Page 1 of 1

Project: 2676 (Run 3, Clone 134, Gen 157) error

Posted: Sun Jun 07, 2009 2:51 am
by alpha754293

Code: Select all

[23:40:59] *------------------------------*
[23:40:59] Folding@Home Gromacs SMP Core
[23:40:59] Version 2.07 (Sun Apr 19 14:51:09 PDT 2009)
[23:40:59] Preparing to commence simulation
[23:40:59] - Ensuring status. Please wait.
[23:41:00] Called DecompressByteArray: compressed_data_size=4862076 data_size=24072629, decompressed_data_size=24072629 diff=0
[23:41:00] - Digital signature verified
[23:41:00] Project: 2676 (Run 3, Clone 134, Gen 157)
[23:41:00] Assembly optimizations on if available.
[23:41:00] Entering M.D.
[23:41:09] un 3, Clone 134, Gen 157)
[23:41:10] Entering M.D.
NNODES=8, MYRANK=1, HOSTNAME=computenode
NNODES=8, MYRANK=0, HOSTNAME=computenode
NNODES=8, MYRANK=2, HOSTNAME=computenode
NNODES=8, MYRANK=3, HOSTNAME=computenode
NNODES=8, MYRANK=6, HOSTNAME=computenode
NNODES=8, MYRANK=7, HOSTNAME=computenode
NNODES=8, MYRANK=5, HOSTNAME=computenode
NNODES=8, MYRANK=4, HOSTNAME=computenode
NODEID=0 argc=20
NODEID=1 argc=20
NODEID=2 argc=20
NODEID=3 argc=20
                         :-)  G  R  O  M  A  C  S  (-:

NODEID=6 argc=20
                   Groningen Machine for Chemical Simulation

                 :-)  VERSION 4.0.99_development_20090307  (-:

      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2008, The GROMACS development team,
            check out for more information.

                                :-)  mdrun  (-:
NODEID=7 argc=20

Reading file work/wudata_05.tpr, VERSION 3.3.99_development_20070618 (single precision)
NODEID=4 argc=20
NODEID=5 argc=20
Note: tpx file_version 48, software version 64

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 3D domain decomposition 2 x 2 x 2
starting mdrun '23133 system in water'
39500004 steps,  79000.0 ps (continuing from step 39250004,  78500.0 ps).
[23:45:47] pleted 2500 out of 250000 steps  (1%)
[23:50:15] Completed 5000 out of 250000 steps  (2%)
[23:54:44] Completed 7500 out of 250000 steps  (3%)
[23:59:13] Completed 10000 out of 250000 steps  (4%)
[00:03:42] Completed 12500 out of 250000 steps  (5%)
[00:08:11] Completed 15000 out of 250000 steps  (6%)
[00:12:40] Completed 17500 out of 250000 steps  (7%)
[00:17:09] Completed 20000 out of 250000 steps  (8%)
[00:21:38] Completed 22500 out of 250000 steps  (9%)
[00:26:10] Completed 25000 out of 250000 steps  (10%)
[00:30:43] Completed 27500 out of 250000 steps  (11%)
[00:35:16] Completed 30000 out of 250000 steps  (12%)
[00:39:49] Completed 32500 out of 250000 steps  (13%)
[00:44:21] Completed 35000 out of 250000 steps  (14%)
[00:48:53] Completed 37500 out of 250000 steps  (15%)
[00:53:24] Completed 40000 out of 250000 steps  (16%)
[00:57:56] Completed 42500 out of 250000 steps  (17%)
[01:02:29] Completed 45000 out of 250000 steps  (18%)
[01:07:02] Completed 47500 out of 250000 steps  (19%)
[01:11:33] Completed 50000 out of 250000 steps  (20%)
[01:16:06] Completed 52500 out of 250000 steps  (21%)
[01:20:40] Completed 55000 out of 250000 steps  (22%)
[01:25:12] Completed 57500 out of 250000 steps  (23%)
[01:29:46] Completed 60000 out of 250000 steps  (24%)
[01:34:18] Completed 62500 out of 250000 steps  (25%)
[01:38:50] Completed 65000 out of 250000 steps  (26%)
[01:43:23] Completed 67500 out of 250000 steps  (27%)

Program mdrun, VERSION 4.0.99_development_20090307
Source code file: nsgrid.c, line: 357

Range checking error:
Explanation: During neighborsearching, we assign each particle to a grid
based on its coordinates. If your system contains collisions or parameter
errors that give particles very high velocities you might end up with some
coordinates being +-Infinity or NaN (not-a-number). Obviously, we cannot
put these on a grid, so this is usually where we detect those errors.
Make sure your system is properly energy-minimized and that the potential
energy seems reasonable before trying again.

Variable ci has value -2147483479. It should have been within [ 0 .. 1694 ]

For more information and tips for trouble shooting please check the GROMACS Wiki at

Thanx for Using GROMACS - Have a Nice Day

Error on node 7, will try to stop all the nodes
Halting parallel pro
Program mdrun, VERSION 4.0.99_development_20090307
Source code file: nsgrid.c, line: 357

Range checking error:
Explanation: During neighborsearching, we assign each particle to a grid
based on its coordinates. If your system contains collisions or parameter
errors that give particles very high velocities you might end up with some
coordinates being +-Infinity or NaN (not-a-number). Obviously, we cannot
put these on a grid, so this is usually where we detect those errors.
Make sure your system is properly energy-minimized and that the potential
energy seems reasonable before trying again.

Variable ci has value -2147483477. It should have been within [ 0 .. 1530 ]

For more information and tips for trouble shooting please check the GROMACS Wiki at

Thanx for Using GROMACS - Have a Nice Day

Error on node 0, will try to stop all the nodes
Halting parallel pro
Program mdrun, VERSION 4.0.99_development_20090307
Source code file: nsgrid.c, line: 357

Range checking error:
Explanation: During neighborsearching, we assign each particle to a grid
based on its coordinates. If your system contains collisions or parameter
errors that give particles very high velocities you might end up with some
coordinates being +-Infinity or NaN (not-a-number). Obviously, we cannot
put these on a grid, so this is usually where we detect those errors.
Make sure your system is properly energy-minimized and that the potential
energy seems reasonable before trying again.

Variable ci has value -2147483491. It should have been within [ 0 .. 1573 ]

For more information and tips for trouble shooting please check the GROMACS Wiki at

Thanx for Using GROMACS - Have a Nice Day

Error on node 3, will try to stop all the nodes
Halting parallel program mdrun on CPU 7 out of 8

gcq#0: Thanx for Using GROMACS - Have a Nice Day

[cli_7]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 7
gram mdrun on CPU 0 out of 8

gcq#0: Thanx for Using GROMACS - Have a Nice Day

[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0
gram mdrun on CPU 3 out of 8

gcq#0: Thanx for Using GROMACS - Have a Nice Day

[cli_3]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 3
restarting client...

Re: Project: 2676 (Run 3, Clone 134, Gen 157) error

Posted: Mon Jun 08, 2009 3:17 am
by susato
Thanks "Al" - I trust this WU finished properly the second time through? At your < 5 minute frame time it should have finished off several hours ago. Please let us know if it failed again.

Re: Project: 2676 (Run 3, Clone 134, Gen 157) error

Posted: Mon Jun 08, 2009 1:09 pm
by alpha754293
susato wrote:Thanks "Al" - I trust this WU finished properly the second time through? At your < 5 minute frame time it should have finished off several hours ago. Please let us know if it failed again.
Yes, it appears that the WU finished successfully on the second try.

Code: Select all

[08:20:22] Project: 2676 (Run 3, Clone 134, Gen 157)

[08:20:22] + Attempting to send results [June 7 08:20:22 UTC]