Project: 2677 (Run 16, Clone 11, Gen 37) CoreStatus = 66 (10

Moderators: Site Moderators, FAHC Science Team

Post Reply
parkut
Posts: 363
Joined: Tue Feb 12, 2008 7:33 am
Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
Location: SE Michigan, USA

Project: 2677 (Run 16, Clone 11, Gen 37) CoreStatus = 66 (10

Post by parkut »

found it "running", but no progress. system load all zeros.
Restarted the client and it crashed. Deleted the WU, restarted
same thing. Repeated till assigned a different WU.

Code: Select all

model name	: Intel(R) Core(TM)2 CPU          4400  @ 2.00GHz
cpu MHz		: 1999.944
cache size	: 2048 KB
Memory: 1.96 GB physical, 1.94 GB virtual
...
Client Version 6.24R3  
Core: FahCore_a2.exe
Core Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
...
[15:19:12] *------------------------------*
[15:19:12] Folding@Home Gromacs SMP Core
[15:19:12] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[15:19:12] 
[15:19:12] Preparing to commence simulation
[15:19:12] - Ensuring status. Please wait.
[15:19:13] Called DecompressByteArray: compressed_data_size=4836865 data_size=24042173, decompressed_data_size=24042173 diff=0
[15:19:14] - Digital signature verified
[15:19:14] 
[15:19:14] Project: 2677 (Run 16, Clone 11, Gen 37)
[15:19:14] 
[15:19:14] Assembly optimizations on if available.
[15:19:14] Entering M.D.
[15:19:23] Run 16, Clone 11, Gen 37)
[15:19:23] 
[15:19:23] Entering M.D.
[15:19:34] lding@home Core Shutdown: INTERRUPTED
[15:19:38] CoreStatus = 66 (102)
[15:19:38] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[15:19:38] Killing all core threads

Folding@Home Client Shutdown.
Ant P
Posts: 1
Joined: Sat Aug 29, 2009 1:04 pm

p2677 bad WU? (won't run and CoreStatus=66)

Post by Ant P »

I start ./fah6, it seems to work for a few seconds then shuts itself down with a lot of output:

Code: Select all

[17:50:22] Loaded queue successfully.
[17:50:22] 
[17:50:22] + Processing work unit
[17:50:22] At least 4 processors must be requested.Core required: FahCore_a2.exe
[17:50:22] Core found.
[17:50:22] Working on queue slot 07 [November 10 17:50:22 UTC]
[17:50:22] + Working ...
[17:50:22] 
[17:50:22] *------------------------------*
[17:50:22] Folding@Home Gromacs SMP Core
[17:50:22] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[17:50:22] 
[17:50:22] Preparing to commence simulation
[17:50:22] - Ensuring status. Please wait.
[17:50:23] Called DecompressByteArray: compressed_data_size=4836865 data_size=24042173, decompressed_data_size=24042173 diff=0
[17:50:23] - Digital signature verified
[17:50:23] 
[17:50:23] Project: 2677 (Run 16, Clone 11, Gen 37)
[17:50:23] 
[17:50:23] Assembly optimizations on if available.
[17:50:23] Entering M.D.
[17:50:33] Run 16, Clone 11, Gen 37)
[17:50:33] 
[17:50:33] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=erosion
NODEID=0 argc=20
NNODES=4, MYRANK=1, HOSTNAME=erosion
NODEID=1 argc=20
NNODES=4, MYRANK=2, HOSTNAME=erosion
NODEID=2 argc=20
NNODES=4, MYRANK=3, HOSTNAME=erosion
NODEID=3 argc=20
Reading file work/wudata_07.tpr, VERSION 3.3.99_development_20070618 (single precision)
Note: tpx file_version 48, software version 68

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22869 system in water'
9500001 steps,  19000.0 ps (continuing from step 9250001,  18500.0 ps).

t = 18500.003 ps: Water molecule starting at atom 143485 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.005 ps: Water molecule starting at atom 79402 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.007 ps: Water molecule starting at atom 143485 can not be settled.
Check for bad contacts and/or reduce the timestep.
[17:50:41] lding@home Core Shutdown: INTERRUPTED
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Quit
[17:50:45] CoreStatus = 66 (102)
[17:50:45] + Shutdown requested by user. Exiting.
Folding@Home Client Shutdown.
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Project: 2677 (Run 16, Clone 11, Gen 37)

Post by _r2w_ben »

Project: 2677 (Run 16, Clone 11, Gen 37) fails to start.

Code: Select all

[16:56:28] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 05 -checkpoint 30 -verbose -lifeline 16790 -version 624'

[16:56:28]
[16:56:28] *------------------------------*
[16:56:28] Folding@Home Gromacs SMP Core
[16:56:28] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[16:56:28]
[16:56:28] Preparing to commence simulation
[16:56:28] - Ensuring status. Please wait.
[16:56:29] Called DecompressByteArray: compressed_data_size=4836865 data_size=24042173, decompressed_data_size=24042173 diff=0
[16:56:29] - Digital signature verified
[16:56:29]
[16:56:29] Project: 2677 (Run 16, Clone 11, Gen 37)
[16:56:29]
[16:56:29] Assembly optimizations on if available.
[16:56:29] Entering M.D.
[16:56:39] Run 16, Clone 11, Gen 37)
[16:56:39]
[16:56:39] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=removed
NNODES=4, MYRANK=1, HOSTNAME=removed
NODEID=0 argc=20
NODEID=1 argc=20
NNODES=4, MYRANK=2, HOSTNAME=removed
NNODES=4, MYRANK=3, HOSTNAME=removed
NODEID=2 argc=20
NODEID=3 argc=20
Reading file work/wudata_05.tpr, VERSION 3.3.99_development_20070618 (single precision)
Note: tpx file_version 48, software version 68

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22869 system in water'
9500001 steps,  19000.0 ps (continuing from step 9250001,  18500.0 ps).

t = 18500.003 ps: Water molecule starting at atom 143485 can not be settled.
Check for bad contacts and/or reduce the timestep.
[16:56:51] Completed 0 out of 250000 steps  (0%)

t = 18500.005 ps: Water molecule starting at atom 79402 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 18500.007 ps: Water molecule starting at atom 143485 can not be settled.
Check for bad contacts and/or reduce the timestep.
[16:56:52]
[16:56:52] Folding@home Core Shutdown: INTERRUPTED
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Quit
[16:56:56] CoreStatus = 66 (102)
[16:56:56] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[16:56:56] Killing all core threads

Folding@Home Client Shutdown.
preet.to
Posts: 19
Joined: Sun Dec 16, 2007 3:20 pm

Re: Project: 2677 (Run 16, Clone 11, Gen 37) CoreStatus = 66 (10

Post by preet.to »

I am having a bogus interruption in this WU. I keep deleting the queue and work folder. But keep getting assigned the same WU. The WU is broken.

Code: Select all

[15:38:38] Project: 2677 (Run 16, Clone 11, Gen 37)
[15:38:38]
[15:38:38] Assembly optimizations on if available.
[15:38:38] Entering M.D.
[15:38:48] Run 16, Clone 11, Gen 37)
[15:38:48]
[15:38:48] Entering M.D.
[15:39:01] Completed 0 out of 250000 steps  (0%)
[15:39:02]
[15:39:02] Folding@home Core Shutdown: INTERRUPTED
[15:39:06] CoreStatus = 66 (102)
[15:39:06] + Shutdown requested by user. Exiting.
Folding@Home Client Shutdown.
bollix47
Posts: 2951
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project: 2677 (Run 16, Clone 11, Gen 37) CoreStatus = 66 (10

Post by bollix47 »

Ditto on this one .... same as above.

If you also delete machineindepent.dat you should get a different WU. I did. :wink:
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 2677 (Run 16, Clone 11, Gen 37) CoreStatus = 66 (10

Post by bruce »

This WU has been stopped.
Post Reply