Page 1 of 1

6013 (Run 0, Clone 53, Gen 157)

Posted: Mon May 31, 2010 5:37 am
by Bigstan
Had this die on me twice over the last 12 hours on one machine with error "CoreStatus = FF (255)". No problems with other projects.

Code: Select all

[19:39:52] + Processing work unit
[19:39:52] Core required: FahCore_a3.exe
[19:39:52] Core found.
[19:39:52] Working on queue slot 00 [May 30 19:39:52 UTC]
[19:39:52] + Working ...
[19:39:52] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 00 -np 8 -checkpoint 5 -verbose -lifeline 4700 -version 629'

[19:39:52] 
[19:39:52] *------------------------------*
[19:39:52] Folding@Home Gromacs SMP Core
[19:39:52] Version 2.19 (Mar 12, 2010)
[19:39:52] 
[19:39:52] Preparing to commence simulation
[19:39:52] - Looking at optimizations...
[19:39:52] - Created dyn
[19:39:52] - Files status OK
[19:39:53] - Expanded 979415 -> 10427873 (decompressed 1064.7 percent)
[19:39:53] Called DecompressByteArray: compressed_data_size=979415 data_size=10427873, decompressed_data_size=10427873 diff=0
[19:39:53] - Digital signature verified
[19:39:53] 
[19:39:53] Project: 6013 (Run 0, Clone 53, Gen 157)
[19:39:53] 
[19:39:53] Assembly optimizations on if available.
[19:39:53] Entering M.D.
[19:42:08] Completed 0 out of 250000 steps  (0%)
[19:51:52] CoreStatus = FF (255)
[19:51:52] Sending work to server
[19:51:52] Project: 6013 (Run 0, Clone 53, Gen 157)
[19:51:52] - Error: Could not get length of results file work/wuresults_00.dat
[19:51:52] - Error: Could not read unit 00 file. Removing from queue.
[19:51:52] Trying to send all finished work units
[19:51:52] + No unsent completed units remaining.
[19:51:52] - Preparing to get new work unit...
[19:51:52] Cleaning up work directory

Code: Select all

[02:20:34] + Processing work unit
[02:20:34] Core required: FahCore_a3.exe
[02:20:34] Core found.
[02:20:34] Working on queue slot 02 [May 31 02:20:34 UTC]
[02:20:34] + Working ...
[02:20:34] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 02 -np 8 -checkpoint 5 -verbose -lifeline 4700 -version 629'

[02:20:34] 
[02:20:34] *------------------------------*
[02:20:34] Folding@Home Gromacs SMP Core
[02:20:34] Version 2.19 (Mar 12, 2010)
[02:20:34] 
[02:20:34] Preparing to commence simulation
[02:20:34] - Looking at optimizations...
[02:20:34] - Created dyn
[02:20:34] - Files status OK
[02:20:34] - Expanded 979415 -> 10427873 (decompressed 1064.7 percent)
[02:20:34] Called DecompressByteArray: compressed_data_size=979415 data_size=10427873, decompressed_data_size=10427873 diff=0
[02:20:34] - Digital signature verified
[02:20:34] 
[02:20:34] Project: 6013 (Run 0, Clone 53, Gen 157)
[02:20:34] 
[02:20:34] Assembly optimizations on if available.
[02:20:34] Entering M.D.
[02:24:23] Completed 0 out of 250000 steps  (0%)
[04:36:12] CoreStatus = FF (255)
[04:36:12] Sending work to server
[04:36:12] Project: 6013 (Run 0, Clone 53, Gen 157)
[04:36:12] - Error: Could not get length of results file work/wuresults_02.dat
[04:36:12] - Error: Could not read unit 02 file. Removing from queue.
[04:36:12] Trying to send all finished work units
[04:36:12] + No unsent completed units remaining.
[04:36:12] - Preparing to get new work unit...
[04:36:12] Cleaning up work directory

Re: 6013 (Run 0, Clone 53, Gen 157)

Posted: Sat Jun 05, 2010 1:41 pm
by AlanH
I received this unit last night. When I checked F@H today I found that it was taking 90 minutes per %. Checking the bonus calculator, this was forecast to complete in six days, against a final deadline of three days. Not good.

I tried restarting F@H, and restarting my Mac, but performance remained pathetic. So I stopped F@H, discarded the work folder and queue, and restarted it. I now have a P6025 unit which is folding at a reasonable rate. I don't cherry pick, but this seemed to be faulty behaviour.

Re: 6013 (Run 0, Clone 53, Gen 157)

Posted: Sat Jun 05, 2010 5:50 pm
by bruce
I'm not sure what's wrong, but there are strong indications this is a bad WU, so I'm reporting it. (A third person got a very small credit for an EUE report so that makes three entirely different symptoms.)

Re: 6013 (Run 0, Clone 53, Gen 157)

Posted: Sun Jun 06, 2010 7:15 pm
by ra40
I've got it now...my step time is 1.07. So far it is at 40% but won't complete the deadline of 3 days.

Re: 6013 (Run 0, Clone 53, Gen 157)

Posted: Mon Jun 07, 2010 11:57 pm
by ra40
As suspected, it didn't make it anywhere close to the deadline. The last portion of the log file:
[15:17:05] Completed 142500 out of 250000 steps (57%)
[16:24:15] Completed 145000 out of 250000 steps (58%)
[17:31:23] Completed 147500 out of 250000 steps (59%)
[18:38:25] Completed 150000 out of 250000 steps (60%)
[19:45:29] Completed 152500 out of 250000 steps (61%)
[20:53:01] Completed 155000 out of 250000 steps (62%)
[22:00:37] Completed 157500 out of 250000 steps (63%)
[23:08:06] Completed 160000 out of 250000 steps (64%)
[23:08:06] Unit 1's deadline (June 7 22:50) has passed.
[23:08:06] Going to interrupt core and move on to next unit...
[23:08:23] mdrun returned 2
[23:08:23] Gromacs was interrupted
[23:08:23] Folding@home Core Shutdown: INTERRUPTED
[23:08:26] CoreStatus = 66 (102)
[23:08:30] - Preparing to get new work unit...
[23:08:30] Cleaning up work directory
[23:08:30] + Attempting to get work packet
[23:08:30] Passkey found
[23:08:30] - Connecting to assignment server
[23:08:31] - Successful: assigned to (171.64.65.54).
[23:08:31] + News From Folding@Home: Welcome to Folding@Home
[23:08:31] Loaded queue successfully.
[23:08:34] + Closed connections