Project: 7611 (Run 2, Clone 55, Gen 209) possibly bad
Posted: Tue Jan 22, 2013 4:40 pm
Received this WU earlier this morning, it immediately failed. The client restarted the WU a total of a dozen times over the next 5 hours with the same result. I have dumped the WU and moved on to process a different WU. Here is a portion of the log showing the error:
System this was run on is an iMac, 2.8 GHz i7, OS 10.6.8, SMP 8 using the 7.2.9 client.
I checked the database, there is one other failure report for this WU.
Code: Select all
11:52:57:WU01:FS00:Starting
11:52:57:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper "/Library/Application Support/FAHClient/cores/www.stanford.edu/~pande/OSX/AMD64/Core_a4.fah/FahCore_a4" -dir 01 -suffix 01 -version 702 -lifeline 66 -checkpoint 15 -np 8
11:52:57:WU01:FS00:Started FahCore on PID 36224
11:52:57:WU01:FS00:Core PID:36228
11:52:57:WU01:FS00:FahCore 0xa4 started
11:52:58:WU01:FS00:0xa4:
11:52:58:WU01:FS00:0xa4:*------------------------------*
11:52:58:WU01:FS00:0xa4:Folding@Home Gromacs Core
11:52:58:WU01:FS00:0xa4:Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
11:52:58:WU01:FS00:0xa4:
11:52:58:WU01:FS00:0xa4:Preparing to commence simulation
11:52:58:WU01:FS00:0xa4:- Ensuring status. Please wait.
11:53:07:WU01:FS00:0xa4:- Looking at optimizations...
11:53:07:WU01:FS00:0xa4:- Working with standard loops on this execution.
11:53:07:WU01:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
11:53:07:WU01:FS00:0xa4:- Expanded 29848 -> 644556 (decompressed 2159.4 percent)
11:53:07:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=29848 data_size=644556, decompressed_data_size=644556 diff=0
11:53:07:WU01:FS00:0xa4:- Digital signature verified
11:53:07:WU01:FS00:0xa4:
11:53:07:WU01:FS00:0xa4:Project: 7611 (Run 2, Clone 55, Gen 209)
11:53:07:WU01:FS00:0xa4:
11:53:07:WU01:FS00:0xa4:Entering M.D.
11:53:13:WU01:FS00:0xa4:Mapping NT from 8 to 8
11:53:14:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
13:55:57:WU01:FS00:Starting
13:55:57:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper "/Library/Application Support/FAHClient/cores/www.stanford.edu/~pande/OSX/AMD64/Core_a4.fah/FahCore_a4" -dir 01 -suffix 01 -version 702 -lifeline 66 -checkpoint 15 -np 8
13:55:57:WU01:FS00:Started FahCore on PID 36314
13:55:57:WU01:FS00:Core PID:36318
13:55:57:WU01:FS00:FahCore 0xa4 started
13:55:57:WU01:FS00:0xa4:
13:55:57:WU01:FS00:0xa4:*------------------------------*
13:55:57:WU01:FS00:0xa4:Folding@Home Gromacs Core
13:55:57:WU01:FS00:0xa4:Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
13:55:57:WU01:FS00:0xa4:
13:55:57:WU01:FS00:0xa4:Preparing to commence simulation
13:55:57:WU01:FS00:0xa4:- Ensuring status. Please wait.
13:56:07:WU01:FS00:0xa4:- Looking at optimizations...
13:56:07:WU01:FS00:0xa4:- Working with standard loops on this execution.
13:56:07:WU01:FS00:0xa4:Examination of work files indicates 8 consecutive improper terminations of core.
13:56:07:WU01:FS00:0xa4:- Expanded 29848 -> 644556 (decompressed 2159.4 percent)
13:56:07:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=29848 data_size=644556, decompressed_data_size=644556 diff=0
13:56:07:WU01:FS00:0xa4:- Digital signature verified
13:56:07:WU01:FS00:0xa4:
13:56:07:WU01:FS00:0xa4:Project: 7611 (Run 2, Clone 55, Gen 209)
13:56:07:WU01:FS00:0xa4:
13:56:07:WU01:FS00:0xa4:Entering M.D.
13:56:13:WU01:FS00:0xa4:Mapping NT from 8 to 8
13:56:13:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
15:33:19:Server connection id=31 on 0.0.0.0:36330 from 127.0.0.1
15:33:28:FS00:Paused
15:36:11:FS00:Unpaused
15:36:11:WU01:FS00:Starting
15:36:11:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper "/Library/Application Support/FAHClient/cores/www.stanford.edu/~pande/OSX/AMD64/Core_a4.fah/FahCore_a4" -dir 01 -suffix 01 -version 702 -lifeline 66 -checkpoint 15 -np 8
15:36:11:WU01:FS00:Started FahCore on PID 36412
15:36:11:WU01:FS00:Core PID:36416
15:36:11:WU01:FS00:FahCore 0xa4 started
[93m15:36:11:WARNING:WU01:FS00:FahCore returned: MISSING_WORK_FILES (116 = 0x74)[0m
[93m15:36:11:WARNING:WU01:FS00:Fatal error, dumping[0m
15:36:11:WU01:FS00:Sending unit results: id:01 state:SEND error:DUMPED project:7611 run:2 clone:55 gen:209 core:0xa4 unit:0x00000148664f2dd04df0f55c3b107ab0
[93m15:36:11:WARNING:WU01:FS00:Missing original Unit data, cannot send dump report[0m
15:36:11:WU01:FS00:Cleaning up
15:36:12:WU00:FS00:Connecting to assign3.stanford.edu:8080
15:36:12:WU00:FS00:News: Welcome to Folding@Home
15:36:12:WU00:FS00:Assigned to work server 171.67.108.60
15:36:12:WU00:FS00:Requesting new work unit for slot 00: READY smp:8 from 171.67.108.60
15:36:12:WU00:FS00:Connecting to 171.67.108.60:8080
15:36:13:WU00:FS00:Downloading 512.91KiB
15:36:14:WU00:FS00:Download complete
15:36:14:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:8028 run:2248 clone:3 gen:10 core:0xa4 unit:0x000000106652edcc50e796b61f505157
15:36:14:WU00:FS00:Starting
15:36:14:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper "/Library/Application Support/FAHClient/cores/www.stanford.edu/~pande/OSX/AMD64/Core_a4.fah/FahCore_a4" -dir 00 -suffix 01 -version 702 -lifeline 66 -checkpoint 15 -np 8
15:36:14:WU00:FS00:Started FahCore on PID 36417
15:36:14:WU00:FS00:Core PID:36418
15:36:14:WU00:FS00:FahCore 0xa4 started
15:36:15:WU00:FS00:0xa4:
15:36:15:WU00:FS00:0xa4:*------------------------------*
15:36:15:WU00:FS00:0xa4:Folding@Home Gromacs Core
15:36:15:WU00:FS00:0xa4:Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
15:36:15:WU00:FS00:0xa4:
15:36:15:WU00:FS00:0xa4:Preparing to commence simulation
15:36:15:WU00:FS00:0xa4:- Looking at optimizations...
15:36:15:WU00:FS00:0xa4:- Created dyn
15:36:15:WU00:FS00:0xa4:- Files status OK
15:36:15:WU00:FS00:0xa4:- Expanded 524704 -> 1189760 (decompressed 226.7 percent)
15:36:15:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=524704 data_size=1189760, decompressed_data_size=1189760 diff=0
15:36:15:WU00:FS00:0xa4:- Digital signature verified
15:36:15:WU00:FS00:0xa4:
15:36:15:WU00:FS00:0xa4:Project: 8028 (Run 2248, Clone 3, Gen 10)
15:36:15:WU00:FS00:0xa4:
15:36:15:WU00:FS00:0xa4:Assembly optimizations on if available.
15:36:15:WU00:FS00:0xa4:Entering M.D.
15:36:21:WU00:FS00:0xa4:Mapping NT from 8 to 8
15:36:21:WU00:FS00:0xa4:Completed 0 out of 500000 steps (0%)
15:37:55:WU00:FS00:0xa4:Completed 5000 out of 500000 steps (1%)
15:39:29:WU00:FS00:0xa4:Completed 10000 out of 500000 steps (2%)
[93m15:39:29:WARNING:Caught signal SIGPIPE(13) on PID 66[0m
[93m15:39:29:WARNING:Caught signal SIGPIPE(13) on PID 66[0m
15:39:29:Server connection id=31 ended
15:41:01:WU00:FS00:0xa4:Completed 15000 out of 500000 steps (3%)
15:42:32:WU00:FS00:0xa4:Completed 20000 out of 500000 steps (4%)
I checked the database, there is one other failure report for this WU.