P6041 (R0 C247 G155) Unstable WU
Posted: Sun Jul 17, 2011 6:11 pm
I've tried this one 4 times now, with a looping failure in the end. The first pass went to 65%, and each
pass after that only got to around 14%. I reduced my overclock in an attempt to run this one. Unlike some
of the A4's that are a little overclock sensitive this one continues to fail even after reducing the overclock.
I reduced my overclock from 4.0ghz to 3.8ghz (which runs even the sensitive A4's). This machine has passed
12 hours of MemTest86+, 12 hours of Prime95, and 24 hours of StressCPU @ 4.0ghz, so I'm going to go out
on a limb here and say the WU's at fault.
I know I stopped the process twice the first time (it may have helped the WU to get to 65% on the first pass)
to play video games, but it started back up and ran for almost 2 hours after the final restart before failing.
Here's the log:
Had to change my machine ID to pick up a new WU (currently running P6053 (R1 C84 G309).
pass after that only got to around 14%. I reduced my overclock in an attempt to run this one. Unlike some
of the A4's that are a little overclock sensitive this one continues to fail even after reducing the overclock.
I reduced my overclock from 4.0ghz to 3.8ghz (which runs even the sensitive A4's). This machine has passed
12 hours of MemTest86+, 12 hours of Prime95, and 24 hours of StressCPU @ 4.0ghz, so I'm going to go out
on a limb here and say the WU's at fault.
I know I stopped the process twice the first time (it may have helped the WU to get to 65% on the first pass)
to play video games, but it started back up and ran for almost 2 hours after the final restart before failing.
Here's the log:
Code: Select all
[15:29:52] + Attempting to send results [July 16 15:29:52 UTC]
[15:29:52] - Reading file work/wuresults_04.dat from core
[15:29:52] (Read 3794987 bytes from disk)
[15:29:52] Connecting to http://171.64.65.54:8080/
[15:31:01] Posted data.
[15:31:02] Initial: 0000; - Uploaded at ~52 kB/s
[15:31:02] - Averaged speed for that direction ~52 kB/s
[15:31:02] + Results successfully sent
[15:31:02] Thank you for your contribution to Folding@Home.
[15:31:02] + Number of Units Completed: 772
[15:31:06] Trying to send all finished work units
[15:31:06] + No unsent completed units remaining.
[15:31:06] - Preparing to get new work unit...
[15:31:06] Cleaning up work directory
[15:31:06] + Attempting to get work packet
[15:31:06] Passkey found
[15:31:06] - Will indicate memory of 4098 MB
[15:31:06] - Connecting to assignment server
[15:31:06] Connecting to http://assign.stanford.edu:8080/
[15:31:07] Posted data.
[15:31:07] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[15:31:07] + News From Folding@Home: Welcome to Folding@Home
[15:31:07] Loaded queue successfully.
[15:31:07] Sent data
[15:31:07] Connecting to http://171.64.65.54:8080/
[15:31:09] Posted data.
[15:31:10] Initial: 0000; - Receiving payload (expected size: 7885241)
[15:31:34] - Downloaded at ~320 kB/s
[15:31:34] - Averaged speed for that direction ~296 kB/s
[15:31:34] + Received work.
[15:31:34] Trying to send all finished work units
[15:31:34] + No unsent completed units remaining.
[15:31:34] + Closed connections
[15:31:34]
[15:31:34] + Processing work unit
[15:31:34] Core required: FahCore_a3.exe
[15:31:34] Core found.
[15:31:34] Working on queue slot 05 [July 16 15:31:34 UTC]
[15:31:34] + Working ...
[15:31:34] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 05 -np 6 -checkpoint 5 -verbose -lifeline 1196 -version 634'
[15:31:34]
[15:31:34] *------------------------------*
[15:31:34] Folding@Home Gromacs SMP Core
[15:31:34] Version 2.27 (Dec. 15, 2010)
[15:31:34]
[15:31:34] Preparing to commence simulation
[15:31:34] - Looking at optimizations...
[15:31:34] - Created dyn
[15:31:34] - Files status OK
[15:31:36] - Expanded 7884729 -> 10126021 (decompressed 128.4 percent)
[15:31:36] Called DecompressByteArray: compressed_data_size=7884729 data_size=10126021, decompressed_data_size=10126021 diff=0
[15:31:36] - Digital signature verified
[15:31:36]
[15:31:36] Project: 6041 (Run 0, Clone 247, Gen 155)
[15:31:36]
[15:31:36] Assembly optimizations on if available.
[15:31:36] Entering M.D.
[15:31:42] Mapping NT from 6 to 6
[15:31:43] Completed 0 out of 250000 steps (0%)
[15:44:44] Completed 2500 out of 250000 steps (1%)
[15:57:43] Completed 5000 out of 250000 steps (2%)
[16:10:31] Completed 7500 out of 250000 steps (3%)
[16:23:18] Completed 10000 out of 250000 steps (4%)
[16:36:06] Completed 12500 out of 250000 steps (5%)
[16:49:02] Completed 15000 out of 250000 steps (6%)
[17:01:55] Completed 17500 out of 250000 steps (7%)
[17:14:40] Completed 20000 out of 250000 steps (8%)
[17:27:21] Completed 22500 out of 250000 steps (9%)
[17:40:01] Completed 25000 out of 250000 steps (10%)
[17:52:57] Completed 27500 out of 250000 steps (11%)
[18:05:54] Completed 30000 out of 250000 steps (12%)
[18:18:49] Completed 32500 out of 250000 steps (13%)
[18:31:37] Completed 35000 out of 250000 steps (14%)
[18:44:40] Completed 37500 out of 250000 steps (15%)
[18:57:36] Completed 40000 out of 250000 steps (16%)
[19:10:23] Completed 42500 out of 250000 steps (17%)
[19:15:08] - Autosending finished units... [July 16 19:15:08 UTC]
[19:15:08] Trying to send all finished work units
[19:15:08] + No unsent completed units remaining.
[19:15:08] - Autosend completed
[19:23:04] Completed 45000 out of 250000 steps (18%)
[19:35:44] Completed 47500 out of 250000 steps (19%)
[19:48:36] Completed 50000 out of 250000 steps (20%)
[20:01:26] Completed 52500 out of 250000 steps (21%)
[20:14:33] Completed 55000 out of 250000 steps (22%)
[20:27:48] Completed 57500 out of 250000 steps (23%)
[20:41:11] Completed 60000 out of 250000 steps (24%)
[20:55:13] Completed 62500 out of 250000 steps (25%)
[21:09:13] Completed 65000 out of 250000 steps (26%)
[21:22:48] Completed 67500 out of 250000 steps (27%)
[21:35:57] Completed 70000 out of 250000 steps (28%)
[21:49:37] Completed 72500 out of 250000 steps (29%)
[22:03:23] Completed 75000 out of 250000 steps (30%)
[22:16:40] Completed 77500 out of 250000 steps (31%)
[22:29:42] Completed 80000 out of 250000 steps (32%)
[22:42:26] Completed 82500 out of 250000 steps (33%)
[22:55:17] Completed 85000 out of 250000 steps (34%)
[23:08:14] Completed 87500 out of 250000 steps (35%)
[23:21:42] Completed 90000 out of 250000 steps (36%)
[23:34:20] Completed 92500 out of 250000 steps (37%)
[23:46:58] Completed 95000 out of 250000 steps (38%)
[23:59:44] Completed 97500 out of 250000 steps (39%)
[00:12:35] Completed 100000 out of 250000 steps (40%)
[00:25:25] Completed 102500 out of 250000 steps (41%)
[00:38:11] Completed 105000 out of 250000 steps (42%)
[00:46:52] Killing all core threads
[00:46:52] Could not get process id information. Please kill core process manually
Folding@Home Client Shutdown at user request.
[00:46:52] ***** Got a SIGTERM signal (2)
[00:46:52] Killing all core threads
[00:46:52] Could not get process id information. Please kill core process manually
Folding@Home Client Shutdown.
--- Opening Log file [July 17 02:14:40 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
Folding@Home Client Version 6.34
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\Users\Blaine\F@H_SMP\Folding@Home Windows SMP Client V1.01
Executable: C:\Users\Blaine\F@H_SMP\Folding@Home Windows SMP Client V1.01\F@H_SMP-6.34.exe
Arguments: -smp -verbosity 9
[02:14:40] - Ask before connecting: No
[02:14:40] - User name: Blazin420 (Team 420)
[02:14:40] - User ID: 7FEB5F862FE09BC8
[02:14:40] - Machine ID: 1
[02:14:40]
[02:14:40] Loaded queue successfully.
[02:14:40]
[02:14:40] - Autosending finished units... [July 17 02:14:40 UTC]
[02:14:40] + Processing work unit
[02:14:40] Trying to send all finished work units
[02:14:40] Core required: FahCore_a3.exe
[02:14:40] + No unsent completed units remaining.
[02:14:40] Core found.
[02:14:40] - Autosend completed
[02:14:40] Working on queue slot 05 [July 17 02:14:40 UTC]
[02:14:40] + Working ...
[02:14:40] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 05 -np 6 -checkpoint 5 -verbose -lifeline 4708 -version 634'
[02:14:40]
[02:14:40] *------------------------------*
[02:14:40] Folding@Home Gromacs SMP Core
[02:14:40] Version 2.27 (Dec. 15, 2010)
[02:14:40]
[02:14:40] Preparing to commence simulation
[02:14:40] - Ensuring status. Please wait.
[02:14:50] - Looking at optimizations...
[02:14:50] - Working with standard loops on this execution.
[02:14:50] - Previous termination of core was improper.
[02:14:50] - Files status OK
[02:14:51] - Expanded 7884729 -> 10126021 (decompressed 128.4 percent)
[02:14:51] Called DecompressByteArray: compressed_data_size=7884729 data_size=10126021, decompressed_data_size=10126021 diff=0
[02:14:51] - Digital signature verified
[02:14:51]
[02:14:51] Project: 6041 (Run 0, Clone 247, Gen 155)
[02:14:51]
[02:14:51] Entering M.D.
[02:14:57] Using Gromacs checkpoints
[02:14:57] Mapping NT from 6 to 6
[02:14:59] Resuming from checkpoint
[02:14:59] Verified work/wudata_05.log
[02:14:59] Verified work/wudata_05.trr
[02:14:59] Verified work/wudata_05.xtc
[02:14:59] Verified work/wudata_05.edr
[02:15:00] Completed 106686 out of 250000 steps (42%)
[02:19:14] Completed 107500 out of 250000 steps (43%)
[02:32:14] Completed 110000 out of 250000 steps (44%)
[02:45:16] Completed 112500 out of 250000 steps (45%)
[02:58:02] Completed 115000 out of 250000 steps (46%)
[03:10:53] Completed 117500 out of 250000 steps (47%)
[03:23:39] Completed 120000 out of 250000 steps (48%)
[03:36:34] Completed 122500 out of 250000 steps (49%)
[03:49:29] Completed 125000 out of 250000 steps (50%)
[04:02:26] Completed 127500 out of 250000 steps (51%)
[04:15:11] Completed 130000 out of 250000 steps (52%)
[04:27:54] Completed 132500 out of 250000 steps (53%)
[04:40:55] Completed 135000 out of 250000 steps (54%)
[04:54:06] Completed 137500 out of 250000 steps (55%)
[05:07:15] Completed 140000 out of 250000 steps (56%)
[05:20:21] Completed 142500 out of 250000 steps (57%)
[05:33:16] Completed 145000 out of 250000 steps (58%)
[05:46:13] Completed 147500 out of 250000 steps (59%)
[05:49:16] Killing all core threads
[05:49:16] Could not get process id information. Please kill core process manually
Folding@Home Client Shutdown at user request.
[05:49:16] ***** Got a SIGTERM signal (2)
[05:49:16] Killing all core threads
[05:49:16] Could not get process id information. Please kill core process manually
Folding@Home Client Shutdown.
--- Opening Log file [July 17 07:00:56 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
Folding@Home Client Version 6.34
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\Users\Blaine\F@H_SMP\Folding@Home Windows SMP Client V1.01
Executable: C:\Users\Blaine\F@H_SMP\Folding@Home Windows SMP Client V1.01\F@H_SMP-6.34.exe
Arguments: -smp -verbosity 9
[07:00:56] - Ask before connecting: No
[07:00:56] - User name: Blazin420 (Team 420)
[07:00:56] - User ID: 7FEB5F862FE09BC8
[07:00:56] - Machine ID: 1
[07:00:56]
[07:00:56] Loaded queue successfully.
[07:00:56]
[07:00:56] - Autosending finished units... [July 17 07:00:56 UTC]
[07:00:56] + Processing work unit
[07:00:56] Trying to send all finished work units
[07:00:56] Core required: FahCore_a3.exe
[07:00:56] + No unsent completed units remaining.
[07:00:56] Core found.
[07:00:56] - Autosend completed
[07:00:56] Working on queue slot 05 [July 17 07:00:56 UTC]
[07:00:56] + Working ...
[07:00:56] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 05 -np 6 -checkpoint 5 -verbose -lifeline 2888 -version 634'
[07:00:56]
[07:00:56] *------------------------------*
[07:00:56] Folding@Home Gromacs SMP Core
[07:00:56] Version 2.27 (Dec. 15, 2010)
[07:00:56]
[07:00:56] Preparing to commence simulation
[07:00:56] - Ensuring status. Please wait.
[07:01:19] - Looking at optimizations...
[07:01:19] - Working with standard loops on this execution.
[07:01:20] - Previous termination of core was improper.
[07:01:20] - Going to use standard loops.
[07:01:20] - Files status OK
[07:01:21] - Expanded 7884729 -> 10126021 (decompressed 128.4 percent)
[07:01:21] Called DecompressByteArray: compressed_data_size=7884729 data_size=10126021, decompressed_data_size=10126021 diff=0
[07:01:21] - Digital signature verified
[07:01:21]
[07:01:21] Project: 6041 (Run 0, Clone 247, Gen 155)
[07:01:21]
[07:01:21] Entering M.D.
[07:01:27] Using Gromacs checkpoints
[07:01:28] Mapping NT from 6 to 6
[07:01:29] Resuming from checkpoint
[07:01:29] Verified work/wudata_05.log
[07:01:30] Verified work/wudata_05.trr
[07:01:30] Verified work/wudata_05.xtc
[07:01:30] Verified work/wudata_05.edr
[07:01:31] Completed 147264 out of 250000 steps (58%)
[07:02:46] Completed 147500 out of 250000 steps (59%)
[07:15:32] Completed 150000 out of 250000 steps (60%)
[07:28:16] Completed 152500 out of 250000 steps (61%)
[07:42:47] Completed 155000 out of 250000 steps (62%)
[07:55:42] Completed 157500 out of 250000 steps (63%)
[08:08:29] Completed 160000 out of 250000 steps (64%)
[08:21:14] Completed 162500 out of 250000 steps (65%)
[08:26:04] Gromacs cannot continue further.
[08:26:04] Going to send back what have done -- stepsTotalG=250000
[08:26:04] Work fraction=0.6537 steps=250000.
[08:26:11] logfile size=132714 infoLength=132714 edr=0 trr=23
[08:26:11] logfile size: 132714 info=132714 bed=0 hdr=23
[08:26:11] - Writing 133250 bytes of core data to disk...
[08:26:14] CoreStatus = C0000005 (-1073741819)
[08:26:14] Client-core communications error: ERROR 0xc0000005
[08:26:14] Deleting current work unit & continuing...
[08:26:28] Trying to send all finished work units
[08:26:28] + No unsent completed units remaining.
[08:26:28] - Preparing to get new work unit...
[08:26:28] Cleaning up work directory
[08:26:28] + Attempting to get work packet
[08:26:28] Passkey found
[08:26:28] - Will indicate memory of 4098 MB
[08:26:28] - Detect CPU. Vendor: AuthenticAMD, Family: 15, Model: 10, Stepping: 0
[08:26:28] - Connecting to assignment server
[08:26:28] Connecting to http://assign.stanford.edu:8080/
[08:26:29] Posted data.
[08:26:29] Initial: 40AB; - Successful: assigned to (171.64.65.54).
[08:26:29] + News From Folding@Home: Welcome to Folding@Home
[08:26:29] Loaded queue successfully.
[08:26:29] Sent data
[08:26:29] Connecting to http://171.64.65.54:8080/
[08:26:32] Posted data.
[08:26:32] Initial: 0000; - Receiving payload (expected size: 7885241)
[08:26:57] - Downloaded at ~308 kB/s
[08:26:57] - Averaged speed for that direction ~298 kB/s
[08:26:57] + Received work.
[08:26:57] + Closed connections
[08:27:02]
[08:27:02] + Processing work unit
[08:27:02] Core required: FahCore_a3.exe
[08:27:02] Core found.
[08:27:02] Working on queue slot 06 [July 17 08:27:02 UTC]
[08:27:02] + Working ...
[08:27:02] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 06 -np 6 -checkpoint 5 -verbose -lifeline 2888 -version 634'
[08:27:02]
[08:27:02] *------------------------------*
[08:27:02] Folding@Home Gromacs SMP Core
[08:27:02] Version 2.27 (Dec. 15, 2010)
[08:27:02]
[08:27:02] Preparing to commence simulation
[08:27:02] - Looking at optimizations...
[08:27:02] - Created dyn
[08:27:02] - Files status OK
[08:27:03] - Expanded 7884729 -> 10126021 (decompressed 128.4 percent)
[08:27:03] Called DecompressByteArray: compressed_data_size=7884729 data_size=10126021, decompressed_data_size=10126021 diff=0
[08:27:03] - Digital signature verified
[08:27:03]
[08:27:03] Project: 6041 (Run 0, Clone 247, Gen 155)
[08:27:03]
[08:27:03] Assembly optimizations on if available.
[08:27:03] Entering M.D.
[08:27:09] Mapping NT from 6 to 6
[08:27:10] Completed 0 out of 250000 steps (0%)
[08:40:12] Completed 2500 out of 250000 steps (1%)
[08:53:17] Completed 5000 out of 250000 steps (2%)
[09:06:15] Completed 7500 out of 250000 steps (3%)
[09:19:05] Completed 10000 out of 250000 steps (4%)
[09:32:04] Completed 12500 out of 250000 steps (5%)
[09:47:34] Completed 15000 out of 250000 steps (6%)
[10:00:42] Completed 17500 out of 250000 steps (7%)
[10:13:33] Completed 20000 out of 250000 steps (8%)
[10:26:19] Completed 22500 out of 250000 steps (9%)
[10:39:29] Completed 25000 out of 250000 steps (10%)
[10:52:45] Completed 27500 out of 250000 steps (11%)
[11:06:05] Completed 30000 out of 250000 steps (12%)
[11:19:06] Completed 32500 out of 250000 steps (13%)
[11:32:17] Completed 35000 out of 250000 steps (14%)
[11:45:16] CoreStatus = C0000029 (-1073741783)
[11:45:16] Client-core communications error: ERROR 0xc0000029
[11:45:16] Deleting current work unit & continuing...