I have had two 7809s dumped in the past 24 hours. Previously I had completed a 7809 ( project:7809 run:7 clone:325 gen:7 ) on this machine on Oct 30th.
I'm running an HP P6636F with Windows 7 Home Premium 64-bit. It has 8G of ram. The processor is a Phenom II 4X 840 running at 3.2 gHz - not overclocked. Video is a GeForce 9800 GTX (factory overclocked EVGA 512-P3-N872-AR).
I'm using Folding@Home Client Control 7.1.24 with the cores divided as 2 smp clients and a GPU client all running together. The system has been remarkably stable except when I try to run the GPU faster.
As you can imagine, the log file is large and difficult to read. I have extracted and commented the parts pertaining to the loading, starting, and completion of the two WUs in question:
Code: Select all
~~~~~~~~~~~~~~~~~~~~~~~~~~~First dump of 7809~~~~~~~~~~~~
Previous unit (Unit 01) reaches 99% and new unit ( project:7809 run:6 clone:300 gen:9 ) is fetched in anticipation of completion of Unit 01. New unit will be assigned as Unit 00.
15:48:49:Unit 01:Completed 1980000 out of 2000000 steps (99%)
15:48:49:Connecting to assign3.stanford.edu:8080
15:48:50:News: Welcome to Folding@Home
15:48:50:Assigned to work server 171.64.65.99
15:48:50:Requesting new work unit for slot 01: RUNNING smp:2 from 171.64.65.99
15:48:50:Connecting to 171.64.65.99:8080
15:48:52:Slot 01: Downloading 1.98MiB
15:48:58:Slot 01: 19.01%
15:49:04:Slot 01: 36.95%
15:49:10:Slot 01: 54.15%
15:49:16:Slot 01: 73.65%
15:49:22:Slot 01: 93.14%
15:49:25:Slot 01: Download complete
15:49:25:Received Unit: id:00 state:DOWNLOAD project:7809 run:6 clone:300 gen:9 core:0xa4 unit:0x000000090a3b1e874e3112d79e56bf89
16:11:41:Unit 01:Completed 2000000 out of 2000000 steps (100%)
16:11:41:Unit 01:DynamicWrapper: Finished Work Unit: sleep=10000
16:11:51:Unit 01:
16:11:51:Unit 01:Finished Work Unit:
16:11:51:Unit 01:- Reading up to 446424 from "01/wudata_01.trr": Read 446424
16:11:51:Unit 01:trr file hash check passed.
16:11:51:Unit 01:- Reading up to 261584 from "01/wudata_01.xtc": Read 261584
16:11:51:Unit 01:xtc file hash check passed.
16:11:51:Unit 01:edr file hash check passed.
16:11:51:Unit 01:logfile size: 34947
16:11:51:Unit 01:Leaving Run
16:11:55:Unit 01:- Writing 749831 bytes of core data to disk...
16:11:55:Unit 01:Done: 749319 -> 699677 (compressed to 93.3 percent)
16:11:55:Unit 01: ... Done.
16:12:01:Unit 01:- Shutting down core
16:12:01:Unit 01:
16:12:01:Unit 01:Folding@home Core Shutdown: FINISHED_UNIT
16:12:02:FahCore, running Unit 01, returned: FINISHED_UNIT (100)
16:12:02:Sending unit results: id:01 state:SEND project:7600 run:30 clone:51 gen:14 core:0xa4 unit:0x0000001d664f2dcd4dee8a628f9ca7d4
16:12:02:Unit 01: Uploading 683.78KiB
16:12:02:Starting Unit 00
16:12:02:Connecting to 171.64.65.101:8080
16:12:02:Running core: C:/Users/Squinch/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -lifeline 3536 -version 701 -checkpoint 12 -np 2 -forceasm
16:12:02:Started core on PID 276
16:12:02:FahCore 0xa4 started
16:12:02:Unit 00:
16:12:02:Unit 00:*------------------------------*
16:12:02:Unit 00:Folding@Home Gromacs GB Core
16:12:02:Unit 00:Version 2.27 (Dec. 15, 2010)
16:12:02:Unit 00:
16:12:02:Unit 00:Preparing to commence simulation
16:12:02:Unit 00:- Assembly optimizations manually forced on.
16:12:02:Unit 00:- Not checking prior termination.
16:12:02:Unit 00:- Expanded 2079472 -> 5386224 (decompressed 259.0 percent)
16:12:02:Unit 00:Called DecompressByteArray: compressed_data_size=2079472 data_size=5386224, decompressed_data_size=5386224 diff=0
16:12:03:Unit 00:- Digital signature verified
16:12:03:Unit 00:
16:12:03:Unit 00:Project: 7809 (Run 6, Clone 300, Gen 9)
16:12:03:Unit 00:
16:12:03:Unit 00:Assembly optimizations on if available.
16:12:03:Unit 00:Entering M.D.
16:12:08:Unit 01: 36.85%
16:12:09:Unit 00:Mapping NT from 2 to 2
16:12:09:Unit 00:Completed 0 out of 1500000 steps (0%)
16:12:14:Unit 01: 76.63%
16:12:17:Unit 01: Upload complete
16:12:17:Server responded WORK_ACK (400)
16:12:17:Final credit estimate, 3055.00 points
16:12:17:Cleaning up Unit 01
16:51:50:Unit 00:Completed 15000 out of 1500000 steps (1%)
etc., etc, until 99% complete on Unit 00 ( project:7809 run:6 clone:300 gen:9 )
09:50:27:Unit 00:Completed 1485000 out of 1500000 steps (99%)
09:50:28:Connecting to assign3.stanford.edu:8080
09:50:28:News: Welcome to Folding@Home
09:50:28:Assigned to work server 129.74.85.15
09:50:28:Requesting new work unit for slot 01: RUNNING smp:2 from 129.74.85.15
09:50:28:Connecting to 129.74.85.15:8080
09:50:29:Slot 01: Downloading 52.87KiB
09:50:29:Slot 01: Download complete
09:50:29:Received Unit: id:03 state:DOWNLOAD project:7008 run:2 clone:58 gen:65 core:0xa4 unit:0x000001000001329c4dfb927a1c2f1bbe
10:32:07:Unit 00:Completed 1500000 out of 1500000 steps (100%)
10:32:07:Unit 00:DynamicWrapper: Finished Work Unit: sleep=10000
10:32:17:Unit 00:
10:32:17:Unit 00:Finished Work Unit:
10:32:17:Unit 00:- Reading up to 2908800 from "00/wudata_01.trr": Read 2908800
10:32:17:Unit 00:trr file hash check passed.
10:32:17:Unit 00:- Reading up to 1554492 from "00/wudata_01.xtc": Read 1554492
10:32:17:Unit 00:xtc file hash check passed.
10:32:17:Unit 00:edr file hash check passed.
10:32:17:Unit 00:logfile size: 43443
10:32:17:Unit 00:Leaving Run
10:32:19:Unit 00:- Writing 4511747 bytes of core data to disk...
10:32:20:Unit 00:Done: 4511235 -> 4326360 (compressed to 95.9 percent)
10:32:20:Unit 00: ... Done.
10:32:44:Unit 02:Completed 80%
10:32:53:Unit 00:- Shutting down core
10:32:53:Unit 00:
10:32:53:Unit 00:Folding@home Core Shutdown: FINISHED_UNIT
10:32:58:FahCore, running Unit 00, returned: FINISHED_UNIT (100)
10:32:58:Sending unit results: id:00 state:SEND project:7809 run:6 clone:300 gen:9 core:0xa4 unit:0x000000090a3b1e874e3112d79e56bf89
10:32:58:Unit 00: Uploading 4.13MiB
10:32:58:Starting Unit 03
10:32:58:Connecting to 171.64.65.99:8080
10:32:58:Running core: C:/Users/Squinch/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 03 -suffix 01 -lifeline 3536 -version 701 -checkpoint 12 -np 2 -forceasm
10:32:58:Started core on PID 5004
10:32:58:FahCore 0xa4 started
10:33:04:Unit 00: 5.68%
10:33:10:Unit 00: 12.02%
10:33:16:Unit 00: 18.36%
10:33:22:Unit 00: 24.61%
10:33:28:Unit 00: 30.96%
10:33:34:Unit 00: 37.20%
10:33:40:Unit 00: 43.36%
10:33:46:Unit 00: 49.60%
10:33:52:Unit 00: 55.66%
10:33:58:Unit 00: 61.91%
10:34:04:Unit 00: 68.06%
10:34:10:Unit 00: 74.31%
10:34:16:Unit 00: 80.65%
10:34:22:Unit 00: 86.62%
10:34:28:Unit 00: 92.96%
10:34:34:Unit 00: 99.21%
10:34:34:Unit 00: Upload complete
Dump of project:7809 run:6 clone:300 gen:9:
10:34:34:Server responded WORK_QUIT (404)
10:34:34:WARNING: Server did not like results, dumping
10:34:35:Cleaning up Unit 00
~~~~~~~~~~~~~~~~~~~~~~~~~~~Second dump of 7809~~~~~~~~~~~~
Previous unit (Unit 02) reaches 99% and new unit ( project:7809 run:5 clone:191 gen:7 ) is fetched in anticipation of completion of Unit 02. New Unit will be assigned as Unit 01.
07:17:23:Unit 02:Completed 1980000 out of 2000000 steps (99%)
07:17:24:Connecting to assign3.stanford.edu:8080
07:17:24:News: Welcome to Folding@Home
07:17:24:Assigned to work server 171.64.65.99
07:17:24:Requesting new work unit for slot 00: RUNNING smp:2 from 171.64.65.99
07:17:24:Connecting to 171.64.65.99:8080
07:17:25:Slot 00: Downloading 1.98MiB
07:17:31:Slot 00: 16.54%
07:17:37:Slot 00: 53.13%
07:17:43:Slot 00: 77.77%
07:17:46:Slot 00: Download complete
07:17:46:Received Unit: id:01 state:DOWNLOAD project:7809 run:5 clone:191 gen:7 core:0xa4 unit:0x000000090a3b1e874e31102fbed63d6d
07:37:16:Unit 02:Completed 2000000 out of 2000000 steps (100%)
07:37:16:Unit 02:DynamicWrapper: Finished Work Unit: sleep=10000
07:37:26:Unit 02:
07:37:26:Unit 02:Finished Work Unit:
07:37:26:Unit 02:- Reading up to 488544 from "02/wudata_01.trr": Read 488544
07:37:26:Unit 02:trr file hash check passed.
07:37:26:Unit 02:- Reading up to 57832 from "02/wudata_01.xtc": Read 57832
07:37:26:Unit 02:xtc file hash check passed.
07:37:26:Unit 02:edr file hash check passed.
07:37:26:Unit 02:logfile size: 45843
07:37:26:Unit 02:Leaving Run
07:37:29:Unit 02:- Writing 616135 bytes of core data to disk...
07:37:29:Unit 02:Done: 615623 -> 546248 (compressed to 88.7 percent)
07:37:29:Unit 02: ... Done.
07:37:35:Unit 02:- Shutting down core
07:37:35:Unit 02:
07:37:35:Unit 02:Folding@home Core Shutdown: FINISHED_UNIT
07:37:36:FahCore, running Unit 02, returned: FINISHED_UNIT (100)
07:37:36:Sending unit results: id:02 state:SEND project:7610 run:332 clone:0 gen:40 core:0xa4 unit:0x00000034664f2dd04de6d3dcbdc79a5a
07:37:36:Unit 02: Uploading 533.95KiB
07:37:36:Starting Unit 01
07:37:36:Connecting to 171.64.65.104:8080
07:37:36:Running core: C:/Users/Squinch/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -lifeline 3536 -version 701 -checkpoint 12 -np 2 -forceasm
07:37:36:Started core on PID 1764
07:37:36:FahCore 0xa4 started
07:37:36:Unit 01:
07:37:36:Unit 01:*------------------------------*
07:37:36:Unit 01:Folding@Home Gromacs GB Core
07:37:36:Unit 01:Version 2.27 (Dec. 15, 2010)
07:37:36:Unit 01:
07:37:36:Unit 01:Preparing to commence simulation
07:37:36:Unit 01:- Assembly optimizations manually forced on.
07:37:36:Unit 01:- Not checking prior termination.
07:37:36:Unit 01:- Expanded 2079605 -> 5386224 (decompressed 259.0 percent)
07:37:36:Unit 01:Called DecompressByteArray: compressed_data_size=2079605 data_size=5386224, decompressed_data_size=5386224 diff=0
07:37:37:Unit 01:- Digital signature verified
07:37:37:Unit 01:
07:37:37:Unit 01:Project: 7809 (Run 5, Clone 191, Gen 7)
07:37:37:Unit 01:
07:37:37:Unit 01:Assembly optimizations on if available.
07:37:37:Unit 01:Entering M.D.
07:37:42:Unit 02: 44.20%
07:37:43:Unit 01:Mapping NT from 2 to 2
07:37:43:Unit 01:Completed 0 out of 1500000 steps (0%)
07:37:48:Unit 02: 95.89%
07:37:49:Unit 02: Upload complete
07:37:49:Server responded WORK_ACK (400)
07:37:49:Final credit estimate, 2642.00 points
07:37:49:Cleaning up Unit 02
08:20:31:Unit 01:Completed 15000 out of 1500000 steps (1%)
etc., etc, until 99% complete on Unit 01 ( project:7809 run:5 clone:191 gen:7 )
02:52:45:Unit 01:Completed 1485000 out of 1500000 steps (99%)
02:52:46:Connecting to assign3.stanford.edu:8080
02:52:46:News: Welcome to Folding@Home
02:52:46:Assigned to work server 129.74.85.15
02:52:46:Requesting new work unit for slot 00: RUNNING smp:2 from 129.74.85.15
02:52:46:Connecting to 129.74.85.15:8080
02:52:47:Slot 00: Downloading 53.90KiB
02:52:47:Slot 00: Download complete
02:52:47:Received Unit: id:00 state:DOWNLOAD project:7002 run:0 clone:0 gen:65 core:0xa4 unit:0x000000e80001329c4dfb8274b23c77b9
03:34:29:Unit 01:Completed 1500000 out of 1500000 steps (100%)
03:34:30:Unit 01:DynamicWrapper: Finished Work Unit: sleep=10000
03:34:40:Unit 01:
03:34:40:Unit 01:Finished Work Unit:
03:34:40:Unit 01:- Reading up to 2908800 from "01/wudata_01.trr": Read 2908800
03:34:40:Unit 01:trr file hash check passed.
03:34:40:Unit 01:- Reading up to 1554516 from "01/wudata_01.xtc": Read 1554516
03:34:40:Unit 01:xtc file hash check passed.
03:34:40:Unit 01:edr file hash check passed.
03:34:40:Unit 01:logfile size: 44345
03:34:40:Unit 01:Leaving Run
03:34:45:Unit 01:- Writing 4512673 bytes of core data to disk...
03:34:46:Unit 01:Done: 4512161 -> 4327143 (compressed to 95.8 percent)
03:34:46:Unit 01: ... Done.
03:35:19:Unit 01:- Shutting down core
03:35:19:Unit 01:
03:35:19:Unit 01:Folding@home Core Shutdown: FINISHED_UNIT
03:35:24:FahCore, running Unit 01, returned: FINISHED_UNIT (100)
03:35:24:Sending unit results: id:01 state:SEND project:7809 run:5 clone:191 gen:7 core:0xa4 unit:0x000000090a3b1e874e31102fbed63d6d
03:35:24:Unit 01: Uploading 4.13MiB
03:35:24:Starting Unit 00
03:35:24:Connecting to 171.64.65.99:8080
03:35:24:Running core: C:/Users/Squinch/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -lifeline 3536 -version 701 -checkpoint 12 -np 2 -forceasm
03:35:24:Started core on PID 4408
03:35:24:FahCore 0xa4 started
03:35:25:Unit 00:
03:35:25:Unit 00:*------------------------------*
03:35:25:Unit 00:Folding@Home Gromacs GB Core
03:35:25:Unit 00:Version 2.27 (Dec. 15, 2010)
03:35:25:Unit 00:
03:35:25:Unit 00:Preparing to commence simulation
03:35:25:Unit 00:- Assembly optimizations manually forced on.
03:35:25:Unit 00:- Not checking prior termination.
03:35:25:Unit 00:- Expanded 54678 -> 203368 (decompressed 371.9 percent)
03:35:25:Unit 00:Called DecompressByteArray: compressed_data_size=54678 data_size=203368, decompressed_data_size=203368 diff=0
03:35:25:Unit 00:- Digital signature verified
03:35:25:Unit 00:
03:35:25:Unit 00:Project: 7002 (Run 0, Clone 0, Gen 65)
03:35:25:Unit 00:
03:35:25:Unit 00:Assembly optimizations on if available.
03:35:25:Unit 00:Entering M.D.
03:35:30:Unit 01: 5.77%
03:35:30:Unit 00:Mapping NT from 2 to 2
03:35:30:Unit 00:Completed 0 out of 10000000 steps (0%)
03:35:36:Unit 01: 11.93%
03:35:42:Unit 01: 18.27%
03:35:48:Unit 01: 24.51%
03:35:54:Unit 01: 30.95%
03:36:00:Unit 01: 35.87%
03:36:06:Unit 01: 42.31%
03:36:12:Unit 01: 48.84%
03:36:18:Unit 01: 55.27%
03:36:24:Unit 01: 61.71%
03:36:30:Unit 01: 68.15%
03:36:36:Unit 01: 74.39%
03:36:42:Unit 01: 81.02%
03:36:48:Unit 01: 87.55%
03:36:54:Unit 01: 93.98%
03:36:59:Unit 01: Upload complete
Dump of project:7809 run:5 clone:191 gen:7:
03:36:59:Server responded WORK_QUIT (404)
03:36:59:WARNING: Server did not like results, dumping
03:37:00:Cleaning up Unit 01
Thanks for your help!
Mod Edit: Changed Quote Tags To Code Tags - PantherX