Page 1 of 1

Project: 5743 (Run 4, Clone 45, Gen 24)

Posted: Fri Dec 19, 2008 9:44 pm
by briar7
Just had this project fail several times on my HD4850. Never got past 10% before shutting down. GPU is at stock clocks.

Code: Select all

[20:29:22] + Processing work unit
[20:29:22] Core required: FahCore_11.exe
[20:29:22] Core found.
[20:29:22] Working on queue slot 01 [December 19 20:29:22 UTC]
[20:29:22] + Working ...
[20:29:22] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -priority 96 -checkpoint 15 -verbose -lifeline 3396 -version 623'

[20:29:22] 
[20:29:22] *------------------------------*
[20:29:22] Folding@Home GPU Core - Beta
[20:29:22] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[20:29:22] 
[20:29:22] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[20:29:22] Build host: amoeba
[20:29:22] Board Type: AMD
[20:29:22] Core      : 
[20:29:22] Preparing to commence simulation
[20:29:22] - Looking at optimizations...
[20:29:22] - Created dyn
[20:29:22] - Files status OK
[20:29:22] - Expanded 70218 -> 360060 (decompressed 512.7 percent)
[20:29:22] Called DecompressByteArray: compressed_data_size=70218 data_size=360060, decompressed_data_size=360060 diff=0
[20:29:22] - Digital signature verified
[20:29:22] 
[20:29:22] Project: 5743 (Run 4, Clone 45, Gen 24)
[20:29:22] 
[20:29:22] Assembly optimizations on if available.
[20:29:22] Entering M.D.
[20:29:29] Working on Protein
[20:29:30] Client config found, loading data.
[20:29:30] Starting GUI Server
[20:42:26] Completed 1%
[20:42:26] mdrun_gpu returned 
[20:42:26] NANs detected on GPU
[20:42:26] 
[20:42:26] Folding@home Core Shutdown: UNSTABLE_MACHINE
[20:42:28] CoreStatus = 7A (122)
[20:42:28] Sending work to server
[20:42:28] Project: 5743 (Run 4, Clone 45, Gen 24)
[20:42:28] - Error: Could not get length of results file work/wuresults_01.dat
[20:42:28] - Error: Could not read unit 01 file. Removing from queue.
[20:42:28] Trying to send all finished work units
[20:42:28] + No unsent completed units remaining.
[20:42:28] - Preparing to get new work unit...
[20:42:28] + Attempting to get work packet
[20:42:28] - Will indicate memory of 2046 MB
[20:42:28] - Connecting to assignment server
[20:42:28] Connecting to http://assign-GPU.stanford.edu:8080/
[20:42:28] Posted data.
[20:42:28] Initial: 40AB; - Successful: assigned to (171.64.65.102).
[20:42:28] + News From Folding@Home: GPU folding beta
[20:42:29] Loaded queue successfully.
[20:42:29] Connecting to http://171.64.65.102:8080/
[20:42:29] Posted data.
[20:42:29] Initial: 0000; - Receiving payload (expected size: 70730)
[20:42:29] Conversation time very short, giving reduced weight in bandwidth avg
[20:42:29] - Downloaded at ~138 kB/s
[20:42:29] - Averaged speed for that direction ~110 kB/s
[20:42:29] + Received work.
[20:42:29] Trying to send all finished work units
[20:42:29] + No unsent completed units remaining.
[20:42:29] + Closed connections
[20:42:34] 
[20:42:34] + Processing work unit
[20:42:34] Core required: FahCore_11.exe
[20:42:34] Core found.
[20:42:34] Working on queue slot 02 [December 19 20:42:34 UTC]
[20:42:34] + Working ...
[20:42:34] - Calling '.\FahCore_11.exe -dir work/ -suffix 02 -priority 96 -checkpoint 15 -verbose -lifeline 3396 -version 623'

[20:42:34] 
[20:42:34] *------------------------------*
[20:42:34] Folding@Home GPU Core - Beta
[20:42:34] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[20:42:34] 
[20:42:34] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[20:42:34] Build host: amoeba
[20:42:34] Board Type: AMD
[20:42:34] Core      : 
[20:42:34] Preparing to commence simulation
[20:42:34] - Looking at optimizations...
[20:42:34] - Created dyn
[20:42:34] - Files status OK
[20:42:34] - Expanded 70218 -> 360060 (decompressed 512.7 percent)
[20:42:34] Called DecompressByteArray: compressed_data_size=70218 data_size=360060, decompressed_data_size=360060 diff=0
[20:42:34] - Digital signature verified
[20:42:34] 
[20:42:34] Project: 5743 (Run 4, Clone 45, Gen 24)
[20:42:34] 
[20:42:34] Assembly optimizations on if available.
[20:42:34] Entering M.D.
[20:42:41] Working on Protein
[20:42:41] Client config found, loading data.
[20:42:41] Starting GUI Server
[20:44:52] Completed 1%
[20:44:52] mdrun_gpu returned 
[20:44:52] NANs detected on GPU
[20:44:52] 
[20:44:52] Folding@home Core Shutdown: UNSTABLE_MACHINE
[20:44:54] CoreStatus = 7A (122)
[20:44:54] Sending work to server
[20:44:54] Project: 5743 (Run 4, Clone 45, Gen 24)
[20:44:54] - Error: Could not get length of results file work/wuresults_02.dat
[20:44:54] - Error: Could not read unit 02 file. Removing from queue.
[20:44:54] Trying to send all finished work units
[20:44:54] + No unsent completed units remaining.
[20:44:54] - Preparing to get new work unit...
[20:44:54] + Attempting to get work packet
[20:44:54] - Will indicate memory of 2046 MB
[20:44:54] - Connecting to assignment server
[20:44:54] Connecting to http://assign-GPU.stanford.edu:8080/
[20:44:55] Posted data.
[20:44:55] Initial: 40AB; - Successful: assigned to (171.64.65.102).
[20:44:55] + News From Folding@Home: GPU folding beta
[20:44:55] Loaded queue successfully.
[20:44:55] Connecting to http://171.64.65.102:8080/
[20:44:55] Posted data.
[20:44:55] Initial: 0000; - Receiving payload (expected size: 70730)
[20:44:55] Conversation time very short, giving reduced weight in bandwidth avg
[20:44:55] - Downloaded at ~138 kB/s
[20:44:55] - Averaged speed for that direction ~113 kB/s
[20:44:55] + Received work.
[20:44:55] Trying to send all finished work units
[20:44:55] + No unsent completed units remaining.
[20:44:55] + Closed connections
[20:45:00] 
[20:45:00] + Processing work unit
[20:45:00] Core required: FahCore_11.exe
[20:45:00] Core found.
[20:45:00] Working on queue slot 03 [December 19 20:45:00 UTC]
[20:45:00] + Working ...
[20:45:00] - Calling '.\FahCore_11.exe -dir work/ -suffix 03 -priority 96 -checkpoint 15 -verbose -lifeline 3396 -version 623'

[20:45:00] 
[20:45:00] *------------------------------*
[20:45:00] Folding@Home GPU Core - Beta
[20:45:00] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[20:45:00] 
[20:45:00] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[20:45:00] Build host: amoeba
[20:45:00] Board Type: AMD
[20:45:00] Core      : 
[20:45:00] Preparing to commence simulation
[20:45:00] - Looking at optimizations...
[20:45:00] - Created dyn
[20:45:00] - Files status OK
[20:45:00] - Expanded 70218 -> 360060 (decompressed 512.7 percent)
[20:45:00] Called DecompressByteArray: compressed_data_size=70218 data_size=360060, decompressed_data_size=360060 diff=0
[20:45:00] - Digital signature verified
[20:45:00] 
[20:45:00] Project: 5743 (Run 4, Clone 45, Gen 24)
[20:45:00] 
[20:45:00] Assembly optimizations on if available.
[20:45:00] Entering M.D.
[20:45:07] Working on Protein
[20:45:07] Client config found, loading data.
[20:45:07] Starting GUI Server
[20:58:39] Completed 1%
[20:58:39] mdrun_gpu returned 
[20:58:39] NANs detected on GPU
[20:58:39] 
[20:58:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[20:58:43] CoreStatus = 7A (122)
[20:58:43] Sending work to server
[20:58:43] Project: 5743 (Run 4, Clone 45, Gen 24)
[20:58:43] - Error: Could not get length of results file work/wuresults_03.dat
[20:58:43] - Error: Could not read unit 03 file. Removing from queue.
[20:58:43] Trying to send all finished work units
[20:58:43] + No unsent completed units remaining.
[20:58:43] - Preparing to get new work unit...
[20:58:43] + Attempting to get work packet
[20:58:43] - Will indicate memory of 2046 MB
[20:58:43] - Connecting to assignment server
[20:58:43] Connecting to http://assign-GPU.stanford.edu:8080/
[20:58:43] Posted data.
[20:58:43] Initial: 40AB; - Successful: assigned to (171.64.65.102).
[20:58:43] + News From Folding@Home: GPU folding beta
[20:58:43] Loaded queue successfully.
[20:58:43] Connecting to http://171.64.65.102:8080/
[20:58:43] Posted data.
[20:58:43] Initial: 0000; - Receiving payload (expected size: 70730)
[20:58:44] - Downloaded at ~69 kB/s
[20:58:44] - Averaged speed for that direction ~104 kB/s
[20:58:44] + Received work.
[20:58:44] Trying to send all finished work units
[20:58:44] + No unsent completed units remaining.
[20:58:44] + Closed connections
[20:58:49] 
[20:58:49] + Processing work unit
[20:58:49] Core required: FahCore_11.exe
[20:58:49] Core found.
[20:58:49] Working on queue slot 04 [December 19 20:58:49 UTC]
[20:58:49] + Working ...
[20:58:49] - Calling '.\FahCore_11.exe -dir work/ -suffix 04 -priority 96 -checkpoint 15 -verbose -lifeline 3396 -version 623'

[20:58:49] 
[20:58:49] *------------------------------*
[20:58:49] Folding@Home GPU Core - Beta
[20:58:49] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[20:58:49] 
[20:58:49] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[20:58:49] Build host: amoeba
[20:58:49] Board Type: AMD
[20:58:49] Core      : 
[20:58:49] Preparing to commence simulation
[20:58:49] - Looking at optimizations...
[20:58:49] - Created dyn
[20:58:49] - Files status OK
[20:58:49] - Expanded 70218 -> 360060 (decompressed 512.7 percent)
[20:58:49] Called DecompressByteArray: compressed_data_size=70218 data_size=360060, decompressed_data_size=360060 diff=0
[20:58:49] - Digital signature verified
[20:58:49] 
[20:58:49] Project: 5743 (Run 4, Clone 45, Gen 24)
[20:58:49] 
[20:58:49] Assembly optimizations on if available.
[20:58:49] Entering M.D.
[20:58:58] Working on Protein
[20:59:00] Client config found, loading data.
[20:59:00] Starting GUI Server
[21:01:19] Completed 1%
[21:03:17] Completed 2%
[21:05:12] Completed 3%
[21:07:09] Completed 4%
[21:09:05] Completed 5%
[21:11:02] Completed 6%
[21:12:57] Completed 7%
[21:14:52] Completed 8%
[21:21:48] Completed 9%
[21:26:15] Completed 10%
[21:26:15] mdrun_gpu returned 
[21:26:15] NANs detected on GPU
[21:26:15] 
[21:26:15] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:26:17] CoreStatus = 7A (122)
[21:26:17] Sending work to server
[21:26:17] Project: 5743 (Run 4, Clone 45, Gen 24)
[21:26:17] - Error: Could not get length of results file work/wuresults_04.dat
[21:26:17] - Error: Could not read unit 04 file. Removing from queue.
[21:26:17] EUE limit exceeded. Pausing 24 hours.

Re: Project: 5743 (Run 4, Clone 45, Gen 24)

Posted: Fri Dec 19, 2008 10:51 pm
by toTOW
There's no data for this WU in the DB yet ... but it looks like this WU affected by the "unreadable results" bug :(

Re: Project: 5743 (Run 4, Clone 45, Gen 24)

Posted: Sat Dec 20, 2008 2:46 am
by bruce
briar7:

What version of the client are you running? If you're on 6.20 you should consider upgrading to 6.23

Re: Project: 5743 (Run 4, Clone 45, Gen 24)

Posted: Sat Dec 20, 2008 11:31 am
by briar7
I've already upgraded to the new 6.23 client, as it was required for the new v1.22 ATI core.