Page 1 of 1

Project: 5768 (Run 6, Clone 60, Gen 30)

Posted: Tue Feb 21, 2012 10:21 pm
by Fahrenheit451
WU stopped with UNSTABLE MACHINE error after 55 successfully finished WU's:

Code: Select all

--- Opening Log file [February 16 23:48:49 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: D:\Programme\Folding@home-Win32-GPU_Vista-623_console_client
Executable: D:\Programme\Folding@home-Win32-GPU_Vista-623_console_client\Folding@home-Win32-GPU.exe
Arguments: -advmethods -verbosity 9 

[23:48:49] - Ask before connecting: No
[23:48:49] - User name: superduper4711 (Team 0)
[23:48:49] - User ID: ECCE00A5AF7AA44
[23:48:49] - Machine ID: 2
[23:48:49] 
[23:48:49] Loaded queue successfully.
[23:48:49] 

<SNIP>

[18:50:57] Project: 5767 (Run 3, Clone 177, Gen 1618)
[18:50:57] - Read packet limit of 540015616... Set to 524286976.


[18:50:57] + Attempting to send results [February 21 18:50:57 UTC]
[18:50:57] - Reading file work/wuresults_05.dat from core
[18:50:57]   (Read 96133 bytes from disk)
[18:50:57] Connecting to http://171.67.108.11:8080/
[18:50:59] Posted data.
[18:50:59] Initial: 0000; - Uploaded at ~47 kB/s
[18:50:59] - Averaged speed for that direction ~44 kB/s
[18:50:59] + Results successfully sent
[18:50:59] Thank you for your contribution to Folding@Home.
[18:50:59] + Number of Units Completed: 3640

[18:51:03] Trying to send all finished work units
[18:51:03] + No unsent completed units remaining.
[18:51:03] - Preparing to get new work unit...
[18:51:03] + Attempting to get work packet
[18:51:03] - Will indicate memory of 2045 MB
[18:51:03] - Connecting to assignment server
[18:51:03] Connecting to http://assign-GPU.stanford.edu:8080/
[18:51:05] Posted data.
[18:51:05] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[18:51:05] + News From Folding@Home: Welcome to Folding@Home
[18:51:05] Loaded queue successfully.
[18:51:05] Connecting to http://171.67.108.11:8080/
[18:51:06] Posted data.
[18:51:06] Initial: 0000; - Receiving payload (expected size: 47170)
[18:51:07] - Downloaded at ~46 kB/s
[18:51:07] - Averaged speed for that direction ~55 kB/s
[18:51:07] + Received work.
[18:51:07] Trying to send all finished work units
[18:51:07] + No unsent completed units remaining.
[18:51:07] + Closed connections
[18:51:07] 
[18:51:07] + Processing work unit
[18:51:07] Core required: FahCore_11.exe
[18:51:07] Core found.
[18:51:07] Working on queue slot 06 [February 21 18:51:07 UTC]
[18:51:07] + Working ...
[18:51:07] - Calling '.\FahCore_11.exe -dir work/ -suffix 06 -priority 96 -checkpoint 15 -verbose -lifeline 2188 -version 623'

[18:51:07] 
[18:51:07] *------------------------------*
[18:51:07] Folding@Home GPU Core
[18:51:07] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[18:51:07] 
[18:51:07] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[18:51:07] Build host: amoeba
[18:51:07] Board Type: Nvidia
[18:51:07] Core      : 
[18:51:07] Preparing to commence simulation
[18:51:07] - Looking at optimizations...
[18:51:07] DeleteFrameFiles: successfully deleted file=work/wudata_06.ckp
[18:51:07] - Created dyn
[18:51:07] - Files status OK
[18:51:07] - Expanded 46658 -> 252912 (decompressed 542.0 percent)
[18:51:07] Called DecompressByteArray: compressed_data_size=46658 data_size=252912, decompressed_data_size=252912 diff=0
[18:51:07] - Digital signature verified
[18:51:07] 
[18:51:07] Project: 5768 (Run 6, Clone 60, Gen 30)
[18:51:07] 
[18:51:07] Assembly optimizations on if available.
[18:51:07] Entering M.D.
[18:51:13] Tpr hash work/wudata_06.tpr:  4220385613 390214888 1684401472 3606630061 3474229529
[18:51:13] 
[18:51:13] Calling fah_main args: 14 usage=100
[18:51:13] 
[18:51:13] Working on Protein
[18:51:14] Client config found, loading data.
[18:51:14] Starting GUI Server
[18:52:05] Completed 1%
[18:52:57] Completed 2%
[18:53:49] Completed 3%
[18:54:41] Completed 4%
[18:55:32] Completed 5%
[18:56:24] Completed 6%
[18:57:16] Completed 7%
[18:58:08] Completed 8%
[18:59:00] Completed 9%
[18:59:52] Completed 10%
[19:00:44] Completed 11%
[19:01:35] Completed 12%
[19:02:27] Completed 13%
[19:03:19] Completed 14%
[19:04:11] Completed 15%
[19:05:03] Completed 16%
[19:05:54] Completed 17%
[19:06:46] Completed 18%
[19:07:38] Completed 19%
[19:08:30] Completed 20%
[19:09:22] Completed 21%
[19:10:14] Completed 22%
[19:11:05] Completed 23%
[19:11:57] Completed 24%
[19:12:49] Completed 25%
[19:13:41] Completed 26%
[19:14:33] Completed 27%
[19:15:25] Completed 28%
[19:16:16] Completed 29%
[19:17:08] Completed 30%
[19:18:00] Completed 31%
[19:18:52] Completed 32%
[19:19:44] Completed 33%
[19:20:36] Completed 34%
[19:21:27] Completed 35%
[19:22:19] Completed 36%
[19:23:11] Completed 37%
[19:24:03] Completed 38%
[19:24:55] Completed 39%
[19:25:47] Completed 40%
[19:26:39] Completed 41%
[19:27:30] Completed 42%
[19:28:22] Completed 43%
[19:29:14] Completed 44%
[19:30:06] Completed 45%
[19:30:58] Completed 46%
[19:31:50] Completed 47%
[19:32:41] Completed 48%
[19:33:33] Completed 49%
[19:34:25] Completed 50%
[19:35:17] Completed 51%
[19:36:09] Completed 52%
[19:37:00] Completed 53%
[19:37:52] Completed 54%
[19:38:44] Completed 55%
[19:39:36] Completed 56%
[19:40:28] Completed 57%
[19:41:20] Completed 58%
[19:42:11] Completed 59%
[19:43:03] Completed 60%
[19:43:55] Completed 61%
[19:44:47] Completed 62%
[19:45:39] Completed 63%
[19:46:31] Completed 64%
[19:47:22] Completed 65%
[19:48:14] Completed 66%
[19:49:06] Completed 67%
[19:49:59] Completed 68%
[19:50:54] Completed 69%
[19:51:47] Completed 70%
[19:52:39] Completed 71%
[19:53:31] Completed 72%
[19:54:25] Completed 73%
[19:55:17] Completed 74%
[19:56:11] Completed 75%
[19:57:03] Completed 76%
[19:57:55] Completed 77%
[19:58:49] Completed 78%
[19:59:42] Completed 79%
[20:00:35] Completed 80%
[20:01:27] Completed 81%
[20:02:20] Completed 82%
[20:03:14] Completed 83%
[20:04:06] Completed 84%
[20:04:59] Completed 85%
[20:05:52] Completed 86%
[20:06:44] Completed 87%
[20:07:36] Completed 88%
[20:08:29] Completed 89%
[20:09:22] Completed 90%
[20:10:15] Completed 91%
[20:11:05] Run: exception thrown during GuardedRun
[20:11:05] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[20:11:06] Going to send back what have done -- stepsTotalG=15000000
[20:11:06] Work fraction=0.9192 steps=15000000.
[20:11:09] logfile size=11953 infoLength=11953 edr=0 trr=23
[20:11:10] + Opened results file
[20:11:10] - Writing 12489 bytes of core data to disk...
[20:11:10] Done: 11977 -> 4195 (compressed to 35.0 percent)
[20:11:10]   ... Done.
[20:11:10] DeleteFrameFiles: successfully deleted file=work/wudata_06.ckp
[20:11:10] 
[20:11:10] Folding@home Core Shutdown: UNSTABLE_MACHINE
[20:11:13] CoreStatus = 7A (122)
[20:11:13] Sending work to server
[20:11:13] Project: 5768 (Run 6, Clone 60, Gen 30)
[20:11:13] - Read packet limit of 540015616... Set to 524286976.


[20:11:13] + Attempting to send results [February 21 20:11:13 UTC]
[20:11:13] - Reading file work/wuresults_06.dat from core
[20:11:13]   (Read 4707 bytes from disk)
[20:11:13] Connecting to http://171.67.108.11:8080/
[20:11:14] Posted data.
[20:11:14] Initial: 0000; - Uploaded at ~5 kB/s
[20:11:14] - Averaged speed for that direction ~36 kB/s
[20:11:14] + Results successfully sent
[20:11:14] Thank you for your contribution to Folding@Home.
[20:11:18] Trying to send all finished work units
[20:11:18] + No unsent completed units remaining.
[20:11:18] - Preparing to get new work unit...
[20:11:18] + Attempting to get work packet
[20:11:18] - Will indicate memory of 2045 MB
[20:11:18] - Connecting to assignment server
[20:11:18] Connecting to http://assign-GPU.stanford.edu:8080/
[20:11:19] Posted data.
[20:11:19] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[20:11:19] + News From Folding@Home: Welcome to Folding@Home
[20:11:19] Loaded queue successfully.
[20:11:19] Connecting to http://171.67.108.11:8080/
[20:11:20] Posted data.
[20:11:20] Initial: 0000; - Receiving payload (expected size: 47150)
[20:11:21] - Downloaded at ~46 kB/s
[20:11:21] - Averaged speed for that direction ~53 kB/s
[20:11:21] + Received work.
[20:11:21] Trying to send all finished work units
[20:11:21] + No unsent completed units remaining.
[20:11:21] + Closed connections
[20:11:26] 
[20:11:26] + Processing work unit
[20:11:26] Core required: FahCore_11.exe
[20:11:26] Core found.
[20:11:26] Working on queue slot 07 [February 21 20:11:26 UTC]
[20:11:26] + Working ...
[20:11:26] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -priority 96 -checkpoint 15 -verbose -lifeline 2188 -version 623'

Client continued with another WU.

Re: Project: 5768 (Run 6, Clone 60, Gen 30)

Posted: Tue Mar 13, 2012 10:36 am
by verlyol
By seeing your log and looking at the configuration of your hardware, it is possible that you have a problem with your GeForce 8800 GTS, overheating or other ...

Re: Project: 5768 (Run 6, Clone 60, Gen 30)

Posted: Tue Mar 13, 2012 5:47 pm
by bruce
verlyol wrote:By seeing your log and looking at the configuration of your hardware, it is possible that you have a problem with your GeForce 8800 GTS, overheating or other ...
That appears to be true. You did receive partial credit for that WU.
Your WU (P5768 R6 C60 G30) was added to the stats database on 2012-02-21 13:02:23 for 324.46 points of credit.

The WU was reassigned and others completed it successfully.

The messages
[20:11:05] Run: exception thrown during GuardedRun
. . .
[20:11:10] Folding@home Core Shutdown: UNSTABLE_MACHINE

mean exactly what they say ... your machine decided to be unstable and reported a hardware exception. We can only guess why that happened.

Re: Project: 5768 (Run 6, Clone 60, Gen 30)

Posted: Tue Mar 13, 2012 11:16 pm
by Fahrenheit451
Thank you for your feedback. But overheating is IMHO definitely not the reason. The GPU runs at stock speed and about 65°C (according to HW Mon) in a well cooled big tower (see my system specs for system 1). But I will not exclude that the system sometimes has a hiccup due to a general overload (2 FAH single core instances, the gpu client and normal use (office, video, internet) at the same time.

Re: Project: 5768 (Run 6, Clone 60, Gen 30)

Posted: Wed Mar 14, 2012 6:05 pm
by verlyol
Yes it is possible, but remember to keep the graphics card because the 8800GTS were sometimes defectives in a long-term use ... to follow if the problem recurs regularly.