Page 1 of 1

Project: 10111 (Run 876, Clone 4, Gen 145)

Posted: Wed Mar 16, 2011 9:24 am
by ChrisMu
Getting repeated NaNs returned:

Code: Select all

[07:12:19] *------------------------------*
[07:12:19] Folding@Home GPU Core
[07:12:19] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[07:12:19] 
[07:12:19] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[07:12:19] Build host: amoeba
[07:12:19] Board Type: Nvidia
[07:12:19] Core      : 
[07:12:19] Preparing to commence simulation
[07:12:19] - Looking at optimizations...
[07:12:19] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[07:12:19] - Created dyn
[07:12:19] - Files status OK
[07:12:19] - Expanded 81873 -> 421543 (decompressed 514.8 percent)
[07:12:19] Called DecompressByteArray: compressed_data_size=81873 data_size=421543, decompressed_data_size=421543 diff=0
[07:12:19] - Digital signature verified
[07:12:19] 
[07:12:19] Project: 10111 (Run 876, Clone 4, Gen 145)
[07:12:19] 
[07:12:19] Assembly optimizations on if available.
[07:12:19] Entering M.D.
[07:12:25] Tpr hash work/wudata_01.tpr:  3399033878 2973792819 1138433328 2405977824 2063172454
[07:12:25] 
[07:12:25] Calling fah_main args: 14 usage=90
[07:12:25] 
[07:12:25] Working on 1174 p10111_ubiquitin_300K
[07:12:26] Client config found, loading data.
[07:12:27] Starting GUI Server
[07:13:09] Completed 1%
[07:13:10] mdrun_gpu returned 
[07:13:10] NANs detected on GPU
[07:13:10] 
[07:13:10] Folding@home Core Shutdown: UNSTABLE_MACHINE
[07:13:13] CoreStatus = 7A (122)
[07:13:13] Sending work to server
[07:13:13] Project: 10111 (Run 876, Clone 4, Gen 145)
[07:13:13] - Error: Could not get length of results file work/wuresults_01.dat
[07:13:13] - Error: Could not read unit 01 file. Removing from queue.
[07:13:13] Trying to send all finished work units
[07:13:13] + No unsent completed units remaining.
[07:13:13] - Preparing to get new work unit...
[07:13:13] Cleaning up work directory
[07:13:13] + Attempting to get work packet
[07:13:13] Passkey found
[07:13:13] - Will indicate memory of 6134 MB
[07:13:13] Gpu type=2 species=13.
[07:13:13] - Connecting to assignment server
[07:13:13] Connecting to xxxx
[07:13:14] Posted data.
[07:13:14] Initial: 40AB; - Successful: assigned to (171.64.65.71).
[07:13:14] + News From Folding@Home: Welcome to Folding@Home
[07:13:14] Loaded queue successfully.
[07:13:14] Gpu type=2 species=13.
[07:13:14] Sent data
[07:13:14] Connecting to xxx
[07:13:14] Posted data.
[07:13:15] Initial: 0000; - Receiving payload (expected size: 82385)
[07:13:15] Conversation time very short, giving reduced weight in bandwidth avg
[07:13:15] - Downloaded at ~160 kB/s
[07:13:15] - Averaged speed for that direction ~160 kB/s
[07:13:15] + Received work.
[07:13:15] Trying to send all finished work units
[07:13:15] + No unsent completed units remaining.
[07:13:15] + Closed connections
[07:13:20] 
[07:13:20] + Processing work unit
[07:13:20] Core required: FahCore_11.exe
[07:13:20] Core found.
[07:13:20] Working on queue slot 02 [March 16 07:13:20 UTC]
[07:13:20] + Working ...
[07:13:20] - Calling '.\FahCore_11.exe -dir work/ -suffix 02 -nice 19 -checkpoint 15 -verbose -lifeline 1224 -version 641'
...eventually getting max EUE exceeded and client sleeps for 24hrs.
OS: Vista 64bit
GPU: GTX295 *very* underclocked and running cool (~70 degrees).
Tried both GPU2 and GPU3 clients, reinstalls after deleting all data, running with start and target dirs different.

Any suggestions? Can I skip this unit and try another?

Mod Edit: Added Code Tags & Made PRCG Thread Title - PantherX

Re: Project: 10111 (Run 876, Clone 4, Gen 145)

Posted: Wed Mar 16, 2011 6:49 pm
by PantherX
Welcome to the F@H Forum ChrisMu,

I check the WU and it was completed successfully by a donor (the username doesn't match your forum name and the FAHlog doesn't show your username so I am unsure if it is you or not. If your username is different from the forum name, you can either post it here or PM it to me).
Your WU (P10111 R876 C4 G145) was added to the stats database on 2011-03-16 06:07:26 for 494 points of credit.
If this was the first time your GPU has encountered an error, than you can manually delete the WU (viewtopic.php?f=19&t=16526).
If you are seeing this error with different WUs, then you have to troubleshoot it so you have to provide us with more details and you can look that this thread for some ideas (viewtopic.php?f=59&t=15196&start=15).