Page 1 of 1

Project: 10512 (Run 6, Clone 506, Gen 13)

Posted: Tue Sep 21, 2010 9:12 am
by Tobit
Been awhile since I've had a GPU2 WU fail but here is one. This is one of those pesky "exception thrown during GuardedRun" errors.

Code: Select all

                       Folding@Home Client Version 6.23

Launch directory: C:\fah\gpu2-1
Executable: C:\fah\gpu2-1\Folding@home-Win32-GPU.exe
Arguments: -forcegpu nvidia_g80 -verbosity 9 -gpu 1 

[04:45:05] - Ask before connecting: No
[04:45:05] - User name: Tobit (Team 33)
[05:30:23] - Preparing to get new work unit...
[05:30:23] + Attempting to get work packet
[05:30:23] - Will indicate memory of 4094 MB
[05:30:23] - Connecting to assignment server
[05:30:23] Connecting to http://assign-GPU.stanford.edu:8080/
[05:30:24] Posted data.
[05:30:24] Initial: 40AB; - Successful: assigned to (171.64.65.61).
[05:30:24] + News From Folding@Home: Welcome to Folding@Home
[05:30:24] Loaded queue successfully.
[05:30:24] Connecting to http://171.64.65.61:8080/
[05:30:24] Posted data.
[05:30:24] Initial: 0000; - Receiving payload (expected size: 63403)
[05:30:25] - Downloaded at ~61 kB/s
[05:30:25] - Averaged speed for that direction ~66 kB/s
[05:30:25] + Received work.
[05:30:25] Trying to send all finished work units
[05:30:25] + No unsent completed units remaining.
[05:30:25] + Closed connections
[05:30:25] 
[05:30:25] + Processing work unit
[05:30:25] Core required: FahCore_11.exe
[05:30:25] Core found.
[05:30:25] Working on queue slot 00 [September 21 05:30:25 UTC]
[05:30:25] + Working ...
[05:30:25] - Calling '.\FahCore_11.exe -dir work/ -suffix 00 -checkpoint 15 -verbose -lifeline 1012 -version 623'
[05:30:25] 
[05:30:25] *------------------------------*
[05:30:25] Folding@Home GPU Core
[05:30:25] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[05:30:25] 
[05:30:25] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[05:30:25] Build host: amoeba
[05:30:25] Board Type: Nvidia
[05:30:25] Core      : 
[05:30:25] Preparing to commence simulation
[05:30:25] - Looking at optimizations...
[05:30:25] DeleteFrameFiles: successfully deleted file=work/wudata_00.ckp
[05:30:25] - Created dyn
[05:30:25] - Files status OK
[05:30:25] - Expanded 62891 -> 336076 (decompressed 534.3 percent)
[05:30:25] Called DecompressByteArray: compressed_data_size=62891 data_size=336076, decompressed_data_size=336076 diff=0
[05:30:25] - Digital signature verified
[05:30:25] 
[05:30:25] Project: 10512 (Run 6, Clone 506, Gen 13)
[05:30:25] 
[05:30:25] Assembly optimizations on if available.
[05:30:25] Entering M.D.
[05:30:31] Tpr hash work/wudata_00.tpr:  1629723231 2792467877 3584417046 2069530858 3972113310
[05:30:31] 
[05:30:31] Calling fah_main args: 14 usage=100
[05:30:31] 
[05:30:31] Working on Protein
[05:30:33] Client config found, loading data.
[05:30:33] Starting GUI Server
[05:32:27] Completed 1%
--- snip ---
[07:02:03] Completed 48%
[07:03:40] Run: exception thrown during GuardedRun
[07:03:40] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[07:03:40] Going to send back what have done -- stepsTotalG=15000000
[07:03:40] Work fraction=0.4884 steps=15000000.
[07:03:44] logfile size=19640 infoLength=19640 edr=0 trr=23
[07:03:44] + Opened results file
[07:03:44] - Writing 20176 bytes of core data to disk...
[07:03:44] Done: 19664 -> 5054 (compressed to 25.7 percent)
[07:03:44]   ... Done.
[07:03:44] DeleteFrameFiles: successfully deleted file=work/wudata_00.ckp
[07:03:44] 
[07:03:44] Folding@home Core Shutdown: UNSTABLE_MACHINE
[07:03:47] CoreStatus = 7A (122)
[07:03:47] Sending work to server
[07:03:47] Project: 10512 (Run 6, Clone 506, Gen 13)

Re: Project: 10512 (Run 6, Clone 506, Gen 13)

Posted: Tue Sep 21, 2010 9:22 pm
by sortofageek
The good news is that you got partial credit.
Hi Tobit (team 33),
Your WU (P10512 R6 C506 G13) was added to the stats database on 2010-09-21 00:09:22 for 286.7 points of credit.

The other news is that it went to two other people. Both completed it successfully.

Re: Project: 10512 (Run 6, Clone 506, Gen 13)

Posted: Wed Sep 22, 2010 12:49 am
by Tobit
sortofageek wrote:The other news is that it went to two other people. Both completed it successfully.
I kind of suspected much. The error I posted is very similar to a NAN which usual means the WU didn't like something about the GPU. Thanks for confirming my actual points received though.

Re: Project: 10512 (Run 6, Clone 506, Gen 13)

Posted: Wed Sep 22, 2010 5:55 am
by 7im
It would be helpful to include a little system info, like what GPU type you had, if it was overclocked, etc. That way if a problem pattern develops, it's easier to spot.