Page 1 of 2

Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 12:12 pm
by Fahrenheit451
Just FYI: WU quit with BAD_WORK_UNIT

Used client: v7.1.52

Code: Select all


09:05:57:WU01:FS00:News: Welcome to Folding@Home
09:05:57:WU01:FS00:Assigned to work server 171.67.108.11
09:05:57:WU01:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:"G92 [GeForce 8800 GTS 512]" from 171.67.108.11
09:05:57:WU01:FS00:Connecting to 171.67.108.11:8080
09:05:58:WU01:FS00:Downloading 44.82KiB
09:05:59:WU01:FS00:Download complete
09:05:59:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:OK project:5769 run:4 clone:20 gen:2247 core:0x11 unit:0x57addfb850923bbc08c7001400041689
09:06:52:WU00:FS00:0x11:Completed 100%
09:06:52:WU00:FS00:0x11:Successful run
09:06:52:WU00:FS00:0x11:DynamicWrapper: Finished Work Unit: sleep=10000
09:07:02:WU00:FS00:0x11:Reserved 75872 bytes for xtc file; Cosm status=0
09:07:02:WU00:FS00:0x11:Allocated 75872 bytes for xtc file
09:07:02:WU00:FS00:0x11:- Reading up to 75872 from "00/wudata_01.xtc": Read 75872
09:07:02:WU00:FS00:0x11:Read 75872 bytes from xtc file; available packet space=786354592
09:07:02:WU00:FS00:0x11:xtc file hash check passed.
09:07:02:WU00:FS00:0x11:Reserved 15168 15168 786354592 bytes for arc file=<00/wudata_01.trr> Cosm status=0
09:07:02:WU00:FS00:0x11:Allocated 15168 bytes for arc file
09:07:02:WU00:FS00:0x11:- Reading up to 15168 from "00/wudata_01.trr": Read 15168
09:07:03:WU00:FS00:0x11:Read 15168 bytes from arc file; available packet space=786339424
09:07:03:WU00:FS00:0x11:trr file hash check passed.
09:07:03:WU00:FS00:0x11:Allocated 560 bytes for edr file
09:07:03:WU00:FS00:0x11:Read bedfile
09:07:03:WU00:FS00:0x11:edr file hash check passed.
09:07:03:WU00:FS00:0x11:Allocated 11240 bytes for logfile
09:07:03:WU00:FS00:0x11:Read logfile
09:07:03:WU00:FS00:0x11:GuardedRun: success in DynamicWrapper
09:07:03:WU00:FS00:0x11:GuardedRun: done
09:07:03:WU00:FS00:0x11:Run: GuardedRun completed.
09:07:05:WU00:FS00:0x11:+ Opened results file
09:07:05:WU00:FS00:0x11:- Writing 103352 bytes of core data to disk...
09:07:05:WU00:FS00:0x11:Done: 102840 -> 95728 (compressed to 93.0 percent)
09:07:05:WU00:FS00:0x11:  ... Done.
09:07:05:WU00:FS00:0x11:DeleteFrameFiles: successfully deleted file=00/wudata_01.ckp
09:07:05:WU00:FS00:0x11:Shutting down core 
09:07:05:WU00:FS00:0x11:
09:07:05:WU00:FS00:0x11:Folding@home Core Shutdown: FINISHED_UNIT
09:07:05:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
09:07:05:WU00:FS00:Sending unit results: id:00 state:SEND error:OK project:5770 run:3 clone:419 gen:2024 core:0x11 unit:0x16739eb3509226a707e801a30003168a
09:07:05:WU00:FS00:Uploading 93.98KiB to 171.67.108.11
09:07:05:WU00:FS00:Connecting to 171.67.108.11:8080
09:07:06:WU01:FS00:Starting
09:07:06:WU01:FS00:Running FahCore: D:\Programme\FAHClient/FAHCoreWrapper.exe C:/Users/Stephan/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/x86/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 01 -suffix 01 -version 701 -lifeline 3344 -checkpoint 15 -gpu 0
09:07:06:WU01:FS00:Started FahCore on PID 5892
09:07:06:WU01:FS00:Core PID:5712
09:07:06:WU01:FS00:FahCore 0x11 started
09:07:06:WU01:FS00:0x11:
09:07:06:WU01:FS00:0x11:*------------------------------*
09:07:06:WU01:FS00:0x11:Folding@Home GPU Core
09:07:06:WU01:FS00:0x11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
09:07:06:WU01:FS00:0x11:
09:07:06:WU01:FS00:0x11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
09:07:06:WU01:FS00:0x11:Build host: amoeba
09:07:06:WU01:FS00:0x11:Board Type: Nvidia
09:07:06:WU01:FS00:0x11:Core      : 
09:07:06:WU01:FS00:0x11:Preparing to commence simulation
09:07:06:WU01:FS00:0x11:- Looking at optimizations...
09:07:06:WU01:FS00:0x11:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
09:07:06:WU01:FS00:0x11:- Created dyn
09:07:06:WU01:FS00:0x11:- Files status OK
09:07:06:WU01:FS00:0x11:- Expanded 45385 -> 251112 (decompressed 553.2 percent)
09:07:06:WU01:FS00:0x11:Called DecompressByteArray: compressed_data_size=45385 data_size=251112, decompressed_data_size=251112 diff=0
09:07:06:WU01:FS00:0x11:- Digital signature verified
09:07:06:WU01:FS00:0x11:
09:07:06:WU01:FS00:0x11:Project: 5769 (Run 4, Clone 20, Gen 2247)
09:07:06:WU01:FS00:0x11:
09:07:06:WU01:FS00:0x11:Assembly optimizations on if available.
09:07:06:WU01:FS00:0x11:Entering M.D.
09:07:08:WU00:FS00:Upload complete
09:07:08:WU00:FS00:Server responded WORK_ACK (400)
09:07:08:WU00:FS00:Cleaning up
09:07:12:WU01:FS00:0x11:Tpr hash 01/wudata_01.tpr:  1021512327 925865220 2188789024 3123121485 544809619
09:07:12:WU01:FS00:0x11:
09:07:12:WU01:FS00:0x11:Calling fah_main args: 14 usage=100
09:07:12:WU01:FS00:0x11:
09:07:12:WU01:FS00:0x11:Working on Protein
09:07:13:WU01:FS00:0x11:Client config unavailable.
09:07:13:WU01:FS00:0x11:Starting GUI Server
09:08:08:WU01:FS00:0x11:Completed 1%
09:09:05:WU01:FS00:0x11:Completed 2%
09:10:00:WU01:FS00:0x11:Completed 3%
09:10:56:WU01:FS00:0x11:Completed 4%
09:11:51:WU01:FS00:0x11:Completed 5%
09:12:46:WU01:FS00:0x11:Completed 6%
09:13:42:WU01:FS00:0x11:Completed 7%
09:14:36:WU01:FS00:0x11:Completed 8%
09:15:32:WU01:FS00:0x11:Completed 9%
09:16:28:WU01:FS00:0x11:Completed 10%
09:17:23:WU01:FS00:0x11:Completed 11%
09:18:19:WU01:FS00:0x11:Completed 12%
09:19:14:WU01:FS00:0x11:Completed 13%
09:20:10:WU01:FS00:0x11:Completed 14%
09:21:06:WU01:FS00:0x11:Completed 15%
09:22:01:WU01:FS00:0x11:Completed 16%
09:22:54:WU01:FS00:0x11:Completed 17%
09:23:48:WU01:FS00:0x11:Completed 18%
09:24:41:WU01:FS00:0x11:Completed 19%
09:25:35:WU01:FS00:0x11:Completed 20%
09:26:28:WU01:FS00:0x11:Completed 21%
09:27:21:WU01:FS00:0x11:Completed 22%
09:28:15:WU01:FS00:0x11:Completed 23%
09:29:07:WU01:FS00:0x11:Completed 24%
09:30:00:WU01:FS00:0x11:Completed 25%
09:30:55:WU01:FS00:0x11:Completed 26%
09:31:49:WU01:FS00:0x11:Completed 27%
09:32:44:WU01:FS00:0x11:Completed 28%
09:33:40:WU01:FS00:0x11:Completed 29%
09:34:34:WU01:FS00:0x11:Completed 30%
09:35:28:WU01:FS00:0x11:Completed 31%
09:36:22:WU01:FS00:0x11:Completed 32%
09:37:14:WU01:FS00:0x11:Completed 33%
09:38:07:WU01:FS00:0x11:Completed 34%
09:39:02:WU01:FS00:0x11:Completed 35%
09:39:54:WU01:FS00:0x11:Completed 36%
09:40:48:WU01:FS00:0x11:Completed 37%
09:41:41:WU01:FS00:0x11:Completed 38%
09:42:34:WU01:FS00:0x11:Completed 39%
09:43:27:WU01:FS00:0x11:Completed 40%
09:44:20:WU01:FS00:0x11:Completed 41%
09:45:14:WU01:FS00:0x11:Completed 42%
09:46:07:WU01:FS00:0x11:Completed 43%
09:47:01:WU01:FS00:0x11:Completed 44%
09:47:54:WU01:FS00:0x11:Completed 45%
09:48:47:WU01:FS00:0x11:Completed 46%
09:49:39:WU01:FS00:0x11:Completed 47%
09:50:32:WU01:FS00:0x11:Completed 48%
09:51:26:WU01:FS00:0x11:Completed 49%
09:52:19:WU01:FS00:0x11:Completed 50%
09:53:12:WU01:FS00:0x11:Completed 51%
09:54:06:WU01:FS00:0x11:Completed 52%
09:54:59:WU01:FS00:0x11:Completed 53%
09:55:53:WU01:FS00:0x11:Completed 54%
09:56:46:WU01:FS00:0x11:Completed 55%
09:57:40:WU01:FS00:0x11:Completed 56%
09:58:34:WU01:FS00:0x11:Completed 57%
09:59:27:WU01:FS00:0x11:Completed 58%
10:00:20:WU01:FS00:0x11:Completed 59%
10:01:15:WU01:FS00:0x11:Completed 60%
10:02:08:WU01:FS00:0x11:Completed 61%
10:03:01:WU01:FS00:0x11:Completed 62%
10:03:55:WU01:FS00:0x11:Completed 63%
10:04:48:WU01:FS00:0x11:Completed 64%
10:05:42:WU01:FS00:0x11:Completed 65%
10:06:35:WU01:FS00:0x11:Completed 66%
10:07:29:WU01:FS00:0x11:Completed 67%
10:08:22:WU01:FS00:0x11:Completed 68%
10:09:16:WU01:FS00:0x11:Completed 69%
10:10:09:WU01:FS00:0x11:Completed 70%
10:11:03:WU01:FS00:0x11:Completed 71%
10:11:57:WU01:FS00:0x11:Completed 72%
10:12:51:WU01:FS00:0x11:Completed 73%
10:13:45:WU01:FS00:0x11:Completed 74%
10:14:39:WU01:FS00:0x11:Completed 75%
10:15:32:WU01:FS00:0x11:Completed 76%
10:16:26:WU01:FS00:0x11:Completed 77%
10:17:20:WU01:FS00:0x11:Completed 78%
10:18:13:WU01:FS00:0x11:Completed 79%
10:19:07:WU01:FS00:0x11:Completed 80%
10:20:01:WU01:FS00:0x11:Completed 81%
10:20:54:WU01:FS00:0x11:Completed 82%
10:21:47:WU01:FS00:0x11:Completed 83%
10:22:40:WU01:FS00:0x11:Completed 84%
10:23:33:WU01:FS00:0x11:Completed 85%
10:24:25:WU01:FS00:0x11:Completed 86%
10:25:18:WU01:FS00:0x11:Completed 87%
10:26:10:WU01:FS00:0x11:Completed 88%
10:27:04:WU01:FS00:0x11:Completed 89%
10:27:57:WU01:FS00:0x11:Completed 90%
10:28:49:WU01:FS00:0x11:Completed 91%
10:29:43:WU01:FS00:0x11:Completed 92%
10:30:36:WU01:FS00:0x11:Completed 93%
10:31:30:WU01:FS00:0x11:Completed 94%
10:32:23:WU01:FS00:0x11:Completed 95%
10:32:49:WU01:FS00:0x11:Run: exception thrown during GuardedRun
10:32:49:WU01:FS00:0x11:Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
10:32:49:WU01:FS00:0x11:Going to send back what have done -- stepsTotalG=15000000
10:32:49:WU01:FS00:0x11:Work fraction=0.9547 steps=15000000.
10:32:54:WU01:FS00:0x11:logfile size=17645 infoLength=17645 edr=0 trr=23
10:32:54:WU01:FS00:0x11:+ Opened results file
10:32:54:WU01:FS00:0x11:- Writing 18181 bytes of core data to disk...
10:32:54:WU01:FS00:0x11:Done: 17669 -> 4911 (compressed to 27.7 percent)
10:32:54:WU01:FS00:0x11:  ... Done.
10:32:54:WU01:FS00:0x11:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
10:32:54:WU01:FS00:0x11:
10:32:54:WU01:FS00:0x11:Folding@home Core Shutdown: EARLY_UNIT_END
10:32:54:WU01:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
10:32:55:WU01:FS00:Sending unit results: id:01 state:SEND error:FAULTY project:5769 run:4 clone:20 gen:2247 core:0x11 unit:0x57addfb850923bbc08c7001400041689
10:32:55:WU01:FS00:Uploading 5.30KiB to 171.67.108.11
10:32:55:WU01:FS00:Connecting to 171.67.108.11:8080
10:32:55:WU00:FS00:Connecting to assign-GPU.stanford.edu:80
10:32:56:WU01:FS00:Upload complete
10:32:56:WU01:FS00:Server responded WORK_ACK (400)
10:32:56:WU01:FS00:Cleaning up


Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 12:20 pm
by bollix47
Thank you for your report.

Your failure is the only return in our database so far but we will keep checking for other returns.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 12:26 pm
by Fahrenheit451
OK. Thank you for your feedback.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 3:11 pm
by Jesse_V
You should upgrade to 7.2.9. Download on the homepage.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 3:16 pm
by bollix47
Another folder was able to complete the WU for full points:

Hi xxxxxx (team xxxxxx),
Your WU (P5769 R4 C20 G2247) was added to the stats database on 2012-11-01 06:02:42 for 353 points of credit.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 5:41 pm
by Fahrenheit451
Jesse_V wrote:You should upgrade to 7.2.9. Download on the homepage.
I don't want to play beta tester. On the official FAH download page I can only find 7.1.52.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 5:48 pm
by Joe_H
7.2.9 is no longer beta. It has been the suggested download from http://folding.stanford.edu/English/HomePage since earlier this week. This was announced in the blog, http://folding.typepad.com/news/2012/10 ... lable.html.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Thu Nov 01, 2012 7:31 pm
by Jesse_V
Fahrenheit451 wrote:
Jesse_V wrote:You should upgrade to 7.2.9. Download on the homepage.
I don't want to play beta tester. On the official FAH download page I can only find 7.1.52.
Refresh the page.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Fri Nov 02, 2012 5:29 pm
by Rolo
I got BAD_WORK_UNIT a couple times as well (likely on a 57xx project); HOWEVER, it was on a couple of GTX260s (each on different machines) that I'm also getting UNSTABLE_MACHINE, so I'm assuming hardware is the root cause for both. Apparently the factory overclock (and even nVidia stock clocks) fail every several hours; I am currently looking for the highest stable clocks (it's looking like 576/1188/1080/1.125v is it).

Is my assumption correct in that BAD_WORK_UNIT is most likely due to hardware?

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Fri Nov 02, 2012 10:33 pm
by bruce
I'd probably say "very likely" rather than "most likely" (That's slightly less certain, isn't it?).

Traditionally we say if the has been only one failure, it's fair to attribute it to a bad WU. If there have been several, it's hardware. When trying to attribute a cause to statistical events, it's hard to be certain.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Fri Nov 02, 2012 11:58 pm
by Rolo
"We demand rigidly defined areas of doubt and uncertainty!" :D

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Sat Nov 03, 2012 4:05 pm
by Rolo
I got a couple more BAD_WORK_UNIT errors but I also got more UNSTABLE_MACHINE at sill low clocks (500/1000/1000).
I'm going to start another topic for the latter; I can probable use more eyes on the problem.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Sat Nov 03, 2012 5:38 pm
by 7im
More eyes won't solve an unstable overclock. Lower the clocks, increase the voltage, increase the airflow. Pick your poison.

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Sat Nov 03, 2012 5:42 pm
by P5-133XL
Also check your RAM

Re: Project: 5769 (Run 4, Clone 20, Gen 2247) BAD_WORK_UNIT

Posted: Sat Nov 03, 2012 7:55 pm
by bruce
You also could have a bad GPU. Some of the GeForce 8000 series had troubles with chips cracking, though I don't remember the details.
P5-133XL wrote:Also check your RAM
Both Main RAM and GPU VRAM. See the information on MemtestG80 http://folding.stanford.edu/English/DownloadUtils