Page 1 of 2

Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Thu May 01, 2014 9:13 pm
by mikepcap
Been folding for years, but total newbie on gpus.
Constantly getting ALL bad work units on this project. Brand new stock out of the box Gateway quadcore computer with gpu:0:BeaverCreek [Radeon HD 6530D]. No OC, no additional drivers. W8.1. Assigned two CPUs and they run every WU just fine. Is this an example of those WUs that just won't work on my particular GPU that I've read about in other threads about 1300x?

Code: Select all

20:47:20:WU03:FS01:FahCore 0x17 started
20:47:21:WU03:FS01:0x17:*********************** Log Started 2014-05-01T20:47:20Z ***********************
20:47:21:WU03:FS01:0x17:Project: 13000 (Run 399, Clone 0, Gen 14)
20:47:21:WU03:FS01:0x17:Unit: 0x00000022538b3db753100c7048913243
20:47:21:WU03:FS01:0x17:CPU: 0x00000000000000000000000000000000
20:47:21:WU03:FS01:0x17:Machine: 1
20:47:21:WU03:FS01:0x17:Reading tar file state.xml
20:47:22:WU03:FS01:0x17:Reading tar file system.xml
20:47:22:WU03:FS01:0x17:Reading tar file integrator.xml
20:47:22:WU03:FS01:0x17:Reading tar file core.xml
20:47:22:WU03:FS01:0x17:Digital signatures verified
20:47:22:WU03:FS01:0x17:Folding@home GPU core17
20:47:22:WU03:FS01:0x17:Version 0.0.52
20:53:39:WU03:FS01:0x17:ERROR:exception: Force RMSE error of 156.976 with threshold of 5
20:53:39:WU03:FS01:0x17:Saving result file logfile_01.txt
20:53:39:WU03:FS01:0x17:Saving result file log.txt
20:53:39:WU03:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
20:53:40:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
20:53:40:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13000 run:399 clone:0 gen:14 core:0x17 unit:0x00000022538b3db753100c7048913243
20:53:40:WU03:FS01:Uploading 2.32KiB to 140.163.4.231
20:53:40:WU03:FS01:Connecting to 140.163.4.231:8080
20:53:40:WU00:FS01:Connecting to 171.67.108.201:80
20:53:40:WU03:FS01:Upload complete
20:53:40:WU03:FS01:Server responded WORK_ACK (400)
20:53:40:WU03:FS01:Cleaning up
20:53:40:WU00:FS01:Assigned to work server 140.163.4.231
20:53:40:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:BeaverCreek [Radeon HD 6530D] from 140.163.4.231
20:53:40:WU00:FS01:Connecting to 140.163.4.231:8080
20:53:41:WU00:FS01:Downloading 4.84MiB
20:53:45:WU00:FS01:Download complete
20:53:45:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13000 run:1334 clone:0 gen:8 core:0x17 unit:0x0000001d538b3db7531114abe23bb5e0
20:53:46:WU00:FS01:Starting
20:53:46:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/mikepcap/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 5064 -checkpoint 6 -gpu 1 -gpu-vendor ati
20:53:46:WU00:FS01:Started FahCore on PID 2028
20:53:46:WU00:FS01:Core PID:1108
20:53:46:WU00:FS01:FahCore 0x17 started
20:53:46:WU00:FS01:0x17:*********************** Log Started 2014-05-01T20:53:46Z ***********************
20:53:46:WU00:FS01:0x17:Project: 13000 (Run 1334, Clone 0, Gen 8)
20:53:46:WU00:FS01:0x17:Unit: 0x0000001d538b3db7531114abe23bb5e0
20:53:46:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
20:53:46:WU00:FS01:0x17:Machine: 1
20:53:46:WU00:FS01:0x17:Reading tar file state.xml
20:53:47:WU00:FS01:0x17:Reading tar file system.xml
20:53:48:WU00:FS01:0x17:Reading tar file integrator.xml
20:53:48:WU00:FS01:0x17:Reading tar file core.xml
20:53:48:WU00:FS01:0x17:Digital signatures verified
20:53:48:WU00:FS01:0x17:Folding@home GPU core17
20:53:48:WU00:FS01:0x17:Version 0.0.52
Mod Edit: Changed Quote Tags To Code Tags - PantherX

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Thu May 01, 2014 9:18 pm
by PantherX
Can you please tell us what Driver version you are using? I would suggest that you install the latest (14.4 WHQL) and see if you are getting the same error or not. Also, make sure that your GPU temperatures are within normal range. A useful tool is GPU-Z (http://www.techpowerup.com/downloads/SysInfo/GPU-Z/) which can give you a good overview of your GPU in an easy-to-understand interface.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Thu May 01, 2014 9:19 pm
by bruce
Have you updated your GPU drivers from AMD?

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Thu May 01, 2014 10:35 pm
by mikepcap
OK, first off, thank you for such a such a quick reply!
Ran GPU-Z max temp 56C over about last 10 minutes with FAH running in bg.
Other details: AMD Radeon HD 6530D, 64Bit, Release Dt: June 15, 2011 , 512 Mb Mem, BW:12.8 GB/s, GPU clk: 444Mhz, Mem 800MHz
Driver according to GPU-Z was atiumdag 13.251.0.0 /win8.1.64
Just loaded new drivers from AMD as suggested in thread now at:
Driver according to GPU-Z is atiumdag 14.100.0.0 (Catalyst 14.4)/win8.1.64

Unfortunately still getting pretty much the same error:

Code: Select all

22:22:23:WU03:FS01:Started FahCore on PID 4808
22:22:23:WU03:FS01:Core PID:4268
22:22:23:WU03:FS01:FahCore 0x17 started
22:22:24:WU03:FS01:0x17:*********************** Log Started 2014-05-01T22:22:23Z ***********************
22:22:24:WU03:FS01:0x17:Project: 13001 (Run 202, Clone 4, Gen 15)
22:22:24:WU03:FS01:0x17:Unit: 0x00000025538b3db753288909030f01d3
22:22:24:WU03:FS01:0x17:CPU: 0x00000000000000000000000000000000
22:22:24:WU03:FS01:0x17:Machine: 1
22:22:24:WU03:FS01:0x17:Reading tar file state.xml
22:22:25:WU03:FS01:0x17:Reading tar file system.xml
22:22:25:WU03:FS01:0x17:Reading tar file integrator.xml
22:22:25:WU03:FS01:0x17:Reading tar file core.xml
22:22:25:WU03:FS01:0x17:Digital signatures verified
22:22:25:WU03:FS01:0x17:Folding@home GPU core17
22:22:25:WU03:FS01:0x17:Version 0.0.52
22:28:35:WU03:FS01:0x17:ERROR:exception: Force RMSE error of 157.031 with threshold of 5
22:28:35:WU03:FS01:0x17:Saving result file logfile_01.txt
22:28:35:WU03:FS01:0x17:Saving result file log.txt
22:28:35:WU03:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
22:28:35:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:28:35:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13001 run:202 clone:4 gen:15 core:0x17 unit:0x00000025538b3db753288909030f01d3
22:28:35:WU03:FS01:Uploading 2.32KiB to 140.163.4.231
22:28:35:WU03:FS01:Connecting to 140.163.4.231:8080
22:28:35:WU03:FS01:Upload complete
22:28:36:WU03:FS01:Server responded WORK_ACK (400)
22:28:36:WU03:FS01:Cleaning up
22:28:36:WU00:FS01:Connecting to 171.67.108.201:80
22:28:36:WU00:FS01:Assigned to work server 140.163.4.231
22:28:36:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:BeaverCreek [Radeon HD 6530D] from 140.163.4.231
22:28:36:WU00:FS01:Connecting to 140.163.4.231:8080
22:28:37:WU00:FS01:Downloading 4.84MiB
22:28:41:WU00:FS01:Download complete
22:28:41:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13001 run:452 clone:6 gen:11 core:0x17 unit:0x0000001a538b3db7532c7c584fcd5421
22:28:41:WU00:FS01:Starting
22:28:41:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/mikepcap/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 4340 -checkpoint 3 -gpu 1 -gpu-vendor ati
22:28:41:WU00:FS01:Started FahCore on PID 4248
22:28:41:WU00:FS01:Core PID:1944
22:28:41:WU00:FS01:FahCore 0x17 started
22:28:42:WU00:FS01:0x17:*********************** Log Started 2014-05-01T22:28:41Z ***********************
22:28:42:WU00:FS01:0x17:Project: 13001 (Run 452, Clone 6, Gen 11)
22:28:42:WU00:FS01:0x17:Unit: 0x0000001a538b3db7532c7c584fcd5421
22:28:42:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
22:28:42:WU00:FS01:0x17:Machine: 1
22:28:42:WU00:FS01:0x17:Reading tar file state.xml
22:28:43:WU00:FS01:0x17:Reading tar file system.xml
22:28:43:WU00:FS01:0x17:Reading tar file integrator.xml
22:28:43:WU00:FS01:0x17:Reading tar file core.xml
22:28:43:WU00:FS01:0x17:Digital signatures verified
22:28:43:WU00:FS01:0x17:Folding@home GPU core17
22:28:43:WU00:FS01:0x17:Version 0.0.52
Mod edit: Added Code tags to log

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Fri May 02, 2014 12:30 am
by bruce
Project: 13001 (Run 452, Clone 6, Gen 11) has been reassigned several times and all have failed.

Project: 13000 (Run 399, Clone 0, Gen 14) has a single failure report (yours) and the results of the reassignment have not yet been returned.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Fri May 02, 2014 11:11 am
by mikepcap
Oh, here are some more 13000, 13001 ERROR reports that have come through just in the last hours or two. Basically every single one.
22:16:08:WU00:FS01:0x17:Project: 13000 (Run 816, Clone 1, Gen 5
22:22:24:WU03:FS01:0x17:Project: 13001 (Run 202, Clone 4, Gen 15)
22:28:42:WU00:FS01:0x17:Project: 13001 (Run 452, Clone 6, Gen 11)
22:35:01:WU03:FS01:0x17:Project: 13000 (Run 976, Clone 4, Gen 9)
22:41:16:WU00:FS01:0x17:Project: 13001 (Run 111, Clone 1, Gen 5)
22:47:32:WU03:FS01:0x17:Project: 13000 (Run 68, Clone 0, Gen 23)
22:53:48:WU00:FS01:0x17:Project: 13000 (Run 1677, Clone 0, Gen 11)
23:00:07:WU03:FS01:0x17:Project: 13000 (Run 776, Clone 0, Gen 8)
23:06:22:WU00:FS01:0x17:Project: 13000 (Run 813, Clone 0, Gen 13)
:Project: 13000 (Run 872, Clone 0, Gen 12)

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Fri May 02, 2014 12:15 pm
by P5-133XL
Project: 13000 (Run 816, Clone 1, Gen 5) -- Several have attempted it but no one has completed it yet. Reported as a bad WU.
Project: 13001 (Run 202, Clone 4, Gen 15) -- Only you have run this one.
Project: 13001 (Run 452, Clone 6, Gen 11) -- Several have attempted it but no one has completed it yet.
Project: 13000 (Run 976, Clone 4, Gen 9) -- Only you and one other have tried and both failed.
Project: 13001 (Run 111, Clone 1, Gen 5) -- Several have attempted it but no one has completed it yet. Reported as a bad WU
Project: 13000 (Run 68, Clone 0, Gen 23) -- Several have attempted it but no one has completed it yet.
Project: 13000 (Run 1677, Clone 0, Gen 11) -- A couple have tried, but no one has succeed yet.
Project: 13000 (Run 776, Clone 0, Gen 8) -- Several have attempted it but no one has completed it yet. Reported as a bad WU
Project: 13000 (Run 813, Clone 0, Gen 13) -- Only you and one other have tried and both failed.
Project: 13000 (Run 872, Clone 0, Gen 12) -- Several have attempted it but no one has completed it yet.

*Note, My personal threshold is 6 failures with no successes before I report the WU as bad.
**Note, There is no indication that the problem is at your end. No one has succeeded where you have failed.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sat May 03, 2014 1:26 am
by mikepcap
"Project: 13000 (Run 813, Clone 0, Gen 13) -- Only you and one other have tried and both failed.
Project: 13000 (Run 872, Clone 0, Gen 12) -- Several have attempted it but no one has completed it yet."

Is there some place that WE can look up stats like this before posting here?
It's been 2 days and no GOOD WU yet from 13000-1. Don't get points for finding broken ones.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sat May 03, 2014 1:54 am
by Joe_H
mikepcap wrote:s there some place that WE can look up stats like this before posting here?
Due to stats database performance issues, PG has provided the lookup tool only to forum admins and moderators.

As for the number of bad WU's you have seen, that is highly unusual. Eventually you should get some good WU's and then see whether or not your 6530 is capable of processing Core_17 projects.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sat May 03, 2014 9:00 am
by mikepcap
"As for the number of bad WU's you have seen, that is highly unusual."
Or you simply need to admit that all the good ones have been depleted and all that are left are previously rejected BAD ones. Here are the next 10 in a row. What a waste of resources. I'll be mining bitcoins until you all sort this mess out.

03:02:04:WU04:FS01:Sending unit results: id:04 state:SEND error:FAULTY project:13001 run:121 clone:7 gen:11 core:0x17 unit:0x0000001a538b3db753287218aa44bac3
03:08:27:WU06:FS01:Sending unit results: id:06 state:SEND error:FAULTY project:13000 run:696 clone:3 gen:9 core:0x17 unit:0x00000019538b3db753106024a382c517
03:14:56:WU04:FS01:Sending unit results: id:04 state:SEND error:FAULTY project:13000 run:1732 clone:0 gen:8 core:0x17 unit:0x0000001a538b3db753118568fa55afad
03:21:20:WU06:FS01:Sending unit results: id:06 state:SEND error:FAULTY project:13000 run:1238 clone:0 gen:10 core:0x17 unit:0x00000013538b3db75310f96eefbb8ff7
03:27:43:WU04:FS01:Sending unit results: id:04 state:SEND error:FAULTY project:13000 run:813 clone:1 gen:7 core:0x17 unit:0x00000013538b3db75310811478065034
03:34:12:WU06:FS01:Sending unit results: id:06 state:SEND error:FAULTY project:13000 run:720 clone:0 gen:12 core:0x17 unit:0x0000001c538b3db7531066d53037701d
03:40:35:WU04:FS01:Sending unit results: id:04 state:SEND error:FAULTY project:13000 run:823 clone:0 gen:6 core:0x17 unit:0x00000018538b3db7531083e4319ff8a8
03:47:06:WU06:FS01:Sending unit results: id:06 state:SEND error:FAULTY project:13001 run:163 clone:6 gen:16 core:0x17 unit:0x00000020538b3db753287e012b1b6f39
03:53:32:WU04:FS01:Sending unit results: id:04 state:SEND error:FAULTY project:13001 run:280 clone:2 gen:15 core:0x17 unit:0x0000001e538b3db753289f1a1d848e6b
03:59:57:WU06:FS01:Sending unit results: id:06 state:SEND error:FAULTY project:13001 run:181 clone:5 gen:9 core:0x17 unit:0x0000001c538b3db7532883178dfd180e

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sat May 03, 2014 10:09 am
by bollix47
Please post the System Info and Config sections of your log.txt file as explained here.
mikepcap wrote:Or you simply need to admit that all the good ones have been depleted and all that are left are previously rejected BAD ones.
If that were true it would not explain why most folders are not experiencing this problem and are returning successfully completed work units for P13000/1 every day.

It can be very frustrating when something like this happens but the reasons are usually found eventually. I agree with you in that until the source of the problem is determined it's probably better to suspend folding on that GPU temporarily while we gather more information. It may be something as simple as not having enough ram on that GPU ... 512MB is on the low side and depending on what other functions your GPU is trying to perform that may not leave enough for folding. Another area to look at is your Power Options ... are they set at maximum performance? ... have you disabled sleep and hibernation? ... have you disabled your screen saver? ... have you disabled the option that turns off your display? So many questions but no specific answers ... it can be frustrating for us too. :ewink:

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sat May 03, 2014 5:21 pm
by P5-133XL
Project: 13001 (Run 202, Clone 4, Gen 15) -- Successfully completed by someone else.
Project: 13001 (Run 452, Clone 6, Gen 11) -- Successfully completed by someone else.
Project: 13000 (Run 976, Clone 4, Gen 9) -- Successfully completed by someone else.
Project: 13000 (Run 68, Clone 0, Gen 23) -- Several have attempted it but no one has completed it yet.
Project: 13000 (Run 1677, Clone 0, Gen 11) -- Several have attempted it but no one has completed it yet.
Project: 13000 (Run 776, Clone 0, Gen 8) -- Successfully completed by someone else.
Project: 13000 (Run 813, Clone 0, Gen 13) -- Only you and one other have tried and both failed.
Project: 13000 (Run 872, Clone 0, Gen 12) -- Several have attempted it but no one has completed it yet.
project:13001 run:121 clone:7 gen:11 -- Several have attempted it but no one has completed it yet.
project:13000 run:696 clone:3 gen:9 -- Several have attempted it but no one has completed it yet. Reported as a bad WU.
project:13000 run:1732 clone:0 gen:8 -- Only you and one other have tried and both failed.
project:13000 run:1238 clone:0 gen:10 -- Several have attempted it but no one has completed it yet. Reported as a bad WU.
project:13000 run:813 clone:1 gen:7 -- Several have attempted it but no one has completed it yet.
project:13000 run:720 clone:0 gen:12 -- Several have attempted it but no one has completed it yet.
project:13000 run:823 clone:0 gen:6 -- Several have attempted it but no one has completed it yet.
project:13001 run:163 clone:6 gen:16 -- Successfully completed by someone else.
project:13001 run:280 clone:2 gen:15 -- Successfully completed by someone else.
project:13001 run:181 clone:5 gen:9 -- Several have attempted it but no one has completed it yet.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sat May 03, 2014 5:36 pm
by kyleb
We are looking into the bad WU problem for p13000. I expect to be posting back early next week with a solution.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sat May 03, 2014 8:02 pm
by bruce
So apparently Kyleb is suggesting that there may be something that can be changed in P13000.

It seems like we're looking at a combination of two problems. From the results posted by P5-133XL above,

Six were successfully completed by someone else and twelve have failed at least one more time. In other words, 6/18=33% of the problems seem to be caused by something that mikepcap can change and 12/18=67% might be caused by something that kyleb can change. It's unlikely that either of you can prevent all errors, but I suggest that both of you need to change something.

Re: Project: 13000 (Run 399, Clone 0, Gen 14) BAD_WORK_UNIT

Posted: Sun May 04, 2014 2:26 am
by mikepcap
> It may be something as simple as not having enough ram on that GPU ... 512MB is on the low side and depending on what other functions your GPU is trying to perform that may not leave enough for folding.

Don't know about that. Looking at if it is possible to upgrade it.

>Another area to look at is your Power Options ... are they set at maximum performance? ...
Yep, re-confirmed

>have yo>u disabled sleep and hibernation? ...
Yep, re-confirmed

>have you disabled your screen saver? ...
Yep, re-confirmed

> have you disabled the option that turns off your display?
Yep, re-confirmed

Let me know if there any other settings I should check