Page 1 of 1
BAD_WORK_UNIT
Posted: Sun Dec 21, 2014 9:56 pm
by gaitskill
It seems like every couple of days FAH reports “ BAD_WORK_UNIT”. It happens often enough to suspect that these are not really bad work units.
The error is always “Error downloading array energyBuffer: clEnqueueReadBuffer”. It happens both on core 17 and core 18.
Fah client 7.4.4, ASUS GTX 750 Ti, NVIDIA 340.52, Windows 7.
Any ideas on the cause of these errors?
Re: BAD_WORK_UNIT
Posted: Sun Dec 21, 2014 10:22 pm
by davidcoton
Three standard checks, which may or may not reveal a cause:
1) Any overclocking? Try backing off if there is.
2) What temperature is the GPU running at?
3) Is the power supply adequate (12V rail)?
If that doesn't help, please post a log including the initial (config) sections and a WU that fails.
Re: BAD_WORK_UNIT
Posted: Mon Dec 22, 2014 4:35 am
by gaitskill
1.There is a factory overclock. Using ASUS GPU tweak I lowered the boost clock to NVIDIAs specified 1085 MHz. I didn’t find a way to reduce the base clock. I believe the memory clock is on spec.
2. The GPU is running at about 50 degrees C.
3 The power supply can put out 54A on 12V+, more than enough.
It may take a day or two to get another failure.
Re: BAD_WORK_UNIT
Posted: Mon Dec 22, 2014 2:11 pm
by gaitskill
It didn't take long to get the error.
Code: Select all
*********************** Log Started 2014-12-22T00:59:33Z ***********************
00:59:33:WU01:FS00:Starting
00:59:33:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18.exe -dir 01 -suffix 01 -version 704 -lifeline 5644 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
00:59:33:WU01:FS00:Started FahCore on PID 6652
00:59:33:WU01:FS00:Core PID:6676
00:59:33:WU01:FS00:FahCore 0x18 started
00:59:33:WU01:FS00:0x18:*********************** Log Started 2014-12-22T00:59:33Z ***********************
00:59:33:WU01:FS00:0x18:Project: 10471 (Run 0, Clone 136, Gen 108)
00:59:33:WU01:FS00:0x18:Unit: 0x00000085538b3dbb53beaa9ffea51ba1
00:59:33:WU01:FS00:0x18:CPU: 0x00000000000000000000000000000000
00:59:33:WU01:FS00:0x18:Machine: 0
00:59:33:WU01:FS00:0x18:Digital signatures verified
00:59:33:WU01:FS00:0x18:Folding@home GPU core18
00:59:33:WU01:FS00:0x18:Version 0.0.3
00:59:33:WU01:FS00:0x18: Found a checkpoint file
00:59:54:WU01:FS00:0x18:Completed 625000 out of 5000000 steps (12%)
00:59:57:WU01:FS00:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
01:10:31:WU01:FS00:0x18:Completed 650000 out of 5000000 steps (13%)
01:33:12:WU01:FS00:0x18:Completed 700000 out of 5000000 steps (14%)
01:55:30:WU01:FS00:0x18:Completed 750000 out of 5000000 steps (15%)
02:17:14:WU01:FS00:0x18:Completed 800000 out of 5000000 steps (16%)
02:39:40:WU01:FS00:0x18:Completed 850000 out of 5000000 steps (17%)
03:09:13:WU01:FS00:0x18:Completed 900000 out of 5000000 steps (18%)
03:34:03:WU01:FS00:0x18:Completed 950000 out of 5000000 steps (19%)
03:56:06:WU01:FS00:0x18:Completed 1000000 out of 5000000 steps (20%)
04:17:52:WU01:FS00:0x18:Completed 1050000 out of 5000000 steps (21%)
04:39:26:WU01:FS00:0x18:Completed 1100000 out of 5000000 steps (22%)
04:53:40:WU01:FS00:0x18:ERROR:exception: Error downloading array energyBuffer: clEnqueueReadBuffer (-36)
04:53:40:WU01:FS00:0x18:Saving result file logfile_01.txt
04:53:40:WU01:FS00:0x18:Saving result file log.txt
04:53:40:WU01:FS00:0x18:Folding@home Core Shutdown: BAD_WORK_UNIT
04:54:00:WARNING:WU01:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
04:54:00:WU01:FS00:Sending unit results: id:01 state:SEND error:FAULTY project:10471 run:0 clone:136 gen:108 core:0x18 unit:0x00000085538b3dbb53beaa9ffea51ba1
04:54:00:WU01:FS00:Uploading 3.00KiB to 140.163.4.235
04:54:00:WU01:FS00:Connecting to 140.163.4.235:8080
04:54:00:WU00:FS00:Connecting to 171.67.108.200:80
04:54:00:WU01:FS00:Upload complete
04:54:00:WU01:FS00:Server responded WORK_ACK (400)
04:54:00:WU01:FS00:Cleaning up
04:54:01:WU00:FS00:Assigned to work server 171.67.108.52
04:54:01:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM107 [GeForce GTX 750 Ti] from 171.67.108.52
04:54:01:WU00:FS00:Connecting to 171.67.108.52:8080
04:54:02:WU00:FS00:Downloading 1.53MiB
04:54:02:WU00:FS00:Download complete
04:54:02:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9201 run:674 clone:2 gen:119 core:0x17 unit:0x000000aa6652edc45399f08dacf8da6d
04:54:02:WU00:FS00:Starting
04:54:02:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 704 -lifeline 5644 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
04:54:02:WU00:FS00:Started FahCore on PID 11064
04:54:03:WU00:FS00:Core PID:9916
04:54:03:WU00:FS00:FahCore 0x17 started
04:54:03:WU00:FS00:0x17:*********************** Log Started 2014-12-22T04:54:03Z ***********************
04:54:03:WU00:FS00:0x17:Project: 9201 (Run 674, Clone 2, Gen 119)
04:54:03:WU00:FS00:0x17:Unit: 0x000000aa6652edc45399f08dacf8da6d
04:54:03:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
04:54:03:WU00:FS00:0x17:Machine: 0
04:54:03:WU00:FS00:0x17:Reading tar file state.xml
04:54:03:WU00:FS00:0x17:Reading tar file system.xml
04:54:03:WU00:FS00:0x17:Reading tar file integrator.xml
04:54:03:WU00:FS00:0x17:Reading tar file core.xml
04:54:03:WU00:FS00:0x17:Digital signatures verified
04:54:03:WU00:FS00:0x17:Folding@home GPU core17
04:54:03:WU00:FS00:0x17:Version 0.0.52
04:54:36:WU00:FS00:0x17:Completed 0 out of 5000000 steps (0%)
04:54:36:WU00:FS00:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
04:59:21:WU00:FS00:0x17:Completed 50000 out of 5000000 steps (1%)
05:04:07:WU00:FS00:0x17:Completed 100000 out of 5000000 steps (2%)
05:08:52:WU00:FS00:0x17:Completed 150000 out of 5000000 steps (3%)
05:13:39:WU00:FS00:0x17:Completed 200000 out of 5000000 steps (4%)
05:18:24:WU00:FS00:0x17:Completed 250000 out of 5000000 steps (5%)
05:23:11:WU00:FS00:0x17:Completed 300000 out of 5000000 steps (6%)
05:27:56:WU00:FS00:0x17:Completed 350000 out of 5000000 steps (7%)
05:32:42:WU00:FS00:0x17:Completed 400000 out of 5000000 steps (8%)
05:37:28:WU00:FS00:0x17:Completed 450000 out of 5000000 steps (9%)
05:42:14:WU00:FS00:0x17:Completed 500000 out of 5000000 steps (10%)
05:47:00:WU00:FS00:0x17:Completed 550000 out of 5000000 steps (11%)
05:51:45:WU00:FS00:0x17:Completed 600000 out of 5000000 steps (12%)
05:56:31:WU00:FS00:0x17:Completed 650000 out of 5000000 steps (13%)
06:01:17:WU00:FS00:0x17:Completed 700000 out of 5000000 steps (14%)
06:06:03:WU00:FS00:0x17:Completed 750000 out of 5000000 steps (15%)
06:10:48:WU00:FS00:0x17:Completed 800000 out of 5000000 steps (16%)
06:15:34:WU00:FS00:0x17:Completed 850000 out of 5000000 steps (17%)
06:20:20:WU00:FS00:0x17:Completed 900000 out of 5000000 steps (18%)
06:25:06:WU00:FS00:0x17:Completed 950000 out of 5000000 steps (19%)
06:29:51:WU00:FS00:0x17:Completed 1000000 out of 5000000 steps (20%)
06:34:37:WU00:FS00:0x17:Completed 1050000 out of 5000000 steps (21%)
06:39:23:WU00:FS00:0x17:Completed 1100000 out of 5000000 steps (22%)
06:44:09:WU00:FS00:0x17:Completed 1150000 out of 5000000 steps (23%)
06:48:54:WU00:FS00:0x17:Completed 1200000 out of 5000000 steps (24%)
06:53:40:WU00:FS00:0x17:Completed 1250000 out of 5000000 steps (25%)
06:58:26:WU00:FS00:0x17:Completed 1300000 out of 5000000 steps (26%)
******************************* Date: 2014-12-22 *******************************
07:03:13:WU00:FS00:0x17:Completed 1350000 out of 5000000 steps (27%)
07:07:58:WU00:FS00:0x17:Completed 1400000 out of 5000000 steps (28%)
07:12:44:WU00:FS00:0x17:Completed 1450000 out of 5000000 steps (29%)
07:17:31:WU00:FS00:0x17:Completed 1500000 out of 5000000 steps (30%)
07:22:17:WU00:FS00:0x17:Completed 1550000 out of 5000000 steps (31%)
07:27:04:WU00:FS00:0x17:Completed 1600000 out of 5000000 steps (32%)
07:31:50:WU00:FS00:0x17:Completed 1650000 out of 5000000 steps (33%)
07:36:37:WU00:FS00:0x17:Completed 1700000 out of 5000000 steps (34%)
07:41:24:WU00:FS00:0x17:Completed 1750000 out of 5000000 steps (35%)
07:46:10:WU00:FS00:0x17:Completed 1800000 out of 5000000 steps (36%)
07:50:56:WU00:FS00:0x17:Completed 1850000 out of 5000000 steps (37%)
07:55:42:WU00:FS00:0x17:Completed 1900000 out of 5000000 steps (38%)
08:00:29:WU00:FS00:0x17:Completed 1950000 out of 5000000 steps (39%)
08:05:16:WU00:FS00:0x17:Completed 2000000 out of 5000000 steps (40%)
08:10:02:WU00:FS00:0x17:Completed 2050000 out of 5000000 steps (41%)
08:14:48:WU00:FS00:0x17:Completed 2100000 out of 5000000 steps (42%)
08:19:35:WU00:FS00:0x17:Completed 2150000 out of 5000000 steps (43%)
08:24:21:WU00:FS00:0x17:Completed 2200000 out of 5000000 steps (44%)
08:29:07:WU00:FS00:0x17:Completed 2250000 out of 5000000 steps (45%)
08:33:53:WU00:FS00:0x17:Completed 2300000 out of 5000000 steps (46%)
08:38:39:WU00:FS00:0x17:Completed 2350000 out of 5000000 steps (47%)
08:43:25:WU00:FS00:0x17:Completed 2400000 out of 5000000 steps (48%)
08:48:12:WU00:FS00:0x17:Completed 2450000 out of 5000000 steps (49%)
08:52:57:WU00:FS00:0x17:Completed 2500000 out of 5000000 steps (50%)
08:57:44:WU00:FS00:0x17:Completed 2550000 out of 5000000 steps (51%)
09:02:31:WU00:FS00:0x17:Completed 2600000 out of 5000000 steps (52%)
09:07:17:WU00:FS00:0x17:Completed 2650000 out of 5000000 steps (53%)
09:12:02:WU00:FS00:0x17:Completed 2700000 out of 5000000 steps (54%)
09:16:48:WU00:FS00:0x17:Completed 2750000 out of 5000000 steps (55%)
09:21:34:WU00:FS00:0x17:Completed 2800000 out of 5000000 steps (56%)
09:26:21:WU00:FS00:0x17:Completed 2850000 out of 5000000 steps (57%)
09:31:07:WU00:FS00:0x17:Completed 2900000 out of 5000000 steps (58%)
09:35:53:WU00:FS00:0x17:Completed 2950000 out of 5000000 steps (59%)
09:40:39:WU00:FS00:0x17:Completed 3000000 out of 5000000 steps (60%)
09:45:26:WU00:FS00:0x17:Completed 3050000 out of 5000000 steps (61%)
09:50:12:WU00:FS00:0x17:Completed 3100000 out of 5000000 steps (62%)
09:54:58:WU00:FS00:0x17:Completed 3150000 out of 5000000 steps (63%)
09:59:44:WU00:FS00:0x17:Completed 3200000 out of 5000000 steps (64%)
10:04:30:WU00:FS00:0x17:Completed 3250000 out of 5000000 steps (65%)
10:09:16:WU00:FS00:0x17:Completed 3300000 out of 5000000 steps (66%)
10:14:02:WU00:FS00:0x17:Completed 3350000 out of 5000000 steps (67%)
10:18:47:WU00:FS00:0x17:Completed 3400000 out of 5000000 steps (68%)
10:23:33:WU00:FS00:0x17:Completed 3450000 out of 5000000 steps (69%)
10:28:19:WU00:FS00:0x17:Completed 3500000 out of 5000000 steps (70%)
10:33:05:WU00:FS00:0x17:Completed 3550000 out of 5000000 steps (71%)
10:37:51:WU00:FS00:0x17:Completed 3600000 out of 5000000 steps (72%)
10:42:41:WU00:FS00:0x17:Completed 3650000 out of 5000000 steps (73%)
10:47:27:WU00:FS00:0x17:Completed 3700000 out of 5000000 steps (74%)
10:52:13:WU00:FS00:0x17:Completed 3750000 out of 5000000 steps (75%)
10:56:58:WU00:FS00:0x17:Completed 3800000 out of 5000000 steps (76%)
11:01:45:WU00:FS00:0x17:Completed 3850000 out of 5000000 steps (77%)
11:06:30:WU00:FS00:0x17:Completed 3900000 out of 5000000 steps (78%)
11:11:16:WU00:FS00:0x17:Completed 3950000 out of 5000000 steps (79%)
11:16:02:WU00:FS00:0x17:Completed 4000000 out of 5000000 steps (80%)
11:20:47:WU00:FS00:0x17:Completed 4050000 out of 5000000 steps (81%)
11:25:33:WU00:FS00:0x17:Completed 4100000 out of 5000000 steps (82%)
11:30:18:WU00:FS00:0x17:Completed 4150000 out of 5000000 steps (83%)
11:35:04:WU00:FS00:0x17:Completed 4200000 out of 5000000 steps (84%)
11:39:51:WU00:FS00:0x17:Completed 4250000 out of 5000000 steps (85%)
11:44:36:WU00:FS00:0x17:Completed 4300000 out of 5000000 steps (86%)
11:49:22:WU00:FS00:0x17:Completed 4350000 out of 5000000 steps (87%)
11:54:08:WU00:FS00:0x17:Completed 4400000 out of 5000000 steps (88%)
11:58:54:WU00:FS00:0x17:Completed 4450000 out of 5000000 steps (89%)
12:03:40:WU00:FS00:0x17:Completed 4500000 out of 5000000 steps (90%)
12:08:26:WU00:FS00:0x17:Completed 4550000 out of 5000000 steps (91%)
12:13:12:WU00:FS00:0x17:Completed 4600000 out of 5000000 steps (92%)
12:17:58:WU00:FS00:0x17:Completed 4650000 out of 5000000 steps (93%)
12:22:44:WU00:FS00:0x17:Completed 4700000 out of 5000000 steps (94%)
12:27:30:WU00:FS00:0x17:Completed 4750000 out of 5000000 steps (95%)
12:32:16:WU00:FS00:0x17:Completed 4800000 out of 5000000 steps (96%)
12:37:01:WU00:FS00:0x17:Completed 4850000 out of 5000000 steps (97%)
12:41:47:WU00:FS00:0x17:Completed 4900000 out of 5000000 steps (98%)
12:46:33:WU00:FS00:0x17:Completed 4950000 out of 5000000 steps (99%)
12:46:34:WU02:FS00:Connecting to 171.67.108.200:80
12:46:34:WU02:FS00:Assigned to work server 171.64.65.93
12:46:34:WU02:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:GM107 [GeForce GTX 750 Ti] from 171.64.65.93
12:46:34:WU02:FS00:Connecting to 171.64.65.93:8080
12:46:36:WU02:FS00:Downloading 3.87MiB
12:46:37:WU02:FS00:Download complete
12:46:37:WU02:FS00:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:9105 run:5 clone:6 gen:0 core:0x18 unit:0x000000000a3b1e81546bd1e833576548
12:51:19:WU00:FS00:0x17:Completed 5000000 out of 5000000 steps (100%)
12:51:23:WU00:FS00:0x17:Saving result file logfile_01.txt
12:51:23:WU00:FS00:0x17:Saving result file checkpointState.xml
12:51:23:WU00:FS00:0x17:Saving result file checkpt.crc
12:51:23:WU00:FS00:0x17:Saving result file log.txt
12:51:24:WU00:FS00:0x17:Saving result file positions.xtc
12:51:26:WU00:FS00:0x17:Folding@home Core Shutdown: FINISHED_UNIT
12:51:26:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:51:26:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:9201 run:674 clone:2 gen:119 core:0x17 unit:0x000000aa6652edc45399f08dacf8da6d
12:51:26:WU00:FS00:Uploading 8.41MiB to 171.67.108.52
12:51:26:WU00:FS00:Connecting to 171.67.108.52:8080
12:51:26:WU02:FS00:Starting
12:51:26:WU02:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18.exe -dir 02 -suffix 01 -version 704 -lifeline 5644 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
12:51:26:WU02:FS00:Started FahCore on PID 9740
12:51:26:WU02:FS00:Core PID:6880
12:51:26:WU02:FS00:FahCore 0x18 started
12:51:27:WU02:FS00:0x18:*********************** Log Started 2014-12-22T12:51:27Z ***********************
12:51:27:WU02:FS00:0x18:Project: 9105 (Run 5, Clone 6, Gen 0)
12:51:27:WU02:FS00:0x18:Unit: 0x000000000a3b1e81546bd1e833576548
12:51:27:WU02:FS00:0x18:CPU: 0x00000000000000000000000000000000
12:51:27:WU02:FS00:0x18:Machine: 0
12:51:27:WU02:FS00:0x18:Reading tar file system.xml
12:51:27:WU02:FS00:0x18:Reading tar file integrator.xml
12:51:27:WU02:FS00:0x18:Reading tar file state.xml
12:51:28:WU02:FS00:0x18:Reading tar file core.xml
12:51:28:WU02:FS00:0x18:Digital signatures verified
12:51:28:WU02:FS00:0x18:Folding@home GPU core18
12:51:28:WU02:FS00:0x18:Version 0.0.3
12:51:32:WU00:FS00:Upload 30.46%
12:51:38:WU00:FS00:Upload 66.86%
12:51:47:WU00:FS00:Upload complete
12:51:47:WU00:FS00:Server responded WORK_ACK (400)
12:51:47:WU00:FS00:Final credit estimate, 22178.00 points
12:51:47:WU00:FS00:Cleaning up
12:52:05:WU02:FS00:0x18:Completed 0 out of 2500000 steps (0%)
12:52:05:WU02:FS00:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
******************************* Date: 2014-12-22 *******************************
13:01:18:WU02:FS00:0x18:Completed 25000 out of 2500000 steps (1%)
13:10:20:WU02:FS00:0x18:Completed 50000 out of 2500000 steps (2%)
13:19:22:WU02:FS00:0x18:Completed 75000 out of 2500000 steps (3%)
13:28:24:WU02:FS00:0x18:Completed 100000 out of 2500000 steps (4%)
13:37:38:WU02:FS00:0x18:Completed 125000 out of 2500000 steps (5%)
13:46:40:WU02:FS00:0x18:Completed 150000 out of 2500000 steps (6%)
13:55:42:WU02:FS00:0x18:Completed 175000 out of 2500000 steps (7%)
14:04:46:WU02:FS00:0x18:Completed 200000 out of 2500000 steps (8%)
Re: BAD_WORK_UNIT
Posted: Mon Dec 22, 2014 3:05 pm
by toTOW
It's a bit surprising the you get failures with Fahcore 18 ...
GTX 750 Ti uses a Maxwell GPU, and there is a bug in OpenCL drivers for this GPU (Maxwell is used in GTX 750 Ti, GTX 970 and GTX 980). I think the symptoms are exactly those you describe.
- With fahcore 17, a small portion of WUs will work with very interesting speeds.
- The remaining will end in errors like those you get.
- With fahcore 18, there is a workaround that will allow you to have a 100% success rate at the cost of a huge performance hit (about 50% of fachore 17 speed when it works).
Pande Group and nVidia have been in contact about the issue, and the last information I read about it is that nVidia had spotted the bug, but didn't give an ETA yet on when it will be resolved and/or integrated in a drivers release.
Re: BAD_WORK_UNIT
Posted: Mon Dec 22, 2014 3:38 pm
by gwildperson
I would run memchkCL to confirm whether you're having memory errors on your gpu or not.
Re: BAD_WORK_UNIT
Posted: Mon Dec 22, 2014 9:15 pm
by gaitskill
Did not find memchkCL with google.
Did try memtestcl set at 1800MB memory, 5400 MHz memory clock, 1072 GPU clock, 100 iterations.
The test showed zero failures. It probably needs to run at least overnight for any confidence.
Re: BAD_WORK_UNIT
Posted: Mon Dec 22, 2014 9:29 pm
by 7im
Unless the ambient temps vary quite a bit overnight, there is no need to run it for very long. It usually finds any memory issues very quickly.
See an existing discussion on this error.
viewtopic.php?p=269876#p269876
Re: BAD_WORK_UNIT
Posted: Tue Dec 23, 2014 9:37 pm
by gaitskill
I’ve lowered the base/boost clocks (they are linked together) to set the base clock to the NVIDIA reference value of 1020MHz
Bumped memtestcl up to 1000 iterations and it ran without errors.
From what you guys are telling me, I won’t get rid of the FAH errors until NVIDIA fixes the driver.
Thanks for all of your heip
Re: BAD_WORK_UNIT
Posted: Tue Dec 23, 2014 10:01 pm
by bollix47
https://folding.stanford.edu/home/upgra ... or-core17/
You could try adding the slot option client-type advanced to see if you have more success ... please let us know if it doesn't help.
Also, have you tried a clean install of
344.80 drivers? I have used them successfully with Windows 7 64-bit.
Re: BAD_WORK_UNIT
Posted: Wed Dec 24, 2014 8:45 am
by rwh202
Hi,
If it's only core_18 causing problems, then as bollix47 suggests, the adv flag ought to limit you to core_17 and 15 for now.
The only time I can recall seeing that particular error message was on a gtx770 (Kepler not maxwell) and was due to inadequate system memory - since upgrading from 2 to a fresh 4 gb there have been no reoccurrences. If you're similarly memory limited maybe that would help?
My 6 Maxwell cards have all been running happily on everything given to them over the last 2 months using a mix of drivers so they might not be efficient but they are stable with the latest cores and subset of WUs that are being given out.
Re: BAD_WORK_UNIT
Posted: Wed Dec 24, 2014 6:51 pm
by gaitskill
Added client-type advanced
Upgraded to 344.80 drivers
Have 16GB ram
Thanks all