GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_UNIT
Moderators: Site Moderators, FAHC Science Team
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Hey, thanks for the confirmation. I've been doing some digging, and I notice something weird. As the GPU changes up through the power levels, the FAHBench https://fahbench.github.io/ benchmarker falls over as soon as the GPU enters power level 3 (P3). Now, I haven't ever overclocked the GPU (GTX 1070) and PSU is more than big enough to run the 150W dissipation.
The GPU will happily sit running a hacked about version of the matrix multiply (just the multiply operation in a infinite loop, recompiled):
Again, thanks for the help.
The GPU will happily sit running a hacked about version of the matrix multiply (just the multiply operation in a infinite loop, recompiled):
Again, thanks for the help.
-
- Site Admin
- Posts: 7993
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Studio M1 Max 32 GB smp6
Mac Hack i7-7700K 48 GB smp4 - Location: W. MA
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
The GPU folding core does not use CUDA, it uses OpenCL.
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Yeah, I realise this. It was more a proof of concept that the GPU is there, will talk to the PC, and will run something without crashing.
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Some more debugging with FAHBench https://fahbench.github.io/...
The GPU benchmark runs to about 10% before falling over with either "NaN" error or some random exception (usually clEnqueueMapBuffer).
I'm not too sure what to do with this information, or how to debug further...
Thanks...
The GPU benchmark runs to about 10% before falling over with either "NaN" error or some random exception (usually clEnqueueMapBuffer).
I'm not too sure what to do with this information, or how to debug further...
Thanks...
-
- Site Moderator
- Posts: 6394
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
It's not a sign of good shape of your GPU ...
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
That's what I thought.
I'm new to GPUs. I'm an electronic engineer, so I don't know the details of GPUs, Nvidia settings, parameters, etc., but I have a pragmatic considered approach.
What I have learned this evening is that reducing the power limit down makes things behave, and I can complete the test.
My current working theory is that there is either a power issue or a clock speed issue which the lower power limit prevents the GPU from entering. The GPU came from a friend, but maybe he dabbled with overclocking it in the past and didn't remember. Thanks all.
I'm new to GPUs. I'm an electronic engineer, so I don't know the details of GPUs, Nvidia settings, parameters, etc., but I have a pragmatic considered approach.
What I have learned this evening is that reducing the power limit down makes things behave, and I can complete the test.
My current working theory is that there is either a power issue or a clock speed issue which the lower power limit prevents the GPU from entering. The GPU came from a friend, but maybe he dabbled with overclocking it in the past and didn't remember. Thanks all.
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
I've never managed to get a bad GPU to work via downclocking. And I have sent at least 4 cards back for warranty service in the past 5 years. 24/7 folding just has a habit of revealing faults with graphics cards.
You should definitely contact the manufacturer about warranty status on that card.
You should definitely contact the manufacturer about warranty status on that card.
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Thanks for the heads up.
Using Uengine Heaven https://benchmark.unigine.com/heaven looping, I am able to get the GPU performance at
[*]Graphics: -70
[*]Memory: +500
However, a 10 minute FAHBench session doesn't like that. For FAHBench, I need to run:
[*]Graphics: -90
[*]Memory: +300
The card is an Asus ROG Strix GTX 1070 O8G with the factory heavy overclock, so I guess I'm just reducing that default a little. If I had a Windows key, I'd check to see how it performed on Windows. Maybe worth a shot.
Thanks for the help/advice.
Using Uengine Heaven https://benchmark.unigine.com/heaven looping, I am able to get the GPU performance at
[*]Graphics: -70
[*]Memory: +500
However, a 10 minute FAHBench session doesn't like that. For FAHBench, I need to run:
[*]Graphics: -90
[*]Memory: +300
The card is an Asus ROG Strix GTX 1070 O8G with the factory heavy overclock, so I guess I'm just reducing that default a little. If I had a Windows key, I'd check to see how it performed on Windows. Maybe worth a shot.
Thanks for the help/advice.
-
- Posts: 3
- Joined: Sun Mar 29, 2020 9:04 pm
- Hardware configuration: Igneous - iMac 4GHz 4-Core Intel i7
Valis - 2.2GHz 20-Core Intel Xeon Silver 4114 + 4x Titan V GPUs - Contact:
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
m1geo - I do not see the same compile error reported by florinandrei. I think we may be dealing with separate issues. Do you consistently see the BAD_WORK_UNIT warning?
florinandrei - have you tried to compile the cuda samples as shown by m1geo above?
florinandrei - have you tried to compile the cuda samples as shown by m1geo above?
-
- Site Moderator
- Posts: 6394
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Rule number one to overclocking with FAH : don't touch the VRAM clocks ! It creates more issues that it adds performances.
However factory overclocks should work ... it they don't, then the card need a RMA.
Does the card runs fine in Furmak with manufacturer default clocks ?
However factory overclocks should work ... it they don't, then the card need a RMA.
Does the card runs fine in Furmak with manufacturer default clocks ?
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
I finally found my issue. It's bizarre! One of the fan bearings has failed. When the controller tried to spin the fans up, the fan would spin a bit, then jam, then drag the 12V rail down on the GPU. That caused all kinds of weirdness. Simply unplugging the one fan and the card works fine. I have ordered 3 new fans. Thanks for the patience and the advice!
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Wow … damn good spot/catch … at least fans are (I believe) cheaper than new GPU card !!
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
-
- Site Moderator
- Posts: 6394
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
That was a nasty one ...
-
- Posts: 3
- Joined: Sun Mar 29, 2020 9:04 pm
- Hardware configuration: Igneous - iMac 4GHz 4-Core Intel i7
Valis - 2.2GHz 20-Core Intel Xeon Silver 4114 + 4x Titan V GPUs - Contact:
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Right on, good catch! I'm curios if florinandrei was able to sort out their compile error. Are there a set of standard programs to test the system's ability to compile OpenCL code?
Re: GTX1060 Linux drv v430 Error compiling kernel BAD_WORK_U
Nice catch indeed.