Page 4 of 11

Re: 9401 fails on 750ti

Posted: Wed Mar 12, 2014 1:00 pm
by bfromcolo
Same issue in Ubuntu 12.04 with the latest NVIDIA Linux driver 334.21. If there is any additional information I can provide let me know.

12:53:33:WU00:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
12:53:33:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
12:53:33:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9401 run:439 clone:0 gen:20 core:0x17 unit:0x0000001e6652edaf52eaf025adc7f1ce

Re: 9401 fails on 750ti

Posted: Wed Mar 12, 2014 9:20 pm
by bruce
Windows monitors various hardware devices, including the GPU, and makes sure it's working. (You've probably seen something like their messages from IE saying site xxx is not responding when bringing up a new page takes longer than they're expecting, sometimes followed immediately by a display of that page.) Windows issues a GPU_RESET which is supposed to get the computer working again and reports it in the event log. FAH generally reports BAD_WORK_UNIT (114=0x72).

Does anybody know if Linux resets the GPU if it fails to respond within some predefined time-limit? Does Linux have a log of such events? If it does, was there an event recorded at that same time? I have not researched how Linux deals with a hung GPU, but I'd be surprised if it didn't have something similar to what MS is doing. (I thought I'd ask you guys first.)

Re: 9401 fails on 750ti

Posted: Wed Mar 12, 2014 10:17 pm
by Freightanimal
That answers my question then. I was hoping that would be a work around so I could at least get this puppy folding. Does anyone know if there is a way to exclude core 17 projects so I could at least work on core 15 work units?

Re: 9401 fails on 750ti

Posted: Wed Mar 12, 2014 10:20 pm
by P5-133XL
You can exclude Core_17 by running the v6 client. To my knowledge, that is the only way, currently.

Re: 9401 fails on 750ti

Posted: Thu Mar 13, 2014 2:26 am
by PantherX
Does the WU fail immediately or does it run for X time and then fails? Is your GPU operating within normal temperature range? I do hope that you are using 64-bit Ubuntu version as it is required for GPU folding.

Re: 9401 fails on 750ti

Posted: Thu Mar 13, 2014 4:19 am
by uddarts
PantherX wrote:Does the WU fail immediately or does it run for X time and then fails? Is your GPU operating within normal temperature range? I do hope that you are using 64-bit Ubuntu version as it is required for GPU folding.
the last one i attempted to run, i monitored it with gpuz and the core never showed more than an occasional 1% blip. it would registered failed after about 15 minutes.


ud

Re: 9401 fails on 750ti

Posted: Thu Mar 13, 2014 7:01 am
by Meiz
Has there been any news of an update for core 17? I didn't want to leave my GPU idling, so I removed the Advanced flag and I had been lucky in getting nothing but core 15 work units since. It just got it's first batch of p8900s and as expected they all failed. I hope they come up with a fix soon, with core 17 being released to full F@H I can't fold at all until they do.

Re: 9401 fails on 750ti

Posted: Thu Mar 13, 2014 1:38 pm
by 7im
I don't see anything posted on the news blog about a core update, so no news.

Re: 9401 fails on 750ti plus one here

Posted: Fri Mar 14, 2014 8:02 pm
by tofuwombat
For me, Maxwell hardware GTX750ti FTW Does NOT fold on v7.
(with or without client-type advanced. "Runs" for several minutes before failing. No steps complete. same same behavior when heavily underclocked)

Good temps.

Even overclocking (100% power target on precision)
it tests ok:
Explicit: 25.4167 ns/day
Implicit: 104.916 ns/day
"Verify Accuracy" selected

Card: evga P/N: 02G-P4-3757-KR
OS: Win7 pro 64 bit sp1
Nvidia Drivers: 335.23

Re: 9401 fails on 750ti plus one here

Posted: Fri Mar 14, 2014 9:16 pm
by bruce
tofuwombat wrote:For me, Maxwell hardware GTX750ti FTW Does NOT fold on v7.
(with or without client-type advanced. "Runs" for several minutes before failing. No steps complete. same same behavior when heavily underclocked)
The Pande Group is aware of the fact that Maxwell is not yet supported, but there's no ETA for a fix.

In my experience, FahCore_17 spends several minutes doing CPU calculations before it actually starts using the GPU. That would have nothing to do with the type of GPU you have or whether it's supported yet.

Notice that between 20:36:56 and 20:42:16, some 5.3 minutes passed initializing (before starting the first frame!)
Notice that between 20:42:16 and 20:47:11, some 4.9 minutes passed doing the first frame.
After that, frame 2 is some 4.3 minutes, frame 3 is 4.2 minutes, frame 4 is 4.3 minutes, and frame 5 is 4.3 minutes.

The actual times will depend on the project and on your hardware, of course, but the 5.3 minutes of initiaztion seems to be based on CPU speed and the things I'm calling frame times depend mostly GPU speed. From that, I predict a unsupported GPU would have died about 20:42:16.

Code: Select all

20:36:56:WU00:FS01:0x17:Reading tar file core.xml
20:36:56:WU00:FS01:0x17:Digital signatures verified
20:36:56:WU00:FS01:0x17:Folding@home GPU core17
20:42:16:WU00:FS01:0x17:Completed 0 out of 1000000 steps (0%)
20:42:16:WU00:FS01:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
20:47:11:WU00:FS01:0x17:Completed 10000 out of 1000000 steps (1%)
20:51:27:WU00:FS01:0x17:Completed 20000 out of 1000000 steps (2%)
20:55:44:WU00:FS01:0x17:Completed 30000 out of 1000000 steps (3%)
20:59:59:WU00:FS01:0x17:Completed 40000 out of 1000000 steps (4%)
21:04:14:WU00:FS01:0x17:Completed 50000 out of 1000000 steps (5%)

Re: 9401 fails on 750ti

Posted: Fri Mar 14, 2014 9:40 pm
by Freightanimal
Meiz wrote:Has there been any news of an update for core 17? I didn't want to leave my GPU idling, so I removed the Advanced flag and I had been lucky in getting nothing but core 15 work units since. It just got it's first batch of p8900s and as expected they all failed. I hope they come up with a fix soon, with core 17 being released to full F@H I can't fold at all until they do.
I have had some of that luck in getting some core 15 work units now also. Made me wonder if there was anything that could be done server side to give out only the core 15 work units to this card in particular

Re: 9401 fails on 750ti

Posted: Fri Mar 14, 2014 9:53 pm
by tofuwombat
First, thanks for the quick response, I always find good help here.
bruce wrote: The Pande Group is aware of the fact that Maxwell is not yet supported, but there's no ETA for a fix.

[/code]
In light of the above, why is 750ti in the whitelist???

0x10de:0x1380:2:4:GM107 [GeForce GTX 750 Ti] :?

Re: 9401 fails on 750ti

Posted: Fri Mar 14, 2014 10:28 pm
by 7im
tofuwombat wrote:First, thanks for the quick response, I always find good help here.
bruce wrote: The Pande Group is aware of the fact that Maxwell is not yet supported, but there's no ETA for a fix.

[/code]
In light of the above, why is 750ti in the whitelist???

0x10de:0x1380:2:4:GM107 [GeForce GTX 750 Ti] :?

By luck, or whatever, Fahcore_15 seems to fold on Maxwell without any modifications needed. Seems the CUDA core was written well enough to keep working with new CUDA hardware.

OpenCL in Fahcore_17 does not appear to be the same.

If that hardware were blacklisted, they couldn't fold any WUs. And folding some are better than none.

There is a reason they call the newest hardware the bleeding edge. Always check the requirements section of the install guide to see if the new hardware is supported. If that is unclear there, ask here!

Re: 9401 fails on 750ti

Posted: Sat Mar 15, 2014 12:17 am
by bruce
Another reason: Suppose Development is working on a change that would fix the problem with Maxwell (maybe currently true, maybe not) They'd have to test it ... so it would have to be whitelisted.

Suppose there's a serious bug in nVidia's OpenCL support that's not in their CUDA support .... Nothing can be done by Stanford until nVidia fixes OpenCL. (I have no facts supporting this assumption, nor any proving it's not true either.)

Re: 9401 fails on 750ti

Posted: Sat Mar 15, 2014 2:30 am
by tofuwombat
bruce wrote: . . .have to test it ... so it would have to be whitelisted.
The need for development is clear, hopefully they have their own lists.

I thought the point of the public whitelist was to identify supported hardware.
Is there some other list that tells which cards "just work"???

Perhaps there is room for a "shiny" list that ACTUALLY works, without failing on V7.

Easier to supply than the much wanted flag to only pickup core_blah . . .