GPU assignment of big WUs to slow GPUs

foldinghomealone · Post by **foldinghomealone** » Mon Apr 27, 2020 7:00 pm

Some of our team members face the problem that big WUs are assigned to slow GPUs.
Like P13877 to a GTX 980 with a preferred deadline of 1day.
It's basically not possible to return the WU in time for part time folders

FAH always promotes the idea that part time folding is possible but then it counteracts it with preferred deadlines of 1 day.

I don't understand how the assignment process works but heard of many reports that it's not possible for the assignment servers to assign a WU by HW configuration.

If that is true FAH could use 'cause preference' to introduce a new category, like 'part time folding'.
Anyone who choses this 'cause' would get small WUs or WUs with longer preferred deadlines.

HaloJones · Post by **HaloJones** » Mon Apr 27, 2020 10:10 pm

It's an interesting idea. In a perfect world, FAH-central would be told by the client what the GPU or CPU is and FAH-central would look in its list of work to allocate the most suitable. Sounds great but there's so many possible GPU and CPU (and more being constantly released) that it would be a nightmare to maintain.

But the 'part time folding' thing is an interesting concept. Unfortunately, for the sake of the project and the way it is iterative, projects really want a quick result to determine what to do next. Each unit informs the next unit what it should ask. That's why there are sometimes very short deadlines.

In more normal times, when FAH isn't trying to solve a very specific and time-sensitive problem, your suggestion might be adopted. maybe.

JimboPalmer · Post by **JimboPalmer** » Mon Apr 27, 2020 10:30 pm

F@H has what it calls Species numbers, but they are based on the capability of a card, not it's speed. A card may Support OpenCL 1.2 and Double Precision floating point math and thus be in a given Species. But that does not give the server it's speed. So far as I know the last AMD Species was for the RDNA/Navi cards, as they support Core_22 but not Core_21. Again not a measure of speed but of ability.

Rel25917 · Post by **Rel25917** » Mon Apr 27, 2020 11:52 pm

Before the whole covid thing the timeouts were much longer, usually 5 days for gpu and possibly more for cpu. For the new covid projects they want results really fast so we got 1 day timeouts.

MeeLee · Post by **MeeLee** » Tue Apr 28, 2020 1:27 am

There might be a way to set what projects you support. Some projects might have longer deadlines on their WUs.

foldinghomealone · Post by **foldinghomealone** » Tue Apr 28, 2020 4:53 am

HaloJones wrote:But the 'part time folding' thing is an interesting concept. Unfortunately, for the sake of the project and the way it is iterative, projects really want a quick result to determine what to do next. Each unit informs the next unit what it should ask. That's why there are sometimes very short deadlines.

But still FAH gives the possibility to use idle processing power and to pause a projects.

It's energy and time wasting to assign big WUs / WUs with short deadlines to slow HW so that they have to be reassigned.

If FAH dislikes the term 'part time folding' then call the 'cause' any other name FAH prefers. Maybe 'long deadlines' or whatever.

foldy · Post by **foldy** » Tue Apr 28, 2020 6:23 am

You are right. That project 13877 has 150k atoms count and should be restricted to fast GPUs with many shaders only. So not assign to nvidia Maxwell GPUs or AMD Polaris GPUs.

rwh202 · Post by **rwh202** » Tue Apr 28, 2020 6:32 am

The old v6 client allowed you to 'request work units without deadlines' and there was greater use of the 'packet-size' flags to get bigger or smaller WUs (that was mostly concerned about data bandwidth, but had the same effect).

That's progress for you...

foldinghomealone · Post by **foldinghomealone** » Tue Apr 28, 2020 8:56 am

foldy wrote:You are right. That project 13877 has 150k atoms count and should be restricted to fast GPUs with many shaders only. So not assign to nvidia Maxwell GPUs or AMD Polaris GPUs.

That would be a good start.
However there are slow GPUs like 1030 or 1050 that - when not folding 24/7 - are not able to return the WU till timeout.

Maybe such projects should be only assigned to Turing and Navi GPUs

foldy · Post by **foldy** » Sat May 16, 2020 8:31 am

https://github.com/FoldingAtHome/fah-issues/issues/1484

BobWilliams757 · Post by **BobWilliams757** » Sun May 17, 2020 12:40 pm

Having some slower hardware myself, I do see the point. I have now done 10 runs of various 16435 WU's, and they barely meet the timeout folding 24/7. In this case, it takes them about 24 hours, some slightly more and missing the timeout slightly. I don't have a problem letting the machine run overnight so I let them run in the hopes that I'm still providing the first result. But others just don't want to let the system run overnight.

But in all fairness, really breaking them down might be must harder than it seems. Atom count alone does not dictate the architecture needed in all cases. Nor does the step count, or the CPU use. The more I dig, the more I think the WU variances are a lot more complex than most of us realize. Having been watching the Beta forum out of curiosity and possible desire to volunteer, it seems some WU's just work much better (or worse) with specific hardware architecture, and/or possible driver sets, OS, etc. So short of having a lot more Beta testers with a lot of various older, unique, rare, etc hardware/OS it would almost be impossible to really dial them in to guarantee any solid point of turnaround time.

So I'd have to agree that the WU's with lengthy time periods until timeout is really the only quick solution. And I also agree with it making things more efficient, since the less WU's timeout, the less the overhead on both the F@H and donor end.

As an interesting note on how complex it gets with WU's, I had searched and found a thread on a specific WU that gave me a crazy high PPD return. During Beta testing, several testers reported PPD return was low. I assume they adjusted the WU specifics to give it a more "fair" point return. But for whatever reason, the specifics of my onboard graphics, small atom count, number of steps, complexity, etc... I got a PPD nearing double my average. Most of the Beta testers have much more powerful GPU's and they reported lower PPD returns when testing. And it was also a Core 21 project, and most people state 20% or so lower PPD expectations. So every WU probably has a range of hardware that runs it great, some just ok, some struggle, etc.

Post by **bruce** » Sun Jun 28, 2020 12:39 am

BobWilliams757 wrote:But in all fairness, really breaking them down might be must harder than it seems.

That certainly true. Nevertheless, FAH can improve the assignment logic in incremental stages. It starts by assigning a variety of projects to a variety of GPUs. Most likely it also starts with an assumption that atom count is at least one of the factors influencing final performance and gathering additional data from whatever gets returned from those assignments.

Timeouts should also be based on expected performance.

As an interesting note on how complex it gets with WU's, I had searched and found a thread on a specific WU that gave me a crazy high PPD return. During Beta testing, several testers reported PPD return was low. I assume they adjusted the WU specifics to give it a more "fair" point return. But for whatever reason, the specifics of my onboard graphics, small atom count, number of steps, complexity, etc... I got a PPD nearing double my average. Most of the Beta testers have much more powerful GPU's and they reported lower PPD returns when testing. And it was also a Core 21 project, and most people state 20% or so lower PPD expectations. So every WU probably has a range of hardware that runs it great, some just ok, some struggle, etc.

Also true.

SetiCrew486 · Post by **SetiCrew486** » Tue Jun 30, 2020 8:27 pm

I also have deactivated my GPU slot a couple of weeks ago - for the same reasons.
Yes, I know my GPU is at the lower end of the list and will probably be dropped with the next update, but currently, it's still supported.

But, as I'm also folding part-time, I personally won't need a precise WU control like mentioned above. I don't have to tweak the lastest out of my GPU, because it will be shut down sooner or later anyway.
Instead, I would be fine by limiting the size roughly i.e. by atom count, by base credit or even by reactivating the max-packet-size flag.
I'm pretty sure I'll be able to find a setting that lets my GPU finish it's WUs safely within time.

Post by **bruce** » Wed Jul 01, 2020 3:56 am

foldinghomealone wrote:Some of our team members face the problem that big WUs are assigned to slow GPUs.
Like P13877 to a GTX 980 with a preferred deadline of 1day.
It's basically not possible to return the WU in time for part time folders

Hmmm. i am seeing conflicting information;

> p13877. HFM Benchmark Data. These wu actually ran well on linux 970s (hosts F67/68) at TPF 4.5 minutes (0.3 days to complete). The 980 should be slightly faster than the 970 and able to easily meet the 1 day timeout.

Are you trying to run your 980 on a 1x riser? what TPF are you seeing? Please post applicable segments of your log.

Ichbin3 · Post by **Ichbin3** » Wed Jul 01, 2020 4:27 am

bruce wrote: foldinghomealone wrote:
Some of our team members face the problem that big WUs are assigned to slow GPUs.
Like P13877 to a GTX 980 with a preferred deadline of 1day.
It's basically not possible to return the WU in time for part time folders

I guess that's what he is pointing to ...

Folding Forum

GPU assignment of big WUs to slow GPUs

GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs

Re: GPU assignment of big WUs to slow GPUs