Lack of WUs for CPU<12 systems

Moderators: Site Moderators, FAHC Science Team

Post Reply
muziqaz
Posts: 1979
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Lack of WUs for CPU<12 systems

Post by muziqaz »

Hey, everyone, so we heard you about how your systems are sitting idle. We done some digging. Here is the situation:
Server Stats page is showing 840 000 CPU WUs available. Unfortunately that number is not correct, as it still counts projects which are inside of the servers which are down. We made a request to the main dev to look into it, and maybe fix it, that we can see actual real numbers of really available WUs.
At the moment one server, which houses 386 000 WUs, is down. I am not sure why. Possible connection issues, maintenance, end of projects. Request to the server owners has been sent to look into that. I personally do not expect much out this particular matter, as server has been down for 4 days now, if research was monitored and actively run, they would have noticed already (but again, I hope I am wrong).
So, that leaves us with real available WU number at somewhere around 380k-400k give or take. 370something thousand of those available are restricted to magical thread number 12 and above. Reason? Those 3 projects run really bad on SSE based CPUs (old, very old, but not obsolete), even OG AVX struggles. Low thread count CPUs, even modern ones, struggle finishing these in time, too. We are looking into re-balancing these 3 projects to allow them to be assigned to lower thread count CPUs (how low, I don't know yet). This change might take awhile depending on when researcher responds to our proposal, plus few days for testing and re-balancing on our end. If the change is made, any CPU without AVX instruction set as a minimum will be excluded (this included Apple and any other ARM CPUs, unfortunately). Saying that, there are some projects which have no restrictions right now, so once (or if) we are done with re-balancing, those smaller projects will have waay less pressure from higher core count CPUs, so they will be more available to ARM/Apple crowd.

I'll try updating this thread of any changes/progress/regress.
Thanks

P.S. Any folders who are having DNS related issues, please check if you are not using cloudflare DNS as your primary name solver. Change it to some google server or something else. FAH cannot do much about this, as FAH is stuck with 3rd party service providers, which have their own issues
FAH Omega tester
Image
Wedge009
Posts: 13
Joined: Fri May 23, 2025 6:16 am

Re: Lack of WUs for CPU<12 systems

Post by Wedge009 »

Thanks for the update.

A question: I notice you seem to use the terms CPU core and thread interchangeably. Does F@h really run that well with SMT? I tend to restrict my machines (not just for F@h) to account for CPU physical core counts, especially since I also run GPU tasks for other projects and they need some CPU power so that the GPU isn't 'starved'. So while my smallest hosts have 12 CPU threads, I'm not going to dedicate all of them to F@h work.

At any rate, hopefully we'll get some progress on work availability in a few days.
muziqaz
Posts: 1979
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Lack of WUs for CPU<12 systems

Post by muziqaz »

Wedge009 wrote: Sat Aug 30, 2025 10:57 am Thanks for the update.

A question: I notice you seem to use the terms CPU core and thread interchangeably. Does F@h really run that well with SMT? I tend to restrict my machines (not just for F@h) to account for CPU physical core counts, especially since I also run GPU tasks for other projects and they need some CPU power so that the GPU isn't 'starved'. So while my smallest hosts have 12 CPU threads, I'm not going to dedicate all of them to F@h work.

At any rate, hopefully we'll get some progress on work availability in a few days.
Wherever I mention cores, I meant threads :) sorry
FAH uses Floating Point Unit inside of the core (yes, this time I mean core :D). Those FPUs can be easily split with SMT, thus it makes no difference if you are using SMT or disable it, as fah will still use full extent of the CPU inside of the core. SMT is usually very helpful with integer tasks, since those resources are harder to utilise if there was no SMT and core has been designed with one in mind.
Saying that FAH still benefits a little bit from SMT, but at higher thread counts (more than 16) scaling of the actual fahcore (software which does simulations) becomes an issue, negating any SMT benefits.
GPUs do require 1 thread per GPU to be made available (but not assigned in FAHclient). A thread is usually enough to satisfy a GPU. Though some people go further and free up full core, some disable CPU folding altogether to allow GPUs gain maximum PPD possible
FAH Omega tester
Image
Wedge009
Posts: 13
Joined: Fri May 23, 2025 6:16 am

Re: Lack of WUs for CPU<12 systems

Post by Wedge009 »

The GPU projects I run are heavily CPU-based - it depends on the nature of the GPU, of course. My fastest GPUs actually need a full core of a fast CPU - in other words, the tasks are somewhat CPU-bound (but still far more efficient than running on CPU alone).

...but this is irrelevant to the discussion of F@h - I was just mentioning it because your assertions about GPU tasks appear to be primarily about F@h GPU applications.
muziqaz
Posts: 1979
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Lack of WUs for CPU<12 systems

Post by muziqaz »

Wedge009 wrote: Sat Aug 30, 2025 11:16 am The GPU projects I run are heavily CPU-based - it depends on the nature of the GPU, of course. My fastest GPUs actually need a full core of a fast CPU - in other words, the tasks are somewhat CPU-bound (but still far more efficient than running on CPU alone).

...but this is irrelevant to the discussion of F@h - I was just mentioning it because your assertions about GPU tasks appear to be primarily about F@h GPU applications.
That is more the fault of nVidia driver model.
AMD is waaay less CPU dependant.
Plus windows driver overhead is a feature
FAH Omega tester
Image
Wedge009
Posts: 13
Joined: Fri May 23, 2025 6:16 am

Re: Lack of WUs for CPU<12 systems

Post by Wedge009 »

...my GPUs are Radeons.

...my hosts are mostly Linux.

...I reiterate/clarify that I'm talking about non-F@h projects.
muziqaz
Posts: 1979
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Lack of WUs for CPU<12 systems

Post by muziqaz »

Update:
There is workaround we are thinking to implement in order to get currently available 380k WUs to be assigned to 3 or 4 threads as minimum. Hopefully beginning of next week will be the day.
Server which has gone down with 380k WUs will be revived, not sure when, hopefully soon. Projects which have been limited to CPU>=3 will be restored to CPU>=2. Again, hopefully very soon.
FAH Omega tester
Image
Wedge009
Posts: 13
Joined: Fri May 23, 2025 6:16 am

Re: Lack of WUs for CPU<12 systems

Post by Wedge009 »

So no changes yet? I only ask because since around 01:30 UTC I've been noticing differing project IDs for the work being assigned - as well as shorter delays between requests and assignments - so I thought something might have changed already. Maybe it's just coincidental.

Anyway, thanks for the update and the team efforts.
muziqaz
Posts: 1979
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 9950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, Intel B580
Location: London
Contact:

Re: Lack of WUs for CPU<12 systems

Post by muziqaz »

Few projects have been changed to CPU>=2. The rest are yet to be changed. There is some issues with saving new constraints on some other projects.
vav22.temple server has been revived, but still has issues.
FAH Omega tester
Image
Wedge009
Posts: 13
Joined: Fri May 23, 2025 6:16 am

Re: Lack of WUs for CPU<12 systems

Post by Wedge009 »

That might be helping a bit, because I just had a task assigned without any 'no appropriate assignment' error:

Code: Select all

05:16:55:I1:OUT460:> POST https://assign1.foldingathome.org/api/assign HTTP/1.1
05:16:56:I1:OUT460:< HTTP/1.1 200 HTTP_OK
05:16:56:I1:OUT461:> POST https://vav22.fah.temple.edu/api/assign HTTP/1.1
05:17:00:I1:OUT461:< HTTP/1.1 200 HTTP_OK
Edit: Looking across all my hosts, aside from the one with 20 cores allocated to F@h, their most recent assignments have all been served by vav22. The most recent HTTP_SERVICE_UNAVAILABLE error was received at 01:23:05 UTC.
Post Reply