Page 1 of 1

FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 11:44 am
by Reuzenkakatoe
Good day,

Running FAH 7.5.1 on a AMD K10 CPU and a GForce GT 630. I have configured FAH to run on the GPU only, so no CPU jobs. I noticed that FahCore_22.exe constantly uses 25% of the CPU time. Since the GPU does most of the work, I figured that 25% CPU load is a bit weird. So I use BES (Battle Encoder Shirase) to limit FahCore's CPU use to only 1%. Now, the GPU is still grinding away at full speed at only 1% CPU load. I can see the progress bar moving at exactly the same speed. I can now allocate the 24% spare CPU capacity to BOINC or whatever. Configuring FAH to run both a GPU and a CPU task now only takes 26% of the CPU instead of 50%. If time allows, I think FahCore_22 could be modified to not waste 24% CPU time. In the meantime, use BES or other tools to prevent FahCore_22.exe from heating up our planet.

Hope this helps.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 12:12 pm
by PantherX
Welcome to the F@H Forum Reuzenkakatoe,

Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 12:21 pm
by foldinghomealone2
I still hope we can get a CUDA core as I assume that NV is not really interested in making Open CL as efficient as possible.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 12:39 pm
by Neil-B
It wouldn't surprise me to find that a core is being developed to utilise both nvidia and amd better taking into account their current specs and capabilities and cope with the challenges of each vendors drivers as best they can - but I guess many things can hold that up - I'll keep my fingers crossed for the GPU folding community.

Edit: Comments such as viewtopic.php?f=16&t=33986&start=15#p326666 by PantherX give me hope - but I won't hold my breath yet :)

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 1:09 pm
by Reuzenkakatoe
Thanks to everybody for the clarification. At least I know that my own system is not to blame. It's really not a major issue and limiting the FahCore solves the heating issue for now. Thanks!

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 3:07 pm
by HaloJones
Interesting. I tried this just now. Limiting the running fahcore_22 process to 50% took my TPF from 1:15 to 1:42. I have turned off BES.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 3:12 pm
by JimboPalmer
foldinghomealone2 wrote:I still hope we can get a CUDA core as I assume that NV is not really interested in making Open CL as efficient as possible.
Doubling the number of cores will double the time to get a new core when needed.
If F@H writes a specific Nvidia CUDA core and a OpenCL AMD core, and an Apple Metal Core, and an ARM OpenCL core, ever upgrading will be a slow process. as they all need to stay in Sync.

We want all that, we just are not going to be happy living with all that. (the poor researcher stuck with the ARM OpenCL core is never going to finish his/her project)

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 3:12 pm
by rwh202
HaloJones wrote:Interesting. I tried this just now. Limiting the running fahcore_22 process to 50% took my TPF from 1:15 to 1:42. I have turned off BES.
You're running some serious GPUs so I'd imagine any slowdown in the polling will hit them more.

The OP has listed a GT 630, so maybe 50 times less powerful - it won't be affected by having to wait a bit for a CPU cycle.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Fri Apr 17, 2020 4:27 pm
by bruce
While the statements about nVidia's support of OpenCL doing a spin-wait are true, it should be noted that periodically the FAHCore will temporarily increase the CPU above the 1-thread saturation that you're reporting. It's generally rather brief but if you happen to observe it, it is normal.

In the past, CUDA was supported but it was long ago and I don't remember the pattern of CPU use that was used. Development and ongoing support for extra versions of the FAHCore are treated as an unnecessary cost. A single 64-bit version of the OpenCL core is used for Windows and linux which supports both AMD and nVidia GPUs has been sufficient and adding others to the release cycle has not been deemed essential.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Sat Apr 18, 2020 8:13 am
by foldinghomealone2
JimboPalmer wrote:
foldinghomealone2 wrote:I still hope we can get a CUDA core as I assume that NV is not really interested in making Open CL as efficient as possible.
Doubling the number of cores will double the time to get a new core when needed.
If F@H writes a specific Nvidia CUDA core and a OpenCL AMD core, and an Apple Metal Core, and an ARM OpenCL core, ever upgrading will be a slow process. as they all need to stay in Sync.

We want all that, we just are not going to be happy living with all that. (the poor researcher stuck with the ARM OpenCL core is never going to finish his/her project)
I understand your points, but:
My assumption is that more than 80% of GPU-related research (~returned WUs) is done by NV-GPUs.
In case CUDA offers more performance - which is also my assumption - there would be increase in FAH's overall performance.
Therefore, instead of making different cores for every system, make two cores. One CUDA for the majority and one OpenCL for all others.

And I don't know why cores would need to stay in sync. When you have an OpenCL-core for AMD and Apple, it still can be computed by NV as NV is OpenCL compatible.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Sat Apr 18, 2020 8:44 am
by PantherX
foldinghomealone2 wrote:...And I don't know why cores would need to stay in sync. When you have an OpenCL-core for AMD and Apple, it still can be computed by NV as NV is OpenCL compatible.
When a new version of FahCore is released, it is generally done with the intention that it adds new scientific features which the researchers can use for their projects. Thus, when OpenCL gets updated, the CUDA version will also need to be updated to ensure that both FahCores can produce the same scientific information. Also, if the Vendor changes the implementation of OpenCL due to a new driver, that means that FahCore will need to be updated to support that other-wise, it may not work. Also, new architectures may require a new version of FahCore to allow that device to fold.

When a new FahCore is released, it needs to undergo extra testing to ensure that the results produced are matched experimentally. Occasionally, the same happens when a new version of the FahCore is released.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Sat Apr 18, 2020 9:54 am
by HaloJones
Until NVidia provide the budget for a CUDA version with ongoing support as required I don't see there being a fork in the GPU core. I wish they would as it would be a trivial cost for them.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Sat Apr 18, 2020 10:01 am
by Cryptoxic
PantherX wrote:Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
Hmm I used to run BOINC with GPUgrid and I noticed that they use OpenMM as well for their WUs but their cores only run on CUDA and Nvidia cards. Maybe the team could look into that?

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Sat Apr 18, 2020 10:33 am
by rwh202
Cryptoxic wrote:
PantherX wrote:Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
Hmm I used to run BOINC with GPUgrid and I noticed that they use OpenMM as well for their WUs but their cores only run on CUDA and Nvidia cards. Maybe the team could look into that?
Yes, OpenMM has options to run in CUDA or opencl (or CPU) platforms.

I'd like to think that the developers have at least tried the 1 line change to see if it works. However, I appreciate it would then need a fair bit more testing to ensure results are consistent in all cases.

Re: FahCore_2 needlessly wasting CPU cycles.

Posted: Sat Apr 18, 2020 10:35 am
by Cryptoxic
rwh202 wrote:
Cryptoxic wrote:
PantherX wrote:Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
Hmm I used to run BOINC with GPUgrid and I noticed that they use OpenMM as well for their WUs but their cores only run on CUDA and Nvidia cards. Maybe the team could look into that?
Yes, OpenMM has options to run in CUDA or opencl (or CPU) platforms.

I'd like to think that the developers have at least tried the 1 line change to see if it works. However, I appreciate it would then need a fair bit more testing to ensure results are consistent in all cases.
Yeah, I would totally be onboard to beta test the CUDA work units with my systems.