FahCore_2 needlessly wasting CPU cycles.
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 6
- Joined: Wed Apr 01, 2020 11:41 am
FahCore_2 needlessly wasting CPU cycles.
Good day,
Running FAH 7.5.1 on a AMD K10 CPU and a GForce GT 630. I have configured FAH to run on the GPU only, so no CPU jobs. I noticed that FahCore_22.exe constantly uses 25% of the CPU time. Since the GPU does most of the work, I figured that 25% CPU load is a bit weird. So I use BES (Battle Encoder Shirase) to limit FahCore's CPU use to only 1%. Now, the GPU is still grinding away at full speed at only 1% CPU load. I can see the progress bar moving at exactly the same speed. I can now allocate the 24% spare CPU capacity to BOINC or whatever. Configuring FAH to run both a GPU and a CPU task now only takes 26% of the CPU instead of 50%. If time allows, I think FahCore_22 could be modified to not waste 24% CPU time. In the meantime, use BES or other tools to prevent FahCore_22.exe from heating up our planet.
Hope this helps.
Running FAH 7.5.1 on a AMD K10 CPU and a GForce GT 630. I have configured FAH to run on the GPU only, so no CPU jobs. I noticed that FahCore_22.exe constantly uses 25% of the CPU time. Since the GPU does most of the work, I figured that 25% CPU load is a bit weird. So I use BES (Battle Encoder Shirase) to limit FahCore's CPU use to only 1%. Now, the GPU is still grinding away at full speed at only 1% CPU load. I can see the progress bar moving at exactly the same speed. I can now allocate the 24% spare CPU capacity to BOINC or whatever. Configuring FAH to run both a GPU and a CPU task now only takes 26% of the CPU instead of 50%. If time allows, I think FahCore_22 could be modified to not waste 24% CPU time. In the meantime, use BES or other tools to prevent FahCore_22.exe from heating up our planet.
Hope this helps.
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: FahCore_2 needlessly wasting CPU cycles.
Welcome to the F@H Forum Reuzenkakatoe,
Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
-
- Posts: 146
- Joined: Sun Jul 30, 2017 8:40 pm
Re: FahCore_2 needlessly wasting CPU cycles.
I still hope we can get a CUDA core as I assume that NV is not really interested in making Open CL as efficient as possible.
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: FahCore_2 needlessly wasting CPU cycles.
It wouldn't surprise me to find that a core is being developed to utilise both nvidia and amd better taking into account their current specs and capabilities and cope with the challenges of each vendors drivers as best they can - but I guess many things can hold that up - I'll keep my fingers crossed for the GPU folding community.
Edit: Comments such as viewtopic.php?f=16&t=33986&start=15#p326666 by PantherX give me hope - but I won't hold my breath yet
Edit: Comments such as viewtopic.php?f=16&t=33986&start=15#p326666 by PantherX give me hope - but I won't hold my breath yet
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
-
- Posts: 6
- Joined: Wed Apr 01, 2020 11:41 am
Re: FahCore_2 needlessly wasting CPU cycles.
Thanks to everybody for the clarification. At least I know that my own system is not to blame. It's really not a major issue and limiting the FahCore solves the heating issue for now. Thanks!
Re: FahCore_2 needlessly wasting CPU cycles.
Interesting. I tried this just now. Limiting the running fahcore_22 process to 50% took my TPF from 1:15 to 1:42. I have turned off BES.
single 1070
-
- Posts: 2522
- Joined: Mon Feb 16, 2009 4:12 am
- Location: Greenwood MS USA
Re: FahCore_2 needlessly wasting CPU cycles.
Doubling the number of cores will double the time to get a new core when needed.foldinghomealone2 wrote:I still hope we can get a CUDA core as I assume that NV is not really interested in making Open CL as efficient as possible.
If F@H writes a specific Nvidia CUDA core and a OpenCL AMD core, and an Apple Metal Core, and an ARM OpenCL core, ever upgrading will be a slow process. as they all need to stay in Sync.
We want all that, we just are not going to be happy living with all that. (the poor researcher stuck with the ARM OpenCL core is never going to finish his/her project)
Last edited by JimboPalmer on Fri Apr 17, 2020 3:30 pm, edited 3 times in total.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
-
- Posts: 410
- Joined: Mon Nov 15, 2010 8:51 pm
- Hardware configuration: 8x GTX 1080
3x GTX 1080 Ti
3x GTX 1060
Various other bits and pieces - Location: South Coast, UK
Re: FahCore_2 needlessly wasting CPU cycles.
You're running some serious GPUs so I'd imagine any slowdown in the polling will hit them more.HaloJones wrote:Interesting. I tried this just now. Limiting the running fahcore_22 process to 50% took my TPF from 1:15 to 1:42. I have turned off BES.
The OP has listed a GT 630, so maybe 50 times less powerful - it won't be affected by having to wait a bit for a CPU cycle.
Re: FahCore_2 needlessly wasting CPU cycles.
While the statements about nVidia's support of OpenCL doing a spin-wait are true, it should be noted that periodically the FAHCore will temporarily increase the CPU above the 1-thread saturation that you're reporting. It's generally rather brief but if you happen to observe it, it is normal.
In the past, CUDA was supported but it was long ago and I don't remember the pattern of CPU use that was used. Development and ongoing support for extra versions of the FAHCore are treated as an unnecessary cost. A single 64-bit version of the OpenCL core is used for Windows and linux which supports both AMD and nVidia GPUs has been sufficient and adding others to the release cycle has not been deemed essential.
In the past, CUDA was supported but it was long ago and I don't remember the pattern of CPU use that was used. Development and ongoing support for extra versions of the FAHCore are treated as an unnecessary cost. A single 64-bit version of the OpenCL core is used for Windows and linux which supports both AMD and nVidia GPUs has been sufficient and adding others to the release cycle has not been deemed essential.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 146
- Joined: Sun Jul 30, 2017 8:40 pm
Re: FahCore_2 needlessly wasting CPU cycles.
I understand your points, but:JimboPalmer wrote:Doubling the number of cores will double the time to get a new core when needed.foldinghomealone2 wrote:I still hope we can get a CUDA core as I assume that NV is not really interested in making Open CL as efficient as possible.
If F@H writes a specific Nvidia CUDA core and a OpenCL AMD core, and an Apple Metal Core, and an ARM OpenCL core, ever upgrading will be a slow process. as they all need to stay in Sync.
We want all that, we just are not going to be happy living with all that. (the poor researcher stuck with the ARM OpenCL core is never going to finish his/her project)
My assumption is that more than 80% of GPU-related research (~returned WUs) is done by NV-GPUs.
In case CUDA offers more performance - which is also my assumption - there would be increase in FAH's overall performance.
Therefore, instead of making different cores for every system, make two cores. One CUDA for the majority and one OpenCL for all others.
And I don't know why cores would need to stay in sync. When you have an OpenCL-core for AMD and Apple, it still can be computed by NV as NV is OpenCL compatible.
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: FahCore_2 needlessly wasting CPU cycles.
When a new version of FahCore is released, it is generally done with the intention that it adds new scientific features which the researchers can use for their projects. Thus, when OpenCL gets updated, the CUDA version will also need to be updated to ensure that both FahCores can produce the same scientific information. Also, if the Vendor changes the implementation of OpenCL due to a new driver, that means that FahCore will need to be updated to support that other-wise, it may not work. Also, new architectures may require a new version of FahCore to allow that device to fold.foldinghomealone2 wrote:...And I don't know why cores would need to stay in sync. When you have an OpenCL-core for AMD and Apple, it still can be computed by NV as NV is OpenCL compatible.
When a new FahCore is released, it needs to undergo extra testing to ensure that the results produced are matched experimentally. Occasionally, the same happens when a new version of the FahCore is released.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Re: FahCore_2 needlessly wasting CPU cycles.
Until NVidia provide the budget for a CUDA version with ongoing support as required I don't see there being a fork in the GPU core. I wish they would as it would be a trivial cost for them.
single 1070
Re: FahCore_2 needlessly wasting CPU cycles.
Hmm I used to run BOINC with GPUgrid and I noticed that they use OpenMM as well for their WUs but their cores only run on CUDA and Nvidia cards. Maybe the team could look into that?PantherX wrote:Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
-
- Posts: 410
- Joined: Mon Nov 15, 2010 8:51 pm
- Hardware configuration: 8x GTX 1080
3x GTX 1080 Ti
3x GTX 1060
Various other bits and pieces - Location: South Coast, UK
Re: FahCore_2 needlessly wasting CPU cycles.
Yes, OpenMM has options to run in CUDA or opencl (or CPU) platforms.Cryptoxic wrote:Hmm I used to run BOINC with GPUgrid and I noticed that they use OpenMM as well for their WUs but their cores only run on CUDA and Nvidia cards. Maybe the team could look into that?PantherX wrote:Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
I'd like to think that the developers have at least tried the 1 line change to see if it works. However, I appreciate it would then need a fair bit more testing to ensure results are consistent in all cases.
Re: FahCore_2 needlessly wasting CPU cycles.
Yeah, I would totally be onboard to beta test the CUDA work units with my systems.rwh202 wrote:Yes, OpenMM has options to run in CUDA or opencl (or CPU) platforms.Cryptoxic wrote:Hmm I used to run BOINC with GPUgrid and I noticed that they use OpenMM as well for their WUs but their cores only run on CUDA and Nvidia cards. Maybe the team could look into that?PantherX wrote:Please note that unfortunately, the CPU usage of FahCore_22 is caused by the implementation of Nvidia's OpenCL implementation. The implementation uses spin-wait (polling) and will always use 1 CPU for that. I would suggest that you can pass your feedback to the Nvidia developers so that they can update their drivers.
I'd like to think that the developers have at least tried the 1 line change to see if it works. However, I appreciate it would then need a fair bit more testing to ensure results are consistent in all cases.