CPU slot only using E-cores on i7-12700
Moderator: Site Moderators
Forum rules
Please read the forum rules before posting.
Please read the forum rules before posting.
-
- Posts: 6
- Joined: Thu Jan 27, 2022 3:33 am
CPU slot only using E-cores on i7-12700
I just built a new machine with a fancy new Intel 12th gen i7-12700 processor. I was running F@H for a couple years on my old machine with a 3rd gen i5, and was looking forward to contributing more. However, when I started it up, the CPU slot refused to use any more than the 4 E-cores on the hybrid architecture! The client recognized I had a chip with 20 logical processors (according to the log), and I guess assigned me a work unit (WU 18402 (105, 4, 25)) based on that expected capability because it was the biggest CPU WU I'd ever received. But then it wouldn't use any more than the 4 E-core CPUs! (from what I read, the E-cores are the last ones shown in the multi-processor view in task manager.) I checked the slot information and it was set to -1 to allow as many as it wanted. I had to run about 14 hours a day for four days at 20% total load to get it done.
Now, I'm still using Windows 10 at the moment, because Windows 11 still needs some work tbh. I realize the hybrid architecture is all super-special and Win10 doesn't fully understand how to use it optimally, but I was under the impression that this meant Windows didn't know the difference between P-cores and E-cores and so just saw 20 identical processors instead of 16 of one and 4 of the other. I thought the main thing with Windows 11 was that it would recognize the difference, and dynamically reassign tasks between types as needed. What's more, when I use other programs (like doing exports from Lightroom), Windows seems quite happy to assign loads to any random processors it feels like, P- or E-, with or without Folding running. Although it does seem like it has *some* awareness of which are the E-cores, because as I sit here at idle, the 16 P-cores are "parked" according to Task Manager and I have minimal load on just the 4 E-cores. If Windows really saw them as identical, you'd think there'd be a little more randomness to the thread assignments. So maybe the issue is that the Folding threads get assigned to E-cores because they are very low priority, but then never get reassigned when load gets high because Windows 10 doesn't know how to do that.
But then, I created a new CPU slot in Folding with the # of CPUs set manually to 12, thinking that since the E-Cores were full with the first CPU slot, the 2nd slot would get assigned to the rest, or at least hoping it would go looking for 12 cores to use. Nope, it just piled the new WU on top of the other one for the E-cores to work on. I tried turning it to full power, and using "slightly higher" folding core priority, but it's still just on the E-cores (unless I need to set those preferences and then wait for a new WU to get pulled?). Granted, it is 100% load on the E-cores, and it seems like that alone is faster than my old processor, which was pegged to 100% all the time when folding!
I searched around the forum here for a bit, and I heard about the the Process Lasso/Affinity tools, but I'd really prefer a native solution, if possible. Also would rather not have to go to Windows 11 right now, at least until they fix the taskbar. I'll try out Lasso if it's the only way, but I guess mostly I'm just curious why this is happening. I'm trying to understand my new hardware and how to use it effectively, as I'm having a couple other hiccups in the system as well. Is a new Folding client/core is planned to resolve the issue? Maybe there are some "expert" configuration options I can apply? Any help, or even just explanation, is appreciated!
Now, I'm still using Windows 10 at the moment, because Windows 11 still needs some work tbh. I realize the hybrid architecture is all super-special and Win10 doesn't fully understand how to use it optimally, but I was under the impression that this meant Windows didn't know the difference between P-cores and E-cores and so just saw 20 identical processors instead of 16 of one and 4 of the other. I thought the main thing with Windows 11 was that it would recognize the difference, and dynamically reassign tasks between types as needed. What's more, when I use other programs (like doing exports from Lightroom), Windows seems quite happy to assign loads to any random processors it feels like, P- or E-, with or without Folding running. Although it does seem like it has *some* awareness of which are the E-cores, because as I sit here at idle, the 16 P-cores are "parked" according to Task Manager and I have minimal load on just the 4 E-cores. If Windows really saw them as identical, you'd think there'd be a little more randomness to the thread assignments. So maybe the issue is that the Folding threads get assigned to E-cores because they are very low priority, but then never get reassigned when load gets high because Windows 10 doesn't know how to do that.
But then, I created a new CPU slot in Folding with the # of CPUs set manually to 12, thinking that since the E-Cores were full with the first CPU slot, the 2nd slot would get assigned to the rest, or at least hoping it would go looking for 12 cores to use. Nope, it just piled the new WU on top of the other one for the E-cores to work on. I tried turning it to full power, and using "slightly higher" folding core priority, but it's still just on the E-cores (unless I need to set those preferences and then wait for a new WU to get pulled?). Granted, it is 100% load on the E-cores, and it seems like that alone is faster than my old processor, which was pegged to 100% all the time when folding!
I searched around the forum here for a bit, and I heard about the the Process Lasso/Affinity tools, but I'd really prefer a native solution, if possible. Also would rather not have to go to Windows 11 right now, at least until they fix the taskbar. I'll try out Lasso if it's the only way, but I guess mostly I'm just curious why this is happening. I'm trying to understand my new hardware and how to use it effectively, as I'm having a couple other hiccups in the system as well. Is a new Folding client/core is planned to resolve the issue? Maybe there are some "expert" configuration options I can apply? Any help, or even just explanation, is appreciated!
Re: CPU slot only using E-cores on i7-12700
I do not believe there are native/built-in tools to automate what you're asking, hence things like process lasso, but you can at least test the theory of it.
Open task manager, go to the details tab, find a cpu core (Core_A8), right-click and select "Set affinity".
All of the CPU cores should be selected, toggle these settings and watch the workload move around the processor.
These processor affinity settings are only temporary and will reset when that specific instance of the process is stopped for any reason.
I would suggest giving process lasso a go to automate what you did above, at least it's a free download.
I don't use process lasso. I have looked at it but didn't spend any time trying to figure it out as I already had my own custom solution.
I wrote my own powershell script to assign processes to specific cpu cores/threads.
I have no idea if it will work with the new hybrid processors, but I imagine it will, and it took me a fair bit of setup - group policy settings, permissions, task scheduler.
Open task manager, go to the details tab, find a cpu core (Core_A8), right-click and select "Set affinity".
All of the CPU cores should be selected, toggle these settings and watch the workload move around the processor.
These processor affinity settings are only temporary and will reset when that specific instance of the process is stopped for any reason.
I would suggest giving process lasso a go to automate what you did above, at least it's a free download.
I don't use process lasso. I have looked at it but didn't spend any time trying to figure it out as I already had my own custom solution.
I wrote my own powershell script to assign processes to specific cpu cores/threads.
I have no idea if it will work with the new hybrid processors, but I imagine it will, and it took me a fair bit of setup - group policy settings, permissions, task scheduler.
-
- Posts: 6
- Joined: Thu Jan 27, 2022 3:33 am
Re: CPU slot only using E-cores on i7-12700
ah, now we're cooking with gas! I think I somehow missed before when looking around the internet that "Set affinity" was an option natively available in Windows rather than added by a third party tool. I wasn't looking in the "Details" tab for it.
It looks like if I unselect all the E-cores it will start up on the P-cores for me, but if even one E-core is selected, it'll just use that one core, even though all the P-cores are also selected. OR if I set the priority to "Normal," it will start using ALL the cores, but "low" or "below normal" will be just E-cores. I kind of wish "below normal" would use all cores, that would resolve the overall issue with a Folding client setting I think...
Do you know why this is happening, or indeed even how, if Windows 10 isn't aware of what E-cores are? Does the on-chip thread director know what priority the OS sets for a process? Can anyone confirm that Windows 11 actually handles this better?
I'll see how annoyed I get at having to reset the process affinity every time, and possibly download process Lasso. Sounds like it might be a generally useful program until I upgrade to Win11.
Cheers!
It looks like if I unselect all the E-cores it will start up on the P-cores for me, but if even one E-core is selected, it'll just use that one core, even though all the P-cores are also selected. OR if I set the priority to "Normal," it will start using ALL the cores, but "low" or "below normal" will be just E-cores. I kind of wish "below normal" would use all cores, that would resolve the overall issue with a Folding client setting I think...
Do you know why this is happening, or indeed even how, if Windows 10 isn't aware of what E-cores are? Does the on-chip thread director know what priority the OS sets for a process? Can anyone confirm that Windows 11 actually handles this better?
I'll see how annoyed I get at having to reset the process affinity every time, and possibly download process Lasso. Sounds like it might be a generally useful program until I upgrade to Win11.
Cheers!
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: CPU slot only using E-cores on i7-12700
I talked with someone (from Intel) who knows more than me about it, and he said you have only two possibilities :
- on Windows 10, applications with normal priority will be sent to the P core. This is not something the client can do automatically, so you'll have to do it with a tool like Process Lasso or Bills2 Process Manager.
- Windows 11 is the definitive fix to this.
In any case, FAH is not aware of P-cores and E-cores, so it will by default try to run as many threads as it detects. When you know how Gromacs works, you know that it will run at the pace of the slowest thread in the mix. So if it uses all P-cores and E-cores in the same slot, P-cores will spend their time waiting for E-cores to finish ...
So here are my suggestions. Set a CPU folding slot and force the number of threads to the number of P-cores (including HT if your CPU has it) then :
- on Windows 10, and use a third party tool to automatically set Fahcore_a8.exe priority to normal (and eventually affinities to only P-core threads). The E-cores will be wasted ... but they can be used for something else, like feeding a GPU slot for instance ...
- on Windows, it should handle it automatically. The E-cores will be wasted ... but they can be used for something else, like feeding a GPU slot for instance ...
I'm still wondering what would happen in Windows 11 if you set one CPU slot with the number of P-cores threads and another CPU slot with the number of E-cores threads ...
- on Windows 10, applications with normal priority will be sent to the P core. This is not something the client can do automatically, so you'll have to do it with a tool like Process Lasso or Bills2 Process Manager.
- Windows 11 is the definitive fix to this.
In any case, FAH is not aware of P-cores and E-cores, so it will by default try to run as many threads as it detects. When you know how Gromacs works, you know that it will run at the pace of the slowest thread in the mix. So if it uses all P-cores and E-cores in the same slot, P-cores will spend their time waiting for E-cores to finish ...
So here are my suggestions. Set a CPU folding slot and force the number of threads to the number of P-cores (including HT if your CPU has it) then :
- on Windows 10, and use a third party tool to automatically set Fahcore_a8.exe priority to normal (and eventually affinities to only P-core threads). The E-cores will be wasted ... but they can be used for something else, like feeding a GPU slot for instance ...
- on Windows, it should handle it automatically. The E-cores will be wasted ... but they can be used for something else, like feeding a GPU slot for instance ...
I'm still wondering what would happen in Windows 11 if you set one CPU slot with the number of P-cores threads and another CPU slot with the number of E-cores threads ...
Re: CPU slot only using E-cores on i7-12700
There are some people aiming for PPD per watt, who claim that folding on the E-cores is very efficient - of course in that case they are running only one thread per E-core.
It would be nice if the client would be aware of E- and P-cores, and choosing the most efficient configuration based on the folding power slider. Might add that to the huge list of wishes, like native ARM and Metal folding on Mac.
It would be nice if the client would be aware of E- and P-cores, and choosing the most efficient configuration based on the folding power slider. Might add that to the huge list of wishes, like native ARM and Metal folding on Mac.
Online: GTX 1660 Super + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 1050 Ti 4G OC, RX580
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: CPU slot only using E-cores on i7-12700
Soon ©gunnarre wrote:It would be nice if the client would be aware of E- and P-cores, and choosing the most efficient configuration based on the folding power slider. Might add that to the huge list of wishes, like native ARM and Metal folding on Mac.
-
- Posts: 6
- Joined: Thu Jan 27, 2022 3:33 am
Re: CPU slot only using E-cores on i7-12700
Thanks, this is illuminating. I am wondering if running Folding in full-hybrid might be a good way to go in terms of thermal performance. All that "waiting for E-cores" the P-cores are doing has allowed my CPU temp to drop by over 20°C! (but also added 50% to time remaining...) It was peaking around 88°C under un-throttled P-core load (though stable around 84). I may have to get another fan on the CPU cooler, or maybe use an AIO, especially in the summer when ambient temp in my apartment goes up by 6 or 7 degrees...also if/when I get a 30-series video card that starts dumping all sorts of heat into the case as well. Or I could always reduce the CPUs assigned to the slot... Anyway I'm sure there's lots of fiddling around I can try...toTOW wrote:I talked with someone (from Intel) who knows more than me about it, and he said you have only two possibilities :
- on Windows 10, applications with normal priority will be sent to the P core. This is not something the client can do automatically, so you'll have to do it with a tool like Process Lasso or Bills2 Process Manager.
- Windows 11 is the definitive fix to this.
In any case, FAH is not aware of P-cores and E-cores, so it will by default try to run as many threads as it detects. When you know how Gromacs works, you know that it will run at the pace of the slowest thread in the mix. So if it uses all P-cores and E-cores in the same slot, P-cores will spend their time waiting for E-cores to finish ...
So here are my suggestions. Set a CPU folding slot and force the number of threads to the number of P-cores (including HT if your CPU has it) then :
- on Windows 10, and use a third party tool to automatically set Fahcore_a8.exe priority to normal (and eventually affinities to only P-core threads). The E-cores will be wasted ... but they can be used for something else, like feeding a GPU slot for instance ...
- on Windows, it should handle it automatically. The E-cores will be wasted ... but they can be used for something else, like feeding a GPU slot for instance ...
I'm still wondering what would happen in Windows 11 if you set one CPU slot with the number of P-cores threads and another CPU slot with the number of E-cores threads ...
Anyway I will look into Process Lasso, as I'm almost on my 5th CPU work unit since 9 am and manually reassigning the affinities is already annoying, LOL!
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: CPU slot only using E-cores on i7-12700
Here's the feature description of Process Lasso for Alder Lake : https://bitsum.com/product-update/proce ... dler-lake/
Re: CPU slot only using E-cores on i7-12700
I have 12900K, 8 big cores and 8 little cores. Win10 will use only e-cores if not manually assigned to use just P-cores with 3rd party utility. Win11 is NOT a solution. Win11 spreads the load very evenly across all cores, and it can not differentiate between physical cores, virtual cores nor e-cores. For example, a 16-threaded F@H task will load each of the 24 threads to about 67%. HOWEVER! Using P-cores and E-cores together drops calculating speeds by 0-25% depending on the task comparing to running F@H only on P-cores. OOPS!toTOW wrote: - on Windows 10, applications with normal priority will be sent to the P core. This is not something the client can do automatically, so you'll have to do it with a tool like Process Lasso or Bills2 Process Manager.
- Windows 11 is the definitive fix to this.
The easiest solution is to disable e-cores in bios. The best solution today is to use process managers to set affinity to only P-cores for F@H client. The ideal solution is to wait for F@H the programmers to fix the client (which might take a few years...)
Last edited by VRoman on Sat Feb 05, 2022 5:06 am, edited 3 times in total.
Re: CPU slot only using E-cores on i7-12700
I found P-cores to be nearly 50% more power efficient in F@H. Yes, P-cores consume more total power, however they outperform e-cores by 2.2 times and therefore overall performance per watt is way better than e-cores.gunnarre wrote:There are some people aiming for PPD per watt, who claim that folding on the E-cores is very efficient - of course in that case they are running only one thread per E-core.
-
- Site Moderator
- Posts: 6359
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: CPU slot only using E-cores on i7-12700
In all cases, you have to manually set your CPU slot to use a number of threads that is the number of threads of P-cores. Never leave it to automatic (which will use all available threads), Gromacs is not efficient when it mixes threads that doesn't run at the same speed.
With your CPU, you need a slot set to use 16 threads. You might eventually add a second slot with 8 threads that will be executed on E-cores ...
I don't see any advantages to these big.LITTLE architectures outside the mobile world (smartphones and laptops) ...
With your CPU, you need a slot set to use 16 threads. You might eventually add a second slot with 8 threads that will be executed on E-cores ...
I don't see any advantages to these big.LITTLE architectures outside the mobile world (smartphones and laptops) ...
Re: CPU slot only using E-cores on i7-12700
Don't mistake the PPD score for efficiency.VRoman wrote:I found P-cores to be nearly 50% more power efficient in F@H. Yes, P-cores consume more total power, however they outperform e-cores by 2.2 times and therefore overall performance per watt is way better than e-cores.gunnarre wrote:There are some people aiming for PPD per watt, who claim that folding on the E-cores is very efficient - of course in that case they are running only one thread per E-core.
It's pretty likely that 2 E cores will outperform 1 P core, but PPD systems prefer faster hardware over slower ones.
Are you comparing 1 P core vs 1 E core?
Or a WU ran on all P cores, vs a WU ran on all E cores?