Initial impressions for fast ARM hardware
Moderators: Site Moderators, FAHC Science Team
-
- Site Moderator
- Posts: 1115
- Joined: Sat Dec 08, 2007 1:33 am
- Location: San Francisco, CA
- Contact:
Re: Initial impressions for fast ARM hardware
The efficiency cores just slow down the performance cores. You should set the slot to cpu:8.
You should get a passkey if you haven’t already. Once you complete 10 WUs, you should get more than 150k ppd.
You should get a passkey if you haven’t already. Once you complete 10 WUs, you should get more than 150k ppd.
Re: Initial impressions for fast ARM hardware
I've managed to max out m1 pro (10 core) CPU.
Configure -> Slots -> 8 CPUs (= number of P-cores)
Configure -> Advanced -> Priority slightly higher (to avoid E-cores)
Output is 145k PPD at 20W SoC power
Some observations:
P-cores are running at 2.1 GHz, fans are nearly inaudible at 1500 rpm, looks like OS throttled performance down
On cold start cores are running at 3 GHz and initial PPD estimation was around 300k PPD
P.S. it's still an emulated x86 process. When do we get native arm64 support?
Configure -> Slots -> 8 CPUs (= number of P-cores)
Configure -> Advanced -> Priority slightly higher (to avoid E-cores)
Output is 145k PPD at 20W SoC power
Some observations:
P-cores are running at 2.1 GHz, fans are nearly inaudible at 1500 rpm, looks like OS throttled performance down
On cold start cores are running at 3 GHz and initial PPD estimation was around 300k PPD
P.S. it's still an emulated x86 process. When do we get native arm64 support?
-
- Site Admin
- Posts: 7936
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Initial impressions for fast ARM hardware
My guess is sometime in 2022, but no idea on exactly when during the year that will happen.vzim wrote:P.S. it's still an emulated x86 process. When do we get native arm64 support?
For the client it will probably be when they release the next version of the client. Work was resumed recently on version 8, it had been halted in early 2020 as F@h started its response to COVID. I am also aware of some work being done towards creating an Apple Silicon version of the CPU folding core, but testing and distribution of that may take a while. Some changes on the server end may also need to be made.
There is already an ARM64 version of the CPU folding core out for ARM systems running Linux.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: Initial impressions for fast ARM hardware
+ Macs Fan Control to force min 2500 rpm
Final WU score: 227k PPD @30W or 7566 PPD/W
Properly optimized native version will be around 10k PPD/W
Final WU score: 227k PPD @30W or 7566 PPD/W
Properly optimized native version will be around 10k PPD/W
-
- Site Admin
- Posts: 7936
- Joined: Tue Apr 21, 2009 4:41 pm
- Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2 - Location: W. MA
Re: Initial impressions for fast ARM hardware
Maybe. Rosetta 2 is pretty good at translating Intel code to the ARM code used on Apple Silicon, and it caches the translated code so it is not continuously incurring the overhead of translating running code. So it remains to be seen just how much overhead gets removed by running a native folding core.vzim wrote:Properly optimized native version will be around 10k PPD/W
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Re: Initial impressions for fast ARM hardware
It's possible to run FAHClient exclusively on E-cores.
This involves some QoS magic and LaunchDaemons/org.foldingathome.fahclient.plist editing
macbook air m1
PPD est: ~24k (most likely overestimated)
Package power: 465 mW
At this power level I can keep it running 24/7 even on battery power
This involves some QoS magic and LaunchDaemons/org.foldingathome.fahclient.plist editing
macbook air m1
PPD est: ~24k (most likely overestimated)
Package power: 465 mW
At this power level I can keep it running 24/7 even on battery power
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: Initial impressions for fast ARM hardware
How well does it do with completing wus within timeout?vzim wrote:At this power level I can keep it running 24/7 even on battery power
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Re: Initial impressions for fast ARM hardware
It haven't completed single WU yet. After running WU 16955 overnight it's at 30% and 1.26 days left (ETA) with 4.46 days left to complete.Neil-B wrote:How well does it do with completing wus within timeout?vzim wrote:At this power level I can keep it running 24/7 even on battery power
Upd on power and PPD: 3.5k @ 0.33W
-
- Posts: 1996
- Joined: Sun Mar 22, 2020 5:52 pm
- Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21 - Location: UK
Re: Initial impressions for fast ARM hardware
16955 timeout is three days with 5 day expiration ... completion within the timeout rather than expiration is preferable as at timeout the wu is assigned to another folder ... from your figures it looks like it will complete in 1.86 days and so within the timeout which is great
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070
(Green/Bold = Active)
Re: Initial impressions for fast ARM hardware
Recipe for cooking on E-cores
Edit the /Library/LaunchDaemons/org.foldingathome.fahclient.plist
Configure cpu slot in FAH Control to use proper number of cores (4 in my case)
Restart the folding service
Edit the /Library/LaunchDaemons/org.foldingathome.fahclient.plist
Code: Select all
<key>ProgramArguments</key>
<array>
<string>/usr/sbin/taskpolicy</string>
<string>-c</string>
<string>background</string>
<string>/usr/local/bin/FAHClient</string>
</array>
Restart the folding service
Code: Select all
sudo launchctl unload /Library/LaunchDaemons/org.foldingathome.fahclient.plist
sudo launchctl load /Library/LaunchDaemons/org.foldingathome.fahclient.plist
-
- Site Moderator
- Posts: 1115
- Joined: Sat Dec 08, 2007 1:33 am
- Location: San Francisco, CA
- Contact:
Re: Initial impressions for fast ARM hardware
Interesting.
Do you get the same result using ProcessType Background?
Does FAHControl work properly with the background clamped client?
What app do you use to get cpu core usage and clock speeds?
Thanks.
Do you get the same result using ProcessType Background?
Does FAHControl work properly with the background clamped client?
What app do you use to get cpu core usage and clock speeds?
Thanks.
Re: Initial impressions for fast ARM hardware
I haven't tested it. I suppose it should be an equivalent in terms of CPU cores scheduling. Not sure if there additional restrictions applied by OS for "background" daemonscalxalot wrote: Do you get the same result using ProcessType Background?
I see no difference in FAHControl operationcalxalot wrote: Does FAHControl work properly with the background clamped client?
sudo powermetricscalxalot wrote: What app do you use to get cpu core usage and clock speeds?
Re: Initial impressions for fast ARM hardware
Are you measuring SOC power, or total system power?vzim wrote:+ Macs Fan Control to force min 2500 rpm
Final WU score: 227k PPD @30W or 7566 PPD/W
Properly optimized native version will be around 10k PPD/W
The PPD values in PPD/W should be measured at the power socket, not from a program like HWMonitor that gets SOC power consumption.
It would be interesting to see which GPU system would be matching it in terms of PPD/W.
I would guess if the above watt ratings is SOC, and not total system power, that you're closer to 8PPD/W, which should be close to the PPD/W of an RTX 1660, paired with a 35W Intel Celeron CPU, running a total system power of 150W.
The desktop would be crunching out closer to 1,2M PPD though.
Re: Initial impressions for fast ARM hardware
I measure SoC power impact w/ FAH process active.
Power at wall (with screen sleep) is usually 2-5W higher
Power at wall (with screen sleep) is usually 2-5W higher