WU taking too long on Odroid N2

Moderators: Site Moderators, FAHC Science Team

Post Reply
UdoA
Posts: 12
Joined: Sun Feb 07, 2021 10:26 am

WU taking too long on Odroid N2

Post by UdoA »

Hello,

I am new to this forum although I have been folding for about a year.
Now I stumbled about an issue with one of my workhorses which are folding.

It is the above mentioned Odroid N2 with Armbian on it running the folding client 7.6.21 on all cores for some time now.
Nevertheless for the last few weeks I relatively often get WU which take slightly longer than the time which is granted to me by the server so after about 3 days of working on the task the time runs out and the work is thrown away.
I would assume that this is due to the big-little configuration of the ARM64 CPU of the N2 as it has 4 more powerful cores and 2 less.

Is this known to the community and how do I resolve this to avoid wasting CPU time?

Thanks and regards from Germany.

Udo
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WU taking too long on Odroid N2

Post by bruce »

I don't know the answer, but I do have a guess. What happens if you reconfigure through FAHControl, setting the number of cores to the number of powerful threads that your hardware has? i.e. tell FAH not to depend on the slow devices.l

FAH folds at the speed of the slowest thread and the default setting may cause the server to assign projects that are too complex to run on the slower cores.
UdoA
Posts: 12
Joined: Sun Feb 07, 2021 10:26 am

Re: WU taking too long on Odroid N2

Post by UdoA »

I just limited the cpu usage to 4 cores and now the client stays within the limited and it seems that the WU is now much faster. Apparently now only the high-power cores are used for the fah client
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WU taking too long on Odroid N2

Post by bruce »

Thanks for the feedback. That exactly what I hoped would happen.

Now I'm off to write an enhancement request unless somebody else has already done so.
JimboPalmer
Posts: 2522
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: WU taking too long on Odroid N2

Post by JimboPalmer »

[This my understanding, but I am not a CPU designer and had no part in ARM CPU design]

big.LITTLE comes in 3 flavors:

1) the most primitive uses n LITTLE CPUs matched with n big CPUs. when program load exceeds the ability of a LITTLE CPU all n CPUs are switched to big CPUs, so only all the LITTLE CPUs are active or all the big CPUs.

2) less primitive uses n LITTLE CPUs matched with n big CPUs. when program load exceeds the ability of a LITTLE CPU, that LITTLE CPU is switched to a big CPU, so only n CPUs are active, but they can be a mix of LITTLE CPUs and big CPUs.

3) The most versitile has some LITTLE CPUs and some big CPUs but they do not replace each other, the LITTLE CPUs are always active and the big CPUs are activated when needed. The number of active CPUs is not constant as in 1) and 2). The number of LITTLE and big CPUs does not have to be equal.

So in both 1) and 2) it is true that we only want to fold on the number of big CPUs. I am less sure about scenario 3). I think both the LITTLE and big CPUs have a NEON SIMD engine.

https://en.wikipedia.org/wiki/ARM_big.LITTLE
https://en.wikipedia.org/wiki/ARM_archi ... SIMD_(NEON)

[That is my understanding, but I am no expert]
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WU taking too long on Odroid N2

Post by bruce »

Experimentally, we have observed that the OS decides. If you run a heavy process using n threads (where you have n LITTLE threads) the dispatcher in the OS will schedule them for the LITTLE threads if it can. If you have M big threads and you're heavy processes exceed n, then some threads will be processed by big and others will be processed by LITTLE threads so some will finish first and will have to wait for the others catch up. Thus FAH runs at the speed of the slowest thread being used. By limiting the number of threads used to <= the number of fast threads, synchronization of the various threads happens much sooner and the whole process runs faster. Allow FAH to avoid (or minimize) the use of slow threads.
Post Reply