Re: Project: 14576 (Run 0, Clone 2096, Gen 48)
Posted: Wed May 13, 2020 11:34 pm
Many years ago, FAH had an option specifically aimed at systems (often with NUMA memory designs) with large numbers of CPUs that was colloquially referred to as "bigadv" One of the requirements imposed on those projects was to prevent the allocation of distinct CPUs for PME (i.e.-npme=0). Since that time, most CPU projects were run on systems with small numbers of threads. With the advent of home-based systems with high CPU counts, we have re-entered that era again._r2w_ben wrote:I don't believe this is a GROMACS issue. The parameters FAHclient passes to mdrun results in PME being used. It allows for better utilization of high thread counts. PME could be disabled by passing -npme = 0 but would cause this problem to occur more often.
GROMACS is designed for the scientist runniing on a dedicated computer where he can freely adjust all of the parameters. When GROMACS was adopted by FAH, it inherited a requirement that projects must on systems with an unpredictable number of threads and there is nobody around who is responsible for intercepting a job that doesn't want to work on certain numbers of CPU threads and correcting that value. As CPU couns grew, it was easy to establish rules that intercepted 7 and 11 and 13 and 14 (and maybe 5 and 10) which reduced the requested number of threads by 1 or 2, allowing it to run. Those rules are based on both GROMACS policies and direct observation. (5 is often an acceptable factor. Early testing of a project may find that 20 does or does not work and it can exclude assigning that project to that cpu count systems.)
The current procedures that FAH uses have not been extended to large values nor does it accomondate -npme adjustments. 48 is a perfectly acceptable value PROVIDED pme does reduce it to 40. Perhaps you'd like to explain when GROMACS is going to change the number of PP threads and when it isn't?
For example, 49 contains the prohibitied factor of 7, so the Core Wrapper will probably reduce it to 48 but that doesn't work because GROMACS will then re-adjust it. Will 48 ALWAYS become 40 or is that unpredictable?
For systems less that 24 (or maybe less that 32 threads) I think the current system works provided npme=0.
Are we doing it wrong? Suppose we start with 26 (13 is a factor). Reducing it to 25 is bad (5 is an unpredictable factor) but 24 is officially acceptable. When npme=0, the actually PME calculations are still performed, just not by dedicated threads. Can we allocate npme=2 to the 26 we started with and will GROMACS be more efficient but still complete the PME calculations on the PP threads?