[I am not an authority, this is just my silly, wild assed guess]
It is my belief that core_a7 does support AVX2, as well as SSE2. GROMACS 2018 does support the following new CPU enhancements:
Achieved speedup on Intel KNL processors of around 11% for PME spread/gather on typical simulation systems.
In the simple case of leap-frog without pressure coupling and with at most one temperature-coupling group, the update of velocities and coordinates is now implemented with SIMD intrinsics for improved simulation rate.
AMD Ryzen appears to always perform slightly better with OpenMP than MPI, up to using all 16 threads on the 8-core die.
While Ryzen supports 256-bit AVX2, the internal units are organized to execute either a single 256-bit instruction or two 128-bit SIMD instruction per cycle. Since most of our kernels are slightly less efficient for wider SIMD, this improves performance by roughly 10%.
On AMD Zen, tabulated Ewald kernels are always faster than analytical. And with AVX2_256 2xNN kernels are faster than 4xN. These faster choices are now made based on CpuInfo at run time.
The group-scheme kernels can use AVX instructions from either the AVX_128_FMA and AVX_256 extensions. But hardware that supports the new AVX2_128 extensions also supports AVX_256, so we enable such support for the group-scheme kernels.
Recent Intel x86 hardware can have multiple AVX-512 FMA units, and the number of those units and the way their use interacts with the way the CPU chooses its clock speed mean that it can be advantageous to avoid using AVX-512 SIMD support in GROMACS if there is only one such unit. Because there is no way to query the hardware to count the number of such units, we run code at CMake and mdrun time to compare the performance from using such units, and recommend the version that is best. This may mean that building GROMACS on the front-end node of the cluster might not suit the compute nodes, even when they are all from the same generation of Intel’s hardware. -
http://manual.gromacs.org/documentation ... mance.html
Notice that some improvements can look at CPUinfo at runtime and make good choices, Easy for F@H. Some have to be fixed at compile time, (CMAKE) so can make bad choices if executed on any other CPU than the one it was compiled on. F@H will not easily take advantage of those optimizations.
I do not believe F@H changes GROMACS version within a core, but I do think they always use the latest stable version for a new core. (I do not know if any new CPU core is far enough along to have locked which version of GROMACS to use)