How useful are CPUs to the science?

Sarr · Post by **Sarr** » Tue Apr 21, 2020 9:41 pm

I know that when optimizing builds for maximum PPD, people tend to focus on GPUs, and getting as many of them running as possible. This makes sense because they do the largest PPD. But I know that GPU and CPU projects are separate, and use separate software. Now what I want to know is how vital are the CPU projects to the science being done? Some of them say in the description that they supplement GPU projects, or something similar, what I want to know is, hypothetically, if there were no more CPUs running on the project, would things grind to a halt due to GPUs alone, inability to do calculations that can only be done on the CPU? And would it follow from this that getting many CPUs, or larger multi core CPUs online would still be a big help to a project, even if it wouldn't be the most efficient route to go, PPD wise? or would going out of my way to get many CPUs going just not be anywhere near as useful as focusing efforts on GPUs

Post by **bruce** » Tue Apr 21, 2020 10:00 pm

The GPU are scientifically very valuable.

Certain types of projects run more effectively on CPUs that GPUs and the scientists makes an intelligent choice about which pool of hardware will give them the results they want most quickly and most efficiently. Also, some are much more familiar with running project on GROMACS or on OpenMM and may choose one or the the other for that reason.

Read lots of project descriptions and you might find a pattern (or might not). I doubt it will be explicit explained, though. Both will work, and there's some overlap in the type of protein that works best.

If there were no CPUs, I beleive those projects could be restructured but some would not run efficiently.

Also, there are specific analysis tools built into each FAHCore that might not be built into the other platform. Their development histories are not identical.

Currently, there is a development effort going on because some project(s) need a different type of internal function. It's being added to a future version of a FAHCore so there will be projects that can't be done on the other FAHCore ... at least until somebody decides to develops that function on the other core. Both are Open Software, so that may or may not happen.

JimboPalmer · Post by **JimboPalmer** » Tue Apr 21, 2020 10:30 pm

I am going to explain what I know, but understand I am not a molecular modeler, just a programmer who spent 36 years doing Inventory and General Ledger applications. I know enough to be dangerous.

In old 'traditional' programming you would do repetitive calculations in a loop over and over again until you got a result. In the 2000 time frame CPUs added SIMD hardware, where you can do one instruction on multiple data at once, cutting time looping.
https://en.wikipedia.org/wiki/SIMD

Modern CPUs can do 8 single precision Instructions per thread and may have as many as 256 threads (although 8 is more common) So potentially a CPU using avx_256 is 2000 times as fast as a CPU using normal math.
https://en.wikipedia.org/wiki/Advanced_ ... Extensions

F@H does not use it yet, but avx_512 offers 16 at once, but is not widely used except in servers.

A GPU has a much 'dumber' instruction set, but can do an instruction on as many as 5000 data point at once.
https://en.wikipedia.org/wiki/General-p ... sing_units
https://en.wikipedia.org/wiki/OpenCL

So far, as you can see, this favors GPUs over CPUs, But the Logic part of the program has only the 'dumb' operations of a GPU, while the CPU has an entire Modern CPU to do the logic from the SIMD array right there at hand, whereas if you try to use the CPU's logic to help the GPU, you have to move data across the PCIE bus. This slows the GPU code quite a bit compared to the CPU code, so it is not as lopsided as it seems at first.

[This example used to be true, I am not sure it is still true] As an example of brute force verus subtle ability, GPUs have to consider the water the protein is in as a continuous fluid (called Implicit solvation) while CPUs can consider each water molecule as an item, giving much more accurate results. (Explicit solvation) So some parts of the art of modeling medicines to combat disease are faster on GPUs, and some parts need the better logic of CPUs to get finer detail.
https://en.wikipedia.org/wiki/Implicit_solvation

Post by **bruce** » Wed Apr 22, 2020 2:52 am

An implicit solvent calculation is a lot simpler calculation than repeating the same equations for each atom in each molecule. That math is easy for the CPU but less accurate.

For GPUs, the sovent is typically many atoms of water, perhaps with some dissolved trace ions. Just considering the three atoms in H2O, those three atoms do not line up along a straight line so the atom is somewhat polar, with the positive charge on one side and the negative charge on the other side. Thus the molecule probably needs to rotate somewhat while other things are happening. That's A LOT more parallel calculations, but it's also more accurate scientifically.

Proteins, too, have portions which are hydrophobic or hydrophillic. That turns out to be important to FAH. See wikipediadia if you're still interested.

Folding Forum

How useful are CPUs to the science?

How useful are CPUs to the science?

Re: How useful are CPUs to the science?

Re: How useful are CPUs to the science?

Re: How useful are CPUs to the science?