Your thoughts....

psaam0001 · Post by **psaam0001** » Sun Dec 06, 2020 11:37 pm

So, I went looking around AMD's site, hoping to see if I could order the processor I am going to need in order to finish reconstructing my former Windows 7 folding machine, from AMD. And I stumble across this: (https://www.amd.com/en/products/server- ... inct-mi100).

Would any of you have thoughts about these new data center cards potentially being used by someone for F@H or BOINC projects? You may have to view the specs sheet (which I'm sure was written to add a lot of marketing hype to sell the product), to give an assessment.

Paul

MeeLee · Post by **MeeLee** » Mon Dec 07, 2020 3:35 am

They possibly could do it, but at 23Tflops; which is a bit slower than a 2080Ti; AND it does it at 300W, vs the 2080Ti 275W (which can be lowered to 190W and still be faster).
I wouldn't buy this, if it was just for folding.
However, if you have the hardware, I see no reason why it wouldn't fold.
All it needs is an OpenCL driver.

psaam0001 · Post by **psaam0001** » Mon Dec 07, 2020 5:43 am

My thoughts, I'll stick to what I can safely spend the money on. And right now, I'm working on buying serious upgrade parts for what was my Win 7 folding rig.

Even the RTX 3090 I wanted will have to wait until I'm done paying for that AMD Ryzen 9 16-core CPU; motherboard; CPU cooling fan & heat sink; and a strong power supply.

Paul

aetch · Post by **aetch** » Mon Dec 07, 2020 6:31 am

I wouldn't buy the Instinct either.
It's powerful for FP16 and FP64 but not FP32, which is kinda important as FAH does a lot of work on FP32.
Cooling, the instinct is passive, it relies on the server fans to push air through it.
It has no rear ports, you cannot plug a monitor into it.
Cost, if you gotta ask...

foldy · Post by **foldy** » Mon Dec 07, 2020 8:23 am

Server GPUs like AMD Radeon Instinct are very expensive, because of AMD Support and they have much expensive VRAM - which FAH does not use. A consumer GPU like AMD RX 6800 XT or Nvidia RTX 3080 is always the better choice.

psaam0001 · Post by **psaam0001** » Mon Dec 07, 2020 2:13 pm

I'll take practicality over performance, when the budget constraints bring my dreams back to the real world.

So, I should be able to afford an RTX 2080 or better consumer level NVidia card by April.

Paul

Yeroon · Post by **Yeroon** » Mon Dec 07, 2020 6:13 pm

MeeLee wrote:They possibly could do it, but at 23Tflops; which is a bit slower than a 2080Ti; AND it does it at 300W, vs the 2080Ti 275W (which can be lowered to 190W and still be faster).
I wouldn't buy this, if it was just for folding.
However, if you have the hardware, I see no reason why it wouldn't fold.
All it needs is an OpenCL driver.

2080ti fp32 is only 13.5TF, not 23TF, and less than 1/2 a TF in fp64
MI100 at 23.1TF fp32, and 11.5TF fp64 would significantly outpace the 2080ti, if specs were all that mattered. Given that fah cores run better on Nvidia, opencl or now also with cuda, even if this card were reasonably priced/accessible, would possibly lose in perf/watt.
However, i doubt this card is anywhere close to a price anyone folding would want to pay. (~$6400 for low volume units)

EDIT: specs are FMA32 at 23.1Tflops, and FP32 at 46.1Tflops. I am not sure which version we'd be comparing for FAH compute.

MeeLee · Post by **MeeLee** » Tue Dec 08, 2020 4:39 pm

Yeroon wrote:
MeeLee wrote:They possibly could do it, but at 23Tflops; which is a bit slower than a 2080Ti; AND it does it at 300W, vs the 2080Ti 275W (which can be lowered to 190W and still be faster).
I wouldn't buy this, if it was just for folding.
However, if you have the hardware, I see no reason why it wouldn't fold.
All it needs is an OpenCL driver.
2080ti fp32 is only 13.5TF, not 23TF, and less than 1/2 a TF in fp64
MI100 at 23.1TF fp32, and 11.5TF fp64 would significantly outpace the 2080ti, if specs were all that mattered. Given that fah cores run better on Nvidia, opencl or now also with cuda, even if this card were reasonably priced/accessible, would possibly lose in perf/watt.
However, i doubt this card is anywhere close to a price anyone folding would want to pay. (~$6400 for low volume units)

EDIT: specs are FMA32 at 23.1Tflops, and FP32 at 46.1Tflops. I am not sure which version we'd be comparing for FAH compute.

Thanks for catching that. I just googled the results, and not gone to a reliable source, like techpowerup.com (which ironically appears to be down right now, lol).
Anyway, 23Gflops is somewhere between a 3070 and a 3080; both of which combined cost the same as a 2080Ti.

Post by **bruce** » Tue Dec 08, 2020 5:42 pm

I'm not sure what percentage of FAH's GPU performance depends on FP32 but it's a big number. The percentage that depends on FP64 and the percentage that depends on "other OPs" and the percentage that depends on PCIe bus configurations are all small and probably depend on the exact protein you're folding. For that rerason, I tend to ignore FP64 numbers although they're easy to find and you can legitimately argue that they matter (but not much).

Theoretical performance numbers may be a good sales tool bult they are never an accurate measure of FAH's performance on a variety of proteins anyway.

Yeroon · Post by **Yeroon** » Tue Dec 08, 2020 9:29 pm

From AMD's CDNA whitepaper in reference to the FMA32 vs FP32, I think my previous numbers were switched (FP32 @ 23.1Tflops should be correct).

Anyways, this is the best I could find in there about FMA32:

As Figure 5 illustrates, the CUs are augmented with new matrix engines to handle the MFMA instructions and boost throughput and energy efficiency. The matrix execution unit has several advantages over the traditional vector pipelines in GCN. First, the execution unit reduces the number of register file reads, since in a matrix multiplication many input values are re-used. Second, the narrower datatypes create a huge opportunity for workloads that do not require full FP32 precision, e.g., machine learning.

Ampere FP32 numbers aren't exactly equal to Turing (2080ti)

HW times Ampere deep dive

Each of the four partitions in an SM has two datapaths or pipelines; One with a cluster of 32 CUDA cores purely dedicated to FP32 operations while another that can do both, FP32 or INT32. This means that the 2x FP32 or 128 FMA per clock of performance that NVIDIA is touting will only be true when the workloads are purely composed of FP32 instructions which is rarely the case. This is why we don’t see an increase of 2x in performance despite the fact that the core count increases by the same figure.

aetch · Post by **aetch** » Tue Dec 08, 2020 11:32 pm

bruce wrote:Theoretical performance numbers may be a good sales tool bult they are never an accurate measure of FAH's performance on a variety of proteins anyway.

Yeah, pretty much.
On paper my GTX 1080 Ti should trounce my RTX 2070 Super at Folding but it doesn't. My RTX 2070 Super actually scores higher and draws less power doing it.

Folding Forum

Your thoughts....

Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....

Re: Your thoughts....