AMD CPU systems & PCIE 4.0

Post by **bruce** » Sat Jun 08, 2019 5:50 pm

Theodore wrote:2- Another interesting thing to note, is that no one knows yet, if PCIE 4.0 (which no doubt will be faster in throughput), has increased latency compared to PCIE 3.0?

It should be noted that the FAH Bonus points don't care whether your WU is slowed down by transfer speed or by latency. When data needs to transfer to/from the GPU, the total time including {latency+data_size/speed} is what matters. Getting the first byte there sooner doesn't matter... it's when the last byte of the block of data arrives.

MeeLee · Post by **MeeLee** » Sun Jun 09, 2019 12:31 pm

bruce wrote:
Theodore wrote:2- Another interesting thing to note, is that no one knows yet, if PCIE 4.0 (which no doubt will be faster in throughput), has increased latency compared to PCIE 3.0?
It should be noted that the FAH Bonus points don't care whether your WU is slowed down by transfer speed or by latency. When data needs to transfer to/from the GPU, the total time including {latency+data_size/speed} is what matters. Getting the first byte there sooner doesn't matter... it's when the last byte of the block of data arrives.

Larger clumps of data definitely will benefit from the higher bandwidth, but smaller chunks of data benefit more from lower latencies.

I guess we will only know for sure, once PCIE 4.0 compatible CPUs come available to the public.

Nathan_P wrote:If you want a low power cpu to run your gpu folding rig look at xeon, especially older socket 115x xeon, those chips can go down as low as 25w.

Older Xeons do suffer from either low CPU frequency (lower than 3Ghz), few cores, or inefficient 65nm designs.
Many older Xeons, are also limited to PCIE 2.0 speeds.
Motherboards using all PCIE lanes are also very hard to find.

Post by **bruce** » Sun Jun 09, 2019 4:12 pm

MeeLee wrote:
bruce wrote:Larger clumps of data definitely will benefit from the higher bandwidth, but smaller chunks of data benefit more from lower latencies.

I don't remember reading anything about the size of the data chunks for FAHCore_2*. It's bound to depend on the size of the protein and maybe on the characteristics of the GPU. Has anybody ever evaluated it?

MeeLee · Post by **MeeLee** » Sun Jun 09, 2019 6:40 pm

bruce wrote:
MeeLee wrote:
bruce wrote:Larger clumps of data definitely will benefit from the higher bandwidth, but smaller chunks of data benefit more from lower latencies.
I don't remember reading anything about the size of the data chunks for FAHCore_2*. It's bound to depend on the size of the protein and maybe on the characteristics of the GPU. Has anybody ever evaluated it?

Aside from seeing GPU activity (very visible on GTX GPUs running on a PCIE 2.0 1x slot) in Windows task manager, I don't know much about it.
All I can say is every few seconds (like 8 or more), the GPU gets fed data. The data buffer is less than 0.5 seconds, which is the maximum refresh rate of taskmanager.
It just shows as a blip on the 'copy' graph, as well as a drop in GPU % activity.

If it was less than 0.5 seconds of data transferring at PCIE 2.0 1x speeds, it would be less than 120MB of data; as PCIE 2.0 1x speed is 500MB/s up or down => 250MB/s up => 1/2 second = <125 MB.
This happens on GTX 1030 and 1050 and 1060 cards at 1x speed. At PCIE 2.0 4x speed, PCIE 3.0 1x speed, or faster, the blip doesn't register in taskmanager.

Nathan_P · Post by **Nathan_P** » Mon Jun 10, 2019 8:51 pm

MeeLee wrote:

Nathan_P wrote:If you want a low power cpu to run your gpu folding rig look at xeon, especially older socket 115x xeon, those chips can go down as low as 25w.
Older Xeons do suffer from either low CPU frequency (lower than 3Ghz), few cores, or inefficient 65nm designs.
Many older Xeons, are also limited to PCIE 2.0 speeds.
Motherboards using all PCIE lanes are also very hard to find.

A 22nm quad core with HT is perfectly capable of driving at least 2 gpu's at full speed on PCIe 3.0 with a modest 2.8GHz boost clock, I could add a 3rd gpu without problems apart from heat.

MeeLee · Post by **MeeLee** » Mon Jun 10, 2019 11:43 pm

Nathan_P wrote:
MeeLee wrote:

Nathan_P wrote:If you want a low power cpu to run your gpu folding rig look at xeon, especially older socket 115x xeon, those chips can go down as low as 25w.
Older Xeons do suffer from either low CPU frequency (lower than 3Ghz), few cores, or inefficient 65nm designs.
Many older Xeons, are also limited to PCIE 2.0 speeds.
Motherboards using all PCIE lanes are also very hard to find.
A 22nm quad core with HT is perfectly capable of driving at least 2 gpu's at full speed on PCIe 3.0 with a modest 2.8GHz boost clock, I could add a 3rd gpu without problems apart from heat.

Why would you use a Xeon, when you can do the same on an Intel Core i5 quad core?
Most ATX motherboards run first slot at 8x, second at 4x, third at 4x. And add any additional 1x slots from there.
Haswell Xeons (22nm) still are very expensive compared to a Core i processor (6th to 8th gen), and Xeons only have cheap Chinese motherboards.
Regular motherboards are more expensive than Core i motherboards.

Nathan_P · Post by **Nathan_P** » Tue Jun 11, 2019 5:18 pm

Lower power, find me an i5 that runs at 25w tdp.

Its running in an Asus Z87 WS, 4 x16 PCIe 3.0 slots via PLX switches, nothing cheap about that board.

MeeLee · Post by **MeeLee** » Tue Jun 11, 2019 9:57 pm

I looked through the list of all Xeon processors.
If the 25 Watts 4 core 8 watts you mentioned, is the E3-1240LV5, Xeon E3-1505L v5, , it only offers 2Ghz max speed.
It should limit RTX cards significantly.
There are 35W alternatives, that run at 2,5-3,2Ghz.

There also are 3+Ghz, quad core + HT, Core I5s and I7s, with 35W TDP.
One desktop version (Intel® Core™ i7-7700T), and a few Ultra low voltage versions.

There are lots of 45Watts Core i processors, with +3Ghz core speed. The +10 to +20Watts more than makes up for the increase in PPD (no CPU bottlenecks on RTX cards).

Even if a CPU is rated at 65Watts, when they run Nvidia cards, the thing that keeps the thread locked, doesn't consume much Wattage.
I'd estimate that a 65W TDP CPU is running at 35-40Watts with the threads all active (feeding GPUs); since kernel times are pretty low (20-80%).

Nathan_P · Post by **Nathan_P** » Wed Jun 12, 2019 4:38 pm

My Xeon is an E3-1230L v3.

Have you actually tested to see if a cpu does bottleneck the gpu's at a lower clock speed?

MeeLee · Post by **MeeLee** » Fri Jun 14, 2019 4:22 am

Yes, I ran tests on 2Ghz processors. I think it was Foldy (but could have been another member) who lowered CPU frequency, and wrote down the PPD performance.
It's also easy to see on slower computers with less than 3Ghz, how much kernel times are being utilized while folding on GPU.
With 3 GPUs fed by 1 quad core, either taskmanager in Windows, or TOP/HTOP in Linux will show you what cores are under 100% load, and what percentage of it is actual kernel times.
On an RTX 2080 ti, the kernel times are 2 to 3x higher than on an RTX 2060, and nearing 80% on my Core i5 @ 3.25Ghz.
Not sure if a xeon has different kernel times than a Core i processor, but for feeding a GPU, I think they should show pretty much the same results.

Folding Forum

AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0

Re: AMD CPU systems & PCIE 4.0