PCI-e bandwidth/capacity limitations
Moderator: Site Moderators
Forum rules
Please read the forum rules before posting.
Please read the forum rules before posting.
Re: PCI-e splitter?
That would be greatly appreciated. I really want to get the most processing power out of the $10k I plan to drop on this.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e splitter?
This site shows how to test different PCIe speeds using the BIOS and some note-it paper.
Maybe someone can test like this for folding? e.g. using FahBench with a demanding work unit?
(The conclusion of the article that pcie speed does not mater only holds for gaming not for GPGPU)
https://www.pugetsystems.com/labs/artic ... mance-518/
I measured my GTX 970 at PCIe 2.0 8x with FahBench 2.2.5 and a real work unit and MSI Afterburner showed BUS usage 55%.
PCIe 2.0 8x has 4 GB/s bandwidth so f@h uses 4GB * 0,55 = 2.2GB.
So "PCIe 2.0 4x" or "PCIe 3.0 2x" each having 2GB/s bandwidth will be the lower limit, maybe loosing 10% ppd.
Maybe someone can test like this for folding? e.g. using FahBench with a demanding work unit?
(The conclusion of the article that pcie speed does not mater only holds for gaming not for GPGPU)
https://www.pugetsystems.com/labs/artic ... mance-518/
I measured my GTX 970 at PCIe 2.0 8x with FahBench 2.2.5 and a real work unit and MSI Afterburner showed BUS usage 55%.
PCIe 2.0 8x has 4 GB/s bandwidth so f@h uses 4GB * 0,55 = 2.2GB.
So "PCIe 2.0 4x" or "PCIe 3.0 2x" each having 2GB/s bandwidth will be the lower limit, maybe loosing 10% ppd.
Re: PCI-e splitter?
So you're saying that PCIe V2.0 x4 would place some limitations on a GTX960 but it would probably be fine for a slower GPU running FahBench and some undefined production WU. Scaling that down to the x1 PCIe splitter being discussed, many of the Kepler GPUs could manage using the adapter through a x1 v2 slot but not most of the Maxwell GPUs.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: PCI-e splitter?
I'm unfamiliar with the measurement process. How do you guys obtain that data?
Now basing this off foldy's data, if 3.0 x2 would see a 10% loss, then 3.0 x1, being only 1GB/s would see a...45% loss? Does that sound right? Or is my maths wrong here (I did fail maths in high school, so I wouldn't be surprised)?
Now basing this off foldy's data, if 3.0 x2 would see a 10% loss, then 3.0 x1, being only 1GB/s would see a...45% loss? Does that sound right? Or is my maths wrong here (I did fail maths in high school, so I wouldn't be surprised)?
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e splitter?
Hi hiigaran, I obtain that data from my GTX 970 and using windows tool MSI Afterburner which shows the pcie bandwidth used.
If math is right then yes 55% loss of PPD but we need somebody to proof that in reality.
Anyone has a GTX 970 or similar running on pcie slower than pcie 3.0 x2 or pcie 2.0 4x or wants to do the paper limit trick or can limit it in bios?
@Nathan_P: Is your rig running again?
@Bruce: I don't know if a slower GPU would use lower pcie bandwidth today on Core_21.
But tests years ago with older GPUs and FahCores showed a much lower pcie bandwidth limit.
The undefined production WU is just one copied from folding@home work unit folder.
I used it to test if FahBench and f@h use the same pcie bandwidth using the same work unit - they do.
If math is right then yes 55% loss of PPD but we need somebody to proof that in reality.
Anyone has a GTX 970 or similar running on pcie slower than pcie 3.0 x2 or pcie 2.0 4x or wants to do the paper limit trick or can limit it in bios?
@Nathan_P: Is your rig running again?
@Bruce: I don't know if a slower GPU would use lower pcie bandwidth today on Core_21.
But tests years ago with older GPUs and FahCores showed a much lower pcie bandwidth limit.
The undefined production WU is just one copied from folding@home work unit folder.
I used it to test if FahBench and f@h use the same pcie bandwidth using the same work unit - they do.
Last edited by foldy on Sat Jun 04, 2016 1:38 pm, edited 1 time in total.
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: PCI-e splitter?
My tests have been unsuccessful, every time I move a gpu in my Z9PE it starts a different wu so I can't do an apples to apples comparison. I have a new mobo on order for a different project so i'll see if I can get any testing done on that. Ideally you would need a mobo that allows you to set the PCIE slot speed in the BIOS.
Edit:- Well it looks like my new mobo gives me the option to set the PCIe generation that a slot runs at but not the link speed so in theory I should still be able to simulate x4 and x8 speeds.
Edit:- Well it looks like my new mobo gives me the option to set the PCIe generation that a slot runs at but not the link speed so in theory I should still be able to simulate x4 and x8 speeds.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e splitter?
Use FahBench 2.2.5 and select the WU "nav" which is a real work unit. It creates real pcie bandwidth use which I compared fahbench to f@h using MSI afterburner GPU bus usage monitoring. I think all Core_21 work units have similar pcie bandwidth usage on a given GPU, so apple to apple comparison is OK.
This test should only be a estimation so even with different work units if all are Core_21 you will get a trend.
What I expect the test to show is when the available bandwidth is half of needed bandwidth then PPD or FAHBench score also drops nearly to half.
For a gtx 970 we need speeds lower than pcie 3.0 x2 or pcie 2.0 4x to probably see the expected performance drop to half - or prove it wrong.
I tried using tape on my gtx 970 pins to force pcie 4x and 1x mode but then mainboard does not recognize the card anymore.
This test should only be a estimation so even with different work units if all are Core_21 you will get a trend.
What I expect the test to show is when the available bandwidth is half of needed bandwidth then PPD or FAHBench score also drops nearly to half.
For a gtx 970 we need speeds lower than pcie 3.0 x2 or pcie 2.0 4x to probably see the expected performance drop to half - or prove it wrong.
I tried using tape on my gtx 970 pins to force pcie 4x and 1x mode but then mainboard does not recognize the card anymore.
Last edited by foldy on Sat Jun 04, 2016 6:36 pm, edited 1 time in total.
Re: PCI-e splitter?
Pick an arbitrary goal for maximum PCIe bandwidth and a suitable test WU.foldy wrote:I don't know if a slower GPU would use lower pcie bandwidth today on Core_21.
But tests years ago with older GPUs and FahCores showed a much lower pcie bandwidth limit.
The undefined production WU is just one copied from folding@home work unit folder.
I used it to test if FahBench and f@h use the same pcie bandwidth using the same work unit - they do.
Run a slow GPU. The calculations of each block of data transferred will take a certain amount of time and data transfers will be overlapped with most of them. The GPU shaders will be the limiting factor.
Run the same test with a fast GPU. The calculations will take less time so the shaders will be waiting more for data to be transferred. Processing percentage will go down and the PCIe transfers will become the limiting factor.
There's also a factor somewhere for the size of VRAM which may allow more data blocks to be queued for future processing but in the early days of GPU folding, we also concluded that the size of VRAM was unimportant. Nobody has studied Core_21 to see if that matters today.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e splitter?
I currently got a Core_18 and gpu bus usage is 60% on pcie 2.0 8x. VRAM usage is 221 MB, on paused fah 122 MB, so I guess there is some room left to queue a little work cause most GPUs have 1GB at least. So VRAM is not a limit today.
It may be worth to study if a reduction of transfer bandwidth is possible and queuing more work on GPU could speedup folding even on pcie 3.0 16x. On the other side future standard pcie 4.0 will double bandwidth again.
But first we should proof what is the bandwidth limit.
It may be worth to study if a reduction of transfer bandwidth is possible and queuing more work on GPU could speedup folding even on pcie 3.0 16x. On the other side future standard pcie 4.0 will double bandwidth again.
But first we should proof what is the bandwidth limit.
Re: PCI-e splitter?
Out of curiosity, is the data running through the slot compressed? Not that I have any understanding of how data flows and interacts with different hardware and their components here, but I'd assume that reducing lane bottlenecks through compression at the cost of a little processing power would be beneficial.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e splitter?
This are developer questions we could raise as issue at OpenMM but only if we can reproduce the problem here to proof it.
Re: PCI-e splitter?
I'm guessing here, but most of the data being transferred is probably binary numbers (which don't compress).
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e splitter?
Nvidia GTX 1080 on PCIe 3.0 16x bus usage 40%
viewtopic.php?f=38&t=28784&start=75
If numbers are correct a PCIe 2.0 8x bus like I have would throttle this card because of bus is 60% too slow.
viewtopic.php?f=38&t=28784&start=75
If numbers are correct a PCIe 2.0 8x bus like I have would throttle this card because of bus is 60% too slow.
Re: PCI-e splitter?
Reverting to the original question, that seems to imply that a majority of projects would be bandwidth limited when running through an x1 slot extender even if it's rather slow GPU.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: PCI-e splitter?
My geforce 970 uses between 17-25 % of 3.0 16x. That means one should not go belove 3.0 4x, 2.0 8x or 1.0 16x if I am correct?