GPU: Stream Processors - Bits, Number - what is important?
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 82
- Joined: Sat Dec 17, 2011 4:22 pm
- Hardware configuration: none anymore, FAH doesn't want it, it seems.
GPU: Stream Processors - Bits, Number - what is important?
(I wasn't sure which forum is best suited, so I chose GD. Feel free to move.)
I've tried to find an answer through searching the forum, but the bits I found didn't really answer my questions, no doubt because I'm not completely up-to-date with regard to how GPUs actually work.
I understand that AMD and nVidia are using different approaches regarding their GPUs, so anything I say is only directed towards AMD. (And Folding@Home, of course.)
From what I can see, the most declared features of a GPU are the number of Stream Processors, the Memory (Bus?) Width, the clock Speed (GPU and memory), and the number of Texture Units. Perhaps shader version as well.
Which of those are important to FAH an in what relation?
My guess would be number of SP, and width of memory interface (as in, if it's smaller than the numbers being processed, it'll take two steps to get the whole number through), then clock speed.
(Of course, only within limits, no matter of processors is going to make up for an endlessly slow bus.)
Is the shader model relevant at all? What about texture units?
However, what is the required bus width? Does it have to be 128Bit, or would 64Bit suffice as well?
Is FAH (currently, in the near future) going to use up all the processors, or is there a reasonable limit above which it's better to go for more clock speed, or wider bus?
I've tried to find an answer through searching the forum, but the bits I found didn't really answer my questions, no doubt because I'm not completely up-to-date with regard to how GPUs actually work.
I understand that AMD and nVidia are using different approaches regarding their GPUs, so anything I say is only directed towards AMD. (And Folding@Home, of course.)
From what I can see, the most declared features of a GPU are the number of Stream Processors, the Memory (Bus?) Width, the clock Speed (GPU and memory), and the number of Texture Units. Perhaps shader version as well.
Which of those are important to FAH an in what relation?
My guess would be number of SP, and width of memory interface (as in, if it's smaller than the numbers being processed, it'll take two steps to get the whole number through), then clock speed.
(Of course, only within limits, no matter of processors is going to make up for an endlessly slow bus.)
Is the shader model relevant at all? What about texture units?
However, what is the required bus width? Does it have to be 128Bit, or would 64Bit suffice as well?
Is FAH (currently, in the near future) going to use up all the processors, or is there a reasonable limit above which it's better to go for more clock speed, or wider bus?
It seems I can't write a signature that both conveys my feelings and doesn't look like a miserable trolling attempt...
-
- Posts: 221
- Joined: Fri Jul 24, 2009 12:30 am
- Hardware configuration: 2 x GTX 460 (825/1600/1650)
AMD Athlon II X2 250 3.0Ghz
Kingston 2Gb DDR2 1066 Mhz
MSI K9A2 Platinum
Western Digital 500Gb Sata II
LiteOn DVD
Coolermaster 900W UCP
Antec 902
Windows XP SP3 - Location: Malvern, UK
Re: GPU: Stream Processors - Bits, Number - what is importan
Most important are:[WHGT]Cyberman wrote: ...
I understand that AMD and nVidia are using different approaches regarding their GPUs, so anything I say is only directed towards AMD. (And Folding@Home, of course.)
From what I can see, the most declared features of a GPU are the number of Stream Processors, the Memory (Bus?) Width, the clock Speed (GPU and memory), and the number of Texture Units. Perhaps shader version as well.
...
Is the shader model relevant at all? What about texture units?
However, what is the required bus width? Does it have to be 128Bit, or would 64Bit suffice as well?
Is FAH (currently, in the near future) going to use up all the processors, or is there a reasonable limit above which it's better to go for more clock speed, or wider bus?
1) number of Stream Processors
2) GPU clock speed
3) Power comsumption
Regarding shader model/version, generally a later model/version may be better, but may depend on the development of a new core to utilise the GPU effectively (with Nvidia the Kepler GPUs did not fold until a new core 15 v2.25 was released)
Least significant are:
a) Memory speed
b) memory size
c) memory bus
d) texture units
History with projects for Nvidia suggests that we will see GPU WUs with larger number of atoms, so it is unlikely that FAH is going to use up all the processors.
-
- Posts: 2948
- Joined: Sun Dec 02, 2007 4:36 am
- Hardware configuration: Machine #1:
Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).
Machine #2:
Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.
Machine 3:
Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32
I am currently folding just on the 5x GTX 460's for aprox. 70K PPD - Location: Salem. OR USA
Re: GPU: Stream Processors - Bits, Number - what is importan
I would add the number of single precision floating point units (FPU's) is also highly important and the number of double precision FPU's are not.
-
- Pande Group Member
- Posts: 148
- Joined: Fri Sep 28, 2012 11:03 pm
- Location: Stanford, CA
- Contact:
Re: GPU: Stream Processors - Bits, Number - what is importan
A a general rule of thumb, we look at the # of cores multiplied by the clock speed. Unfortunately we don't support double precision for now, because all GPUs for a given project would need to be in the same precision. For most cases, I don't think we are bound by memory. When enough GPUs out there start supporting double precision we can perhaps set up some double precision exclusive projects.
-
- Posts: 128
- Joined: Thu Dec 06, 2007 9:48 pm
- Location: Norway
Re: GPU: Stream Processors - Bits, Number - what is importan
The hardware isn't a problem, just look at Milkyway@home.proteneer wrote:When enough GPUs out there start supporting double precision we can perhaps set up some double precision exclusive projects.
Re: GPU: Stream Processors - Bits, Number - what is importan
The real question is whether DP improves science or not.
First some history: Gromacs for the CPU was first introduced running pure x86 code. Later optimized code was introduced to use Single Precision SSE. At that stage of hardware development, some donors had SSE support and others did not so both codepaths were included the same FahCore with switches to manage whether to run optimized or not. Later, Double Precision was seriously considered and a new FahCore and a new set of projects were introduced for SSE2 along with the server logic to exclude assigning those projects to non-SSE2 machines.
When running the same protein, double precision ran significantly slower (at that time, and with that generation of hardware). After a short time, no new projects for FahCore_79 were introduced. I conclude from that that apparently they concluded that the scientific gain of double precision was less than the cost of reduced performance on the same protein. (Also: for protein folding, does SP support sound scientific conclusions or would increased precision show additional results that might not be found from SP results?)
Fast Forward to today's hardware, but ask the same questions. How would performance compare for DP vs. SP across the entire spectrum of DP-capable GPUs and how would that compare to whatever scientific benefit would derive from having more accurate results? If the answer favors DP, would the servers be able to differentiate between DP-capable GPUs and SP-only GPUs so that the assignment process can use both classes of GPUs effectively?
Just because you have a nice hardware feature doesn't necessarily mean you need it -- and what's needed by project X is not necessarily what's needed by project Y.
First some history: Gromacs for the CPU was first introduced running pure x86 code. Later optimized code was introduced to use Single Precision SSE. At that stage of hardware development, some donors had SSE support and others did not so both codepaths were included the same FahCore with switches to manage whether to run optimized or not. Later, Double Precision was seriously considered and a new FahCore and a new set of projects were introduced for SSE2 along with the server logic to exclude assigning those projects to non-SSE2 machines.
When running the same protein, double precision ran significantly slower (at that time, and with that generation of hardware). After a short time, no new projects for FahCore_79 were introduced. I conclude from that that apparently they concluded that the scientific gain of double precision was less than the cost of reduced performance on the same protein. (Also: for protein folding, does SP support sound scientific conclusions or would increased precision show additional results that might not be found from SP results?)
Fast Forward to today's hardware, but ask the same questions. How would performance compare for DP vs. SP across the entire spectrum of DP-capable GPUs and how would that compare to whatever scientific benefit would derive from having more accurate results? If the answer favors DP, would the servers be able to differentiate between DP-capable GPUs and SP-only GPUs so that the assignment process can use both classes of GPUs effectively?
Just because you have a nice hardware feature doesn't necessarily mean you need it -- and what's needed by project X is not necessarily what's needed by project Y.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 128
- Joined: Thu Dec 06, 2007 9:48 pm
- Location: Norway
Re: GPU: Stream Processors - Bits, Number - what is importan
Well, I do remember back in the days with "SSE2-only" wu's I deleted some of them, since got downloaded to my SSE-only computer. The server-logic didn't work very well back in the days, and how it's working now is just to look on some of the other forum-threads...bruce wrote:Later, Double Precision was seriously considered and a new FahCore and a new set of projects were introduced for SSE2 along with the server logic to exclude assigning those projects to non-SSE2 machines.
(snip)
Fast Forward to today's hardware, but ask the same questions. How would performance compare for DP vs. SP across the entire spectrum of DP-capable GPUs and how would that compare to whatever scientific benefit would derive from having more accurate results? If the answer favors DP, would the servers be able to differentiate between DP-capable GPUs and SP-only GPUs so that the assignment process can use both classes of GPUs effectively?
Re: GPU: Stream Processors - Bits, Number - what is importan
Yeah I remember those days when my Dothan Laptop at 1.6GHz ran the same WU 50% faster than my Athlons at 2.4GHz because it had SSE2. 50% slower clock for 50% faster WUs.
But yeah as derrick said, SP, clock, power then I'd say cooling (to OC more )
But yeah as derrick said, SP, clock, power then I'd say cooling (to OC more )
-
- Pande Group Member
- Posts: 148
- Joined: Fri Sep 28, 2012 11:03 pm
- Location: Stanford, CA
- Contact:
Re: GPU: Stream Processors - Bits, Number - what is importan
On the double precision point - GPUs are unfortunately terribad at double precision (even the Teslas). Our internal testing shows 1/6-1/8th the performance of Single Precision.
-
- Site Moderator
- Posts: 2850
- Joined: Mon Jul 18, 2011 4:44 am
- Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4 - Location: Western Washington
Re: GPU: Stream Processors - Bits, Number - what is importan
Thanks for the new word. "Terribad". Huh. http://www.urbandictionary.com/define.php?term=Terribadproteneer wrote:GPUs are unfortunately terribad at double precision
F@h is now the top computing platform on the planet and nothing unites people like a dedicated fight against a common enemy. This virus affects all of us. Lets end it together.
-
- Posts: 1024
- Joined: Sun Dec 02, 2007 12:43 pm
Re: GPU: Stream Processors - Bits, Number - what is importan
So I guess that means even if you do create some Double Precision projects, nobody is going to want to run them. In the days that Bruce is talking about, SSE2 was half as fast as SSE so DP on gpus is a lot less practical.proteneer wrote:On the double precision point - GPUs are unfortunately terribad at double precision (even the Teslas). Our internal testing shows 1/6-1/8th the performance of Single Precision.
Re: GPU: Stream Processors - Bits, Number - what is importan
Oh yeah SSE2 WUs on hardware with SSE2 smoked much faster hardware (MHz) if it didn't have SSE2. Hopefully that would be able to assign those WUs to just the better GPUs that can do them the best. But if they benchmark them like the SSE2 WUs, they will be average PPD on non-DP GPUs but excellent on DP GPUs as they will be able to complete much faster.
-
- Posts: 128
- Joined: Thu Dec 06, 2007 9:48 pm
- Location: Norway
Re: GPU: Stream Processors - Bits, Number - what is importan
Well, chances are a double-precision application will just error-out if tries to run it on a non-DP-GPU, but even if assuming it would work the performance would be really bad.mmonnin wrote:Oh yeah SSE2 WUs on hardware with SSE2 smoked much faster hardware (MHz) if it didn't have SSE2. Hopefully that would be able to assign those WUs to just the better GPUs that can do them the best. But if they benchmark them like the SSE2 WUs, they will be average PPD on non-DP GPUs but excellent on DP GPUs as they will be able to complete much faster.
As for FAH using DP on GPU, with FAH's dreadful Amd-performance and Nvidia's abysimally low double-precision-speed, especially on 6xx-series of cards, I wouldn't expect it...