GPU WUs are how much more complex than SMP WUs?

user123 · Post by **user123** » Sat Nov 17, 2012 1:46 pm

I'm using the SMP and GPU (Fermi) clients and know that GPU WUs have many more base points than CPU WUs.

My question is: generally, how much more complex are GPU WUs than SMP WUs? 3 times? 10 times? 20 times? Do the number of base points (ignore the bonus points for SMP WUs) indicate the complexity of a WU?

7im · Post by **7im** » Sat Nov 17, 2012 2:49 pm

There is no way to make this comparison yet per the base points. GPU and CPU currently use different benchmark hardware. But when the new GPU QRB system is released, you will be able to do it.

mdk777 · Post by **mdk777** » Sat Nov 17, 2012 2:53 pm

Complexity is a function of what is being studied....the number of atom, the forces involved, the specific type of equations used to study those subjects...not the machine used to do the calculations.

Points should be something like watts, a measure of work, or more exactly, watt hours(work over time).

A very complex computation can be multiplied by few steps, or a simple calculation can be multiplied by many steps to yield a roughly equivalent amount of work, right?

So, just pointing out that your premise of complexity, while a part, is not the only consideration in determining the base points.

When dealing with bus size, latency issues, different native calculating ability, variations in HZ, memory frequency, etc. etc. etc. trying to determine what is a roughly equal amount of calculation yield of through-put in widely different calculation projects and studies was a pretty daunting task, and I suspect will continue to be.

One way to simplify this rather difficult and complex "work yield evaluation" problem is to simply test rather than predict.

This is of course what is done everywhere else...you build an engine and predict the horsepower, by until you put it on the Dyno you don't really know.
And of course, the entire graph shows you the variation at different RPM.

The entire concept of FLOPS is the same, supercomputers vary widely in the speed of their calculations, the ability to store and access, the range of calculation that they are geared to be best at....Yet an agreed standard test yields a certain amount of work accomplished for comparisons sake.

Anyway, my point:

the new WU test for QRB on GPU moves the points of base from theoretical, to testing, by running the same calculations on both SMP and GPU for the first time.

Now the test will be direct and not subject to artificial (required by the lack of direct testing) variables and equating compensations.

So, in the very near future, your question will no longer be relevant. The "complexity" will be the same. The only relevant variables remaining will be number of steps, size, and ability to accomplish that work over time.

JimF · Post by **JimF** » Sun Nov 18, 2012 4:05 pm

mdk777 wrote:Anyway, my point:

the new WU test for QRB on GPU moves the points of base from theoretical, to testing, by running the same calculations on both SMP and GPU for the first time.

That is a very nice explanation. But you bring up a point that I have been wondering about, though the answer may not be known yet: will the "new" work units run the same calculations on the CPU as the GPU? In other words, would you do the same science on both, but just give the user the choice of which to use. Or alternatively will there continue to be different science projects run on the GPU as compared to the CPU?

One reason I ask is that on the World Community Grid/HCC project, they have adapted the CPU work units to run on the GPU, so they are preforming the same calculations, just a lot faster on the GPU. Given the wide variety of work on Folding, that probably won't be practical for a long time, but is that the direction they are heading?

Post by **Jesse_V** » Sun Nov 18, 2012 4:15 pm

Figuring out how to efficiently use a GPU for computing is significantly more difficult than using the CPU. Usually it requires a complete restructuring of the algorithms based on the GPU architecture, or the use of intermediate libraries. The PG uses the latter approach, including their own library, OpenMM. So yes, they are more complex. But if you can use them efficiently, for certain calculations GPUs are a lot faster.

mdk777 · Post by **mdk777** » Sun Nov 18, 2012 4:21 pm

but is that the direction they are heading?

Well, I don't speak for the project, but it appears so.
That is ultimately the beauty of Heterogeneous compute and open CL. It really doesn't code for one specific piece of hardware but for any that are compliant.

The simplification for PG is extreme in the long run. It is not just switching to GPU over CPU.
The efficiency of cpu vs. gpu vs a blend of cpu and gpu can be optimized in real time

This has already been demonstrated at Heterogeneous compute conferences.

I agree, it will not happen overnight, but I think it is the direction they are headed.

EDIT: I agree with Jesse_V that this was the past or current approach. I am not sure in the long run if they won't give up some efficiency for simplicity and ubiquitous hardware conformity. Time will tell.

Post by **bruce** » Mon Nov 19, 2012 6:29 am

JimF wrote:... the answer may not be known yet: will the "new" work units run the same calculations on the CPU as the GPU? In other words, would you do the same science on both, but just give the user the choice of which to use. Or alternatively will there continue to be different science projects run on the GPU as compared to the CPU?

You're right ... the PG has not answered that question yet.

They did say that they expect to be able to run both implicit and explicit projects on either SMP or GPU, which gives them the ability to benchmark a project on either one. They did NOT say whether they planned to run some/all projects on the Donor's choice of platforms so all we can do at this point is guess. The possibility of hybrid projects (mentioned above) introduces a third possibility of running a single project on all available resources. To me, that seems really complicated, but I really don't know if it would be worth the software development costs.

I'm pretty confident that some projects will be limited to SMP by the size of GPU VRAM just like some SMP projects are currently limited to 64-bit OSs.

JimF · Post by **JimF** » Mon Nov 19, 2012 11:25 am

bruce wrote:The possibility of hybrid projects (mentioned above) introduces a third possibility of running a single project on all available resources. To me, that seems really complicated, but I really don't know if it would be worth the software development costs.

It may also complicate the choice of a graphics card. If Gromacs 4.6 uses OpenCL, there is the possibility that AMD will out-perform Nvidia. But as you say, that might not be used on all projects, and Nvidia with CUDA might do better on the others. It seems to me that this is the time to sit tight on the hardware and wait to see which way the software is going.

Folding Forum

GPU WUs are how much more complex than SMP WUs?

GPU WUs are how much more complex than SMP WUs?

Re: GPU WUs are how much more complex than SMP WUs?

Re: GPU WUs are how much more complex than SMP WUs?

Re: GPU WUs are how much more complex than SMP WUs?

Re: GPU WUs are how much more complex than SMP WUs?

Re: GPU WUs are how much more complex than SMP WUs?

Re: GPU WUs are how much more complex than SMP WUs?

Re: GPU WUs are how much more complex than SMP WUs?