Blog post: "Unified GPU/SMP benchmarking scheme ..."

csvanefalk · Post by **csvanefalk** » Sun Dec 09, 2012 8:33 am

TonyStewart14 wrote:I wonder if these changes will lead to greater Kepler GPU support. I'm wondering if the 560Ti is still the best buy or if the 6xx series will take over.

My guess is that Kepler will outdo Fermi by a sizeable margin once the optimizations make it into FaH. Of course, that is something we will have to see in time. I would hold off on any investment before those numbers are actually in. Remember that Kepler overall compares very well with Fermi even without the optimizations, so it does look promising.

mmonnin · Post by **mmonnin** » Mon Dec 10, 2012 3:13 am

Compute power has been lowered in Kepler so I'm not sure it'll ever be able to reach Fermi levels.

Post by **bruce** » Mon Dec 10, 2012 3:31 am

mmonnin wrote:Compute power has been lowered in Kepler so I'm not sure it'll ever be able to reach Fermi levels.

The stated goal of Kepler was to reduce the power consumption and heat while maintaining the same performance.

PinHead · Post by **PinHead** » Mon Dec 10, 2012 3:53 am

mmonnin wrote:Compute power has been lowered in Kepler so I'm not sure it'll ever be able to reach Fermi levels.

The double pumped per clock data went to single pumped per clock data on the Kepler because of the power required to double pump the data. However, NVidia; threw a lot more cores and several new technologies into the Kepler ( such as Dynamic Parallelism and Hyper Q ) where performance gains might not have been harvested by the client. It was mostly done for HPC clusters, but who knows; it could arrive here also.

csvanefalk · Post by **csvanefalk** » Mon Dec 10, 2012 12:52 pm

mmonnin wrote:Compute power has been lowered in Kepler so I'm not sure it'll ever be able to reach Fermi levels.

Kepler already outperforms Fermi watt-for-watt, so I don't think that is an issue.

mmonnin · Post by **mmonnin** » Mon Dec 10, 2012 2:47 pm

csvanefalk wrote:
mmonnin wrote:Compute power has been lowered in Kepler so I'm not sure it'll ever be able to reach Fermi levels.
Kepler already outperforms Fermi watt-for-watt, so I don't think that is an issue.

Watt to Watt is purely based on the process shrink from 40nm to 28nm.

csvanefalk · Post by **csvanefalk** » Mon Dec 10, 2012 4:49 pm

mmonnin wrote:
csvanefalk wrote:
mmonnin wrote:Compute power has been lowered in Kepler so I'm not sure it'll ever be able to reach Fermi levels.
Kepler already outperforms Fermi watt-for-watt, so I don't think that is an issue.
Watt to Watt is purely based on the process shrink from 40nm to 28nm.

Sure, but I was reacting to your statement that you were not sure Kepler would ever reach Fermi levels. Kepler pulls good PPD even without the optimizations from what I have seen (there have been reports of some Kepler cards outperforming equivalent Fermi cards for some projects as well). With full support in place, I have little doubt that Kepler will outdo Fermi across the board, but that is of course speculation.

In either case, IMHO the best metric you can judge any folding hardware by is effective PPD/Watt, and on that front Fermi simply won't be able to compete with Kepler once it is supported.

JimF · Post by **JimF** » Mon Dec 10, 2012 5:17 pm

csvanefalk wrote:In either case, IMHO the best metric you can judge any folding hardware by is effective PPD/Watt, and on that front Fermi simply won't be able to compete with Kepler once it is supported.

Maybe. Some people complained right when Kepler was announced that doing away with the clock-doubling would hurt the power-efficiency of their application (whatever it was) without any benefit from the new circuitry, which is geared mainly for games. And some of the optimizations will apply to Fermi too, so its efficiency should increase also.

My limited knowledge of CUDA (actually, non-existent if you want to be technical) suggests that Folding has not even gotten up to CUDA 4.2, which is common to both Fermi and Kepler. I think there is no special advantage for Kepler until you get up to CUDA 5, and on the GPUGrid.net forums, they say that CUDA 5 provides them no performance advantage even on Kepler. So whatever differences there are in PPD/Watt may be minimal. I think it is all irrelevant anyway with GROMACs 4.6 coming out. That is a whole new ballgame, and even the AMD cards may do comparatively well, so I am not getting overly excited about Kepler verses Fermi at the moment. I would love to have an excuse to upgrade to Kepler if one ever comes along; it hasn't thus far.

somata · Post by **somata** » Fri Dec 21, 2012 8:39 am

Can anybody clarify what the deal is with AMD GPU project(s?) regarding the new unified benchmarking scheme? In the blog post a couple weeks ago it said:

We are still in the process of re-benchmarking some old projects assigned to ATI and G80 GPUs.

The only AMD project I ever get is p11293, which doesn't appear to have been re-benchmarked yet. The fact that they couldn't re-benchmark the one and only apparently active AMD project before enacting the new benchmarking scheme makes me nervous. Coupled with the mere fact that there aren't more AMD projects, I get the feeling that the OpenCL core is little more than a novelty at present... perhaps ever so slightly more useful than GPU1.

Anyway, whatever the case, it would be nice to know a broad timescale on when the rest of the GPU projects will be re-benchmarked. Are we talking days, weeks, or months? In general, progress has been very slow and remarkably opaque on the OpenCL core/projects, so I guess I shouldn't hold my breath. It would just be nice to know how much FaH really values these WUs. Of course, I realize the standard response will be "Pande group doesn't comment on release dates" or something like that, so I suppose this post is really just to vent my frustration.

mmonnin · Post by **mmonnin** » Fri Dec 21, 2012 5:36 pm

I get a mix of 11292 and 11293 WUs

somata · Post by **somata** » Fri Dec 21, 2012 7:12 pm

Ok I suppose I should've counted both projects that show up in the project summary stats. I'm only an occasional folder so my experience my not be fully representative of the situation. Regardless, it doesn't really change the sentiment of my original post.

Post by **bruce** » Fri Dec 21, 2012 8:10 pm

Whether OpenCL, is a wonderful plan or a novelty depends on who you listen to. The actual development depends primarily on the people who develop the drivers, not on Stanford. In your case, that means ATI/AMD. I'm sure there is some kind of behind-the-scenes cooperative effort between Stanford developers and AMD but that arrangement puts Stanford in the position of testing the code developed by AMD and reporting any limitations that it finds and then waiting for AMD to provide fixes. [The same statements apply if you substitute "NVidia" or "Intel", etc. in place of "AMD" except for the fact that NVidia does provide a workable alternative called CUDA. In the case of AMD, there is no other alternative.]

Based on forum reports, there have been several recent revisions of AMD drivers which have contained bugs that interfere with the reliable use of OpenCL for AMD GPUs. Until AMD resolves those issues, there's little incentive for a researcher to release new projects for AMD. Personally, I have great hopes that AMD will resolve these issues quickly and completely, but they're just that: hopes. I have no data that can suggest when, or if, those issues might be resolved -- which is probably why the Pande Group offers no predictions, either, though they may have slightly more information than I do.

Have you asked the same questions on an AMD forum?

somata · Post by **somata** » Sat Dec 22, 2012 9:13 am

bruce wrote:Based on forum reports, there have been several recent revisions of AMD drivers which have contained bugs that interfere with the reliable use of OpenCL for AMD GPUs. Until AMD resolves those issues, there's little incentive for a researcher to release new projects for AMD.

Does that not imply the current AMD projects are using a (possibly severely) simplified model, one that fits within the current limitations of AMD's OpenCL platform? So simple, it would seem, that despite having >1 PFLOPS available for the platform nobody bothers to run new projects on it because the results are... less than desired?

I guess it's just a shame that Nvidia had to get everyone hooked on CUDA instead of endorsing an open standard like OpenCL. Ok, so it appears there are problems with AMD's OpenCL drivers, but what about Nvidia's? When GPU3 was introduced I had hoped there would be "one core to rule them all" based on OpenCL and it would run seamlessly on either platform. That way PG's attention could be focused on maintaining as few cores as possible. But nooo, apparently GPGPU is still too immature to get correct behavior/good performance if you abstract too far from the hardware, so CUDA is still preferred on Nvidia, completely defeating the purpose of OpenCL.

If there were only a single, OpenCL-based core, then if it worked on Nvidia and not on AMD I could blame AMD's drivers. But since Nvidia doesn't use OpenCL either it's not obvious where the problem is without feedback from PG (monthly or even quarterly status updates would be nice). And of course, that completely ignores that PG actually uses OpenMM as the foundation for GPU3. My understanding is that PG is essentially waiting for OpenMM's group to work with AMD in order to get a usable model worked out, but I'd also imagine PG does some coding as well; not sure what division of labor looks like between PG and OpenMM. Oh well, I guess all I can do it wait... and maybe take a look at the OpenMM project to see what the problem is.

mdk777 · Post by **mdk777** » Sat Dec 22, 2012 12:59 pm

and maybe take a look at the OpenMM project to see what the problem is.

only very general information posts every 6 to 9 months.
I'm convinced that everything is spare time project for a handful of graduate students.

chaosdsm · Post by **chaosdsm** » Sat Dec 22, 2012 3:50 pm

csvanefalk wrote:
Sure, but I was reacting to your statement that you were not sure Kepler would ever reach Fermi levels. Kepler pulls good PPD even without the optimizations from what I have seen (there have been reports of some Kepler cards outperforming equivalent Fermi cards for some projects as well). With full support in place, I have little doubt that Kepler will outdo Fermi across the board, but that is of course speculation.

In either case, IMHO the best metric you can judge any folding hardware by is effective PPD/Watt, and on that front Fermi simply won't be able to compete with Kepler once it is supported.

After the GPU work unit points adjustment, my GTX-460 with 1GB 256bit memory, core version 2.22, & clocked at 825 core / 1650 shader / 1800 memory was getting 27,300ppd on p7623 and 26,880ppd on p7624. My non-TI 660 running core version 2.25 reaches clock speed of 1215MHz core/shader & is getting 29,626ppd on p7623 and 29,483ppd on p7624... I know, I know... it's a 4xx Fermi instead of a 5xx Fermi, but we don't even have Kepler optimizations yet & I don't have a 5xx Fermi. Now my just added 660Ti is also core version 2.25, and also clocked at 1215MHz core/shader & it's cranking out 39,026ppd on p7623, and 38,778ppd on p7624 and that's with the same power consumption as a 560 & lower consumption than a 560Ti.

Folding Forum

Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."

Re: Blog post: "Unified GPU/SMP benchmarking scheme ..."