GROMACS 4.6
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 17
- Joined: Sun Nov 04, 2012 2:42 am
GROMACS 4.6
I am wondering when fah will start using gromacs 4.6, it is now out of beta and released and comes with support for new cpu instruction sets and much better support for bulldozer and piledriver cpus, a lot of bigadv systems use bulldozer so this would boost the amount of work that gets done by quite a bit.
Re: GROMACS 4.6
We're looking at it. Unfortunately, Gromacs 4.6 sets CPU optimizations at compile time rather than run time (which is when Gromacs 4.5 and prior set optimizations). Therefore it presents a more complicated set of options for FAH. We definitely have plans to develop 4.6-based FAH cores, however.
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: GROMACS 4.6
Spazturtle wrote:I am wondering when fah will start using gromacs 4.6, it is now out of beta and released and comes with support for new cpu instruction sets and much better support for bulldozer and piledriver cpus, a lot of bigadv systems use bulldozer so this would boost the amount of work that gets done by quite a bit.
How much is "quite a bit?"
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: GROMACS 4.6
Thanks, that answers my question of whether the Nvidia cards have much of a future. I was beginning to wonder whether, with all of the capability built into Core_17, that would do everything you need.
But it is all subject to revision anyway; this is research as I like to remind everyone (including myself).
But it is all subject to revision anyway; this is research as I like to remind everyone (including myself).
Re: GROMACS 4.6
There's a children's story about "Chicken Little" who runs around trying to convince everyone that The Sky is Falling based on incorrect information. Recent generation NVidia cards DO have a bright future, in spite of what the naysayers are saying. Core_17 is quite capable of getting good performance out of Fermi/Kepler hardware and that's not really related to Gromacs, version 4.x. I'm sure both will be used for quite some time to come.
For future hardware, a lot will depend on upcoming versions of drivers for that hardware. Past FahCores have been designed for a uniprocessor, for a SMP processor, for basic GPUs and could be developed for hardware that's still relatively uncommon. Donors have decided that NVidia+SMP is a reasonable combination but to get AMD+SMP to work efficiently, you have to leave a free CPU for the GPU. Bulldozer, piledriver, Intel GPUs or Intel MIC, etc can be supported either by better drivers, by features yet to be developed in a FahCore, by features yet to be incorporated into OpenCL, or by donors figuring out what combinations of existing FahCores makes good use of the hardware. Already I'm seeing complaints from donors complaining that FahCore_17 is distributing it work both to GPUs and CPUs but there's an unexplored future there and we may find that SMP+CPUs working on separate WUs is inferior to having them work together through a single FahCore. It's really hard to predict the future when we know it's likely to be different than what we've seen in the past.
Some psychologist did a study of how we perceive change. Ask a bunch of people about how much the world has changed in the past N years (pick a number N). Then ask them what the think the world will change in the next N years (same number). People will recognize how much it has changed in the past but will consistently predict that the future will be better, but with a lot fewer changes than we've seen in the past.
For future hardware, a lot will depend on upcoming versions of drivers for that hardware. Past FahCores have been designed for a uniprocessor, for a SMP processor, for basic GPUs and could be developed for hardware that's still relatively uncommon. Donors have decided that NVidia+SMP is a reasonable combination but to get AMD+SMP to work efficiently, you have to leave a free CPU for the GPU. Bulldozer, piledriver, Intel GPUs or Intel MIC, etc can be supported either by better drivers, by features yet to be developed in a FahCore, by features yet to be incorporated into OpenCL, or by donors figuring out what combinations of existing FahCores makes good use of the hardware. Already I'm seeing complaints from donors complaining that FahCore_17 is distributing it work both to GPUs and CPUs but there's an unexplored future there and we may find that SMP+CPUs working on separate WUs is inferior to having them work together through a single FahCore. It's really hard to predict the future when we know it's likely to be different than what we've seen in the past.
Some psychologist did a study of how we perceive change. Ask a bunch of people about how much the world has changed in the past N years (pick a number N). Then ask them what the think the world will change in the next N years (same number). People will recognize how much it has changed in the past but will consistently predict that the future will be better, but with a lot fewer changes than we've seen in the past.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: GROMACS 4.6
I perceive change through the numbers I see on the beta forum (and my own Core_17 tests on my HD 7770). Of course that is very preliminary, but it is better than guessing.
My conclusions by the way are that not only will Kepler take a big hit, but even the lower-end AMD cards won't do so well (maybe due to low QRB? I don't know.). That is not a disaster; I am perfectly willing to upgrade to a better AMD card when the situation calls for it, but rather it is knowledge.
My conclusions by the way are that not only will Kepler take a big hit, but even the lower-end AMD cards won't do so well (maybe due to low QRB? I don't know.). That is not a disaster; I am perfectly willing to upgrade to a better AMD card when the situation calls for it, but rather it is knowledge.
-
- Posts: 17
- Joined: Sun Nov 04, 2012 2:42 am
Re: GROMACS 4.6
Would it be possible to compile multiple copies of the same core with different optimisations set, so have a AMD and an Intel version of the Gromacs 4.6 core and have them fold the same projects?kasson wrote:We're looking at it. Unfortunately, Gromacs 4.6 sets CPU optimizations at compile time rather than run time (which is when Gromacs 4.5 and prior set optimizations). Therefore it presents a more complicated set of options for FAH. We definitely have plans to develop 4.6-based FAH cores, however.
Re: GROMACS 4.6
Possible? Certainly. Probable? Probably not, but that remains to be seen.
Pick a number of different cores that you think would be needed. Now recognize that the Client gathers some data about what features your system has but it would have to gather a more detailed set of data so that the server will be smart enough to choose the right FahCore to download to your system. That probably means an upgrade to the Client plus an upgrade to the Assignment Server code to accept the extra data describing your system. Third, there has to be a storage structure to store all the copies of each FahCore. Fourth, if a bug is found in one, but not all copies (say an error that only occurs for systems using driver version X but not if someone upgrades to driver version Y) somebody has to decide what to do about the problem.
In the past, the FahCores have depended on run-time optimization whenever possible rather than on compile-time optimization. The general statement of policy is that multiple compile-time optimizations will improve individual throughput slightly but in terms of TOTAL FAH performance, it's not worth the costs associated with the points mentioned in paragraph 1. Most people would be surprised how accurate that statement turns out to be but are preconditioned not to accept it by the advertisements distributed by the manufacturers of new hardware. Even in cases where a measurable improvement could be made, the PG looks at the costs associated with increasing the total number of donors and the costs of more versions of the same core and adding more donors turns out to be the best use of the available funds.
There currently are multiple versions of GPU cores, so it's not like it has never been done. CPUs are well optimized and OSs do a good job of utilizing them. Vendors cannot compete unless they provide reasonable performance for code that's optimized for their competitor. That has not been true for various generations of GPUs but it's becoming more true every day. The FahCore-17 that's currently being tested is written to the OpenCL standard and will eventually replace several previous FahCores which were customized to proprietary interfaces, putting the compatibility burden on the developers of OpenCL and on the drivers, both of which are sold as part of a new GPU purchase, whether it's from AMD, NVidia, (... Intel, etc.). The big advantage for FAH is that they only have to maintain one FahCore.
Pick a number of different cores that you think would be needed. Now recognize that the Client gathers some data about what features your system has but it would have to gather a more detailed set of data so that the server will be smart enough to choose the right FahCore to download to your system. That probably means an upgrade to the Client plus an upgrade to the Assignment Server code to accept the extra data describing your system. Third, there has to be a storage structure to store all the copies of each FahCore. Fourth, if a bug is found in one, but not all copies (say an error that only occurs for systems using driver version X but not if someone upgrades to driver version Y) somebody has to decide what to do about the problem.
In the past, the FahCores have depended on run-time optimization whenever possible rather than on compile-time optimization. The general statement of policy is that multiple compile-time optimizations will improve individual throughput slightly but in terms of TOTAL FAH performance, it's not worth the costs associated with the points mentioned in paragraph 1. Most people would be surprised how accurate that statement turns out to be but are preconditioned not to accept it by the advertisements distributed by the manufacturers of new hardware. Even in cases where a measurable improvement could be made, the PG looks at the costs associated with increasing the total number of donors and the costs of more versions of the same core and adding more donors turns out to be the best use of the available funds.
There currently are multiple versions of GPU cores, so it's not like it has never been done. CPUs are well optimized and OSs do a good job of utilizing them. Vendors cannot compete unless they provide reasonable performance for code that's optimized for their competitor. That has not been true for various generations of GPUs but it's becoming more true every day. The FahCore-17 that's currently being tested is written to the OpenCL standard and will eventually replace several previous FahCores which were customized to proprietary interfaces, putting the compatibility burden on the developers of OpenCL and on the drivers, both of which are sold as part of a new GPU purchase, whether it's from AMD, NVidia, (... Intel, etc.). The big advantage for FAH is that they only have to maintain one FahCore.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 141
- Joined: Sun Jun 15, 2008 4:39 pm
- Hardware configuration: Intel® Core™ 2 Duo processor E8500, dual 3.16GHz cores, 6MB L2 Cache, 1333MHz FSB (45nm); 4096MB Corsair™ XMS2 DDR2-800 RAM; 256MB eVGA™ NVIDIA® GeForce™ 8600 GT Video Card
- Location: NYC Metro Area
Re: GROMACS 4.6
I like the fact that according to the recent FAH News posting, "we can [now] do just about any [work unit] calculation on any piece of [GPU and CPU] hardware, it's strange to benchmark [the work units] separately".bruce wrote: ...The FahCore-17 that's currently being tested is written to the OpenCL standard and will eventually replace several previous FahCores which were customized to proprietary interfaces, putting the compatibility burden on the developers of OpenCL and on the drivers, both of which are sold as part of a new GPU purchase, whether it's from AMD, NVidia, (... Intel, etc.). The big advantage for FAH is that they only have to maintain one FahCore.
Practically speaking, I can now upgrade an older computer with a more efficient video card, and postpone the expense of a new system while increasing my folding capability. Works for me!
-
- Posts: 887
- Joined: Wed May 26, 2010 2:31 pm
- Hardware configuration: Atom330 (overclocked):
Windows 7 Ultimate 64bit
Intel Atom330 dualcore (4 HyperThreads)
NVidia GT430, core_15 work
2x2GB Kingston KVR1333D3N9K2/4G 1333MHz memory kit
Asus AT3IONT-I Deluxe motherboard - Location: Finland
Re: GROMACS 4.6
Yep. I recalled today that Zagen30 did post FahBench results on a 3770K CPU, I reminisced on it in CPU Vs. GPU? (Which is now more important?) (3770K versus GT430) topic. Intel doing OpenCL on the CPU seems a bit weird at first, but starts to make more sense when you consider the new CPU HW optimizations such as AVX.
Back to topic. The recent GPU QRB update blog post gives a clue that SMP will be the yardstick for a long time:
Interesting times...
Back to topic. The recent GPU QRB update blog post gives a clue that SMP will be the yardstick for a long time:
I'm hazarding a guess that PG will pit GROMACS 4.6 CPU cores against AMD/NVidia OpenCL (FahCore_17) at some point. Seeing a core fight between OpenCL and GROMACS 4.6 - both on "AVX" capable CPUs - would also be nice.our plan is to treat all WUs identically, i.e. benchmark on a single benchmark machine (SMP) and use those points.
Interesting times...
Win7 64bit, FAH v7, OC'd
2C/4T Atom330 3x667MHz - GT430 2x832.5MHz - ION iGPU 3x466.7MHz
NaCl - Core_15 - display
2C/4T Atom330 3x667MHz - GT430 2x832.5MHz - ION iGPU 3x466.7MHz
NaCl - Core_15 - display
Re: GROMACS 4.6
Yes interesting times. FAH has always been known for being on the bleeding edge of the MD technology isn't for everyone, but personally it keeps me intrigued and forces me to learn new things that are beyond my comfort zone.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.