Page 1 of 1

Nine new papers out; three exceptionally impressive

Posted: Mon Feb 25, 2013 6:19 pm
by Jesse_V
Nine new papers have been recently added to the Papers page on the FAH website (we're now up to 109, good job everyone!). One presents some new results on Alzheimer's, another is about immunotherapeutics, there's one on improving pain medicines, and the rest are more general. I found three of the papers exceptionally impressive and interesting. In particular, these quotes from the body stood out to me:

#101: Slow unfolded-state structuring in Acyl-CoA binding protein folding revealed by simulation and experiment.
To gain further insight into the folding mechanism, we independently perform a large-scale molecular simulation study of ACBP folding. Previously, systems such as ACBP have been too large and slow for simulation studies, which have been limited to nanosecond to microsecond time scales for proteins with less than 40 residues. Today, recent advances in simulation methodology(8-10) and network models called Markov state models (MSMs), in which conformational dynamics is modeled as transitions between kinetically metastable states, make it possible to model folding on the millisecond time scale. Here, we use over 30 ms of trajectory data to construct a MSM of ACBP folding, which predicts residual unfolded-state structure and kinetics consistent with experiment.
Even the most sophisticated single-molecule experiments, however, cannot resolve the entire microscopic complexity of folding due to the limited number of photons that can be detected on the microsecond time scale. It is therefore likely that ensemble and single-molecule fast kinetic observables cannot capture the full complexity of folding, and instead we must turn to computer simulation. We expect Markov state model approaches to be increasingly useful in this regard, as direct comparisons to experiment can made by projecting predicted microscopic dynamics onto macroscopic observables.
#108: OpenMM 4: A Reusable, Extensible, Hardware Independent Library for High Performance Molecular Simulation
The CUDA simulations were run on a single Nvidia GTX 580 GPU. For OpenCL, simulations were run on both the GTX 580 and on an AMD Radeon HD 7970. The OpenCL platform can parallelize an explicit solvent simulation across multiple GPUs. We therefore repeated each of the above benchmarks using one to four Nvidia C2070 GPUs in parallel. The results are shown in Table 5.

Code: Select all

				1 GPU		2 GPUs		3 GPUs		4 GPUs
Explicit-RF, H-bonds	  	25.9 (1.0)	40.2 (1.55)	48.5 (1.87)	52.3 (2.02)
Explicit-RF, H-angles		47.6 (1.0)	69.4 (1.46)	80.8 (1.70)	87.9 (1.85)
Explicit-PME, H-bonds		16.5 (1.0)	27.1 (1.64)	30.1 (1.83)	29.8 (1.81)
Explicit-PME, H-angles		30.9 (1.0)	49.7 (1.61)	55.5 (1.80)	54.9 (1.78)
all results are in ns/day. The value in parentheses is the speedup relative to a single GPU.
The scaling with the number of GPUs is much less than linear. Using up to three GPUs produces a significant speedup, but there is little benefit beyond that. For PME, using four GPUs is actually slightly slower than using three. This is primarily due to the cost of transferring data over the high latency PCIe bus.
Edit: OpenMM is one of the key pieces of software behind the FAH GPU cores. Proteneer informed me that OpenMM 5 powers the upcoming FahCore 17 (Zeta) and that all of those numbers rely on OpenMM 4. OpenMM 5.1, not currently released, should introduce a lot of exciting scientific features.

#109: To milliseconds and beyond: challenges in the simulation of protein folding
The list of potential questions the folding field might hope to address through simulation is long. A few of the most exciting include
  • Can we build models allowing for the detailed comparison of simulations to experiment in order to both test simulations and aid in the interpretation of experiments [81]? Further, simulations might be able to direct the design of future experiments, suggesting those with the greatest impact.
  • With a detailed comparison of experiment in hand as tests of simulation accuracy, can we answer how do particular proteins fold? Why do so many proteins appear to fold in a two-state manner? What is the nature of ‘downhill’ folding? Can we describe these in microscopic, physical terms?
  • With the knowledge the mechanism of how particular proteins fold, we can learn how this mechanism is encoded in the inherent physical interaction of the amino acids in a given protein sequence?
  • With the knowledge of how many individual proteins fold, can simulations help reveal general features of protein folding amongst broad groups of proteins (or ideally some general properties for all proteins)?
Much effort has been poured into advancing molecular simulation, and in this decade the fruits of that effort are coming to bear. Hopefully with continued progress in sampling and forcefields, combined with powerful analysis techniques, simulation can play a key role — alongside experiment and theory — in discovering how proteins fold.
These papers have some really neat abstracts on the site as well. I'm pretty impressed by these new papers and thought I'd share. In particular, the information from #108 helps answer a question that has been asked a number of times on this forum.

Re: Nine new papers out; three exceptionally impressive

Posted: Tue Feb 26, 2013 8:09 am
by SodaAnt
The real interesting thing here seems to be the use of multiple GPUs. The loss of scaling at four GPUs doesn't sound like that much of a deal, especially since someone with four GPUs could always do two separate calculations using 2 GPUs each if it is more efficient. Further, I wonder what the main bottleneck there is, I would assume its the PCI-e bus, but is that bandwidth related, latency related, or some other reason?

Re: Nine new papers out; three exceptionally impressive

Posted: Tue Feb 26, 2013 8:39 am
by bruce
Latency is a big factor, but it can also be bandwidth. With a slow GPU, the I/O between main RAM and VRAM can be easily be overlapped with computing. With a fast GPU, it's a lot harder to feed it enough data to keep it busy. (That's oversimplified, but still true.)

Re: Nine new papers out; three exceptionally impressive

Posted: Tue Feb 26, 2013 8:53 am
by SodaAnt
bruce wrote:Latency is a big factor, but it can also be bandwidth. With a slow GPU, the I/O between main RAM and VRAM can be easily be overlapped with computing. With a fast GPU, it's a lot harder to feed it enough data to keep it busy. (That's oversimplified, but still true.)
Can the SLI or xFire link be used for normal data transfer, or is that restricted for other purposes?

Re: Nine new papers out; three exceptionally impressive

Posted: Wed Feb 27, 2013 5:41 am
by bruce
The information we were given several years ago was that SLI/Xfire links are too slow for FAH, though I've never seen an explanation of how moving data via SLI/XFire compares with PCIe data transfers. Since the FahCores have been designed to preparing work packets in CPU-RAM which can be processed by a single GPU, I'm not sure that direct GPU-to-GPU transfers would be useful, but there might be unexpected optimizations if a FahCore is somehow designed to use more than one GPU on the same WU.

Re: Nine new papers out; three exceptionally impressive

Posted: Wed Feb 27, 2013 8:40 am
by SodaAnt
bruce wrote:The information we were given several years ago was that SLI/Xfire links are too slow for FAH, though I've never seen an explanation of how moving data via SLI/XFire compares with PCIe data transfers. Since the FahCores have been designed to preparing work packets in CPU-RAM which can be processed by a single GPU, I'm not sure that direct GPU-to-GPU transfers would be useful, but there might be unexpected optimizations if a FahCore is somehow designed to use more than one GPU on the same WU.
Well, there's always the choice of combining the two for greater throughput between the GPUs, though I'm not sure how well that approach would work without some hard data.

Re: Nine new papers out; three exceptionally impressive

Posted: Thu Feb 28, 2013 1:15 am
by bruce
SodaAnt wrote:Well, there's always the choice of combining the two for greater throughput between the GPUs, though I'm not sure how well that approach would work without some hard data.
Paper #108 seems like hard data to me and it's not encouraging.

Can performance be optimized beyond that level? Perhaps -- perhaps not. Naturally PG will try to improve those numbers, but there's no way to tell until they try and we get more hard data of those results.

Re: Nine new papers out; three exceptionally impressive

Posted: Thu Feb 28, 2013 10:40 am
by PantherX
The 2 GPU mode is rather promising. I do hope that we will reach a performance level of 1.8 to 1.95 in the near future. Regarding the results, are these dependent on the driver version used? If so, then new improved drivers might yield better results for 2 GPUs.