Page 1 of 1
launching/starting F@H using mpirun
Posted: Sun May 31, 2009 8:16 pm
by alpha754293
Is there a way to start F@H using mpirun?
As most of you already know, one of my professors that I work with is planning on getting a 64-way system and I am thinking about using F@H to test the cluster, and I am wondering if there is any way for me to be able to start the client using mpirun directly (without having to go through F@H's "CLI").
I am planning on using an existing WU that one of my other systems has downloaded since I am quite certain that there aren't any 64-way WUs available (yet), so in the interest of science, the WU will be returned normally; but I would duplicate it for testing purposes only and WILL NOT be returning the testing WU back. (For those that are wondering/concerned about that. The WU will be transferred from my main network onto to a subnet, after which, the subnet will be physically disconnected from my main network in order to prevent an accidental upload when the WU completes.)
IF it works, current estimates put the time required to complete an a2 WU (P2671) at ~1 hour on a 64-way system.
Any help in potentially setting this test up would be greatly appreciated (if it can be done at all).
(If not, then the best that I would be able to do would either be 8x8 or 16x4.)
Re: launching/starting F@H using mpirun
Posted: Sun May 31, 2009 10:45 pm
by toTOW
How is the cluster seen by OS/applications (sorry, I know nothing about clusters) ?
If you can make it recognized as a single machine with 64 CPUs, you can run the FAH client with -smp 64 flag.
Re: launching/starting F@H using mpirun
Posted: Sun May 31, 2009 11:34 pm
by alpha754293
toTOW wrote:How is the cluster seen by OS/applications (sorry, I know nothing about clusters) ?
If you can make it recognized as a single machine with 64 CPUs, you can run the FAH client with -smp 64 flag.
It will likely be either independent OS installations OR the slave nodes will be booted by the master via PXE, so, no, I don't think that it would be a monolithic OS install/setup.
Re: launching/starting F@H using mpirun
Posted: Mon Jun 01, 2009 10:21 am
by toTOW
For the time being, I think your only solution is to load one SMP client per node ... but I think Pande Group is working on cluster versions of the SMP client, so if they are able to get a decent working build, you might be able to use it.
Re: launching/starting F@H using mpirun
Posted: Mon Jun 01, 2009 2:33 pm
by alpha754293
I'm asking because I think that the GROMACS can be launched via mpirun provided that you have all of your run/simulation definition files setup for it.
Because the way that F@H runs right now, it downloads the WU as a single package, unpacks it, and then launches the run as well. Upon completion, it packs it up, and sends the results back.
For what I am trying to do, I would need to interfere/interject just before it starts the MPI run in order to modify the run parameters. In GROMACS, you can edit the run definition, but it still doesn't kind of "work backwards" to call mpirun to start it. (Well, ok, it does, but not in necessarily in the sense that I would be able to run it across multiple systems, unlike a MPICH Beowulf cluster would be able to launch programs. I hope that that makes sense.)
Re: launching/starting F@H using mpirun
Posted: Mon Jun 01, 2009 7:55 pm
by bruce
alpha754293 wrote:I'm asking because I think that the GROMACS can be launched via mpirun provided that you have all of your run/simulation definition files setup for it.
Because the way that F@H runs right now, it downloads the WU as a single package, unpacks it, and then launches the run as well. Upon completion, it packs it up, and sends the results back.
For what I am trying to do, I would need to interfere/interject just before it starts the MPI run in order to modify the run parameters. In GROMACS, you can edit the run definition, but it still doesn't kind of "work backwards" to call mpirun to start it. (Well, ok, it does, but not in necessarily in the sense that I would be able to run it across multiple systems, unlike a MPICH Beowulf cluster would be able to launch programs. I hope that that makes sense.)
If your goal is to analyze proteins, you can install a stand-alone version of GROMACS on a cluster. If your goal is to process work assigned by Folding@Home, I'm reasonably confident that this isn't going to work for you.
Stanford University does use clusters to process some of their work -- especially the proteins that are not well suited for distribution as part of FAH. They have not developed a cluster based version of FAH and I have no reason to believe such a version might be distributed soon (if, at all). That might change at any time, though, but I wouldn't count on it.
What you're trying to do may have some real merit but the applicability would be rather limited. FAH was specifically designed to take advantage of "home" computers, and there just aren't that many home users that have high-powered clusters. The FAH developers have to consider both the value of the WUs processed (both quantity and speed) compared to the support costs of developing and maintaining a new and rather complex client. The then have to compare that to spending those same resources on some other things on their to-do list.
Please read the EULA carefully.
Re: launching/starting F@H using mpirun
Posted: Mon Jun 01, 2009 9:38 pm
by alpha754293
Yes, I realize and understand that. But considering that Intel is working on a 32-core processor already; I think that it would only be a matter of time before those questions become all-the-more relevant.
If there's a way for me to use the cluster with a single 64-way MPI run to work on it; I would love to be able to run it to see what kind of results we'd get.
But no way, am I modify the underlying core for this run. (Or if I am coaxing it to work in 64-way, it will be physically disconnected from the network to prevent it from accidently uploading the result as stated before, therefore; either I can try and do it on my own, OR, hopefully that I might be able to get a bit of support, help, and assistance to be able to do it.)
With all the debate lately about speeding up science, wouldn't it be nice to be able to take a current a2 WU (that takes about 8 hours on my 8-way system to run) to complete it in an hour?
It isn't very often that we'd get to play with such hardware, but I think that it would be interesting for those who might be interested in terms of what level of performance we may see in the future if it works.
(Granted, getting it to run on a cluster is more complicated than a single 32-core system, I'll admit that). But from a mpirun perspective; it makes no difference. (Or very minimally so since Beowulf does a fair bit of it).