Page 2 of 2

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 1:30 am
by 7im
alpha754293 wrote:Gawd...I LOVE what a bunch of mindless drones you guys are.

While I can certainly understand and appreciate the lack of human resources in order to port the F@H program over to other platforms; I'd like to speculate at this point (however wrongly this might be) that this might just be a case of "you don't know because you've never tried."

And while the recent advances in porting over the client onto GPUs and PS3s, from what I've also read is that they're all still very specialized clients due to the architecture of the processors, and that while they return results faster, they can only do a limited subset of the simulations at hand, which means that the normal x86/x64 processors still have to do the remainder of them.

IF there is so much computational work that is left to be done, I am surprised that there wouldn't be a greater emphasis in illiciting computational assistance from some of the fastest mainframes. Yes, I would also agree and admit that (as the distributed.net project shows) that the cumulative computational effort of those would be very very small, BUT, on the other hand, for those of us that are going to be entering into the workforce within the next 5-10 years or so; where we will be placed in charge of those very same systems; WE might actually have the authority to make that call as to whether to put F@H on the mainframe or not.

As I've also mentioned earlier before, I'm not a programmer, and therefore; I am incapable of affecting change in such a manner where I can made code contributions to the project. However, if the intent is to be able to make this program usable (and I've read the 200+ pages of the GROMACS core, which actually looks like the same caliber of simluation stuff that I do for work) to the greatest number of people; then I am surprised that there hasn't been more emphasis or effort put into it (yes, I am making a presumption here and I could be VERY wrong about it, but I'm okay with being wrong.)

If it's not Windows/Linux/Mac on Intel/AMD (or in brevity PPC), it's like put the shaders on and let's bury ourselves to wallow in our self-pity.

I know that if I were a CTO or CIO, I'd be running F@H company wide. Oh wait. I can't, cuz there are clients for it. *rolls eyes*

Mindless drones? Gawd... I love your naive enthusiasm. :roll:

Sorry, but codysluder and I (and many others) have been with the project a long time. We're politely trying to say, "Been there, done that" several times over. Reality is not self-pity. We already know and understand the limitations of fah and of Pande Group. As I suggested in other threads, this porting for Sun/Solaris has already been discussed at length, and that is how we know this information, not just posting unsupported negative comments. Search really is your friend.

But I'm glad you recognize the limited nature of Pande Group's development resources. You cited the GPU2 port as an example of a porting success on different hardware. And I'll cite the GPU2 port as an example of PG using their limited resources to persue only the most productive (faster and larger number of potential clients) to gain the most bang for their development buck. The potential power of the GPU2 port was worth the top priority their developers are giving it. That puts the porting for Sun or AIX much lower on the priority list, i.e. possible but not probable.

With limited resources, Pande Group does try to support as wide a range of hardware as feasibly possible. It's naive to assume they do not try to support many types of clients based on a lack of other types of clients. You can't use a negative to prove a positive.

This too was mentioned in several of the other porting threads. Go to a couple of the other larger distributed computing projects, and see if they support Solaris. You will find one or three that still do. Now look at the percentage of Sun clients as compared to the other types of hardware. As I last recall, Sun was less than 1 percent. Which is especially odd, considering how few DC project actually support Solaris. You see, any willing Solaris donators would tend to migrate towards and concentrate in those few DC projects supporting Sun clients. So even with a concentrated number of Solaris clients, the number of donators is still very small over all. And then consider the GPU2 clients are producing about half of the TFLOPs for fah. Your fractions against half the Fah production.... hmm, do you really think they would remove resources from the most productive fah client to persue a tiny Sun client when there is so much more optimization that can be squeezed out of the current client?

We're glad you are an enthusiastic fah supporter alpha754293, but please be aware there are a few "old hats" around here that have seen it all from day 1. I don't want to dampen your spirit, but most new suggestions are not new to fah or this forum. I do hope you become a CIO someday.

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 7:34 am
by alpha754293
7im wrote:
alpha754293 wrote:Gawd...I LOVE what a bunch of mindless drones you guys are.

While I can certainly understand and appreciate the lack of human resources in order to port the F@H program over to other platforms; I'd like to speculate at this point (however wrongly this might be) that this might just be a case of "you don't know because you've never tried."

And while the recent advances in porting over the client onto GPUs and PS3s, from what I've also read is that they're all still very specialized clients due to the architecture of the processors, and that while they return results faster, they can only do a limited subset of the simulations at hand, which means that the normal x86/x64 processors still have to do the remainder of them.

IF there is so much computational work that is left to be done, I am surprised that there wouldn't be a greater emphasis in illiciting computational assistance from some of the fastest mainframes. Yes, I would also agree and admit that (as the distributed.net project shows) that the cumulative computational effort of those would be very very small, BUT, on the other hand, for those of us that are going to be entering into the workforce within the next 5-10 years or so; where we will be placed in charge of those very same systems; WE might actually have the authority to make that call as to whether to put F@H on the mainframe or not.

As I've also mentioned earlier before, I'm not a programmer, and therefore; I am incapable of affecting change in such a manner where I can made code contributions to the project. However, if the intent is to be able to make this program usable (and I've read the 200+ pages of the GROMACS core, which actually looks like the same caliber of simluation stuff that I do for work) to the greatest number of people; then I am surprised that there hasn't been more emphasis or effort put into it (yes, I am making a presumption here and I could be VERY wrong about it, but I'm okay with being wrong.)

If it's not Windows/Linux/Mac on Intel/AMD (or in brevity PPC), it's like put the shaders on and let's bury ourselves to wallow in our self-pity.

I know that if I were a CTO or CIO, I'd be running F@H company wide. Oh wait. I can't, cuz there are clients for it. *rolls eyes*

Mindless drones? Gawd... I love your naive enthusiasm. :roll:

Sorry, but codysluder and I (and many others) have been with the project a long time. We're politely trying to say, "Been there, done that" several times over. Reality is not self-pity. We already know and understand the limitations of fah and of Pande Group. As I suggested in other threads, this porting for Sun/Solaris has already been discussed at length, and that is how we know this information, not just posting unsupported negative comments. Search really is your friend.

But I'm glad you recognize the limited nature of Pande Group's development resources. You cited the GPU2 port as an example of a porting success on different hardware. And I'll cite the GPU2 port as an example of PG using their limited resources to persue only the most productive (faster and larger number of potential clients) to gain the most bang for their development buck. The potential power of the GPU2 port was worth the top priority their developers are giving it. That puts the porting for Sun or AIX much lower on the priority list, i.e. possible but not probable.

With limited resources, Pande Group does try to support as wide a range of hardware as feasibly possible. It's naive to assume they do not try to support many types of clients based on a lack of other types of clients. You can't use a negative to prove a positive.

This too was mentioned in several of the other porting threads. Go to a couple of the other larger distributed computing projects, and see if they support Solaris. You will find one or three that still do. Now look at the percentage of Sun clients as compared to the other types of hardware. As I last recall, Sun was less than 1 percent. Which is especially odd, considering how few DC project actually support Solaris. You see, any willing Solaris donators would tend to migrate towards and concentrate in those few DC projects supporting Sun clients. So even with a concentrated number of Solaris clients, the number of donators is still very small over all. And then consider the GPU2 clients are producing about half of the TFLOPs for fah. Your fractions against half the Fah production.... hmm, do you really think they would remove resources from the most productive fah client to persue a tiny Sun client when there is so much more optimization that can be squeezed out of the current client?

We're glad you are an enthusiastic fah supporter alpha754293, but please be aware there are a few "old hats" around here that have seen it all from day 1. I don't want to dampen your spirit, but most new suggestions are not new to fah or this forum. I do hope you become a CIO someday.
I think that perhaps why Solaris isn't supported as much is because there isn't a client for it. Yes, there's no argument in my mind that Windows/Linux/Mac/GPU2/PS3 aren't the most common platforms.

What I am saying is that perhaps, they're perceived to be the most common for all the wrong reasons.

Couple that with companies not willing to spend their excess computational resource on such projects; and so now, while all the data would point to "there's no demand for it"; I don't REALLY buy into that argument.

I think that if spare computational capacities were used for projects like F@H and the likes, and the clients exist; people just MIGHT go for it. But, if clients don't exist and companies don't want to "pay"; then you're stuck in this profit-based culture where the shallowness of human behavior, will once again, triumph over human endeavor. And that's just a sad state of affairs.


- * - * - * -

To answer your question, I would think so, yes. It depends on where the PandeGroup plans to do/head next. It states in the FAQs that although the PS3 and the GPU clients turn around results VERY quickly, both classes can only work on a limited number of projects due to the nature of the computational architecture/core of the host hardware. Therefore; for everything else, it's still up to your good old, generic x86/x64 processors to do.

My idea is that let's port the the x86/x64 general client over to these larger, mainframe systems, so that companies would be motivated (hopefully, although I really doubt that it'd work, because companies are stocked by people, and people generally are inherently shallow.)

On the other hand, what also is interesting to me is that you would have people who would run F@H on their home machines, but they're "forbidden" (nearly or practically so) on the systems, servers, mainframes, workstations, etc. that they might have an influence/impact on at work but aren't running the F@H client. So, I wonder how people can live with such hypocrisy.

I talk about HPC clients and servers and mainframes and being a CIO/CTO not because it's "fun", but because there are significant practical ramifications. And while it might take some time for me to get to a position where I would have a direct influence of my company's computational resources, what I REALLY hope is that by the time I get there, that all of these issues/clients, etc. will be worked out so that I can just issue the massive deployment order of the F@H client and when my work systems aren't busy, they'd be contributing.

And as we face our current generate of graduates today; someone please tell me how that would be a bad idea.

As far as I know, and to the best of my capabilities - this is the way I know best how to contribute to the project.

On a sidenote, I am already planning my first est. $1.2 M, 768-core, 42RU cabinet blade server system. (As a personal acquisition). Current timeframe goal that I have set for myself for this acquisition is 5-7 years. That ought to be able to run 96 instances of the 8-core Linux SMP client. (The actual system is probably realistically only going to be about $750k, and the other $500k would be for supporting and infrastructural upgrades in order to be able to prepare and operate the system.)

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 10:11 am
by P5-133XL
alpha754293 wrote:On a sidenote, I am already planning my first est. $1.2 M, 768-core, 42RU cabinet blade server system. (As a personal acquisition). Current timeframe goal that I have set for myself for this acquisition is 5-7 years. That ought to be able to run 96 instances of the 8-core Linux SMP client. (The actual system is probably realistically only going to be about $750k, and the other $500k would be for supporting and infrastructural upgrades in order to be able to prepare and operate the system.
Well, I can't fault you for not thinking big but I also can't see what, on a personal level, you would ever need such a costly monstrosity. If it is to create a folding farm, then you are far better off with mass GPU folding rather than blade servers: It is just plain much more productive for the money.

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 10:27 am
by alpha754293
P5-133XL wrote:
alpha754293 wrote:On a sidenote, I am already planning my first est. $1.2 M, 768-core, 42RU cabinet blade server system. (As a personal acquisition). Current timeframe goal that I have set for myself for this acquisition is 5-7 years. That ought to be able to run 96 instances of the 8-core Linux SMP client. (The actual system is probably realistically only going to be about $750k, and the other $500k would be for supporting and infrastructural upgrades in order to be able to prepare and operate the system.
Well, I can't fault you for not thinking big but I also can't see what, on a personal level, you would ever need such a costly monstrosity. If it is to create a folding farm, then you are far better off with mass GPU folding rather than blade servers: It is just plain much more productive for the money.
My "normal" CFD/FEA simulations (would require such a "massive monstrosity").

I was involved in a project before where we were converting one of the city buses from diesel to diesel-electric hybrid and my team/group was responsible for the structural side of it. We had to truncate our simluations because the system ran out of RAM (in a simulation that I believe took 13 hours to do.)

CFD side - I've done simulations for internal combustion engines (the combustion process). That got truncated as well because a single cylinder too about 100 hours to run for two of the four strokes, and we figured that it would have taken it nearly 240 hours to run the full four stroke. We never got far enough to calculating/figuring how long it would have taken to simulate the entire engine as problems came up during the mesh update loops.

Luckily, external aerodynamics doesn't take that long/much. At least for a simple flow field. If I were to simulate it with varying degrees of traffic and wind conditions, then yes; it would change it quite substantially.

This does not include the "crazy" stuff. :) ;)

*edit*
The folding farm that results from it is when there's some downtime on the system while I'm working stuff out and preparing the next big simulation runs.

Remember that GPU folding farms isn't the be all and end all of F@H. (gahhh...*mildly annoyed*) Remember that PS3 and the GPU clients can only work on a rather limited subset of the entire F@H project because of their processor architecture. Everything else still requires the good ol' x86/x64 CPUs to do everything else. Yes, they're fast. Yes, they generate lots of PPD. But they're DEFINITELY no less important than any of the other clients.

Besides, if I were to run off a PS3 or GPU farm, I'd have to pretty much rewrite all of the other programs that I already normally use for simulations for those architectures. Yea.....that's SOOO not happen.

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 10:50 am
by P5-133XL
I didn't say GPU folding was more important. I said it was more productive. There is a big difference between those two statements.

What you are describing, certainly doesn't sound like a personal purchase but rather more like a business purchase. Since the timeframe is quite a ways out there; the money is big enough; and the simulations appear to be perfect for GPU calculations. I'd be trying to hire a team to convert your simulations into GPGPU code. You may get an order of magnitude in faster calculating and cheaper HW costs that may pay for that software development team.

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 11:12 am
by alpha754293
P5-133XL wrote:I didn't say GPU folding was more important. I said it was more productive. There is a big difference between those two statements.

What you are describing, certainly doesn't sound like a personal purchase but rather more like a business purchase. Since the timeframe is quite a ways out there; the money is big enough; and the simulations appear to be perfect for GPU calculations. I'd be trying to hire a team to convert your simulations into GPGPU code. You may get an order of magnitude in faster calculating and cheaper HW costs that may pay for that software development team.
Perhaps, but that's only if the software vendors make the GPGPU solvers available COTS. I can understand it if it was something that I was developing for myself to be able to leverage the power of the GPUs, but since I'm not the one developing the code; sadly, I don't have control over that.

- * - * - * -

I took the push for the GPU F@H farm to be an implicit statement about their importance. My bad. But it's also important to understand why I said what I said though, and how I may have arrived at that assumption.

- * - * - * -

Considering that my room is a mini computing lab right now (see the pictures posted on the thread of "most powerful folders"), it should really surprise anyone.

Besides, I also have it planned that if and when I do get a house, I'd like to build it around the Sun MD20. I'm still planning/working out the details because the current blade servers don't have the computational efficiency that I'm looking for. (Considering that I can get 100 MFLOPS/W using 2U systems, with the downside being computational density.)

In the blade server, it ends up being about 114 MFLOPS/W with a total capacity of 3.84 TFLOPS. In the alternate configuration, it's 21-2U systems, each at 100 MFLOPS/W (minimum), but results in only a total computational capacity of 1.68 TFLOPS. So...I'm still doing some shopping around (and planning).

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 7:45 pm
by bruce
alpha754293 wrote:Perhaps, but that's only if the software vendors make the GPGPU solvers available COTS. I can understand it if it was something that I was developing for myself to be able to leverage the power of the GPUs, but since I'm not the one developing the code; sadly, I don't have control over that.
That's probably not an unreasonable hope, but it is also not unreasonable to consider developing something yourself or to get it in the next update from your software vendor. CFD and FEA are not terribly dissimilar to a simplified MD code. With CFD/FEA, the nodes in the mesh are mostly uniform except at specific boundaries. With MD, each atom has it's own mass and force characteristics so there is nothing like a "mesh" -- just a lot of unique nodes with some additional stochastic characteristics.

The hardware folks do provide an API that would allow a programmer to off-load the parallel node calculations to a GPU for some pretty significant speed gains even if nothing else is changed in the fundamental CFD/FEA code. I'd expect that the code vendors would already be pretty far along with that.

Re: MPICH vs. OpenMP

Posted: Fri Jan 23, 2009 9:59 pm
by alpha754293
bruce wrote:
alpha754293 wrote:Perhaps, but that's only if the software vendors make the GPGPU solvers available COTS. I can understand it if it was something that I was developing for myself to be able to leverage the power of the GPUs, but since I'm not the one developing the code; sadly, I don't have control over that.
That's probably not an unreasonable hope, but it is also not unreasonable to consider developing something yourself or to get it in the next update from your software vendor. CFD and FEA are not terribly dissimilar to a simplified MD code. With CFD/FEA, the nodes in the mesh are mostly uniform except at specific boundaries. With MD, each atom has it's own mass and force characteristics so there is nothing like a "mesh" -- just a lot of unique nodes with some additional stochastic characteristics.

The hardware folks do provide an API that would allow a programmer to off-load the parallel node calculations to a GPU for some pretty significant speed gains even if nothing else is changed in the fundamental CFD/FEA code. I'd expect that the code vendors would already be pretty far along with that.
In truth, I have no idea what they're working on. I know that the last time that I was in a training session with them, they said that they're more focused on merging the two separate and distinct CFD softwares (from company acquisitons) together to make a unified solver. So, whether they'd be working on offloading the computations to GPU clients, who knows.

They might have some stuff deep in the skunkworks, but I have no connections there; so I really don't know what's going on. If it does, it would be interesting. If it doesn't; it wouldn't surprise me.

Besides, I'm not sure how a company would build a supercomputer or a compute cluster using the GPU clients though. I haven't really read a heck of a lot in terms of ultra high computational density with MPICH or OpenMP for GPU clients though. *shrug* who knows...maybe you're right though. *shrug*