32-bit vs. 64-bit SMP clients?

Moderators: Site Moderators, FAHC Science Team

alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

32-bit vs. 64-bit SMP clients?

Post by alpha754293 »

Is there a real difference between 32-bit and 64-bit SMP clients (i.e. Windows vs. Linux/Mac)?

I understand that 64-bit will be required if the WU's > 4 GB of RAM on any given system, but for < 4 GB of RAM; is there any real performance benefit?

Or is it more because of what the program is compiled with/in?

Is it really that difficult/different to write a program for 64-bit vs. 32-bit or is it supposed to be as they say it is, that you can just use your existing 32-bit code and recompile it (with minimal changes) for 64-bit?
dschief
Posts: 146
Joined: Tue Dec 04, 2007 5:56 am
Hardware configuration: ASUS P5K-E, Q6600/ 8 gig ram Win-7

2X ASUS z97-K 16 G Ram Win-7_64

Re: 32-bit vs. 64-bit SMP clients?

Post by dschief »

You will see a improvement in frame times for the SMP Clients. when switching from windows to Linux.
I have boxes running XP Pro SP3 & Vista Home Premium to support 4 9800 GTX+'s; I'll toss in an occassional Win SMP client to use up processor cycles. Under windows I'm seeing frame times of around 15 min per %.
Under Fedora FC 8 x86_64; I'm getting frame times of 10 min per %
{ I will be chastised for saying it, but useing 'Taskset" to lock each client to a pair of cores, I've seen Linux finish 2 Wu's in less time than it takes to finish one under windows}.
I doubt memory will be a bottle-neck , I have only one box that has more than 2 gig., & usage rarely goes above 35/40%.
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: 32-bit vs. 64-bit SMP clients?

Post by P5-133XL »

There is a big difference between Linux vs Windows SMP clients but it really isn't a 64 bit issue. Rather, Linux has the A2 core while Windows does not and that specific core is much more efficient and scales much better. For some unknown reason the A2 core has not been successfully transfered to Windows. If you use the A1 core WU's that run in both Windows and Linux, the speed difference between the two OS's give a slight gain to Linux.
Image
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Re: 32-bit vs. 64-bit SMP clients?

Post by alpha754293 »

P5-133XL wrote:There is a big difference between Linux vs Windows SMP clients but it really isn't a 64 bit issue. Rather, Linux has the A2 core while Windows does not and that specific core is much more efficient and scales much better. For some unknown reason the A2 core has not been successfully transfered to Windows. If you use the A1 core WU's that run in both Windows and Linux, the speed difference between the two OS's give a slight gain to Linux.
So, in other words, for those that want the optimum PPD, they should really go with Linux SMP beta client?

Would there be any preference for Linux clients? (like has anyone ever done a test to see if one distribution is faster than others?)
bollix47
Posts: 2965
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 32-bit vs. 64-bit SMP clients?

Post by bollix47 »

I can't say which distribution is "best" but if you go with Ubuntu, use 8.04.1 rather than 8.10 as the SMP client will produce roughly 30-40% more PPD on 8.04.1. At least that's my experience and I've seen others report the same especially on 4 or more cores.

viewtopic.php?f=44&t=7362
viewtopic.php?f=44&t=6704&start=15
Image
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Re: 32-bit vs. 64-bit SMP clients?

Post by alpha754293 »

bollix47 wrote:I can't say which distribution is "best" but if you go with Ubuntu, use 8.04.1 rather than 8.10 as the SMP client will produce roughly 30-40% more PPD on 8.04.1. At least that's my experience and I've seen others report the same especially on 4 or more cores.

viewtopic.php?f=44&t=7362
viewtopic.php?f=44&t=6704&start=15
Historically, I've ran F@H SMP Linux Client on Red Hat Enterprise Linux 4 AS (in order to support some of the other simulation work that I do).

I don't know if some of the "normal" desktop Linux distributions will support 8 cores or more.
bollix47
Posts: 2965
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 32-bit vs. 64-bit SMP clients?

Post by bollix47 »

Some of us are running Ubuntu on an i7 with HT on and using the -SMP 8 flag and it's working very well. :wink:

On my 940 @ stock of 2.93 Ghz I routinely get ~7800 PPD.
Image
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Re: 32-bit vs. 64-bit SMP clients?

Post by alpha754293 »

bollix47 wrote:Some of us are running Ubuntu on an i7 with HT on and using the -SMP 8 flag and it's working very well. :wink:

On my 940 @ stock of 2.93 Ghz I routinely get ~7800 PPD.
I would think that HTT would be bad for things such as F@H because you only still have physically 4 cores and given the intense computational nature of F@H, if you enable HTT, and then force it to use -smp 8 (which I actually don't think that it works, because they're hardcoded to 4 cores); you're now constraining FPU resources.

What's your PPD without HTT? What would happen if you were to run two instances of the FAH Linux SMP client?

Additionally, running Ubuntu on a Core i7 with HTT still doesn't mean 8-cores. (In my case, I am running with a true 8-core system, quad AMD Opteron 880). I'm asking because in my experience, OSes that are locked to 4 processing units (regardless of configuration), typically have a hard time with my system because they don't know how to handle/address the other 4 cores and because the sockets/cores are keyed; therefore, I can't run two instances of the operating system instead (and actually binding the entire OS to a 64-bit hex key range).
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Re: 32-bit vs. 64-bit SMP clients?

Post by alpha754293 »

bollix47 wrote:I can't say which distribution is "best" but if you go with Ubuntu, use 8.04.1 rather than 8.10 as the SMP client will produce roughly 30-40% more PPD on 8.04.1. At least that's my experience and I've seen others report the same especially on 4 or more cores.

viewtopic.php?f=44&t=7362
viewtopic.php?f=44&t=6704&start=15
Where is there such a huge difference between Ubuntu 8.04 and 8.10 with the SMP client???
bollix47
Posts: 2965
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 32-bit vs. 64-bit SMP clients?

Post by bollix47 »

HT on the i7 works much better than it did on previous versions. I don't remember what my PPD was when I had HT turned off but it was significantly less.

I can't comment on your 8 core other than to say that I've seen comments on this forum about the cache on the AMD being less than the Intel and thus produces less PPD since a larger cache does speed up some FAH calculations.

The L3 cache on the i7 is 8meg and it's available to all cores at the same time rather than having 2 L2 caches of 4 meg each as in some previous Intel multi-core processors.

With 8.04 as I said I get around 7800 PPD. With 8.10 it drops to around 5400. Some have speculated it has something to do with the later kernel in 8.10 and that it has been fixed in later kernels but with Ubuntu that may have to wait until 9.04 or manually compiling the newer one.
Image
bollix47
Posts: 2965
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 32-bit vs. 64-bit SMP clients?

Post by bollix47 »

Some results with -SMP versus -SMP 8 can be seen at:

viewtopic.php?f=44&t=7266
Image
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Re: 32-bit vs. 64-bit SMP clients?

Post by alpha754293 »

bollix47 wrote:HT on the i7 works much better than it did on previous versions. I don't remember what my PPD was when I had HT turned off but it was significantly less.

I can't comment on your 8 core other than to say that I've seen comments on this forum about the cache on the AMD being less than the Intel and thus produces less PPD since a larger cache does speed up some FAH calculations.

The L3 cache on the i7 is 8meg and it's available to all cores at the same time rather than having 2 L2 caches of 4 meg each as in some previous Intel multi-core processors.

With 8.04 as I said I get around 7800 PPD. With 8.10 it drops to around 5400. Some have speculated it has something to do with the later kernel in 8.10 and that it has been fixed in later kernels but with Ubuntu that may have to wait until 9.04.
I haven't tested my 8-core system (by itself) for PPD so I don't know for sure either. Couple that with the issues that the 6.22 Windows SMP beta client was having with EUEs, I'm pretty sure that I lost a fair bit of work to that. I just might have to spend a bit of time to test that system.

Additionally, from what I can see, I don't think that my dual-cores HAVE a L3 cache (it doesn't need it). It'll be a while before I move either to a 16-core or 32-core system using the AMD quad-core processor (mostly because of cost).

From what I can tell though, it wouldn't entirely surprise me if a single Core i7 940 would be faster (for F@H) than my current 8-core system.

Having said that though, the downside with the Core i7 is that it isn't quite realistically practical for it to be able to use/support > 16 GB of RAM, which is nearly a requirement for a lot of the simulation stuff that I do.

(On a sidenote, my friend and I have benched my Q9550 OC'd to 3.4 GHz and it was getting about the same level of performance (minus the obvious RAM limitation of course) as my old quad-socket system (4x AMD Opteron 870, 2.0 GHz dual-core)).

If someone can help me calculate the PPD based on the times from the console outputs, I can definitely post that and we can find out how much my 8-core system is contributing (without having to actually run it for a few days straight to stablize the PPD).
bollix47
Posts: 2965
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 32-bit vs. 64-bit SMP clients?

Post by bollix47 »

If someone can help me calculate the PPD based on the times from the console outputs, I can definitely post that and we can find out how much my 8-core system is contributing (without having to actually run it for a few days straight to stablize the PPD).
There are a couple of 3rd party tools for calculating PPD. FAHMON(Windows binary and Linux via compile) and FahSpy(Windows only) are two of the more popular ones.

viewtopic.php?f=14&t=52
Additionally, from what I can see, I don't think that my dual-cores HAVE a L3 cache (it doesn't need it). It'll be a while before I move either to a 16-core or 32-core system using the AMD quad-core processor (mostly because of cost).
The L3 is new for Intel in Nehalem although AMD has had it for some time.
Image
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Re: 32-bit vs. 64-bit SMP clients?

Post by alpha754293 »

bollix47 wrote:
If someone can help me calculate the PPD based on the times from the console outputs, I can definitely post that and we can find out how much my 8-core system is contributing (without having to actually run it for a few days straight to stablize the PPD).
There are a couple of 3rd party tools for calculating PPD. FAHMON(Windows binary and Linux via compile) and FahSpy(Windows only) are two of the more popular ones.

viewtopic.php?f=14&t=52
Is that how people are calculating it? I thought that they were just reading the console output readings and just getting it from there. Hmm...that would be interesting if they're not doing the PPD calculations themselves.
bollix47 wrote:
Additionally, from what I can see, I don't think that my dual-cores HAVE a L3 cache (it doesn't need it). It'll be a while before I move either to a 16-core or 32-core system using the AMD quad-core processor (mostly because of cost).
The L3 is new for Intel in Nehalem although AMD has had it for some time.
Not for my dual-cores they don't.
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: 32-bit vs. 64-bit SMP clients?

Post by P5-133XL »

Yes, most people use a 3rd party tool such as FAHMON to calculate PPD's. However, you can calculate your own, if that is your desire.

look in your fahlog.txt to get the WU you are working on and two adjcent frame times (one % difference).

Subtract the two times. Do note that normal subtraction won't work because they are times in the form of hours:minutes:seconds. When you subract the seconds and need to take from the minutes there are 60 seconds in every minute taken and the same goes for minutes and hours. After you struggle with base 60, you will get a frame time in the form of xx:yy (xx minutes and yy seconds).

Now you need to convert that into decimal time by dividing the yy seconds by 60 and adding that to the xx. You now have the time in the form xx.z minutes.

But that is only one percent of the entire WU. So to find out how long it takes for an entire WU to complete is 100 times that. so multiply xx.z * 100 to get aa.c.

Next you need to find out how much of a day that is by dividing aa.c by 1440 minutes/day (60 minutes/hour * 24 hours/day) to get how many WU's you can complete in a day: aa.c /1440 = bb.d

Now lookup the point value of your specific WU at: Psummery

Now you calculate the points per day by multiplying the point value of the wU with bb.d (the number of WU's you can complete in a day).

That is how you calculate PPD and that is also why most people choose to use a 3rd party program to do the calculating for them.
Image
Post Reply