64 bit version?
Moderators: Site Moderators, FAHC Science Team
64 bit version?
Why don't you make a 64-bit version of your client / cores ?
wouln't that bring more performance?
wouln't that bring more performance?
Re: 64 bit version?
The FAH Linux v6 beta is 64-bit already, but the bit "count" will help nothing because the FPU does not use general registers.greeny wrote:Why don't you make a 64-bit version of your client / cores ?
wouln't that bring more performance?
-
- Posts: 450
- Joined: Tue Dec 04, 2007 8:36 pm
The Linux version uses both 64-bit and 32-bit code, so you need support for both instruction sets.
The Windows version is all 32-bit and, like the Linux version, would have almost no benefit if it was recompiled for 64-bit. That's because 99% of the time is spend waiting for the FPU to do it's work. Speeding up the other 1% by say 20% would improve the total time by 0.2% and FAH would have to support a whole new set of 64-bit clients.
A better question would be what about making the current Linux version run on 32-bit hardware.
The Windows SMP Beta is less stable than the Linux version, particularly if it's on a machine where it needs to stop and restart a lot, but even there, it's worth a try.
The Windows version is all 32-bit and, like the Linux version, would have almost no benefit if it was recompiled for 64-bit. That's because 99% of the time is spend waiting for the FPU to do it's work. Speeding up the other 1% by say 20% would improve the total time by 0.2% and FAH would have to support a whole new set of 64-bit clients.
A better question would be what about making the current Linux version run on 32-bit hardware.
The Windows SMP Beta is less stable than the Linux version, particularly if it's on a machine where it needs to stop and restart a lot, but even there, it's worth a try.
I am not sure what you're meaning with FPU (maybe Floating Point Unit, anyway my knowledge in computer architecture is quite limited) but I understand that the 64bit cores are not really a performance boost
the linux smp-client is not a real alternative for me, because I have to work with windows applications most of the time and the deadlines are too short to do just "freetime folding"
the linux smp-client is not a real alternative for me, because I have to work with windows applications most of the time and the deadlines are too short to do just "freetime folding"
-
- Posts: 1024
- Joined: Sun Dec 02, 2007 12:43 pm
The ALU (Arithmetic Logic Unit) and the general registers do the bulk of processing for many types of computer code (FAH is one notable exception). The arithmetic operations that it can do are integer operations, working only on whole numbers.
The FPU (Floating Point Unit) along with a different set of registers perform floating point operations, which can have extremely small (or large) values that are not necessarily whole numbers.
In the very early days of computers, the CPU only had integer operations and if you needed to do floating point arithmetic, you had to add a separate CPU that was called a co processor. By the time the first Pentium rolled around, both were incorporated into the same chip.
Integers come in several flavors such as 8-bit / 16-bit / 32-bit / 64-bit and they're also used to calculate the addresses of different parts of the computer code such as when you branch to a different segment of code. For that reason, 64-bit hardware can address much, much larger memory than 32-bit. Floating point numbers are almost always Single Precision but they are sometimes Double Precision or other things that are even rarer (even on the very old 16-bit hardware such as the 8087 and 80287).
http://en.wikipedia.org/wiki/Coprocessor
The FPU (Floating Point Unit) along with a different set of registers perform floating point operations, which can have extremely small (or large) values that are not necessarily whole numbers.
In the very early days of computers, the CPU only had integer operations and if you needed to do floating point arithmetic, you had to add a separate CPU that was called a co processor. By the time the first Pentium rolled around, both were incorporated into the same chip.
Integers come in several flavors such as 8-bit / 16-bit / 32-bit / 64-bit and they're also used to calculate the addresses of different parts of the computer code such as when you branch to a different segment of code. For that reason, 64-bit hardware can address much, much larger memory than 32-bit. Floating point numbers are almost always Single Precision but they are sometimes Double Precision or other things that are even rarer (even on the very old 16-bit hardware such as the 8087 and 80287).
http://en.wikipedia.org/wiki/Coprocessor
Re: 64 bit version?
Hi folks, apologies if I missed this in other posts or FAQs,
Under Intel 64 or AMD64, it looks like in 64-bit mode you get double the number of SSE registers (see e.g. section 3.1.1 of http://www.intel.com/design/processor/m ... 253665.pdf, and section 4.3.1 of http://www.amd.com/us-en/assets/content ... /24592.pdf ).
On the GROMACS web site they have the following mention about SSE speedups: "Assembly loops using SSE and 3DNow! multimedia instructions are provided for i386 processors, separate versions using all x86-64 registers are used on Opteron x86-64 and Xeon EM64t machines." (see first bullet point of http://www.gromacs.org/content/view/12/176/ )
Do the GROMACS cores used by F@H make use of the extra registers in 64-bit mode ?
Best regards,
- Robo
Under Intel 64 or AMD64, it looks like in 64-bit mode you get double the number of SSE registers (see e.g. section 3.1.1 of http://www.intel.com/design/processor/m ... 253665.pdf, and section 4.3.1 of http://www.amd.com/us-en/assets/content ... /24592.pdf ).
On the GROMACS web site they have the following mention about SSE speedups: "Assembly loops using SSE and 3DNow! multimedia instructions are provided for i386 processors, separate versions using all x86-64 registers are used on Opteron x86-64 and Xeon EM64t machines." (see first bullet point of http://www.gromacs.org/content/view/12/176/ )
Do the GROMACS cores used by F@H make use of the extra registers in 64-bit mode ?
Best regards,
- Robo
Cheers,
- robo (folding as "FoldInTime")
- robo (folding as "FoldInTime")
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: 64 bit version?
You are confusing two different things here.
Intel added more SSE registers at the same time they added 64 bit support. The two features are not directly related.
The Core 2 Duo CPUs process Gromacs work units twice as fast as Pentium 4s running twice the speed? Why, not because of 64 bit support, because they still run twice as fast in Windows XP-32, with a 32 bit fah client. It is because Intel improved the bit width, and the number of SSE instructions it could process in a single clock cycle.
64 bit support adds nothing fah can use to go faster. However, the other CPU architecture improvements made at the same time have helped, but do not require a 64 bit OS or client.
Intel added more SSE registers at the same time they added 64 bit support. The two features are not directly related.
The Core 2 Duo CPUs process Gromacs work units twice as fast as Pentium 4s running twice the speed? Why, not because of 64 bit support, because they still run twice as fast in Windows XP-32, with a 32 bit fah client. It is because Intel improved the bit width, and the number of SSE instructions it could process in a single clock cycle.
64 bit support adds nothing fah can use to go faster. However, the other CPU architecture improvements made at the same time have helped, but do not require a 64 bit OS or client.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: 64 bit version?
Hi 7im, thanks for the reply. I see that this has been discussed in former times, with some benchmarks done and maintenance/support costs dominating.
For what it's worth, the main Gromacs developers do seem to have added some inner loops specifically to make use of additional SSE registers:
http://www.gromacs.org/content/view/18/132/ :
Erik Lindahl 27 Dec 2004 X86-64 assembly loops SSE (single) and SSE2 (double) assembly loops have been added for the x86-64 architecture.
http://www.gromacs.org/pipermail/gmx-us ... 16963.html :
The retuned loops are for x86-64; I've rescheduled them to use all
available registers (16, iso 8 for ia32).
http://www.gromacs.org/pipermail/gmx-an ... 00005.html
And e.g. section 4.3.1 of the AMD processor manual indicates that the high 8 XMM registers are available for programs running in 64-bit mode by using the REX instruction prefix.
I can see the issues with integration with the core launcher and the support costs. There would have to be a sizeable benchmark improvement for a 64-bit version to be worthwhile.
For what it's worth, the main Gromacs developers do seem to have added some inner loops specifically to make use of additional SSE registers:
http://www.gromacs.org/content/view/18/132/ :
Erik Lindahl 27 Dec 2004 X86-64 assembly loops SSE (single) and SSE2 (double) assembly loops have been added for the x86-64 architecture.
http://www.gromacs.org/pipermail/gmx-us ... 16963.html :
The retuned loops are for x86-64; I've rescheduled them to use all
available registers (16, iso 8 for ia32).
http://www.gromacs.org/pipermail/gmx-an ... 00005.html
And e.g. section 4.3.1 of the AMD processor manual indicates that the high 8 XMM registers are available for programs running in 64-bit mode by using the REX instruction prefix.
I can see the issues with integration with the core launcher and the support costs. There would have to be a sizeable benchmark improvement for a 64-bit version to be worthwhile.
Cheers,
- robo (folding as "FoldInTime")
- robo (folding as "FoldInTime")