Intel MIC (aka Xeon Phi)
Moderator: Site Moderators
Forum rules
Please read the forum rules before posting.
Please read the forum rules before posting.
Re: Intel MIC (aka Xeon Phi)
Yes and no.
standard loops are non-SSE, but that code no longer exists except in FahCore_78. The message comes from some old code that should be removed and at that point, it's really saying that it will attempt to turn off optimizations but since the standard loops no longer exist, the message actually means nothing. [Two different programmers working on two different code segments and the most recent update didn't make a couple of (nonessential) cleanup steps.]
standard loops are non-SSE, but that code no longer exists except in FahCore_78. The message comes from some old code that should be removed and at that point, it's really saying that it will attempt to turn off optimizations but since the standard loops no longer exist, the message actually means nothing. [Two different programmers working on two different code segments and the most recent update didn't make a couple of (nonessential) cleanup steps.]
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Re: Intel MIC (aka Xeon Phi)
http://software.intel.com/en-us/blogs/2 ... oprocessor
So the GPU paths should be capable of running as well.
Edit: since the link got truncated and people likely missed the point, MIC can run OpenCL.
So the GPU paths should be capable of running as well.
Edit: since the link got truncated and people likely missed the point, MIC can run OpenCL.
Last edited by mhouston on Thu Nov 15, 2012 5:06 am, edited 1 time in total.
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Intel MIC (aka Xeon Phi)
Means nothing. Cosmetic error. SSE is hard coded to ON despite that fahcore message.
Check for yourself... Frame times are exactly the same whether you see that message or not. Or forum search for the history on this. Been mentioned many times.
Check for yourself... Frame times are exactly the same whether you see that message or not. Or forum search for the history on this. Been mentioned many times.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Posts: 285
- Joined: Tue Jan 24, 2012 3:43 am
- Hardware configuration: Quad Q9550 2.83 contains the GPU 57xx - running SMP and GPU
Quad Q6700 2.66 running just SMP
2P 32core Interlagos SMP on linux
Re: Intel MIC (aka Xeon Phi)
The specs mention a 512 byte SIMD register, but not specifically which SSE x.x instructions it will run. The price for 60 cores 1.05GHz cores that use MPI which this project no longer supports kinda makes the SSE a mute point. Intel didn't release good specs to form a decision on this one. Usually in PC hardware, if it doesn't say it will do it; then it will not. They don't miss a chance to brag.
Re: Intel MIC (aka Xeon Phi)
In other words it isnt going to be any use for folding, oh well every cloud has a silver lining and I can tick two of these off my Christmas list
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Intel MIC (aka Xeon Phi)
Which SSEx optimizations? None.PinHead wrote:The specs mention a 512 byte SIMD register, but not specifically which SSE x.x instructions it will run. The price for 60 cores 1.05GHz cores that use MPI which this project no longer supports kinda makes the SSE a mute point. Intel didn't release good specs to form a decision on this one. Usually in PC hardware, if it doesn't say it will do it; then it will not. They don't miss a chance to brag.
http://software.intel.com/en-us/article ... e#overview
Intel® Many Integrated Core Architecture Overview
The Intel® Xeon Phi™ Coprocessor has up to 61 in-order Intel® MIC Architecture processor cores running at 1GHz (up to 1.3GHz). The Intel® MIC Architecture is based on the x86 ISA, extended with 64-bit addressing and new 512-bit wide SIMD vector instructions and registers. Each core supports 4 hardware threads. In addition to the cores, there are multiple on-die memory controllers and other components.
Each core includes a newly-designed Vector Processing Unit (VPU). Each vector unit contains 32 512-bit vector registers. To support the new vector processing model, a new 512-bit SIMD ISA was introduced.
The VPU is a key feature of the Intel® MIC Architecture-based cores. Fully utilizing the vector unit is critical for best Intel Intel® Xeon Phi™ Coprocessor performance. It is important to note that Intel® MIC Architecture cores do not support other SIMD ISAs (such as MMX™, Intel® SSE, or Intel® AVX).
Each core has a 32KB L1 data cache, a 32KB L1 instruction cache, and a 512KB L2 cache. The L2 caches of all cores are interconnected with each other and the memory controllers via a bidirectional ring bus, effectively creating a shared last-level cache of up to 32MB. The design of each core includes a short in-order pipeline. There is no latency in executing scalar operations and low latency in executing vector operations. Due to the short in-order pipeline, the overhead for branch misprediction is low.
Adding OPENCL support is great, and the only way FAH could run on this co-processsor so far, but the speed isn't there yet in OPENCL, even will all those cores...
60 cores, 240 threads, 1.0 GHz. With no SSE, it runs 1/3 slower at best, like 80 cores at 1 GHz, or 40 cores at 2.0 GHz, which is about what a BigAdv machine does now. At $3000, PHI is not cost effective yet. People are building BA machines for half that.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
-
- Pande Group Member
- Posts: 2058
- Joined: Fri Nov 30, 2007 6:25 am
- Location: Stanford
Re: Intel MIC (aka Xeon Phi)
The kicker here is the 512 byte SIMD register unit. It's not SSE compliant (so more coding to do) but it could be quite powerful.
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
Re: Intel MIC (aka Xeon Phi)
If you believe Intel's marketing, I think it does ~1 TF at DP. Potentially good for FAH research teams, not good for us run of the mill folders (very high price).VijayPande wrote:The kicker here is the 512 byte SIMD register unit. It's not SSE compliant (so more coding to do) but it could be quite powerful.
If FAH code is still SP, then it should be around 2 TeraFLOPS per card
Re: Intel MIC (aka Xeon Phi)
The Gromac code pretty much runs on everything that's in production or has been in recent years. FAH doesn't support all the platforms supported by GROMACS but focuses on hardware that is relatively common in the "at home" environment ... the x86 PC, the Power PC, then SIMD optimizations, then Mulit-Core and GPUs. (Some older platforms like PowerPC and older GPUs have been deprecated.) If the Intel MIC becomes commonplace with academic researchers it will likely be supported by gromacs.org. If the price-tag prevents the home environment from having enough installed devices to provide a valuable resource for FAH, it may not be supported by FAH. The market acceptance of the device will influence FAH's decision to invest in supporting it or not, and only time will tell.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 10179
- Joined: Thu Nov 29, 2007 4:30 pm
- Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
- Location: Arizona
- Contact:
Re: Intel MIC (aka Xeon Phi)
FAH is SP, and some of the Intel docs say the SP is as much as 10X faster as the DP.
http://www.intel.com/content/www/us/en/ ... etail.html
http://www.intel.com/content/www/us/en/ ... etail.html
But real world performance is never up to the estimates.Compared with Intel® Xeon® processor E5 family-based servers, the Intel® Xeon Phi™ coprocessor delivers:
• Up to 10x better performance on certain financial services applications*1,9
*9. 2 socket Intel® Xeon® processor E5-2600 product family server vs. Intel® Xeon Phi™ coprocessor (10.75x: Measured by Intel October 2012. 2 socket E5-2670 (8 core, 2.6GHz) vs. 1 Intel® Xeon Phi™ coprocessor SE10P (61 cores, 1.1GHz) on a Single Precision Monte Carlo Simulation. 45,501 options/sec vs. 489,354 options/sec )
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Re: Intel MIC (aka Xeon Phi)
What they didn't tell us about that benchmark was whether the Xeon was using SSE to get "up to 4x the performance" of single core operations through the use of SSE or if they're just comparing basic x86 code. Since the advertising copy was written to sell the new product, I think I know, though. A 10x improvement sounds really nice, but if you have to remove the real-world 3.5x optimized code for it to work at all, that's only going to be half as useful as they're advertising.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
-
- Posts: 2948
- Joined: Sun Dec 02, 2007 4:36 am
- Hardware configuration: Machine #1:
Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).
Machine #2:
Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.
Machine 3:
Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32
I am currently folding just on the 5x GTX 460's for aprox. 70K PPD - Location: Salem. OR USA
Re: Intel MIC (aka Xeon Phi)
Price and volume are going to be the problems, not performance. I have no doubt that GROMACS can make it work with relatively small changes in coding since it is basically multiple X86's cores on a coprocessor board. The problem will be that it will be a specialty item and not mass produced for the consumer market. As such it will be expensive; not many will be produced and those that are sold will be meeting a specific need that is not related to folding. Finding and convincing those people to fold will be like trying to convince companies to fold on their 4p servers: i.e. A very hard sell.
-
- Posts: 10
- Joined: Fri Aug 03, 2012 2:44 pm
- Hardware configuration: ========== RIG #0 ==========
Sony PS3
========== RIG #1 ==========
CPU: AMD Phenom II 965 BE
MEM: OCZ 8GB DDR3 (4x2GB)
GPU: ASUS GTX-560ti (448 core)
GPU: ASUS GTX-560ti (384 core)
PSU: Thermaltake 1200W
HDD: Seagate Baracuda 1TB 7200 RPM
OS: Win-7 Ultimate (64-bit)
========== RIG #2 ==========
CPU: AMD FX-8120
MEM: Crucial 8GB DDR3 (4x2GB)
GPU: MSI GTX-560ti (384 core)
PSU: DiabloTek 600W
HDD: Seagate Baracuda 1TB 7200 RPM
OS: Win-7 Home Premium (64-bit)
========== RIG #3 ==========
CPU: AMD FX-8150
MEM: Corsair 32GB DDR3 (4x8GB)
GPU: EVGA GTX-690 (3072 core)
PSU: OCZ 1250W 80-Plus Gold
SSD: OCZ Vertex-4 256GB
HDD: (2) Seagate Baracuda 1TB 7200 RPM
OS: Win-7 Professional (64-bit) - Location: Frostbite Falls, Minnesota, USA
- Contact:
Re: Intel MIC (aka Xeon Phi)
My take on the Intel MIC is that it is squarely aimed at FP64 applications.Ben_Lamb wrote:The Semiaccurate article on the Phi is very interesting and indicates that it thinks gpgpu is a dead duck. It also indicates that the launch of the Phi is a killer blow to Nvidias gpgpu dominance. ...
The Semiaccurate article makes the assertion that the Xeon Phi will - by orders of magnitude - outperform any GPGPU in the FP64 sphere.
However, they go on to say that GPGPU will still have the clear advantage on FP32 apps... and for quite some time to come.
It seems that Intel is taking a very aggressive, competitive swipe at Nvidia (specifically the Tesla K20 market) for dominance in the FP64 arena.
There are some other DC/HPC projects out there that could definitely benefit from massive, parallel FP64 threads available on Xeon Phi.
However, as was mentioned earlier, there are two considerations germane to F@H:
1) Computational accuracy required by the 'science'
2) Price-point barriers
If the actual computational science does not need FP64 - why build a client?
If the cost-of-ownership is too outrageous, it may not be worthwhile for Stanford to invest in developing a FAHCore client app that will be available only to possibly a few 'elite' donors/folders?
That's my two-cents - for whatever that's worth.
Re: Intel MIC (aka Xeon Phi)
I agree completely.KWSN_PToT wrote:However, as was mentioned earlier, there are two considerations germane to F@H:
1) Computational accuracy required by the 'science'
2) Price-point barriers
FAH seems to be accomplishing a huge amount of science with FP32 and the only FP64 folding that they've done was a long time ago. I'm assuming that means they decided they didn't need it then. There's just not enough information to predict the future but my guess is that FP32 will be sufficient (for a long time).
The Intel compiler should be able to create code for the MIC ("soon"). Machines in a University Lab (Stanford or researchers elsewhere) can be run with a custom compiled version of GROMACS. If the need for FP64 research is truly very limited it's likely to be done on custom machines so I predict that no attempt will be made to distribute that work through FAH together with a custom FahCore -- especially if there are very few "@HOME" computers that include MIC hardware. We are not in the customer group targeted by Intel.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.
Hope for Knights Landing processor?
The next generation Intel Phi processor will have AVX instructions, unlike the previous Phi. Does this give it a chance to get folding support? Does AVX mean it's likely to support folding without changes? It's still a year+ away, unfortunately.
Some info here: http://www.zdnet.com/intel-takes-its-ne ... 000030775/
Some info here: http://www.zdnet.com/intel-takes-its-ne ... 000030775/