The Xeon Phi at work at TACC

Name: The Xeon Phi at work at TACC
Item: The Xeon Phi at work at TACC
Author: Johan De Gelas

by Johan De Gelas on November 14, 2012 1:44 PM EST

46 Comments | Add A Comment

46 Comments

The Xeon Phi card comes on a PCIe card, much like a GPU. Given the architecture's origins as a GPU, the form factor should't come as a surprise. Like modern HPC GPUs however, the Xeon Phi card has no display output - its role is strictly for compute.

The Xeon Phi acts as a multi-core system on chip running its own operating system, a modified Linux kernel. Each Xeon Phi card has its own IP address however, the Xeon Phi can not operate on its own. A "normal" Xeon will be be the host CPU, the Xeon Phi card is a coprocessor, similar to the way your CPU and GPU work together.

Below you can see the SKUs that Intel will offer.

The Xeon Phi inside the Stampede are special edition Xeon Phis.These special editions get 61 cores and run at a slightly higher clockspeed (1.1 GHz).

The commercially avialable 5110P has one core and 50 MHz less than the special edition Phi but comes with 8 GB of ECC memory. The P-suffix indicates that it's passively cooled, relying on the host server for airflow. The 5110P is not cheap at $2699, but it's still more affordable than NVIDIA's Tesla K20 ($3199). The Xeon Phi 5100 series is really intended for more memory bandwidth bound applications thanks to the use of 5GHz GDDR5 and a fully populated 512-bit memory interface.

For compute bound applications however, Intel will offer the Xeon Phi 3100 series in the first half of next year for less than $2000. The Xeon Phi 3100 will come with 6GB of GDDR5 (5GHz data rate) and only a 384-bit memory interface. Core clock should be higher, delivering over 1TFLOP of DP FP performance.

The Xeon Phi cards use a 7GHz PCIe 2.0 interface, as Intel found moving to PCIe 3.0 resulted in slightly higher overhead.

Knights Corner and the Xeon Phi Dell's C8220 and The TACC Stampede

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

46 Comments

View All Comments

Kevin G - Saturday, November 17, 2012 - link
That rumor has a grain of truth to it. A slide deck about Larrabee from Intel indicated a socketable version fitting into a quad socket Xeon motherboard. This was while Intel still had consumer plans for Larrabee which have since radically changed.

Source:
http://arstechnica.com/gadgets/2007/06/clearing-up...
alpha754293 - Wednesday, November 14, 2012 - link
I don't even know what generally (and publically accessible) programs are available that you would be able to use to do this sort of HPC testing.

OpenMP code is sort of "easier" to come by. A program that has both an OpenMP and a CUDA version where it's a straight port - I can't even think of one.

The only one that might be a possiblity would be Ansys 13/14 because they do have some limited static structural/mechanical FEA capabilities that can run on the GPU, but I don't know how you'd be able to force it onto the Xeon Phis.

Hmmm....
TeXWiller - Wednesday, November 14, 2012 - link
The next version of OpenMP should have accelerator suppport via the OpenACC scheme. I'd bet that most engineering applications will be able to support most accelerators like Phi, Tesla and APUs in a transparent manner simply through the math libraries, not perhaps in the most optimal but at least in a sufficiently worthwhile way.
rad0 - Wednesday, November 14, 2012 - link
One thing I've yet to understand about the Xeon Phi is: do you get to run anything you want on it, or not?

Could you run Oracle's JVM (or any other JVM) on it? I know HPC isn't all that interested in Java, but a cheap 60-thread Java machine would be very interesting to play with.

Can you just ssh into the embedded linux and run anything you want?
coder543 - Wednesday, November 14, 2012 - link
Why Java? A dozen negative adjectives pop into my mind at the mere mention of the word outside of a coffee shop.
madmilk - Wednesday, November 14, 2012 - link
You can probably run Java on it, but it will not run well. Most Java code is application code - very branchy, something the Phi's memory architecture cannot handle well. The JVM certainly will not vectorize code either, so you have all those vector units being wasted.

This is really much closer to a GPU in terms of the kind of optimizations that must be done for performance, even if the underlying instruction set is x86.
Jaybus - Thursday, November 15, 2012 - link
No, it is much closer to a CPU than a GPU. This is an area where it differs VASTLY from a GPU. In fact, the cores are CPUs.
llninja1 - Thursday, November 15, 2012 - link
According to Tom's Hardware, you can login to the Xeon Phi card and get a command line prompt

http://www.tomshardware.com/reviews/xeon-phi-larra...

so that implies you can do whatever you want with some finagling. Whether your 60-thread JVM thought would work well or not on this architecture remains to be seen.
extide - Wednesday, November 14, 2012 - link
Do some Folding@Home benchmarks on a Phi if at all possible!

Thanks!
tipoo - Wednesday, November 14, 2012 - link
Like the people in charge of F@H would develop and release a new folding core so that it could run on one of these in the off chance some enthusiast has one of these multi thousand dollar cards and a computer system that can run it?

Not going to happen. This isn't a general CPU core that any existing software can run on, nor is it aimed at home users.

The Xeon Phi at work at TACC

Post Your Comment

46 Comments

View All Comments

Kevin G - Saturday, November 17, 2012 - link

alpha754293 - Wednesday, November 14, 2012 - link

TeXWiller - Wednesday, November 14, 2012 - link

rad0 - Wednesday, November 14, 2012 - link

coder543 - Wednesday, November 14, 2012 - link

madmilk - Wednesday, November 14, 2012 - link

Jaybus - Thursday, November 15, 2012 - link

llninja1 - Thursday, November 15, 2012 - link

extide - Wednesday, November 14, 2012 - link

tipoo - Wednesday, November 14, 2012 - link

Log in

Don't have an account? Sign up now