The Intel Xeon D Review: Performance Per Watt Server SoC Champion?

Name: The Intel Xeon D Review: Performance Per Watt Server SoC Champion?
Item: The Intel Xeon D Review: Performance Per Watt Server SoC Champion?
Author: Johan De Gelas

by Johan De Gelas on June 23, 2015 8:35 AM EST

90 Comments | Add A Comment

90 Comments

Memory Subsystem: Bandwidth

As more memory channels complicate motherboard design and can be a problem for dense servers, the Xeon-D, Xeon E3 and Atom C2000 only have two memory channels. This makes quad channel operation a good way to differentiate up to the Xeon E5. The Xeon E3 and Atom are limited to DDR3-1600 as per JEDEC specifications, whereas the Xeon D should be able to command more bandwidth due to the use of DDR4-2133 DIMMs.

We measured the memory bandwidth in Linux. The binary was compiled with the Open64 compiler 5.0 (Opencc). It is a multi-threaded, OpenMP based, 64-bit binary. The following compiler switches were used:

-Ofast -mp -ipa

To keep things simple, we only report the Triad sub-benchmark of our OpenMP enabled Stream benchmark.

Stream Triad

Using DIMMs with a 33% higher clock, the Xeon D gets a 25-38% boost in bandwidth compared to the Xeon E3. Basically, every percent increase in clock speed is translated in higher bandwidth. The Xeon E5 has almost twice as much bandwidth for only 50% more cores and should as result do better in some bandwidth intensive applications (mostly HPC).

Benchmark configuration Memory Subsystem: Latency

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

90 Comments

View All Comments

JohanAnandtech - Wednesday, June 24, 2015 - link
Hi Patrick, the base clock of our chip is 2 GHz, not 1.9 GHz as the one pre-production version that we got from Intel. I have to check the turboclocks though, but I do believe we have measured 2.6 GHz. I'll doublecheck.
pjkenned - Wednesday, June 24, 2015 - link
Awesome! Our ES ones were 1.9GHz.
Chrisrodinis1 - Tuesday, June 23, 2015 - link
For comparison, this server uses Xeon's. It is the HP Proliant BL460c G9 blade server: https://www.youtube.com/watch?v=0s_w8JVmvf0
MrDiSante - Wednesday, June 24, 2015 - link
Why use only -O2 when compiling the benchmarks? I would imagine that in order to squeeze out every last bit of performance, all production software is compiled with all optimizations turned up to 11. I noticed that their github uses -O2 as an example - is it that TinyMemBenchmark just doesn't play nice with -O3?
JohanAnandtech - Wednesday, June 24, 2015 - link
The standard makefile had no optimization whatsoever. If you want to measure latency, you do not want maximum performance but rather accuracy, so I played it safe and used -O2. I am not convinced that all production software is optimized with all optimization turned on.
diediealldie - Wednesday, June 24, 2015 - link
Intel seems disARMing them... X-Gene 2 doesn't look so promising, as they'll have to fight mighty Skylake-based Xeons, not Broadwell ones.

Thanks for great article again.
jfallen - Wednesday, June 24, 2015 - link
Thanks Johan for the great article. I'm a tech enthusiast, and will never buy or use one of these. But it makes great reading and I appreciate the time you take to research and write the article.

Regards
Jordan
JohanAnandtech - Wednesday, June 24, 2015 - link
Happy to read this! :-)
TomWomack - Wednesday, June 24, 2015 - link
This looks very much consistent with my experience; the disconcertingly high idle power (I looked at the board with a thermal camera; the hot chips were the gigabit PHY, the inductors for the power supply, and the AST2400 management chip), the surprisingly good memory performance, the fairly hot SoC (running sixteen threads of number-crunching I get a power draw of 83W at the plug) and the generally pretty good computation.

I'm not entirely sure it was a better buy for my use case than a significantly cheaper 6-core Haswell E - Haswell E is not that hot, electricity not that expensive, and from my supplier the X10SDV-F board and memory were £929 whilst Scan get me an i7-5820K board, CPU and memory for £702. And four-channel DDR4 probably is usefully faster than two-channel for what I do.

I quite strongly don't believe in server mystique - the outbuilding is big enough that I run out of power before I run out of space for micro-ATX cases, and I am lucky enough to be doing calculations which are self-checking to the point that ECC is a waste of money.
JohanAnandtech - Wednesday, June 24, 2015 - link
Hi Tom, I believe we saw up to 90 Watt at the wall when running OpenFOAM (10 Gbit enabled). It is however less relevant for such a chip which is not meant to be a HPC chip as we have shown in the article. HPC really screams for an E5.

The Intel Xeon D Review: Performance Per Watt Server SoC Champion?

Memory Subsystem: Bandwidth

Post Your Comment

90 Comments

View All Comments

JohanAnandtech - Wednesday, June 24, 2015 - link

pjkenned - Wednesday, June 24, 2015 - link

Chrisrodinis1 - Tuesday, June 23, 2015 - link

MrDiSante - Wednesday, June 24, 2015 - link

JohanAnandtech - Wednesday, June 24, 2015 - link

diediealldie - Wednesday, June 24, 2015 - link

jfallen - Wednesday, June 24, 2015 - link

JohanAnandtech - Wednesday, June 24, 2015 - link

TomWomack - Wednesday, June 24, 2015 - link

JohanAnandtech - Wednesday, June 24, 2015 - link

Log in

Don't have an account? Sign up now