Single-Threaded Performance

I admit, the following two benchmarks are almost irrelevant for anyone buying a Xeon E7 based machine. But still, we have to quench our curiosity: how much have the new cores been improved? There is a lot that can be said about all the sophisticated "uncore" improvements (cache coherency policies, low latency rings, and so on) that allow this multi-core monster to scale, but at the end of the day, good performance starts with a good core. And since we have listed the many subtle core improvements, we could not resist the opportunity to check how each core compares.

The results aren't totally meaningless either, as the profile of a compression algorithm is somewhat similar to many server workloads: hard to extract instruction level parallelism (ILP) and sensitive to memory parallelism and latency. The instruction mix is a bit different, but it's still somewhat similar to many server workloads. And as one more reason to test performance in this manner, the 7-zip source code is available under the GNU LGPL license. That allows us to recompile the source code on every machine with the -O2 optimization with gcc 4.8.1.

We've run an additional data point for this particular set of tests. The new Ivy Bridge EX was tested at 2.8GHz and downclocked to 2.4GHz, so that we can do a clock-for-clock comparison with Westmere EX. Since we're only testing single-threaded performance here, other than perhaps slight differences due to having more total L3 cache, it doesn't matter which particular E7 v2 chip we use.

LZMA single threaded performance: compression

The latest Xeon E7 v2 "Ivy Bridge EX" is capable of extracting 33% more ILP out of the complex compression code than the older Xeon E7 "Westmere-EX" at the same clock speed. That is pretty amazing and shows how all the small micro-architecture improvements have accumulated into a large performance increase. The Opteron core is also better than most people think: at 2.4GHz it would deliver about 2481 MIPs. That is about 80% of Intel's best server core at the moment—not enough, but nothing to be ashamed about.

Also interesting to note is that the Westmere core was indeed a "tick": any performance increase over the Xeon X7560 (Codename "Beckton", 45nm Nehalem core) is simply the result of the higher clockspeed of the 32nm chip.

Let us see how the chips compare in decompression. Decompression is an even lower IPC (Instructions Per Clock) workload, as it is pretty branch intensive and depends on the latencies of the multiply and shift instructions.

LZMA single threaded performance: decompression

Again, we note a 30% improvement in integer performance going from the Xeon E7 "Westmere" (Xeon E7-4870 at 2.4GHz) to the Xeon E7 v2 "Ivy Bridge EX" (Xeon E7-4890 v2 clocked down to 2.4GHz).

To summarize: the new 15-core Xeon E7 v2 is built upon a strong core architecture that has improved significantly compared to the predecessor.

Our Benchmarks and Configuration Multi-Threaded Integer Performance
Comments Locked

125 Comments

View All Comments

  • JohanAnandtech - Friday, February 21, 2014 - link

    I don't see the error. "Beckton" (Nehalem-EX, X7560) is at 2.4 GHz
  • mslasm - Sunday, February 23, 2014 - link

    > I don't see the error.

    The article says "The Opteron core is also better than most people think: at 2.4GHz it would deliver about 2481 MIPs." - but, according to the graph, Opteron already delivers 2723 @ 2.3Ghz. So it is puzzling to see that it "would" deliver less MIPS (2481 vs 2723) at higher frequency (2.4 vs 2.3 Ghz) (regardless of any Intel results/frequencies)
  • silverblue - Saturday, February 22, 2014 - link

    It's entirely possible that the score is down to the 6376's 3.2GHz turbo mode.
  • plext0r - Friday, February 21, 2014 - link

    Would be nice to run benchmarks against a Quad E5-4650 system for comparison.
  • blaktron - Friday, February 21, 2014 - link

    ... you know you can't, right?
  • blaktron - Friday, February 21, 2014 - link

    Nevermind, read v2 there where you didn't write it. Too much coffee....
  • usernametaken76 - Friday, February 21, 2014 - link

    For the more typo-sensitive reader (perhaps both technically astute and typo-senstive):

    "A question like "Does the SPARC T5 also support both single-threaded and multi-threaded applications?" must sound particularly hilarious to the our technically astute readers."

    ...to the our...
  • JohanAnandtech - Friday, February 21, 2014 - link

    Fixed. Thx!
  • TiGr1982 - Friday, February 21, 2014 - link

    From the conclusion:
    "The Xeon E7 v2 chips are slated to remain in data centers for the next several years as the most robust—and most expensive—offerings from Intel."

    I don't think it will be really "several" years - maybe 1-2 years later this Ivy Bridge-EX-based E7 v2 will probably be superseded by Haswell-EX-based E7 v3 with Haswell cores with AVX2/FMA, which should make a difference in pro floating point calculations and data processing, and working with DDR4.
  • Kevin G - Friday, February 21, 2014 - link

    The Ivy Bridge-EX -> Haswell-EX transition will mimic the Nehalem-EX -> Westere-EX transition in that the core systems provided by the big OEM will stay the same. The OEM's offer Haswell-EX as a drop in replacement to their existing socket 2011v1 systems. Haswell-EX -> Broadwell-EX will again be using the same socket and follow a similarly quick transition. SkyLake-EX will bring a new socket design (perhaps with some optical interconnects?).

    At some point Intel will offer new memory buffer chips to support DDR4. This will likely require a system to swap out all the memory daughter cards but the motherboard from big OEM's shouldn't change. There may also be a period where these large systems can be initially configured with either DDR3 or DDR4 based upon customer requests.

Log in

Don't have an account? Sign up now