Intel's Benchmarks

Since time constraints meant that we were not able to run a ton of benchmarks ourselves, it's useful to check out Intel's own benchmarks as well. In our experience Intel's own benchmarking has a good track record for producing accurate numbers and documenting configuration details. Of course, you have to read all the benchmarking information carefully to make sure you understand just what is being tested.

The OLTP and virtualization benchmarks show that the new Xeon E7 v3 is about 25 to 39% faster than the previous Xeon E7 (v2). In some of those benchmarks, the new Xeon had twice as much memory, but it is safe to say that this will make only a small difference. We think it's reasonable to conclude that the Xeon E7 is 25 to 30% faster, which is also what we found in our integer benchmarks.

The increase in legacy FP application is much lower. For example Cinebench was 14% faster, SPECFP 9% and our own OpenFOAM was about 4% faster. Meanwhile linpack benchmarks are pretty useless to most of the HPC world, so we have more faith in our own benchmarking. Intel's own realistic HPC benchmarking showed at best a 19% increase, which is nothing to write home about.

The exciting part about this new Xeon E7 is that data analytics/mining happens a lot faster on the new Xeon E7 v3. The 72% faster SAS analytics number is not really accurate as part of the speedup was due to using P3700 SSDs instead of the S3700 SSD. Still, Intel claims that the replacing the E7 v2 with the v3 is good for a 55-58% speedup.

The most spectacular benchmark is of course SAP HANA. It is not 6x faster as Intel claims, but rather 3.3x (see our comments about TSX). That is still spectacular and the result of excellent software and hardware engineering.

Final Words: Comparing Xeon E7 v3 vs V2

For those of us running scale-up, reasonably priced HPC or database applications, it is hard to get excited about the Xeon E7 v3. The performance increases are small-but-tangible, however at the same time the new Xeon E7 costs a bit more. Meanwhile as far as our (HPC) energy measurements go, there is no tangible increase in performance per watt.

The Xeon E7 in its natural habitat: heavy heatsinks, hotpluggable memory

However organizations running SAP HANA will welcome the new Xeon E7 with open arms, they get massive speedups for a 0.1% or less budget increase. The rest of the data mining community with expensive software will benefit too, as the new Xeon E7 is at least 50% faster in those applications thanks to TSX.

Ultimately we wonder how the rest of us will fare. Will SAP/SAS speedups also be visible in open source Big Data software such as Hadoop and Elastic Search? Currently we are still struggling to get the full potential out of the 144 threads. Some of these tests run for a few days only to end with a very vague error message: big data benchmarking is hard.

Comparing Xeon E7 v3 and POWER8

Although the POWER8 is still a power gobbling monster, just like its older brother the POWER7, there is no denying that IBM has made enormous progress. Few people will be surprised that IBM's much more expensive enterprise systems beat Intel based offerings in the some high-end benchmarks like SAP's. But the fact that 24 POWER8 cores in a relatively reasonably priced IBM POWER8 server can beat 36 Intel Haswell cores by a considerable margin is new.

It is also interesting that our own integer benchmarking shows that the POWER8 core is capable of keeping up with Intel's best core at the same clockspeed (3.3-3.4 GHz). Well, at least as long as you feed it enough threads in IPC unfriendly code. But that last sentence is the exact description of many server workloads. It also means that the SAP benchmark is not an exception: the IBM POWER8 is definitely not the best CPU to run Crysis (not enough threads) but it is without a doubt a dangerous competitor for Xeon E7 when given enough threads to fill up the CPU.

Right now the threat to Intel is not dire, IBM still asks way too much for its best POWER8 systems and the Xeons have a much better performance-per-watt ratio. But once the OpenPOWER fondation partners start offering server solutions, there is a good chance that Intel will receive some very significant performance-per-dollar competition in the server market.

HPC Watts per Job
Comments Locked

146 Comments

View All Comments

  • DanNeely - Friday, May 8, 2015 - link

    Intel's 94% market share is still only ~184k systems. That's tiny compared to the mainstream x86 market; and doesn't give a lot of (budgetary) room to make radical changes to CPU vs just scaling shared designs to a huger layout.
  • theeldest - Friday, May 8, 2015 - link

    184k for 4S systems. The number of 2S systems *greatly* outnumbers the 184k.
  • Samus - Sunday, May 10, 2015 - link

    by 100 orders of magnitude, easily.

    2S systems are everywhere these days, I picked up a Lenovo 2S Xeon system for $600 NEW (driveless, 4GB RAM) from CDW.

    4S, on the other hand, is considerably more rare and starts at many thousands, even with 1 CPU included.
  • erple2 - Sunday, May 10, 2015 - link

    Well, maybe 2 orders of magnitude. 100 orders of magnitude would imply, based on the 184k 4S systems, more 2S systems than atoms in the universe. Ok, I made that up, I don't know how many atoms are in the universe, but 10^100 is a really big number. Well, 10^105, if we assume 184k 4S systems.

    I think you meant 2 orders of magnitude.
  • mapesdhs - Sunday, May 10, 2015 - link

    Yeah, that made me smile too, but we know what he meant. ;)
  • evolucion8 - Monday, May 11, 2015 - link

    That would be right if Intel cores are wide enough which aren't compared to IBM. For example, according to this review, enabling two way SMT boosted the performace to 45% and adding two more threads added 30% more performance. On the other hand, enabling two way SMT on the latest i7 architecture can only go up to 30% on the best case scenario.
  • chris471 - Friday, May 8, 2015 - link

    Great article, and I'm looking forward to see more Power systems.

    I would have loved to see additional benchmarks with gcc flags -march=native -Ofast. Should not change stream triad results, but I think 7zip might profit more on Power than on Xeon. Most software is not affected by the implied -ffast-math.
  • close - Friday, May 8, 2015 - link

    It reminds me of the time when Apple gave up on PowerPC in mobiles because the new G5s were absolute power guzzlers and made space heaters jealous. And then gave up completely and switched to Intel because the 2 dual core PowerPC 970MP CPUs at 2.5GHz managed to pull 250W of power and needed liquid cooling to be manageable.

    IBM is learning nothing from past mistakes. They couldn't adapt to what the market wanted and the more nimble competition was delivering 25-30 years ago when fighting Microsoft, it already lost business to Intel (which is actually only nimble by comparison), and it's still doing business and building hardware like we're back in the '70s mainframe age.
  • name99 - Friday, May 8, 2015 - link

    You are assuming that the markets IBM sells into care about the things you appear to care about (in particular CPU performance per watt). This is a VERY dubious assumption.
    The HPC users MAY care (but I'd need to see evidence of that). For the business users, the cost of the software running on these systems dwarfs the lifetime cost of their electricity.
  • SuperVeloce - Saturday, May 9, 2015 - link

    They surely care. Why wouldn't they. A whole server rack or many of them in fact do use quite a bit of power. And cooling the server room is very expensive.

Log in

Don't have an account? Sign up now