CPU Choices

For this article, the only current-generation Intel Broadwell-EP processors we had in the lab were the Xeon E5-2699 v4 and Xeon E5-2650 v4. Comparing the IBM POWER8 with the former was not fair: the Xeon costs almost 3 times ($4115) more than the midrange POWER8 chip ($1500). The latter was not an option either with a TDP of 90W. There are no Intel chips with 190W TDP, so we had to compromise.

The most comparable CPU that was available to us was the Xeon E5-2690 v3. It is a higher end midrange Intel SKU (135W TDP) that came out around the same time as the POWER8. If the 190W TDP POWER8 cannot beat this 135W TDP chip, IBM's micro architects have not done a very good job. Don't let the 2.6 GHz label fool you: this Haswell Xeon can boost to 3.1 GHz when all cores are active and to 3.5 GHz in a single thread situation. So it does have 2 cores extra and similar clockspeeds.

However we can't ignore the current-generation Broadwell-EP entirely. To get a better idea how the midrange POWER8 compares to the latest Xeons, we had to add another midrange Xeon E5 v4 SKU. So we only enabled 14 of the 22-cores of the Xeon E5-2699 v4. This gives us a chip that is somewhere between the Xeon E5-2660 v4 (14 cores at 2 GHz) and E5-2680 v4 (14 cores at 2.4 GHz). Well, at least on paper. The Xeon E5-2680 v4 runs most of the time at 2.9 GHz in heavily multi-threaded situations (+5 steps, all cores active), while our Xeon E5-2699 v4 with 14 cores runs at 2.8 GHz (+6 turbo steps). As the TDP of the latter is higher, the turbo clock will be used for a higher percentage of the time. Bottom line, our Xeon E5-2699 v4 with 14 cores is very similar to an E5-2680 v4 with a 145 W TDP. As the Xeon E5-2680 costs around $1745, it is in the right price range. From a price/performance point of perspective that is as fair as we can get it.

For those looking to get the best performance per watt: we'll save you some time and tell you that it does not get any better than the Xeon E5-2600 v4 series. Intel really went all the way to make sure that the Broadwell EP Xeon is a power sipper. And although the performance step is small, the Xeon E5-2600 v4 consumes much less than a similar Xeon E5 v3 SKU, let alone a CPU with a 190W TDP (+ 60-80W memory buffers).

Benchmark Configuration and Methodology

Our testing was conducted on Ubuntu Server 15.10 (kernel 4.2.0) with gcc compiler version 5.2.1. The reason why we did not update was that we only got everything working with that version.

Last but not least, we want to note how the performance graphs have been color-coded. Orange is for used for the review POWER8 CPU. The latest generation of the Intel Xeon (v4) gets dark blue, the previous one (v3) gets light blue. Older Xeon generations are colored with the default gray.

IBM S812LC (2U)

The IBM S812LC is based up on Tyan's "Habanero" platform. The board inside the IBM server is thus designed by Tyan.

CPU One IBM POWER8 2.92 GHz (up to 3.5 GHz Turbo)
RAM 256 GB (16x16GB) DDR3-1333
Internal Disks 2x Samsung 850Pro 960 GB
Motherboard Tyan SP012
PSU Delta Electronics DSP-1200AB 1200W

Intel's Xeon E5 Server – S2600WT (2U Chassis)

CPU One Intel Xeon processor E5-2699 v4 (2.2 GHz, 22c, 55MB L3, 145W)
One "simulated" Intel Xeon processor E5-2680 v4 (2.2 GHz, 14c, 35MB L3, 145W)
One Intel Xeon processor E5-2699 v3 (2.3 GHz, 18c, 45MB L3, 145W)
One Intel Xeon processor E5-2690 v3 (3.2 GHz, 8c, 20MB L3, 135W)
RAM 128 GB (8x16GB) Kingston DDR4-2400
or​
256 GB (8x 32GB) Hynix DDR4-2133
Internal Disks 2x Samsung 850Pro 960 GB
Motherboard Intel Server Board Wildcat Pass
PSU Delta Electronics 750W DPS-750XB A (80+ Platinum)

All C-states are enabled in the BIOS.

SuperMicro 6027R-73DARF (2U Chassis)

CPU Two Intel Xeon processor E5-2697 v2 (2.7GHz, 12c, 30MB L3, 130W)
RAM 128GB (8x16GB) Samsung at 1866 MHz
Internal Disks 2x Intel SSD3500 400GB
Motherboard SuperMicro X9DRD-7LN4F
PSU Supermicro 740W PWS-741P-1R (80+ Platinum)

All C-states are enabled in the BIOS.

Other Notes

Both servers are fed by a standard European 230V (16 Amps max.) power line. The room temperature is monitored and kept at 23°C by our Airwell CRACs.

Back to the Present: Real World Application Benchmarking on IBM's S812LC Java Performance
Comments Locked

49 Comments

View All Comments

  • nils_ - Monday, September 26, 2016 - link

    Isn't the limit slighty lower than 32 GiB? At some point the JVM switches to 64 bit pointers, which means you'll lose a lot of the available heap to larger pointers. I think you might want to lower your settings. I'm curious, what kind of GC times are you seeing with your heap size? I don't currently have access to Java running on non virtualised hardware so I would like to know if the overhead is significant (mostly running Elasticsearch here).
  • CajunArson - Thursday, September 15, 2016 - link

    All in all the Power chip isn't terrible but the power consumption coupled with the sheer amount of tuning that is required just to get it competitive with the Xeons isn't too encouraging. You could spend far less time tuning the Xeons and still have higher performance or go ahead with tuning to get even more performance out of those Xeons.

    On top of the fact that this isn't a supposedly "high end" model, the higher end power parts cost more and will burn through even more power, and that's an expense that needs to be considered for the types of real-world applications that use these servers.
  • dgingeri - Thursday, September 15, 2016 - link

    That ad on the last page that claims lower equipment cost of course compares that to an HP DL380, the most overpriced Xeon E5 system out right now. (I know because I shopped them.) Comparing it to a comparable Dell R730 would show less expense, better support, and better expansion options.
  • Morawka - Thursday, September 15, 2016 - link

    you mean a company made a slide that uses the most extreme edge cases to make their product look good?!?! Shocking /s
  • Gondalf - Thursday, September 15, 2016 - link

    Something is wrong is these power consumption data. The plataform idles at 221W and under full load only 260W?? the cpu is vanished?? Power 8 at over 3Ghz has an active power of only 40W??
    1) the idle value is wrong or 2) the under load value is wrong. All this is not consistent with IBM TDP official values.
    IMO the energy consumption page of the article has to be rewrite.
  • JohanAnandtech - Thursday, September 15, 2016 - link

    We have double checked those numbers. It is probably an indication that many of the power saving features do not work well under Linux right now.
    BTW, just to give you an idea: running c-ray (floating point) caused the consumption to go to 361W.
  • Kevin G - Thursday, September 15, 2016 - link

    I presume that c-ray uses the 256 bit vector unit on POWER8?

    Also have you done any energy consumption testing that takes advantage of the hardware decimal unit?
  • mapesdhs - Thursday, September 15, 2016 - link

    C-ray isn't that smart. :D It's a very simple code, brute force basically, and the smaller dataset can easily fit in a modern cache (actually the middling size test probably does too on CPUs like these). Hmm, I suppose it's possible one could optimise the compilation a bit to help, but I doubt anything except a full rewrite could make decent use of any vector tech, and I don't want to allow changes to the code, that would make comparisons to all other test results null. Compiler optimisations are ok, but not multi-pass optimisations that feed back info about the target data into the initial compile, that's cheating IMO (some people have done this to obtain what look like really silly run times, but I don't include them on my main C-ray page).

    Ian.
  • Gondalf - Tuesday, September 20, 2016 - link

    Ummm so in short words the utilized sw don't stress at all the cpu, not even the hot caches near the memory banks. We need a bench with an high memory utilization and a balanced mix between integer and FP, more in line with real world utilization

    I don't know if this test is enough to say POWER8 is power/perf competitive with haswell in 22nm.
    In fact POWER market share is definitively at the historic minimum and 14nm Broadwell is pretty young, so this disaster it is not its fault.
  • jesperfrimann - Wednesday, September 21, 2016 - link

    If you have a OPAL (Bare Metal system that cannot run POWERVM) then all the powersavings features are off by default AFAIR.
    Try to have a look at:
    https://public.dhe.ibm.com/common/ssi/ecm/po/en/po...

    Many of the features does have a performance impact, ranging from negative over neutral to positive for a single one.

    But Again. I think your comparison with 'vanilla' software stacks are relevant. This is what people would see out of the box with an existing software stack.
    It is 101% relevant to do that comparison as this is the marked that IBM is trying to break into with these servers.

    But what could be fun to see was some tests where all the Bells and Whistles were utilized. As many have written here.. use of Hardware supported Decimal Floating Point. The Vector Execution unit, the ability to do hardware assisted Memory Compression etc. etc.

    // Jesper

Log in

Don't have an account? Sign up now