Benchmark Configuration and Methodology

For our testing we installed 64-bit Ubuntu 15.04 Linux (Kernel version 3.19.0) so that we were able to use GCC 4.9.2, which has better support for the POWER8. We tried to keep the colors inside our benchmark graphs consistent: dark blue is IBM, light blue is the latest Intel Xeon generation (Haswell, E5 v3), and gray was reserved for older Intel systems.

Meanwhile on a quick aside, we should point out that IBM's servers also support PowerVM and KVM virtualization, however we decided not to make use of it to keep the complexity of the tests under control. As we explained in the introduction, porting and tuning the usual benchmarks was quite a challenge, and virtualization makes benchmarking a lot more complex. Testing virtualized workloads was thus beyond the scope of this article.

All tests have been done with the help of Kirth and Wannes of the Sizing Servers Lab.

IBM S822L (2U Chassis)

CPU Two IBM POWER8 3.425 GHz 10 cores
RAM 128GB (8x16GB) IBM CDIMMs
Internal Disks 2x 300GB 15K RPM SAS Disks (boot)
1x Intel DC P3700 400 GB (Data and benchmarks)
Motherboard No idea
BIOS version OPAL v3
PSU Dual Emerson 1400W

Intel's Xeon E5 Server – "Wildcat Pass" (2U Chassis)

CPU Two Intel Xeon processor E5-2699 v3 (2.3GHz, 18c, 45MB L3, 145W)
Two Intel Xeon processor E5-2695 v3 (2.3 GHz, 14c, 35MB L3, 120W)
Two Intel Xeon processor E5-2667 v3 (3.2 GHz, 8c, 20MB L3, 135W)
Two Intel Xeon processor E5-2650L v3 (1.8GHz, 12c, 30MB L3, 65W)
RAM 128GB (8x16GB) Samsung M393A2G40DB0 (RDIMM)
Internal Disks 2x Intel MLC SSD710 200GB (boot)
1x Intel DC P3700 400 GB (Data and benchmarks)
Motherboard Intel S2600WTT
BIOS version version 1.01
PSU Delta Electronics 750W DPS-750XB A (80+ Platinum)

All C-states are enabled in both the BIOS.

Other Notes

Both servers are fed by a standard European 230V (16 Amps max.) powerline. The room temperature is monitored and kept at 23°C by our Airwell CRACs.

The L4-cache and Memory Subsystem "Per Core" Integer Performance: 7-Zip
Comments Locked

146 Comments

View All Comments

  • usernametaken76 - Thursday, November 12, 2015 - link

    Technically this is not true. IBM had a working version of AIX running on PS/2 systems as late as the 1.3 release. Unfortunately support was withdrawn and future releases of AIX were not compiled for x86 compatible processors. One can still find a copy of this release if one knows where to look. It's completely useless to anyone but a museum or curious hobbyist, but it's out there.
  • zenip - Friday, November 13, 2015 - link

    ...>--click here-
  • Steven Perron - Monday, November 23, 2015 - link

    Hello Johan,

    I was reading this article, and I found it interesting. Since I am a developer for the IBM XL compiler, the comparisons between GCC and XL were particularly interesting. I tried to reproduce the results you are seeing for the LZMA benchmark. My results were similar, but not exactly the same.

    When I compared GCC 4.9.1 (I know a slightly different version that you) to XL 13.1.2 (I assume this is the version you used), I saw XL consistently ahead of GCC, even when I used -O3 for both compilers.

    I'm still interested in trying to reproduce your results, so I can see what XL can do better, so I have a couple questions on areas that could be different.

    1) What version of the XL compiler did you use? I assumed 13.1.2, but it is worth double checking.
    2) Which version of the 7-zip software did you use? I picked up p7zip 15.09.
    3) Also, I noticed when the Power 8 machine was running at full capacity (for me that was 192 threads on a 24 core machine), the results would fluctuate a bit. How many runs did you do for each configuration? Were the results stable?
    4) Did you try XL at the less aggressive and more stable options like "-O3" or "-O3 -qhot"?

    Thanks for you time.
  • Toyevo - Wednesday, November 25, 2015 - link

    Other than the ridiculous price of CDIMMs the power efficiency just doesn't look healthy. For data centers leasing their hardware like Amazon AWS, Google AppEngine, Azure, Rackspace, etc, clients who pay for hardware yet fail to use their allocation significantly help the bottom line of those companies by reduced overheads. For others high usage is a mandatory part of the ROI equation during its period as an operating asset, thus power consumption is a real cost. Even with our small cluster of 12 nodes the power efficiency is a real consideration, let alone companies standardizing toward IBM and utilising 100s or 1000s of nodes that are arguably less efficient.

    Perhaps you could devise some sort of theoretical total cost of ownership breakdown for these articles. My biggest question after all of this is, which one gets the most work done with the lowest overheads. Don't get me wrong though, I commend you and AnandTech on the detail you already provide.
  • AstroGuardian - Tuesday, December 8, 2015 - link

    It's good to have someone challenging Intel, since AMD crap their pants on regular basis
  • dba - Monday, July 25, 2016 - link

    Dear Johan:

    Can you extrapolate how much faster the Sparc S7 will be in your Cluster Benchmarking,
    if the 2 on Die Infiniband ports are Activated, 5, 10, 20% ???

    Thank You, dennis b.

Log in

Don't have an account? Sign up now