Memory Subsystem: Bandwidth

As we mentioned before, the IBM POWER8 has a memory subsystem which is more similar to the Xeon E7's than the E5's. The IBM POWER8 connects to 4 "Centaur" buffer cache chips, which have both a 19.2 GB/s read and 9.6 GB/s write link to the processor, or 28.8 GB/s in total. So the 105 GB/s aggregate bandwidth of the POWER8 is not comparable to Intel's peak bandwidth. Intel's peak bandwidth is the result of 4 channels of DDR4-2400 that can either write or read at 76.8 GB/s (2.4 GHz x 8 bytes per channel x 4 channels).

Bandwidth is of course measured with John McCalpin's Stream bandwidth benchmark. We compiled the stream 5.10 source code with gcc 5.2.1 64 bit. The following compiler switches were used on gcc:

-Ofast -fopenmp -static -DSTREAM_ARRAY_SIZE=120000000

The latter option makes sure that stream tests with array sizes which are not cacheable by the Xeons' huge L3 caches.

It is important to note why we use the GCC compiler and not vendors' specialized compilers: the GCC compiler is not as good at vectorizing the code. Intel's ICC compiler does that very well, and as result shows the bandwidth available to highly optimized HPC code, which is great for that code in the real world, but it's not realistic for multi-threaded server applications.

With ICC, Intel can use the very wide 256-bit load units to their full potential and we measured up to 65 GB/s per socket. But you also have to consider that ICC is not free, and GCC is much easier to integrate and automate into the daily operations of any developer. No licensing headaches, no time consuming registrations.

Stream Triad (GCC)

The combination of the powerful four load and two store subsystem of the POWER8 and the read/write interconnect between the CPU and the Centaur chips makes it much easier to offer more bandwidth. The IBM POWER8 delivers a solid 90 GB/s despite using old DDR3-1333 memory technology.

Intel claims higher bandwidth numbers, but those numbers can only be delivered in vectorized software.

Configuration and Benchmark Selection Memory Subsystem: Latency Measurements


View All Comments

  • close - Thursday, July 21, 2016 - link

    This right here is why I keep coming back to Anandtech. Thumbsup! Reply
  • jardows2 - Thursday, July 21, 2016 - link

    Agreed. There are plenty of places you can go to find out how pretty your games will look, but this sort of stuff is much more interesting to me!

    Looking forward to the application numbers. Power8 may shape up to be a nice server alternative. I would like to see about virtualization. With the threaded capabilities, it might just be a good platform for that.
  • Brutalizer - Friday, July 22, 2016 - link

    Regarding virtualization, SPARC M7 is more than 4x faster than POWER8 on SPECvirt_sc2013, and more than 2x faster than x86
    Regarding SPECcpu2006, SPARC M7 is 1.9x and 1.8x faster than POWER8, and is faster than x86 as well:
    Regarding memory bandwidth, SPARC M7 is 2.2x and 1.7x faster than POWER8 and 2.4x faster than x86 on STREAM benchmarks:
    If you dig a bit on that web site, you will find 30ish world records, where SPARC M7 is 2-3x faster than POWER8 and x86, all the way up to 11x faster.

    It is interesting to delve in to the technology behind POWER8 and x86, but in the end, what really matters, is how fast the cpu performs in real life workloads and benchmarks. SPARC has lower IPC than x86, but as real life server workloads have an IPC of 0.8, SPARC which is a server cpu, is much faster than x86 in practice. In theory, x86 and POWER8 are fast, but in practice, they are much slower than SPARC. So, you can theoretize all you want, but in the end - which cpu is fastest in real workloads and in real benchmarks? SPARC. Just look at all the benchmarks above, where SPARC M7 is faster in number crunching, Big data, neural networks, Hadoop, virtualization, memory bandwidth, etc etc. And if you also factor in the business benchmarks, such as SAP, Peoplesoft, databases etc - there is no contest. You get twice the performance, or more, with a SPARC M7 server than the competitors.

    SPARC M7 can also turn on encryption on everything, and loose 2-3% performance. Whereas encryption on POWER8 and x86 typically reduces performance down to 33% or lower. So, if you benchmark encrypted workloads, then SPARC M7 is not typically 2-3x faster, but another 3 times faster again - i.e. typically faster 6-9x.
  • Kevin G - Friday, July 22, 2016 - link

    Oracle marketing at its finest.

    The virtualization score is good vs. POWER8 mainly based on the radical different in core count: 32 vs. 6. Yeah, even with lower IPC, I'd expect the higher core count system to fair better. Also note that IBM offers such higher core count systems and at higher clock speeds which would close that gap.

    Same for the claims of being twice as fast in raw benchmarks: Oracle isn't comparing there best against IBM's best POWER8. There choice of comparison point was simply arbitrary to make SPARC look good, as is the job of their marketing department. Real performance comparisons come from independent reports.

    To get the memory bandwidth advantage Oracle proclaims, they have to use twice as many sockets.
  • SarahKerrigan - Friday, July 22, 2016 - link

    These supposed Oracle "wins" are all based on worst-case scenarios for Power8 - ie, testing a DCM based system and counting each DCM as two processors. This isn't very useful for comparison to Power8 overall, as the entry-level machines like the one in this article, and the S822LC positioned above it, all use SCM's (with as many as twelve cores.)

    M7 is a first-rate CPU, but it's also in a totally different cost class; the cheapest M7 config listed on Oracle's website costs over US$40k, for a one-processor machine. Considering you can get a pair of 10-core Power8's with 256GB of RAM in an S822LC for US$14,300 list, this is an exceptionally tough sell for those not wedded to Solaris (and by the way, there's no RHEL, SLES, or Ubuntu for SPARC - Solaris is pretty much the only game in town.)

    My company is currently deploying an S812LC and intends to deploy an S822LC in the future; we briefly considered SPARC but found the style of marketing that Oracle and its proxies seem to favor to be deeply offputting, as is the relatively poor perf/$ compared to both Power and Intel. Our loads (mainly a large PostgreSQL application) scale well with memory bandwidth and cache sizes, and we've found S812LC perf/$ to be first-rate. The main downsides have just been related to the relative immaturity of the ppc64le platform (occasional lack of available packages, etc.)
  • Brutalizer - Sunday, July 31, 2016 - link

    These oracle sparc m7 benchmarks vs IBM power8 are not worst case. The DCM Power8 module, actually consists of two power8 CPUs, in one socket. So there is nothing wrong with these benchmarks. It is up to IBM to release benchmarks with two power8 CPUs in one socket, not oracle choice. IBM has for decades promoted few strong cores instead of many weaker cores. For instance, IBM claimed "dual core power6 @ 5 ghz was superior to 8core sparc niagara2 @ 1.6 ghz because databases runs best on few but strong cores" and IBM talked about future super strong single/dual core 6-7 ghz power CPUs and mocked sparc many but weaker cores because databases are worthless on sparc. Back then sparc were first with 8 cores, and it was very controversial having that many cores. Later IBM realized laws of physics prohibit highly clocked CPUs, so IBM abandoned that path and followed sparc with many knower clocked cores. Just like Intel abanoned Prescott with high clocks. Today everybody have many lower clocked cores, just like spare decades ago.

    Of course, if IBM released benchmarks with other configurations of power8, oracle would be happy to use them, but IBM has not. Oracle has no choice than to use those benchmarks that IBM has released. It is not oracles choice what benchmarks IBM release.

    We also know that power8 is slower than the latest Intel xeons, and we know that sparc m7 is typically 2-3x faster than Intel Xeon, so probably these benchmarks from IBM vs sparc m7 benchmarks are true. If you find other IBM power8 benchmarks I am sure oracle will compare to them instead. But you can only bench against ibm's own results, right?

    Regarding my credibility, yes, I am an sparc supporter. What is the problem with being an supporter? I know there are IBM supporters here, and there are nvidia, Amd, Intel etc supporters. What is wrong with that? Does the fact that I consider sparc to be superior, invalidate the official oracle vs IBM vs Intel benchmarks? I have not created those benchmarks, IBM has. And oracle. And Intel. Instead of you, IBM supporters, linking to official superior IBM power8 benchmarks you claim that because I am an sparc supporter, those official vendor benchmarks can not be trusted. Instead of proving that power8 is faster with benchmarks, you resort to attacking me. That does not win you any discussions. Show us facts and benchmarks if you want invalidate my linked benchmarks, instead of attacking me. Fact is, you have not proven anything regarding power8 inferiority.

    And why do I keep talking about sparc m7? Well, it seems people believe that Intel and power8 is so fast, but in fact there are another cpu out there, 2-3x faster, up to 11x faster. People just don't know that sparc is the worlds fastest CPU. I would like anandtech to talk about the best CPU in the world instead of slow IBM power or Intel Xeon CPUs. But anandtech don't.

    Regarding myself, yes I have been interviewed in Swedish media, and it is evident that I have always worked finance. I have never worked at Sun nor Oracle. Just read the interview. The last years I am an quantitative analyst concocting trading strategies. I have never worked in IT. i just happen to be a nerd and geek, and i only support the best tech, and it is sparc and Solaris. IBM and Intel sucks. Just compare their lousy performance to sparc m7
  • SarahKerrigan - Sunday, July 31, 2016 - link

    "The DCM Power8 module, actually consists of two power8 CPUs, in one socket."

    Dude, nobody outside of Oracle marketing cares, just like they didn't care when Xeon and Opteron used MCM's. IBM has SCM's going all the way up to 12 cores and 8 Centaur links, they just use DCM's for cost reasons on some (but not all) smaller machines. These have the same number of Centaur links per socket as the big SCM's, and they're priced as one would expect of one or two socket enterprise systems. Realistically, the 8-Centaur SCM has roughly equivalent memory bandwidth to the 8-Centaur DCM.

    "Later IBM realized laws of physics prohibit highly clocked CPUs, so IBM abandoned that path and followed sparc with many knower clocked cores. Just like Intel abanoned Prescott with high clocks. Today everybody have many lower clocked cores, just like spare decades ago."

    You mean like when Oracle replaced 16-core 1.65GHz T3 with 8-core 3GHz T4? Which, by the way, had very similar throughput performance (which you say is all that matters) to the T3, but had far higher single-thread and single-core performance? If only throughput matters, why would Oracle do such a thing? It's quite a thing for you to imply Oracle doesn't know what they're doing!

    They also have been publishing benchmarks for their shiny new S7 chip where they lose per-chip to the Xeon - but they win per-core, which you've said on many occasions doesn't matter. Here are some examples:

    Comparisons to IBM are conspicuously absent, I suspect because Power perf/core is rather impressive.

    "For instance, IBM claimed "dual core power6 @ 5 ghz was superior to 8core sparc niagara2 @ 1.6 ghz because databases runs best on few but strong cores" and IBM talked about future super strong single/dual core 6-7 ghz power CPUs and mocked sparc many but weaker cores because databases are worthless on sparc."

    IBM has never reduced per-core or single-thread performance generation to generation. P7 and P8 were both massive improvements in both categories. IBM has not historically shown interest in "weaker" cores for Power.

    "Well, it seems people believe that Intel and power8 is so fast, but in fact there are another cpu out there"

    Yes. For the low, low price of over forty thousand dollars for the lowest-end, one-processor M7 system with public prices on Oracle's website.

    "2-3x faster"

    Consulting officially published results on an industry-standard benchmark:

    Xeon E7-8890v4, 2.2GHz: SPECint rate result of 927/chip, 24 cores (38/core)
    Power8 SCM, 4GHz: SPECint rate result of 900/chip, 12 cores (75/core)
    SPARC M7, 4.13GHz: SPECint rate result of 1200/chip, 32 cores (37/core)

    Not that impressive - especially given M7's price. And certainly not 2-3x of anything (or even 1.9x). It's 1.3x... while having 2.5x as many cores. Additionally, for a large range of applications, single-thread performance matters.

    "up to 11x faster."

    When running in-memory queries inside Oracle DB using accelerator instructions added to SPARC M7 specifically for Oracle DB, yes.

    By the way, since you mentioned memory bandwidth... how does it feel to have two-processor SPARC S7 losing on STREAM Triad to entry-level, one-processor Power8 machines that cost significantly less? Compare to the entry-level Power8 results in the article we're commenting on!

    Oracle proponents need to do better than this. At least Phil Dunn resorts less to copypasta...

    SPEC references:
  • close - Tuesday, August 02, 2016 - link

    "What is the problem with being an supporter? What is wrong with that?"
    Lying, deceiving, etc.

    This is what Oracle does because simply put ever since they acquired Sun those products went to sh*t. Oracle are reverse-alchemists. Whether it's software (like Java) or hardware (like the Sparc) Oracle managed to turn those gold nuggets into lead weights. Java was buried by Google, Sparc was buried by Intel and IBM.

    Oracle always resorts to this kind of piss-poor advertising and it's not for the customers themselves. They try to save face with numbers on that site with one reason only: to have something to show during their conferences. Because companies don't rely on numbers in a benchmark when committing to multi-year contracts and getting tied into a specific ecosystem.
    Right now only a handful of government institutions and some in regulated industries still rely on Sparc and only in corner cases. Most times it's just until they manage to migrate off them.
  • close - Tuesday, August 02, 2016 - link

    The EXA products might be the only ones with some solid popularity because it's the full package but they do come with plenty of caveats. Having worked in the defense and financial sectors for a long time I've seen plenty of consolidation being done on newer Oracle/Sparc systems but not so many new deployments (a handful). And the proof is in the numbers. Oracle can't seem to make any headway into this.
    This isn't the kind of runaway success you'd expect for such an "overpowering" system.

    P.S. Google for "For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer" and see the army of posters Oracle is employing and the kind of tactics Oracle they resort to. And that's just the official posters.
  • close - Tuesday, August 02, 2016 - link

    Their engineered systems for integrated infrastructure and platforms (the latter being their driver) are great but not because of the hardware or the CPU in particular. It's because of the value of the whole package that includes the software layer. Nobody actually cares about the CPU in those particular products and if the CPU were being sold they would have tough time.
    And not least, they almost always HAVE to heavily discount the price in order to make the sale. From personal and recent experience Oracle was eager enough to undercut competitors like Cisco, VCE or HP (HP has 3 digit growth in this segment YoY for 2-3 years now) and discounted so aggressively that we ended up with 50% savings...

Log in

Don't have an account? Sign up now