Bandwidth Monster

Previous versions of Intel's flagship Xeon always came with very conservative memory configurations as RAM capacity and reliability was the priority. Typically, these systems came with memory extension buffers for increased capacity, but those memory buffers also increase memory latency. As a result, these quad- and octal-socket monsters had a hard time competing with the best dual-Xeon setups in memory intensive applications.

The new Xeon E7 v2 still has plenty of memory buffers (code named "Jordan Creek"), and it now supports three instead of two DIMMs per channel. The memory riser cards with two buffers now support 12 instead of eight DIMMs (Xeon Westmere-EX). Using relatively affordable 32GB DIMMs, this allows you to load a system machine up to 3TB RAM. If you break the bank and use 64GB LRDIMMs, 6TB RAM is possible.

With the previous platform, having eight memory channels only increased capacity and not bandwidth as they ran in lockstep. Each channel delivers half a cache line, then the Jordan Creek buffer combines those halves and sends off the result to the requesting memory controller. The high speed serial interface or scalable memory interconnect (SMI) channels must run at the same speed as the DDR3 channels. With Westmere-EX, this resulted in an SMI running at a maximum of 1066MHz. With the Xeon E7 v2, we get four SMI interconnects running at speeds up to 1600MHz. In lockstep, the system can survive a dual-device error. As result, the RAS (Reliability, Accessibility, Serviceability) is best in Lockstep.

With the Ivy Bridge EX version of the Xeon E7, the channels can also run independently. This mode is called performance mode and each channel can deliver one cache line. To cope with twice the amount of bandwidth, the SMI interconnect must run twice as fast as the memory channels. In this case, the SMI channel can run at 2667 MT/s while the two channels work at 1333 MT/s. That means in theory, the E7 v2 chip could deliver as much as 85GB/s (1333 * 8 channels * 8 bytes per channel) of bandwidth, which is 2.5x more than what the previous platform delivered. The disadvantage is that only a single device error can be corrected—more speed, less RAS.

According to Intel, both latency and bandwidth are improved tremendously compared to the Westmere-EX platform. As a result, the new quad Xeon E7 v2 platform should perform a lot better in memory intensive HPC applications.

Meet the New Xeon E7 v2 Power Consumption
Comments Locked

125 Comments

View All Comments

  • Kevin G - Saturday, February 22, 2014 - link

    Not 100% sure since I'm not an IEEE member to view it, but this paper maybe the source for the POWER7+ figures:
    http://ieeexplore.ieee.org/xpl/articleDetails.jsp?...
  • Phil_Oracle - Monday, February 24, 2014 - link

    TDP is great for comparing chip to chip, but what really matters is system performance/watt. And although Intel's latest Xeon E7 v2 may have better TDP specs than either Power7+ or SPARC T5, when you look at the total system performance/watt, SPARC T5 actually leads today due to its higher throughput, core count, 4 x more threads, built-in encryption engines and higher optimization with the Oracle SW stack.
  • Flunk - Friday, February 21, 2014 - link

    8 core consumer chips now please. If you have to take the GPU off go for it.
  • DanNeely - Friday, February 21, 2014 - link

    Assuming you mean 8 identical cores, until mainstream consumer apps appear that can use more CPU resources than the 4HT cores in Intel's high end consumer chips but which can't benefit from GPU acceleration become common it's not going to happen.

    I suppose Intel could do a big.little type implementation with either core and atom or atom and the super low power 486ish architecture they announced a few months ago in the future. But in addition to thinking it was worthwhile for the power savings, they'd also need to license/work around arm's patents. I suppose a mobile version might happen someday; but don't really see a plausible benefit for laptop/desktop systems that don't need continuous connected standby like phones do.
  • Kevin G - Friday, February 21, 2014 - link

    Intel hasn't announced any distinct plans to go this route, they're at least exploring the idea at some level. The SkyLake and Knights Landing are to support the same ISA extensions and in principle a program could migrate between the two types of cores.
  • StevoLincolnite - Saturday, February 22, 2014 - link

    Er. You don't need apps to use more than 4 threads to make use of an 8 core processor.
    Whatever happened to running several demanding applications at once? Surely I am not the only one who does this...
    My Sandy-Bridge-E processor being a few years old is starting to show it's age in such instances, I would cry tears of blood for an 8-Core Haswell based processor to replace my current 6-core chip.
  • psyq321 - Monday, March 10, 2014 - link

    Well, you can buy bigger Ivy Bridge EP Xeon CPU and fit it in your LGA2011 system.

    This way you can go up to 12 cores and not have to wait for 8-core Haswell E.
  • SirKnobsworth - Friday, February 21, 2014 - link

    8 core Haswell-E chips are due out later this year. You can already buy 6 core Ivy Bridge-E chips with no integrated graphics.
  • TiGr1982 - Friday, February 21, 2014 - link

    Did you know:
    Haswell-E is supposed to be released in Q3 this year, to have up to 8 Haswell cores with HT, fit in the new revision of Socket LGA2011 (incompatible with the current desktop LGA2011), and work with DDR4 and X99 chipset. No GPU there, since it's a byproduct of server Haswell-EP.
  • Harry Lloyd - Friday, February 21, 2014 - link

    That will not help much, unless they release a 6-core chip for around 300 $, replacing the lowest LGA2011 4-core chips. It is about time.

Log in

Don't have an account? Sign up now