Capacity: the New Arms Race

Some of the hottest software trends of today are Big Data and In Memory Business Analytics. Both applications benefit from fast processors, but even more importantly they are virtually unsatiable when it comes to RAM capacity. Another important area that's much closer to the daily work of many IT professionals is virtualization. As heavier applications are being virtualized, the typical amount of memory allocated to virtual machines has increased rapidly. As announced on the latest VMworld, vSphere 6 will now support Virtual machines that allocate up to 4TB (!!) of memory. The days where virtual machines were limited to only a fraction of "native" operating systems are behind us.

With the above developments, support for and development of high capacity DIMMs is crucial. Intel has been steadily improving the support for LRDIMMs (here's some additional information on LRDIMMs). The first Xeon E5-2600 had support for LRDIMMs but it only delivered higher capacity at the expense of lower bandwidth and higher latency. The memory controller of the Xeon E5-2600 v2 had several improvements specifically for LRDIMMs and as a result the latency and throughput tax was greatly reduced.

The advent of DDR4 has given the engineers of IDT the opportunity to give LRDIMMs a performance advantage instead of a disadvantage. By introducing data buffers close to the DRAM chips, they managed to reduce the I/O trace lengths tremendously. See the figure below.

DDR4 and DDR3 LRDIMMs compared, image courtesy of IDT.

The latency overhead of the extra buffering is thus significantly lower on DDR4 LRDIMMs. In other words, compared to Registered DDR4 running at the same speed with 1 DPC (1 DIMM per channel), the latency overhead will be small. As soon as you start to use more DIMMs per channel, LRDIMMs actually offer lower latency as they can run at higher speeds.

Below you can see the evolution of LRDIMM support over the three generations of Xeon E5s. On the far right is the speed of DIMMs on Sandy Bridge EP, in the middle is Ivy Bridge EP, and on the left is the speed of DIMMs on the new Haswell EP Xeon.

On Sandy Bridge EP (Xeon E5-2600), LRDIMMs were only clocked faster at three DPC. On Ivy Bridge EP (Xeon E5-2600 v2), the LRDIMMs were faster at two and three DPC. And on Haswell EP (Xeon E5-2600 v3), the bandwidth speed gap at two and three DPC has increased while the latency tax (not seen in the picture) has been reduced.

Samsung LRDIMM on top, RDIMM below. Notice the data buffers on the LRDIMM

Several sources tell us that LRDIMMs will be about 20%-25% more expensive. Our task then is to help you decide wether or not the investment is worth it. In this review, we will show some preliminary results.

The latency penalty has been reduced, but what about capacity? As you can see by the 4G marking in the photo above, the DIMMs used in our current servers are still using the mature 4Gbit DRAM chip technology. So currently, the Xeon E5-2600 v2 platform is limited to 384GB of registered DDR4 or 768GB of LRDIMMs. Quad-ranked RDIMMs, which were expensive, slow, and could only be used at 2DPC, are dead. The current 64GB LRDIMMs can be used at 3DPC, but they are Octal (!) ranks using quad-die-packages. As a result they are slow at 3DPC and power hungry.

But the future looks bright. At the end of this year, dual-ranked modules, such as the ones you can see above, will use 8Gb. This results in 64GB LRDIMMs and 32GB RDIMMs. That means the Xeon E5 platform will soon be able to address up to 1.5TB of physical RAM. In the second half of 2015, 128GB LRDIMMs should be available too, allowing up to 3TB of RAM.

DDR4 Positioning: SKUs and Servers
Comments Locked

85 Comments

View All Comments

  • bsd228 - Friday, September 12, 2014 - link

    Now go price memory for M class Sun servers...even small upgrades are 5 figures and going 4 years back, a mid sized M4000 type server was going to cost you around 100k with moderate amounts of memory.

    And take up a large portion of the rack. Whereas you can stick two of these 18 core guys in a 1U server and have 10 of them (180 cores) for around the same sort of money.

    Big iron still has its place, but the economics will always be lousy.
  • platinumjsi - Tuesday, September 9, 2014 - link

    ASRock are selling boards with DDR3 support, any idea how that works?

    http://www.asrockrack.com/general/productdetail.as...
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Well... ASRock is generally famous "marrying" different gen hardware.
    But here, since this is about DDR RAM, governed by the CPU itself (because memory controller is inside the CPU), then my only guess is Xeon E5 v3 may have dual-mode memory controller (supporting either DDR4 or DDR3), similarly as Phenom II had back in 2009-2011, which supported either DDR2 or DDR3, depending on where you plugged it in.

    If so, then probably just the performance of E5 v3 with DDR3 may be somewhat inferior in comparison with DDR4.
  • alpha754293 - Tuesday, September 9, 2014 - link

    No LS-DYNA runs? And yes, for HPC applications, you actually CAN have too many cores (because you can't keep the working cores pegged with work/something to do, so you end up with a lot of data migration between cores, which is bad, since moving data means that you're not doing any useful work ON the data).

    And how you decompose the domain (for both LS-DYNA and CFD makes a HUGE difference on total runtime performance).
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    No, I hope to get that one done in the more Windows/ESXi oriented review.
  • Klimax - Tuesday, September 9, 2014 - link

    Nice review. Next stop: Windows Server. (And MS-SQL..)
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    Agreed. PCIe Flash and SQL server look like a nice combination to test this new Xeons.
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Xeon 5500 series (Nehalem-EP): up to 4 cores (45 nm)
    Xeon 5600 series (Westmere-EP): up to 6 cores (32 nm)
    Xeon E5 v1 (Sandy Bridge-EP): up to 8 cores (32 nm)
    Xeon E5 v2 (Ivy Bridge-EP): up to 12 cores (22 nm)
    Xeon E5 v3 (Haswell-EP): up to 18 cores (22 nm)

    So, in this progression, core count increases by 50% (1.5 times) almost each generation.

    So, what's gonna be next:

    Xeon E5 v4 (Broadwell-EP): up to 27 cores (14 nm) ?

    Maybe four rows with 5 cores and one row with 7 cores (4 x 5 + 7 = 27) ?
  • wallysb01 - Wednesday, September 10, 2014 - link

    My money is on 24 cores.
  • SuperVeloce - Tuesday, September 9, 2014 - link

    What's the story with 2637v3? Only 4 cores and the same freqency and $1k price as 6core 2637v2? By far the most pointless cpu on the list.

Log in

Don't have an account? Sign up now