Capacity: the New Arms Race

Some of the hottest software trends of today are Big Data and In Memory Business Analytics. Both applications benefit from fast processors, but even more importantly they are virtually unsatiable when it comes to RAM capacity. Another important area that's much closer to the daily work of many IT professionals is virtualization. As heavier applications are being virtualized, the typical amount of memory allocated to virtual machines has increased rapidly. As announced on the latest VMworld, vSphere 6 will now support Virtual machines that allocate up to 4TB (!!) of memory. The days where virtual machines were limited to only a fraction of "native" operating systems are behind us.

With the above developments, support for and development of high capacity DIMMs is crucial. Intel has been steadily improving the support for LRDIMMs (here's some additional information on LRDIMMs). The first Xeon E5-2600 had support for LRDIMMs but it only delivered higher capacity at the expense of lower bandwidth and higher latency. The memory controller of the Xeon E5-2600 v2 had several improvements specifically for LRDIMMs and as a result the latency and throughput tax was greatly reduced.

The advent of DDR4 has given the engineers of IDT the opportunity to give LRDIMMs a performance advantage instead of a disadvantage. By introducing data buffers close to the DRAM chips, they managed to reduce the I/O trace lengths tremendously. See the figure below.

DDR4 and DDR3 LRDIMMs compared, image courtesy of IDT.

The latency overhead of the extra buffering is thus significantly lower on DDR4 LRDIMMs. In other words, compared to Registered DDR4 running at the same speed with 1 DPC (1 DIMM per channel), the latency overhead will be small. As soon as you start to use more DIMMs per channel, LRDIMMs actually offer lower latency as they can run at higher speeds.

Below you can see the evolution of LRDIMM support over the three generations of Xeon E5s. On the far right is the speed of DIMMs on Sandy Bridge EP, in the middle is Ivy Bridge EP, and on the left is the speed of DIMMs on the new Haswell EP Xeon.

On Sandy Bridge EP (Xeon E5-2600), LRDIMMs were only clocked faster at three DPC. On Ivy Bridge EP (Xeon E5-2600 v2), the LRDIMMs were faster at two and three DPC. And on Haswell EP (Xeon E5-2600 v3), the bandwidth speed gap at two and three DPC has increased while the latency tax (not seen in the picture) has been reduced.

Samsung LRDIMM on top, RDIMM below. Notice the data buffers on the LRDIMM

Several sources tell us that LRDIMMs will be about 20%-25% more expensive. Our task then is to help you decide wether or not the investment is worth it. In this review, we will show some preliminary results.

The latency penalty has been reduced, but what about capacity? As you can see by the 4G marking in the photo above, the DIMMs used in our current servers are still using the mature 4Gbit DRAM chip technology. So currently, the Xeon E5-2600 v2 platform is limited to 384GB of registered DDR4 or 768GB of LRDIMMs. Quad-ranked RDIMMs, which were expensive, slow, and could only be used at 2DPC, are dead. The current 64GB LRDIMMs can be used at 3DPC, but they are Octal (!) ranks using quad-die-packages. As a result they are slow at 3DPC and power hungry.

But the future looks bright. At the end of this year, dual-ranked modules, such as the ones you can see above, will use 8Gb. This results in 64GB LRDIMMs and 32GB RDIMMs. That means the Xeon E5 platform will soon be able to address up to 1.5TB of physical RAM. In the second half of 2015, 128GB LRDIMMs should be available too, allowing up to 3TB of RAM.

DDR4 Positioning: SKUs and Servers


View All Comments

  • MorinMoss - Friday, August 9, 2019 - link

    Hello from 2019.
    AMD has a LOT of ground to make up but it's a new world and a new race
  • Kevin G - Monday, September 8, 2014 - link

    As an owner of a dual Opteron 6376 system, I shudder at how far behind that platform is. Then I look down and see that I have both of my kidneys as I didn't need to sell one for a pair of Xeons so I don't feel so bad. For the price of one E5-2660v3 I was able to pick up two Opteron 6376's. Reply
  • wallysb01 - Monday, September 8, 2014 - link

    But the rest of the system cost is about the same. So you get 1/2 the performance for a 10% discount. YEPPY! Reply
  • Kevin G - Monday, September 8, 2014 - link

    Nope. Build price after all the upgrades over the course of two years is some where around $3600 USD. The two Opterons accounted for a bit more than a third of that price. Not bad for 32 cores and 128 GB of memory. Even with Haswell-E being twice as fast, I'd have to spend nearly twice as much (CPU's cost twice as much as does DDR4 compared to when I bought my DDR3 memory). To put it into prespective, a single Xeon E5 2999v3 might be faster than my build but I was able to build an entire system for less than the price Intel's flagship server CPU.

    I will say something odd - component prices have increased since I purchased parts. RAM prices have gone up by 50% and the motherboard I use has seemingly increased in price by $100 due to scarcity. Enthusiast video card prices have also gotten crazy over the past couple of years so a high end video card is $100 more for top of the line in the consumer space.
  • wallysb01 - Tuesday, September 9, 2014 - link

    Going to the E5 2699 isn’t needed. A pair of 2660 v3s is probably going to be nearly 2x as fast the 6376, especially for floating point where your 32 cores are more like 16 cores or for jobs that can’t use very many threads. True a pair of 2660s will be twice as expensive. On a total system it would add about $1.5K. We’ll have to wait for the workstation slanted view, but for an extra $1.5K, you’d probably have a workstation that’s much better at most tasks. Reply
  • Kevin G - Friday, September 12, 2014 - link

    Actually if you're aiming to double the performance of a dual Opteron 6376, two E5-2695v3's look to be a good pick for that target according to this review. A pair of those will set you pack $4848 which is more than what my complete system build cost.

    Processors are only one component. So while a dual Xeon E5-2695v3 system would be twice as fast, total system cost is also approaching double due to memory and motherboard pricing differences.
  • Kahenraz - Monday, September 8, 2014 - link

    I'm running a 6376 server as well and, although I too yearn for improved single-threaded performance, I could actually afford to own this one. As delicious as these Intel processors are, they are not priced for us mere mortals.

    From a price/performance standpoint, I would still build another Opteron server unless I knew that single-threaded performance was critical.
  • JDG1980 - Tuesday, September 9, 2014 - link

    The E5-2630 v3 is cheaper than the Opteron 6376 and I would be very surprised if it didn't offer better performance. Reply
  • Kahenraz - Tuesday, September 9, 2014 - link

    6376s can be had very cheaply on the second-hand market, especially bundled with a motherboard. Additionally, the E5-2630 v3 requires both a premium on the board and DDR4 memory.

    I'd wager you could still build an Opteron 6376 system for half or less.
  • Kevin G - Tuesday, September 9, 2014 - link

    It'd only be fair to go with the second hand market for the E5-2630v3's but being new means they don't exist. :)

    Still going by new prices, an Opteron 6376 will be cheaper but roughly 33% from what I can tell. You're correct that the new Xeon's have a premium pricing on motherboards and DDR4 memory.

Log in

Don't have an account? Sign up now