DDR4

Intel and the DRAM world are switching over to DDR4 and with good reason. DDR4 is a large step forward, and some of the highlights of DDR4 include the following:

  • Speeds up to 3200 MT/s (1.6GHz Double Data Rate)
  • Lower DRAM I/O voltage (1.2 instead of 1.5 V VDDQ)
  • Twice the capacity (using the same DRAM chips)
  • Improved RAS

The improvements start with the internal organization. A DDR3 chip has eight independent banks, while DDR4 comes with 16 banks, organized in a 4x4 configuration: four bank groups with four banks. More banks mean that more pages can stay open (more page hits, lower latency) at a small power increase, which is completely negated by a whole range of power efficiency features (see further). The power efficiency gains are rather large. Samsung quantifies them in the slide below. 

Samsung claims about 21% lower power thanks to the drop in operating voltage (1.5 ->1.2v). Low Power DDR4 will run at 1.05v and will lower the power usage even further. But there is more to DDR4 than lowering the voltage. Samsung claims that, when both are manufactured with the same process technology, the DDR4 runs at 2/3 of the power DDR3L needs. 

Micron gives a break down of the features that made DDR4 more power efficient besides the obvious drop in VDDQ. 

Note that the total power efficiency increase is 30-35%, and this is not just a result of the VDD reduction (20%). In that sense, DDR4 is a larger step forward than previous DDR technology transistions. Of course, the 30-35% improvement in power efficiency is measured with RAM running at the same speed. It's also possible to run DDR4 at much higher speeds (3200 MT/s vs 1866 MT/s) while sacrificing some of the power savings. The DDR4 memory that we are using for testings runs at 2100 MT/s, a good compromise between a mild speed increase and power efficiency.

A more elaborate discussion will follow in our next server memory article, but each bank also has much smaller rows (four times smaller) and thus the cycle time of the DRAM can be much higher. The result is lower latency.

The improved signal to noise ratio and the extra pins for addressing allow DDR4 to support eight DRAM stacks instead of four (DDR3). As a result, DDR4 can support twice the capacity of DDR3 using the same (4-16Gb) DRAM chips. This will require the use of 3D stacking technology, which will take time to implement. However, since 8Gb chips are now used, Registered DIMMs of 32GB should soon be a reality, as well as 64GB LRDIMMs. We'll discuss this in more detail on the next page.

Power Optimizations Improved Support for LRDIMMs
Comments Locked

85 Comments

View All Comments

  • LostAlone - Saturday, September 20, 2014 - link

    Given the difference in size between the two companies it's not really all that surprising though. Intel are ten times AMD's size, and I have to imagine that Intel's chip R&D department budget alone is bigger than the whole of AMD. And that is sad really, because I'm sure most of us were learning our computer science when AMD were setting the world on fire, so it's tough to see our young loves go off the rails. But Intel have the money to spend, and can pursue so many more potential avenues for improvement than AMD and that's what makes the difference.
  • Kevin G - Monday, September 8, 2014 - link

    I'm actually surprised they released the 18 core chip for the EP line. In the Ivy Bridge generation, it was the 15 core EX die that was harvested for the 12 core models. I was expecting the same thing here with the 14 core models, though more to do with power binning than raw yields.

    I guess with the recent TSX errata, Intel is just dumping all of the existing EX dies into the EP socket. That is a good means of clearing inventory of a notably buggy chip. When Haswell-EX formally launches, it'll be of a stepping with the TSX bug resolved.
  • SanX - Monday, September 8, 2014 - link

    You have teased us with the claim that added FMA instructions have double floating point performance. Wow! Is this still possible to do that with FP which are already close to the limit approaching just one clock cycle? This was good review of integer related performance but please combine with Ian to continue with the FP one.
  • JohanAnandtech - Monday, September 8, 2014 - link

    Ian is working on his workstation oriented review of the latest Xeon
  • Kevin G - Monday, September 8, 2014 - link

    FMA is common place in many RISC architectures. The reason why we're just seeing it now on x86 is that until recently, the ISA only permitted two registers per operand.

    Improvements in this area maybe coming down the line even for legacy code. Intel's micro-op fusion has the potential to take an ordinary multiply and add and fuse them into one FMA operation internally. This type of optimization is something I'd like to see in a future architecture (Sky Lake?).
  • valarauca - Monday, September 8, 2014 - link

    The Intel compiler suite I believe already converts

    x *= y;
    x += z;

    into an FMA operation when confronted with them.
  • Kevin G - Monday, September 8, 2014 - link

    That's with source that is going to be compiled. (And don't get me wrong, that's what a compiler should do!)

    Micro-op fusion works on existing binaries years old so there is no recompile necessary. However, micro-op fusion may not work in all situations depending on the actual instruction stream. (Hypothetically the fusion of a multiply and an add in an instruction stream may have to be adjacent to work but an ancient compiler could have slipped in some other instructions in between them to hide execution latencies as an optimization so it'd never work in that binary.)
  • DIYEyal - Monday, September 8, 2014 - link

    Very interesting read.
    And I think I found a typo: page 5 (power optimization). It is well known that THE (not needed) Haswell HAS (is/ has been) optimized for low idle power.
  • vLsL2VnDmWjoTByaVLxb - Monday, September 8, 2014 - link

    Colors or labeling for your HPC Power Consumption graph don't seem right.
  • JohanAnandtech - Monday, September 8, 2014 - link

    Fixed, thanks for pointing it out.

Log in

Don't have an account? Sign up now