Limitations

Where are the virtualization benchmarks? We only got ESXi running a few days before the launch, after performing a necessary BIOS update. A little bit later, disaster struck: our iSCSI target was gone as some of the disks in the RAID-array failed. Unfortunately that means we will have to post our virtualization findings in a later article.

The other main limitation of this review is that we did not have sufficient time to experiment with different servers to measure power consumption. We have started asking around to get different kinds of servers in the lab, and we will be updating our tools to measure power draw of the different components inside the servers soon.

Conclusions so Far...

This has been a massive review and there's a lot of information to digest. However, if there is one thing you should remember it's that there is not one SKU that is the best in every situation. The results vary enormously depending on the workload. Some workloads like our kernel compilation test prefer the higher clocked SKUs, and those who thought the 14-core and 18-core processors at 2.3GHz would only excel in easy scaling software are wrong. Turbo Boost has improved vastly, and the massive core monsters can deftly wield this weapon when few threads are running.

The Xeon E5-2695 v3 is an interesting SKU for those searching for high performance in integer workloads. It is also relatively power efficient, never asking for too many amps, and it performs very well in alomst every (integer!) application. Of course the price tag is heavy, and it only makes sense if you can use all that processing power.

It is clear that server buyers could really benefit from some serious competition in the market, but you can hardly blame Intel at this stage. We hope that AMD can make a comeback in 2015. If not, it does not look like Intel will have any real competition in the midrange server market.

The Xeon E5-2650L v3 however is the true star of this review. It is power efficient (obviously) and contrary to previous low power offerings it still offers a good response time. Perhaps more surprising is that it even performs well in our FP intensive applications.

At the other end of the spectrum, the Xeon E5-2699 v3 is much more power hungry than we are used to from a high end part. It shines in SAP where hardware costs are dwarfed by the consulting invoices and delivers maximum performance in HPC. However, the peak power draw of this CPU is nothing to laugh about. Of course, the HPC crowd are used to powerhogs (e.g. GPGPU), but there's a reason Intel doesn't usually offer >130W TDP processors.

Considering the new Haswell EP processors will require a completely new platform – motherboards, memory, and processors all need to be upgraded – at least initially the parts will mostly be of interest to new server buyers. There are also businesses that demand the absolute fastest servers available and they'll be willing to upgrade, but for many the improvements with Haswell EP may not be sufficient to entice them into upgrading. The 14 nm Broadwell EP will likely be a better time to update servers, but that's still a year or so away.

LRDIMMs: Capacity and Real World Performance
Comments Locked

85 Comments

View All Comments

  • LostAlone - Saturday, September 20, 2014 - link

    Given the difference in size between the two companies it's not really all that surprising though. Intel are ten times AMD's size, and I have to imagine that Intel's chip R&D department budget alone is bigger than the whole of AMD. And that is sad really, because I'm sure most of us were learning our computer science when AMD were setting the world on fire, so it's tough to see our young loves go off the rails. But Intel have the money to spend, and can pursue so many more potential avenues for improvement than AMD and that's what makes the difference.
  • Kevin G - Monday, September 8, 2014 - link

    I'm actually surprised they released the 18 core chip for the EP line. In the Ivy Bridge generation, it was the 15 core EX die that was harvested for the 12 core models. I was expecting the same thing here with the 14 core models, though more to do with power binning than raw yields.

    I guess with the recent TSX errata, Intel is just dumping all of the existing EX dies into the EP socket. That is a good means of clearing inventory of a notably buggy chip. When Haswell-EX formally launches, it'll be of a stepping with the TSX bug resolved.
  • SanX - Monday, September 8, 2014 - link

    You have teased us with the claim that added FMA instructions have double floating point performance. Wow! Is this still possible to do that with FP which are already close to the limit approaching just one clock cycle? This was good review of integer related performance but please combine with Ian to continue with the FP one.
  • JohanAnandtech - Monday, September 8, 2014 - link

    Ian is working on his workstation oriented review of the latest Xeon
  • Kevin G - Monday, September 8, 2014 - link

    FMA is common place in many RISC architectures. The reason why we're just seeing it now on x86 is that until recently, the ISA only permitted two registers per operand.

    Improvements in this area maybe coming down the line even for legacy code. Intel's micro-op fusion has the potential to take an ordinary multiply and add and fuse them into one FMA operation internally. This type of optimization is something I'd like to see in a future architecture (Sky Lake?).
  • valarauca - Monday, September 8, 2014 - link

    The Intel compiler suite I believe already converts

    x *= y;
    x += z;

    into an FMA operation when confronted with them.
  • Kevin G - Monday, September 8, 2014 - link

    That's with source that is going to be compiled. (And don't get me wrong, that's what a compiler should do!)

    Micro-op fusion works on existing binaries years old so there is no recompile necessary. However, micro-op fusion may not work in all situations depending on the actual instruction stream. (Hypothetically the fusion of a multiply and an add in an instruction stream may have to be adjacent to work but an ancient compiler could have slipped in some other instructions in between them to hide execution latencies as an optimization so it'd never work in that binary.)
  • DIYEyal - Monday, September 8, 2014 - link

    Very interesting read.
    And I think I found a typo: page 5 (power optimization). It is well known that THE (not needed) Haswell HAS (is/ has been) optimized for low idle power.
  • vLsL2VnDmWjoTByaVLxb - Monday, September 8, 2014 - link

    Colors or labeling for your HPC Power Consumption graph don't seem right.
  • JohanAnandtech - Monday, September 8, 2014 - link

    Fixed, thanks for pointing it out.

Log in

Don't have an account? Sign up now