Limitations

Where are the virtualization benchmarks? We only got ESXi running a few days before the launch, after performing a necessary BIOS update. A little bit later, disaster struck: our iSCSI target was gone as some of the disks in the RAID-array failed. Unfortunately that means we will have to post our virtualization findings in a later article.

The other main limitation of this review is that we did not have sufficient time to experiment with different servers to measure power consumption. We have started asking around to get different kinds of servers in the lab, and we will be updating our tools to measure power draw of the different components inside the servers soon.

Conclusions so Far...

This has been a massive review and there's a lot of information to digest. However, if there is one thing you should remember it's that there is not one SKU that is the best in every situation. The results vary enormously depending on the workload. Some workloads like our kernel compilation test prefer the higher clocked SKUs, and those who thought the 14-core and 18-core processors at 2.3GHz would only excel in easy scaling software are wrong. Turbo Boost has improved vastly, and the massive core monsters can deftly wield this weapon when few threads are running.

The Xeon E5-2695 v3 is an interesting SKU for those searching for high performance in integer workloads. It is also relatively power efficient, never asking for too many amps, and it performs very well in alomst every (integer!) application. Of course the price tag is heavy, and it only makes sense if you can use all that processing power.

It is clear that server buyers could really benefit from some serious competition in the market, but you can hardly blame Intel at this stage. We hope that AMD can make a comeback in 2015. If not, it does not look like Intel will have any real competition in the midrange server market.

The Xeon E5-2650L v3 however is the true star of this review. It is power efficient (obviously) and contrary to previous low power offerings it still offers a good response time. Perhaps more surprising is that it even performs well in our FP intensive applications.

At the other end of the spectrum, the Xeon E5-2699 v3 is much more power hungry than we are used to from a high end part. It shines in SAP where hardware costs are dwarfed by the consulting invoices and delivers maximum performance in HPC. However, the peak power draw of this CPU is nothing to laugh about. Of course, the HPC crowd are used to powerhogs (e.g. GPGPU), but there's a reason Intel doesn't usually offer >130W TDP processors.

Considering the new Haswell EP processors will require a completely new platform – motherboards, memory, and processors all need to be upgraded – at least initially the parts will mostly be of interest to new server buyers. There are also businesses that demand the absolute fastest servers available and they'll be willing to upgrade, but for many the improvements with Haswell EP may not be sufficient to entice them into upgrading. The 14 nm Broadwell EP will likely be a better time to update servers, but that's still a year or so away.

LRDIMMs: Capacity and Real World Performance
Comments Locked

85 Comments

View All Comments

  • martinpw - Monday, September 8, 2014 - link

    There is a nice tool called i7z (can google it). You need to run it as root to get the live CPU clock display.
  • kepstin - Monday, September 8, 2014 - link

    Most Linux distributions provide a tool called "turbostat" which prints statistical summaries of real clock speeds and c state usage on Intel cpus.
  • kepstin - Monday, September 8, 2014 - link

    Note that if turbostat is missing or too old (doesn't support your cpu), you can build it yourself pretty quick - grab the latest linux kernel source, cd to tools/power/x86/turbostat, and type 'make'. It'll build the tool in the current directory.
  • julianb - Monday, September 8, 2014 - link

    Finally the e5-xxx v3s have arrived. I too can't wait for the Cinebench and 3DS Max benchmark results.
    Any idea if now that they are out the e5-xxxx v2s will drop down in price?
    Or Intel doesn't do that...
  • MrSpadge - Tuesday, September 9, 2014 - link

    Correct, Intel does not really lower prices of older CPUs. They just gradually phase out.
  • tromp - Monday, September 8, 2014 - link

    As an additional test of the latency of the DRAM subsystem, could you please run the "make speedup" scaling benchmark of my Cuckoo Cycle proof-of-work system at https://github.com/tromp/cuckoo ?
    That will show if 72 threads (2 cpus with 18 hyperthreaded cores) suffice to saturate the DRAM subsystem with random accesses.

    -John
  • Hulk - Monday, September 8, 2014 - link

    I know this is not the workload these parts are designed for, but just for kicks I'd love to see some media encoding/video editing apps tested. Just to see what this thing can do with a well coded mainstream application. Or to see where the apps fades out core-wise.
  • Assimilator87 - Monday, September 8, 2014 - link

    Someone benchmark F@H bigadv on these, stat!
  • iwod - Tuesday, September 9, 2014 - link

    I am looking forward to 16 Core Native Die, 14nm Broadwell Next year, and DDR4 is matured with much better pricing.
  • Brutalizer - Tuesday, September 9, 2014 - link

    Yawn, the new upcoming SPARC M7 cpu has 32 cores. SPARC has had 16 cores for ages. Since some generations back, the SPARC cores are able to dedicate all resources to one thread if need be. This way the SPARC core can have one very strong thread, or massive throughput (many threads). The SPARC M7 cpu is 10 billion transistors:
    http://www.enterprisetech.com/2014/08/13/oracle-cr...
    and it will be 3-4x faster than the current SPARC M6 (12 cores, 96 threads) which holds several world records today. The largest SPARC M7 server will have 32-sockets, 1024 cores, 64TB RAM and 8.192 threads. One SPARC M7 cpu will be as fast as an entire Sunfire 25K. :)

    The largest Xeon E5 server will top out at 4-sockets probably. I think the Xeon E7 cpus top out at 8-socket servers. So, if you need massive RAM (more than 10TB) and massive performance, you need to venture into Unix server territory, such as SPARC or POWER. Only they have 32-socket servers capable of reaching the highest performance.

    Of course, the SGI Altix/UV2000 servers have 10.000s of cores and 100TBs of RAM, but they are clusters, like a tiny supercomputer. Only doing HPC number crunching workloads. You will never find these large Linux clusters run SAP Enterprise workloads, there are no such SAP benchmarks, because clusters suck at non HPC workloads.

    -Clusters are typically serving one user who picks which workload to run for the next days. All SGI benchmarks are HPC, not a single Enterprise benchmark exist for instance SAP or other Enterprise systems. They serve one user.

    -Large SMP servers with as many as 32 sockets (or even 64-sockets!!!) are typically serving thousands of users, running Enterprise business workloads, such as SAP. They serve thousands of users.

Log in

Don't have an account? Sign up now