LRDIMMs: Capacity and Real World Performance

As we have shown that Intel invested a lot of time to improve the support for LRDIMMs, we also wanted to do a few real world tests to understand when capacity or higher speed at high DPC matters.

First we test with our CDN test. We simulate our CDN test with one CDN server and three client machines on a 10 Gbit/s network. Each client machine simulates thousands of users requesting different files. Our server runs Ubuntu 13.04 with Apache.

The static files requested originate from sourceforge.org, a mirror of the /a directory, containing 1.4TB of data and 173,000 files. To model the real world load for the CDN two different usage patterns or workloads are executed simultaneously, one that accesses a limited set of files more frequently and a second that accesses less frequently requested files. This workload simulates users that are requesting both current as well as older or less frequently accessed files. You can read more about our CDN test here. We used the Xeon E5-2695 v3.

Content Delivery Network test

There is no doubt about it: some applications are all about caching and LRDIMMs are incredibly useful in these situations. Some HPC workloads supposedly require a lot of memory as well and can be very sensitive to memory bandwidth. Really? Even with a 35MB L3 cache? We decided to find out.

Actiflow OpenFOAM – influence of memory

LRDIMMs are just as fast as RDIMM in a 1DPC configuration. Once you plug in two DIMMs per channel, LRDIMMs outperform RDIMMs by 18%. That is quite surprising considering that our CPU is outfitted with an exceptionally large L3. This test clearly shows there are still applications out there craving more memory bandwidth. In the case of OpenFOAM, the amount of bandwidth largely determines how many cores your CPU can keep busy.

Energy and HPC Conclusions so Far...
Comments Locked

85 Comments

View All Comments

  • bsd228 - Friday, September 12, 2014 - link

    Now go price memory for M class Sun servers...even small upgrades are 5 figures and going 4 years back, a mid sized M4000 type server was going to cost you around 100k with moderate amounts of memory.

    And take up a large portion of the rack. Whereas you can stick two of these 18 core guys in a 1U server and have 10 of them (180 cores) for around the same sort of money.

    Big iron still has its place, but the economics will always be lousy.
  • platinumjsi - Tuesday, September 9, 2014 - link

    ASRock are selling boards with DDR3 support, any idea how that works?

    http://www.asrockrack.com/general/productdetail.as...
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Well... ASRock is generally famous "marrying" different gen hardware.
    But here, since this is about DDR RAM, governed by the CPU itself (because memory controller is inside the CPU), then my only guess is Xeon E5 v3 may have dual-mode memory controller (supporting either DDR4 or DDR3), similarly as Phenom II had back in 2009-2011, which supported either DDR2 or DDR3, depending on where you plugged it in.

    If so, then probably just the performance of E5 v3 with DDR3 may be somewhat inferior in comparison with DDR4.
  • alpha754293 - Tuesday, September 9, 2014 - link

    No LS-DYNA runs? And yes, for HPC applications, you actually CAN have too many cores (because you can't keep the working cores pegged with work/something to do, so you end up with a lot of data migration between cores, which is bad, since moving data means that you're not doing any useful work ON the data).

    And how you decompose the domain (for both LS-DYNA and CFD makes a HUGE difference on total runtime performance).
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    No, I hope to get that one done in the more Windows/ESXi oriented review.
  • Klimax - Tuesday, September 9, 2014 - link

    Nice review. Next stop: Windows Server. (And MS-SQL..)
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    Agreed. PCIe Flash and SQL server look like a nice combination to test this new Xeons.
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Xeon 5500 series (Nehalem-EP): up to 4 cores (45 nm)
    Xeon 5600 series (Westmere-EP): up to 6 cores (32 nm)
    Xeon E5 v1 (Sandy Bridge-EP): up to 8 cores (32 nm)
    Xeon E5 v2 (Ivy Bridge-EP): up to 12 cores (22 nm)
    Xeon E5 v3 (Haswell-EP): up to 18 cores (22 nm)

    So, in this progression, core count increases by 50% (1.5 times) almost each generation.

    So, what's gonna be next:

    Xeon E5 v4 (Broadwell-EP): up to 27 cores (14 nm) ?

    Maybe four rows with 5 cores and one row with 7 cores (4 x 5 + 7 = 27) ?
  • wallysb01 - Wednesday, September 10, 2014 - link

    My money is on 24 cores.
  • SuperVeloce - Tuesday, September 9, 2014 - link

    What's the story with 2637v3? Only 4 cores and the same freqency and $1k price as 6core 2637v2? By far the most pointless cpu on the list.

Log in

Don't have an account? Sign up now