Memory Subsystem: Latency

To measure latency, we use the open source TinyMemBench benchmark. The source was compiled for x86 with gcc 4.8.2 and optimization was set to "-O2". The measurement is described well by the manual of TinyMemBench:

Average time is measured for random memory accesses in the buffers of different sizes. The larger the buffer, the more significant the relative contributions of TLB, L1/L2 cache misses, and DRAM accesses become. All the numbers represent extra time, which needs to be added to L1 cache latency (4 cycles).

We tested with dual random read, as we wanted to see how the memory system coped with multiple read requests. To keep the graph readable we limited ourselves to the CPUs that were different.

L3 caches have increased significantly the past years, but it is not all good news. The L3 cache of the Xeon E3 responds very quickly (about 10 ns or less than 30 cycles at 2.8 GHz) while the L3-cache of the new generation needs almost twice as much time to respond (about 20 ns or 50 cycles at 2.6 GHz). Larger L3 caches are not always a blessing and can result in a hit to latency - there are applications that have a relatively small part of cacheable data/instructions such as search engines and HPC application that work on huge amounts of data. 

It gets worse for the "large L3 cache" models when we look at latency of accessing memory (measured at 64 MB): 

Latency in memory

The higher L3-cache latency makes memory accesses more costly in terms of latency for the Xeon E5. Despite having access to DDR4-2133 DIMMs, the Xeon E5-2650L accesses memory slower than the Xeon E3-1230L.  It is also a major weakness of the Atom C2750 which has much less sophisticated memory controller/prefetching.

Memory Subsystem: Bandwidth Single-Threaded Integer Performance
Comments Locked

90 Comments

View All Comments

  • extide - Wednesday, June 24, 2015 - link

    So, I was thinking last night, that this chip is THE PERFECT enthusiast chip! All Intel needs to do is release an unlocked and socketed version (although that would be complex because there is currently no platform for it ...) although if we could get at least an unlocked version on an enthusiast style board it would be awesome.

    Think about it:
    8 Broadwell cores -- Great!
    12MB L3 -- Great!
    24 Lanes PCIe 3.0 -- More than 16 or even skylakes rumored 20, pretty good. You could do things like 16x + 8x, or 8x + 8x + 4x + 4x (the two 4x being m.2 ssd's) which would support CF or SLI quite well and some fast ssd's.
    2ch DDR4 -- plenty for gaming and most enthusiast applications
    Dual 10GbE -- Just added Gravvy here, but would def help adoption of 10g in the enthusiast realm.

    COME ON INTEL!!
  • extide - Wednesday, June 24, 2015 - link

    Also, I forgot to add:

    This would be a great intermediate between the current regular consumer stuff (LGA 115x) and HPDE (LGA 2011x) -- A lot of people really see the LGA 2011 platform as overkill, even for enthusiasts, and it gets so expensive, with quad channel ddr4 and all that. This chip just seems to make so much sense. Now if intel priced it no more than the $500 mark, that would be awesome. Imagine, if AMD was more competitive, we might actually have that5 scenario.... Hopefully Zen is just great!
  • Namisecond - Saturday, June 27, 2015 - link

    Intel's tray price for this chip is listed at $199 for the 4-core and $581 for the 8-core. The price for the CPU+motherboard is almost $1K for the 8 core. which indicates the problem is not in the price of the chip itself.

    If you want cheap and low power consumption, I'd direct you to the S1150 platform with Xeon E3 V3 "L" series (13-45W) processors.
  • spikebike - Wednesday, June 24, 2015 - link

    For a home machine, small server, workstation, or similar the Xeon D 1520 looks even better. Faster clock, 1/3rd the price, same maximum ram, ecc, etc. Sure it's got 4 cores/8threads instead of more, but for many use cases that's not a big limitation. In quite a few cases spending the $400 different on RAM or SSDs will make a bigger difference.
  • hifiaudio2 - Thursday, June 25, 2015 - link

    Where can you get a 1520? Google searching is not finding anything for sale...
  • hifiaudio2 - Thursday, June 25, 2015 - link

    If I cannot find the 1520 for sale, what is the best bang for the buck i3 and MB combo (want to use ECC ram as well) for a Media server/transcode/nas? Low TDP, etc..
  • jaziniho - Thursday, June 25, 2015 - link

    Any word on whether HP plan to make a Moonshot cartridge featuring Xeon D? the 45W TDP seems to match up with some of the previous chips they have used.
  • jeffsci - Monday, June 29, 2015 - link

    Why do the results use a variety of OSS compilers? For an Intel Xeon processor, the Intel compilers are the most reliable. Is Open64 actively developed for Intel processors? And switching from GCC 4.8 to 4.9 with different flags...how is this even remotely scientific?
  • needforsuv - Saturday, July 11, 2015 - link

    so they just done to the 'regular' 4/8 i7/e3 what they did to the C2D in making the C2Q but more sophisticated I like it now wheres that lga 115x 8 core
  • tabascosauz - Sunday, July 19, 2015 - link

    I hope that Mr. de Gelas will continue to learn and improve as a writer, because the grammar in this article is, in numerous places, rather iffy and AT has traditionally excelled in delivering detailed, grammatically correct content.

Log in

Don't have an account? Sign up now