Memory Subsystem: Latency

To measure latency, we use the open source TinyMemBench benchmark. The source was compiled for x86 with gcc 4.8.2 and optimization was set to "-O2". The measurement is described well by the manual of TinyMemBench:

Average time is measured for random memory accesses in the buffers of different sizes. The larger the buffer, the more significant the relative contributions of TLB, L1/L2 cache misses, and DRAM accesses become. All the numbers represent extra time, which needs to be added to L1 cache latency (4 cycles).

We tested with dual random read, as we wanted to see how the memory system coped with multiple read requests. To keep the graph readable we limited ourselves to the CPUs that were different.

L3 caches have increased significantly the past years, but it is not all good news. The L3 cache of the Xeon E3 responds very quickly (about 10 ns or less than 30 cycles at 2.8 GHz) while the L3-cache of the new generation needs almost twice as much time to respond (about 20 ns or 50 cycles at 2.6 GHz). Larger L3 caches are not always a blessing and can result in a hit to latency - there are applications that have a relatively small part of cacheable data/instructions such as search engines and HPC application that work on huge amounts of data. 

It gets worse for the "large L3 cache" models when we look at latency of accessing memory (measured at 64 MB): 

Latency in memory

The higher L3-cache latency makes memory accesses more costly in terms of latency for the Xeon E5. Despite having access to DDR4-2133 DIMMs, the Xeon E5-2650L accesses memory slower than the Xeon E3-1230L.  It is also a major weakness of the Atom C2750 which has much less sophisticated memory controller/prefetching.

Memory Subsystem: Bandwidth Single-Threaded Integer Performance
Comments Locked

90 Comments

View All Comments

  • Krysto - Tuesday, June 23, 2015 - link

    Betteridge law.
  • Metaluna - Tuesday, June 23, 2015 - link

    ...fails in this case. Did you read the review?
  • CajunArson - Tuesday, June 23, 2015 - link

    While desktop Broadwell isn't all that great, these server parts really show off Intel's accomplishments in improving power efficiency and performance-per-watt with 14nm.

    ARM has a huge hill to climb to really compete with these parts, and we've already seen AMD effectively skip its first iteration of an ARM product because they probably got wind of the Xeon D and decided they would have to do both a die-shrink and completely customized ARM core just to keep up.
  • The_Assimilator - Tuesday, June 23, 2015 - link

    I very much doubt whether we'll ever see another server CPU from AMD, regardless of ARM cores or not. If they even manage to get Zen out the door, *and* it's not another massive flop, I will be impressed.
  • Refuge - Tuesday, June 23, 2015 - link

    I root for them everyday, but lets not give them too big of a hill to climb with a broken leg now. lol
  • extide - Tuesday, June 23, 2015 - link

    Take it easy man, AMD is not going down the drain any time soon, and we WILL see some future server oriented parts come from them. But how fast will they be? That's the question and we wont know for a while...
  • Kjella - Tuesday, June 23, 2015 - link

    Really? Last quarter they had a $187 million total comprehensive loss on $1030 million in revenue, even if you exclude the restructuring cost they lost $100 million for a -10% deficit. The stockholder's equity is almost gone with $17 million left, after that getting funding or a credit limit will become much harder.

    And Q2 is probably going to be another bloody quarter with no major CPU or GPU launches and firesales of old Win8 stock in preparation for Win10. The console ramp-up is usually in Q3 in preparation for Christmas, not before the summer. Last quarter's loss they took almost entirely from their cash reserves, they're now in the lower end of what they need to operate, if they lose this quarter too they must cut where it hurts bad.
  • Guspaz - Tuesday, June 23, 2015 - link

    When we needed a low-power and low-cost server solution, we went with a desktop i3, because for some reason Intel supports ECC RAM on the i3 and lower, but not in the i5 and higher.
  • julianb - Tuesday, June 23, 2015 - link

    Very interested in this SOC.

    If possible could we see how the Xeon D deal with Cinebench Multithreaded test?
    I am into 3D CPU rendering and would like to know how does the Xeon D-1540 compare to say i7-3930K or i7-4790K.
    I realize the purpose of Xeon D-1540's existence is different but still...
    Thank you.
  • MrSpadge - Saturday, June 27, 2015 - link

    An eco-tuned 5820K seem better. I don't suppose you're going to render 24/7 all the time, so the electricity savings from the 14 nm Broadwell will have a hard time making up for the massive difference in initial cost.

Log in

Don't have an account? Sign up now