Memory Subsystem: Latency

To measure latency, we use the open source TinyMemBench benchmark. The source was compiled for x86 with gcc 4.8.2 and optimization was set to "-O2". The measurement is described well by the manual of TinyMemBench:

Average time is measured for random memory accesses in the buffers of different sizes. The larger the buffer, the more significant the relative contributions of TLB, L1/L2 cache misses, and DRAM accesses become. All the numbers represent extra time, which needs to be added to L1 cache latency (4 cycles).

We tested with dual random read, as we wanted to see how the memory system coped with multiple read requests. To keep the graph readable we limited ourselves to the CPUs that were different.

L3 caches have increased significantly the past years, but it is not all good news. The L3 cache of the Xeon E3 responds very quickly (about 10 ns or less than 30 cycles at 2.8 GHz) while the L3-cache of the new generation needs almost twice as much time to respond (about 20 ns or 50 cycles at 2.6 GHz). Larger L3 caches are not always a blessing and can result in a hit to latency - there are applications that have a relatively small part of cacheable data/instructions such as search engines and HPC application that work on huge amounts of data. 

It gets worse for the "large L3 cache" models when we look at latency of accessing memory (measured at 64 MB): 

Latency in memory

The higher L3-cache latency makes memory accesses more costly in terms of latency for the Xeon E5. Despite having access to DDR4-2133 DIMMs, the Xeon E5-2650L accesses memory slower than the Xeon E3-1230L.  It is also a major weakness of the Atom C2750 which has much less sophisticated memory controller/prefetching.

Memory Subsystem: Bandwidth Single-Threaded Integer Performance
Comments Locked

90 Comments

View All Comments

  • Kjella - Tuesday, June 23, 2015 - link

    Server on a chip? It's not intended for use with a display, it does all it's "supposed to" do for the hyperscale market without any display.
  • close - Tuesday, June 23, 2015 - link

    "Intel was able to combine 8 of them together with dual 10 Gbit, 4 USB 3.0 controllers, 6 SATA 3 controller and quite a bit more".
    This ^^ makes it a SoC. Ok, a video output would be nice but that certainly doesn't disqualify it.
  • ats - Tuesday, June 23, 2015 - link

    cause video isn't required or even wanted in this market segment. It is a SoC, which simply means system on a chip and doesn't have some ironclad definition. Hell, most "SoC" chips aren't really systems on a chip anyways and require significant supporting logic (this is true for just about any cell phone SoC on the market too).
  • bill.rookard - Tuesday, June 23, 2015 - link

    Exactly, you would tend to use remote management over the network to admin this type of a unit. I have several rackmounted servers in my basement (I do some home-serving of websites over a business class connection) and while I do have them actually hooked up to a display, I can hardly remember the last time I looked at them as 99.9% of the time I SSH into everything for administration.

    About the only time you'd ever really use a display is if you were doing multiple VMs of assorted types. Beyond that, it's wattage wasted.
  • ats - Tuesday, June 23, 2015 - link

    Yeah honestly, having several SM boards with their ILM system, the only time I'd ever hook up a display is if the network was down. The SM ILM will fully proxy pretty much anything you want and give you a 1200p display that works for just about anything. And you can remotely hook up CDs, DVDs, BRs, USB, etc through it along with the stand console and keyboard/mouse functions. Its a very nice solution.
  • nightbringer57 - Tuesday, June 23, 2015 - link

    Basically, you don't need video output.
    Even if you do, mainboard manufacturers usually include a third-party chip with dedicated functions that, along other things, provide a VGA port usable for a server use.
    In this case, the AST2400 chip offers some basic GPU functions with a VGA port along with many remote control-related stuff.
    Adding all those functions to the Intel SoC would be awfully expensive. The chip only requires a simple PCIe x1 connection from the SoC, but provides hundreds of additional pins. Not only would those functions probably be hard to implement on a relatively recent 14nm process, but it would require at least 300 new pins on the SoC to add all the 3rd party chip's functions on it, which is almost impossible to do.
  • Th-z - Tuesday, June 23, 2015 - link

    There doesn't seem to have a concrete definition for the term SoC, but it's ridiculous now with the term SoC bandwagon. Everything seems to be called "SoC" these days as long as a chip has more than one functions integrated. One of examples is people even called current console's integrated CPU and GPU chip as SoC, which doesn't even have networking and other peripheral units in it. When a system has so many "SoCs" inside, the term really has lost its meaning and significance.
  • redzo - Tuesday, June 23, 2015 - link

    I'm thinking this is a bad name for a product like this. It reminds of the infamous Celeron D and Pentium D line.
  • nandnandnand - Tuesday, June 23, 2015 - link

    Anyone who can figure out Xeon D exists can probably tell the difference
  • wussupi83 - Tuesday, June 23, 2015 - link

    I agree with redzo, I think anyone who can figure out a 'Xeon D' exists AND remembers that Pentium & Celeron D's existed would initially assume this is a budget Xeon - which it's clearly not. E4 sounds pretty logical. But sure lets just put D...

Log in

Don't have an account? Sign up now