Memory Performance

Seeing as how the huge L3 cache and quad-channel memory interface are big parts of what makes Ivy Bridge E unique, I thought it might make sense to look at memory latency and bandwidth. We'll start with memory latency, compared to Ivy Bridge, Haswell and Haswell + Crystalwell.

The larger L3 cache buys IVB-E lower latency accesses for a wider range of addresses, but once you exceed the 15MB L3 cache space we see latency about on par with everything else. Only Haswell + Crystalwell manages to hold out for longer. Unfortunately that's not really a part desktop enthusiasts can buy so it's mostly an academic comparison.

The bandwidth story is an interesting one. Sandra maxes out bandwidth by driving all cores at the same time, so you get some uplift here by there simply being more cores under IVB-E's hood. But even if you divide out the number of cores, you get per core cache bandwidth figures that are extremely high (at least outside of L1). The L3 cache in particular is quite bandwidth happy.

Going outside of the L3 cache, we also see a doubling of memory bandwidth - which is expected given the doubling of memory interface width. In reality the peak memory bandwidth advantage would be even larger as IVB-E officially supports DDR3-1866 (if you only populate 1 DIMM per channel, otherwise either 1333 or 1600 is officially supported).

General Performance

I don't know that I've ever seen an Intel slide before that called out a performance degradation, but there's a first time for everything:

The problem with IVB-E vs. Haswell is that the extra large L3 cache and quad-channel memory interface are generally only useful in heavily threaded applications, which of course benefit from its 6-core configuration. In those tests that aren't heavily threaded however, IVB-E typically sees a single threaded performance deficit compared to Haswell. Given that the 4960X and Haswell based Core i7-4770K run at very similar frequencies, it's not surprising to see IVB-E take a backseat to Haswell in in "everyday computing" tasks. Intel's slide above claims about a 18% reduction in "everyday computing" performance compared to the 4770K, but in practice I found the gap to be much narrower.

Although not the best indication of overall system performance, the SYSMark 2012 suite does give us a good idea of lighter workloads than we're used to testing.

SYSMark 2012 - Overall

There's pretty much no advantage to the 4960X over the 3970X here. Remember Ivy Bridge's architectural improvements were very limited on the CPU side. As clock speeds didn't really go up between the 3970X and 4960X, the performance parity here isn't surprising. Haswell manages a ~6% performance advantage over the 4960X at an obviously lower power and price point.

Although I retired SYSMark 2007 a while ago, I do have much older performance data here which lets us compare the 4960X back as far as the early Pentium 4 based Extreme Edition parts:

SYSMark 2007 - Overall

The Haswell advantage grows a bit here to around 8%, but the 4960X remains in the top three performers here. It's very clear that for most users, there are far more cost effective ways of getting great performance than IVB-E.

Our final lightly threaded test is Mozilla's Kraken JavaScript benchmark. This test includes some forward looking js code designed to showcase performance of future rich web applications on today's software and hardware. We run the test under IE10:

Windows 8 - Mozilla Kraken Javascript Benchmark

Ivy Bridge always had good single threaded performance, but once again these lightly threaded use cases are better served by an architecture with higher IPC. The Haswell advantage isn't huge, but it's a lower power/more cost effective way to get the best performance here.

If you are still on LGA-1366, you'll note that the performance gains here are good, but not earth shattering. Comparing to Intel's first 6-core platform, the 4960X manages a 27% increase in performance over the Core i7-990X. That's a healthy gain, but it's still small enough where there's no immediate need to upgrade.

Introduction & The Details Video Transcoding & 3D Rendering Performance
Comments Locked

120 Comments

View All Comments

  • ShieTar - Tuesday, September 3, 2013 - link

    Whats the point? A 10-core only runs at 2GHz, and a 8-core only runs at 3 GHz, so both have less overall performance than a 6-core overclocked to more than 4GHz. You simply cannot put more computing power into a reasonable power envelope for a single socket. If a water-cooled Enthusiast 6-core is not enough for your needs, you automatically need a 2-socket system.

    And its not like that is not feasible for enthusiasts. The ASUS Z9PE-D8 WS, the EVGA Classified SR-X and the Supermicro X9DAE are mainboard aiming at the enthusiast / workstation market, combining two sockets for XEON-26xx with the capability to run GPUs in SLI/CrossFire. And if you are looking to spend significantly more than 1k$ for a CPU, the 400$ on those boards and the extra cost for ECC Memory should not scare you either.

    Just go and check Anandtech own benchmarking: http://www.anandtech.com/show/6808/westmereep-to-s... . It's clear that you need two 8-cores to be faster then the enthusiast 6-cores even before overclocking is taken into account.

    Maybe with Haswell-E we can get 8 cores with >3.5GHz into <130W, but with Ivy Bridge, there is simply no point.
  • f0d - Tuesday, September 3, 2013 - link

    who cares if the power envelope is "reasonable"?
    i already have my SBE overclocked to 5.125Ghz and if they release a 10core i would oc that thing like a mutha******

    that link you posted is EXACTLY why i want a 10/12 core instead of dual socket (which i could afford if it made sense performance wise) - its obvious that video encoding doesnt work well with NUMA and dual sockets but it does work well with multi cored single cpu's

    so i say give me a 10 core and let me OC it like crazy - i dont care if it ends up using 350W+ i have some pretty insane watercooling to suck it up (3k ultra kaze's in push/pull on a rx480rad 24v laingd5s raystorm wb - a little over the top but isnt that what these extreme cpu's are for?)
  • 1Angelreloaded - Tuesday, September 3, 2013 - link

    I have to agree with you in the extreme market who gives a damn about being green, most will run 1200watt Plat mod PSUs with an added extra 450 watt in the background, and 4GPUs as this is pretty much the only reason to buy into 2011 socket in the first place 2 extra cors and 40x PCIe lanes.
  • crouton - Tuesday, September 3, 2013 - link

    I could not agree with you more! I have a OC'd i920 that just keeps chugging along and if I'm going to drop some coin on an upgrade, I want it to be an UPGRADE. Let ME decide what's reasonable for power consumption. If I burn up a 8/10 core CPU with some crazy cooling solution then it's MY fault. I accept this. This is the hobby that I've chosen and it comes with risks. This is not some elementary school "color by numbers" hobby where you can follow a simple set of instructions to get the desired result in 10 minutes. This is for the big boys. It takes weeks or more to get it right and even then, we know we can do better. Not interested in XEON either.
  • Assimilator87 - Tuesday, September 3, 2013 - link

    The 12 core models run at 2.7Ghz, which will be slightly faster than six cores at 5.125Ghz. You could also bump up the bclk to 105, which would put the CPU at 2.835Ghz.
  • Casper42 - Tuesday, September 3, 2013 - link

    2690 v2 will be 10c @ 3.0 and 130W. Effectively 30Ghz.
    2697 v2 will be 12c @ 2.7 and 130W. Effectively 32.4Ghz

    Assuming a 6 Core OC'd to 5Ghz Stable, 6c @ 5.0 and 150W? (More Power due to OC)
    effectively 30Ghz.

    So tell me again how a highly OC'd and large unavailable to the masses 6c is better than a 10/12c when you need Multiple Threads?
    Keep in mind those 10 and 12 core Server CPUs are almost entirely AIR cooled and not overclocked.

    I think they should have released an 8 and 10 core Enthusiast CPU. Hike up the price and let the market decide which one they want.
  • MrSpadge - Tuesday, September 3, 2013 - link

    6c @ 5.0 will eat more like 200+ W instead of 130/150.
  • ShieTar - Wednesday, September 4, 2013 - link

    For Sandy Bridge, we had:
    2687, 8c @ 3.1 GHz => 24.8 GHz effectively
    3970X, 6c @ 3.5 GHz => 21 GHz before overclocking, only 4.2 GHz required to exceed the Xeon.

    Fair enough, for Ivy Bridge Xeons, the 10core at 3 GHz has been announced. I'll believe that claim when I see some actual benchmarks on it. I have some serious doubts that a 10core at 3 GHz can actually use less power than an 8 core at 3.4 GHz. So lets see on what frequency those parts will actually run, under load.

    Furthermore, the effective GHz are not the whole truth, even on highly parallel tasks. While cache seems to scale with the number of cores for most Xeons, memory bandwidth does not, and there are always overheads due to the common use of the L3 cache and the memory.

    Finally, not directly towards you but to several people talking about "green": Entirely not the point. No matter how much power your cooling system can remove, you are always creating thermal gradients when generating too much heat on a very small space. Why do you guys think there was no 3.5GHz 8 core for Sandy Bridge-EP? The silicon is the same for 6-core and 8-core, the core itself could run the speed. But INTEL is not going to verify the continued operation of a chip with a TDP >150W.

    They give a little leeway when it comes to the K-class, because there the risk is with customer to a certain point. But they just won't go and sell a CPU which reliably destroys itself or the MB the very moment somebody tries to overclock it.
  • psyq321 - Thursday, September 5, 2013 - link

    I am getting 34.86 @Cinebench with dual Xeon 2697 v2 running @3 GHz (max all-core turbo).

    Good luck reaching that with superclocked 4930/4960X ;-)
  • piroroadkill - Tuesday, September 3, 2013 - link

    All I really learn from these high end CPU results is that if you actually invested in high end 1366 in the form of 980x all that time ago, you've got probably the longest lasting system in terms of good performance that I can even think of.

Log in

Don't have an account? Sign up now