Evolution in Performance

The underlying architecture in Haswell-E is not anything new. Haswell desktop processors were first released in July 2013 to replace Ivy Bridge, and at the time we stated an expected 3-17% increase, especially in floating point heavy benchmarks. Users moving from Sandy Bridge should expect a ~20% increase all around, with Nehalem users in the 40% range. Due to the extreme systems only needing more cores, we could assume that the suggested recommendations for Haswell-E over IVB-E and the others were similar but we tested afresh for this review in order to test those assumptions.

For our test, we took our previous CPU review samples from as far back as Nehalem. This means the i7-990X, i7-3960X, i7-4960X and the Haswell-E i7-5960X.

Each of the processors were set to 3.2 GHz on all the cores, and set to four cores without HyperThreading enabled.

Memory was set to the CPU supported frequency at JEDEC settings, meaning that if there should Intel have significantly adjusted the performance between the memory controllers of these platforms, this would show as well. For detailed explanations of these tests, refer to our main results section in this review.

Average results show an average 17% jump from Nehalem to SNB-E, 7% for SNB-E to IVB-E, and a final 6% from IVB-E to Haswell-E. This makes for a 31% (rounded) overall stretch in three generations.

Web benchmarks have to struggle with the domain and HTML 5 offers some way to help use as many cores in the system as possible. The biggest jump was in SunSpider, although overall there is a 34% jump from Nehalem to Haswell-E here. This is split by 14% Nehalem to SNB-E, 6% SNB-E to IVB-E and 12% from IVB-E to Haswell-E.

Purchasing managers often look to the PCMark and SYSmark data to clarify decisions and the important number here is that Haswell-E took a 7% average jump in scores over Ivy Bridge-E. This translates to a 24% jump since Nehalem.

Some of the more common synthetic benchmarks in multithreaded mode showed an average 8% jump from Ivy Bridge-E, with a 29% jump overall. Nehalem to Sandy Bridge-E was a bigger single jump, giving 14% average.

In the single threaded tests, a smaller overall 23% improvement was seen from the i7-990X, with 6% in this final generation.

The take home message, if there was one, from these results is that:

Haswell-E has an 8% improvement in performance over Ivy Bridge-E clock for clock for pure CPU based workloads.

This also means an overall 13% jump from Sandy Bridge-E to Haswell-E.
From Nehalem, we have a total 28% raise in clock-for-clock performance.

Looking at gaming workloads, the difference shrinks. Unfortunately our Nehalem system decided to stop working while taking this data, but we can still see some generational improvements. First up, a GTX 770 at 1080p Max settings:

The only title that gets much improvement is F1 2013 which uses the EGO engine and is most amenable to better hardware under the hood. The rise in minimum frame rates is quite impressive.

For SLI performance:

All of our titles except Tomb Raider get at least a small improvement in our clock-for-clock testing with this time Bioshock also getting in on the action in both average and minimum frame rates.

If we were to go on clock-for-clock testing alone, these numbers do not particularly show a benefit from upgrading from a Sandy Bridge system, except in F1 2013. However our numbers later in the review for stock and overclocked speeds might change that.

Memory Latency and CPU Architecture

Haswell is a tock, meaning the second crack at 22nm. Anand went for a deep dive into the details previously, but in brief Haswell bought better branch prediction, two new execution ports and increased buffers to feed an increased parallel set of execution resources. Haswell adds support for AVX2 which includes an FMA operation to increase floating point performance. As a result, Intel doubled the L1 cache bandwidth. While TSX was part of the instruction set as well, this has since been disabled due to a fundamental silicon flaw and will not be fixed in this generation.

The increase in L3 cache sizes for the highest CPU comes from an increased core count, extending the lower latency portion of the L3 to larger data accesses. The move to DDR4 2133 C15 would seem to have latency benefits over previous DDR3-1866 and DDR3-1600 implementations as well.

The Intel Haswell-E CPU Review Intel Haswell-E Overclocking
Comments Locked

203 Comments

View All Comments

  • Michael REMY - Friday, August 29, 2014 - link

    again, in your table of extreme core i7 cpus, you forgot the last 4-core Nehalem which is : the i7-975X at 3.3GHz .
    No, the 965X is not the latest 4-core extreme !
  • Death666Angel - Friday, August 29, 2014 - link

    Considering this would have cost me ~340€ over my i7-4770K (which I have @ 4.5GHz and delidded), because of the price difference in CPU and the fact that I had a 1150 socket mainboard from my retired mining rig, I'm not too salty about it. At least it is 6 core at the low end, that is encouraging. I've been mostly fine with my i7-860 so I guess the i7-4770k will serve me a while.
  • Death666Angel - Saturday, August 30, 2014 - link

    "With ASUS motherboards, they have implemented a new onboard button which tells 2x/3x GPU users which slots to go in with LEDs on the motherboard to avoid confusion."
    Because looking stuff up in the manual is way too complicated!
  • anactoraaron - Friday, August 29, 2014 - link

    The 5820 can be had for $299 at micro center and they will also discount a compatible motherboard by $40. Jus' sayin'. IDK if there's some kind of ad agreement, etc for listing Newegg's price... Anyone shopping for anything should always shop around.
  • tuxRoller - Friday, August 29, 2014 - link

    "Very few PC games lose out due to having PCIe 3.0 x8 over PCIe 3.0 x16"

    Any? Even BF4 might be more due to other factors. It might be more useful to determine these bottlenecks with uhd.
  • Ian Cutress - Monday, September 1, 2014 - link

    I want to try with UHD. Need the monitors though.
  • Mr Perfect - Friday, August 29, 2014 - link

    The 28 lanes of the i7-5820K has almost no effect on SLI gaming at 1080p.


    I realize you where trying to CPU limit the benchmarks by using such a low resolution, but does this still hold up when running, say, three 1440p monitors? Wouldn't that be the time when the GPUs are maxed out and start shuttling large amounts of data between themselves?
  • Ian Cutress - Monday, September 1, 2014 - link

    I want to test with higher resolutions in the near future, although my monitor situation is not as fruitful as I would hope. There is no big AnandTech warehouse, we all work in our corner of the world so shipping around this HW is difficult.
  • KAlmquist - Friday, August 29, 2014 - link

    "The move to DDR4 2133 C15 would seem to have latency benefits over previous DDR3-1866 and DDR3-1600 implementations as well."

    If my math is correct, this is wrong. With DDR4 2133 timings of 15-15-15, each of those 15's corresponds to 14.1 nanoseconds. (Divide 2133 by two to get the actual frequency, then divide the clock count by the frequency.) With DDR3 1600 and the common 9-9-9 timings, each time is only 11.25 nanoseconds. With DDR3, the actual transfer of the data takes four clock cycles (there are eight transfers, but "DDR" stands for "double data rate" meaning that there are two transfers per clock cycle). That translates to 5 nanonseconds on DDR3 1600. DDR4 transfers twice as much data at a time, so with DDR4 2133 a transfer takes eight clock cycles or 7.5 nanoseconds. So DDR3 1600 has lower latency than the DDR4 2133 memory.

    So why does Sandra report a memory latency of around 28.75 nanoseconds (92 clock cycles at 3.2 Ghz) as shown in the chart on page 2 of this review? If a bank does not have an open page, then the memory latency should be 15+15+8 clock cycles, or 35.6 nanoseconds, not counting the latency internal to the processor. So the Sandra benchmark result seems implausible to me. As far as I can tell, the source code for the Sandra benchmark is not available so there is no way to tell exactly what it is measuring.
  • JumpingJack - Monday, September 1, 2014 - link

    Good points.

Log in

Don't have an account? Sign up now