Original Link: http://www.anandtech.com/show/8426/the-intel-haswell-e-cpu-review-core-i7-5960x-i7-5930k-i7-5820k-tested
The Intel Haswell-E CPU Review: Core i7-5960X, i7-5930K and i7-5820K Testedby Ian Cutress on August 29, 2014 12:00 PM EST
Today marks the release of Intel’s latest update to its Extreme processor line with a trio of Haswell-E models including Intel’s first consumer socketed 8-core product. This is the update from Ivy Bridge-E, which includes an IPC increase, a new X99 chipset, the first consumer platform with DDR4 memory, and a new CPU socket that is not backwards compatible. We managed to get all three CPUs ahead of launch to test.
August 29th, The Haswell-E Launch
As part of PAX Prime today, three major launches are occurring:
- New line of Haswell-E i7 CPUs
- New line of X99 motherboards using the new LGA2011-3 socket
- An upgrade from DDR3 to DDR4 memory, using the new 288-pin slots
Each of these launches is an upgrade over the previous enthusiast models in the market. The Haswell-E processors will support up to 8 cores on i7, the X99 motherboards have increased connectivity and focus on newer storage methods, and the DDR4 memory supports higher frequency memory at lower voltages than DDR3.
Our coverage will be split to cover all three major launches. This article is talking about the Haswell-E CPUs, we will have another article discussing the new X99 chipset and motherboards, with a third about the new DDR4 memory. There is a small amount of overlap in the data between the three, but check out our other articles this week to find out more.
The New CPUs
Getting straight to the heart of the matter, Intel is keeping the enthusiast extreme range simple by only releasing three models, similar to the initial Sandy Bridge-E and Ivy Bridge-E launches.
The top of the line will be the 8-core i7-5960X with HyperThreading, using a 3.0 GHz base frequency and 40 PCIe 3.0 lanes for $999 for 1000 units. This pricing is in line with previous extreme edition processor launches, but the base frequency is quite low. This is due to the TDP limitation: sticking two extra cores produces extra energy lost as heat, and in order to get the TDP down the base clock has to be reduced over the six-core models. This is a common trend we see in the Xeon range, and as a result it might affect the feel of day-to-day performance.
The mid-range i7-5930K model mimics the older i7-4960X from Ivy Bridge-E by having six cores and 40 PCIe 3.0 lanes, however it does differ in the frequencies (the 5930K is slower) and the memory (5930K supports DDR4-2133). Pricing for this model is aimed to be highly competitive at just under the $600 mark.
The entry level model is a slightly slower i7-5820K, also featuring six cores and DDR4-2133 support. The main difference here is that it only has 28 PCIe 3.0 lanes. When I first read this, I was relatively shocked, but if you consider it from a point of segmentation in the product stack, it makes sense. For example:
For Ivy Bridge-E and Sandy Bridge-E, the i7-4820K and i7-3820 CPUs both had four cores, separating it from the other six cores in their series. For Nehalem, the quad core i7-920 was a super low clocked version compared to the quad core i7-965 and hex-core i7-980X which was released later. In these circumstances, the options for the lower $400 part were either fewer cores or lower frequency. Intel has decided to make the lower cost Haswell-E processor with fewer PCIe 3.0 lanes, but this is an even better scenario for most consumers:
Having 28 PCIe 3.0 lanes means dual GPU setups are at PCIe 3.0 x16/x8 (rather than x16/x16), and tri-GPU setups are at x8/x8/x8 (rather than x16/x16/x8). Very few PC games lose out due to having PCIe 3.0 x8 over PCIe 3.0 x16, meaning that performance should be almost identical. On paper, there should be a smaller performance difference with this setup than if the frequency had been reduced, or the fact that people would complain if there were fewer cores. Having six cores puts it above the i7-4790K in terms of market position and pricing, and the overall loss is that an i7-5820K user cannot use 4-way SLI, which is a very small minority to begin with.
The only downside to all the 28 PCIe 3.0 lanes is that there is no physical way to improve the PCIe lane situation. If the frequency was low, the user could overclock. If there were fewer cores, overclocking would also help mitigate that. Despite this, on paper it looks like that performance difference should be minimal.
The raise in TDP from 130W to 140W puts extra strain on user cooling. Intel still recommends its TS13X liquid cooling solution as a bare minimum – this is the same cooling solution Intel suggested for Ivy Bridge-E. Users wanting to overclock might expect another 150W pushing the i7-5960X up to 4.3 GHz (see our overclocking results later in the review), suggesting that an aftermarket thicker/longer radiator liquid cooler might be in order.
The base silicon for the three mainstream Haswell-E processors is of a similar construction to the previous generation, with a dedicated L3 cache in the middle and the processors around the outside connected by a ring:
All eight cores in the silicon will have access to the cache for the top of the line Core i7-5960X. For the six core models, the i7-5930K and the i7-5820K, one pair of cores is disabled; the pair which is disabled is not always constant, but will always be a left-to-right pair from the four rows as shown in the image. Unlike the Xeon range where sometimes the additional cache from disabled cores is still available, the L3 cache for these two cores will be disabled also.
Intel was quite happy to share the dimensions of the die and the transistor counts, which allows us to update this table with the new information:
|CPU Specification Comparison|
|CPU||Manufacturing Process||Cores||GPU||Transistor Count (Schematic)||Die Size|
ULT GT3 2C
Intel Ivy Bridge-E
Intel Ivy Bridge
|Intel Sandy Bridge- E 6C||32nm||6||N/A||2.27B||435mm2|
|Intel Sandy Bridge 4C||32nm||4||GT2||995M||216mm2|
This shows how moving from a six core Ivy Bridge-E die to an eight core Haswell-E increases the die area from 257 mm2 to 356 mm2 (a 39% increase) and the number of transistors from 1.86 billion to 2.6 billion (a 40% increase). This means that adding 33% more cores actually requires more space and more transistors. Part of the increase as well might be the migration to a DDR4 memory controller.
The span of the extreme processor space historically from Intel has a distinct pattern. The CPUs with the lower cores are often clocked the fastest, but over time the speed of the SKU with the most cores might match the lower core model. Then when the next update arrives with more cores, the frequency is again reduced:
|Intel Extreme Edition Comparison|
i7-920 1.0MB L2
i7-965 1.0MB L2
i7-3820 1.0MB L2
i7-4820K 1MB L2
i7-3930K 1.5MB L2
i7-980X 1.5MB L2
i7-3960X 1.5MB L2
i7-5820K 1.5MB L2
i7-4930K 1.5MB L2
i7-990X 1.5MB L2
i7-3970X (150W)1.5MB L2
i7-5930K 1.5MB L2
i7-4960X 1.5MB L2
i7-5960X 2.0MB L2
When you take the cache sizes into account (click a CPU to see the cache size), it becomes very difficult to do a like-for-like comparison. For example, the i7-990X and the i7-5930K are both six-core, 3.5 GHz base frequency models, but the i7-5930K has 3MB more L3 cache. Similarly with the i7-980X and the i7-3960X.
The X99 Chipset
We will go into more detail in our motherboard review piece, but the basic X99 chipset layout from Intel is as follows:
For CPUs with 40 PCIe lanes, the chipset diagram above will allow x16/x16/x8 scenarios or x8/x8/x8/x8/x8 with additional clock generators. For the 28 lane CPU, this becomes x16/x8/x4, which might make some PCIe slots on the motherboard redundant – it is worth checking the manual first which should show each combination. With ASUS motherboards, they have implemented a new onboard button which tells 2x/3x GPU users which slots to go in with LEDs on the motherboard to avoid confusion.
The platform now uses DDR4 memory, which has a base frequency of 2133 MHz. Almost all consumer motherboards will use either one DIMM per channel or two DIMMs per channel, making up to 64GB of memory possible with the latter. Should 16GB UDIMM DDR4 modules come along, it is assumed that with a microcode update, Intel will support these as well.
X99 will also support 10 SATA 6 Gbps ports from the chipset. This is a rather odd addition, because only six of those ports will be RAID capable. Most motherboards will list which ones are specifically for RAID, but this dichotomy makes me believe that the chipset might use a SATA hub on die in order to extend the number of possible ports.
The socket looks roughly the same from X79 to X99, but the main differences include the notches inside the socket, making sure that you cannot misplace the wrong CPU in the wrong socket. The pin layouts are also different, making them incompatible. The socket arms for fixing the CPU in place also change, with X99 arms requiring to be pushed around and out rather than out then in.
All the main motherboard manufacturers will have models ready on day one. These will be in the micro-ATX and ATX form factors, with most models aiming at the high end for functionality and performance such as the ASUS X79 Deluxe and the ASRock X99 OC Formula. There will be a few models for the cheap side of the market, such as the MSI X99S SLI PLUS and the GIGABYTE X99-UD3.
Prices should range from around $230 to $400+. See our X99 motherboard coverage for a more in-depth look.
DDR4 and JEDEC
All the Haswell CPUs will support DDR4 only, and the new DDR4 design means that the DRAM slots will not be able to take DDR3 due to a different placement of the notch and DDR4 has more pins. DDR4 modules are also a slightly different shape whereby the middle pins of the memory are longer than those on the outside.
For motherboards with single sided latches, this can make it a little trickier to put in because the module might feel in place but both ends need to be firmly in the slot.
The CPUs are listed as supporting DDR4-2133 which in terms of JEDEC timings is 15-15-15. This is similar to when DDR3 first launched, at the nice low (but high at the time) speed of DDR3-1066 7-7-7. While DDR4-2133 CL15 sounds slow, DRAM module manufacturers will be launching models up to DDR4-3200 CL16. This turns the DRAM Performance Index (MHz divided by CAS) from 142 to 200.
DDR4 is also at a lower voltage than DDR3, with 2133 C15 modules requiring 1.2 volts. Prior to launch, G.Skill, Corsair, Crucial and ADATA all sent out preview images of their modules, with a few even releasing pricing to etailers ahead of time.
Modules should be available from DDR4-2133 to DDR4-3200 at launch, with the higher end of the spectrum being announced by both G.Skill and Corsair. See our DDR4 article later this week for more extensive testing.
Haswell-E and the Battle with Xeons
One of the main issues Intel has with its Extreme platform is the respective enterprise platform based on its high end Xeon processors. In the server world, the customers demand a certain level of consistency for each platform to match up with their upgrade and replacement cycle. As a result, while mainstream Haswell processors were launched in June 2013, it has taken another 14 months for the enthusiast versions to hit the market. This cadence difference between mainstream and extreme silicon is primarily driven by the Xeon market requiring the same platform for two generations. In this case, the Sandy Bridge-E and Ivy Bridge-E platforms, with the LGA2011-0 socket, we held in place for three years before the upgrade to Haswell-E with LGA2011-3. If you are wondering why there is the big difference in release date from Haswell to Haswell-E, there is your answer.
That being said, the consumer range of extreme processors is actually a small market for Intel compared to the Xeons. The market is pushed more out of the prosumer level customers that require performance but at a lower cost, or as a platform for Intel to show how fast it can go at a certain power limitation and then allow extreme overclockers to blow through it as much as possible.
The prosumer market is the important one for the consumer grade silicon. For small businesses that rely on CPU limited throughput, such as video editing, video production, scientific computation and virtualization, having the high performance in a single, low-cost product can produce a significant upgrade in throughput, allowing projects to be completed quicker or with more accuracy. While these prosumer would love the higher powered Xeons, the cost is overly prohibitive, particularly in the long term, or the lack of memory overclock capability has a negative effect.
With this long delay in extreme platform upgrades, it gives Intel the chance to test new functionality out on the mainstream segment. One of the prevailing problems with Ivy Bridge-E is that it relies on the X79 chipset which is showing its age. The new Haswell-E platform and the X99 chipset borrows plenty of cues from Z87 and Z97 in terms of input/output and connectivity support, based on the Xeon customer request of ‘SATA Express looks good, we will have that’.
The drive for lower power is also true, even in high performance systems. For datacenters, the majority of the cost of the facility is typically the energy usage for the systems and the cooling. Thus if a datacenter can use a more energy efficient system, it probably will. So the transition from DDR3 to DDR4 also involves a drop in DRAM voltage from 1.5 volts to 1.2 volts. This does not sound like much for a home system with 4-8 DRAM modules, but in a datacenter with several thousand systems, each using 8-64 sticks of memory, saving a few kW helps bring down the power bill.
This extreme cadence will eventually land Intel with a bit of an issue. If the gap between the mainstream CPU architecture and the performance CPU architecture widens more, then at some point there will be a two-generation difference. This means the server side will have to decide if having fewer faster cores with the highest IPC on the market is better than 2-3 year old slower processors. This would also mean a dichotomy based on whatever features are added. This would suggest that at some point, Intel may have to cut out an entire platform of processors but still maintain the two-generation platform consistency that the server market requires.
Competition and Market
Perhaps unsurprising Intel’s main competition is from itself on the consumer CPU side. As in the table above, the 5960X now leads the new charge on 8-core processors with the 6-core i7-5820K sitting at the back with a reduced lane count but also with a reduced price. Doing a direct comparison based solely on frequency and core count we can see that the i7-3960X matches the i7-5820K, showing how the platform evolves (as well as a position of the price point) over time. This bodes well, perhaps suggesting that Skylake-E’s lowest processor will be a similarly specified Haswell-E i7-5960X but with a higher IPC, should the trend continue.
Intel’s nearest challenger for consumer CPUs from outside is still the FX-9590 which we reviewed recently, but at 220W it needs another 50% power and is only competitive in a few choice benchmarks for 1/3 of the cost.
From Intel’s Haswell-E CPU launch, several questions immediately spring to mind:
How much faster is Haswell-E over Ivy Bridge-E?
How well do these CPUs overclock?
I have an i7-3960X at 4.8 GHz / i7-4960X at 4.5 GHz, should I upgrade?
I already have the i7-4960X and run at stock, should I upgrade?
Do the 28 PCIe 3.0 lanes on the i7-5820K affect gaming?
One of the big questions on should I upgrade from X58 or X79 will always be towards the chipset, which we will cover in the motherboard review.
But our testing here aims to answer all these questions, in terms of a stock vs. stock comparison through to an overclocked comparison for prosumers making the most of their enthusiast system or users attempting to go down the low-cost X99 route. All of our benchmark results will be in Bench as well for comparisons to other consumer and server processors.
Evolution in Performance
The underlying architecture in Haswell-E is not anything new. Haswell desktop processors were first released in July 2013 to replace Ivy Bridge, and at the time we stated an expected 3-17% increase, especially in floating point heavy benchmarks. Users moving from Sandy Bridge should expect a ~20% increase all around, with Nehalem users in the 40% range. Due to the extreme systems only needing more cores, we could assume that the suggested recommendations for Haswell-E over IVB-E and the others were similar but we tested afresh for this review in order to test those assumptions.
For our test, we took our previous CPU review samples from as far back as Nehalem. This means the i7-990X, i7-3960X, i7-4960X and the Haswell-E i7-5960X.
Each of the processors were set to 3.2 GHz on all the cores, and set to four cores without HyperThreading enabled.
Memory was set to the CPU supported frequency at JEDEC settings, meaning that if there should Intel have significantly adjusted the performance between the memory controllers of these platforms, this would show as well. For detailed explanations of these tests, refer to our main results section in this review.
Average results show an average 17% jump from Nehalem to SNB-E, 7% for SNB-E to IVB-E, and a final 6% from IVB-E to Haswell-E. This makes for a 31% (rounded) overall stretch in three generations.
Web benchmarks have to struggle with the domain and HTML 5 offers some way to help use as many cores in the system as possible. The biggest jump was in SunSpider, although overall there is a 34% jump from Nehalem to Haswell-E here. This is split by 14% Nehalem to SNB-E, 6% SNB-E to IVB-E and 12% from IVB-E to Haswell-E.
Purchasing managers often look to the PCMark and SYSmark data to clarify decisions and the important number here is that Haswell-E took a 7% average jump in scores over Ivy Bridge-E. This translates to a 24% jump since Nehalem.
Some of the more common synthetic benchmarks in multithreaded mode showed an average 8% jump from Ivy Bridge-E, with a 29% jump overall. Nehalem to Sandy Bridge-E was a bigger single jump, giving 14% average.
In the single threaded tests, a smaller overall 23% improvement was seen from the i7-990X, with 6% in this final generation.
The take home message, if there was one, from these results is that:
Haswell-E has an 8% improvement in performance over Ivy Bridge-E clock for clock for pure CPU based workloads.
This also means an overall 13% jump from Sandy Bridge-E to Haswell-E.
From Nehalem, we have a total 28% raise in clock-for-clock performance.
Looking at gaming workloads, the difference shrinks. Unfortunately our Nehalem system decided to stop working while taking this data, but we can still see some generational improvements. First up, a GTX 770 at 1080p Max settings:
The only title that gets much improvement is F1 2013 which uses the EGO engine and is most amenable to better hardware under the hood. The rise in minimum frame rates is quite impressive.
For SLI performance:
All of our titles except Tomb Raider get at least a small improvement in our clock-for-clock testing with this time Bioshock also getting in on the action in both average and minimum frame rates.
If we were to go on clock-for-clock testing alone, these numbers do not particularly show a benefit from upgrading from a Sandy Bridge system, except in F1 2013. However our numbers later in the review for stock and overclocked speeds might change that.
Memory Latency and CPU Architecture
Haswell is a tock, meaning the second crack at 22nm. Anand went for a deep dive into the details previously, but in brief Haswell bought better branch prediction, two new execution ports and increased buffers to feed an increased parallel set of execution resources. Haswell adds support for AVX2 which includes an FMA operation to increase floating point performance. As a result, Intel doubled the L1 cache bandwidth. While TSX was part of the instruction set as well, this has since been disabled due to a fundamental silicon flaw and will not be fixed in this generation.
The increase in L3 cache sizes for the highest CPU comes from an increased core count, extending the lower latency portion of the L3 to larger data accesses. The move to DDR4 2133 C15 would seem to have latency benefits over previous DDR3-1866 and DDR3-1600 implementations as well.
Intel Haswell-E Overclocking
One of the burgeoning questions relating to overclocking over the past couple of years has been the quality of Intel’s construction under the heatspreader relating to thermal interface material and adhesives. This caused enough of a talking point for Intel to release Devil’s Canyon (read our review here) which featured an upgraded interface and essentially reduced the thermal pressure restricting the overclock.
Thankfully Intel has not decided to play around with the extreme edition platform too much since Nehalem. Although recent reports suggest that Intel is using an epoxy to bind the die to the heatspreader, one tell-tale sign that a goopy TIM is not being used is the hole in the heatspreader in one of the corners.
Looking through the previous generations, Sandy-E, Ivy-E and Haswell-E shows this hole, which is typically thought to allow for expansion of the heatspreader and/or gas trapped inside due to the heat. Also due to the way that the epoxy is handled, the heatspreader cannot be removed without force and destroying parts the silicon die.
Due to the way that the CPU is arranged, with the cores to the left and right of center, there may develop a series of recommendations when using different methods of applying thermal paste as the sources of the heat will most likely be in these two regions. I would advise the normal procedure of applying thermal paste here: a small blob in the middle and allow the heatsink to spread the TIM through applied pressure. This helps remove air bubbles as the TIM spreads; spreading it out manually leads to air bubbles all over the place and is not recommended for high thermal sources.
In our results below we are using a Cooler Master Nepton 140XL closed loop liquid CPU cooler, and following the instructions above our CPU temperatures stay extremely low until we pile on the overclock. In fact I was seeing less than 30ºC while idle, which should bode well for overclocking.
Our standard overclocking methodology is as follows. We select the overclock options and test for stability with PovRay and OCCT to simulate high-end workloads. These stability tests aim to catch any immediate causes for memory or CPU errors.
For manual overclocks, based on the information gathered from stock testing, we start at a nominal voltage and CPU multiplier, and the multiplier is increased until the stability tests are failed. The CPU voltage is increased gradually until the stability tests are passed, and the process repeated until the motherboard reduces the multiplier automatically (due to safety protocol) or the CPU temperature reaches a stupidly high level (100ºC+). Our test bed is not in a case, which should push overclocks higher with fresher (cooler) air.
Due to the timing of our testing, we were only able to test two i7-5960X CPUs. Both of these were M0 stepping samples, the same as the retail stepping as far as we understand. The i7-5960X for reference is a 3.0 GHz base clock CPU with 8 cores, with a stock load voltage around 1.050 volts. Standard turbo modes allow 3.5 GHz, and so we start our testing at 3.5 GHz on all cores at 1.000 volts set in the BIOS. Where load line calibration was possible, it was enabled to match our setting as closely as possible, but otherwise only the CPU voltage was adjusted.
The first sample has a lot of early headroom with +0.100 volts allowing for an extra +1.1 GHz, or a 36.7% overclock. It has been a long while since numbers like +36.7% has been bandied around Intel’s extreme range, with only the i7-920 type Nehalem CPUs doing that sort of overclock in its stride.
The sweet spot for this CPU seems to be at around 4.4 GHz where the CPU voltage is just starting to rise but peak temperatures are under 75ºC.
Unfortunately our second sample was pretty much a dud by comparison. The voltage needed early on in the overclock went up quickly. This time we were unable to monitor temperatures due to a BIOS issue, but had a power meter on hand. We still managed a +1.1 GHz overclock easily enough, although +0.175 volts was required.
At 4.1 GHz, peak power is +104W over the system power draw at stock, with another 40W at 4.3 GHz. This shows that Haswell-E can be a power hog from even small overclocks, and thus users must have cooling to match. If we add the 140W TDP and the +140W more from the overclock (it would most likely be more than this due to the change of efficiency in the PSU curve), then a mildly overclocked CPU is fast approaching 300W. One can imagine that a highly clocked 4.7 GHz sample would be nearer 400W, and thus users should purchase power supplies to match.
A Problem with Haswell
One issue from Haswell does crop up with Haswell-E: the variability in the quality of the processors. Intel only guarantees that the processor will run at the specific frequency and voltage that is applied out of the factory: any other speed is out of specification and not supported. With Haswell LGA1150 CPUs, while the turbo frequency of the i7-4770K was 3.9 GHz, some CPUs barely managed 4.2 GHz for a 24/7 system.
If we consider that the i7-4770K only needs one of those CPU cores to be below quality to ruin overclockability, then placing double the cores on the i7-5960X is asking for double the trouble. Time to put some numbers to this:
In ASUS’ press deck for overclocking recommendations that came with the X99-Deluxe, they tell us the following:
i7-5960X at 4.4 GHz with 1.300 volts is below average
i7-5960X at 4.5 GHz with 1.300 volts is average
i7-5960X at 4.6 GHz with 1.300 volts is above average
By that standard our first CPU is around average and the second CPU we tested is below average. Even with these guidelines, it would seem that other reviewers and even manufacturers are getting a wide array of results. I have heard of reports of CPUs getting 4.7 GHz on a water loop, whereas others are testing a range of CPUs and not getting more than 4.4 GHz, like our second sample.
ASUS is recommending that anything over 1.25 volts requires a water/liquid cooling as a bare minimum, with up to 1.35V needing a triple (3x120mm) radiator setup depending on ambient temperatures. As with most overclocked setups, this means that the enthusiast user must decide between clock speed or fan noise for their machine.
Another issue with Haswell-E is the current draw of the CPU. ASUS is stating that the standard current draw for the CPU can reach 25 amps, meaning that the power supply must be capable of supplying at least 30 amps on the EPS12V cable. This is covered for most home-build non-OEM power supplies with an 80 PLUS rating, but suggests that a cheap power supply might trigger the over-current protection early.
Comparison to Ivy Bridge-E, Sandy Bridge-E, Haswell
As part of our testing, we hooked up our older i7-4960X and i7-3960X to the ASUS Rampage IV Black Edition, as well as compared to our previous i7-4790K Haswell overclocks:
Our i7-3960X sample at the time was actually a really nice overclocking CPU, in comparison to our i7-4960X which was below overage. I put two values here for the i7-5960X, showing that a 4.3 GHz overclock, while it is lower in number than the 4.8 GHz of the i7-3960X, is actually around the same percentage overclock. If we have a good i7-5960X for comparison, then +50% overclock comes very easily.
The next question then is which one is better for performance? While the Haswell-E CPUs have a lower frequency than the previous generations, they do have the benefit of a higher IPC and DDR4 memory. There is also the core count, with the i7-5960X having 8 cores at 4.3/4.6 GHz against the six cores or four cores.
It should be obvious that for single core throughput, the i7-4790K wins at 4.7 GHz:
In most benchmarks, the 5960X, 4960X and 3960X are actually evenly matched for single threaded performance, with the 5960X taking the edge on software that can take advantage of the newer instruction sets.
For multithreaded tasks, an overclocked i7-5960X is the only way to go:
The graphs later in the review comparing each of these processors at stock will have our overclocked results as well.
Load Delta Power Consumption
Power consumption was tested on the system while in a single MSI GTX 770 Lightning GPU configuration with a wall meter connected to the OCZ 1250W power supply. This power supply is Gold rated, and as I am in the UK on a 230-240 V supply, leads to ~75% efficiency under 50W and 90%+ efficiency at 250W, suitable for both idle and multi-GPU loading. This method of power reading allows us to compare the power management of the UEFI and the board to supply components with power under load, and includes typical PSU losses due to efficiency.
We take the power delta difference between idle and load as our tested value, giving an indication of the power increase from the CPU when placed under stress. Unfortuantely we were not in a position to test the power consumption for the two 6-core CPUs due to the timing of testing.
Because not all processors of the same designation leave the Intel fabs with the same stock voltages, there can be a mild variation and the TDP given on each CPU is understandably an absolute stock limit. Due to power supply efficiencies, we get higher results than TDP, but the more interesting results are the comparisons. The 5960X is coming across as more efficient than Sandy Bridge-E and Ivy Bridge-E, including the 130W Ivy Bridge-E Xeon.
Intel Core i7-5820K
Intel Core i7-5930K
Intel Core i7-5960X
3.3 GHz / 3.6 GHz
3.5 GHz / 3.7 GHz
3.0 GHz / 3.5 GHz
ASUS X99 Deluxe
ASRock X99 Extreme4
Cooler Master Nepton 140XL
OCZ 1250W Gold ZX Series
Corsair AX1200i Platinum PSU
80 PLUS Gold
80 PLUS Platinum
Corsair 4x8 GB
|Video Cards||MSI GTX 770 Lightning 2GB (1150/1202 Boost)|
|Video Drivers||NVIDIA Drivers 337.88|
|Hard Drive||OCZ Vertex 3|
|Optical Drive||LG GH22NS50|
|Case||Open Test Bed|
|Operating System||Windows 7 64-bit SP1|
|USB 2/3 Testing||OCZ Vertex 3 240GB with SATA->USB Adaptor|
Many thanks to...
We must thank the following companies for kindly providing hardware for our test bed:
Thank you to OCZ for providing us with PSUs and SSDs.
Thank you to G.Skill for providing us with memory.
Thank you to Corsair for providing us with an AX1200i PSU and a Corsair H80i CLC.
Thank you to MSI for providing us with the NVIDIA GTX 770 Lightning GPUs.
Thank you to Rosewill for providing us with PSUs and RK-9100 keyboards.
Thank you to ASRock for providing us with some IO testing kit.
Thank you to Cooler Master for providing us with Nepton 140XL CLCs and JAS minis.
A quick word to the manufacturers who sent us the extra testing kit for review, including G.Skill’s Ripjaws 4 DDR4-2133 CL15, Corsair for similar modules, and Cooler Master for the Nepton 140XL CLCs. We will be reviewing the DDR4 modules in due course, including Corsair's new extreme DDR4-3200 kit, but we have already tested the Nepton 140XL in a big 14-way CLC roundup. Read about it here.
The dynamics of CPU Turbo modes, with both Intel and AMD, can cause concern during environments with a variable threaded workload. There is also an added issue of the motherboard remaining consistent, depending on how the motherboard manufacturer wants to add in their own boosting technologies over the ones that the CPU manufacturer would prefer they used. In order to remain consistent, we implement an OS-level unique high performance mode on all the CPUs we test which should override any motherboard manufacturer performance mode.
HandBrake v0.9.9: link
For HandBrake, we take two videos (a 2h20 640x266 DVD rip and a 10min double UHD 3840x4320 animation short) and convert them to x264 format in an MP4 container. Results are given in terms of the frames per second processed, and HandBrake uses as many threads as possible.
The variable turbo speeds of the CPUs results in a small difference in low quality conversion, and the high single core frequency of the 4790K wins there. For 4K conversion the problem becomes more parallel and the extra cores of the 5960X push it ahead of the pack. The 5930K and 5820K are both behind the 4960X however.
Agisoft Photoscan – 2D to 3D Image Manipulation: link
Agisoft Photoscan creates 3D models from 2D images, a process which is very computationally expensive. The algorithm is split into four distinct phases, and different phases of the model reconstruction require either fast memory, fast IPC, more cores, or even OpenCL compute devices to hand. Agisoft supplied us with a special version of the software to script the process, where we take 50 images of a stately home and convert it into a medium quality model. This benchmark typically takes around 15-20 minutes on a high end PC on the CPU alone, with GPUs reducing the time.
Photoscan's four separate components rely on different amounts of high frequency vs. many cores: check our Bench database for more detailed results but overall the 5960X comes out on top. That being said, the 5820K is less than 40% of the price and is only 1.2 minutes behind.
Dolphin Benchmark: link
Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that raytraces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. Results are given in minutes, where the Wii itself scores 17.53 minutes.
Dolphon loves single core speed and efficiency, meaning the 4790K wins out again. Interestingly the large L3 cache of the 5960X also helps here against the 5820K, despite the 5820K having a higher single thread frequency.
WinRAR 5.0.1: link
WinRAR is a variable thread workload, but more cores still wins out. Interestingly the xx60X CPUs are ahead of the xx30K CPUs followed by the xx20K. After this comes the 4790K, and then the 990X on par, showing how far three generations of Intel CPU have developed.
PCMark8 v2 OpenCL
A new addition to our CPU testing suite is PCMark8 v2, where we test the Work 2.0 and Creative 3.0 suites in OpenCL mode.
PCMark v8 relies on a number of factors, and it would seem that frequency is preferred over cache and memory. Interestingly the 4930K beat the 4960X in the Creative Suite with no obvious explanation.
Hybrid x265: link
Hybrid is a new benchmark, where we take a 4K 1500 frame video and convert it into an x265 format without audio. Results are given in frames per second.
Converting 4K video gets another step in the preference for more cores in Hybrid x265. The 5820K matches the 3960X, showing the progression of CPU generational development.
3D Particle Movement
3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores.
FastStone Image Viewer 4.9
FastStone is the program I use to perform quick or bulk actions on images, such as resizing, adjusting for color and cropping. In our test we take a series of 170 images in various sizes and formats and convert them all into 640x480 .gif files, maintaining the aspect ratio. FastStone does not use multithreading for this test, and results are given in seconds.
FastStone is a purely single threaded exercise, showing here how the lower core CPUs with high turbo perfom best, and by quite a margin.
One of the important things to test in our gaming benchmarks this time around is the effect of the Core i7-5820K having 28 PCIe 3.0 lanes rather than the normal 40. This means that the CPU is limited to x16/x8 operation in SLI, rather than x16/x16.
First up is F1 2013 by Codemasters. I am a big Formula 1 fan in my spare time, and nothing makes me happier than carving up the field in a Caterham, waving to the Red Bulls as I drive by (because I play on easy and take shortcuts). F1 2013 uses the EGO Engine, and like other Codemasters games ends up being very playable on old hardware quite easily. In order to beef up the benchmark a bit, we devised the following scenario for the benchmark mode: one lap of Spa-Francorchamps in the heavy wet, the benchmark follows Jenson Button in the McLaren who starts on the grid in 22nd place, with the field made up of 11 Williams cars, 5 Marussia and 5 Caterham in that order. This puts emphasis on the CPU to handle the AI in the wet, and allows for a good amount of overtaking during the automated benchmark. We test at 1920x1080 on Ultra graphical settings.
Nothing here really shows any advantage of Haswell-E over Ivy Bridge-E, although the 10% gaps to the 990X for minimum frame rates offer some perspective.
Bioshock Infinite was Zero Punctuation’s Game of the Year for 2013, uses the Unreal Engine 3, and is designed to scale with both cores and graphical prowess. We test the benchmark using the Adrenaline benchmark tool and the Xtreme (1920x1080, Maximum) performance setting, noting down the average frame rates and the minimum frame rates.
Bioshock Infinite likes a mixture of cores and frequency, especially when it comes to SLI.
The next benchmark in our test is Tomb Raider. Tomb Raider is an AMD optimized game, lauded for its use of TressFX creating dynamic hair to increase the immersion in game. Tomb Raider uses a modified version of the Crystal Engine, and enjoys raw horsepower. We test the benchmark using the Adrenaline benchmark tool and the Xtreme (1920x1080, Maximum) performance setting, noting down the average frame rates and the minimum frame rates.
Tomb Raider is blissfully CPU agnostic it would seem.
Sleeping Dogs is a benchmarking wet dream – a highly complex benchmark that can bring the toughest setup and high resolutions down into single figures. Having an extreme SSAO setting can do that, but at the right settings Sleeping Dogs is highly playable and enjoyable. We run the basic benchmark program laid out in the Adrenaline benchmark tool, and the Xtreme (1920x1080, Maximum) performance setting, noting down the average frame rates and the minimum frame rates.
The biggest graph of CPU performance change is the minimum frame rate while in SLI - the 5960X reaches 67.4 FPS minimum, with only the xx60X CPUs of each generation moving above 60 FPS. That being said, all the Intel CPUs in our test are above 55 FPS, though it would seem that the 60X processors have some more room.
The EA/DICE series that has taken countless hours of my life away is back for another iteration, using the Frostbite 3 engine. AMD is also piling its resources into BF4 with the new Mantle API for developers, designed to cut the time required for the CPU to dispatch commands to the graphical sub-system. For our test we use the in-game benchmarking tools and record the frame time for the first ~70 seconds of the Tashgar single player mission, which is an on-rails generation of and rendering of objects and textures. We test at 1920x1080 at Ultra settings.
Battlefield 4 is the only benchmark where we see the 5820K with its 28 PCIe lanes down by any reasonable margin against the other two 5xxx processors, and even then this is around 5% when in SLI. Not many users will notice the difference between 105 FPS and 110 FPS, and minimum frame rates are still 75 FPS+ on all Intel processors.
As part of our reviews here at AnandTech we have recently been including a section on overclocked results, because in the end a +10% overclock does not always mean an extra +10% on performance. For our overclocking escapades mentioned earlier in the review, while we were able to achieve 4.6 GHz on the Core i7-5960X, the sweet spot was around 4.3 GHz at a very comfortable temperature. This leads to a +43% overclock over the base frequency, similar to what we saw with Sandy Bridge-E overclocking.
For our overclocking tests, we are using the same graphs as in the last two pages, but adding the data from our overclocked Sandy Bridge-E, Ivy Bridge-E, Haswell and Haswell-E CPUs as well, tested fresh for this review on our latest benchmark suite.
In the past overclocking was all about getting the same or better performance for a lower cost, however with Ivy Bridge-E due to its lower frequency, it was a battle to keep on par with Sandy Bridge-E. Now that Haswell-E has the same frequency deficit (200 MHz) but a +8% increase in IPC, it begs the question if Sandy Bridge-E users with good 4.8 GHz+ CPUs should consider upgrading (for anything other than more cores and an upgraded chipset).
SYSmark sees the biggest uplift in its media and office benchmark suites when overclocked, although the financial suite does enjoy the more cores to put the 5960X ahead.
HandBrake v0.9.9: link
Interestingly the overclocked 5960X does aid low quality conversion, showing that with enough frequency all the cores can be constantly fed with data. The 5960X takes the top two spots for 4K conversion.
Agisoft Photoscan – 2D to 3D Image Manipulation: link
Photoscan also enjoys overclocking in combination with the cores, but the 3960X overclocked will beat the 5960X at stock despite the extra cores of the 5960X.
Dolphin Benchmark: link
Dolphin prefers single threaded speed, so the Haswell CPUs at 4.7 GHz win here. Haswell does well in Dolphin's emulation overall, hence why the older extreme processors, even when overclocked, are further down.
WinRAR 5.0.1: link
More top spots for the 5960X, with the two extra cores at stock beating the other extreme processors.
3D Particle Movement
FastStone Image Viewer 4.9
When overclocked to 4.3 GHz, the 5960X would seem to produce a similar experience in FastStone to the 4790K at stock. This makes sense as the 4790K at stock is 4.4 GHz in turbo mode.
POV-Ray 3.7 Beta RC14
The overclocked 5960X scores a few points in minimum frame rates, giving another +20% while in SLI.
Bioshock average frame rates seem to get a small boost when overclocked, but minimum frame rates are more responsive to the 84W and 88W parts. The variation might be more indicative of the benchmark as a whole, as it only takes one errant slow frame to produce a low result in the minimum FPS results.
Intel Haswell-E Conclusion
The new Haswell-E affords several advances in the consumer and prosumer desktop PC world:
- Eight cores for the high-end i7-5960X processor over the six cores in the i7-4960X
- The movement to Haswell architecture and the on-die voltage regulator
- A jump from DDR3 to DDR4 memory
- The X99 chipset
Since the release of Ivy Bridge-E last year, many users have been complaining about the antiquity of the X79 chipset compared to the mainstream line. X99 comes up to par with Z97 in terms of PCIe storage implemented into the RST along with a full array of SATA 6 Gbps ports and USB 3.0. In fact, the additional PCIe 3.0 lanes of the extreme CPUs (40 on all but the i7-5820K) make more sense for PCIe storage on X99, especially when it is most likely prosumers taking advantage of the newer standards.
The movement from DDR3 to DDR4 makes more sense in the data-center space, where saving every watt of power helps bring down costs. DDR4 uses 1.2 volts as standard compared to 1.5 volts for DDR3, and spread over thousands of modules can be a difference in the power bill. For regular users, it does mean that every purchase of a Haswell-E CPU will require a new kit of memory. For the last several years we have been able to reuse DDR3, but now everyone has to factor in the cost of DDR4. This makes the DRAM module manufacturers happy, and as a result in order to get your sale they are offering DDR4-2133 to DDR4-3200 in various shapes and sizes.
For the CPUs themselves, there are several clear points we can make:
Clock for clock, Haswell-E affords an 8% average boost over Ivy Bridge-E. This we already knew from the jump to Haswell to Ivy Bridge, however the cache sizes on Haswell-E get a boost due to more cores. This would seem to make little difference, but it means that a Haswell-E six-core processor performs similarly to an Ivy Bridge-E processor with a +300-500 MHz advantage depending on the benchmark.
Despite the low clock speed of the 5960X, it comes top in multithreaded benchmarks. With two more cores, and thus four more threads, despite the frequency difference to the i7-4960X, any benchmark that can use >12 threads sees a distinct improvement. This includes WinRAR, which has a variable thread workload.
The i7-5820K is on par with the i7-3960X at just over a third of the release cost. These two processors have the same core count and same frequency, but differ in their architecture, PCIe lane count and price. With the i7-5820K being two generations newer, it should afford a 10-15% performance improvement in CPU limited benchmarks. This is quite amazing if we consider the release price of the i7-3960X was $990 and the release price of the i7-5820K is $389.
The added benefit of the i7-5820K is also the X99 chipset, although one downside is the number of PCIe lanes from the CPU.
What this all means is that for $600 less, that two-to-three year upgrade will offer a 10-15% boost in CPU limited workloads, or moving up to the i7-5930K will increase throughput even more.
The 28 lanes of the i7-5820K has almost no effect on SLI gaming at 1080p. One question that will come from all sides is if the 28 lanes effects gaming. The CPU will cause an x16/x8 SLI configuration in two-way and x8/x8/x8 in three-way SLI, rather than the x16/x16 or x16/x16/x8. We tested at 1080p maximum settings with two GTX 770 Lightning GPUs, and found that the only benchmark that any significant difference was the average frame rates in Battlefield 4, which dropped from 110 FPS with the 5930K to 105 FPS with the 5820K. It makes sense that we should test this with 4K in the future.
In terms of raw frequency, on average, Haswell-E overclocks lower than Ivy Bridge-E. Both our overclock testing and ASUS’ recommendations showed that 4.3 GHz to 4.4 GHz will be a happy medium for most Haswell-E CPUs, however the chances of getting a good clocking CPU might be harder on Haswell-E. In our tests of an i7-3960X at 4.8 GHz, i7-4960X at 4.5 GHz and an i7-5960X at 4.3 GHz, all three CPUs performed similarly unless a benchmark takes control of the newer instructions, or needs the eight cores of the i7-5960X. However users expecting a day-to-day difference in performance while overclocked should not get their hopes up.
But even with the 5960X, there are two extra cores. This will be the bottom line: prosumers who invest in the high end platform are more often than not CPU-limited rather than content-creation limited.
Encoding a 4K60 video to x265 has a 16% boost moving from the i7-4960X to the i7-5960X, which extends to 30% when both are overclocked.
I want to go back to those original questions from the first page of this review and answer them:
- How much faster is Haswell-E over Ivy Bridge-E? Clock for clock, 8% on average.
- How well do these CPUs overclock? Not as well as Ivy Bridge-E or Sandy Bridge-E, but performance is comparable.
- I have an i7-3960X at 4.8 GHz / i7-4960X at 4.5 GHz, should I upgrade? Only if you need more cores.
- I already have the i7-4960X and run at stock, should I upgrade? Only if you need more cores.
- Do the 28 PCIe 3.0 lanes on the i7-5820K affect gaming? Not at 1080p in SLI.
The most promising member of the three CPUs launched today is the i7-5820K, as now the lowest end CPU for the extreme Intel platform has more cores than the highest member of the mainstream platform, the i7-4790K. We can pick up a low-cost X99 motherboard for the same price as a mid-range Z97 motherboard, but the main barrier to adoption might be the high price of DDR4 which stands at around $250 for a 16GB quad channel kit.
The i7-5960X comes across as the new champion in terms of non-Xeon throughput, although kudos will lay more on having the up-to-date chipset that users have been requesting. Most people moving from a Sandy Bridge-E or Ivy Bridge-E will not see a day-to-day adjustment in the speed of their workflow on the new platform, and the real benefit will be for those that are CPU limited. Haswell-E does mark the time that Nehalem and Westmere users, or 3820K/4820K users, who do anything other than gaming, might consider switching.
Because of the trifecta of new releases today, we put together some systems for users thinking of upgrading. Each one caters to a different crowd, and after the release we will update the pricing as appropriate.
Example Haswell-E Builds
An Average Introduction to Haswell-E
|CPU||Intel Core i7-5820K||$389|
|Motherboard||MSI X99S SLI PLUS||$230|
|DRAM||G.Skill 4x4GB DDR4-2133 C15||$260|
|Power Supply||Corsair AX860 Platinum||$150|
|GPUs||AMD R9 285 x2||$500|
|Case||Corsair Carbide 400R||$100|
|CPU Cooler||Cooler Master Nepton 140XL||$100|
|SSD||Crucial MX100 512GB||$220|
This first system is meant to be representative of a user moving from either a Sandy Bridge or Ivy Bridge mainstream system to the extreme side. This mimics my position back with Nehalem, moving from an AMD X2 system all the way up to the i7-920 at the time. For users wanting to have a introduction to the six-core i7-5820K, this build under $2000 uses a lower cost motherboard and a suitable power supply for dual GPU gaming. We picked the R9 285 which is soon to be released, but AMD announced pricing a few days ago at their AMD30Live event. Given my success with the Nepton 140XL in this review in overclocking, the system should offer some headroom, especially when using the MSI OC Genie button.
Example Haswell-E Builds
|CPU||Intel Core i7-5960X||$1000|
|Motherboard||MSI X99S SLI PLUS||$230|
|DRAM||G.Skill 4x4GB DDR4-2133 C15||$250|
|Power Supply||Corsair CS550W Gold||$85|
|GPUs||AMD R7 240||$60|
|Case||Corsair Carbide 200R||$60|
|CPU Cooler||Cooler Master Seidon 120V||$50|
|SSD||Crucial MX100 128GB||$80|
There will be some prosumers interested in just the 8-core CPU, so everything else needs to be lightweight. We've stripped down most of the components here, using a simple 128GB SSD as well as a cheaper liquid cooler. The R7 240 is there more as a graphics output, but one of the major barriers to a super cheap build is DDR4 pricing. At $250 for a basic DDR4-2133 MHz kit it turns out that in order to use quad channel memory the build almost approaches the price of our introduction to Haswell-E build.
Example Haswell-E Builds
A Mid Range Build
|CPU||Intel Core i7-5930K||$583|
|Motherboard||ASRock X99 WS||$324|
|DRAM||G.Skill 4x8GB DDR4-2133 C15||$480|
|Power Supply||Corsair RM1000 Gold||$180|
|GPUs||AMD R9 290 x2||$800|
|Case||Corsair Carbide Air 540||$130|
|CPU Cooler||Cooler Master Nepton 140XL||$100|
|SSD||2 x Samsung 850 Pro 256GB||$400|
Amusingly enough we didn't intend this build to be almost $3000, but it sets a good starting point for an i7-5930K build. Moving up to the i7-5960X would be another $417, making it perhaps prohibitive. The X99 WS sits in the middle of X99 pricing, but the workstation designation should indicate a higher level of compatibility with add-in cards. DDR4 is still pretty expensive here, even when selecting a 32GB kit. We used 2x R9 290s although for that price perhaps 3x R9 285s might be an interesting diversion. For a mid-range build the user has the option of a single Samsung 850 Pro 512GB or we can put two 256GB models in RAID.
Example Haswell-E Builds
|CPU||Intel Core i7-5960X||$1000|
|DRAM||G.Skill 8x8GB DDR4-2666 C15||$1010|
|Power Supply||Corsair AX1500i||$450|
|GPUs||NVIDIA 780 Ti x 3||$2160|
|Case||Corsair Obsidian 900D||$320|
|CPU Cooler||Cooler Master Nepton 280L||$140|
|SSD||4 x Samsung 850 Pro 512GB||$1600|
In an almost no-holds barred system, using the 8-core monster and 64GB of DDR4 does some financial damage. Add in a high end ASUS X99-Deluxe, a 1500W power supply, the case and CPU cooler uses another $1400, which is surpassed by a four-way RAID SSD setup. One option here would be to look at M.2 SSDs, however there are few on the market at high capacity right now, so a user might go cheaper with the MX100 and then purchase an M.2 next year - take a look at our SSD roundup for more information. With a high end CPU and power supply, it makes sense to go all out on GPUs with three 780 TIs in the mix. Don't forget to add the price of a 4K monitor (or three) in order for the system to stretch its legs, and make sure to take advantage of ASUS' 5-Way Optimization overclocking.