CPU Benchmarks

The dynamics of CPU Turbo modes, with both Intel and AMD, can cause concern during environments with a variable threaded workload. There is also an added issue of the motherboard remaining consistent, depending on how the motherboard manufacturer wants to add in their own boosting technologies over the ones that the CPU manufacturer would prefer they used. In order to remain consistent, we implement an OS-level unique high performance mode on all the CPUs we test which should override any motherboard manufacturer performance mode.

HandBrake v0.9.9: link

For HandBrake, we take two videos (a 2h20 640x266 DVD rip and a 10min double UHD 3840x4320 animation short) and convert them to x264 format in an MP4 container. Results are given in terms of the frames per second processed, and HandBrake uses as many threads as possible.

HandBrake v0.9.9 LQ Film

HandBrake v0.9.9 2x4K

The variable turbo speeds of the CPUs results in a small difference in low quality conversion, and the high single core frequency of the 4790K wins there. For 4K conversion the problem becomes more parallel and the extra cores of the 5960X push it ahead of the pack. The 5930K and 5820K are both behind the 4960X however.

Agisoft Photoscan – 2D to 3D Image Manipulation: link

Agisoft Photoscan creates 3D models from 2D images, a process which is very computationally expensive. The algorithm is split into four distinct phases, and different phases of the model reconstruction require either fast memory, fast IPC, more cores, or even OpenCL compute devices to hand. Agisoft supplied us with a special version of the software to script the process, where we take 50 images of a stately home and convert it into a medium quality model. This benchmark typically takes around 15-20 minutes on a high end PC on the CPU alone, with GPUs reducing the time.

Agisoft PhotoScan Benchmark - Total Time

Photoscan's four separate components rely on different amounts of high frequency vs. many cores: check our Bench database for more detailed results but overall the 5960X comes out on top. That being said, the 5820K is less than 40% of the price and is only 1.2 minutes behind.

Dolphin Benchmark: link

Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that raytraces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. Results are given in minutes, where the Wii itself scores 17.53 minutes.

Dolphin Emulation Benchmark

Dolphon loves single core speed and efficiency, meaning the 4790K wins out again. Interestingly the large L3 cache of the 5960X also helps here against the 5820K, despite the 5820K having a higher single thread frequency.

WinRAR 5.0.1: link

WinRAR 5.01, 2867 files, 1.52 GB

WinRAR is a variable thread workload, but more cores still wins out. Interestingly the xx60X CPUs are ahead of the xx30K CPUs followed by the xx20K. After this comes the 4790K, and then the 990X on par, showing how far three generations of Intel CPU have developed.

PCMark8 v2 OpenCL

A new addition to our CPU testing suite is PCMark8 v2, where we test the Work 2.0 and Creative 3.0 suites in OpenCL mode.

PCMark8 v2 Work 2.0 OpenCL with R7 240 DDR3

PCMark8 v2 Creative 3.0 OpenCL with R7 240 DDR3

PCMark v8 relies on a number of factors, and it would seem that frequency is preferred over cache and memory. Interestingly the 4930K beat the 4960X in the Creative Suite with no obvious explanation.

Hybrid x265: link

Hybrid is a new benchmark, where we take a 4K 1500 frame video and convert it into an x265 format without audio. Results are given in frames per second.

Hybrid x265, 4K Video

Converting 4K video gets another step in the preference for more cores in Hybrid x265. The 5820K matches the 3960X, showing the progression of CPU generational development.

Cinebench R15

Cinebench R15 - Single Threaded

Cinebench R15 - Multi-Threaded

3D Particle Movement

3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores.

3D Particle Movement: Single Threaded

3D Particle Movement: MultiThreaded

FastStone Image Viewer 4.9

FastStone is the program I use to perform quick or bulk actions on images, such as resizing, adjusting for color and cropping. In our test we take a series of 170 images in various sizes and formats and convert them all into 640x480 .gif files, maintaining the aspect ratio. FastStone does not use multithreading for this test, and results are given in seconds.

FastStone Image Viewer 4.9

FastStone is a purely single threaded exercise, showing here how the lower core CPUs with high turbo perfom best, and by quite a margin.

Power Consumption, Test Setup Gaming Benchmarks
Comments Locked

203 Comments

View All Comments

  • Michael REMY - Friday, August 29, 2014 - link

    again, in your table of extreme core i7 cpus, you forgot the last 4-core Nehalem which is : the i7-975X at 3.3GHz .
    No, the 965X is not the latest 4-core extreme !
  • Death666Angel - Friday, August 29, 2014 - link

    Considering this would have cost me ~340€ over my i7-4770K (which I have @ 4.5GHz and delidded), because of the price difference in CPU and the fact that I had a 1150 socket mainboard from my retired mining rig, I'm not too salty about it. At least it is 6 core at the low end, that is encouraging. I've been mostly fine with my i7-860 so I guess the i7-4770k will serve me a while.
  • Death666Angel - Saturday, August 30, 2014 - link

    "With ASUS motherboards, they have implemented a new onboard button which tells 2x/3x GPU users which slots to go in with LEDs on the motherboard to avoid confusion."
    Because looking stuff up in the manual is way too complicated!
  • anactoraaron - Friday, August 29, 2014 - link

    The 5820 can be had for $299 at micro center and they will also discount a compatible motherboard by $40. Jus' sayin'. IDK if there's some kind of ad agreement, etc for listing Newegg's price... Anyone shopping for anything should always shop around.
  • tuxRoller - Friday, August 29, 2014 - link

    "Very few PC games lose out due to having PCIe 3.0 x8 over PCIe 3.0 x16"

    Any? Even BF4 might be more due to other factors. It might be more useful to determine these bottlenecks with uhd.
  • Ian Cutress - Monday, September 1, 2014 - link

    I want to try with UHD. Need the monitors though.
  • Mr Perfect - Friday, August 29, 2014 - link

    The 28 lanes of the i7-5820K has almost no effect on SLI gaming at 1080p.


    I realize you where trying to CPU limit the benchmarks by using such a low resolution, but does this still hold up when running, say, three 1440p monitors? Wouldn't that be the time when the GPUs are maxed out and start shuttling large amounts of data between themselves?
  • Ian Cutress - Monday, September 1, 2014 - link

    I want to test with higher resolutions in the near future, although my monitor situation is not as fruitful as I would hope. There is no big AnandTech warehouse, we all work in our corner of the world so shipping around this HW is difficult.
  • KAlmquist - Friday, August 29, 2014 - link

    "The move to DDR4 2133 C15 would seem to have latency benefits over previous DDR3-1866 and DDR3-1600 implementations as well."

    If my math is correct, this is wrong. With DDR4 2133 timings of 15-15-15, each of those 15's corresponds to 14.1 nanoseconds. (Divide 2133 by two to get the actual frequency, then divide the clock count by the frequency.) With DDR3 1600 and the common 9-9-9 timings, each time is only 11.25 nanoseconds. With DDR3, the actual transfer of the data takes four clock cycles (there are eight transfers, but "DDR" stands for "double data rate" meaning that there are two transfers per clock cycle). That translates to 5 nanonseconds on DDR3 1600. DDR4 transfers twice as much data at a time, so with DDR4 2133 a transfer takes eight clock cycles or 7.5 nanoseconds. So DDR3 1600 has lower latency than the DDR4 2133 memory.

    So why does Sandra report a memory latency of around 28.75 nanoseconds (92 clock cycles at 3.2 Ghz) as shown in the chart on page 2 of this review? If a bank does not have an open page, then the memory latency should be 15+15+8 clock cycles, or 35.6 nanoseconds, not counting the latency internal to the processor. So the Sandra benchmark result seems implausible to me. As far as I can tell, the source code for the Sandra benchmark is not available so there is no way to tell exactly what it is measuring.
  • JumpingJack - Monday, September 1, 2014 - link

    Good points.

Log in

Don't have an account? Sign up now