CPU Benchmarks

The dynamics of CPU Turbo modes, both Intel and AMD, can cause concern during environments with a variable threaded workload. There is also an added issue of the motherboard remaining consistent, depending on how the motherboard manufacturer wants to add in their own boosting technologies over the ones that Intel would prefer they used. In order to remain consistent, we implement an OS-level unique high performance mode on all the CPUs we test which should override any motherboard manufacturer performance mode.

HandBrake v0.9.9: link

For HandBrake, we take two videos (a 2h20 640x266 DVD rip and a 10min double UHD 3840x4320 animation short) and convert them to x264 format in an MP4 container. Results are given in terms of the frames per second processed, and HandBrake uses as many threads as possible.

HandBrake v0.9.9 LQ Film

Low quality conversion loves faster individual cores, hence the W processor wins out due to its higher full-load frequency. Nonetheless, the fast consumer grade processors win here by a large margin.

HandBrake v0.9.9 2x4K

In full double-4K mode, the balance of cores, frequency and architecture upgrade puts the E5-2687W v3 above the 12-core E5-2697 v2.

Agisoft Photoscan – 2D to 3D Image Manipulation: link

Agisoft Photoscan creates 3D models from 2D images, a process which is very computationally expensive. The algorithm is split into four distinct phases, and different phases of the model reconstruction require either fast memory, fast IPC, more cores, or even OpenCL compute devices to hand. Agisoft supplied us with a special version of the software to script the process, where we take 50 images of a stately home and convert it into a medium quality model. This benchmark typically takes around 15-20 minutes on a high end PC on the CPU alone, with GPUs reducing the time.

Agisoft PhotoScan Benchmark - Total Time

Dolphin Benchmark: link

Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that raytraces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. Results are given in minutes, where the Wii itself scores 17.53 minutes.

Dolphin Emulation Benchmark

A single emulation instance benefits from a fast single core.

WinRAR 5.0.1: link

WinRAR 5.01, 2867 files, 1.52 GB

WinRAR seems to enjoy Haswell-EP over Ivy-EP, although it stills needs a high frequency to achieve top speeds.

PCMark8 v2 OpenCL

A new addition to our CPU testing suite is PCMark8 v2, where we test the Work 2.0 suite in OpenCL mode. 

PCMark8 v2 Work 2.0 OpenCL with R7 240 DDR3

Hybrid x265

Hybrid is a new benchmark, where we take a 4K 1500 frame video and convert it into an x265 format without audio. Results are given in frames per second.

Hybrid x265, 4K Video

Hybrid also takes advantage of the new architecture, giving a 5% advantage to the E5-2687W v3 despite two fewer cores.

Cinebench R15

Cinebench R15 - Single Threaded

Cinebench R15 - Multi-Threaded

3D Particle Movement

3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores.

3D Particle Movement: Single Threaded

3D Particle Movement: MultiThreaded

FastStone Image Viewer 4.9

FastStone is the program I use to perform quick or bulk actions on images, such as resizing, adjusting for color and cropping. In our test we take a series of 170 images in various sizes and formats and convert them all into 640x480 .gif files, maintaining the aspect ratio. FastStone does not use multithreading for this test, and results are given in seconds.

FastStone Image Viewer 4.9

Web Benchmarks

General usability is a big factor of experience, especially as we move into the HTML5 era of web browsing. For our web benchmarks, we take four well known tests with Chrome 35 as a consistent browser.

Sunspider 1.0.2

Sunspider 1.0.2

Mozilla Kraken 1.1

Kraken 1.1

WebXPRT

WebXPRT

Google Octane v2

Google Octane v2

Market Positioning, Test Setup, and Overclocking? Gaming Benchmarks
Comments Locked

27 Comments

View All Comments

  • JarredWalton - Monday, October 13, 2014 - link

    For ten cores I wouldn't expect a huge bump over the "minimum guaranteed" speed. It's one thing to boost a few cores by a large amount, but the whole problem with multi-core designs is that if you load up all the cores then either you have massive power consumption or you need to curtail the clocks. Honestly, running ten cores at 100% and still hitting 3.1GHz is impressive in my book -- and it still consumes up to 160W.
  • Carl Bicknell - Monday, October 13, 2014 - link

    I got my numbers a bit wrong: the 2687W is 3.1 GHz default and 3.2 GHz all cores on turbo, according to wikipedia.

    That's disappointing.

    Apart from anything else, they've managed to get their best 12 (yes twelve!) core CPU (E5-2690 v3) to operate at 3.1 GHz turbo all cores in a 135 W design.

    With two fewer cores and an extra 25 watts I'd hope for more than a mere 100 MHz performance.
  • NovoRei - Monday, October 13, 2014 - link

    Ian, could you comment on performance with pure AVX2 and mixed AVX instructions and where the W version stands?

    Thanks.
  • Laststop311 - Monday, October 13, 2014 - link

    4100 for an 18 core ill take 2
  • ruthan - Tuesday, October 14, 2014 - link

    I would like to see, benchmarks some of those low power - 6/12 or 12/24 - 55W a 65W models.
  • pokazene_maslo - Tuesday, October 14, 2014 - link

    Is it possible to override turbo boost to force all cores to run at maximum turbo freqency? (E5-2687W-v3 running all cores at 3.5GHz)
  • alpha754293 - Tuesday, October 14, 2014 - link

    Well, the thing with these "big" multicore systems is no different than testing large SMP system. You have to use programs for applications that where it make sense to use it. For engineering analyses and simulations, even HOW a problem is divided up (from a single, much larger problem) can have an impact on not only the speed for the analysis/simulation, but also the accuracy of the simulation, and you have to have a pretty sound understanding of the math and physics involved in order to make the best determination.

    And for some applications, there is such a thing and you CAN have TOO many cores (where you've divided up a problem so much that it's now so small that it can't fully load a core up anymore, and that the process of dividing and re-assembling the results takes an extremely large amount of time.) (You can run into that with some of the FEA analysis).

    I was working with Johan and studying a while slew of parameters using LS-DYNA to study how the various ways of decomposing a problem can have an impact on the crash test simulation results, and how swap performance means EVERYTHING when it comes to mechanical engineering simluations.
  • mapesdhs - Thursday, October 16, 2014 - link


    Oddly enough this can be the case with animation rendering aswell. I know a movie studio
    which uses a system that can exclude cores from a render pipeline so there is more RAM
    and cache bandwidth available with a fewer number of cores. This can matter because
    sometimes complex film renders can use huge amounts of data. Someone at SPI told me
    one frame of a big movie can involve 500GB of data.

    Interesting how the same issue can crop up in such widely different fields.

    Ian.
  • RAMdiskSeeker - Tuesday, October 14, 2014 - link

    Could you please test these motherboards for supporting ECC unbuffered DIMMs, reporting that ECC is active, and overclocking potential with ECC DIMMs? It would be good to know whether Xeon chips on non-server motherboards can use ECC.
  • nutral - Tuesday, October 14, 2014 - link

    What still is strange to me is that there is still no workstation cpu focused on a workstation with single threaded software. Wouldn't an i7 cpu still be much faster than this workstation cpu?

Log in

Don't have an account? Sign up now