Benchmark Overview

For our testing, depending on the product, we attempt to tailor the presentation of our global benchmark suite down into what users who would buy this hardware might actually want to run. Barring abnormalities, our full test suite is typically used to gather data and all the results are placed into Bench, our benchmark database for users that want to look at non-typical benchmarks or legacy data.

The benchmarks fall into several areas:

Short Form CPU

Our short form testing script uses a straight run through of a mixture of known apps or workloads, and requires about four hours. These are typically the CPU tests we run in our motherboard suite, to identify any performance anomolies.

CPU Short Form Benchmarks
Three Dimensional Particle Movement (3DPM) v1 3DPM is a self-penned benchmark, derived from my academic research years looking at particle movement parallelism. The coding for this tool was rough, but emulates the real world in being non-CompSci trained code for a scientific endeavor. The code is unoptimized, but the test uses OpenMP to move particles around a field using one of six 3D movement algorithms in turn, each of which is found in the academic literature. This test is performed in single thread and multithreaded workloads, and uses purely floating point numbers. The code was written in Visual Studio 2008 in Release mode with all optimizations (including fast math and -Ox) enabled. We take the average of six runs in each instance.
v2 The second version of this benchmark is similar to the first, however it has been re-written in VS2012 with one major difference: the code has been written to address the issue of false sharing. If data required by multiple threads, say four, is in the same cache line, the software cannot read the cache line once and split the data to each thread - instead it will read four times in a serial fashion. The new software splits the data to new cache lines so reads can be parallelized and stalls minimized.

As v2 is fairly new, we are still gathering data and results are currently limited.
WinRAR 5.01 WinRAR is a compression based software to reduce file size at the expense of CPU cycles. We use the version that has been a stable part of our benchmark database through 2015, and run the default settings on a 1.52GB directory containing over 2800 files representing a small website with around thirty half-minute videos. We take the average of several runs in this instance.
POV-Ray 3.7 beta POV-Ray is a common ray-tracing tool used to generate realistic looking scenes. We've used POV-Ray in its various guises over the years as a good benchmark for performance, as well as a tool on the march to ray-tracing limited immersive environments. We use the built-in multithreaded benchmark.
HandBrake  HandBrake is a freeware video conversion tool. We use the tool in to process two different videos - first a 'low quality' two hour video at 640x388 resolution to x264, then a 'high quality' ten minute video at 4320x3840. The low quality video scales at lower performance hardware, whereas the buffers required for high-quality can stretch even the biggest processors. At current, this is a CPU only test.
7-Zip 7-Zip is a freeware compression/decompression tool that is widely deployed across the world. We run the included benchmark tool using a 50MB library and take the average of a set of fixed-time results.

 

Web, Synthetic and Legacy

The web tests are a usual mix of Octane/Kraken with WebXPRT in the mix. Synthetic and Legacy CPU testing relates to our long-term data under CineBench and x264.

Web and Synthetic Benchmarks
Google Octane 2.0 Lots of factors go into web development, including the tools used and the browser those tools play in. One of the common and widely used benchmarks to judge performance is Google Octane, now in version 2.0. To quote: 'The updated Octane 2.0 benchmark includes four new tests to measure new aspects of JavaScript performance, including garbage collection / compiler latency and asm.js-style JavaScript performance.'
Mozilla Kraken 1.1 Kraken is a similar tool to Google, focusing on web tools and processing power. Kraken's tools include searching algorithms, audio processing, image filtering, flexible database parsing and cryptographic routines.
WebXPRT 2013/2015 WebXPRT aims to be a souped up version of Octane and Kraken, using these tools in real time to display data in photograph enhancement, sorting, stock options, local storage manipulation, graphical enterfaces and even filtering algorithms on scientific datasets. We run the 2013 and 2015 versions of the benchmark.
Cinebench Cinebench is a widely known benchmarking tool for measuring performance relative to MAXON's animation software Cinema 4D. Cinebench has been optimized over a decade and focuses on purely CPU horsepower, meaning if there is a discrepancy in pure throughput characteristics, Cinebench is likely to show that discrepancy. Arguably other software doesn't make use of all the tools available, so the real world relevance might purely be academic, but given our large database of data for Cinebench it seems difficult to ignore a small five minute test. We run the modern version 15 in this test, as well as the older 11.5 due to our back data.
x264 HD 3.0 Similarly, the x264 HD 3.0 package we use here is also kept for historic regressional data. The latest version is 5.0.1, and encodes a 1080p video clip into a high quality x264 file. Version 3.0 only performs the same test on a 720p file, and in most circumstances the software performance hits its limit on high end processors, but still works well for mainstream and low-end. Also, this version only takes a few minutes, whereas the latest can take over 90 minutes to run.
TrueCrypt 7.1 Before its discontinuation, TrueCrypt was a popular tool for WindowsXP to offer software encryption to a file system. The near-latest version, 7.1, is still widely used however the developers have stopped supporting it since the introduction of encrypted disk support in Windows 8/7/Vista from 5/2014, and as such any new security issues are unfixed.

 

Long Form and Professional Benchmarks

For reviews that require a little more depth, we invoke our long form CPU tests. These scripts include the short form tests, the web tests, and bundle some real-world tests that are influenced by processor frequency, core count, cache sizes and memory support. Generational advances between CPU microarchitectures show up here as well. Our professional tests involve a 2D to 3D image and model conversion tool used by archivists and modelers as well as Linux Bench. We are currently looking into expanding our professional testing suite to include code compilation as well as FPGA workflows.

Long Form and Professional Benchmarks
Dolphin 4.0
(Wii Emulation)
Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that raytraces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. 
Agisoft Photoscan 1.0 Photoscan is professional software that takes a series of 2D images (as little as 50, usually 250+) and 'performs calculations' to determine where the pictures were taken and if it can create a 3D model and textures of what the images are of. This model can then be exported to other software for touch-ups or implementation in physics engines/games or, as the reader that directed me to it, national archiving. The tool has four phases, one of which can be OpenCL accelerated, while the other three are a mix of single thread and variable thread workloads.
Linux Bench Linux Bench is a collection of Linux based benchmarks compiled together by ServeTheHome. The idea for this is to have some non-windows based tools that are easy enough to run with a USB key, an internet connection and three lines of code in a terminal. The tests in Linux Bench include standard synthetic compute, compression, matrix manipulation, database tools and key-value storage.

 

Gaming 

Our Gaming test suite is still our 2015 implementation, which remains fairly solid over gaming title updates. We are still working on a 2016 suite update, with a move to Windows 10. This will allow most of the titles to be replaced with DirectX 12, indie and eSports games.

Gaming Benchmarks
Alien: Isolation If first person survival mixed with horror is your sort of thing, then Alien: Isolation, based off of the Alien franchise, should be an interesting title. Developed by The Creative Assembly and released in October 2014, Alien: Isolation has won numerous awards from Game Of The Year to several top 10s/25s and Best Horror titles, ratcheting up over a million sales by February 2015. Alien: Isolation uses a custom built engine which includes dynamic sound effects and should be fully multi-core enabled.
Total War: Attila The Total War franchise moves on to Attila, another The Creative Assembly development, and is a stand-alone strategy title set in 395AD where the main story line lets the gamer take control of the leader of the Huns in order to conquer parts of the world. Graphically the game can render hundreds/thousands of units on screen at once, all with their individual actions and can put some of the big cards to task.
Grand Theft Auto V The highly anticipated iteration of the Grand Theft Auto franchise finally hit the shelves on April 14th 2015, with both AMD and NVIDIA in tow to help optimize the title. GTA doesn’t provide graphical presets, but opens up the options to users and extends the boundaries by pushing even the hardest systems to the limit using Rockstar’s Advanced Game Engine. Whether the user is flying high in the mountains with long draw distances or dealing with assorted trash in the city, when cranked up to maximum it creates stunning visuals but hard work for both the CPU and the GPU.
GRID: Autosport No graphics tests are complete without some input from Codemasters and the EGO engine, which means for this round of testing we point towards GRID: Autosport, the next iteration in the GRID and racing genre. As with our previous racing testing, each update to the engine aims to add in effects, reflections, detail and realism, with Codemasters making ‘authenticity’ a main focal point for this version.
Middle-Earth: Shadow of Mordor The final title in our testing is another battle of system performance with the open world action-adventure title, Shadow of Mordor. Produced by Monolith using the LithTech Jupiter EX engine and numerous detail add-ons, SoM goes for detail and complexity to a large extent, despite having to be cut down from the original plans. The main story itself was written by the same writer as Red Dead Redemption, and it received Zero Punctuation’s Game of The Year in 2014.
Test Bed and Setup Performance Comparison: Real World
Comments Locked

94 Comments

View All Comments

  • tipoo - Monday, August 8, 2016 - link

    Looks like even a Skylake i3 may be able to retire the venerable 2400/2500K, higher frame rates and better frame times at that. However a native quad does prevent larger dips.
  • Kevin G - Monday, August 8, 2016 - link

    I have a feeling much that is due to the higher base clock on the SkyLake i3 vs. the i5 2500K. Skylake's IPC improvements also help boost performance here too.

    The real challenge is if the i3 6320 can best the i5 2500k as the same 3.9 Ghz base clock speed. Sandy Bridge was a good overclocker so hitting those figures shouldn't be difficult at all.
  • tipoo - Monday, August 8, 2016 - link

    That's true, overclocked the difference would diminish. But you also get modernities like high clocked DDR4 in the switchover.

    At any rate, funny that a dual core i3 can now fluidly run just about everything, it's two cores are probably faster than the 8 in the current consoles.
  • Lolimaster - Monday, August 8, 2016 - link

    Benchrmarks don't tell you about the hiccups when playing with a dual core. Specially with things like Crysis 3 or even worse ROt Tomb Raider where you get like half the fps just by using a dual core bs a cheapo Athlon 860K.
  • gamerk2 - Monday, August 8, 2016 - link

    That's why Frame Times are also measured, which catches those hitches.
  • Samus - Tuesday, August 9, 2016 - link

    I had a lot of issues with my Sandy Bridge i3-2125 in Battlefield 3 circa 2011 with lag and poor minimum frame rates.

    After long discussions on the forums, it was determined disabling hyper threading actually improved frame rate consistency. So at least in the Sandy Bridge IPC, and probably dating back to Nehalem or even Prescott, Jackson Technology or whatever you want to call it, has a habit of stalling the pipeline if there are too many cache misses to complete the instruction. Obviously more cache resolves this, so the issue isn't as prominent on the i7's, and it would certainly explain why the 4MB i3's are more consistent performers than the 3MB variety.

    Of course the only way to prove if hyper threading is causing performance inconsistency is to disable it. It'd be a damn unique investigation for Anandtech to do a IPC improvement impact on it's affect on hyper-threading performance over the years, perhaps even dating back to the P4.
  • AndrewJacksonZA - Wednesday, August 10, 2016 - link

    HOW ON EARTH DID I MISS THIS?!?!

    Thank you for introducing me to Intel's tech known as "Jackson!" This is now *SO* on my "To Buy" list!

    Thank you Samus! :-D
  • bug77 - Monday, August 8, 2016 - link

    Neah, I went i5-2500k -> i5-6600k and there's no noticeable difference. The best part of the upgrade was those new I/O ports on the new motherboard, but it's a sad day when you upgrade after 4 years and the most you have to show is you new M2 or USB 3.1 ports (and USB 3.1 is only added through a 3rd party chip).
    Sure, if I bench it, the new i5 is faster, but since the old i5 wasn't exactly slow, I can't say that I see a significant improvement.

    Now, if you mean that instead of getting an i5-2500k one can now look at a Skylake i3, I'm not going to argue with you there. Though (money permitting) the boost speed might be nice to have anyway.
  • Cellar Door - Monday, August 8, 2016 - link

    This is a poorly educated comment:

    a) Your perceived speed might be limited by your storage
    b) You don't utilize your cpu's multitasking abilities fully(all cores)
  • Duckeenie - Monday, August 8, 2016 - link

    Why did you continue to post your comment if you believed you were making poorly educated points?

Log in

Don't have an account? Sign up now