The 2017 Benchmark Suite

For our review, we are implementing our fresh CPU testing benchmark suite, using new scripts developed specifically for this testing. This means that with a fresh OS install, we can configure the OS to be more consistent, install the new benchmarks, maintain version consistency without random updates and start running the tests in under 5 minutes. After that it's a one button press to start an 8-10hr test (with a high-performance core) with nearly 100 relevant data points in the benchmarks given below for CPUs, followed by our CPU gaming tests which run for 4-5 hours for each of the GPUs used. The CPU tests cover a wide range of segments, some of which will be familiar but some of the tests are new to benchmarking in general, but still highly relevant for the markets they come from.

Our new CPU tests go through six main areas. We cover the Web (we've got an un-updateable version of Chrome 56), general system tests (opening tricky PDFs, emulation, brain simulation, AI, 2D image to 3D model conversion), rendering (ray tracing, modeling), encoding (compression, AES, h264 and HEVC), office based tests (PCMark and others), and our legacy tests, throwbacks from another generation of bad code but interesting to compare.

All of our benchmark results can also be found in our benchmark engine, Bench.

A side note on OS preparation. As we're using Windows 10, there's a large opportunity for something to come in and disrupt our testing. So our default strategy is multiple: disable the ability to update as much as possible, disable Windows Defender, uninstall OneDrive, disable Cortana as much as possible, implement the high performance mode in the power options, and disable the internal platform clock which can drift away from being accurate if the base frequency drifts (and thus the timing ends up inaccurate).

Web Tests on Chrome 56

Sunspider 1.0.2
Mozilla Kraken 1.1
Google Octane 2.0
WebXPRT15

System Tests

PDF Opening
FCAT
3DPM v2.1
Dolphin v5.0
DigiCortex v1.20
Agisoft PhotoScan v1.0

Rendering Tests

Corona 1.3
Blender 2.78
LuxMark v3.1 CPU C++
LuxMark v3.1 CPU OpenCL
POV-Ray 3.7.1b4
Cinebench R15 ST
Cinebench R15 MT

Encoding Tests

7-Zip 9.2
WinRAR 5.40
AES Encoding (TrueCrypt 7.2)
HandBrake v1.0.2 x264 LQ
HandBrake v1.0.2 x264-HQ
HandBrake v1.0.2 HEVC-4K

Office / Professional

PCMark8
Chromium Compile (v56)
SYSmark 2014 SE

Legacy Tests

3DPM v1 ST / MT
x264 HD 3 Pass 1, Pass 2
Cinebench R11.5 ST / MT
Cinebench R10 ST / MT

CPU Gaming Tests

For our new set of GPU tests, we wanted to think big. There are a lot of users in the ecosystem that prioritize gaming above all else, especially when it comes to choosing the correct CPU. If there's a chance to save $50 and get a better graphics card for no loss in performance, then this is the route that gamers would prefer to tread. The angle here though is tough - lots of games have different requirements and cause different stresses on a system, with various graphics cards having different reactions to the code flow of a game. Then users also have different resolutions and different perceptions of what feels 'normal'. This all amounts to more degrees of freedom than we could hope to test in a lifetime, only for the data to become irrelevant in a few months when a new game or new GPU comes into the mix. Just for good measure, let us add in DirectX 12 titles that make it easier to use more CPU cores in a game to enhance fidelity.

Our original list of nine games planned in February quickly became six, due to the lack of professional-grade controls on Ubisoft titles. If you want to see For Honor, Steep or Ghost Recon: Wildlands benchmarked on AnandTech, please point Ubisoft Annecy or Ubisoft Montreal in my direction. While these games have in-game benchmarks worth using, unfortunately they do not provide enough frame-by-frame detail to the end user, despite using it internally to produce the data the user eventually sees (and it typically ends up obfuscated by another layer as well). I would instead perhaps choose to automate these benchmarks via inputs, however the extremely variable loading time is a strong barrier to this.

So we have the following benchmarks as part of our 4/2 script, automated to the point of a one-button run and out pops the results four hours later, per GPU. Also listed are the resolutions and settings used.

  • Civilization 6 (1080p Ultra, 4K Ultra)
  • Ashes of the Singularity: Escalation* (1080p Extreme, 4K Extreme)
  • Shadow of Mordor (1080p Ultra, 4K Ultra)
  • Rise of the Tomb Raider #1 - GeoValley (1080p High, 4K Medium)
  • Rise of the Tomb Raider #2 - Prophets (1080p High, 4K Medium)
  • Rise of the Tomb Raider #3 - Mountain (1080p High, 4K Medium)
  • Rocket League (1080p Ultra, 4K Ultra)
  • Grand Theft Auto V (1080p Very High, 4K High)

For each of the GPUs in our testing, these games (at each resolution/setting combination) are run four times each, with outliers discarded. Average frame rates, 99th percentiles and 'Time Under x FPS' data is sorted, and the raw data is archived.

The four GPUs we've managed to obtain for these tests are:

  • MSI GTX 1080 Gaming X 8G
  • ASUS GTX 1060 Strix 6G
  • Sapphire Nitro R9 Fury 4GB
  • Sapphire Nitro RX 480 8GB

In our testing script, we save a couple of special things for the GTX 1080 here. The following tests are also added:

  • Civilization 6 (8K Ultra, 16K Lowest)

This benchmark, with a little coercion, are able to be run beyond the specifications of the monitor being used, allowing for 'future' testing of GPUs at 8K and 16K with some amusing results. We are only running these tests on the GTX 1080, because there's no point watching a slideshow more than once.

*As an additional to this review, we do not have any CPU gaming data on Skylake-X. We ran a set of tests before Threadripper arrived, but now having had a chance to analyze the data, despite being on the latest BIOS and setup, there are still issues with performance that we need to nail down once this review is out of the way.

Test Bed and Setup Benchmarking Performance: CPU System Tests
Comments Locked

347 Comments

View All Comments

  • lefty2 - Thursday, August 10, 2017 - link

    except that they haven't
  • Dr. Swag - Thursday, August 10, 2017 - link

    How so? You have the performance numbers, and they gave you power draw numbers...
  • bongey - Thursday, August 10, 2017 - link

    Just do a avx512 benchmark and Intel will jump over 300watts , 400watts(overclocked) only from the cpu. (prime95 avx512 benchmark).See der8auer's video "The X299 VRM Disaster (en)"
  • DanNeely - Thursday, August 10, 2017 - link

    The Chromium build time results are interesting. Anandtech's results have the 1950X only getting 3/4ths of the 7900X's performance. Arstechnica's getting almost equal results on both CPUs, but at 16 compiles per day vs 24 or 32 is seeing significantly worse numbers all around.

    I'm wondering what's different between the two compile benchmarks to see such a large spread.
  • cknobman - Thursday, August 10, 2017 - link

    I think it has a lot to do with the RAM used by Anandtech vs Arstechnica .
    For all the regular benchmarking Anand used DDR4 2400, only the DDR 3200 was used in some overcloking.
    Arstechnica used DDR4 3200 for all benchmarking.
    Everyone already knows how faster DDR4 memory helps the Zen architecture.
  • DanNeely - Thursday, August 10, 2017 - link

    If ram was the determining factor, Ars should be seeing faster build times though not slower ones.
  • carewolf - Thursday, August 10, 2017 - link

    Anandtech must have misconfigured something. Building chromium is scales practically linearly. You can move jobs all the way across a slow network and compile on another machine and you still get linear speed-ups with more added cores.
  • Ian Cutress - Thursday, August 10, 2017 - link

    We're using a late March v56 code base with MSVC.
    Ars is using a newer v62 code base with clang-cl and VC++ linking

    We locked in our versions when we started testing Windows 10 a few months ago.
  • supdawgwtfd - Friday, August 11, 2017 - link

    Maybe drop it then as it is not at all usefull info.
  • Johan Steyn - Thursday, August 10, 2017 - link

    I refrained from posting on the previous article, but now I'm quite sure Anand is being paid by Intel. It is not that I argue against the benchmarks, but how it is presented. I was even under the impression that this was an Intel review.

    The previous article was stated as "Introducing Intel's Desktop Processor" Huge marketing research is done on how to market products. By just stating one thing first or in a different way, quite different messages can be conveyed without lying outright.

    By making the "Most Powerful, Most Scalable" Bold, that is what the readers read first, then they read "Desktop Processor" without even reading that is is Intel's. This is how marketing works, so Anand used slanted journalism to favour Intel, yet most people will just not realise it eat it up.

    In this review there are so many slanted journalism problems, it is just sad. If you want, just compare it to other sites reviews. They just omit certain tests and list others at which Intel excel.

    I have lost my respect for Anandtech with these last two articles of them, and I have followed Anandtech since its inception. Sad to see that you are also now bought by Intel, even though I suspected this before. Congratulations for making this so clear!!!

Log in

Don't have an account? Sign up now