CPU Performance: System Tests

Our System Test section focuses significantly on real-world testing, user experience, with a slight nod to throughput. In this section we cover application loading time, image processing, simple scientific physics, emulation, neural simulation, optimized compute, and 3D model development, with a combination of readily available and custom software. For some of these tests, the bigger suites such as PCMark do cover them (we publish those values in our office section), although multiple perspectives is always beneficial. In all our tests we will explain in-depth what is being tested, and how we are testing.

All of our benchmark results can also be found in our benchmark engine, Bench.

For our graphs, some of them have two values: a regular value in orange, and one in red called 'Intel Spec'. ASUS offers the option to 'open up' the power and current limits of the chip, so the CPU is still running at the same frequency but is not throttled. Despite Intel saying that they recommend 'Intel Spec', the system they sent to us to test was actually set up with the power limits opened up, and the results they provided for us to compare to internally also correlated with that setting. As a result, we're providing both sets results for our CPU tests.

Application Load: GIMP 2.10.4

One of the most important aspects about user experience and workflow is how fast does a system respond. A good test of this is to see how long it takes for an application to load. Most applications these days, when on an SSD, load fairly instantly, however some office tools require asset pre-loading before being available. Most operating systems employ caching as well, so when certain software is loaded repeatedly (web browser, office tools), then can be initialized much quicker.

In our last suite, we tested how long it took to load a large PDF in Adobe Acrobat. Unfortunately this test was a nightmare to program for, and didn’t transfer over to Win10 RS3 easily. In the meantime we discovered an application that can automate this test, and we put it up against GIMP, a popular free open-source online photo editing tool, and the major alternative to Adobe Photoshop. We set it to load a large 50MB design template, and perform the load 10 times with 10 seconds in-between each. Due to caching, the first 3-5 results are often slower than the rest, and time to cache can be inconsistent, we take the average of the last five results to show CPU processing on cached loading.

AppTimer: GIMP 2.10.4

.

FCAT: Image Processing

The FCAT software was developed to help detect microstuttering, dropped frames, and run frames in graphics benchmarks when two accelerators were paired together to render a scene. Due to game engines and graphics drivers, not all GPU combinations performed ideally, which led to this software fixing colors to each rendered frame and dynamic raw recording of the data using a video capture device.

The FCAT software takes that recorded video, which in our case is 90 seconds of a 1440p run of Rise of the Tomb Raider, and processes that color data into frame time data so the system can plot an ‘observed’ frame rate, and correlate that to the power consumption of the accelerators. This test, by virtue of how quickly it was put together, is single threaded. We run the process and report the time to completion.

FCAT Processing ROTR 1440p GTX980Ti Data

.

3D Particle Movement v2.1: Brownian Motion

Our 3DPM test is a custom built benchmark designed to simulate six different particle movement algorithms of points in a 3D space. The algorithms were developed as part of my PhD., and while ultimately perform best on a GPU, provide a good idea on how instruction streams are interpreted by different microarchitectures.

A key part of the algorithms is the random number generation – we use relatively fast generation which ends up implementing dependency chains in the code. The upgrade over the naïve first version of this code solved for false sharing in the caches, a major bottleneck. We are also looking at AVX2 and AVX512 versions of this benchmark for future reviews.

For this test, we run a stock particle set over the six algorithms for 20 seconds apiece, with 10 second pauses, and report the total rate of particle movement, in millions of operations (movements) per second. We have a non-AVX version and an AVX version, with the latter implementing AVX512 and AVX2 where possible.

3DPM v2.1 can be downloaded from our server: 3DPMv2.1.rar (13.0 MB)

3D Particle Movement v2.1

.

3D Particle Movement v2.1 (with AVX)

.

Dolphin 5.0: Console Emulation

One of the popular requested tests in our suite is to do with console emulation. Being able to pick up a game from an older system and run it as expected depends on the overhead of the emulator: it takes a significantly more powerful x86 system to be able to accurately emulate an older non-x86 console, especially if code for that console was made to abuse certain physical bugs in the hardware.

For our test, we use the popular Dolphin emulation software, and run a compute project through it to determine how close to a standard console system our processors can emulate. In this test, a Nintendo Wii would take around 1050 seconds.

The latest version of Dolphin can be downloaded from https://dolphin-emu.org/

Dolphin 5.0 Render Test

.

DigiCortex 1.20: Sea Slug Brain Simulation

This benchmark was originally designed for simulation and visualization of neuron and synapse activity, as is commonly found in the brain. The software comes with a variety of benchmark modes, and we take the small benchmark which runs a 32k neuron / 1.8B synapse simulation, equivalent to a Sea Slug.

Example of a 2.1B neuron simulation

We report the results as the ability to simulate the data as a fraction of real-time, so anything above a ‘one’ is suitable for real-time work. Out of the two modes, a ‘non-firing’ mode which is DRAM heavy and a ‘firing’ mode which has CPU work, we choose the latter. Despite this, the benchmark is still affected by DRAM speed a fair amount.

DigiCortex can be downloaded from http://www.digicortex.net/

DigiCortex 1.20 (32k Neuron, 1.8B Synapse)

.

y-Cruncher v0.7.6: Microarchitecture Optimized Compute

I’ve known about y-Cruncher for a while, as a tool to help compute various mathematical constants, but it wasn’t until I began talking with its developer, Alex Yee, a researcher from NWU and now software optimization developer, that I realized that he has optimized the software like crazy to get the best performance. Naturally, any simulation that can take 20+ days can benefit from a 1% performance increase! Alex started y-cruncher as a high-school project, but it is now at a state where Alex is keeping it up to date to take advantage of the latest instruction sets before they are even made available in hardware.

For our test we run y-cruncher v0.7.6 through all the different optimized variants of the binary, single threaded and multi-threaded, including the AVX-512 optimized binaries. The test is to calculate 250m digits of Pi, and we use the single threaded and multi-threaded versions of this test.

Users can download y-cruncher from Alex’s website: http://www.numberworld.org/y-cruncher/

y-Cruncher 0.7.6 Single Thread, 250m Digitsy-Cruncher 0.7.6 Multi-Thread, 250m Digits

.

Agisoft Photoscan 1.3.3: 2D Image to 3D Model Conversion

One of the ISVs that we have worked with for a number of years is Agisoft, who develop software called PhotoScan that transforms a number of 2D images into a 3D model. This is an important tool in model development and archiving, and relies on a number of single threaded and multi-threaded algorithms to go from one side of the computation to the other.

In our test, we take v1.3.3 of the software with a good sized data set of 84 x 18 megapixel photos and push it through a reasonably fast variant of the algorithms, but is still more stringent than our 2017 test. We report the total time to complete the process.

Agisoft’s Photoscan website can be found here: http://www.agisoft.com/

Agisoft Photoscan 1.3.3, Complex Test

.

Our New Testing Suite for 2018 and 2019 CPU Performance: Rendering Tests
Comments Locked

136 Comments

View All Comments

  • Kevin G - Wednesday, January 30, 2019 - link

    For $3000 USD, a 28 core unlocked Xeon chip isn't terribly bad. The real issue is its incredibly low volume nature and that in effect only two motherboards are going to be supporting it. LGA 3647 is a wide spread platform but the high 255W TDP keeps it isolated.

    Oddly I think Intel would have had better success if they also simultaneously launched an unlocked 18 core part with even higher base/turbo clocks. This would have threaded the needle better in terms of per thread performance and overall throughput. The six channel memory configuration would have assisted in performance to distinguish itself from the highend Core i9 Extreme chips.

    The other aspect is that there is no clear upgrade path from the current chips: pretty much one chip to board ratio for the life time of the product. There is a lot on the Xeon side Intel has planned like on package FGPAs, Omnipath fabric and Nervana accelerators which could stretch their wings with a 255 W TDP. The Xeon Gold 6138P is an example of this as it comes with an Arria 10 FPGA inside but a slightly reduced clock 6138 die as well at a 195 W TDP. At 255 W, that chip wouldn't have needed to compromise the CPU side. For the niche market Intel is targeting, a FPGA solution would be interesting if they pushed ideas like OpenCL and DirectCompute to run on the FPGA alongside the CPU. Doing something really bold like accelerating PhysX on the FPGA would have been an interesting demo of what that technology could do. Or leverage the FGPA for DSP audio effects in a full 3D environment. That'd give something for these users to look forward to.

    Well there is the opportunity to put in other LGA 3647 parts into these boards but starting off with a 28 core unlocked chip means that other offering are a downgrade. With luck, Ice Lake-SP would be an upgrade but Intel hasn't committed to it on LGA 3647.

    Ultimately this looks like AMD's old 4x4/QuadFX efforts that'll be quickly forgotten by history.

    Speaking of AMD, Intel missing the launch window by a few months places it closer to the eminent launch of new Threader designs leveraging Zen 2 and AMD's chiplet strategy. I wouldn't expect AMD to go beyond 32 cores for Threadripper but the common IO die should improve performance overall on top of the Zen 2 improvements. Intel has some serious competition coming.
  • twtech - Wednesday, January 30, 2019 - link

    Nobody really upgrades workstation CPUs, but it sounds like getting a replacement in the event of failure.could be difficult if the stock will be so limited.

    If Dell and HP started offering this chip in their workstation lineup - which I don't expect to happen given the low-volume CPU production and needing a custom motherboard - then I think it would have been a popular product.
  • DanNeely - Wednesday, January 30, 2019 - link

    Providing the replacement part (and thus holding back enough stock to do so) is on Dell/HP/etc via the support contract. By the time it runs out in a few years the people who buy this sort of prebuilt system will be upgrading to something newer and much faster anyway.
  • MattZN - Wednesday, January 30, 2019 - link

    I have to disagree re: upgrades. Intel has kinda programmed consumers into believing that they have to buy a whole new machine whenever they upgrade. In the old old days we actually did have to upgrade in order to get better monitor resolutions because the busses kept changing.

    But in modern times that just isn't the case any more. For Intel, it turned into an excuse to get people to pay more money. We saw it in spades with offerings last year where Intel forced people into a new socket for no reason (a number of people were actually able to get the cpu to work in the old socket with some minor hackery). I don't recall the particular CPU but it was all over the review channels.

    This has NOT been the case for Intel's commercial offerings. The Xeons traditionally have had a whole range of socket-compatible upgrade options. It's Intel's shtick 'Scaleable Xeon CPUs' for the commercial space. I've upgraded several 2S Intel Xeon systems by buying CPUs on E-Bay... its an easy way to double performance on the cheap and businesses will definitely do it if they care about their cash burn.

    AMD has thrown cold water on this revenue source on the consumer side. I think consumers are finally realizing just how much money Intel has been squeezing out of them over the last decade and are kinda getting tired of it. People are happily buying new AMD CPUs to upgrade their existing rigs.

    I expect that Intel will have to follow suit. Intel traditionally wanted consumers to buy whole new computers but now that CPUs offer only incremental upgrades over prior models consumers have instead just been sticking with their old box across several CPU cycles before buying a new one. If Intel wants to sell more CPUs in this new reality, they will have to offer upgradability just like AMD is. I have already upgraded two of my AM4 boxes twice just by buying a new CPU and I will probably do so again when Zen 2 comes out. If I had had to replace the entire machine it would be a non-starter. But since I only need to do a BIOS update and buy a new CPU... I'll happily pay AMD for the CPU.

    Intel's W-3175X is supposed to compete against threadripper, but while it supposedly supports ECC I do not personally believe that the socket has any longevity and that it is a complete waste of money and time to buy into it verses buying into threadripper's far more stable socket and far saner thermals. Intel took a Xeon design that is meant to run closer to the maximally efficient performance/power point on the curve and tried to turn it into a pro-sumer or small-business competitor to the threadripper by removing OC limits and running it hot, on an unstable socket. No thanks.

    -Matt
  • Kevin G - Thursday, January 31, 2019 - link

    I would disagree with this. Workstations around here are being retrofitted with old server hand-me-downs from the data center as that requipment is quietly retired. Old workstations make surprisingly good developer boxes, especially considering that the costs is just moving parts from one side of the company to the other.

    Though you do have point that the major OEMs themselves are not offering upgrades.
  • drexnx - Wednesday, January 30, 2019 - link

    wow, I thought (and I think many people did) that this was just a vanity product, limited release, ~$10k price, totally a "just because we're chipzilla and we can" type of thing

    looks like they're somewhat serious with that $3k price
  • MattZN - Wednesday, January 30, 2019 - link

    The word 'nonsensical' comes to mind. But setting aside the absurdity of pumping 500W into a socket and trying to pass it off as a usable workstation for anyone, I have to ask Anandtech ... did you run with the scheduler fixes necessary to get reasonable results out of the 2990WX in the comparisons? Because it kinda looks like you didn't.

    The Windows scheduler is pretty seriously broken when it comes to both the TR and EPYCs and I don't think Microsoft has pushed fixes for it yet. That's probably what is responsible for some of the weird results. In fact, your own article referenced Wendel's work here:

    https://www.anandtech.com/show/13853/amd-comments-...

    That said, of course I would still expect this insane monster of Intel's to put up better results. It's just that... it is impractical and hazardous to actually configure a machine this way and expect it to have any sort of reasonable service life.

    And why would Anandtech run any game benchmarks at all? This is a 28-core Xeon... actually, it's two 14-core Xeons haphazardly pasted together (but that's another discussion). Nobody in their right mind is going to waste it by playing games that would run just as well on a 6-core cpu.

    I don't actually think Intel has any intention of actually selling very many of these things. This sort of configuration is impractical with 14nm and nobody in their right mind would buy it with AMD coming out with 10nm high performance parts in 5 months (and Intel probably a bit later this year). Intel has no business putting a $3000 price tag on this monster.

    -Matt
  • eddman - Thursday, January 31, 2019 - link

    "it's two 14-core Xeons haphazardly pasted together"

    Where did you get that info? Last time I checked each xeon scalable chip, be it LCC, HCC or XCC, is a monolithic die. There is no pasting together.
  • eddman - Thursday, January 31, 2019 - link

    Didn't you read the article? It's right there: "Now, with the W-3175X, Intel is bringing that XCC design into the hands of enthusiasts and prosumers."

    Also, der8auer delidded it and confirmed it's an XCC die. https://youtu.be/aD9B-uu8At8?t=624
  • mr_tawan - Wednesday, January 30, 2019 - link

    I'm surprised you put the Duron 900 on the image. That makes me expecting the test result from that CPU too!!

Log in

Don't have an account? Sign up now