CPU Rendering Tests

Rendering tests are a long-time favorite of reviewers and benchmarkers, as the code used by rendering packages is usually highly optimized to squeeze every little bit of performance out. Sometimes rendering programs end up being heavily memory dependent as well - when you have that many threads flying about with a ton of data, having low latency memory can be key to everything. Here we take a few of the usual rendering packages under Windows 10, as well as a few new interesting benchmarks.

All of our benchmark results can also be found in our benchmark engine, Bench.

Corona 1.3: link

Corona is a standalone package designed to assist software like 3ds Max and Maya with photorealism via ray tracing. It's simple - shoot rays, get pixels. OK, it's more complicated than that, but the benchmark renders a fixed scene six times and offers results in terms of time and rays per second. The official benchmark tables list user submitted results in terms of time, however I feel rays per second is a better metric (in general, scores where higher is better seem to be easier to explain anyway). Corona likes to pile on the threads, so the results end up being very staggered based on thread count.

Rendering: Corona Photorealism

Corona loves threads. Game Mode goes behind the 1800X due to frequency.

Blender 2.78: link

For a render that has been around for what seems like ages, Blender is still a highly popular tool. We managed to wrap up a standard workload into the February 5 nightly build of Blender and measure the time it takes to render the first frame of the scene. Being one of the bigger open source tools out there, it means both AMD and Intel work actively to help improve the codebase, for better or for worse on their own/each other's microarchitecture.

Rendering: Blender 2.78

Blender loves threads.

LuxMark v3.1: Link

As a synthetic, LuxMark might come across as somewhat arbitrary as a renderer, given that it's mainly used to test GPUs, but it does offer both an OpenCL and a standard C++ mode. In this instance, aside from seeing the comparison in each coding mode for cores and IPC, we also get to see the difference in performance moving from a C++ based code-stack to an OpenCL one with a CPU as the main host.

Rendering: LuxMark CPU C++Rendering: LuxMark CPU OpenCL

Like Blender, LuxMark is all about the thread count. Ray tracing is very nearly a textbook case for easy multi-threaded scaling, although a couple of things pop up in the OpenCL version. Aside from the scores being lower, the jump from 1920X to 1950X isn't that great, and the quad-channel DRAM of the 1950X in Game Mode puts it over the 1800X.

POV-Ray 3.7.1b4: link

Another regular benchmark in most suites, POV-Ray is another ray-tracer but has been around for many years. It just so happens that during the run up to AMD's Ryzen launch, the code base started to get active again with developers making changes to the code and pushing out updates. Our version and benchmarking started just before that was happening, but given time we will see where the POV-Ray code ends up and adjust in due course.

Rendering: POV-Ray 3.7

POV-Ray loves threads.

Cinebench R15: link

The latest version of CineBench has also become one of those 'used everywhere' benchmarks, particularly as an indicator of single thread performance. High IPC and high frequency gives performance in ST, whereas having good scaling and many cores is where the MT test wins out.

Rendering: CineBench 15 MultiThreadedRendering: CineBench 15 SingleThreaded

Multithreaded results are as expected, and single thread seems to benefit a bit from more DRAM channels, although 200 MHz is enough to put the 1800X over the 1950X in Game Mode.

Benchmarking Performance: CPU System Tests Benchmarking Performance: CPU Web Tests
Comments Locked

104 Comments

View All Comments

  • MrSpadge - Thursday, August 17, 2017 - link

    It's definitely good that reviewers test the game mode and the others, so that we know what to expect from them. If they only tested creator mode the internets would be full of people shouting foul play to bash AMD.
  • deathBOB - Thursday, August 17, 2017 - link

    Ian - why not just enable NUMA and leave SMT on?
  • Ian Cutress - Thursday, August 17, 2017 - link

    The fourth corner of testing :)
  • lelitu - Thursday, August 17, 2017 - link

    Looking at setting up something for a home VM host, and linux development workstation makes NUMA with SMT the most useful set of benchmarks for my usecase.

    I'm particularly interested in TR, because it's brought the price of entry low enough that I can actually consider building such a system.
  • Ratman6161 - Friday, August 18, 2017 - link

    ThreadRipper is big bucks for your purposes if I'm reading this correctly. For a home lab sort of environment a lot of cores helps as does a lot of RAM, but you don't necessarily need a boatload of CPU power. For example, in my home ESXi system I've got an FX8350 which VMWare sees as an 8 Core CPU. I've also given it 32 GB of DDR3 RAM (purchased when that was cheap). The 990FX motherboards work great for this since they have plenty of PCIe lanes available. In my case, those are used for an ancient ATI video card I happened to have in a drawer, an LSI x8 RAID card and an x4 Intel dual port gigabit NIC. The RAID card has 4 1 TB desktop drives hooked up to it in a RAID 5.

    All of the above can be had pretty cheap these days. I'm thinking of upgrading my storage to 4x2 TB SAS drives - available for $35 each on Amazon...brand new (but old models). The system is running 6 to 7 VM's (Windows Servers mostly) at any given time. But with only two users, I don't run into many cases where more than two VM's are actually doing anything at the same time. Example: Web server and SQL Server serving up a web app.

    For this environment, having a storage setup where the VM's are not contending for the disks and also having plenty of RAM seems to make a lot more difference than the CPU.

    Of course if you have the bucks and just want to, ThreadRipper would be terrific for this - just way to expensive and overkill for me.
  • lelitu - Monday, August 21, 2017 - link

    That depends a lot on what you want the VMs for. Unfortunately for the sort of performance testing and development I do a VM toaster isn't actually good enough. Each VM needs at least 4 uncontended cores, and 10GB uncontended RAM. Two VMs is the absolute minimum, 3 would be better.

    That's not going to fit into anything less than a ryzen 7 minimum, and a Threadripper, *if* it performs as I expect in SMT + NUMA mode would be almost perfect. Unfortunately, you're right, it's a *lot* of coin to drop on something I don't know will actually do what I need well enough.

    Thus, I wish there were SMT+NUMA workstation and VM benchmarks here.
  • JasonMZW20 - Thursday, August 17, 2017 - link

    Seems like Game Mode should have bumped up the base clocks to 1800X levels, especially for Nvidia cards using a software scheduler that seems to scale with CPU frequency. AMD's hardware scheduler is apparent in overall FPS stability and being mostly CPU agnostic.

    Matching base clocks with 1800X or even 1900X (3.8GHz) might be better on TR for gaming in Game Mode.
  • lordken - Friday, August 18, 2017 - link

    Also for some weird reason that 1800X is much faster with higher fps in civilization and tomb rider?
  • peevee - Thursday, August 17, 2017 - link

    "because the 1920X has fewer cores per CCX, it actually falls behind the 1950X in Game Mode and the 1800X despite having more cores. "

    Sorry, but when 12 cores with twice memory bandwidth are compiling slower than 8, you are doing something wrong. Yes, Anandtech, you. I'd seriously investigate. For example, the maximum number of threads were set at 24 or something.
  • Ian Cutress - Thursday, August 17, 2017 - link

    When you have a bank of cores that communicate with each other, and replace it with more cores but uneven communication latencies, it is a difference and it can affect code paths.

Log in

Don't have an account? Sign up now