Benchmarking Performance: CPU Rendering Tests

Rendering tests are a long-time favorite of reviewers and benchmarkers, as the code used by rendering packages is usually highly optimized to squeeze every little bit of performance out. Sometimes rendering programs end up being heavily memory dependent as well - when you have that many threads flying about with a ton of data, having low latency memory can be key to everything. Here we take a few of the usual rendering packages under Windows 10, as well as a few new interesting benchmarks.

All of our benchmark results can also be found in our benchmark engine, Bench.

Corona 1.3: link

Corona is a standalone package designed to assist software like 3ds Max and Maya with photorealism via ray tracing. It's simple - shoot rays, get pixels. OK, it's more complicated than that, but the benchmark renders a fixed scene six times and offers results in terms of time and rays per second. The official benchmark tables list user submitted results in terms of time, however I feel rays per second is a better metric (in general, scores where higher is better seem to be easier to explain anyway). Corona likes to pile on the threads, so the results end up being very staggered based on thread count.

Rendering: Corona Photorealism

Blender 2.78: link

For a render that has been around for what seems like ages, Blender is still a highly popular tool. We managed to wrap up a standard workload into the February 5 nightly build of Blender and measure the time it takes to render the first frame of the scene. Being one of the bigger open source tools out there, it means both AMD and Intel work actively to help improve the codebase, for better or for worse on their own/each other's microarchitecture.

Rendering: Blender 2.78

LuxMark v3.1: Link

As a synthetic, LuxMark might come across as somewhat arbitrary as a renderer, given that it's mainly used to test GPUs, but it does offer both an OpenCL and a standard C++ mode. In this instance, aside from seeing the comparison in each coding mode for cores and IPC, we also get to see the difference in performance moving from a C++ based code-stack to an OpenCL one with a CPU as the main host.

Rendering: LuxMark CPU C++

POV-Ray 3.7.1b4: link

Another regular benchmark in most suites, POV-Ray is another ray-tracer but has been around for many years. It just so happens that during the run up to AMD's Ryzen launch, the code base started to get active again with developers making changes to the code and pushing out updates. Our version and benchmarking started just before that was happening, but given time we will see where the POV-Ray code ends up and adjust in due course.

Rendering: POV-Ray 3.7

Cinebench R15: link

The latest version of CineBench has also become one of those 'used everywhere' benchmarks, particularly as an indicator of single thread performance. High IPC and high frequency gives performance in ST, whereas having good scaling and many cores is where the MT test wins out.

Rendering: CineBench 15 MultiThreaded

Rendering: CineBench 15 SingleThreaded

Benchmarking Performance: CPU Office Tests Benchmarking Performance: CPU Encoding Tests
Comments Locked

152 Comments

View All Comments

  • CrazyHawk - Tuesday, September 26, 2017 - link

    "Intel also launched Xeon-W processors in the last couple of weeks."

    Just where can one purchase these mythical Xeon-W processors? There hasn't been a single peep about them since the "launch" week. I've only heard of two motherboards that will support them. They seem to be total vaporware. On Intel's own site, it says they were "Launched" in 3Q2017. Intel had better hurry up, 3Q will be up in 4 days!
  • samer1970 - Tuesday, September 26, 2017 - link

    I dont understand why intel disables ECC on their i9 CPU , they are losing low budget workstation buyers who will 100% choose AMD threadripper over intel i9..

    Even if they are doing this to protect their xeons chips ,they can enable non buffered ECC and not allow Registered ECC on the i9 problem solved. unbuffered ECC has Size limitation and people who want more RAM will go for xeons.

    Remember that their i3 has ECC support , but only the i3 ...

    intel , you are stupid.
  • vladx - Wednesday, September 27, 2017 - link

    Newsflash, these chips don't target "low budget workstation buyers". Golden rule is always: "If you can't afford it, you're not the target customer.".
  • samer1970 - Wednesday, September 27, 2017 - link

    Thats not a Golden Rule anymore with the Threadripper chips around . it is called "Stupid rule" ...

    They are allowing AMD to steal the low budget workstation buyers by not offering them an alternative to choose from.
  • vladx - Wednesday, September 27, 2017 - link

    The "low budget workstation buyers" as you call them are a really insignificant percentage of an already really small piece of the huge pie of Intel customers.
  • samer1970 - Wednesday, September 27, 2017 - link

    who told you so ? Most engineering students at universities need one , and Art Students who render alot as well. all these people will buy threadripper CPU and avoid intel , for intel xeon are 50% more expensive .

    andI dont cae about the percentage in intel Pie ... hundreds of thousands student enter uiviersites around the world each year . Low percentage or not they are alot ...

    how much do you think a low budget workstation costs ? they start from $3000 ... and with xeon Pricing , it will be very difficult to add alot of RAM and a good workstation card and fast SSD .
  • esi - Wednesday, September 27, 2017 - link

    What's the explanation for some of the low scores of the 7980XE on the SPECwpc benchmarks? Particularly Poisson, where the 6950X is 3.5X higher.
  • ZeDestructor - Wednesday, September 27, 2017 - link

    Most likely cache-related
  • esi - Wednesday, September 27, 2017 - link

    Maybe. But one that really makes no sense is the Dolphin 5.0 render test. How can the 7980XE take nearly twice as long as the 7960X?
  • esi - Wednesday, September 27, 2017 - link

    So I ran the Poisson benchmark on by 6950X. It uses all 10 cores (20 h/w threads), but can be configured to run in different ways: you can set the number of s/w threads per process. It then creates enough processes to ensure there's one s/w thread per h/w thread. Changing the s/w threads per processes significantly effects the result:

    20 - 1.34
    10 - 2.5
    5 - 3.31
    4 - 3.47
    2 - 3.67
    1 - 0.19

    Each process only uses about 2.5MB of RAM. So the 1-thread per process probably has a low result as this will result in more RAM usage than L3 cache, whereas the others should all fit in.

    Would be interesting to see what was used for the 7980/7960. Perhaps the unusual number of cores resulted in a less than optimal process/thread mapping.

Log in

Don't have an account? Sign up now