HEDT Benchmarks: Encoding Tests

With the rise of streaming, vlogs, and video content as a whole, encoding and transcoding tests are becoming ever more important. Not only are more home users and gamers needing to convert video files into something more manageable, for streaming or archival purposes, but the servers that manage the output also manage around data and log files with compression and decompression. Our encoding tasks are focused around these important scenarios, with input from the community for the best implementation of real-world testing.

Handbrake 1.1.0: Streaming and Archival Video Transcoding

A popular open source tool, Handbrake is the anything-to-anything video conversion software that a number of people use as a reference point. The danger is always on version numbers and optimization, for example the latest versions of the software can take advantage of AVX-512 and OpenCL to accelerate certain types of transcoding and algorithms. The version we use here is a pure CPU play, with common transcoding variations.

We have split Handbrake up into several tests, using a Logitech C920 1080p60 native webcam recording (essentially a streamer recording), and convert them into two types of streaming formats and one for archival. The output settings used are:

  1. 720p60 at 6000 kbps constant bit rate, fast setting, high profile
  2. 1080p60 at 3500 kbps constant bit rate, faster setting, main profile
  3. 1080p60 HEVC at 3500 kbps variable bit rate, fast setting, main profile

Handbrake 1.1.0 - 720p60 x264 6000 kbps Fast
Handbrake 1.1.0 - 1080p60 x264 3500 kbps Faster
Handbrake 1.1.0 - 1080p60 HEVC 3500 kbps Fast

Video encoding is always an interesting mix of multi-threading, memory latency, and compute. The Core i9, with AVX2 instructions, sets a commanding lead in all three tests. The AMD processors seem to fluctuate a bit, with the 1950X and 2700X being the best of the bunch. Unfortunately we didn’t get 2950X results in our initial runs, but I would expect it to be competitive with the Core i9 for sure, given where the 1950X is. However the 2990WX does fall behind a bit.

7-zip v1805: Popular Open-Source Encoding Engine

Out of our compression/decompression tool tests, 7-zip is the most requested and comes with a built-in benchmark. For our test suite, we’ve pulled the latest version of the software and we run the benchmark from the command line, reporting the compression, decompression, and a combined score.

It is noted in this benchmark that the latest multi-die processors have very bi-modal performance between compression and decompression, performing well in one and badly in the other. There are also discussions around how the Windows Scheduler is implementing every thread. As we get more results, it will be interesting to see how this plays out.

7-Zip 1805 Compression

7-Zip 1805 Decompression

7-Zip 1805 Combined

Oh boy, this was an interesting set of tests. When we initially published this review, without commentary, the compression graph with the 2990WX at the bottom was shared around social media like crazy, trying to paint a picture of why AMD performance isn’t great. It was also used in conjuction with Phoronix’s tests, that showed a much better picture on Linux.

But what confuses me is that almost no-one also posted the decompression graph. Here AMD’s 32-core processors take a commanding lead, with the 16/18-core parts being the best of the rest.

If you plan to share out the Compression graph, please include the Decompression one. Otherwise you’re only presenting half a picture.

WinRAR 5.60b3: Archiving Tool

My compression tool of choice is often WinRAR, having been one of the first tools a number of my generation used over two decades ago. The interface has not changed much, although the integration with Windows right click commands is always a plus. It has no in-built test, so we run a compression over a set directory containing over thirty 60-second video files and 2000 small web-based files at a normal compression rate.

WinRAR is variable threaded but also susceptible to caching, so in our test we run it 10 times and take the average of the last five, leaving the test purely for raw CPU compute performance.

WinRAR 5.60b3

A set of high frequency cores and good memory is usually beneficial, but sometimes some more memory bandwidth and lower latency helps. At the top is AMD’s R7 2700X, with the Intel 10-core just behind. I’m surprised not to see the 8700K in there, perhaps its six cores is not enough. But the higher core count AMD parts struggle to gain traction here, with the 32-core parts taking some sweet time to finish this test.

AES Encryption: File Security

A number of platforms, particularly mobile devices, are now offering encryption by default with file systems in order to protect the contents. Windows based devices have these options as well, often applied by BitLocker or third-party software. In our AES encryption test, we used the discontinued TrueCrypt for its built-in benchmark, which tests several encryption algorithms directly in memory.

The data we take for this test is the combined AES encrypt/decrypt performance, measured in gigabytes per second. The software does use AES commands for processors that offer hardware selection, however not AVX-512.

AES Encoding

Normally we see this test go very well when there are plenty of cores, but it would seem that the bi-modal nature of the cores and memory controllers in the 2990WX gives a poor result. The EPYC 7601, with eight memory controllers, does a better job, however the 1950X wins here. The 2950X, where all cores have a similar access profile, scores top here, well above Intel’s 18-core Core i9.

HEDT Benchmarks: Office Tests HEDT Benchmarks: Web and Legacy Tests
Comments Locked

171 Comments

View All Comments

  • just4U - Monday, August 13, 2018 - link

    Ian, were you testing this with the CM Wraith Cooler? If not is it something you plan to review?
  • Ian Cutress - Monday, August 13, 2018 - link

    Most of the testing data is with the Liqtech 240 liquid cooler, rated at 500W. I do have data taken with the Wraith Ripper, and I'll be putting some of that data out when this is wrapped up.
  • IGTrading - Monday, August 13, 2018 - link

    To be honest, with the top of the line 32core model, it is interesting to identify as many positive effect cases as possible, to see if that entire set of applications that truly benefit of the added cores will persuade power users to purchase it.

    Like you've said, it is a niche of a niche and seeing it be X% faster of Y% slower is not as interesting as seeing what it can actually do when it is used efficiently and if this this makes a compelling argument for power users.
  • PixyMisa - Tuesday, August 14, 2018 - link

    Phoronix found that a few tests ran much faster on Linux - for 7zip compression in particular, 140% faster (as in, 2.4x). Some of these benchmarks could improve a lot with some tweaking to the Windows scheduler.
  • phoenix_rizzen - Wednesday, August 15, 2018 - link

    It'd be interesting to redo these tests on a monthly basis after Windows/BIOS updates are done, to see how performance changes over time as the Windows side of things is tweaked to support the new NUMA setup for TR2.

    At the very least, a follow-up benchmark run in 6 months would be nice.
  • Kevin G - Monday, August 13, 2018 - link

    Chiplets!

    The power consumption figures are interesting but TR does have to manage one thing that the high end desktop chips from Intel don't: off-die traffic. The amount of power to move data off die is significantly higher than moving it around on-die. Even in that context, TR's energy consumption for just the fabric seems high. When only threads are loaded, they should only be with dies with the memory controllers leaving two dies idle. It doesn't appear that the fabric is powering down while those remote dies are also powering down. Any means of watching cores enter/exit sleep states in real time?

    I'd also be fun to see with Windows Server what happens when all the cores on a die are unplugged from the system. Consdiering the AMD puts the home agent on the memory controller on each die, even without cores or memory attached, chances are that the home agent is still alive consuming power. It'd be interesting to see what happens on Sky Lake-SP as well if the home agents on the grid eventually power themselves down when there is nothing directly connected to them. It'd be worth comparing to the power consumption when a core is disabled in BIOS/EFI.

    I also feel that this would be a good introduction for what is coming down the road with server chips and may reach the high end consumer products: chiplets. This would permit the removal of the off-die Infinity Links for something that is effectively on-die throughout the cluster of dies. That alone will save AMD several watts. The other thing about chiplets is that it would greatly simplify Thread Ripper: only two memory controller chiplets would be to be in the package vs. four as we have now. That should save AMD lots of power. (And for those reading this comment, yes, Intel has chiplet plans as well.). The other thing AMD could do is address how their cache coherency protocols work. AMD has hinted at some caching changes for Zen 2 but lacks specificity.
  • gagegfg - Monday, August 13, 2018 - link

    do not seem to exist more than once the 16 additional core of the 2990wx compared to the 2950x
  • Ian Cutress - Monday, August 13, 2018 - link

    https://www.anandtech.com/bench/product/2133?vs=21...
  • Chaitanya - Monday, August 13, 2018 - link

    Built for scientific workload.
  • woozle341 - Monday, August 13, 2018 - link

    Do you think the lack of AVX512 is an issue? I might build a workstation soon for data processing with R and Python for some Fortran models and post-processing. Skylake-X looks pretty good wit its quad memory channels despite its high price.

Log in

Don't have an account? Sign up now