AMD Threadripper Pro Review: An Upgrade Over Regular Threadripper?

Name: AMD Threadripper Pro Review: An Upgrade Over Regular Threadripper?
Item: AMD Threadripper Pro Review: An Upgrade Over Regular Threadripper?
Author: Dr. Ian Cutress

by Dr. Ian Cutress on July 14, 2021 9:00 AM EST

98 Comments | Add A Comment

98 Comments

CPU Tests: Synthetic

Most of the people in our industry have a love/hate relationship when it comes to synthetic tests. On the one hand, they’re often good for quick summaries of performance and are easy to use, but most of the time the tests aren’t related to any real software. Synthetic tests are often very good at burrowing down to a specific set of instructions and maximizing the performance out of those. Due to requests from a number of our readers, we have the following synthetic tests.

Linux OpenSSL Speed: SHA256

One of our readers reached out in early 2020 and stated that he was interested in looking at OpenSSL hashing rates in Linux. Luckily OpenSSL in Linux has a function called ‘speed’ that allows the user to determine how fast the system is for any given hashing algorithm, as well as signing and verifying messages.

OpenSSL offers a lot of algorithms to choose from, and based on a quick Twitter poll, we narrowed it down to the following:

rsa2048 sign and rsa2048 verify
sha256 at 8K block size
md5 at 8K block size

For each of these tests, we run them in single thread and multithreaded mode. All the graphs are in our benchmark database, Bench, and we use the sha256 results in published reviews.

(8-3c) Linux OpenSSL Speed sha256 8K Block (1T) (8-4c) Linux OpenSSL Speed sha256 8K Block (nT)

AMD has had a sha256 accelerator in its processors for many years, whereas Intel only enabled SHA acceleration in Rocket Lake. That's why we see RKL matching TR in 1T mode, but when the cores get fired up, TR and TR Pro streak ahead with the available performance and memory bandwidth. This is all about threads here, and 128 threads really matters.

GeekBench 5: Link

As a common tool for cross-platform testing between mobile, PC, and Mac, GeekBench is an ultimate exercise in synthetic testing across a range of algorithms looking for peak throughput. Tests include encryption, compression, fast Fourier transform, memory operations, n-body physics, matrix operations, histogram manipulation, and HTML parsing.

I’m including this test due to popular demand, although the results do come across as overly synthetic.

(8-1c) Geekbench 5 Single Thread (8-1d) Geekbench 5 Multi-Thread

DRAM Bandwidth

As we're moving from 2 channel memory on Ryzen to 4 channel memory on Threadripper then 8 channel memory on Threadripper Pro, these all have associated theoretical bandwidth maximums but there is a case for testing to see if those maximums can be reached. In this test, we do a simple memory write for peak bandwidth.

For 2-channel DDR4-3200, the theoretical maximum is 51.2 GB/s.
For 4-channel DDR4-3200, the theoretical maximum is 102.4 GB/s.
For 8-channel DDR4-3200, the theoretical maximum is 204.8 GB/s.

(8-2b) AIDA DRAM Write Speed

Here we see all the 4-channel Threadripper processors getting around 83 GB/s, but the Threadripper Pro can only achieve closer to its maximums when there are more cores present. Along with the memory controller bandwidth, AMD has to manage internal infinity fabric bandwidth and power to get the most out of the system. The fact that the 64C/64T achieves better than the 64C/128T might suggest that in 128T there is some congestion.

CPU Tests: Legacy and Web CPU Tests: SPEC

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

98 Comments

View All Comments

Spunjji - Friday, July 16, 2021 - link
Having seen how modern processors behave with insufficient cooling, Threska's right that it won't get "fried", but you're correct to infer that it would result in unpredictably sub-optimal performance.

Anecdotally, I had a friend with a Sandy Bridge system with a cooling issue that he only noticed when he bought a new GPU and ran 3DMark and got unexpectedly low results. The "cooling issue" was that the stock heatsink wasn't even making contact with the CPU heat-spreader; he'd been gaming with the system for 3 years by that point. 😬
serpretetsky - Friday, July 16, 2021 - link
I had to do some thermal shutdown testing on some consumer intel cpu. I forgot which one. Maybe i5/i7 8000 series?

With server CPUs this was usually pretty easy, remove fan, and wait for shutdown. With the consumer CPU it kept running. So i completely removed the heatsink, the thing simply downclocked to 800 MHz, and continued running happily with no heatsink. Booted to linux, ran everything great, and no heatsink (actually once it booted to linux I think it even started clocking back up once in a while). I had get a hot-air soldering gun to heat it up till shutdown.
mode_13h - Saturday, July 17, 2021 - link
5-10 years ago, there was a heatsink gasket where you have to get near 100 degrees C to melt the material so it fuses with the heatsink and CPU. I forget the name, but I'm wondering if it's even possible to do that any more.
skaurus - Wednesday, July 14, 2021 - link
That's great analysis.
Threska - Wednesday, July 14, 2021 - link
It would be nice to see how these MBs do with VFIO since that has considerations most users don't.
mode_13h - Wednesday, July 14, 2021 - link
Ian, is the source code for your 3DPM benchmark published anywhere? If not, it would be nice if we could see it and compare the AVX2 path with the AVX-512 one. Also, maybe someone could add support for ARM NEON or SVE.
techguymaxc - Wednesday, July 14, 2021 - link
I'm slightly confused by the concluding remarks.

"Performance between Threadripper Pro and Threadripper came in three stages. Either (a) the results between similar processors was practically identical, (b) Threadripper beat TR Pro by a small margin due to slightly higher frequencies, or (c) TR Pro thrashed Threadripper due to memory bandwidth availability. That last point, (c), only really kicks in for the 32c and 64c processors it should be noted. Our 16c TR Pro had the same memory bandwidth results as TR, most likely due to only having two chiplets in its design."

A and B are observable, but C only proves true in synthetic benchmarks (and Pi calculation). Is there a real-world use-case for the additional memory bandwidth, outside of calculating Pi?
Blastdoor - Wednesday, July 14, 2021 - link
The advantage shows up with multi-threaded SPEC. SPEC is essentially a composite of a suite of real-world tasks. I guess you could call it 'synthetic' due to it being a composite, but the individual tasks don't strike me as 'synthetic.' For example, here's a description of namd: https://www.spec.org/cpu2017/Docs/benchmarks/508.n...
techguymaxc - Wednesday, July 14, 2021 - link
Thanks for that info. It would be nice to see the breakdown of individual test results from the SPEC suite.
arashi - Saturday, July 17, 2021 - link
Bench

AMD Threadripper Pro Review: An Upgrade Over Regular Threadripper?

CPU Tests: Synthetic

Linux OpenSSL Speed: SHA256

GeekBench 5: Link

DRAM Bandwidth

Post Your Comment

98 Comments

View All Comments

Spunjji - Friday, July 16, 2021 - link

serpretetsky - Friday, July 16, 2021 - link

mode_13h - Saturday, July 17, 2021 - link

skaurus - Wednesday, July 14, 2021 - link

Threska - Wednesday, July 14, 2021 - link

mode_13h - Wednesday, July 14, 2021 - link

techguymaxc - Wednesday, July 14, 2021 - link

Blastdoor - Wednesday, July 14, 2021 - link

techguymaxc - Wednesday, July 14, 2021 - link

arashi - Saturday, July 17, 2021 - link

Log in

Don't have an account? Sign up now