Conclusions

The art to building a good CPU is balance: you want something that is fast for individual streams of instructions and data, but also fast for multiple streams. You need something that is also power efficient, high yielding, and can be put together quite easily, with software out there already able to take advantage of what you have made.

“Opportunities multiply as they are seized.”

AMD has succeeded at a time when its competitor has struggled. As AMD launched its Zen 2 hardware across its Ryzen and EPYC product lines, built on TSMC’s 7nm, Dr. Lisa Su the CEO stated in interviews to AnandTech that:

We've executed our roadmap from the previous five years and we’re extending it into the next 5 years, all while assuming our competition will be competitive and even beating their public targets.’

At a time when Intel is struggling with its 10nm manufacturing process, AMD is targeting where Intel should have been if it had executed to time. The fact that Intel has suffered issues has benefited AMD, with its latest Ryzen and EPYC CPUs taking high praise. The follow on from these has been Threadripper, and the first two Zen 2 based Threadripper CPUs were quite good. I even used the word ‘bloodbath’ in the review, it was that impressive compared to what Intel had to offer.

Read our Initial Threadripper 3000 Series Review Here

With this third Threadripper 3000 processor, the 3990X, AMD is hoping to capitalize on its successes. The concept here is relatively simple: more of the same. Double the high-performance Zen 2 cores, at only slightly lower frequencies per core, for the same power – if a user has the right workload, then it’s the ideal processor.

And there-in likes the crux of this CPU; what is the right workload?

“Know yourself and you will win all battles”

One of the continual talking points about new CPUs is if the ecosystem is ready for them, especially with AMD pushing core counts ever higher. There’s no point having a million cores if everything is written for a few cores – not everyone runs a thousand copies of the same workload at the same time. Unfortunately this is what happened here with the 3990X. We’re in a situation where only a few software packages (that we tested) work great with the CPU, but it’s also the operating system that’s behind.

In our reviews, I prefer Windows for both comfort but also because a lot of the user base is on Windows. We typically use Windows 10 Pro, but because this CPU has 128 total threads, the regular version of Windows 10 Pro has issues – we had to move to Windows 10 Enterprise in order to see a difference. The alternative was to disable simultaneous multithreading, taking us back to one thread per core, which actually worked really well for a lot of tests, but also left some performance on the table. We suggest that 3990X users who typically have Windows 10 Pro do one of these two things: either disable SMT or use Win10 Pro for Workstations/Enterprise. This issue is down to how Windows tracks processor groups, an adage from multi-socket platforms, which shouldn’t apply here but because it’s hard coded into the OS when we have above 64 threads, it’s a pain.

Then there’s also the workload issue: we saw a number of tests, like Corona, Blender, and even NAMD, work great, which points to rendering and scientific compute benefiting from such a high core count processor. However other programs, such as 7-zip, LuxMark, Photoscan, and others did not see much (if any) of an improvement in performance compared to AMD’s own 32-core CPU.

I’ve heard a lot of silicon engineers say that adding cores helps, but adding frequency helps everything. The question then becomes whether you target workloads that can scale out (more cores) best, or whether scaling up (more frequency) is a better solution. We either end up with target CPUs for one or the other, or a combination CPU that tries to do both.

“[He] who wishes to fight must first count the cost”

In this review we evaluated two directions for AMD’s 64-core 3990X. The first was at the consumer/prosumer level, looking up to improve on their high-end desktop system. The second was at the enterprise level, looking down to see if that single 64-core CPU is actually worth it compared to a dual socket system. The conclusion might shock you. (It might not.)

For the first stage, the consumer/prosumer level, our conclusion is that the usefulness of the 3990X is limited. Aside from a few select instances (as mentioned, Corona, Blender, NAMD) the 32-core Threadripper for half the price performed on par or with margin. For this market, saving that $2000 between the 64-core and the 32-core can easily net another RTX 2080 Ti for GPU acceleration, and this would probably be the preferred option. Unless you run those specific tests (or ones like it), then go for the 32 core and spend the money elsewhere. Aside from the core count there is little to differentiate the two parts.

The second stage, the enterprise level, it becomes a no brainer to consolidate a dual socket system into a single AMD CPU – the initial outlay cost is substantially lower, and the long term power costs also come into play. This is what the enterprise likes to combine into ‘Total Cost of Ownership’, or TCO. The TCO and performance advantage of AMD here is plain to see in the benchmarks and the pricing. The situation gets a little muddier when we compare which AMD CPU to choose from: typically a server market wants RDIMM memory, which only comes from the EPYC processors. The difference between the 64-core EPYC 7702P and Threadripper 3990X is minor in terms of cost (under $500), and each CPU has its benefits: EPYC gets more PCIe lanes (128 vs 64) and more memory (8 channel RDIMM vs 4 channel UDIMM), while Threadripper gets better frequencies (2900/4300 vs 2000/3350) for a higher TDP (280W vs 200W). From a server perspective, if you need more IO or more memory, get the EPYC, otherwise Threadripper merits consideration.

“Do many calculations [to] lead to victory”

In the end, the situation for the 3990X is not as clear as it was with the 3970X. It’s a good chip, but it’s not the best chip for everything. I will tell you what it is good at though: ever seen Cinebench R20 complete in 16 seconds? Here you go:

A final thought. The AMD TR 3990X is amusingly priced at $3990. It’s a great marketing idea, and gets people talking. I’m proud to say that this price was my idea – AMD originally had it for something different. I don’t often influence change in the industry in such an obvious way, but this one was fun.

AMD 3990X Against $20k Enterprise CPUs
Comments Locked

279 Comments

View All Comments

  • GreenReaper - Saturday, February 8, 2020 - link

    64 sockets, 64 cores, 64 threads per CPU - x64 was never intended to surmount these limits. Heck, affinity groups were only introduced in Windows XP and Server 2003.

    Unfortunately they hardcoded the 64-CPU limit in by using a DWORD and had to add Processor Groups as a hack added in Win7/2008 R2 for the sake of a stable kernel API.

    Linux's sched_setaffinity() had the foresight to use a length parameter and a pointer: https://www.linuxjournal.com/article/6799

    I compile my kernels to support a specific number of CPUs, as there are costs to supporting more, albeit relatively small ones (it assumes that you might hot-add them).
  • Gonemad - Friday, February 7, 2020 - link

    Seeing a $4k processor clubbing a $20k processor to death and take its lunch (in more than one metric) is priceless.

    If you know what you need, you can save 15 to 16 grand building an AMD machine, and that's incredible.

    It shows how greedy and lazy Intel has become.

    It may not be the best chip for, say, a gaming machine, but it can beat a 20-grand intel setup, and that ensures a spot for the chip, not being useless.
  • Khenglish - Friday, February 7, 2020 - link

    I doubt that really anyone would practically want to do this, but in Windows 10 if you disable the GPU driver, games and benchmarks will be fully CPU software rendered. I'm curious how this 64 core beast performs as a GPU!
  • Hulk - Friday, February 7, 2020 - link

    Not very well. Modern GPU's have thousands of specialized processors.
  • Kevin G - Friday, February 7, 2020 - link

    The shaders themselves are remarkably programmable. The only thing really missing from them and more traditional CPU's in terms of capability is how they handle interrupts for IO. Otherwise they'd be functionally complete. Granted the per-thread performance would be abyssal compared to modern CPUs which are fully pipelined, OoO monsters. One other difference is that since GPU tasks are embarrassing parallel by nature, these shaders have hardware thread management to quickly switch between them and partition resources to achieve some fairly high utilization rates.

    The real specialization are in in the fixed function units for their TMUs and ROPs.
  • willis936 - Friday, February 7, 2020 - link

    Will they really? I don’t think graphics APIs fall back on software rendering for most essential features.
  • hansmuff - Friday, February 7, 2020 - link

    That is incorrect. Software rendering is never done by Windows just because you don't have rendering hardware. Games no longer come with software renderers like they used to many, many moons ago.
  • Khenglish - Friday, February 7, 2020 - link

    I love how everyone had to jump in and said I was wrong without spending 30 seconds to disable their GPU driver and try it themselves and finding they are wrong.

    There's a lot of issues with the Win10 software renderer (full screen mode mostly broken, only DX11 seems supported), but it does work. My Ivy Bridge gets fully loaded at 70W+ just to pull off 7 fps at 640x480 in Unigine Heaven, but this is something you can do.
  • extide - Friday, February 7, 2020 - link

    No -- the Windows UI will drop back to software mode but games have not included software renderers for ~two decades.
  • FunBunny2 - Friday, February 7, 2020 - link

    " games have not included software renderers for ~two decades."

    which is a deja vu experience: in the beginning DOS was a nice, benign, control program. then Lotus discovered that the only way to run 1-2-3 faster than molasses uphill in winter was to fiddle the hardware directly, which DOS was happy to let it do. it didn't take long for the evil folks to discover that they could too, and virus was born. one has to wonder how much exposure these latest GPU hardware present?

Log in

Don't have an account? Sign up now