Conclusions

There are three main ways to increase modern computing performance: more cores, higher frequency, and a better instruction throughput per cycle (IPC).

The one everyone loves, but is the hardest to do, is to increase IPC – most modern processor designs, if they are evolutions of previous designs, try to ensure that IPC increases faster than power consumption, such that for every 1% increase in power, there might be 2% increase in IPC. This helps efficiency, and it helps everyone.

As we’ve seen with some recent consumer processors, IPC is nothing unless you can match the frequency of the previous generation. Increasing frequency should sound easy: just increase the voltage, which gives the unfortunate side effect of heat and decreases the efficiency. There’s also another element at play here, in physical design. The ability to produce a layout of a processor floorplan such that different parts of the CPU are not affecting the frequency is a key tenet to good physical design, and this can help boost maximum frequencies. If you can’t get IPC, then an increase in frequency also helps everyone.

An increase in core count is harder to quantify. More cores only helps users that have workloads that scale across multiple cores, or gives an opportunity for more users to work at once. There also has to be an interconnect to feed those cores, which scales out the power requirements. Cores doesn’t always help everyone, but it can be one of the easier ways to scale out certain types of performance.

With the new 7F range of Rome processors, AMD is hoping to stag that first second rung of the ladder. These new parts offer more frequency, but also improve the L3 cache to core ratio, which will certainly help a number of edge cases that are L3 limited or interconnect limited. There is a lot of demand for high frequency hardware, and given the success of the Naples 7371 processor from the previous generation, AMD has expanded its remit into three new 7F processors. The F is for Frequency.

The processor we tested today was the 7F52, the most expensive offering ($3100) which has 16 cores with a base frequency of 3.5 GHz and a turbo of 3.9 GHz. This is the highest turbo of any AMD EPYC processor, and this CPU is built such that there is 256 MB of L3 cache, offering the highest core-to-cache ratio of any x86 processor. At a full 16 MB per core, this means that there is less chance for congestion between threads at the L3 level, which is an important consideration for caching workloads that reuse data.

Our tests showed very good single thread performance, and a speedy ramp from idle to high power, suitable for bursty workloads where responsiveness matters. For high throughput performance, we saw some good numbers in our test suite, especially for rendering.

Personally, it’s great when we see companies like AMD expanding their product portfolio into these niche areas. High frequency parts, high cache parts, or custom designs are all par for the course in the enterprise market, depending on the size of the customer (for a custom SKU) or the size of the demand (to make the SKU public). AMD has been doing this for generations, and in the past even created modified Opterons for the Ferrari F1 team to do more computational fluid dynamics within a given maximum FLOPs. I’m hoping AMD lets us in on any of these special projects in the future.


Threadripper, Rome, Naples. AMD introducing RGB to CPUs

CPU Performance: Rendering and Synthetics
Comments Locked

97 Comments

View All Comments

  • eastcoast_pete - Tuesday, April 14, 2020 - link

    Thanks Ian! Two questions: 1. Could you and some of your readers give specific examples of applications for which these high frequency CPUs are of great interest?
    2. Any recent moves by Intel to make software developers use AVX512 even more, basically whenever it would make any sense?
    The reason I am asking the second question is that this seems to be the last bastion Intel holds, almost regardless of CPU class. Except for AVX512, AMD is beating them in price/performance quite badly, now from servers to workstations to desktop to mobile.
  • schujj07 - Tuesday, April 14, 2020 - link

    DB servers are one place where you want a fast CPU. SAP HANA for example loves frequency and RAM. I've seen PRD systems with all of 16CPUs but 1.5TB RAM.
  • DanNeely - Tuesday, April 14, 2020 - link

    AVX is a compute feature. Rendering and math heavy scientific/engineering workloads are where it'd shine. Databases, typical webservers, and most other 'conventional' business related software don't care.
  • Shorty_ - Thursday, April 16, 2020 - link

    Web serving is another place where frequency really helps. I run threadrippers with ECC UDIMM for php hosting for this express reason
  • Mikewind Dale - Tuesday, April 14, 2020 - link

    Unfortunately, this breaks AMD's trend of being cheaper than Intel. A 20 core Xeon Xeon Gold 5218R boosts up to 4.0 GHz and costs $1273. This new EPYC is only 16 core, boosts only up to 3.9 GHz, and costs $3100.

    Usually, AMD is cheaper than Intel, but this seems to be an exception. A pity.
  • Fataliity - Tuesday, April 14, 2020 - link

    That's because its a specialized processor. If you are buying one of these, you won't be worried about the price.

    To get that much cache, they are using 6-8chiplets. So as many as their top of the line products. So yeah, its going to cost more because theres more silicon.
  • schujj07 - Tuesday, April 14, 2020 - link

    The 5218R that you referenced isn't what the 7F52 is competing against. With a base clock of 2.1GHz the 5218R isn't a frequency optimized part. Most of Intel's CPUs have high boost clocks and middle of the road base clocks. The actual competition is the 6246R which has a 3.4GHz base and 4.1GHz boost clock. These high base clocks are for sustained performance in a given scenario.
  • MFinn3333 - Tuesday, April 14, 2020 - link

    That’s is also because it has about 12x as much L3 cache per CPU core. A combined 256MB vs 30MB cache size speaks for itself.
  • edzieba - Tuesday, April 14, 2020 - link

    It's down to two design choices: process choice, and core choice.

    AMDs hands are somewhat tied when it comes to process choice. They get what TSMC has on offer, and what TSMC has on offer is geared towards mobile devices because that's where the volume market is. The high-performance variants are variants, rather than the baseline.
    But even in general, as you shrink your process from 21nm on down, it gets harder and harder to clock up. Gate oxide thickness hit its limit generations ago, which is why gate voltage has remained near constant (~1.1v) for so long. This is only going to get harder as processes shrink further while being stuck with the constant gate oxide thickness but trying to cram closer together without interfering.

    In AMD's hands is the design goal of cramming as many cores as possible in. Great for multi-core workloads, but not so great for single core speed. Getting CPUs to clock higher means using multiple transistors per gate (2-3 or even more as processes shrink), and AMD figured they may as well use these transistors for more cores instead of faster cores. The obvious downside is the difficulty in getting Zen cores to even approach 5GHz (with Zen 2 being notable for getting above 4GHz without overkill cooling), and that any workloads that do not span beyond one thread leave those transistors sitting idle.
  • twtech - Tuesday, April 14, 2020 - link

    On the 7H12 - Dell offers it in their EPYC servers, such as the R7525. It's currently about a $375 upgrade over the 7742.

Log in

Don't have an account? Sign up now