Polaris Refined: Better Fab Yields & a New Memory State

One of the components of AMD’s marketing angle for the Radeon RX 500 series is that it’s Polaris Refined. These are still the same Polaris 10 GPUs as in the Radeon RX 400 series, but AMD wants to highlight the small improvements that they have made/gained in the past year. For RX 400 owners this doesn’t amount to much – your cards are still shiny and chrome – but it serves to help differentiate the new cards from the old cards. And for the owners of cards like the R9 280 and 380 series that AMD is trying to reach and convince them to upgrade, it’s the justification for why AMD thinks they should want to upgrade now after having passed on the RX 400 series.

There are two elements to Polaris Refined: silicon improvements, and a new memory clock state. The former in turn is comprised of both the benefits in improving fab yields and quality that AMD has enjoyed over the past year, and a new revision of Polaris 10.

In the case of fab yields, all of the revised Polaris chips are being manufactured on what AMD is calling the “Latest Generation FinFET 14” process. This is a bit of a mouthful, but in short it’s AMD calling attention to the improvements partners GlobalFoundries and Samsung have made to their 14nm LPP processes in the last year. Yields are up and overall chip quality is better, which improves the average performance (clockspeed & power) characteristics of the chips. Both foundries have also been making other undisclosed, small tweaks to their lines to further boost chip quality. It’s not a new fab process (it’s still 14nm LPP) but it’s an improvement over where Polaris 10 production started nearly a year ago.

Typically these kinds of yearly gains would simply be rolled into a product line without any fanfare – these improvements are gradual over time anyhow, not a binary event – but for the RX 500 series AMD wants to call attention to them to explain why clockspeeds are improved versus the RX 400 series cards released last year. Though to be clear here, the difference isn’t dramatic; the gains from a year’s optimization to a manufacturing line are a fraction of a full node improvement.

Meanwhile AMD is also releasing a new revision of Polaris 10, which is being used in the RX 580/570 launch. These revised chips have received further tweaking to reach higher clockspeeds, allowing AMD to reliably clock up a bit higher and/or reduce power consumption a bit. The new revision also fixes a couple of minor issues with the GPUs. Specifically, AMD is adding a new mid-power memory clock state so that applications that require memory clocks faster than idle – primarily mixed-resolution multi-monitor and video decoding – no longer cause the memory to clock up to its most power-demanding speeds, keeping overall power consumption down.

One thing to note here is that while AMD’s chip quality has improved here though the combination of manufacturing improvements and revised silicon, for the desktop AMD is investing all of those gains into improving clockspeeds. This is why the TBPs have gone up by 30-35W over the RX 480 and RX 470.

Power Consumption: By the Numbers

Since all of AMD’s optimizations are focused on bringing down power consumption, let’s take a look at that now. There are a few different things we can look at here, and I’ll start with what’s probably the most burning question: just how much better is the new revision of Polaris 10 over the old revision?

To test this, I’ve taken the Radeon RX 580 sample AMD sent over – PowerColor’s Red Devil RX 580 – and underclocked it to the same clockspeeds as the RX 480. It should be noted however that this process is a bit more complex than just underclocking to the RX 480’s official boost clock of 1266MHz. Because the RX 480 power throttles under both FurMark and Crysis 3, it’s necessary to match the RX 480’s specific clockspeeds in those scenarios.

After doing so, what we find are mixed results.

Load Power Testing, Normalized GPU Clockspeeds (Power Draw at Wall)
  FurMark
(740MHz)
Crysis 3
(1230MHz)
Radeon RX 480 231W 301W
Radeon RX 580 205W 314W

Even after dialing the RX 580 down to 1230MHz for Crysis 3 to match the reference RX 480, power consumption at the wall is still 11W higher than the RX 480. Performance is the same, so the RX 580 isn’t doing more work, but none the less power system at a system level is still a bit higher.

On the other hand, turning the RX 580 down to 740MHz to match the RX 480 on FurMark (power viruses cause significant throttling), we find the RX 580 ahead by a rather shocking 26W. Power consumption at the wall is 205W, versus 231W for the RX 480.

Broadly speaking, although FurMark isn’t always the best tool for load power measurement on cross-vendor cards, it has proven to be very reliable when looking at cards based on the same architecture. It suffers from a very specific limitation: it will push a card to its TDP limit, and this can vary among manufacturers, but even with that it typically gives you a consistent and sane metric to compare like-cards.

Consequently I tend to favor the FurMark numbers here. However it doesn’t change the fact that power consumption numbers under Crysis 3 are wildly different, and paint the RX 580 as being worst. So they can’t both be right, can they?

As it stands, I suspect we’re getting into the area of random variation – with a sample size of 1 on each Radeon card, the random variations in quality from GPU to GPU are downing out the actual data. It’s entirely possible we’re looking at a worse-than-average RX 480 and a better-than-average RX 580, especially as the latter has been binned for factory overclocking. However I’m not ready to rule out that something more complex may be going on here: that the improvements Polaris 10’s power curve aren’t linear/consistent. It may be that AMD’s greatest gains are at lower clockspeeds and voltages, and that those improvements taper off at higher clockspeeds and voltages.

But for the moment, I’m ruling it a push. The FurMark data is interesting, but without Crysis 3 being in agreement it’s not enough to say anything definitive.

That New Memory State

Finally, let’s take a look at the specific benefits AMD is touting for the new memory state that the company has included with the new Polaris 10 revision. The new mid-power state allows the memory to be clocked at 4Gbps GDDR5 on the RX 580. The other power states on the RX 580 (and the RX 480) are 1.2Gbps (idle) and 8Gbps (full load), so on the RX 480 if AMD ever needed to increase the memory clocks above idle, their only option was to go to full clocks, which on GDDR5 is relatively expensive.

The two scenarios AMD is looking to address with this new memory clock state are multi-monitor configurations and video playback. In the case of the former, mismatched monitors would require the RX 480 to go to its full memory clocks even when idling. Due to the timing differences, the higher memory clock is needed to avoid flickering. Matched monitors avoid this problem, as they have identical timings. Otherwise in the case of video playback, while AMD has their fixed function decoder to offload most of the work, it still generates a lot of video data, which can require the memory to jump to a higher clock state to keep up. Though the video playback scenario is particularly complex as the GPU clock itself can also jump up if the video decoder needs a higher performance state for itself.

Putting this to the test, I ran both the RX 480 and RX 580 through a mix of multi-monitor and video playback scenarios.

Multi-Monitor Power Testing (Power Draw at Wall)
  Single Monitor
(1080p)
Multi-Monitor
Matched
(1080p+1080p)
Multi-Monitor
Mismatched
(1080p + 1440p)
Radeon RX 480 76W 76W 100W
Radeon RX 580 74W 74W 100W
GeForce GTX 1060 6GB 73W 73W 73W

Starting with the multi-monitor testing, the results were not what I was expecting. While AMD tells me that this should trigger the new mid-power state, I haven’t been able to successfully trigger it. With matched monitors the RX 580 can go to full idle, just like the RX 480. Otherwise with mismatched monitors, it always goes to 8Gbps, skipping past 4Gbps and never returning. Even with a few different monitors, the results were always the same. Due to the quick launch I haven’t had time to further debug the issue, so I’m not sure if it’s related to the monitors or if it’s something specific to the Red Devil RX 580.

Video Playback Power Testing (Power Draw at Wall)
  Idle High Bitrate H.264 High Bitrate HEVC
Radeon RX 480 76W 125W 125W
Radeon RX 580 74W 90W 93W
GeForce GTX 1060 6GB 73W 96W 96W

On the plus side however, AMD’s new memory state worked as expected with video playback. Whereas the RX 480 would have to settle for an 8Gbps memory clock when playing back high-biterate H.264 and HEVC video in Media Player Classic – Home Cinema, the RX 580 would settle at 4Gbps. In fact the RX 580 actually performed a bit better than expected; the RX 480 would typically have to go to higher core clock speeds as well, compounding the power cost. As a result power consumption at the wall was notably lower on the RX 580 than the RX 480.

And just for reference, this is actually a bit better than the GeForce GTX 1060 6GB. NVIDIA’s midrange card goes to its maximum memory clock in the same tests, and as a result power consumption at the wall was a few watts higher than the RX 580.

The AMD Radeon RX 580 & RX 570 Review Meet the Cards: PowerColor Red Devil RX 580 & Sapphire Nitro+ RX 570
Comments Locked

129 Comments

View All Comments

  • BrokenCrayons - Wednesday, April 19, 2017 - link

    Eh, the use of TBP did leap out as an error right away until I spent a few minutes thinking about it. I get that the industry (well the industry of all two of the world's GPU companies) is constantly trying to buzzword its way into presenting a product in the best light possible, but it seems like a stupid and pointless terminology change from over here.
  • Wineohe - Tuesday, April 18, 2017 - link

    I drank the koolaid and bought a pair of RX480's when they first came out. Bad idea. I followed this up with a GTX 1080, which is what I should have bought in the first place. If my budget was a single RX480/580 then I would definitely opt for a 1060 instead.
  • Lolimaster - Tuesday, April 18, 2017 - link

    RX570 4GB is damn good deal for 1080p or 1600x1200 CRT-
  • Lolimaster - Tuesday, April 18, 2017 - link

    Then sell it in 2019 for a nice RX770 10nm gpu.
  • Pork@III - Wednesday, April 19, 2017 - link

    Excessive consumption of electricity for devices from the middle class as performance. AMD again began to offer plates, which can fry eggs.
  • SydneyBlue120d - Wednesday, April 19, 2017 - link

    Is VP9 with HDR decoding enabled with these cards?
  • slickr - Wednesday, April 19, 2017 - link

    Good job? Anandtech has become the laughing stock of hardware reviews. The suite they are using is painfully outdated, with games 3-4 years old on average, the number of games is so small, the titles are outdated, they tested Battlefield 4, instead of Battlefield 1. BF1 supports DX12 as well, its the new engine that most EA games use and are going to use in the future and they are testing BF4 which is a very old title with an old engine that no one is using anymore!

    Dirt Rally, Crysis 3, all old games. Even Hitman is an old game now and should be dropped. Where is For Honor, Mass Effect Andromeda, Ghost Recon Wildlands, Deus EX: MD, Watch Dogs 2, Mafia 3, Forza 3, Sniper Elite 4, BF1, etc...

    The only relevant games they use are ROTR, AOTS, The Division and Witcher 3. No minimums tested, no maximums, no frametime, no overclocking, no custom clocked cards on the Nvidia side, etc...

    This is a barebones review!
  • milkod2001 - Wednesday, April 19, 2017 - link

    Im afraid you are right. Anand is no longer in charge of this site, it is owned by Purch, advertising company which only care about ad clicks. They could not give a slightest sh.t. about your latest games benchmarks.
  • Ryan Smith - Wednesday, April 19, 2017 - link

    True, the site is owned by Purch. But it is run by me, and when it comes to Editorial, the buck stops here.

    Every article you see posted here and every choice made in how we benchmark is my responsibility. Even on those articles I don't write, the editor in charge has spoken to me at some point to gather my feedback and to solicit my advice. This is to ensure that the articles you guys get live up to the quality that AnandTech is known for. And I do that precisely because I do care; I care about bringing you guys the information and analysis you need to see, and I care about trying to bring you the things you'd like to see.
  • Ryan Smith - Wednesday, April 19, 2017 - link

    I commented on the game selection elsewhere in the thread: http://www.anandtech.com/comments/11278/amd-radeon...

    In short we refresh once per year, and the next refresh will be Vega (the current suite was rolled out with the Pascal launch). This ensures consistency between articles, and makes Bench more useful for you guys. There are other sites out there that do differently - and it's totally a valid way to test - but it's not how we want to do things. Our goal is apples-to-apples, and sometimes that requires being methodical and a bit slow. The benefit is that we can stand behind our data knowing full well that the results make sense, and that we have a very good understanding of the tests used.

    As for the number of games, there are 9 games here. On the one hand this is more than more sites, so I'd like to think we're doing well enough here, and on the other hand there is a practical limit to how many games we can have, due to how long it takes to run all of those games. If we added more games, we'd have to give up something else. And I should note that every data point you see here was collected or validated for this article, so you're looking at 9 games, 13 video cards, and multiple resolutions per card. It adds up very quickly.

    As for custom clocked cards, this has a bit of a history to it:

    http://www.anandtech.com/show/3988/the-use-of-evga...

    The last time we included the opposition's factory overclocked cards, you guys rightfully called us out on it, and made it clear that you wanted apples-to-apples testing. Since then, this is exactly what we've delivered: reviews and their conclusions are based around stock-clocked cards/configurations. This ensures that what you see is the baseline performance of a card, and that no retail card should be slower than the card we've tested. Especially when most buyers purchase the cheapest card they can find, it's not the fastest card that matters, it's the slowest.

Log in

Don't have an account? Sign up now