The Polaris Architecture: In Brief

For today’s preview I’m going to quickly hit the highlights of the Polaris architecture.

In their announcement of the architecture this year, AMD laid out a basic overview of what components of the GPU would see major updates with Polaris. Polaris is not a complete overhaul of past AMD designs, but AMD has combined targeted performance upgrades with a chip-wide energy efficiency upgrade. As a result Polaris is a mix of old and new, and a lot more efficient in the process.

At its heart, Polaris is based on AMD’s 4th generation Graphics Core Next architecture (GCN 4). GCN 4 is not significantly different than GCN 1.2 (Tonga/Fiji), and in fact GCN 4’s ISA is identical to that of GCN 1.2’s. So everything we see here today comes not from broad, architectural changes, but from low-level microarchitectural changes that improve how instructions execute under the hood.

Overall AMD is claiming that GCN 4 (via RX 480) offers a 15% improvement in shader efficiency over GCN 1.1 (R9 290). This comes from two changes; instruction prefetching and a larger instruction buffer. In the case of the former, GCN 4 can, with the driver’s assistance, attempt to pre-fetch future instructions, something GCN 1.x could not do. When done correctly, this reduces/eliminates the need for a wave to stall to wait on an instruction fetch, keeping the CU fed and active more often. Meanwhile the per-wave instruction buffer (which is separate from the register file) has been increased from 12 DWORDs to 16 DWORDs, allowing more instructions to be buffered and, according to AMD, improving single-threaded performance.

Outside of the shader cores themselves, AMD has also made enhancements to the graphics front-end for Polaris. AMD’s latest architecture integrates what AMD calls a Primative Discard Accelerator. True to its name, the job of the discard accelerator is to remove (cull) triangles that are too small to be used, and to do so early enough in the rendering pipeline that the rest of the GPU is spared from having to deal with these unnecessary triangles. Degenerate triangles are culled before they even hit the vertex shader, while small triangles culled a bit later, after the vertex shader but before they hit the rasterizer. There’s no visual quality impact to this (only triangles that can’t be seen/rendered are culled), and as claimed by AMD, the benefits of the discard accelerator increase with MSAA levels, as MSAA otherwise exacerbates the small triangle problem.

Along these lines, Polaris also implements a new index cache, again meant to improve geometry performance. The index cache is designed specifically to accelerate geometry instancing performance, allowing small instanced geometry to stay close by in the cache, avoiding the power and bandwidth costs of shuffling this data around to other caches and VRAM.

Finally, at the back-end of the GPU, the ROP/L2/Memory controller partitions have also received their own updates. Chief among these is that Polaris implements the next generation of AMD’s delta color compression technology, which uses pattern matching to reduce the size and resulting memory bandwidth needs of frame buffers and render targets. As a result of this compression, color compression results in a de facto increase in available memory bandwidth and decrease in power consumption, at least so long as buffer is compressible. With Polaris, AMD supports a larger pattern library to better compress more buffers more often, improving on GCN 1.2 color compression by around 17%.

Otherwise we’ve already covered the increased L2 cache size, which is now at 2MB. Paired with this is AMD’s latest generation memory controller, which can now officially go to 8Gbps, and even a bit more than that when oveclocking.

AMD's Path to Polaris Gaming Performance
POST A COMMENT

449 Comments

View All Comments

  • just4U - Wednesday, June 29, 2016 - link

    yeah.. competes with a 970 on some games.. and the reference 980 on others. Was a good card. Reply
  • akamateau - Wednesday, June 29, 2016 - link

    Seriously?

    2 RX 480 in Crossfire CRUSHES GTX 1080!!!!

    rtflol
    Reply
  • AntDX316 - Thursday, June 30, 2016 - link

    The hype was fake.

    I mean honestly releasing a 14nm flagship slower than their previous gen is a step in the wrong direction. I wouldn't be surprised if they just release an $800 version of the 14nm in early 2017 with masssive power. They just need more time to get their fabrications correct. I assume there could be some unforeseen problems and if the problems do arise with the $200 version it won't leak into the common video card world. It would be kept rare and quiet so that stuff can be fixed for their $800 flagship.
    Reply
  • slickr - Thursday, June 30, 2016 - link

    This isn't a flagship card you moron! This is a mid range mainstream card, created specifically for the mass market. Their flagship card is coming in 2017, its Vega and its got HBM2, matured 14nm process, 4000+ stream processors, etc... Reply
  • Yojimbo - Thursday, June 30, 2016 - link

    It's not a flagship card. They don't have a new flagship card so they tried to hype their mainstream card with "This is what you really want/need!" The recent trend is for gamers to buy more expensive cards, not cheaper ones, though, so in my opinion unless the economy tanks it's a bad strategy.

    If it performed 20% better or they sold it for $160 instead of $200 it would be all the things they tried to hype it as being. But as it is, it just looks like the first 14/16nm mainstream card to market. No more, no less. It's a solid card but it'll never be that impressive to launch a mainstream card that slots right into what the competition will offer in 1 or 2 months while leaving the rest of the market uncovered.
    Reply
  • Gigaplex - Thursday, June 30, 2016 - link

    Next you'll be telling us that the NVIDIA 750 Ti was the Maxwell flagship. It came first, that does not make it a flagship. Reply
  • ihatenividiaastheyareaholes - Saturday, July 09, 2016 - link

    heres a thought how about people stop arguaing about this and wake the fuck up to what is going on that being the consoles trying to knock pc of its glorious pedastel Reply
  • IronTed - Wednesday, June 29, 2016 - link

    You sir are a moron. Reply
  • slickr - Thursday, June 30, 2016 - link

    The 390 easily beast the gimped and fraudulent 970 3.5GB trash for LESS money. The trash 970 still costs around $300, in extremely rare cases $280 for only 3.5GB.

    The RX 480 costs $200 or $240, consumes less, has full DX12 support, its cooler, quieter, overclocks a lot more.
    Reply
  • sonicmerlin - Friday, July 01, 2016 - link

    The reference cards don't overclock at all. If you want an AIB card expect to pay significantly more. Reply

Log in

Don't have an account? Sign up now