The Polaris Architecture: In Brief

For today’s preview I’m going to quickly hit the highlights of the Polaris architecture.

In their announcement of the architecture this year, AMD laid out a basic overview of what components of the GPU would see major updates with Polaris. Polaris is not a complete overhaul of past AMD designs, but AMD has combined targeted performance upgrades with a chip-wide energy efficiency upgrade. As a result Polaris is a mix of old and new, and a lot more efficient in the process.

At its heart, Polaris is based on AMD’s 4th generation Graphics Core Next architecture (GCN 4). GCN 4 is not significantly different than GCN 1.2 (Tonga/Fiji), and in fact GCN 4’s ISA is identical to that of GCN 1.2’s. So everything we see here today comes not from broad, architectural changes, but from low-level microarchitectural changes that improve how instructions execute under the hood.

Overall AMD is claiming that GCN 4 (via RX 480) offers a 15% improvement in shader efficiency over GCN 1.1 (R9 290). This comes from two changes; instruction prefetching and a larger instruction buffer. In the case of the former, GCN 4 can, with the driver’s assistance, attempt to pre-fetch future instructions, something GCN 1.x could not do. When done correctly, this reduces/eliminates the need for a wave to stall to wait on an instruction fetch, keeping the CU fed and active more often. Meanwhile the per-wave instruction buffer (which is separate from the register file) has been increased from 12 DWORDs to 16 DWORDs, allowing more instructions to be buffered and, according to AMD, improving single-threaded performance.

Outside of the shader cores themselves, AMD has also made enhancements to the graphics front-end for Polaris. AMD’s latest architecture integrates what AMD calls a Primative Discard Accelerator. True to its name, the job of the discard accelerator is to remove (cull) triangles that are too small to be used, and to do so early enough in the rendering pipeline that the rest of the GPU is spared from having to deal with these unnecessary triangles. Degenerate triangles are culled before they even hit the vertex shader, while small triangles culled a bit later, after the vertex shader but before they hit the rasterizer. There’s no visual quality impact to this (only triangles that can’t be seen/rendered are culled), and as claimed by AMD, the benefits of the discard accelerator increase with MSAA levels, as MSAA otherwise exacerbates the small triangle problem.

Along these lines, Polaris also implements a new index cache, again meant to improve geometry performance. The index cache is designed specifically to accelerate geometry instancing performance, allowing small instanced geometry to stay close by in the cache, avoiding the power and bandwidth costs of shuffling this data around to other caches and VRAM.

Finally, at the back-end of the GPU, the ROP/L2/Memory controller partitions have also received their own updates. Chief among these is that Polaris implements the next generation of AMD’s delta color compression technology, which uses pattern matching to reduce the size and resulting memory bandwidth needs of frame buffers and render targets. As a result of this compression, color compression results in a de facto increase in available memory bandwidth and decrease in power consumption, at least so long as buffer is compressible. With Polaris, AMD supports a larger pattern library to better compress more buffers more often, improving on GCN 1.2 color compression by around 17%.

Otherwise we’ve already covered the increased L2 cache size, which is now at 2MB. Paired with this is AMD’s latest generation memory controller, which can now officially go to 8Gbps, and even a bit more than that when oveclocking.

AMD's Path to Polaris Gaming Performance
Comments Locked

449 Comments

View All Comments

  • catavalon21 - Wednesday, July 13, 2016 - link

    The review HardOCP did on the 480 in CF mode against the 1080 and 1070 suggests YOUR statement missed the mark...if only I could type, or proofread, or something.
  • AbbieHoffman - Wednesday, June 29, 2016 - link

    Well! I was going to buy the RX 480 to replace my GTX 970, But it looks like there is no point! I really thought the 480X was going to perform better than the 980.
  • Meteor2 - Thursday, June 30, 2016 - link

    If you want to replace your 970 you're going to have buy a 1070.
  • Laststop311 - Thursday, June 30, 2016 - link

    Hopefully this brings price of 1070 down to 299.99 for the custom ones.
  • vladx - Thursday, June 30, 2016 - link

    Good luck with that
  • amitp05 - Thursday, June 30, 2016 - link

    AMD need to push performance UP by 15% and power consumption DOWN 15%. To make this card truly tempting and to match the hype they created.

    But I'll still buy AMD. We need them :(

    AMD: Please don't Hype Zen too much. It feels bad when expectation you created are not met.
  • AntDX316 - Thursday, June 30, 2016 - link

    If you want to support AMD just get XB2 and PS5.
  • D. Lister - Thursday, June 30, 2016 - link

    Nah, last time my GPU died, I spent several months on a crappy IGPU. AMD, or ANY company for that matter, didn't come for my support. Then why should I support any of them?
  • GPU2016follower - Thursday, June 30, 2016 - link

    I don't even know why I come here maybe just by curiosity but I don't trust anandtech and their biased reviews always in favor of Nvidia cards. In the majority of other websites' reviwers just to name few: Techspot, Forbes, Polygon, arstechnica, PCgamer, ... the RX 480 easily dominates the GTX 970 in 95 % gaming benchmarks by an average from 5 and up to 10fps and the RX 480 manages in very few cases to trail the GTX 980 by only 2 or 3 fps below.

    I think I will wait for the custom versions to see if they can offer better performance, maybe we will see the Sapphire, ASUS, XFX RX 480 beefed with their more powerful OC versions to compete against the GTX 980.
  • AntDX316 - Thursday, June 30, 2016 - link

    The microstutter is way higher than nvidias offerings..

    Unfortunately certain things aren't made common like adaptive vsync/gsync, micro stutter, frame draw response time. FPS is what most of the gamers look for and soley chase. I believe because they are too busy thinking about what they were taught before and/or too busy with whatever else they are doing in life other than keeping up-to-date of what does matter for the best gaming experience and why. It took a while for people to move away from the you can only see 24/30 fps and no more.

Log in

Don't have an account? Sign up now