The Polaris Architecture: In Brief

For today’s preview I’m going to quickly hit the highlights of the Polaris architecture.

In their announcement of the architecture this year, AMD laid out a basic overview of what components of the GPU would see major updates with Polaris. Polaris is not a complete overhaul of past AMD designs, but AMD has combined targeted performance upgrades with a chip-wide energy efficiency upgrade. As a result Polaris is a mix of old and new, and a lot more efficient in the process.

At its heart, Polaris is based on AMD’s 4th generation Graphics Core Next architecture (GCN 4). GCN 4 is not significantly different than GCN 1.2 (Tonga/Fiji), and in fact GCN 4’s ISA is identical to that of GCN 1.2’s. So everything we see here today comes not from broad, architectural changes, but from low-level microarchitectural changes that improve how instructions execute under the hood.

Overall AMD is claiming that GCN 4 (via RX 480) offers a 15% improvement in shader efficiency over GCN 1.1 (R9 290). This comes from two changes; instruction prefetching and a larger instruction buffer. In the case of the former, GCN 4 can, with the driver’s assistance, attempt to pre-fetch future instructions, something GCN 1.x could not do. When done correctly, this reduces/eliminates the need for a wave to stall to wait on an instruction fetch, keeping the CU fed and active more often. Meanwhile the per-wave instruction buffer (which is separate from the register file) has been increased from 12 DWORDs to 16 DWORDs, allowing more instructions to be buffered and, according to AMD, improving single-threaded performance.

Outside of the shader cores themselves, AMD has also made enhancements to the graphics front-end for Polaris. AMD’s latest architecture integrates what AMD calls a Primative Discard Accelerator. True to its name, the job of the discard accelerator is to remove (cull) triangles that are too small to be used, and to do so early enough in the rendering pipeline that the rest of the GPU is spared from having to deal with these unnecessary triangles. Degenerate triangles are culled before they even hit the vertex shader, while small triangles culled a bit later, after the vertex shader but before they hit the rasterizer. There’s no visual quality impact to this (only triangles that can’t be seen/rendered are culled), and as claimed by AMD, the benefits of the discard accelerator increase with MSAA levels, as MSAA otherwise exacerbates the small triangle problem.

Along these lines, Polaris also implements a new index cache, again meant to improve geometry performance. The index cache is designed specifically to accelerate geometry instancing performance, allowing small instanced geometry to stay close by in the cache, avoiding the power and bandwidth costs of shuffling this data around to other caches and VRAM.

Finally, at the back-end of the GPU, the ROP/L2/Memory controller partitions have also received their own updates. Chief among these is that Polaris implements the next generation of AMD’s delta color compression technology, which uses pattern matching to reduce the size and resulting memory bandwidth needs of frame buffers and render targets. As a result of this compression, color compression results in a de facto increase in available memory bandwidth and decrease in power consumption, at least so long as buffer is compressible. With Polaris, AMD supports a larger pattern library to better compress more buffers more often, improving on GCN 1.2 color compression by around 17%.

Otherwise we’ve already covered the increased L2 cache size, which is now at 2MB. Paired with this is AMD’s latest generation memory controller, which can now officially go to 8Gbps, and even a bit more than that when oveclocking.

AMD's Path to Polaris Gaming Performance
Comments Locked

449 Comments

View All Comments

  • Yojimbo - Thursday, June 30, 2016 - link

    What monopoly concerns? It's not illegal to have a monopoly, it's illegal to abuse a monopoly. other than that a monopoly position can affect regulatory rulings concerning mergers and acquisitions, but I doubt NVIDIA has any ambition to make a purchase of a GPU maker, so I doubt they would have any regulatory concerns. The important reason NVIDIA won't try to drive AMD out of the market is because they are interested primarily in increasing their profits and not with driving AMD out of the market. If NVIDIA can get 95% market share and maintain their profit margins they would be very happy, unconcerned with having "too much market share", providing they could achieve it without engaging in uncompetitive practices.
  • cocochanel - Thursday, June 30, 2016 - link

    I never knew Nvidia to be much concerned about monopolies. Over the years, my impression was that they only care about profit margins. And they are good at it.
  • Yojimbo - Thursday, June 30, 2016 - link

    Yes the RX 480 may have a cost advantage over the GTX 970, but the point is that AMD doesn't have the market completely to themselves since the RX 480's advantage over the GTX 970 is dubious. NVIDIA may have smaller profit margins in the space but it's not like they are uncompetitive in the space. The GTX 1060 will arrive soon enough to restore NVIDIA's profit margins. In the mean time inventories of the GTX 970 can be flushed out of the system for a profit.
  • Questor - Wednesday, June 29, 2016 - link

    "When can we get downvote buttons on AT comments?"

    I get your point, really I do! Be careful what you ask for. Heaven forbid you should say anything of merit over at TH. The foundations of civilization shake when you question a review(er) and the fanboys rules supreme with their mouse cursor over those little clickable arrows. You can say something that is completely true, accurate, responsible and even polite, but beware should you offend a minion! They and their brethren will pounce upon your words of wit and wisdom with the fury the scorned. Your post, feelings, opinions, facts, questions and whatever else you said, will be buried so deep, not even Hades will be able to dig it up!
  • JoeyJoJo123 - Wednesday, June 29, 2016 - link

    You can go back to reddit and enjoy your inner-circle and upboat eachother to make yourselves feel good.

    Proper internet forums of speech aren't saddled by prominently displaying the most popular opinion. Everyone's post should be equally as worthless.
  • AntDX316 - Thursday, June 30, 2016 - link

    How do you get a blue post?
  • pashhtk27 - Thursday, June 30, 2016 - link

    "Everyone's post should be equally as worthless."
    Nice. ;)
  • ddriver - Wednesday, June 29, 2016 - link

    The rx480 is targeted in the market niche that has the best sales to profit margins ratio. It is about as fast as the gtx970, but is more efficient and better performing at new and upcoming games (vulkan, actual dx12 (not dumb ports)). In property optimized games (I mean not games nvidia pays to be left unoptimized for radeons) it is as fast as the r9 nano.

    I'd say job well done. A very efficient and well targeted launch. It would not be possible to do any better given amd's lack of resources, any higher expectations would be unrealistic and the product of genuine cluelessness or fanboyism.
  • smilingcrow - Wednesday, June 29, 2016 - link

    More efficient by a negligible margin but it is good value; a Radeon Lidl 480. :)
  • sonicmerlin - Friday, July 1, 2016 - link

    Really? AMD advertised a 2.8x increase in performance per watt with Polaris. The card massively failed AMD's own expectations.

Log in

Don't have an account? Sign up now