Tonga’s Microarchitecture - What We’re Calling GCN 1.2

As we alluded to in our introduction, Tonga brings with it the next revision of AMD’s GCN architecture. This is the second such revision to the architecture, the last revision (GCN 1.1) being rolled out in March of 2013 with the launch of the Bonaire based Radeon HD 7790. In the case of Bonaire AMD chose to kept the details of GCN 1.1 close to them, only finally going in-depth for the launch of the high-end Hawaii GPU later in the year. The launch of GCN 1.2 on the other hand is going to see AMD meeting enthusiasts half-way: we aren’t getting Hawaii level details on the architectural changes, but we are getting an itemized list of the new features (or at least features AMD is willing to talk about) along with a short description of what each feature does. Consequently Tonga may be a lateral product from a performance standpoint, but it is going to be very important to AMD’s future.

But before we begin, we do want to quickly remind everyone that the GCN 1.2 name, like GCN 1.1 before it, is unofficial. AMD does not publicly name these microarchitectures outside of development, preferring to instead treat the entire Radeon 200 series as relatively homogenous and calling out feature differences where it makes sense. In lieu of an official name and based on the iterative nature of these enhancements, we’re going to use GCN 1.2 to summarize the feature set.


AMD's 2012 APU Feature Roadmap. AKA: A Brief Guide To GCN

To kick things off we’ll pull old this old chestnut one last time: AMD’s HSA feature roadmap from their 2012 financial analysts’ day. Given HSA’s tight dependence on GPUs, this roadmap has offered a useful high level overview of some of the features each successive generation of AMD GPU architectures will bring with it, and with the launch of the GCN 1.2 architecture we have finally reached what we believe is the last step in AMD’s roadmap: System Integration.

It’s no surprise then that one of the first things we find on AMD’s list of features for the GCN 1.2 instruction set is “improved compute task scheduling”. One of AMD’s major goals for their post-Kavari APU was to improve the performance of HSA by various forms of overhead reduction, including faster context switching (something GPUs have always been poor at) and even GPU pre-emption. All of this would fit under the umbrella of “improved compute task scheduling” in AMD’s roadmap, though to be clear with AMD meeting us half-way on the architecture side means that they aren’t getting this detailed this soon.

Meanwhile GCN 1.2’s other instruction set improvements are quite interesting. The description of 16-bit FP and Integer operations is actually very descriptive, and includes a very important keyword: low power. Briefly, PC GPUs have been centered around 32-bit mathematical operations for some number of years now since desktop technology and transistor density eliminated the need for 16-bit/24-bit partial precision operations. All things considered, 32-bit operations are preferred from a quality standpoint as they are accurate enough for many compute tasks and virtually all graphics tasks, which is why PC GPUs were limited to (or at least optimized for) partial precision operations for only a relatively short period of time.

However 16-bit operations are still alive and well on the SoC (mobile) side. SoC GPUs are in many ways a 5-10 year old echo of PC GPUs in features and performance, while in other ways they’re outright unique. In the case of SoC GPUs there are extreme sensitivities to power consumption in a way that PCs have never been so sensitive, so while SoC GPUs can use 32-bit operations, they will in some circumstances favor 16-bit operations for power efficiency purposes. Despite the accuracy limitations of a lower precision, if a developer knows they don’t need the greater accuracy then falling back to 16-bit means saving power and depending on the architecture also improving performance if multiple 16-bit operations can be scheduled alongside each other.


Imagination's PowerVR Series 6XT: An Example of An SoC GPU With FP16 Hardware

To that end, the fact that AMD is taking the time to focus on 16-bit operations within the GCN instruction set is an interesting one, but not an unexpected one. If AMD were to develop SoC-class processors and wanted to use their own GPUs, then natively supporting 16-bit operations would be a logical addition to the instruction set for such a product. The power savings would be helpful for getting GCN into the even smaller form factor, and with so many other GPUs supporting special 16-bit execution modes it would help to make GCN competitive with those other products.

Finally, data parallel instructions are the feature we have the least knowledge about. SIMDs can already be described as data parallel – it’s 1 instruction operating on multiple data elements in parallel – but obviously AMD intends to go past that. Our best guess would be that AMD has a manner and need to have 2 SIMD lanes operate on the same piece of data. Though why they would want to do this and what the benefits may be are not clear at this time.

AMD's Radeon R9 285 GCN 1.2: Geometry Performance & Color Compression
POST A COMMENT

86 Comments

View All Comments

  • CrazyElf - Wednesday, September 10, 2014 - link

    All in all, this doesn't really change the market all that much.

    I still very firmly feel that the R9 290 right now (Q3 2014) remains the best price:performance of the mid to high end cards. That and the 4GB VRAM which may make it more future proof.

    What really is interesting at this point is what AMD has to respond on Nvidia's Maxwell.
    Reply
  • MrSpadge - Wednesday, September 10, 2014 - link

    I Agree - Tonga is not bad, but on the other hand it does not change anything substantially compared to Tahiti. This would have been a nice result 1 - 1.5 years after the introduction of Tahiti. But that's almost been 3 years ago! The last time a GPU company showed no real progress after 3 years they went out of business shortly afterwards...

    And seing how AMD brags to beat GTX760 almost makes cry. That's the double cut-down version of a 2.5 years old chip which is significantly smaller than Tonga! This is only a comparison because nVidia kept this card at a far too high price because there was no competitive pressure from AMD.

    If this is all they have their next generation will get stomped by Maxwell.
    Reply
  • iLovefloss - Wednesday, September 10, 2014 - link

    So all you got from this review is that Tonga is a cut down version of Tahiti? After reading this review, this is the impression you were left with? Reply
  • MrSpadge - Thursday, September 11, 2014 - link

    Nope. But in the end the result performs just the same at even almost the same power consumption. Sure, there are some new features.. but so far and I expect for the foreseeable future they don't matter. Reply
  • Demiurge - Wednesday, September 10, 2014 - link

    This is the first mid-range card to have all the value add features of the high-end cards. I wish AMD would leverage TrueAudio better, but the other features and the nice TDP drop.

    The color compression enhancement is a very interesting feature. I think that in itself deserves a little applause because of its significance in the design and comparing to the 280's. I think this is more significant, not as a performance feature, but similar to what Maxwell represented for NV in terms of efficiency. Both are respectable design improvements, in different areas. It's a shame they don't cross-license... seems like such as waste.
    Reply
  • MrSpadge - Thursday, September 11, 2014 - link

    Well, the TDP-drop is real, but mostly saves virtual power. By this I mean that 280 / 7950 never come close to using 250 W, and hence the savings from Tonga are far less than the TDP difference makes it seem. The average between different articles seems to be ~20 W saving at the wall and establishes about a power-efficiency parity with cards like GTX670.

    The color compression could be Tongas best feature. But I still wonder: if Pitcairn on 270X comes so close to 285 and 280 performance with 256 bit memory bus and without color compression.. how much does it really matter (for 285)? To me it seems that Tahiti most often didn't need that large bus rather than color compression working wonders for Tonga. Besides, GTX770 and GTX680 also hold up fine at that performance level with a 256 bit bus.
    Reply
  • Demiurge - Thursday, September 11, 2014 - link

    The TDP drop is something I did not think about being a paper launch value. You make a good point about the color compression too. It will be interesting how both fair. That may be an interesting topic to follow up during the driver refresh.

    As an owner of GTX 260 with a 448-bit bus, I can tell you that with anti-aliasing, it matters quite a bit as that becomes the limiter. The shader count is definitely not the limiter usually in the low-end and mid-range displays that these cards will typically be paired with. My GTX 260 and 1280x1024 monitor kind of illustrate that with 216 Shaders/896MB. :-)

    It isn't pretty, but I don't see anything that forces me to upgrade yet. Think I've got two more generations or so to wait on before performance is significant enough, or a groundbreaking feature would do it. I'm actually considering upgrading out of boredom and interest in gimmicky features more than anything else at this point.
    Reply
  • TiGr1982 - Thursday, September 11, 2014 - link

    GTX 260 is like 6 years old now. It's lacking DX11, having less than 1 GB of (relatively slow) GDDR3 VRAM, and overall should be 3-4 times slower than R9 285 or R9 290, I guess.

    I really didn't think anybody still uses these old gen cards (e.g. I have HD 7950 Boost Dual-X which is essentially identical to R9 280).
    Reply
  • P39Airacobra - Friday, January 09, 2015 - link

    Because they would loose money! LOL. And they are both about the same anyway, Except AMD goes for brute force to get performance,(like using aV8) And Nvidia uses efficency with power. (Like a turbo charged 4cyl or 6cyl) Reply
  • bwat47 - Thursday, September 11, 2014 - link

    "And seing how AMD brags to beat GTX760 almost makes cry. That's the double cut-down version of a 2.5 years old chip which is significantly smaller than Tonga! This is only a comparison because nVidia kept this card at a far too high price because there was no competitive pressure from AMD."

    You are being pretty silly here. Both AMD and Nvidia were rebranding a lot of cards these last few gens. You can'y go after AMD for rebranding a 2-3 year old chip, and then say its fine if nvidia does it and blame AMD's 'lack of competitive pressure'. If lack of competitive pressure was the reason for rebranding, then there was lack of competitive pressure on both sides.

    And I highly doubt the 285 is 'all amd has'. this was just a small update to their product line, to bring some missing features (freesync, true audio etc...), and reduced power consumption to the 28x series. I'm sure there is a 3xx series coming down the road (or whatever they will call it). Both AMD and nvidia have been working been squeezing all they can out of older architecture for the past few years, you can't really put the blame on one of the other without being hypocritical.
    Reply

Log in

Don't have an account? Sign up now