Graphics

The big upgrade in graphics for Carrizo is that the maximum number of compute units for a 15W mobile APU moves up from six (384 SPs) to eight (512 SPs), affording a 33% potential improvement. This means that the high end A10 Carrizo mobile APUs will align with the A10 Kaveri desktop APUs, although the desktop APUs will use 6x the power. Carrizo also moves to AMD’s third generation of Graphics Core Next, meaning GCN 1.2 and similar to Tonga based retail graphics cards (the R9 285).

This gives DirectX 12 support, but one of AMD’s aims with Carrizo is full HSA 1.0 support. Earlier this year when AMD first released proper Carrizo details, we were told that Carrizo will support the full HSA 1.0 draft as it currently stands as it has not been ratified, and they will not push back the launch of Carrizo until that happens. So there is a chance that Carrizo will not be certified has a fully HSA 1.0 compliant APU, but very few people are predicting major changes to the specification at this point before ratification that requires hardware adjustments.

The difference between Kaveri’s ‘HSA Ready’ and Carrizo’s ‘HSA Final’ nomenclature comes down to one main feature – context switching. Kaveri can do everything Carrizo can do, apart from this. Context switching allows the HSA device to switch between work asynchronously while it waits on the other part that needs to finish. I would imagine that if Kaveri came across work that required this, it would sit there idle waiting for work to finish before continuing, which means that Carrizo would be faster in this regard.

One of the key parts of HSA is pointer translation, allowing both the CPU and GPU to access the same memory despite their different interpretations of how the memory in the system is configured. One of the features on Carrizo will be the use of address translation caches inside the GPU, essentially keeping a record of which address points to which data and when an address is in a lower cache, that data can be accessed quicker. These ATC L1/L2 caches will be inside the compute units themselves as well as the GPU memory controller and an overriding ATC L2 beyond the regular L2 per compute unit.

Use of GCN 1.2 means that AMD can use their latest color compression algorithms with little effort – it takes a little more die area to implement (of which Excavator has more to play with than Kaveri), but affords performance improvements particularly in gaming. The texture data is stored losslessly to maintain visual fidelity, and move between graphics cores in this compressed state.

In yet more effort to suction power out of the system, the GPU will have its own dedicated voltage plane as part of the system, rather than a separate voltage island requiring its own power delivery mechanism as before. AMD’s latest numbers on the improvements here only date back to June 2013 via internal simulations, rather than an actual direct comparison.

All the performance metrics rolled in, and AMD is quoting a 65% performance improvement at 15W compared to Kaveri. The adjustment in design is allowing higher frequency for the same power, combined with the additional compute units and other enhancements for the overall score. At 35W the gain is less pronounced, but more akin to regular generational improvements anyway. What we see at 35W is what we would normally expect, and it pales in comparison to the 15W numbers.

Unified Video Decoder and Playback Pathways AMD Secure Processor and Final Thoughts
Comments Locked

137 Comments

View All Comments

  • FlushedBubblyJock - Tuesday, June 9, 2015 - link

    amazing how a critically correct comment turns into an angry ranting conspiracy from you
  • BillyONeal - Wednesday, June 3, 2015 - link

    This is a preview piece. They don't have empirical data because the hardware isn't in actual devices yet. Look at any of AT's IDF coverage and you'll see basically the exact same thing.
  • Refuge - Wednesday, June 3, 2015 - link

    nothing has been released yet. but it was announced. This is a news site, you think they are just going to ignore AMD's product announcement? That would be considered "Not doing their job"

    They go through the claims, explain them, try to see if they are plausible with what little information they have. I like these articles, it gives me something to digest while I wait for a in depth review, and when I go to read said review I know exactly what information I'm most interested in.
  • KaarlisK - Wednesday, June 3, 2015 - link

    About adaptive clocking.
    Power is not saved by reducing frequency by 5% for 1% of the time.
    Power is saved by reducing the voltage margin (increasing frequency at the same voltage) _all_ the time.
    Also, when the voltage instability occurs, only frequency is reduced. The requested voltage, IMHO, does not change.
  • ingwe - Wednesday, June 3, 2015 - link

    Interesting. That makes more sense for sure.
  • name99 - Monday, June 8, 2015 - link

    It seems like a variant of this should be widely applicable (especially if AMD have patents on exactly what they do). What I have in mind is that when you detect droop rather than dynamically change the frequency (which is hard and requires at least some cycles) you simply freeze the entire chip's clock at the central distribution point --- for one cycle you just hold everything at zero rather than transitioning to one and back. This will give the capacitors time to recover from the droop (and obviously the principle can be extended to freeze the clock for two cycles or even more if that's how long it takes for the capacitors to recover).

    This seems like it should allow you to run pretty damn close to the minimum necessary voltage --- basically all you now need is enough margin to ensure that you don't overdraw within a worst case single-cycle. But you don't need to provision for 3+ worst-case cycles, and you don't need the alternative of fancy check-point and recovery mechanisms.
  • KaarlisK - Wednesday, June 3, 2015 - link

    About that power plane.
    "In yet more effort to suction power out of the system, the GPU will have its own dedicated voltage plane as part of the system, rather than a separate voltage island requiring its own power delivery mechanism as before"
    As I understand it, "before" = same power plane/island as other parts of the SoC.
  • Gadgety - Wednesday, June 3, 2015 - link

    Great read and analysis given the fact that actual units are not available for testing.

    As a consumer looking for use of Carrizo beyond laptops, provided AMD releases it for consumers, it could be a nice living room HTPC/light gaming unit.
  • Laxaa - Wednesday, June 3, 2015 - link

    I would buy a Dell XPS13-esque machine with this(i.e. high quality materials, good design and a high res screen)
  • Will Robinson - Wednesday, June 3, 2015 - link

    According to ShintelDK and Chizow...the above article results are from an Intel chip and AT have been paid to lie and say its Carrizo because their lives would have no meaning if it is a good product from AMD.

Log in

Don't have an account? Sign up now