The GPU

AMD making the move from VLIW4 to the newer GCN architecture makes a lot of sense. Rather than being behind the curve, Kaveri now shares the same GPU architecture as Hawaii based GCN parts; specifically the GCN 1.1 based R9-290X and 260X from discrete GPU lineup. By synchronizing the architecture of their APUs and discrete GPUs, AMD is finally in a position where any performance gains or optimizations made for their discrete GPUs will feed back into their APUs, meaning Kaveri will also get the boost and the bonus. We have already discussed TrueAudio and the UVD/VCE enhancements, and the other major one to come to the front is Mantle.

The difference between the Kaveri implementation of GCN and Hawaii, aside from the association with the CPU in silicon, is the addition of the coherent shared unified memory as Rahul discussed in the previous page.

AMD makes some rather interesting claims when it comes to the gaming market GPU performance – as shown in the slide above, ‘approximately 1/3 of all Steam gamers use slower graphics than the A10-7850K’. Given that this SKU is 512 SPs, it makes me wonder just how many gamers are actually using laptops or netbook/notebook graphics. A quick look at the Steam survey shows the top choices for graphics are mainly integrated solutions from Intel, followed by midrange discrete cards from NVIDIA. There are a fair number of integrated graphics solutions, coming from either CPUs with integrated graphics or laptop gaming, e.g. ‘Mobility Radeon HD4200’. With the Kaveri APU, AMD are clearly trying to jump over all of those, and with the unification of architectures, the updates from here on out will benefit both sides of the equation.

A small bit more about the GPU architecture:

Ryan covered the GCN Hawaii segment of the architecture in his R9 290X review, such as the IEEE2008 compliance, texture fetch units, registers and precision improvements, so I will not dwell on them here. The GCN 1.1 implementations on discrete graphics cards will still rule the roost in terms of sheer absolute compute power – the TDP scaling of APUs will never reach the lofty heights of full blown discrete graphics unless there is a significant shift in the way these APUs are developed, meaning that features such as HSA, hUMA and hQ still have a way to go to be the dominant force. The effect of low copying overhead on the APU should be a big break for graphics computing, especially gaming and texture manipulation that requires CPU callbacks.

The added benefit for gamers as well is that each of the GCN 1.1 compute units is asynchronous and can implement independent scheduling of different work. Essentially the high end A10-7850K SKU, with its eight compute units, acts as eight mini-GPU blocks for work to be carried out on.

Despite AMD's improvements to their GPU compute frontend, they are still ultimately bound by the limited amount of memory bandwidth offered by dual-channel DDR3. Consequently there is still scope to increase performance by increasing memory bandwidth – I would not be surprised if AMD started looking at some sort of intermediary L3 or eDRAM to increase the capabilities here.

Details on Mantle are Few and Far Between

AMD’s big thing with GCN is meant to be Mantle – AMD's low level API for game engine designers intended to improve GPU performance and reduce the at-times heavy CPU overhead in submitting GPU draw calls. We're effectively talking about scenarios bound by single threaded performance, an area where AMD can definitely use the help. Although I fully expect AMD to eventually address its single threaded performance deficit vs. Intel, Mantle adoption could help Kaveri tremendously. The downside obviously being that Mantle's adoption at this point is limited at best.

Despite the release of Mantle being held back by the delay in the release of the Mantle patch for Battlefield 4 (Frostbite 3 engine), AMD was happy to claim a 2x boost in an API call limited scenario benchmark and 45% better frame rates with pre-release versions of Battlefield 4. We were told this number may rise by the time it reaches a public release.

Unfortunately we still don't have any further details on when Mantle will be deployed for end users, or what effect it will have. Since Battlefield 4 is intended to be the launch vehicle for Mantle - being by far the highest profile game of the initial titles that will support it - AMD is essentially in a holding pattern waiting on EA/DICE to hammer out Battlefield 4's issues and then get the Mantle patch out. AMD's best estimate is currently this month, but that's something that clearly can't be set in stone. Hopefully we'll be taking an in-depth look at real-world Mantle performance on Kaveri and other GCN based products in the near future.

Dual Graphics

AMD has been coy regarding Dual Graphics, especially when frame pacing gets plunged into the mix. I am struggling to think if at any point during their media presentations whether dual graphics, the pairing of the APU with a small discrete GPU for better performance, actually made an appearance. During the UK presentations, I specifically asked about this with little response except for ‘AMD is working to provide these solutions’. I pointed out that it would be beneficial if AMD gave an explicit list of paired graphics solutions that would help users when building systems, which is what I would like to see anyway.

AMD did address the concept of Dual Graphics in their press deck. In their limited testing scenario, they paired the A10-7850K (which has R7 graphics) with the R7 240 2GB GDDR3. In fact their suggestion is that any R7 based APU can be paired with any G/DDR3 based R7 GPU. Another disclaimer is that AMD recommends testing dual graphics solutions with their 13.350 driver build, which due out in February. Whereas for today's review we were sent their 13.300 beta 14 and RC2 builds (which at this time have yet to be assigned an official Catalyst version number).

The following image shows the results as presented in AMD’s slide deck. We have not verified these results in any way and are only here as a reference from AMD.

It's worth noting that while AMD's performance with dual graphics thus far has been inconsistent, we do have some hope that it will improve with Kaveri if AMD is serious about continuing to support it. With Trinity/Richland AMD's iGPU was in an odd place, being based on an architecture (VLIW4) that wasn't used in the cards it was paired with (VLIW5). Never mind the fact that both were a generation behind GCN, where the bulk of AMD's focus was. But with Kavari and AMD's discrete GPUs now both based on GCN, and with AMD having significantly improved their frame pacing situation in the last year, dual graphics is in a better place as an entry level solution to improving gaming performance. Though like Crossfire on the high-end, there are inevitably going to be limits to what AMD can do in a multi-GPU setup versus a single, more powerful GPU.

AMD Fluid Motion Video

Another aspect that AMD did not expand on much is their Fluid Motion Video technology on the A10-7850K. This is essentially using frame interpolation (from 24 Hz to 50 Hz / 60 Hz) to ensure a smoother experience when watching video. AMD’s explanation of the feature, especially to present the concept to our reader base, is minimal at best: a single page offering the following:

A Deep Dive on HSA The Kaveri Socket and Chipset Line Up: Today and Q1, No Plans for FX or Server(?)
Comments Locked

380 Comments

View All Comments

  • eanazag - Wednesday, January 15, 2014 - link

    In reference to the no FX versions, I don't think that will change. I think we are stuck with it indefinitely. From the AMD server roadmap and info in this article related to process, I believe that the Warsaw procs will be a die shrink to 12/16 because the GF 28nm process doesn't help clocks. The current clocks on the 12/16 procs already suck so they might stay the same or better because of the TDP reduction at that core count, but it doesn't benefit in the 8 core or less pile driver series. Since AMD has needed to drive CPU clock way higher to compensate for a lack of IPC and the 28 nm process hurts clocks, I am expecting to not see anything for FX at all. Only thing that could change that is if a process at other than GF would make a good fit for a die shrink. I still doubt they will be doing any more changes to the FX series at the high end.

    So to me, this might force me to consider only Intel for my next build because I am still running discrete GPUs in desktop and I want at least 8 core (AMD equivalent in Intel) performance CPUs in my main system. I will likely go with a #2 Haswell chip. I am not crazy about paying $300 for a CPU, but $200-300 is okay.

    I would not be surprised to see an FX system with 2P like the original FX. The server roadmap is showing that. This would essentially be two Kaveri's and maybe crossfire between the two procs. That sounds slightly interesting if I could ratchet up the TDP for the CPU. It does sound like a Bitcoin beast.
  • britjh22 - Wednesday, January 15, 2014 - link

    I think there are some interesting points to be made about Kaveri, but I think the benchmarks really fall short of pointing to some possibly interesting data. Some of the things I got from this:

    1. The 7850k is too expensive for the performance it currently offers (no proliferation of HSA), and the people comparing it to cheaper CPU/dGPU are correct. However to say Kaveri fails based on that particular price comparison is a failure to see what else is here, and the article does point that out somewhat.

    2. The 45W part does seem to be the best spot at the moment for price to performance, possibly indicating that more iGPU resources don't give up much benefit without onboard cache like crystalwell/Iris Pro. However, putting the 4770R in amongst the benches is no super useful due to the price and lack of availability, not to mention it not being socketed.

    3. The gaming benchmarks may be the standard for AT, but they really don't do an effective job to either prove or disprove AMD's claims for gaming performance. Plenty of people will (and have looking at the comments) say they have failed at 1080p gaming scores based on 1080p extreme settings. Even some casual experimentation to see what is actually achievable at 1080p would be helpful and informative.

    4. I think the main target for these systems isn't really being addressed by the review, which may be difficult to do in a score/objective way, but I think it would be useful. I think of systems like this, and more based off the 65W/45W parts as great mainstream parts. For that price ($100-130ish) you would be looking at an i3 with iGP, or a lower feature pentium part with a low end dGPU. I think at this level you get a lot more from your money with AMD. You have a system which one aspect will not become inadequate before the other (CPU vs GPU), how many relatives do we know where they have an older computer with enough CPU grunt, but not enough GPU grunt. I've seen quite a few where the Intel integrated was just good enough at the time of launch, but a few years down the road would need a dGPU or more major system upgrade. A system with the A8-7600 would be well rounded for a long time, and down the road could add a mid grade dGPU for good gaming performance. I believe it was an article on here that recently showed even just an A8 was quite sufficient for high detail 1080p when paired with a mid to high range card.

    5. As was referenced another review and in the comments, a large chunk of steam users are currently being served by iGPU's which are worse then this. These are the people who play MMO's, free to play games, source games, gMod games, DOTA2/LoL, indie games, and things like Hearthstone. For them, and most users that these should be aimed at, the A10-7850K (at current pricing) is not a winner, and they would probably be better (value) or equally (performance) served by the A8-7600. This is a problem with review sites, including AT, which tend to really look at the high end of the market. This is because the readership (myself included) is interested for personal decision making, and the manufacturer's provide these products as, performance wise, they are the most flattering. However, I think some of the most interesting and prolific advances are happening in the middle market. The review does a good job of pointing that out with the performance charts at 45W, however I think some exploration into what was mentioned in point #3 would really help to flesh this out. Anand's evaluation for CPU advances slowing down in his Mac Pro is a great example of this, and really points out how HSA could be a major advancement. I upgraded from a Q6600 to a 3570K, and don't see any reasons coming up to make a change any time soon, CPU's have really become somewhat stagnant at the high end of performance. Hopefully AMD's gains at the 45W level can pan out into some great APU's in laptops for AMD, for all the users for games like the above mentioned.
  • fteoath64 - Sunday, January 19, 2014 - link

    As consumers, our problem with the prices inching upwards in the mid-range is that Intel is not supplying enough models of the i3 range within the price point of AMD APU (mid to highest models). This means the prices are well segmented in the market such that they will not change giving excuse for slight increases as we have seen with Richland parts. It seems like lack of competition in the segment ranges indicate a cartel like behaviour in the x86 market.
    AMD is providing the best deal in a per transistor basis while consumers expects their cpu performance to ran on par with Intel. That is not going to happen as Intel's gpu inprovement inches closer to AMD. With HSA, the tables have turned for AMD and Intel with Nvidia certain will have to respond some time in the future. This is come when the software changes for HSA makes a significant improvement in overall performance for AMD APUs. We shall see but I am hopeful.
  • woogitboogity - Wednesday, January 15, 2014 - link

    Ah AMD... to think that in the day of thunderbird they were once the under-appreciated underdog where the performance was. The rebel against the P4 and it's unbelievably impractical pipeline architecture.

    Bottom line is Intel still needs them as anti-trust suit insurance... with this SoC finally getting off the ground is anyone else wondering whether Intel was less aggressive with their own SoC stuff as a "AMD doggy/gimp treat"? Still nice to able to recommend a processor without worrying about the onboard graphics when they are on chip.
  • Hrel - Wednesday, January 15, 2014 - link

    "do any AnandTech readers have an interest in an even higher end APU with substantially more graphics horsepower? Memory bandwidth obviously becomes an issue, but the real question is how valuable an Xbox One/PS4-like APU would be to the community."

    I think as a low end Steam Box that'd be GREAT! I'm not sure the approach Valve is looking to take with steam boxes, but if there's no "build your own" option then it doesn't make sense to sell it to us. Makes a lot more sense for them to do that and just sell the entire "console" directly to consumers. Or, through a reseller, but then I become concerned with additional markup from middlemen.
  • tanishalfelven - Wednesday, January 15, 2014 - link

    You can install steamos on whatever computer you want... even one you built your self or one you already own. I'd personally think a pc based on something like this processor would be significantly less expensive (i can imagine 300 bucks) and maybe even faster. And more importantly with things like humble bundle it'd be much much cheaper in the games department...
  • tanishalfelven - Wednesday, January 15, 2014 - link

    i am wrong on faster than ps4 however, point stands
  • JBVertexx - Wednesday, January 15, 2014 - link

    As always, very good writeup, although I must confess that it took me a few attempts to get thru the HSA feel dive! Still, it was a much needed education, so I appreciate that.

    I have had to digest this, as I was initially really dissappointed at the lack of progress on the CPU front, but after reading through all the writeups I could find, I thinks the real story here is about the A8-7600 and opening up new markets for advanced PC based gaming.

    If you think about it, that is where the incentive is for game developers to develop for Mantle. Providing the capability for someone who already has or would purchase an advanced discrete GPU to play with equal performance on an APU provides zero economic incentive for game developers.

    However, if AMD can successfully open up as advanced gaming to the mass, low cost PC market, even if that performance is substandard by "enthudiast" standards, then that does provide huge economic incentive for developers, because the cost of entry to play your game has just gone down significantly, potentially opening up a vast new customer base.

    With Steam really picking up "steam", with the consoles on PC tech, and with the innovative thinking going on at AMD, I have come around to thinking this is all really good stuff for PC gaming. And it's really the only path to adoption that AMD can take. I for one am hoping they're successful.
  • captianpicard - Wednesday, January 15, 2014 - link

    I doubt Kaveri was ever intended for us, the enthusiast community. The people whom Kaveri was intended for are not the type that would read a dozen CPU/GPU reviews and then log on to newegg to price out an optimal FPS/$ rig. Instead, they would be more inclined to buy reasonably priced prebuilt PCs with the hope that they'd be able to do some light gaming in addition to the primary tasks of web browsing, checking email, watching videos on youtube/netflix, running office, etc.

    Nothing really up till now has actually fulfilled that niche, and done it well, IMO. Lots of machines from dell, HP, etc. have vast CPU power but horrendous GPU performance. Kaveri offers a balanced solution at an affordable price, in a small footprint. So you could put it into a laptop or a smart tv or all in one pc and be able to get decent gaming performance. Relatively speaking, of course.
  • izmanq - Wednesday, January 15, 2014 - link

    why put i7 4770 with discrete HD 6750 in the integrated GPU performance charts ? :|

Log in

Don't have an account? Sign up now