Kaveri: Aiming for 1080p30 and Compute

The numerical differences between Kaveri and Richland are easy enough to rattle off – later in the review we will be discussing these in depth – but at a high level AMD is aiming for a middle ground between the desktop model (CPU + discrete graphics) and Apple’s Mac Pro dream (offloading compute onto different discrete graphics cards) by doing the dream on a single processor. At AMD’s Kaveri tech day the following graph was thrown in front of journalists worldwide:

With Intel now on board, processor graphics is a big deal. You can argue whether or not AMD should continue to use the acronym APU instead of SoC, but the fact remains that it's tough to buy a CPU without an integrated GPU.

In the absence of vertical integration, software optimization always trails hardware availability. If you look at 2011 as the crossover year when APUs/SoCs took over the market, it's not much of a surprise that we haven't seen aggressive moves by software developers to truly leverage GPU compute. Part of the problem has been programming model, which AMD hopes to address with Kaveri and HSA. Kaveri enables a full heterogeneous unified memory architecture (hUMA), such that the integrated graphics topology can access the full breadth of memory that the CPU can, putting a 32GB enabled compute device into the hands of developers.

One of the complexities of compute is also time: getting the CPU and GPU to communicate to each other without HSA and hUMA requires an amount of overhead that is not trivial. For compute, this comes in the form of allowing the CPU and GPU to work on the same data set at the same time, effectively opening up all the compute to the same task without asynchronous calls to memory copies and expensive memory checks for coherency.

The issue AMD has with their HSA ecosystem is the need for developers to jump on board. The analogy oft cited is that on Day 1, iOS had very few apps, yet today has millions. Perhaps a small equivocation fallacy comes in here – Apple is able to manage their OS and system in its entirety, whereas AMD has to compete in the same space as non-HSA enabled products and lacks the control. Nevertheless, AMD is attempting to integrate programming tools for HSA (and OpenCL 2.0) as seamlessly as possible to all modern platforms via a HSA Instruction Layer (HSAIL). The goal is for programming languages like Java, C++ and C++ AMP, as well as common acceleration API libraries and toolkits to provide these features at little or no coding cost. This is something our resident compute guru Rahul will be looking at in further detail later on in the review.

On the gaming side, 30 FPS has been a goal for AMD’s integrated graphics solutions for a couple of generations now.

Arguably we could say that any game should be able to do 30 FPS if we turn down the settings far enough, but AMD has put at least one restriction on that: resolution. 1080p is a lofty goal to hold at 30 FPS with some of the more challenging titles of today. In our testing in this review, it was clear that users had a choice – start with a high resolution and turn the settings down, or keep the settings on medium-high and adjust the resolution. Games like BF4 and Crysis 3 are going to tax any graphics card, especially when additional DirectX 11 features come in to play (ambient occlusion, depth of field, global illumination, and bilateral filtering are some that AMD mention).

Introduction and Overview The Steamroller Architecture: Counting Compute Cores and Improvements over Piledriver
Comments Locked

380 Comments

View All Comments

  • DanNeely - Tuesday, January 14, 2014 - link

    This is the first I've heard that Excavator will be the end of the line for the current AMD core. Is there any information about what's coming next publicly available yet?
  • JDG1980 - Tuesday, January 14, 2014 - link

    It's all speculation because AMD hasn't released any roadmaps that far in advance. If I had to guess, I'd say they will probably beef up the "cat" cores (Bobcat -> Jaguar, etc.) and use that as their mainstream line. That would be similar to what Intel did when they were faced with a situation like this - they scaled up the mobile Pentium M to become the Core 2 Duo.
  • jabber - Tuesday, January 14, 2014 - link

    The great shame of these chips is the real market they should be selling in will never take off. These are perfect all round chips for those folks that buy a family PC in the usual PC mega store. That family PC would be your usual Compaq/Acer desktop with a decent enough Intel chip in it but the crappy Intel IGP only.

    But as AMD never advertises to these people (the people who should be buying this stuff) they will never buy them. The demand will never appear. They have heard of Intel, they hear the Intel jingle on the TV several times a week. But AMD? Never heard of them, they cant be any good.

    Has anyone at Anandtech ever got round to interview the lazy idiot in the AMD marketing dept? Does AMD really have a marketing dept?

    AMD, sometimes you do have to push the boat out and make the effort. Really stick it under ordinary peoples noses. Don't bother keep brown-nosing the tech review sites cos most of their readers don't buy your stuff anyway.
  • UtilityMax - Tuesday, January 14, 2014 - link

    AMD can't market the APUs directly to the average consumers. They just buy what the PC mega-store sells to them. AMD should convince the OEMs, and that is _really_ hard. First is the issue of Intel quasi-monopoly. Intel always browbeat the major EOMs to ignore AMD. Even after losing the lawsuit, I think this effect still exists. And then next issue is that, your typical average consumer does not play on PC. They play on consoles. In fact, hardly anyone buys a PC box these days. Everyone buys laptops, and AMD's strategy there is just as weak.
  • ThreeDee912 - Tuesday, January 14, 2014 - link

    They tried to get OEMs to put Llano chips into "thin and light" laptops, but Intel kind of beat them with their Ultrabook marketing.

    At least AMD kind of "won" the console wars by getting their CPUs into both the PS4 and XBone.
  • xdesire - Tuesday, January 14, 2014 - link

    Sorry but i read it like this: this is another piece of sht hardware which is YET another disappointment for their fans. I owned many of their CPUs GPUs and stuff but enough is enough. They have been laying their a**es off for SO long and couldn't even make an improvement on their crap stuff. So, is this THE Kaveri we were promised for so long? I supported them in their worst days by buying their products, hoping to see them come back in the game BUT no, they are being lazy and don't improve sht..
  • jabber - Tuesday, January 14, 2014 - link

    Dear AMD marketing Dept, the above post signifies what I said in the last part of my last post.

    This is not the market/customer you are looking for!
  • jnad32 - Tuesday, January 14, 2014 - link

    Actually a 30% performance improvement seems pretty amazing to me. Also please try and remember that all these tests are done with very early drivers. We all know AMD takes forever to get there drivers in line. I wouldn't personally worry about numbers for the next couple of months. BTW, what were you expecting from an APU? Core i5? HA! I am a massive AMD fan, but we all know that wasn't even possible. What I really want to know is where is my 8 core Steamroller chip.
  • JDG1980 - Tuesday, January 14, 2014 - link

    I was hoping for IPC in line with at least Nehalem. The low IPC is really killing the "construction equipment" cores, and it's increasingly looking like an unfixable problem. If Steamroller could have brought ~30% IPC gains as was initially rumored, then that would have been a good sign, but at this point it seems they'd be better off taking their "cat" cores and scaling them up to desktop levels, and dropping the module architecture as a failed experiment.
  • silverblue - Tuesday, January 14, 2014 - link

    A "construction equipment" (thanks) module actually gets an impressive amount of work done when taxed. The concensus has been to make software think a module is a single core with HT. I imagine that the cores will be fed better in single threaded workloads in that circumstance.

    I also imagine that a heavily threaded workload will extract the very best from the architecture now the MT penalty is gone.

    One question about the review scores - all the testing was done on Windows 7 64-bit SP1 with the Core Parking updates applied. Would using Windows 8 or 8.1 make any real difference to the results or would it just benefit both AMD and Intel?

Log in

Don't have an account? Sign up now