Compute & Synthetics

One of the major promises of AMD's APUs is the ability to harness the incredible on-die graphics power for general purpose compute. While we're still waiting for the holy grail of heterogeneous computing applications to show up, we can still evaluate just how strong Trinity's GPU is at non-rendering workloads.

Our first compute benchmark comes from Civilization V, which uses DirectCompute 5 to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game's leader scenes. And while games that use GPU compute functionality for texture decompression are still rare, it's becoming increasingly common as it's a practical way to pack textures in the most suitable manner for shipping rather than being limited to DX texture compression.

Compute: Civilization V

Similar to what we've already seen, Trinity offers a 15% increase in performance here compared to Llano. The compute advantage here over Intel's HD 4000 is solid as well.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We're now using a development build from the version 2.0 branch, and we've moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU 2.0d4

Intel significantly shrinks the gap between itself and Trinity in this test, and AMD doesn't really move performance forward that much compared to Llano either.

For our next benchmark we're looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher. Note that this test fails on all Intel processor graphics, so the results below only include AMD APUs and discrete GPUs.

AESEncryptDecrypt

We see a pretty hefty increase in performance over Llano in our AES benchmark. The on-die Radeon HD 7660D even manages to outperform NVIDIA's GeForce GT 640, a $100+ discrete GPU.

Our fourth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we're using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

DirectX11 Compute Shader Fluid Simulation - Nearest Neighbor

For our last compute test, Trinity does a reasonable job improving performance over Llano. If you're in need of a lot of GPU computing horsepower you're going to be best served by a discrete GPU, but it's good to see the processor based GPUs inch their way up the charts.

Synthetic Performance

Moving on, we'll take a few moments to look at synthetic performance. Synthetic performance is a poor tool to rank GPUs—what really matters is the games—but by breaking down workloads into discrete tasks it can sometimes tell us things that we don't see in games.

Our first synthetic test is 3DMark Vantage's pixel fill test. Typically this test is memory bandwidth bound as the nature of the test has the ROPs pushing as many pixels as possible with as little overhead as possible, which in turn shifts the bottleneck to memory bandwidth so long as there's enough ROP throughput in the first place.

3DMark Vantage Pixel Fill

Since our Llano and Trinity numbers were both run at DDR3-1866, there's no real performance improvement here. Ivy Bridge actually does quite well in this test, at least the HD 4000.

Moving on, our second synthetic test is 3DMark Vantage's texture fill test, which provides a simple FP16 texture throughput test. FP16 textures are still fairly rare, but it's a good look at worst case scenario texturing performance.

3DMark Vantage Texture Fill

Trinity is able to outperform Llano here by over 30%, although NVIDIA's GeForce GT 640 shows you what a $100+ discrete GPU can offer beyond processor graphics.

Our final synthetic test is the set of settings we use with Microsoft's Detail Tessellation sample program out of the DX11 SDK. Since IVB is the first Intel iGPU with tessellation capabilities, it will be interesting to see how well IVB does here, as IVB is going to be the de facto baseline for DX11+ games in the future. Ideally we want to have enough tessellation performance here so that tessellation can be used on a global level, allowing developers to efficiently simulate their worlds with fewer polygons while still using many polygons on the final render.

DirectX11 Detail Tessellation Sample - Normal

DirectX11 Detail Tessellation Sample - Max

The tessellation results here were a bit surprising given the 8th gen tessellator in Trinity's GPU. AMD tells us it sees much larger gains internally (up to 2x), but using different test parameters. Trinity should be significantly faster than Llano when it comes to tessellation performance, depending on the workload that is.

Minecraft & Civilization V Performance Power Consumption
Comments Locked

139 Comments

View All Comments

  • parkerm35 - Monday, October 1, 2012 - link

    First of all you have never owned a 3870k, as your just an Intel fan boy wanting some attention. The simple fact is you have chosen to look for an AMD review to feed us this rubbish, this just shows how obssesed you are. If you don't like AMD parts, that's fine, but it's because of people like you why AMD is in this kind of mess to start of with. I bet your one of these people who went out and bought a P4 as well?

    This review has just shown you this APU competing with discrete graphics cards, and doing a damn good job at it too. How much was your G620? add the price of a discrete card that is capable of matching the trinity, maybe looking at a GT630 (which i think will be slightly slower), $70? + $65 for the CPU $135 for a dual core, slower CPU and in all a more power hungry setup. Do me a favor.

    " A G620 can compete generally with a 3870K on the CPU side. That is just embarrassing. The 5800K isn't much of an improvement."

    How do you know the 5800k isn't much of an improvement? This hole review is about GPUs, no CPU data what so ever.

    Could you please list these HD1000 parts with quicksync.
  • kpo6969 - Thursday, September 27, 2012 - link

    Anand if you went along with this your stock as one of (if not the best) reviews to trust site has gone way down. Just my opinion.
  • rhx123 - Thursday, September 27, 2012 - link

    I agree. They should have done the same as TechReport and called AMD out on this.
    I have been a long time lurker, and I nearly posted about Anandtech's spin on the Enduro Update, but now it really feels like there's something going on between the two.

    It's obvious that in making AMD hold this information back, it's confirmed to everyone in the know that piledriver is going to be rubbish , and has probably done AMD more damage than just letting people release the benchmarks.

    Just hoping a Chinese reviewer somewhere can get his hands on the parts and release some real CPU benchmarks.
  • jaydee - Thursday, September 27, 2012 - link

    Fortuanately, Anand has more class than to be a blatant hypocrite like "Tech Report" in happily preview Intels chips under certain parameters, but complaining about it when AMD does it.

    http://techreport.com/review/9538/intel-conroe-per...
  • cobalt42 - Thursday, September 27, 2012 - link

    You're simply pointing out the difference between a PRE-view and a RE-view, not pointing out any supposed hypocrisy.

    A preview is often done on the manufacturer's terms. Compare to what is often done in gaming; you get to see what they show you, and you're careful not to draw conclusions. (To quote TR's conclusions in that article you cite, they start with "Clearly, it's way too early to call this race.") Previews are also often done when you're offsite and in their controlled conditions. Plus, the article you write about it is called a "preview" in the title, not a "review". Look at the title of these articles versus the one you cite.

    What AMD is trying to do here is control the output of REviews.
  • Visual - Thursday, September 27, 2012 - link

    The high-end GPU version seems nice, its disappointing there are weaker versions though. Especially the mobile version, with not nearly enough performance to distinguish itself from the intel offering.
  • Jamahl - Thursday, September 27, 2012 - link

    Can you point out that the GT 640 in this review is in an Ivy bridge powered system? It would have been nice to have it running in the 5800K system, just to see how close the graphics portion of Trinity really is to it.
  • Rick83 - Thursday, September 27, 2012 - link

    "Note that this test fails on all Intel processor graphics, so the results below only include AMD APUs and discrete GPUs."

    Well, down to the i5's they all have AES acceleration in the CPU pipeline.
    Would be interesting to see a direct comparison of that to the results in the table.

    Of course, for the i3s and below, this is a bit of a let-down.
  • DanNeely - Thursday, September 27, 2012 - link

    What's with the pair of USB1 ports that AMD still puts on all their chipsets?
  • jasomill - Thursday, September 27, 2012 - link

    A cost-saving measure, perhaps, intended for use with integrated devices? Many devices don't benefit from speeds in excess of 12Mbps: keyboards, pointing devices, digitizer tablets, Bluetooth adapters, infrared ports, fingerprint readers, GPS receivers, accelerometers, ambient light sensors, switches, buttons, blinkenlights, fax modems, floppy drives, . . .

Log in

Don't have an account? Sign up now