Compute & Synthetics

One of the major promises of AMD's APUs is the ability to harness the incredible on-die graphics power for general purpose compute. While we're still waiting for the holy grail of heterogeneous computing applications to show up, we can still evaluate just how strong Trinity's GPU is at non-rendering workloads.

Our first compute benchmark comes from Civilization V, which uses DirectCompute 5 to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game's leader scenes. And while games that use GPU compute functionality for texture decompression are still rare, it's becoming increasingly common as it's a practical way to pack textures in the most suitable manner for shipping rather than being limited to DX texture compression.

Compute: Civilization V

Similar to what we've already seen, Trinity offers a 15% increase in performance here compared to Llano. The compute advantage here over Intel's HD 4000 is solid as well.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We're now using a development build from the version 2.0 branch, and we've moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU 2.0d4

Intel significantly shrinks the gap between itself and Trinity in this test, and AMD doesn't really move performance forward that much compared to Llano either.

For our next benchmark we're looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher. Note that this test fails on all Intel processor graphics, so the results below only include AMD APUs and discrete GPUs.

AESEncryptDecrypt

We see a pretty hefty increase in performance over Llano in our AES benchmark. The on-die Radeon HD 7660D even manages to outperform NVIDIA's GeForce GT 640, a $100+ discrete GPU.

Our fourth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we're using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

DirectX11 Compute Shader Fluid Simulation - Nearest Neighbor

For our last compute test, Trinity does a reasonable job improving performance over Llano. If you're in need of a lot of GPU computing horsepower you're going to be best served by a discrete GPU, but it's good to see the processor based GPUs inch their way up the charts.

Synthetic Performance

Moving on, we'll take a few moments to look at synthetic performance. Synthetic performance is a poor tool to rank GPUs—what really matters is the games—but by breaking down workloads into discrete tasks it can sometimes tell us things that we don't see in games.

Our first synthetic test is 3DMark Vantage's pixel fill test. Typically this test is memory bandwidth bound as the nature of the test has the ROPs pushing as many pixels as possible with as little overhead as possible, which in turn shifts the bottleneck to memory bandwidth so long as there's enough ROP throughput in the first place.

3DMark Vantage Pixel Fill

Since our Llano and Trinity numbers were both run at DDR3-1866, there's no real performance improvement here. Ivy Bridge actually does quite well in this test, at least the HD 4000.

Moving on, our second synthetic test is 3DMark Vantage's texture fill test, which provides a simple FP16 texture throughput test. FP16 textures are still fairly rare, but it's a good look at worst case scenario texturing performance.

3DMark Vantage Texture Fill

Trinity is able to outperform Llano here by over 30%, although NVIDIA's GeForce GT 640 shows you what a $100+ discrete GPU can offer beyond processor graphics.

Our final synthetic test is the set of settings we use with Microsoft's Detail Tessellation sample program out of the DX11 SDK. Since IVB is the first Intel iGPU with tessellation capabilities, it will be interesting to see how well IVB does here, as IVB is going to be the de facto baseline for DX11+ games in the future. Ideally we want to have enough tessellation performance here so that tessellation can be used on a global level, allowing developers to efficiently simulate their worlds with fewer polygons while still using many polygons on the final render.

DirectX11 Detail Tessellation Sample - Normal

DirectX11 Detail Tessellation Sample - Max

The tessellation results here were a bit surprising given the 8th gen tessellator in Trinity's GPU. AMD tells us it sees much larger gains internally (up to 2x), but using different test parameters. Trinity should be significantly faster than Llano when it comes to tessellation performance, depending on the workload that is.

Minecraft & Civilization V Performance Power Consumption
Comments Locked

139 Comments

View All Comments

  • mczak - Thursday, September 27, 2012 - link

    If early listings at merchants are any indication, they should be available. I think though the problem is that the top 65W parts seem to cost as much as the top 100W parts (so a A10-5800k costs the same as A10-5700, same story for the A8), which probably makes them a hard sell at retail (quite similar to intel, and I don't think the low power parts exactly fly off the retail desks there neither).
    But I agree the 65W parts are nice. On the cpu side you lose around 10% but the gpu actually has the same clocks. Of course if you tinker with it manually it should be easily possible to undervolt/underclock the 100W parts to the same level as the 65W parts.
  • medi01 - Thursday, September 27, 2012 - link

    Oh yeah. Comparing it to nVidia 640 makes so much more sense...

    Of course nobody would think about Anand finding yet another way to piss on AMDs cookies...
  • dawp - Friday, September 28, 2012 - link

    I believe that was included for a comparison to a low end discreet card. it could have just as easily been an hd7750 or hd7770.
  • Aone - Friday, September 28, 2012 - link

    Good point. I've reached the same conclusion.
  • shin0bi272 - Tuesday, October 2, 2012 - link

    its a 100 dollar gpu beating the pants off of amd's latest and greatest APU.

    You could buy an intel celeron g530 for 48 dollars (with free shipping) and an asus gt640 (or galaxy 650 with MiR for the same price) and beat the living snot out of AMD's amazing new APU that everyone just has to love because its brand new from AMD and all the fan boys have to fall all over themselves to get it... sounds like apple. Hell if you hate nvidia so much you can get an amd 7750 for 99 bucks on newegg.

    Either way you go the price is between 30 and 50 dollars more than the APU and it will get about twice the FPS... who's going to buy an APU with stats like that? Oh yeah fanboys...
  • CeriseCogburn - Thursday, October 11, 2012 - link

    They never respond when the truth hurts too much.

    Just think, without them, amd is double epic fail and already gone. I bet that statement made their hearts so warm, and feel so heady, they "saved competition".

    But you know what they really saved ? Saved the world from real innovation and forward movement, as all those resources and programmers and engineers were wasted on amd crap. Saved us all from the truth and reality. Saved us from sanity and believing fellow human beings had a clue.

    I'm going to go find a trinity w discrete bench so I can LMAO as soon as the overlord control freak in fear of their own life amd releases the death grip on the nda bench rules.

    You just know all the little pliable as ruibber amd fanboys gonna get their new squeeze trinity - they're looking in the mirror now saying "My name is not Mr. Anderson!"

    Watch we'll get an HTPC article now, or maybe it's already posted. Hope I don't have to laugh and shake my head about how cracked and crap and functionless and problematic it turns out to be. Flash will probably rip it a new one. LMAO
  • Devo2007 - Thursday, September 27, 2012 - link

    That article was dated yesterday....just like the Anandtech one.
  • Wolfpup - Friday, September 28, 2012 - link

    Yeah, I have no idea what that's supposed to mean. They're giving performance data for a product that hasn't launched yet and they're three months late?
  • chowmanga - Thursday, September 27, 2012 - link

    Same could have been said about Tom's when Anandtech had the Sandy Bridge preview before anyone else.
  • r3loaded - Thursday, September 27, 2012 - link

    Really? No other site does such in-depth analysis of new chip architectures and such rigorous testing and benchmarking (though Ars comes close). This is the wrong site if you want tech tabloid journalism.

Log in

Don't have an account? Sign up now