Compute & Synthetics

One of the major promises of AMD's APUs is the ability to harness the incredible on-die graphics power for general purpose compute. While we're still waiting for the holy grail of heterogeneous computing applications to show up, we can still evaluate just how strong Trinity's GPU is at non-rendering workloads.

Our first compute benchmark comes from Civilization V, which uses DirectCompute 5 to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game's leader scenes. And while games that use GPU compute functionality for texture decompression are still rare, it's becoming increasingly common as it's a practical way to pack textures in the most suitable manner for shipping rather than being limited to DX texture compression.

Compute: Civilization V

Similar to what we've already seen, Trinity offers a 15% increase in performance here compared to Llano. The compute advantage here over Intel's HD 4000 is solid as well.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We're now using a development build from the version 2.0 branch, and we've moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU 2.0d4

Intel significantly shrinks the gap between itself and Trinity in this test, and AMD doesn't really move performance forward that much compared to Llano either.

For our next benchmark we're looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher. Note that this test fails on all Intel processor graphics, so the results below only include AMD APUs and discrete GPUs.

AESEncryptDecrypt

We see a pretty hefty increase in performance over Llano in our AES benchmark. The on-die Radeon HD 7660D even manages to outperform NVIDIA's GeForce GT 640, a $100+ discrete GPU.

Our fourth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we're using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

DirectX11 Compute Shader Fluid Simulation - Nearest Neighbor

For our last compute test, Trinity does a reasonable job improving performance over Llano. If you're in need of a lot of GPU computing horsepower you're going to be best served by a discrete GPU, but it's good to see the processor based GPUs inch their way up the charts.

Synthetic Performance

Moving on, we'll take a few moments to look at synthetic performance. Synthetic performance is a poor tool to rank GPUs—what really matters is the games—but by breaking down workloads into discrete tasks it can sometimes tell us things that we don't see in games.

Our first synthetic test is 3DMark Vantage's pixel fill test. Typically this test is memory bandwidth bound as the nature of the test has the ROPs pushing as many pixels as possible with as little overhead as possible, which in turn shifts the bottleneck to memory bandwidth so long as there's enough ROP throughput in the first place.

3DMark Vantage Pixel Fill

Since our Llano and Trinity numbers were both run at DDR3-1866, there's no real performance improvement here. Ivy Bridge actually does quite well in this test, at least the HD 4000.

Moving on, our second synthetic test is 3DMark Vantage's texture fill test, which provides a simple FP16 texture throughput test. FP16 textures are still fairly rare, but it's a good look at worst case scenario texturing performance.

3DMark Vantage Texture Fill

Trinity is able to outperform Llano here by over 30%, although NVIDIA's GeForce GT 640 shows you what a $100+ discrete GPU can offer beyond processor graphics.

Our final synthetic test is the set of settings we use with Microsoft's Detail Tessellation sample program out of the DX11 SDK. Since IVB is the first Intel iGPU with tessellation capabilities, it will be interesting to see how well IVB does here, as IVB is going to be the de facto baseline for DX11+ games in the future. Ideally we want to have enough tessellation performance here so that tessellation can be used on a global level, allowing developers to efficiently simulate their worlds with fewer polygons while still using many polygons on the final render.

DirectX11 Detail Tessellation Sample - Normal

DirectX11 Detail Tessellation Sample - Max

The tessellation results here were a bit surprising given the 8th gen tessellator in Trinity's GPU. AMD tells us it sees much larger gains internally (up to 2x), but using different test parameters. Trinity should be significantly faster than Llano when it comes to tessellation performance, depending on the workload that is.

Minecraft & Civilization V Performance Power Consumption
Comments Locked

139 Comments

View All Comments

  • Kougar - Friday, September 28, 2012 - link

    Given no mention of a "preview" was mentioned in the title, it would have been nice if the The Terms of Engagement section was at the very top of the "review" to be completely forthright with your readership.

    I read down to that section and stopped, then went looking through the review for CPU benchmarks which didn't exist. Can thank The Tech Report for posting an editorial on AMD's "preview" clause before I realized what was going on.
  • Omkar Narkar - Friday, September 28, 2012 - link

    would you guys review 5880k crossfired with HD 6670 ???
    because I've heard that when you pair it with high end GPU like HD7870 then integrated graphics cores doesn't work.
  • TheJian - Friday, September 28, 2012 - link

    Why was this benchmark used in the two reviews before the 660TI launch, and here today, but not in the 660TI article Ryan Smith wrote? This is just more stuff showing bias. He could have easily ran it with the same patch as the two reviews before the 660TI launch article. Is is because in both of those two articles the 600 series dominated the 7970ghz edition and the 7950 Boost? This is at the very least, hard to explain.
  • plonk420 - Monday, October 1, 2012 - link

    are those discrete GPUs on the charts being run on the AMD board? or a Sandy/Ivy?
  • seniordady - Monday, October 1, 2012 - link

    Please,can you make some test to the CPU vs... not only to the GPU?
  • ericore - Monday, October 1, 2012 - link

    http://news.softpedia.com/news/GlobalFoundries-28n...

    Power leaking reduced by up to 550%; wow.
    What an unintended coup for AMD haha all because of Global Foundries.
    Take that Intel.

    AMD is also first one working on Java GPU acceleration.
  • shin0bi272 - Tuesday, October 2, 2012 - link

    This is cool if you want to game at 13x7 at low res... but who does that anymore? When you bump up games like BF3 or Crysis2 (which you didnt test but toms did) the FPS falls into the single digits. This cpus is fine if you dont really play video games or have a 17" CRT monitor. The thing that I think is funny about this is that in all the games a 100 dollar nvidia gpu beat the living snot out of this apu. Other than HTPC people who want video output without having to buy a video card or someone who doesnt play FPS games but wants to play farmville or minecraft no one will buy this thing. Yet people are still trying to make this thing out to be a gaming cpu/gpu combo and its just not going to satisfy anyone who buys it to play games on and thats disingenuous.
  • Shadowmaster625 - Tuesday, October 2, 2012 - link

    When you tested your GT440, you didnt do it on this hardware right? If you were to disable the trinity gpu and put a GT640 in its place, do you think it would still do better? Or would its score be pretty close to that of the iGPU??
  • skgiven - Sunday, October 7, 2012 - link

    No idea what the NVidia GT440 is doing there; where are the old AMD alternatives?

    Given the all to limited review I don't see the point in comparing this to NVidia's discrete GT640.
    Firstly, it's not clear if you are comparing the APU's to a DDR3 GT640 version (of which there are two; April 797MHz and June 900MHz) or the GDDR5 version (all 65W TDP).
    Secondly, the GT640 has largely been superseded by the GTX650 (64W TDP).
    So was your comparison the 612GFlops model, the 691, or 729 GFlops version?
    Anyway, the GTX650 is basically the same card but has is rated as 812GFlops (30% faster than the April DDR3 model). Who knows, maybe you intended to add these details along with the GTX650Ti, in a couple of days?

    If you are going to compare these APU to discrete entry level cards, you need to add a few more cards. Clearly the A10-5800k falls short against Intels more recent processors for most things (nothing new there), but totally destroys anything Intel has when it comes to gaming, so there is no point in over-analysing that. It wins that battle hands down, so the real question is, how does it perform compared to other gaming APU's and discrete entry level cards?

    I'm not sure why you stuck to the same 1366 screen resolution? Can this card not operate at other frequencies, or can the opposition not compete at higher resolutions?
    1366 is common for laptops. I don't think these 100W chips are really intended for that market. They are for small desktops, home theatre, entry level (inexpensive) gaming systems.

    These look good for basic gaming systems and in terms of performance per $ and Watt, even for some office systems, but their niche is very limited. If you want a good home theatre/desktop/gaming system, throw in a proper discrete GPU and operate at a more sensible 1680 or 1920 for real HD quality.

Log in

Don't have an account? Sign up now