AMD Radeon HD 7750 & Radeon HD 7770 GHz Edition Review: Evading The Price/Performance Curveby Ryan Smith & Ganesh T S on February 15, 2012 12:01 AM EST
Moving on from our look at gaming performance, we have our customary look at compute performance. With GCN AMD significantly overhauled their architecture in order to improve compute performance, as their long-run initiatives rely on GPU compute performance becoming far more important than it is today.
With such a move however AMD has to solve the chicken and the egg problem on their own, in this case by improving compute performance before there are really a large variety of applications ready to take advantage of it. As we’ll see AMD has certainly achieved that goal, but it raises the question of what was the tradeoff for that? We have some evidence that GCN is more efficient than VLIW5 on a per-shader basis even in games, but at the same time we can’t forget that AMD has gone from 800 SPs to 640 SPs in the move from Juniper to Cape Verde, in spite of a full node jump in fabrication technology. In the long run AMD will be better off, but I suspect we’re looking at that tradeoff today with the 7700 series.
Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.
Theoretically the 5770 has a 5% compute performance advantage over the 7770. In practice the 5770 doesn’t stand a chance. Even the much, much slower 7750 is ahead by 12%, meanwhile the 7770 is in a class of its own, competing with the likes of the 6870. The 7770 series still trails the GTX 560 to some degree, but once again we’re looking at the proof of just how much the GCN architecture has improved AMD’s compute performance.
Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.
SmallLuxGPU is another good showing for the GCN based 7700 series, with the 7770 once again moving well up the charts. This time it’s between the 6850 and 6870, and well, well ahead of the GTX 560 or any other NVIDIA video cards. Throwing in an overclock pushes things even farther, leading to the XFX BESDD tying the 6870 in this benchmark.
For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.
Under our AESEncryptDecrypt benchmark the 7770 does even better yet, this time taking the #2 spot and only losing to its overclocked self. PCIe 3.0 helps here, but as we’ve seen with the 7900 series there’s no replacement for a good compute architecture.
Finally, our last benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.
It would appear we’ve saved the best for last, as in our fluid simulation benchmark the top three cards are all 7700 series cards. This benchmark strongly favors a well organized cache, leading to the 7700 series blowing past the 6800 series and never looking back. Even NVIDIA’s Fermi based video cards can’t keep up.