Original Link: http://www.anandtech.com/show/5625/amd-radeon-hd-7870-ghz-edition-radeon-hd-7850-review-rounding-out-southern-islands
AMD Radeon HD 7870 GHz Edition & Radeon HD 7850 Review: Rounding Out Southern Islandsby Ryan Smith on March 5, 2012 12:01 AM EST
In 2009-2010, AMD launched the entire 4 chip Evergreen series in 6 months. By previous standards this was a quick pace for a new design, especially since AMD had not previously attempted a 4 chip launch in such a manner. Now in 2012 AMD’s Southern Islands team is hard at work at wrapping up their own launch with new aspirations on quickness. Evergreen may have launched 4 chips in 6 months, but this month AMD will be completing the 3 chip Southern Islands launch in half the time – 3 chips in a mere 3 months.
To that end today AMD is taking the wraps off the final piece of the Southern Islands puzzle: Pitcairn. The middle child of the family, it will be the basis of AMD’s $250+ enthusiast segment Radeon HD 7800 series. We’ve seen AMD capture the high-end with the 7900 series and struggle to control the mainstream market with the 7700 series, but how does the 7800 series fare amidst AMD’s lead in deploying 28nm GPUs? Let’s find out.
|AMD GPU Specification Comparison|
|AMD Radeon HD 7870||AMD Radeon HD 7850||AMD Radeon HD 6970||AMD Radeon HD 6950||AMD Radeon HD 5870|
|Memory Clock||4.8GHz GDDR5||4.8GHz GDDR5||5.5GHz GDDR5||5.0GHz GDDR5||4.8GHz GDDR5|
|Memory Bus Width||256-bit||256-bit||256-bit||256-bit||256-bit|
|Manufacturing Process||TSMC 28nm||TSMC 28nm||TSMC 40nm||TSMC 40nm||TSMC 40nm|
So what exactly is Pitcairn? In a nutshell, take Cape Verde (7700) and double it, and you have Pitcairn. Pitcairn has twice the number of CUs, twice the number of ROPs, twice the memory bandwidth, and of particular importance twice as many geometry engines on the frontend. This works out to 1280 SPs among 20 CUs – organized as a doubling Cape Verde’s interesting 4/3/3 configuration – 80 texture units, 32 ROPs, 512KB L2 cache, and a 256-bit memory bus. Compared to Tahiti, Pitcairn still has 12 fewer CUs and as a result less shader and texturing performance along with the narrower memory bus, but it has the same number of ROPs and the same frontend as its bigger brother, which as we’ll see creates some very interesting situations.
On the functionality side of things, the Cape Verde comparisons continue. As with all Southern Islands family parts, Pitcairn supports things such as DX10+ SSAA, PowerTune, Fast HDMI support, partially resident textures, D3D 11.1 support, and the still-AWOL Video Codec Engine (VCE). FP64 support is once again present, and like Cape Verde it’s a performance-limited implementation for compatibility and software development purposes, with FP64 performance limited to 1/16th FP32 performance.
AMD’s Pitcairn cards will be the Radeon HD 7870 GHz Edition and the Radeon HD 7850. The 7870 is a full Pitcairn, clocked at 1000MHz core and paired with 2GB of GDDR5 running at 4.8GHz. It has a PowerTune limit of 190W while AMD puts its typical board power draw closer to 175, meanwhile idle power consumption is around 10W with a long idle of 3W like the rest of Southern Islands. As for the 7850 it’s the typical lower tier part, featuring 16 active CUs (1024 SPs), an 860MHz core clock, and the same 2GB of GDDR5 running at 4.8GHz as its counterpart, giving it roughly 68% the shading/texturing performance and 86% of the ROP & frontend performance of the 7870. The PowerTune limit is 150W with a typical board power of 130W, and the same 10W/3W idle power consumption as the 7870.
Altogether the 7800 series isn’t just the successor to the Barts based 6800 series in name but also the successor to the 6800 in design. This includes not only power consumption, with one card being a sub-150W part, but also with regards to things such as CrossFire, where it features a single CF connector. Interestingly enough even though Barts was already a fairly small chip for its performance, Pitcairn takes this one step further with a die size of 212mm2, which in turn contains 2.8B transistors, only 160M more than Cayman. As we’ll see when we get to our benchmarks, this makes Pitcairn a surprisingly small chip given its 6970+ performance.
Speaking of the 6970, let’s talk about the 7800 series’ competition. As AMD began winding down Cayman (6900 series) almost immediately with the launch of the 7900 series, at this point the 6900 market has effectively dried up. Having taken themselves out of competition with themselves, AMD’s only competition is NVIDIA’s lineup. From a performance and price basis the 7870 and 7850 don’t map particularly well to any specific NVIDIA products, but generally speaking they’re targeted against the GTX 570 and GTX 560 Ti respectively.
With AMD targeting the ~$320 570 and ~$210 560 Ti and given their conservative pricing on the rest of Southern Islands, it should come as no surprise that the 7800 series is priced equally conservatively. The 7870 will have an MSRP of $350, while the 7850 will have an MSRP of $250. With the 7800 series completing the launch of Southern Islands, this gives AMD a consistent price structure for the entire family: $550, $450, $350, $250, $159, and $109.
Finally, as far as availability goes this will be a delayed launch. AMD is formally unveiling the 7800 series today, but it will not go on sale until the 19th, 2 weeks from now. AMD has said that this is due to both CeBIT and the Game Developers Conference; AMD and their partners want to be able to show off the 7800 series to their respective attendees at those events, with both events being far too large to keep the 7800 under wraps. This delayed launch also means that partner cards aren’t quite ready yet, so we only have AMD’s reference cards on hand. We’ll be taking a look at partner cards later this month.
|Spring 2012 GPU Pricing Comparison|
|Radeon HD 7950||$450||GeForce GTX 580|
|Radeon HD 7870||$350|
|$330||GeForce GTX 570|
|Radeon HD 7850||$250|
|$200||GeForce GTX 560 Ti|
|$179||GeForce GTX 560|
|Radeon HD 7770||$159|
Meet The Radeon HD 7870 & Radeon HD 7850
For today’s review AMD sent over a 7870 and a 7850. Both are built on the 7870 reference design, so the cards are functionally identical except for the configuration of their respective GPU and the number of PCIe power sockets present.
For retail cards this will be very similar to the 7700 series launch, with partners doing semi-custom cards right away. In fact among the list of cards AMD sent us only Club3D will be using the complete 7870 reference design, while everyone else will be using the reference PCB along with their customary open air coolers. The 7850 will be even more divergent since AMD actually has a different, shorter reference PCB for these cards. Consequently our 7850 has very little in common with retail 7850s when it comes to their construction.
Starting as always with the cooler, the 7870 reference design is effectively a smaller version of the 7970 reference design. Here AMD is once again using a blower design with a slightly smaller blower, shrouded in the same hard red & black plastic as with the 7900. Underneath the shroud we find AMD’s heatsink, which utilizes a copper baseplate attached to 3 copper heatpipes, which in turn run into an aluminum heatsink that runs roughly half the length of the card. This is fairly typical for a blower design for a sub-200W card, but again almost all of the retail cards will be using a completely different open air design.
The 7870 PCB itself runs 9.5” long, with an additional .25” of shroud overhang bringing the total to 9.75”. Our card is equipped with 8 5GHz 256MB Hynix GDDR5 memory chips, the same 5GHz chips that we saw on the 7700 series. For the 7870 power is provided by a pair of 6pin PCIe power socket, while the sub-150W 7850 uses a single socket. Both cards feature a single CrossFire connector, allowing them to be paired up in a 2-way CrossFire configuration.
Meanwhile for display connectivity AMD is using the same configuration as we’ve seen on the 7900 series: 1 DL-DVI port, 1 HDMI port, and 2 miniDP ports. Interestingly, unlike the 7900 series and 7700 series there is a set of pads for a second DVI port on the card, and while AMD doesn’t make use of them at least one XFX card will. The 7800 series as the same display configuration options as the 7900 series though, so while it can drive up to 6 monitors it can only drive 2 TMDS type displays at once, and if you want to drive a full 6 monitors you’ll need a MST hub.
Finally, I wanted to touch on marketing for a bit. We typically don’t go into any detail on marketing, but with the 7800 something AMD did caught my eye. One of AMD’s marketing angles will be to pitch the 7800 series as an upgrade for the 5800 series; AMD doesn’t typically pitch cards as upgrades in this manner, and the 5800 comparison is especially odd.
At 2.5 years old the 5800 series is no longer the video card king but it’s also not particularly outdated; other than tessellation performance it has held up well relative to newer cards. More specifically, the 7800 series performance is roughly equal to the 6900 series, and while the 6900 series as a step up from the 5800 series it was not a massive leap. With its $350/$250 MSRP the 7800 series has common pricing with the 5800 series, but at only 20-40% faster than the 5800 it’s not the kind of step up in performance that typically justifies such a large purchase. Of course AMD’s conservative pricing has a lot to do with this, but at the end of the day it’s odd to call the 7800 series the upgrade for the 5800 series when the 7950 is the more natural upgrade from a performance perspective.
Further Image Quality Improvements: SSAA LOD Bias and MLAA 2.0
The Southern Islands launch has been a bit atypical in that AMD has been continuing to introduce new AA features well after the hardware itself has shipped. The first major update to the 7900 series drivers brought with it super sample anti-aliasing (SSAA) support for DX10+, and starting with the Catalyst 12.3 beta later this month AMD is turning their eye towards further improvements for both SSAA and Morphological AA (MLAA).
On the SSAA side of things, since Catalyst 9.11 AMD has implemented an automatic negative Level Of Detail (LOD) bias in their drivers that gets triggered when using SSAA. As SSAA oversamples every aspect of a scene – including textures – it can filter out high frequency details in the process. By using a negative LOD bias, you can in turn cause the renderer to use higher resolution textures closer to the viewer, which is how AMD combats this effect.
With AMD’s initial release of DX10+ SSAA support for the 7900 series they enabled SSAA DX10+ games, but they did not completely port over every aspect of their DX9 SSAA implementation. In this case while there was a negative LOD bias for DX9 there was no such bias in place for DX10+. Starting with Catalyst 12.3 AMD’s drivers have a similar negative LOD bias for DX10+ SSAA, which will bring it fully on par with their DX9 SSAA implementation.
As far as performance and image quality goes, the impact to both is generally minimal. The negative LOD bias slightly increases the use of higher resolution textures, and thereby increases the amount of texels to be fetched, but in our tests the performance difference was non-existent. For that matter in our tests image quality didn’t significantly change due to the LOD bias. It definitely makes textures a bit sharper, but it’s a very subtle effect.
|4x SSAA||4x SSAA w/LOD Bias|
Moving on, AMD’s other AA change is to Morphological AA, their post-process pseudo-AA method. AMD first introduced MLAA back in 2010 with the 6800 series, and while they were breaking ground in the PC space with a post-process AA filter, game developers quickly took the initiative 2011 to implement post-process AA directly into their games, which allowed it to be applied before HUD elements were drawn and avoiding the blurring of those elements.
Since then AMD has been working on refining their MLAA implementation, which will be replacing MLAA 1.0 and is being launched as MLAA 2.0. In short, MLAA 2.0 is supposed to be faster and have better image quality than MLAA 1.0, reflecting the very rapid pace of development for post-process AA over the last year and a half.
As far as performance goes the performance claims are definitely true. We ran a quick selection of our benchmarks with MLAA 1.0 and MLAA 2.0, and the performance difference between the two is staggering at times. Whereas MLAA 1.0 had a significant (20%+) performance hit in all 3 games we tested, MLAA 2.0 has virtually no performance hit (<5%) in 2 of the 3 games we tested, and in the 3rd game (Portal 2) the performance hit is still reduced by some. This largely reflects the advancements we’ve seen with games that implement their own post-process AA methods, which is that post-process AA is nearly free in most games.
|Radeon HD 7970 MLAA Performance|
|4x MSAA||4x MSAA + MLAA 1.0||4x MSAA + MLAA 2.0|
As for image quality, that’s not quite as straightforward. Since MLAA does not have access to any depth data and operates solely on the rendered image, it’s effectively a smart blur filter. Consequently like any post-process AA method there is a need to balance the blurring of aliased edges with the unintentional burring of textures and other objects, so quality is largely a product of how much burring you’re willing to put up for any given amount of de-aliasing. In other words, it’s largely subjective.
|Batman AC #1||Batman AC #2||Crysis: Warhead||Portal 2|
|MLAA 1.0||Old MLAA||Old MLAA||Old MLAA||Old MLAA|
|MLAA 2.0||New MLAA||New MLAA||New MLAA||New MLAA|
From our tests, the one thing that MLAA 2.0 is clearly better at is identifying HUD elements in order to avoid blurring them – Portal 2 in particular showcases this well. Otherwise it’s a tossup; overall MLAA 2.0 appears to be less overbearing, but looking at Portal 2 again it ends up leaving aliasing that MLAA 1.0 resolved. Again this is purely subjective, but MLAA 2.0 appears to cause less image blurring at a cost of less de-aliasing of obvious aliasing artifacts. Whether that’s an improvement or not is left as an exercise to the reader.
For the Radeon HD 7800 series launch AMD provided an early version of Catalyst 12.3, version number 8.95.5-120224a. Along with adding support for the 7800 series this adds support for MLAA 2.0 and LOD biasing for DX10+ SSAA. Game performance is largely unchanged, although we did see an increase in SmallLuxGPU performance across Cayman and Southern Islands.
Unfortunately these drivers still do not enable support for the Video Codec Engine (VCE), AMD’s fixed function H.264 encoder. At this point VCE has been absent for over 2 months into what’s likely a 12 month lifecycle for the 7900 series, which is moving the feature into the chronically late territory. AMD is telling us they’ll have more news on VCE later this month, but it’s still not clear when we’ll actually be able to use it.
|CPU:||Intel Core i7-3960X @ 4.3GHz|
|Motherboard:||EVGA X79 SLI|
|Chipset Drivers:||Intel 184.108.40.2062|
|Power Supply:||Antec True Power Quattro 1200|
|Hard Disk:||Samsung 470 (256GB)|
|Memory:||G.Skill Ripjaws DDR3-1867 4 x 4GB (8-10-9-26)|
|Case:||Thermaltake Spedo Advance|
AMD Radeon HD 7950
AMD Radeon HD 7870
AMD Radeon HD 7850
AMD Radeon HD 7770
AMD Radeon HD 7750
AMD Radeon HD 6970
AMD Radeon HD 6950
AMD Radeon HD 6870
AMD Radeon HD 6850
AMD Radeon HD 5870
AMD Radeon HD 4870
NVIDIA GeForce GTX 580
NVIDIA GeForce GTX 570
NVIDIA GeForce GTX 560 Ti
NVIDIA GeForce GTX 285
NVIDIA ForceWare 295.73
AMD Catalyst Beta 8.932.2
AMD Catalyst Beta 8.95.5
|OS:||Windows 7 Ultimate 64-bit|
Kicking things off as always is Crysis: Warhead. It’s no longer the toughest game in our benchmark suite, but it’s still a technically complex game that has proven to be a very consistent benchmark. Thus even four years since the release of the original Crysis, “but can it run Crysis?” is still an important question, and the answer continues to be “no.” While we’re closer than ever, full Enthusiast settings at a 60fps is still beyond the grasp of a single-GPU card.
As we’ll see throughout today’s benchmarks, Crysis ends up being a good proxy for the 7800 series’ performance, especially compared to the outgoing 6900 series. Ahead of the Southern Islands launch there was some doubt that AMD could deliver 6900 series performance with the 7800 series, and this doubt increased after the 7700 series underperformed the 6800 series. Results like what we're seeing with Crysis should make it clear that the 7800 series is more than a competitor for the 6900 series, with both the 7870 and 7850 equaling or beating the 6970 and 6950 respectively in almost all tests.
Overall at 1920x1200 the 7870 gets 39.9fps, which isn’t quite enough to smoothly handle enthusiast quality and 4x MSAA. Meanwhile the 7850 is farther down the line at 35.4fps; both cards would need Crysis’s settings turned down to reach 60fps here. Compared to the 7950 the 7870 trails it by 17%, giving AMD’s next card up a fairly wide lead in this game.
Meanwhile compared to NVIDIA’s lineup the 7800 series does quite well here, reflecting the fact that the 7800 series doesn’t have a true equal in NVIDIA’s existing lineup. At 1920 the 7870 leads the GTX 570 by 12% and is within spitting distance of the GTX 580, while the 7850 is virtually tied with the more expensive GTX 570 while it leads the GTX 560 Ti by 19%. Elsewhere at 2560 the 7870 has a similar lead, while the 7850 has a 41% lead on the GTX 560 Ti; while 2560 is not the ideal resolution for either card, it’s something to keep in mind when we begin discussing the impacts of the 7800’s 2GB of RAM.
When it comes to minimum framerates in Crysis the relative rankings are nearly identical. The 7800 series extends their lead over the 6900 series by a slight degree, while the lead over NVIDIA’s cards shrinks slightly.
Paired with Crysis as our second behemoth FPS is Metro: 2033. Metro gives up Crysis’ lush tropics and frozen wastelands for an underground experience, but even underground it can be quite brutal on GPUs, which is why it’s also our new benchmark of choice for looking at power/temperature/noise during a game. If its sequel due this year is anywhere near as GPU intensive then a single GPU may not be enough to run the game with every quality feature turned up.
Metro ends up being an even better test for the 7800 series than Crysis was. At 1920 the 7870 ties the GTX 580 while taking a 17% lead over the GTX 570 and a much smaller lead over the 6970. The 7850 on the other hand is slightly behind the 6950, but is itself tied with the GTX 570 and well ahead of the GTX 560 Ti.
Interestingly, in spite of being built from the same architecture the 7950’s lead decreases some here. At 1920 it’s now only 13% ahead of the 7870, though in all likelihood it’s just enough of a difference that the 7870 isn’t going to be fully playable at these settings. As it turns out the gap between the 7870 and 7850 ends up being larger, with the 7870 enjoying a 17% advantage.
For racing games our racer of choice continues to be DiRT, which is now in its 3rd iteration. Codemasters uses the same EGO engine between its DiRT, F1, and GRID series, so the performance of EGO has been relevant for a number of racing games over the years.
DiRT3 is a game that greatly benefitted from AMD’s GCN architecture, rocketing the 7900 series cards to the top of our charts and giving a similar boost to the 7800 series. Compared to the 6970 the 7870 is an incredible 30% faster at 1920, an impressive showing when we are generally only expecting the 7870 to match the 6970’s performance. The 7850 brings us back down to earth however, with a much more modest 10% lead.
Meanwhile compared to NVIDIA’s cards, the 7800 series lead isn’t nearly as large here as NVIDIA has historically done well at DiRT 3. The 7870 enjoys a 9% lead over the GTX 570 here, while the 7850 is virtually tied with the GTX 560 Ti.
Total War: Shogun 2
Total War: Shogun 2 is the latest installment of the long-running Total War series of turn based strategy games, and alongside Civilization V is notable for just how many units it can put on a screen at once. As it also turns out, it’s the single most punishing game in our benchmark suite (on higher end hardware at least).
Shogun 2 ends up being an interesting benchmark for the 7800 series today for a number of different reasons. First and foremost of course is a strong performance lead for the 7800 compared to both the 6900 series and NVIDIA’s lineup. The 7870 leads the GTX 570 by 26%, and even the GTX 580 is over 10% slower. At the same time the 7850 ties the GTX 570, while taking a smaller 14% lead over the GTX 560 Ti.
More importantly however, it’s the first test in our suite where even the 1.25GB of VRAM on the GTX 570 isn’t enough. One of AMD’s planks for marketing the 7800 series will be that they have 2GB of VRAM versus 1.25GB on the GTX 570 or 1GB on the GTX 560 Ti, and this is a showcase of that difference. Shogun 2 knows how much VRAM it needs for any given setting configuration and won’t run on cards that don’t meet the requirements – as a result the GTX 570 and GTX 560 Ti can’t even compete at 2560. This is admittedly a higher resolution than most of the cards were designed for, but it showcases the importance of moving beyond 1GB of VRAM going forward. Between Shogun, BF3, and Skyrim, we’re seeing modern games that need 1.5GB of VRAM or more to fully spread their wings.
Batman: Arkham City
Batman: Arkham City is loosely based on Unreal Engine 3, while the DirectX 11 functionality was apparently developed in-house. With the addition of these features Batman is far more a GPU demanding game than its predecessor was, particularly with tessellation cranked up to high.
Batman ends up being the one and only test the 7870 loses at when compared to the 6970 at 1920. Most likely exacerbated due to the lack of significant digits from the built-in benchmark, the 7870 hits 59fps while the 6970 hits 60fps. For all practical purposes this is a tie, but it serves as a showcase of the lower bound of the 7870’s performance relative to the 6970: equal to, and no slower.
Otherwise Batman isn’t the strongest game for AMD’s cards. The GTX 570 takes a 2fps(~3%) lead over the 7870 here, while the 7850 does much better versus the GTX 560 Ti. Interestingly the 7870’s lead over the 7850 is extremely small here, with only a tiny 7% gap separating the two cards. This is less than the core clock difference never mind the shader difference, so it would appear we’re looking at a memory bandwidth bottleneck for the 7800 series here, which is reinforced by the 7950’s lead. In which case there’s an interesting opportunity for AMD’s partners here if they equip their cards with 6GHz GDDR5 chips in order to go for a strong factory memory overclock.
Portal 2 continues the long and proud tradition of Valve’s in-house Source engine. While Source continues to be a DX9 engine, Valve has continued to upgrade it over the years to improve its quality, and combined with their choice of style you’d have a hard time telling it’s over 7 years old at this point. Consequently Portal 2’s performance does get rather high on high-end cards, but we have ways of fixing that…
Portal 2 ends up being at about the middle of the road as far as performance goes. The 7870 enjoys a smaller lead over both the 6970 and the GTX 570, while the 7850 takes one of its biggest losses versus the 6950 and GTX 560 Ti at around 3-5%. The good news is that at these performance levels, performance is good enough for the SSAA bonus round, which sees the 7870 hitting just shy of 60fps. The 7850 on the other hand only falls further behind the GTX 560 Ti with SSAA, and at 45fps probably isn’t going to be quite fast enough to make the cut.
Its popularity aside, Battlefield 3 may be the most interesting game in our benchmark suite for a single reason: it’s the first AAA DX10+ game. It’s been 5 years since the launch of the first DX10 GPUs, and 3 whole process node shrinks later we’re finally to the point where games are using DX10’s functionality as a baseline rather than an addition. Not surprisingly BF3 is one of the best looking games in our suite, but as with past Battlefield games that beauty comes with a high performance cost
For anyone keeping score, we reran all of our numbers after the recent Battlefield 3 Radeon HD 7000 series performance patch. The results are virtually identical. While we don’t have official confirmation, we believe that DICE switched to a different FXAA codepath; however doing this doesn’t seem to have impacted the performance of the 7000 series, which is either a testament to AMD’s shader compiler, or proof that the overhead from FXAA is very low in the first place.
In any case while AMD’s BF3 performance has improved since the 7970 launch, it’s still one of their weaker games. The 7870 can hang with the GTX 570 at 1920 but the 7850 once more falls behind the GTX 560 Ti. The 7850 in particular just isn’t doing very well here, and it would be necessary to drop down in settings or in resolution to get fluid gameplay out of BF3. However at the same time we do see some further evidence of the impact of having 2GB of VRAM, as both 7800 cards improve relative to NVIDIA’s cards at 2560.
Meanwhile compared to the 6900 series, the 7800 series takes another small lead. At 19x12 without MSAA the 7870 has 5% on the 6970, while the 7850 is effectively tied with the 6950. It’s interesting to note however that relative to the 7950, the 7870 is doing very well here, trailing AMD’s faster card by less than 4%. The fact of the matter is that with the same basic frontend and the same number of ROPs, the 7870’s 1GHz core clock can significantly eat into the performance lead of the 7950 if the 7950 is primarily performance bound by either of those two rendering stages.
Our next game is Starcraft II, Blizzard’s 2010 RTS megahit. Much like Portal 2 it’s a DX9 game designed to run on a wide range of hardware so performance is quite peppy with most high-end cards, but it can still challenge a GPU when it needs to.
Once again the 7870 and 7850 place quite close to each other, particularly at 1920. Here the gap between the two is only 6%, the smallest lead the 7870 will ever have over the 7850. The good news is that this means the 7850 ties the GTX 570, however the bad news is that it means the 7870 doesn’t get much farther ahead. The 7800 series ends up looking quite good compared to the 6900 series though, with leads in the 10-15% range,
The Elder Scrolls V: Skyrim
Prior to the launch of our new benchmark suite, we wanted to include The Elder Scrolls V: Skyrim, which is easily the most popular RPG of 2011. However as any Skyrim player can tell you, Skyrim’s performance is CPU-bound to a ridiculous degree. With the release of the 1.4 patch and the high resolution texture pack this has finally been relieved to the point where GPUs once again matter, particularly when we’re working with high resolutions and less than high-end GPUs. As such, we're now including it in our test suite.
At 1920 we seem to be more CPU limited than GPU limited, but at 2560 we do see some greater differentiation between all of our video cards. Here the 7870 can edge out even the GTX 580, and the 7850 beats out even the 6970. What’s interesting to see is where the 1GB cards collapse due to the use of high resolution textures – the GTX 560 Ti collapses after just 1680, and the GTX 570 collapses beyond 1920. Going forward we expect more games to be like Skyrim, which will make additional VRAM all the more important.
Our final game, Civilization 5, gives us an interesting look at things that other RTSes cannot match, with a much weaker focus on shading in the game world, and a much greater focus on creating the geometry needed to bring such a world to life. In doing so it uses a slew of DirectX 11 technologies, including tessellation for said geometry, driver command lists for reducing CPU overhead, and compute shaders for on-the-fly texture decompression.
CivV has something interesting going on at 1920; can you spot it? For the first and only time, the 7870 ends up leading over the 7950, if only by 2%. Even though AMD’s performance improvements in CivV seem to largely be driven by compute shader performance improvements, there’s apparently still something going on with the frontend or the ROPs that makes the 7870’s higher core clockspeed matter.
In any case this is another game where the 7800 comes out looking quite good. Relative to the 6900 series there is no competition: the 7800 series is 40-50% faster. The lead against NVIDIA’s cards isn’t nearly as large, but it’s still 8% for the 7870 versus the GTX 570, and 9% for the 7850 versus the GTX 560 Ti.
Moving on from our look at gaming performance, we have our customary look at compute performance. With GCN AMD significantly overhauled their architecture in order to improve compute performance, as their long-run initiatives rely on GPU compute performance becoming far more important than it is today.
Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.
The Civ V compute shader benchmark once again shows off just how much the compute shader performance of the 7800 series has improved relative to the 6900 series, with both 7800 cards coming in well, well ahead of any previous generation AMD cards. Compared to NVIDIA’s lineup the 7800 series does fairly well for itself too, although not quite as well as the commanding lead the 7900 series took.
Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.
SmallLuxGPU continues to showcase the 7800 series’ improvements over past AMD architectures, and while it’s not the same kind of massive leap we saw with CivV, it’s still enough to bring the 7850 up to near the performance of the 6970, and pushing the 7870 well beyond that. The only real competition here for AMD is AMD.
For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.
On the one hand, the 7870 gets quite close to the 7950 here in our AESEncryptDecrypt benchmark, in spite of the latter’s higher number of shaders. On the other hand, it’s still not enough to dethrone the GTX 570; the only NVIDIA cards the 7800 series can beat start at the GTX 560 Ti.
Finally, our last benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.
In our final compute test the 7800 series once again makes a run at the top, with both cards rising past the GTX 570, although they can’t quite match the GTX 580. In an interesting turn of events the 7870 ends up being some 6% faster than the 7950, in spite of the fact that in a compute benchmark the 7950 should have a solid lead. This just goes to show that core clockspeeds do matter, and that adding more shaders alone can’t conquer all benchmarks.
Before moving on from compute performance, we wanted to quickly take a look at theoretical performance. This will be particularly helpful for highlighting the importance of core clockspeeds in AMD's GCN architecture.
We’ll start with a quick look at tessellation performance with the DX11 Detail Tessellation sample program. Because the 7900 series and the 7800 series share a common dual geometry engine frontend, geometry performance is almost entirely dictated by the core clock. As a result the 7870 and its 1GHz core clock just edges out the 7950 and its 800MHz core clock when it comes to tessellation performance. The rest of the difference comes down to shaders, where the 7950 has more shader resources to throw at the hull and domain shading parts of the tessellation process.
Of course that tessellation performance lead doesn’t always translate into great performance in tessellation heavy benchmarks. Unigine Heaven, in spite of its heavy use of tessellation still has the 7950 well ahead.
Finally, a quick look at 3DMark Vantage theoretical performance largely confirms what we’ve already seen. Pixel fill is heavily bandwidth limited, leading to the 7950 taking a large lead and even the 6970 edging out the 7800 series, though you’d never know it from the gaming benchmarks. Otherwise when it comes to texture fillrate, the 6970 and 7870 are in a dead heat.
Power, Temperature, & Noise
As always, we wrap up our look at a new video card with a look at the physical performance attributes: power consumption, temperatures, and noise. Thanks to TSMC’s 28nm process AMD has been able to offer 6900 series performance on a much smaller chip, but what has that done to power consumption and all of its related properties? Let’s find out.
Please note that we’re including our 7870-based 7850 in these charts, even though none of AMD’s partners will be shipping a card in this exact configuration. Power consumption should be nearly identical to shipping cards, but temperatures and noise readings are going to be significantly different since most of those cards will be using open air coolers.
|Radeon HD 7800 Series Voltages|
|Ref 7870 Load||Ref 7850 Load||Ref 7870 Idle|
When getting a voltage reading on our 7800 cards through GPU-Z, it was interesting to note that the load voltage was almost identical between the two cards: 1.219v versus 1.213v. While we believe GPU-Z is giving us the right readings, we’re not sure whether the 7850 voltages are the same we’ll be seeing on shipping cards because of the PCB differences.
Idle power consumption looks quite good, as you’d expect from GCN. Idle power consumption is virtually identical to the 7900 series at the wall, and only the 7700 series can beat 112W. This further goes to show just how much progress has been made with idle power consumption – the Cayman based 6900 series had good idle power consumption for its time, and yet the 7800 series beats it by 5W+ at the wall.
Long idle power consumption is virtually identical with the rest of the Southern Islands cards thanks to AMD’s ZeroCore Power technology. The next closest card is the GTX 560 Ti, and that’s at nearly 10W higher.
Moving on to load power testing, we have Metro 2033. Load power consumption here is about where you’d expect it to be, with the 7800 setups drawing more at the wall than the 6800 setups, but less than the 7900 and 6900 series. This is largely a consequence of performance, as the higher rendering performance of the 7800 series versus the 6800 series drives up CPU power consumption in order to generate more frames.
OCCT on the other hand gives us a more purified look at power consumption, and as you’d expect for 28nm it looks good. The 7870 ends up drawing only a few more watts at the wall compared to the 6870, showcasing the fact that the 7800 series is a drop-in replacement for the 6800 series from a power consumption perspective. The 7850 looks even better, capping out at 15W below the 6850, most likely as a result of PowerTune keeping the card firmly at 150W. Though it’s interesting to note that the measurements at the wall don’t perfectly align with the differences in PowerTune limits, with the 7850 drawing 30W more than the 7770 at the wall compared to a 50W PT difference, while the 7950 draws 30W more at the wall over the 7870 even though there’s only supposed to be a 10W PT difference.
AMD’s latest generation blowers do quite well with idle temperatures and we can see it here. At 30C for the 7850 it’s every bit as cool as the GTX 560 Ti, while the entire 7800 series is around 5-8C cooler than the 6900.
Under load, Metro temperatures are also quite good. At 62C the 7850 is the coolest card in this performance class, but keep in mind that it’s basically using an oversized cooler; retail cards will be open air coolers with much different characteristics. Otherwise at 68C the 7870 is still among the coolest cards, coming ahead of even the historically cool GTX 560 Ti, never mind the much hotter 6900 series.
Load temperatures climb under OCCT, but again the 7800 series is among the coolest temperatures we see. Here we see the 7870 peak at 73C, whereas its last generation counterpart would be at 80C and the GTX 570 at a toasty 87C.
Moving on to noise testing, there are no major surprises at idle, with the 7870 hugging 40db. For whatever reason the 7850’s minimum fan state is roughly 200RPM higher than the 7870’s, but since no one will be using this cooler it’s not a significant result.
Consistent with AMD’s other 7000 series cards, we’re once again seeing the consequences of AMD’s aggressive cooling policies coupled with the use of a blower. At 48.8dB the 7870 is still quieter than the blower-based 6870, but it’s significantly louder than the open air cooled GTX 560 Ti, even though the latter consumes far more power and generates far more heat. This doesn’t make the use of a blower the wrong choice, but combined with aggressive cooling policies it does hurt AMD. The GTX 570, in spite of using much more power than the 7870, is only less than 2dB louder even though it too uses a blower.
Last, but not least we have our OCCT noise results. Unlike Metro the 7800 series does better on a relative basis here, but this is mostly because NVIDIA doesn’t have a power throttling system quite like PowerTune. At 51.9dB the 7870 is not the quietest card, but it still manages to beat the 6970 and the PowerTune-less 6870.
All things considered there are no great surprises here on a relative basis, as the 7800 series performs like we’d expect for a blower based sub-200W video card. Due to TSMC's 28nm process AMD greatly improves on their performance/power and performance/noise ratios with the 7800 series compared to the 6800 and 6900 series, while for their power class the 7800 series is slightly ahead of the pack on both power consumption and noise.
With that said, keep in mind that since most of AMD’s partners will be using open air coolers these results won’t be applicable to most retail cards. So for the temp/noise characteristics of retail cards you’ll want to look at individual card reviews when those start appearing later this month. This is particularly true for the 7950, where all of the retail cards will be using a different design than our sample.
Overclocking: Power, Temp, & Noise
As with the rest of Southern Islands, AMD is making sure to promote the overclockability of their cards. And why not? So far we’ve seen every 7700 and 7900 card overclock by at least 12% on stock voltage, indicating there’s a surprising amount of headroom in these cards. The fact that performance has been scaling so well with overclocking only makes overclocking even more enticing. Who doesn’t want free performance?
So how does Pircairn and the 7800 series stack up compared to the 7700 and 7900 series when it comes to overclocking? Quite well actually; it easily lives up to the standards set by AMD’s previous Southern Islands cards.
|Radeon HD 7800 Series Overclocking|
|AMD Radeon HD 7870||AMD Radeon HD 7850|
|Shipping Core Clock||1000MHz||860MHz|
|Shipping Memory Clock||4.8GHz||4.8GHz|
|Overclock Core Clock||1150MHz||1050MHz|
|Overclock Memory Clock||5.4GHz||5.4GHz|
Overall we were able to push our 7870 from 1000MHz to 1150MHz, representing a sizable 15% core overclock. This is now the 3rd SI card we’ve hit 1125MHz or 1150MHz – the other two being the 7970 and the 7770 – so AMD’s overclocking headroom has been extremely consistent for their upper tier cards.
As for memory overclocking, we hit 5.4GHz on both cards before general performance started to plateau, representing a 12.5% memory overclock. Considering that both cards use the same RAM on the same PCB, and the performance limitation is the memory bus itself, this is consistent with what we would have expected. With that said, we are a bit surprised that we got so far over 5GHz on 2Gb GDDR5 memory chips only rated for 5GHz in the first place; it indicates that Hynix’s GDDR5 production very mature.
With that said, because of the unique and non-retail nature of the 7850 AMD supplied us, the 7850 overclocking results should be considered low-confidence. The retail 7850 cards will be using simpler and no doubt cheaper coolers, PCBs, and VRMs; all of these can reduce the amount of overclocking headroom a card has. It’s by no means impossible that a 7850 could hit 1050MHz/5.4GHz, but it’s far more likely on a 7870 PCB than it is on a 7850 PCB.
Anyhow we’ll take a look at gaming performance in a moment, but in the meantime let’s take a look at what our overclocks do to power, temperature, and noise.
Even without a voltage increase overclocking does cause power consumption to go up, but not by a great deal. Under Metro the total difference is roughly 21W for the 7850 and 25W for the 7870, at least some of which can be traced back to the increased load on the CPU. Whereas on OCCT there’s a difference of nearly 40W on both cards, thanks to the increased PowerTune limits we’re using to avoid any kind of throttling when overclocked. All things considered with our overclocks power consumption for the 7850 approaches that of the 7870 and the 7870 approaches the GTX 560 Ti, which as we’ll see is a fairly small power consumption increase for the performance increase we’re getting.
Of course when power consumption goes up so does temperature. For both cards under Metro and for the 7870 under OCCT this amounts to a 5C increase, while the 7850 rises 8C under Metro. However as with our regular temperature readings we would not suggest putting too much consideration into the 7950 numbers since it’s using a non-retail design.
AMD’s conservative fan profiles mean that what are already somewhat loud cards get a bit louder, but in spite of what the earlier power draw differences would imply the increase in noise is rather limited. Paying particular attention to Metro 2033 here, the 7870 is just shy of 3dB louder at 51.9dB, while the 7850 increases by 2.7dB to 51.5dB. OCCT does end up being worse at 2.8dB and 3.7dB louder respectively, but keep in mind this is our pathological case with a much higher PowerTune limit.
Overclocking: Gaming & Compute Performance
As always we’ll keep the commentary thin here, but overall the overclocked performance of the 7800s looks very good, which is what you’d expect with a 15%+ core overclock and a 12% memory overclock. With the exception of Skyrim performance at 19x12, the overclocking performance increase for both cards is roughly split between the core and memory overclocks, meaning we’re seeing a 13.5% average performance increase for the 7870 and a 17% average performance increase for the 7850. As the 7800 series has the same number of ROPS as the 7900 series, it looks like we’re running headlong into the memory bandwidth bottleneck that makes the 384-bit memory bus on the 7900 series sing.
All things considered among our tests we have everything from even the 7850OC beating the GTX 580, to the GTX 580 still holding onto its lead versus the 7870OC. But even if you can’t beat a GTX 580 with a 7870OC you can get very close on a card that costs a good bit less.
Meanwhile for compute performance the gains are similar: 12.5% on the 7870, and 19.5% on the 7850.
With 3 major launches in under 3 months it seems like I’ve written he same thing time and time again, and that wouldn’t be an incorrect observation. By being the first to deploy 28nm GPUs AMD has been enjoying a multi-month lead on NVIDIA that has allowed them to set their own pace, and there’s little NVIDIA can do but sit back and watch. Consequently we’re seeing AMD roll out a well-orchestrated launch plan unhindered, with AMD launching each new Southern Islands card at exactly the place they’ve intended to from the beginning.
At each launch AMD has undercut NVIDIA at critical points, allowing them to push NVIDIA out of the picture, and the launch of the Radeon HD 7800 series is no different. AMD’s decision to launch the 7870 and 7850 at roughly $25 to $50 over the GTX 570 and GTX 560 Ti respectively means that NVIDIA’s cards still have a niche between AMD’s price points for the time being, but this is effectively a temporary situation as NVIDIA starts drawing down inventory for the eventual Kepler launch.
Starting with the Radeon HD 7870 GHz Edition, AMD is effectively in the clear for the time being. At roughly 9% faster than the GTX 570 there’s little reason to get the GTX 570 even with the 7870’s price premium; it’s that much faster, cooler, and quieter. With the launch of Pitcairn and the 7870 in particular, GF110 has effectively been removed from competition after a nearly year and a half run.
As for the Radeon HD 7850, things are not so clearly in AMD’s favor. From a power perspective it's by far the fastest 150W card you can buy, and that alone will earn AMD some major OEM wins along with some fans in the SFF PC space. Otherwise from a price perspective it’s certainly the best $250 card you can buy, but then that’s the catch: it’s a $250 card. With GTX 560 Ti prices starting to drop below $200 after rebate, the 7850 is nearly $50 more expensive than the GTX 560 Ti. At the same time its performance is only ahead of the GTX 560 Ti by about 9% on average, and in the process it loses to the GTX 560 Ti at a couple of games, most importantly Battlefield 3 by about 8%. AMD has a power consumption lead to go along with that performance lead, but without retail cards to test it’s not clear whether that translates into any kind of noise improvements over the GTX 560 Ti. In the long run the 7850 is going to be the better buy – in particular because of its additional RAM in the face of increasingly VRAM-hungry games – but $199 for a GTX 560 Ti is going to be hard to pass up while it lasts.
Of course by being in the driver’s seat overall when it comes to setting video card prices AMD has continued to stick to their conservative pricing, both to their benefit and detriment. The 7800 series isn’t really any cheaper than the 6900 series it replaces; in fact it’s probably a bit more expensive after you factor in the rebates that have been running on the 6900 series since last summer. But these prices stop the bleeding from what has been an aggressive price war between the two companies over the last 3 years, which is going to be of great importance to AMD in the long run.
Nevertheless we’re largely in the same situation now as where we were with the 7700 series: AMD has only moved a small distance along the price/performance curve with the 7800 series, and they’re in no particular hurry to change that. But if nothing else, on the product execution side of things AMD has done a much better job, getting their old cards out of the market well ahead of time in order to keep from having to compete with themselves. As a result your choices right now at $200+ are the 7800 and 7900 series, or last-generation Fermi cards. Otherwise we’re in a holding pattern until AMD brings prices down, which considering Pitcairn is the replacement for the Barts-based 6800, could potentially be quite a reduction in the long run.
Wrapping things up, at this point in time AMD has taken firm control of the $200+ video card market. The only real question is this: for how long? AMD enjoyed a nearly 6 month lead over NVIDIA when rolling out the first generation of 40nm DX11 cards, but will they enjoy a similarly long lead with the first generation of 28nm cards? Only time will tell.