Original Link: http://www.anandtech.com/show/6837/amd-radeon-7790-review-feat-sapphire-the-first-desktop-sea-islands
AMD Radeon HD 7790 Review Feat. Sapphire: The First Desktop Sea Islandsby Ryan Smith on March 22, 2013 12:01 AM EST
In an industry that has long grown accustomed to annual product updates, the video card industry is one where the flip of a calendar to a new year brings a lot of excitement, anticipation, speculation, and maybe even a bit of dread for consumers and manufacturers alike. It’s no secret then that with AMD launching most of their Radeon HD 7000 series parts in Q1 of 2012 that the company would be looking to refresh their product lineup this year. Indeed, they removed doubt before 2012 even came to a close when they laid out their 8000M plans for the first half of 2013, revealing their first 2013 GPU and giving us a mobile roadmap with clear spots for further GPUs. So we have known for months that new GPUs would be on their way; the questions being what would they be and when would they arrive?
The answer to that, as it turns out, is a lot more complex than anyone was expecting. It’s been something of an epic journey getting to AMD’s 2013 GPU launches, and not all for good reasons. A PR attempt to explain that the existing Radeon HD 7000 series parts would not be going away backfired in a big way, with AMD’s calling their existing product stack “stable through 2013” being incorrectly interpreted as their intention to not release any new products in 2013. This in turn lead to AMD going one step further to rectify the problem by publically laying out their 2013 plans in greater (but not complete) detail, which thankfully cleared a lot of confusion. Though not all confusion and doubt has been erased – after all, AMD has to save something for the GPU introductions – we learned that AMD would be launching new retail desktop 7000 series cards in the first half of this year, and that brings us to today.
Launching today is AMD’s second new GPU for 2013 and the first GPU to make it to the retail desktop market: Bonaire. Bonaire in turn will be powering AMD’s first new retail desktop card for 2013, the Radeon HD 7790. With the 7790 AMD intends to fill the sometimes wide chasm in price and performance between their existing 7770 (Cape Verde) and 7850 (Pitcairn) products, and as a result today we’ll see just how Bonaire and the 7790 fit into the big picture for AMD’s 2013 plans.
|AMD GPU Specification Comparison|
|AMD Radeon HD 7790||AMD Radeon HD 7850||AMD Radeon HD 7770||AMD Radeon HD 6870|
|Memory Clock||6GHz GDDR5||4.8GHz GDDR5||4.5GHz GDDR5||4.2GHz GDDR5|
|Memory Bus Width||128-bit||256-bit||128-bit||256-bit|
|Target Board Power||~85W||150W (TDP)||~80W||151W (TDP)|
|Manufacturing Process||TSMC 28nm||TSMC 28nm||TSMC 28nm||TSMC 40nm|
|Architecture||GCN 1.1*||GCN 1.0||GCN 1.0||VLIW5|
Diving right into things like always, Bonaire is designed to be an in-between GPU; something to go between the 10 Compute Unit Cape Verde GPU, and the 20 CU Pitcairn GPU. Pitcairn, as we might recall, is almost entirely twice the GPU that Cape Verde is. It has twice as many shaders, twice as many ROPs, twice as many geometry processors, and twice as wide a memory bus. Not surprisingly then, the performance gap between the two GPUs at similar clockspeeds approaches that two-fold difference, and even with binning and releasing products like the 7850 this leaves a fairly large gap in performance.
As AMD intends to carry the existing Southern Islands family forward into 2013, their strategy for the mid-to-low end of the desktop market has become one of filling in that gap. This is a move made particularly important for AMD due to the fact that NVIDIA’s GK106-powered GeForce GTX 650 Ti sits rather comfortably between AMD’s 7770 and 7850 in price and performance, robbing AMD of that market segment. Bonaire in turn will fill that gap, and the 7790 will be the flagship desktop Bonaire video card.
So what are we looking at for Bonaire and the 7790? As the 7790 will be a fully enabled Bonaire part, what we’ll be seeing with the 7790 today will be everything that Bonaire can offer. On the specification front we’re looking at 14 CUs, which breaks down to 896 stream processors paired with 56 texture units, giving Bonaire 40% more shading and texturing performance than Cape Verde. As a further change to the frontend, the number of geometry engines and command processors (ACEs) has been doubled compared to Cape Verde from 1 to 2 each, giving Bonaire the ability to process up to 2 primitives per clock instead of 1, bringing it up to parity with Pitcairn and Tahiti. Finally, the backend remains unchanged; like Cape Verde, Bonaire has 16 ROPs attached to a 128bit memory bus, giving it equal memory bandwidth and equal ROP throughput at equivalent clockspeeds.
Moving on to the 7790 in particular, the 7790 will be shipping at a familiar 1GHz, the same core clockspeed as the 7770. So all of those performance improvements due to increases in functional units translate straight through – compared to the 7770, the 7790 has 40% more theoretical compute/shading performance, 40% more texturing performance, 100% more geometry throughput, and no change in ROP throughput. Meanwhile in a move mirroring what AMD did with the 7970 GHz Edition last year, AMD has bumped up their memory clocks. 7790 will ship with a 6GHz memory clock thanks to a higher performing (i.e. not from Cape Verde) memory interface, which compared to the 7770’s very conservative 4.5GHz memory clock means that the 7790 will have 33% more memory bandwidth compared to 7770, despite the fact that the memory bus itself is no wider.
Putting it altogether, so as long as the 7790 is not ROP bottlenecked, it stands to be 33%-100% faster than the 7770. Or relative to 7850, the 7790 offers virtually all of the 7850’s texturing and shading performance (it’s actually 2% faster), while offering only around 60% of the memory bandwidth and ROP throughput.
On the power front, unsurprisingly power consumption has gone up a bit. As a reminder, AMD does not quote TDPs, but rather “typical board power”, which is AMD’s estimate for what power consumption will be like under an average workload. 7770’s official TBP is 80W, while 7790’s is 85W. We’ll have our own breakdown on this in our look at power, temperature, and noise, but it’s fair to say that 7790 draws only a small amount of additional power over the 7770. Ultimately this can be attributed to the fact that while Bonaire is a larger chip, it’s not extremely so, with only the addition of the CUs and additional geometry/ACE pipeline separating the two. Mixed with gradual improvements over the last year on TSMC’s 28nm process, and better power management from AMD, and it’s possible to make these kinds of small improvements while not pushing load power too much higher.
On the note of Bonaire versus Cape Verde, let’s also talk a bit about transistor count and die sizes. Unsurprisingly, Bonaire sits between Cape Verde and Pitcairn in transistor count and die size. Altogether Bonaire comes in at 2.08B transistors, occupying a 160mm2 die. This is as compared to Cape Verde’s 1.5B transistors and 123mm2 die size, or Pitcairn’s 2.8B transistors and 212mm2 die size. For AMD their closest chip in terms of die size in recent history would be Juniper, the workhorse of the Evergreen family and the Radeon HD 5770, which came in at 166mm2.
Moving on, as is consistent with AMD’s previous announcements, the 7790 is being launched as just that: the 7790. AMD has told us that they intend to keep the HD 7000 brand in retail this year due to the success of the brand, and to that end our first Bonaire card is a 7700 series card. The namespace collision is unfortunate – sticking with the 7000 series means AMD is facing the pigeonhole principle and has to put new GPUs in existing sub-series – but ultimately this is something AMD shouldn’t have any real problems executing on. We’ll get into the microarchitecture of Bonaire on our next page, but for gamers and other consumers Bonaire may as well be another member of the Southern Islands GPU family, so it fits in nicely in the 7000 series despite being from a new wave of GPUs.
With that in mind, let’s talk about product positioning and pricing. The 7790 will launch at $149, roughly in between the 7770 and the 7850. AMD will be positioning it as an entry-level 1080p graphics card, and though it’s a 7700 series part its closest competition in AMD’s product stack is more likely to be the 7850, which it’s closer to on the basis of both price and performance.
Against the competition, the 7790’s closest competition will be the GeForce GTX 650 Ti. However with the price of that card regularly falling to $130 and lower, the 7790 is effectively carving out a small niche for itself where it will be a bit ahead of the GTX 650 Ti in both performance and in price. NVIDIA’s next card up is the GTX 660, at more than $200.
For anyone looking to pick up a 7790 today, this is being launched ahead of actual product availability (likely to coincide with GDC 2013 next week). Cards will start showing up in the market on April 2nd, which is about a week and a half from now. Notably, AMD and their partners will be launching stock clocked and factory overclocked parts right away, and from what we’re being told factory overclocked cards will be prolific from day one. Overall we’re expecting this launch to be a lot like the launch of the GTX 560, where NVIDIA did something very similar. In which case we should see both stock and factory overclocked parts right away with more factory overclocked parts than stock parts, and if it does play out like the 560 then stock clocked cards would become a larger piece of the 7790 inventory later in the lifetime of the 7790.
Finally, AMD is wasting no time in extending their Never Settle Reloaded bundle to the 7790. As the 7790 is a cheaper card it won’t come with as many games as the more expensive Radeon cards, but for 7790 buyers they will be receiving a voucher for Bioshock Infinite with their cards. MSRPs/values are usually a poor way to look at the significance of game bundles, but it goes without saying that it’s not too often that $150 cards come with brand-new AAA games.
|Spring 2013 GPU Pricing Comparison|
|$219||GeForce GTX 660|
|Radeon HD 7850||$179|
|Radeon HD 7790||$149|
|$134||GeForce GTX 650 Ti|
|Radeon HD 7770||$109||GeForce GTX 650|
|Radeon HD 7750||$99||GeForce GT 640|
Bonaire’s Microarchitecture - What We’re Calling GCN 1.1
With our introduction out of the way, before looking at the cards and our performance results we would like to dive into a technical discussion and a bit of nitpicking. Specifically we would like to spend some time talking about architectures and product naming, as it’s going to be a bit confusing at first. As AMD has stated numerous times in the past, Graphics Core Next is a long-term architecture for the company. AMD intends to evolve GCN over the years, releasing multiple microarchitectures based on GCN that improve the architecture and add features while still being rooted in the design principles of GCN. GCN is after all the other half of AMD’s upcoming HSA-capable APUs, the culmination of years of AMD’s efforts with HSA/Fusion.
So where does Bonaire fit in? Bonaire is of course a GCN part; it’s a new microarchitecture that’s technically different from Southern Islands, but on the whole it’s a microarchitecture that’s extremely close in design to Southern Islands. In this new microarchitecture there are some changes – among other things the new microarchitecture implements some new instructions that will be useful for HSA, support for a larger number of compute work queues (also good for HSA) and it also implements a new version of AMD’s PowerTune technology (which we’ll get to in a bit) – but otherwise the differences from Southern Islands are very few. There are no notable changes in shader/CU efficiency, ROP efficiency, graphics features, etc. Unless you’re writing compute code for AMD GPUs, from what we know about this microarchitecture it’s likely you’d never notice a difference.
Unfortunately AMD has chosen to more-or-less gloss over the microarchitectural differences altogether, which is not wholly surprising since they will be selling Bonaire and previous generation products side-by-side. Bonaire’s microarchitecture has no official name (at least not one AMD wants to give us) and no version number. The Sea Islands name we’ve been seeing thrown around is not the microarchitecture name. Sea Islands is in fact the name for all of the GPUs in this wave – or perhaps it would be better to say all of the products created in this development cycle – including both Bonaire and it’s new microarchitecture, and Oland, AMD’s other new GPU primarily for mobile that is purely Southern Islands in microarchitecture.
In fact if not for the fact that AMD released (and then retracted) an ISA document called “AMD Sea Islands Instruction Set Architecture” last month, we would likely know even less about Bonaire’s microarchitecture. The document has been retracted at least in part due to the name (since AMD will not be calling the microarchitecture Sea Islands after all), so as a whole AMD isn’t particularly keen in talking about their microarchitecture at this time. But at the same time from a product standpoint it gives you an idea of how AMD is intending to smoothly offer both Southern Islands and Bonaire microarchitecture parts together as one product family.
Anyhow, for the sake of our sanity and for our discussions, in lieu of an official name from AMD we’re going to be retroactively renaming AMD’s GCN microarchitectures in order to quickly tell them apart. For the rest of this article and in future articles we will be referring to Southern Islands as GCN 1.0, while Bonaire’s microarchitecture will be GCN 1.1, to reflect the small changes between it and the first rendition of GCN.
Ultimately the differences between GCN 1.0 and GCN 1.1 are extremely minor, but they are real. But despite our general annoyance in how this has been handled, for consumers the difference between a GCN 1.0 card like the 7770 and a GCN 1.1 card like the 7790 should be limited to their innate performance differences, and of course PowerTune. GCN 1.1 or not, Bonaire fits in nicely in AMD’s current product stack and is in a position where it’s reasonable for it to be lumped together with GCN 1.0 parts as a single family. It’s really only the technical enthusiasts (like ourselves) and programmers that should have any significant reason to care about GCN 1.0 versus GCN 1.1. For everyone else this may just as well be another Southern Islands part.
The New PowerTune: Adding Further States
In 2010 AMD introduced their PowerTune technology alongside their Cayman GPU. PowerTune was a new, advanced method of managing GPU voltages and clockspeeds, with the goal of offering better control over power consumption at all times so that AMD could be more aggressive with their clockspeeds. PowerTune’s primary task was to reign in on programs like FurMark – power viruses as AMD calls them – so that these programs would not push a card past its thermal/electrical limits. Consequently, with PowerTune in place AMD would not need to set their maximum GPU clocks as conservatively merely to handle the power virus scenario.
This technology was brought forward for the entire Southern Islands family of GPUs, and remained virtually unchanged. PowerTune as implemented on SI cards without Boost had 3 states – idle, intermediate (low-3D), and high (full-3D). When for whatever reason PowerTune needed to clamp down on power usage to stay within the designated limits, it could either jump states or merely turn down the clockspeed, depending on how far over the limit the card was trying to go. In practice state jumps were rare – it’s a big gap between high and intermediate – so for non-boost cards it would merely turn down the GPU clockspeed until power consumption was where it needed to be.
Modulating clockspeeds in such a manner is a relatively easy thing to implement, but it’s not without its drawbacks. That drawback being that semiconductor power consumption scales at a far greater rate with voltage than it does with clockspeed. So although turning down clockspeeds does reduce power consumption, it doesn’t do so by a large degree. If you want big power savings, you need to turn down the voltage too.
Starting with 7790 and Bonaire, this is exactly what AMD is doing. Gone is pure clockspeed modulation – inferred states in AMD’s nomenclature – and instead AMD is moving to using a larger number of full states. GCN 1.1 has 8 states altogether, with no inferred states between them. With this change, when PowerTune needs to reduce clockspeeds it can drop to a nearby state, reducing power consumption through both clockspeed and voltage reductions at the same time.
With this change state jumping will also be a far more frequent occurrence. The lack of intermediate states and the lack of granularity (8 states over 700MHz is not fine-grained) effectively makes fast state jumping a requirement, as there’s a very good chance dropping down a state will leave some power/performance on the table. So if it’s throttling, 7790 will be able to state jump as quickly as every 10ms (that’s 100 jumps a second), typically bouncing between two or more states in order to keep the card within its limits.
At the same time, AMD’s formula for picking states on non-boost cards has changed. In a move similar to what AMD has done with Richland, AMD’s temperature-agnostic state selection system has been ditched in favor of one that includes temperatures into the calculation, making it a system that is now based on power, temperature, and load. There are some minor benefits to being temperature-agnostic that AMD is giving up – mainly that performance is going to vary a bit with temperature now – but at the end of the day this allows AMD to better min-max their GPUs to hit higher frequencies more often. This also brings them to parity with Intel and NVIDIA, who have long taken temperature into account.
The fact that this is a very boost-like system is not lost on us, and with these changes the line between PowerTune with and without boost starts to become foggy. Both are ultimately going to be doing the same thing – switching states based on power and temperature considerations – the only difference being whether a card adjusts down, or if it adjusts both up and down. In practice we rarely see cards adjust down outside of FurMark, so while PowerTune doesn’t dictate a clockspeed floor, base clocks are still base clocks. In which case the practical difference between whether an AMD card has boost or not is whether it can access some higher voltage, higher clockspeed states that it may not be able to maintain for long periods of time across all workloads. The 7790 isn’t a boost part of course, but AMD’s own presentation neatly lays out where boost would fit in, so if we do see future GCN 1.1 products with boost we have a good idea of what to expect.
Moving on, with the changes to PowerTune will also come changes to AMD’s API for 3rd party utilities, and what information is reported. First and foremost, due to the frequency of state changes with the new PowerTune, AMD will no longer be reporting the instantaneous state. Instead they will be reporting an average of the states used. We don’t know how big the averaging window is – we suspect it’s no more than 2 seconds – but the end result will be that MSI Afterburner, GPU-Z, and other utilities will now see those averages reported as the clockspeed. This will give most users a better idea of what the effective clockspeed (and thereby effective performance) is, but it does mean that it’s going to be virtually impossible to infer the clockspeeds/voltages of AMD’s new states.
The other change is that with the new PowerTune AMD will be exposing new tweaking options to 3rd parties. The current PowerTune (TDP) setting is going to be joined by a separate setting for adjusting a limit called Total Design Current (TDC), which as the name implies is how much current is allowed to be passed into the GPU. AMD limits cards by both TDP and TDC to keep total power, temperatures, and total currents in check, so this will open up the latter to tweakers. Unfortunately utilities with TDC controls were not ready in time for our 7790 review, so we can’t really comment on TDC at this time. With AMD’s changes to PowerTune however (and their insistence on calling TDP thermal management), TDP may be turning into a temperature control while TDC becomes the new power control.
Finally, since these controls are going to be user-accessible, this will spill-over to AMD’s partners. Partners will be able to set their own TDP and TDC limits if they wish, which will help them fine-tune their factory overclocked cards. This will give partners more headroom for such cards as opposed to being stuck shipping cards at AMD’s reference limits, but it means that different cards from different vendors may have different base TDP and TDC limits, along with different clockspeeds. This also means that in the future equalizing clockspeeds may not be enough to equalize two cards.
Meet The Radeon HD 7790 & Sapphire HD 7790 Dual-X Turbo
Today we’ll be looking at two cards, AMD’s reference card and Sapphire’s customized HD 7790 Dual-X OC. As is typical for cards in this price segment, the designs are relatively simple and as such only a few partners will be using the reference design as opposed to rolling their own designs. At the same time AMD is pushing their partners’ factory overclocked cards hard – we have the Sapphire and then 3 more on their way – but it’s important to keep in mind that not every last 7790 will be factory overclocked. So AMD’s Spartan reference card is a good example of baseline 7790 performance will be like, including how well it performs with a simple, single-fan open air cooler.
As we alluded to a moment ago, AMD’s reference 7790 is a spartan card. At under 7” long it’s actually shorter than the 7770 and has more in common with the 7750 as far as board length goes. Cooling is provided by a small open air cooler, composed of a circular heatsink with copper heatpipes running up from the base of the card and into the heatsink fins. At the center is a single, small fan responsible for providing the airflow for the card. Meanwhile towards the front of the card we find a small upright heatsink, providing the minimal cooling necessary for the MOSFETs regulating power for the card.
As we’re looking at a 128bit card, memory is provided by 4 6GHz Hynix GDDR5 memory modules, placed on the front of the card underneath the heatsink. A lot of AMD’s partners will be shipping their cards with the memory overclocked to 6.4GHz, which is a fairly common overclock for Hynix’s GDDR5 modules these days.
Elsewhere on the card we can see the sole 6pin PCIe power socket, pointing towards the rear of the card. The 7790 does draw more power than the 7770, and while total power consumption is fairly low, it’s still over 75W and hence requires external power. Meanwhile at the top of the card we can see a single CrossFire connector. AMD believes offering CF here when NVIDIA’s closest product doesn’t (the GTX 650 Ti) is a marketable advantage, but CFing a 1GB card in 2013 strikes us as a poor idea.
Finally, for display connectivity AMD has deviated from the rest of the 7000 series a bit. The 7700 and 7800 series used a single row of display connectors, typically composed of an HDMI port, a DL-DVI-I port, and 2 miniDPs. With 7790 however AMD is dropping the miniDPs in favor of one full-size DisplayPort, and at the same time they’re bringing back the stacked DVI connector.
Taking up space on the 2nd slot of the card’s bracket is a DL-DVI-D port, giving us the first AMD card with two DVI ports in this price range in some time. Note that while Bonaire can drive up to 6 displays it can only drive 2 TMDS-type displays (DVI/HDMI), so the second DVI port can only be used if the HDMI port is not in use.
|Radeon HD 7790 Specification Comparison|
|Radeon HD 7790 (Ref)||Sapphhire HD 7790 OC|
|Width||Double Slot||Double Slot|
Meet The Sapphire HD 7790 Dual-X OC
Moving on, we’re also taking a look at a partner card today, Sapphire’s HD 7790 Dual-X OC. Virtually every partner is releasing a factory overclocked card of some kind with their own take on the design, but Sapphire’s 7790 should be a good representation of what to expect given how similar many of the 7790 designs are.
To that end Sapphire’s Dual-X cooler is your fairly standard twin fan design, utilizing a pair of shallow fans mounted over an aluminum heatsink that runs over the length of the card. A pair of copper heatpipes run from the baseplate over the GPU to the heatsink, with the entire solution serving as an open-air cooler. Note that while Sapphire is using AMD’s reference PC design here, they have lengthened their PCB to match the length of their heatsink, and to allow them to turn the PCIe socket 90 degrees so that it now is against the top of the card rather than the rear.
As given away by the OC name, Sapphire will be shipping their card with a decent factory overclock. Shipping speeds will be 1075MHz for the core and 6.4GHz for the memory, a 7.5% core overclock and 6.5% memory overclock respectively. This will be the most common factory overclock, with several other partners shipping their top-end cards with the same overclock.
Other than the custom cooler and factory overclock, Sapphire’s card is otherwise functionally identical to AMD’s reference card. We’re looking at the same display output configuration of 1x DP, 1x HDMI, and 2x DL-DVI, with the same CrossFire capabilities. Sapphire is putting the MSRP of the card at $159, putting a $10 premium on their cooler and factory overclock.
For today’s review we will be using the latest rendition of our game benchmark suite, first introduced in our review of the GeForce GTX Titan. We still expect to add another 1-2 games to this suite in April after the last of the major Spring game releases hit next week. As a reminder, our 2013 benchmark suite is much more 1080p centric on the low-end, as 1080p sales have eclipsed even cheaper, lower resolution monitors. As AMD is promoting the 7790 as an entry-level 1080p card anyhow, this ends up working well.
On the driver side of things we are using AMD’s 12.101.2 press drivers for the 7790, and their Catalyst 13.2 beta 7 drivers for the rest of our AMD cards. For our NVIDIA cards we are using 314.21.
Unfortunately we only had a very short period of time to spend with this card due to AMD’s launch schedule conflicting with NVIDIA’s GPU Technology Conference this week. As a result while we’ve been able to put together our usual analysis and data collections, we’ve only been able to compare it to around half a dozen other cards – the relevant AMD and NVIDIA cards above and below the 7790, and for a historical perspective we’ve thrown in the Radeon HD 6870.
Similarly, because of a short period of time to write this article our performnace commentary will be lighter than usual, so our apologies on that. But the fact of the matter is that the 7790 results will speak for themselves as we’ll see in our charts. Against AMD’s lineup the 7790 is comfortably in between the 7770 and 7850, offering 130% of the former and 84% of the latter on average. While against NVIDIA’s lineup the 7790 is 11% faster than the GTX 650 Ti, beating the 650 Ti – sometimes by quite a bit – in everything but Battlefield 3. The question, as is often the case, is not performance but price.
|CPU:||Intel Core i7-3960X @ 4.3GHz|
|Motherboard:||EVGA X79 SLI|
|Power Supply:||Antec True Power Quattro 1200|
|Hard Disk:||Samsung 470 (256GB)|
|Memory:||G.Skill Ripjaws DDR3-1867 4 x 4GB (8-10-9-26)|
|Case:||Thermaltake Spedo Advance|
AMD Radeon HD 7850
AMD Radeon HD 7790
AMD Radeon HD 7770
AMD Radeon HD 6870
Sapphire HD 7790 Dual-X OC
NVIDIA GeForce GTX 660
NVIDIA GeForce GTX 650 Ti
NVIDIA ForceWare 314.21
AMD 12.101.2 7790 Press Beta
AMD Catalyst 13.2 Beta 7
|OS:||Windows 8 Pro|
Racing to the front of our benchmark suite is our racing benchmark, DiRT: Showdown. DiRT: Showdown is based on the latest iteration of Codemasters’ EGO engine, which has continually evolved over the years to add more advanced rendering features. It was one of the first games to implement tessellation, and also one of the first games to implement a DirectCompute based forward-rendering compatible lighting system. At the same time as Codemasters is by far the most prevalent PC racing developers, it’s also a good proxy for some of the other racing games on the market like F1 and GRID.
Total War: Shogun 2
Our next benchmark is Shogun 2, which is a continuing favorite to our benchmark suite. Total War: Shogun 2 is the latest installment of the long-running Total War series of turn based strategy games, and alongside Civilization V is notable for just how many units it can put on a screen at once. Even 2 years after its release it’s still a very punishing game at its highest settings due to the amount of shading and memory those units require.
The third game in our lineup is Hitman: Absolution. The latest game in Square Enix’s stealth-action series, Hitman: Absolution is a DirectX 11 based title that though a bit heavy on the CPU, can give most GPUs a run for their money. Furthermore it has a built-in benchmark, which gives it a level of standardization that fewer and fewer benchmarks possess.
Another Square Enix game, Sleeping Dogs is one of the few open world games to be released with any kind of benchmark, giving us a unique opportunity to benchmark an open world game. Like most console ports, Sleeping Dogs’ base assets are not extremely demanding, but it makes up for it with its interesting anti-aliasing implementation, a mix of FXAA and SSAA that at its highest settings does an impeccable job of removing jaggies. However by effectively rendering the game world multiple times over, it can also require a very powerful video card to drive these high AA modes.
Up next is our legacy title for 2013, Crysis: Warhead. The stand-alone expansion to 2007’s Crysis, at over 4 years old Crysis: Warhead can still beat most systems down. Crysis was intended to be future-looking as far as performance and visual quality goes, and it has clearly achieved that. We’ve only finally reached the point where single-GPU cards have come out that can hit 60fps at 1920 with 4xAA.
Far Cry 3
The next game in our benchmark suite is Far Cry 3, Ubisoft’s island-jungle action game. A lot like our other jungle game Crysis, Far Cry 3 can be quite tough on GPUs, especially with MSAA and improved alpha-to-coverage checking thrown into the mix. On the other hand it’s still a bit of a pig on the CPU side, and seemingly inexplicably we’ve found that it doesn’t play well with HyperThreading on our testbed, making this the only game we’ve ever had to disable HT for to maximize our framerates.
Our final action game of our benchmark suite is Battlefield 3, DICE’s 2011 multiplayer military shooter. Its ability to pose a significant challenge to GPUs has been dulled some by time and drivers, but it’s still a challenge if you want to hit the highest settings at the highest resolutions at the highest anti-aliasing levels. Furthermore while we can crack 60fps in single player mode, our rule of thumb here is that multiplayer framerates will dip to half our single player framerates, so hitting high framerates here may not be high enough.
Our final game, Civilization V, gives us an interesting look at things that other RTSes cannot match, with a much weaker focus on shading in the game world and a much greater focus on creating the geometry needed to bring such a world to life. In doing so it uses a slew of DirectX 11 technologies, including tessellation for said geometry, driver command lists for reducing CPU overhead, and compute shaders for on-the-fly texture decompression.
As always we'll start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.
Our next benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.
Our 3rd benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.
Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, as Folding @ Home is moving exclusively OpenCL this year with FAHCore 17.
Our 5th compute benchmark is Sony Vegas Pro 12, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.
Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, as described in this previous article, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.
As always we’ll also take a quick look at synthetic performance to get a better look at our video cards' underpinnings. These tests are mostly for comparing cards from within a manufacturer, as opposed to directly comparing AMD and NVIDIA cards.
We’ll start with 3DMark Vantage’s Pixel Fill test, a mix of a ROP test and a bandwidth test to see if you have enough bandwidth to feed those ROPs.
Moving on, we have our 3DMark Vantage texture fillrate test, which does for texels and texture mapping units what the previous test does for ROPs.
Finally we’ll take a quick look at tessellation performance with TessMark. We have everything turned up to maximum here, which means we're looking at roughly 11 million polygons per frame.
Power, Temperature, & Noise
Last but not least of course is our look at power, temperature, and noise. AMD designed the Bonaire GPU in the same vein as Cape Verde, making it slightly bigger and slightly more power hungry, but not immensely so. As a result of this and other considerations the 7790’s typical board power is rated at 85W, just 5W over the 7770’s. Now we’ll see how that translates to real world conditions.
Stopping quickly to take a look at voltages, while AMD’s clock averaging algorithm means that we cannot see the individual states, we can still see the idle and top states just as before due to the fact that these are easily sustained states. So we know what kind of voltage our 7790 reference card is pulling at idle and at 1GHz. To that end we’re seeing a idle voltage of 0.894v and a load voltage of 1.263v according to GPU-Z.
|Radeon HD 7790 Voltages|
|Ref 7790 Idle||Ref 7790 Load||Sapphire 7790 Load|
For anyone curious about the impact of the latest rendition of PowerTune, we also recorded the average (average) clockspeed of the 7790 in all of our games. Since PowerTune in this case is being used purely as a throttling mechanism and not as a boost mechanism, there’s actually very little to report here. Our 7790 sustained 1GHz in each and every game. Even in FurMark the average clockspeed was only slightly reduced, to 975MHz. Given what we’re seeing, we don’t see any reason that any other 7790 card shoudn’t be doing the full 1GHz the vast majority of the time.
|Radeon HD 7790 Average Clockspeeds|
|Far Cry 3||1000MHz|
Kicking things off as always is our look at power consumption. We are measuring power consumption at the wall for the entire system, so these numbers are almost entirely composed of everything but the video card, but it still is sensitive enough to identify any cards that may be significantly worse than today’s low-idling cards.
For reasons we’re not entirely sure of, we’re seeing 7790’s idle power consumption come in at just a bit higher than both the 7770 and 7850. As Bonaire is a larger GPU the former isn’t unusual, but it’s not clear why 7790 is pulling more power at idle than the larger yet 7850 and its 2GB of GDDR5. The difference at the wall is only a couple of watts, but it is repeatable.
Looking at our first load power test with Battlefield 3, we can see that our test results are very close to AMD’s typical board power rating. Both the reference and Sapphire 7790 cards are pulling 7W more than the 7770, a hair more than the 5W difference in AMD’s TBP. Considering the fact that the 7790 is nearly 30% faster than the 7770 on average, this is quite an accomplishment on a power/performance basis for AMD in the span of a year. It’s a very reasonable increase in performance for very little of an increase in power consumption. This also means that the 7790 looks good against the 7850; the 7850 is faster, but it draws 22W more at the wall.
But perhaps more interesting is the 7790 against the GTX 650 Ti. The 650 Ti is around 5% faster in this test, but it’s also leading to our testbed drawing 12W more at the wall. As BF3 is a game test, power consumption does scale slightly with performance due to the extra work put on the CPU to generate the additional frames, but a 5% performance difference in this case would not explain a 12W increase in power consumption. NVIDIA has for a long time set the bar on efficiency, but with the 7790 it looks like AMD will finally edge out NVIDIA.
Moving on to FurMark the picture changes a bit. Against the 7770 the 7790 once again looks great, with the difference at the wall being a single watt, while drawing 14W less than the 7850. On the other hand the 7790 is now drawing slightly more than the GTX 650 Ti, but as we’ve see before game tests and FurMark can sometimes be at odd since FurMark is an explicit test of a card’s power throttling systems. NVIDIA looks to be throttling a bit harder than AMD, which would lead to what we’re seeing.
At the same time our 7790 cards show a bit of variance. The Sapphire card, despite being overclocked, draws 6W less than our reference 7790. There are some temperature differences between the cards due to their different coolers, so with the new PowerTune this may be a manifestation of the impact of power. Or it could just be standard variation for 7790, taking into consideration that the GPU in the Sapphire card would be binned. In FurMark this is mostly an intellectual curiosity. Once we have more 7790s we may be able to break it down further.
Looking at idle temperatures there are no great surprises here. Sapphire’s Dual-X cooler can bring their card down to 28C, while our reference card hits 30C. Idle temperatures on these low-wattage GPUs aren’t too far above room temperature.
Jumping to load temperatures, the reference 7790 with its admittedly average cooler is particularly average in the temperature department. 69C is warmer than the 7850, 7770, and the GTX 650 Ti, but as it’s still under 70C these are by all considerations still cool temperatures. Sapphire’s cooler in the meantime looks very good here. 53C when running BF3 is quite an accomplishment, especially since our reference GTX 650 Ti does so well in this test.
Cranking things up to the max for FurMark, we can see that temperatures are up across the board. The reference 7790 is now at 74C, which is warm but not exceptional. It’s still a low enough power card that even a half-decent cooler is enough to render temperatures meaningless. Meanwhile Sapphire’s card once again takes top honors at a relatively chilly 62C, 3C cooler than the otherwise overbuilt 7850, and 6C cooler than the reference GTX 650 Ti.
Both of our 7790 cards look good when it comes to idle noise. The reference 7790 with its one fan runs up against the noise floor at 37.5 dB(A), while the Sapphire card with its two fans is just a bit louder at 38 dB(A). In lieu of passive cooling this looks like a decent option for silent PC enthusiasts looking for a quiet idle, though really anything at the GTX 650 Ti level or lower is doing rather well here.
Jumping into load noise, we can see that the reference 7790 and its simple open air cooler thrives with our BF3 testing. Despite the jump in heat dissipation the amount of noise generated by its one fan has jumped by less than 1 dB(A), an almost imperceptible increase. At the same time this paints a picture of the reference 7790 where it looks like it has been optimized for noise over temperature, giving us reasonably acceptable temperatures for fantastic acoustics.
Sapphire’s card on the other hand isn’t set to favor acoustics over temperature, so those great temperatures we saw earlier do come at the expense of some noise. 41.2 dB(A) leaves it effectively tied with the 7770 and quite a bit quieter than the 7850, but over 3 dB(A) louder than the reference 7790, and 2 dB(A) louder than the GTX 650 Ti. If you’re looking for near-absolute silence here there is a clear benefit to the reference 7790 over Sapphire’s card, but 41.2 dB(A) is nothing to sneeze at.
Finally we have our look at noise under FurMark. Again the reference 7790 does very well here, coming in at 38.5 dB(A), amazingly quiet for a >75W card under FurMark. This is over 8 dB(A) quieter than the 7770, 12 dB(A) quieter than the 7850, and over 2 dB(A) quieter than the GTX 650 Ti. AMD is more interested in pushing partner cards than their own design, but right now right here the reference design is looking very good.
As for Sapphire’s card, once more we see it favoring temperatures over noise. 45.5 dB(A) is still quieter than the reference 7770, so Sapphire isn’t doing badly here. But the reference 7790 and even the GTX 650 Ti back it into a corner. If you care about noise than the reference 7790 is a better configuration, but if you care about temperature the Sapphire is going to be a tough act to beat.
It goes without saying of course that these noise improvements come despite a 30%+ increase in performance over the 7770, and contrast against a marginal increase in power consumption. The price difference between the 7770 and 7790 put them in distinctly different categories, but the 7790s we’re seeing are clear improvements over the 7770 in practically every way. These gains aren’t quite as remarkable when placed against the GTX 650 Ti due to the latter’s high efficiency out of the gate, however that doesn’t mean AMD hasn’t managed to surpass NVIDIA here. BF3 power consumption and noise testing on our reference 7790 tells us that AMD can edge out NVIDIA in the efficiency department here, especially since this comes with a 10% performance improvement.
Bringing our review to a close, the launch of the Radeon HD 7790 is another precisely targeted launch by AMD. The 7790 is intended to fill AMD’s price and performance gaps between the 7770 and the 7850, and it does this very well, offering 84% of the 7850’s performance – or 130% of the 7770’s performance – for around $30 less than the 7850. In the world of sub-$200 video cards where every $10 matters, this is exactly what AMD needs to fill in their product lineup.
Meanwhile as the first GCN 1.1 GPU, Bonaire doesn’t greet us with any great surprises, and if not for the new PowerTune implementation it would be indistinguishable from Southern Islands (GCN 1.0). With that said AMD already had a strong architecture in GCN 1.0, so even minor changes such as PowerTune and a new GPU configuration serve to make a good architecture better. The new PowerTune will probably take enthusiasts a bit of time to get used to, but ultimately we’re happy to see AMD moving to using just full clock/voltage states and not relying on their clockspeed-only inferred states, as the former is going to offer more power savings. As for AMD’s functional unit layout for Bonaire – 14 CUs, 2 geometry pipelines, and 16 ROPs – it looks to have paid off handsomely for them. They’ve improved performance by quite a bit without having to add too many transistors or a larger memory bus, making it a great way to iterate on GCN midway between new process nodes.
The big question of course is whether 7790 is worth its $149 price tag, and factory overclocked models like the Sapphire worth the $159 price tag. From a pure price/performance perspective, right now things look pretty good for AMD and their partners. Against the rest of the 7000 series it has a very clear niche to fill, which is does so but without being so good as to make the 7850 redundant. Meanwhile against NVIDIA’s GeForce GTX 650 Ti things are still in AMD’s favor but it’s a bit murkier. A 12% performance advantage is distinct, but AMD’s also asking for nearly $20 more than most cheap GTX 650 Tis. At these prices there’s really no concept of a sweet spot since consumers often have fixed budgets, so instead we’ll point out that NVIDIA simply doesn’t have a suitable $150 video card right now; all they can offer are factory overclocked GTX 650 Ti cards.
Speaking of factory overclocked cards, our Sapphire HD 7790 Dual-X OC was exactly what we expected it to be. A 6-7% increase in clockspeeds leads to a 6% performance increase, showing that 7790 achieves the performance scaling necessary to make these cards viable. In this case overclocked cards are a very straightforward proposition: $10-$20 more for 6% more performance and typically a better cooler. This is all rather normal for factory overclocked cards, though we would point out that we have no reason to believe these overclocks aren’t achievable on stock-clocked cards.
Our one concern with the 7790 right now is one of memory size. Adding another 1GB of GDDR5 would definitely have a price impact, and having 2GB of GDDR5 on a 128bit bus would be a bit odd. But on the other hand we now know what the future of PC gaming holds: a lot of ports coming from a console with 8GB of GDDR5 memory. 1GB is going to look very small in a year’s time as those ports start arriving.
Ultimately we’re reminded of a discussion we had with the launch of the GTX 650 Ti last year, when we had the time to look at 2GB vs. 1GB on the 650 Ti and the 7850. Our conclusion at the time was such: “We have reached that point where if you’re going to be spending $150 or more that you shouldn’t be settling for a 1GB card; this is the time where 2GB cards are going to become the minimum for performance gaming video cards.” That conclusion has not changed. The 7790 looks good among the current crop of cards, but the 2GB 7850 is going to be so much more future-proof, at least in as much as a video card can be. At these prices consumer budgets are typically fixed and for good reason, but with 2GB 7850s available at around $180, it’s a very compelling upgrade for the extra $30. In 2013 it’s something worth considering if you want to keep a video card for at least a couple of years.