Original Link: http://www.anandtech.com/show/5818/nvidia-geforce-gtx-670-review-feat-evga
NVIDIA GeForce GTX 670 Review Feat. EVGA: Bringing GK104 Down To $400by Ryan Smith on May 10, 2012 9:00 AM EST
In a typical high-end GPU launch we’ll see the process take place in phases over a couple of months if not longer. The new GPU will be launched in the form of one or two single-GPU cards, with additional cards coming to market in the following months and culminating in the launch of a dual-GPU behemoth. This is the typical process as it allows manufacturers and board partners time to increase production, stockpile chips, and work on custom designs.
But this year things aren’t so typical. GK104 wasn’t the typical high-end GPU from NVIDIA, and neither it seems is there anything typical about its launch.
NVIDIA has not been wasting any time in getting their complete GK104 based product lineup out the door. Just 6 weeks after the launch of the GeForce GTX 680, NVIDIA launched the GeForce GTX 690, their dual-GK104 monster. Now only a week after that NVIDIA is at it again, launching the GK104 based GeForce GTX 670 this morning.
Like its predecessors, GTX 670 will fill in the obligatory role as a cheaper, slower, and less power-hungry version of NVIDIA’s leading video card. This is a process that allows NVIDIA to not only put otherwise underperforming GPUs to use, but to satisfy buyers at lower price points at the same time. Throughout this entire process the trick to successfully launching any second-tier card is to try to balance performance, prices, and yields, and as we’ll see NVIDIA has managed to turn all of the knobs just right to launch a very strong product.
|GTX 680||GTX 670||GTX 580||GTX 570|
|Memory Clock||6.008GHz GDDR5||6.008GHz GDDR5||4.008GHz GDDR5||3.8GHz GDDR5|
|Memory Bus Width||256-bit||256-bit||384-bit||320-bit|
|FP64||1/24 FP32||1/24 FP32||1/8 FP32||1/8 FP32|
|Manufacturing Process||TSMC 28nm||TSMC 28nm||TSMC 40nm||TSMC 40nm|
Like GeForce GTX 680, GeForce GTX 670 is based on NVIDIA’s GK104 GPU. So we’re looking at the same Kepler design and the same Kepler features, just at a lower level of performance. As always the difference is that since this is a second-tier card, NVIDIA is achieving that by harvesting otherwise defective GPUs.
In a very unusual move for NVIDIA, for GTX 670 they’re disabling one of the eight SMXes on GK104 and lowering the core clock a bit, and that’s it. GTX 670 will ship with 7 active SMXes, all 32 of GK104’s ROPs, and all 4 GDDR5 memory controllers. Typically we’d see NVIDIA hit every aspect of the GPU at once in order to create a larger performance gap and to maximize the number of GPUs they can harvest – such as with the GTX 570 and its 15 SMs & 40 ROPs – but not in this case.
Meanwhile clockspeeds turn out to be equally interesting. Officially, both the base clock and the boost clock are a fair bit lower than GTX 680. GTX 670 will ship at 915MHz for the base clock and 980MHz for the boost clock, which is 91MHz (9%) and 78MHz (7%) lower than the GTX 680 respectively. However as we’ve seen with GTX 680 GK104 will spend most of its time boosting and not necessarily just at the official boost clock. Taken altogether, depending on the game and the specific GPU GTX 670 has the capability to boost within 40MHz or so of GTX 680, or about 3.5% of the clockspeed of its more powerful sibling.
As for the memory subsystem, like the ROPs they have not been touched at all. GTX 670 will ship at the same 6.008GHz memory clockspeed of GTX 680 with the same 256-bit memory bus, giving it the same 192GB/sec of memory bandwidth. This is particularly interesting as NVIDIA has always turned down their memory clocks in the past, and typically taken out a memory controller/ROP combination in the past. Given that GK104 is an xx4 GPU rather than a full successor to GF110 and its 48 ROPs, it would seem that NVIDIA is concerned about their ROP and memory performance and will not sacrifice performance there for GTX 670.
Taken altogether, this means at base clocks GTX 670 has 100% of the memory bandwidth, 91% of the ROP performance, and 80% of the shader performance of GTX 680. This puts GTX 670’s specs notably closer to GTX 680 than GTX 570 was to GTX 580, or GTX 470 before it. In order words the GTX 670 won’t trail the GTX 680 by as much as the GTX 570 trailed the GTX 580 – or conversely the GTX 680 won’t have quite the same lead as the GTX 580 did.
As for power consumption, the gap between the two is going to be about the same as we saw between the GTX 580 and GTX 570. The official TDP of the GT 670 is 170W, 25W lower than the GTX 680. Unofficially, NVIDIA’s GPU Boost power target for GTX 670 is 141W, 29W lower than the GTX 680. Thus like the GTX 680 the GTX 670 has the lowest TDP for a part of its class that we’ve seen out of NVIDIA in quite some time.
Moving on, unlike the GTX 680 launch NVIDIA is letting their partners customize right off the bat. GTX 670 will launch with a mix of reference, semi-custom, and fully custom designs with a range of coolers, clockspeeds, and prices. There are a number of cards to cover over the coming weeks, but today we’ll be looking at EVGA’s GeForce GTX 670 Superclocked alongside our reference GTX 670.
As we’ve typically seen in the past, custom cards tend to appear when GPU manufacturers and their board partners feel more comfortable about GPU availability and this launch is no different. The GTX 670 launch is being helped by the fact that NVIDIA has had an additional 7 weeks to collect suitable GPUs compared to the GTX 680 launch, on top of the fact that these are harvested GPUs. With that said NVIDIA is still in the same situation they were in last week with the launch of the GTX 690: they already can’t keep GK104 in stock.
Due to binning GTX 670 isn’t drawn from GTX 680 inventory, so it’s not a matter of these parts coming out of the same pool, but realistically we don’t expect NVIDIA to be able to keep GTX 670 in stock any better than they can GTX 680. The best case scenario is that GTX 680 supplies improve as some demand shifts down to the GTX 670. In other words Auto-Notify is going to continue to be the best way to get a GTX 600 series card.
Finally, let’s talk pricing. If you were expecting GTX 570 pricing for GTX 670 you’re going to come away disappointed. Because NVIDIA is designing GTX 670 to perform closer to GTX 680 than with past video cards they’re also setting the prices higher. GTX 670 will have an MSRP of $399 ($50 higher than GTX 570 at launch), with custom cards going for higher yet. This should dampen demand some, but we don’t expect it will be enough.
Given its $399 MSRP, the GTX 670 will primarily be competing with the $399 Radeon HD 7950. However from a performance perspective the $479 7970 will also be close competition depending on the game at hand. AMD’s Three For Free promo has finally gone live, so they’re countering NVIDIA in part based on the inclusion of Deus Ex, Nexuiz, and DiRT Showdown with most 7900 series cards.
Below that we have AMD’s Radeon HD 7870 at $350, while the GTX 570 will be NVIDIA’s next card down at around $299. The fact that NVIDIA is even bothering to mention the GTX 570 is an interesting move, since it means they expect it to remain as part of their product stack for some time yet.
Update 5/11: NVIDIA said GTX 670 supply would be better than GTX 680 and it looks like they were right. As of this writing Newegg still has 5 of 7 models still in stock, which is far better than the GTX 680 and GTX 690 launches. We're glad to see that NVIDIA is finally able to keep a GTX 600 series card in stock, particularly a higher volume part like GTX 670.
|Spring 2012 GPU Pricing Comparison|
|$999||GeForce GTX 690|
|$499||GeForce GTX 680|
|Radeon HD 7970||$479|
|Radeon HD 7950||$399||GeForce GTX 670|
|Radeon HD 7870||$349|
|$299||GeForce GTX 570|
|Radeon HD 7850||$249|
|$199||GeForce GTX 560 Ti|
|$169||GeForce GTX 560|
|Radeon HD 7770||$139|
Meet The GeForce GTX 670
Because of the relatively low power consumption of GK104 relative to past high-end NVIDIA GPUs, NVIDIA has developed a penchant for small cards. While the GTX 680 was a rather standard 10” long, NVIDIA also managed to cram the GTX 690 into the same amount of space. Meanwhile the GTX 670 takes this to a whole new level.
We’ll start at the back as this is really where NVIDIA’s fascination with small size makes itself apparent. The complete card is 9.5” long, however the actual PCB is far shorter at only 6.75” long, 3.25” shorter than the GTX 680’s PCB. In fact it would be fair to say that rather than strapping a cooler onto a card, NVIDIA strapped a card onto a cooler. NVIDIA has certainly done short PCBs before – such as with one of the latest GTX 560 Ti designs – but never on a GTX x70 part before. But given the similarities between GK104 and GF114, this isn’t wholly surprising, if not to be expected.
In any case this odd pairing of a small PCB with a large cooler is no accident. With a TDP of only 170W NVIDIA doesn’t necessarily need a huge PCB, but because they wanted a blower for a cooler they needed a large cooler. The positioning of the GPU and various electronic components meant that the only place to put a blower fan was off of the PCB entirely, as the GK104 GPU is already fairly close to the rear of the card. Meanwhile the choice of a blower seems largely driven by the fact that this is an x70 card – NVIDIA did an excellent job with the GTX 560 Ti’s open air cooler, which was designed for the same 170W TDP, so the choice is effectively arbitrary from a technical standpoint (there’s no reason to believe $400 customers are any less likely to have a well-ventilated case than $250 buyers). Accordingly, it will be NVIDIA’s partners that will be stepping in with open air coolers of their own designs.
Starting as always at the top, as we previously mentioned the reference GTX 670 is outfitted with a 9.5” long fully shrouded blower. NVIDIA tells us that the GTX 670 uses the same fan as the GTX 680, and while they’re nearly identical in design, based on our noise tests they’re likely not identical. On that note unlike the GTX 680 the fan is no longer placed high to line up with the exhaust vent, so the GTX 670 is a bit more symmetrical in design than the GTX 680 was.
Lifting the cooler we can see that NVIDIA has gone with a fairly simple design here. The fan vents into a block-style aluminum heatsink with a copper baseplate, providing cooling for the GPU. Elsewhere we’ll see a moderately sized aluminum heatsink clamped down on top of the VRMs towards the front of the card. There is no cooling provided for the GDDR5 RAM.
As for the PCB, as we mentioned previously due to the lower TDP of the GTX 670 NVIDIA has been able to save some space. The VRM circuitry has been moved to the front of the card, leaving the GPU and the RAM towards the rear and allowing NVIDIA to simply omit a fair bit of PCB space. Of course with such small VRM circuitry the reference GTX 670 isn’t built for heavy overclocking – like the other GTX 600 cards NVIDIA isn’t even allowing overvolting on reference GTX 670 PCBs – so it will be up to partners with custom PCBs to enable that kind of functionality. Curiously only 4 of the 8 Hynix R0C GDDR5 RAM chips are on the front side of the PCB; the other 4 are on the rear. We typically only see rear-mounted RAM in cards with 16/24 chips, as 8/12 will easily fit on the same side.
Elsewhere at the top of the card we’ll find the PCIe power sockets and SLI connectors. Since NVIDIA isn’t scrambling to save space like they were with the GTX 680, the GTX 670’s PCIe power sockets are laid out in a traditional side-by-side manner. As for the SLI connectors, since this is a high-end GeForce card NVIDIA provides 2 connectors, allowing for the card to be used in 3-way SLI.
Finally at the front of the card NVIDIA is using the same I/O port configuration and bracket that we first saw with the GTX 680. This means 1 DL-DVI-D port, 1 DL-DVI-I port, 1 full size HDMI 1.4 port, and 1 full size DisplayPort 1.2. This also means the GTX 670 follows the same rules as the GTX 680 when it comes to being able to idle with multiple monitors.
Meet The EVGA GeForce GTX 670 Superclocked
Our second card of the day is EVGA’s GeForce GTX 670 Superclocked, which in EVGA’s hierarchy is their first tier of factory overclocked cards. EVGA is binning GTX 670s and in turn promoting some of them to this tier, which means the GTX 670 Superclocked are equipped with generally better performing chips than the average reference card.
|GeForce GTX 670 Partner Card Specification Comparison|
|EVGA GeForce GTX 670 Superclocked||GeForce GTX 670 (Ref)|
|Memory Bus Width||256-bit||256-bit|
|Manufacturing Process||TSMC 28nm||TSMC 28nm|
|Width||Double Slot||Double Slot|
For the GTX 670 SC, EVGA has given both the core clock and memory clock a moderate boost. The core clock has been increased by 52MHz (6%) to 967MHz base and 66MHz (7%) boost to 1046MHz. Meanwhile the memory clock has been increased by 202MHz (3%) to 6210MHz.
Other than the clockspeed changes, the GTX 670 SC is an almost-reference card utilizing a reference PCB with a slightly modified cooler. EVGA is fabricating their own shroud, but they’ve copied NVIDIA’s reference shroud down to almost the last detail. The only functional difference is that the diameter of the fan intake is about 5mm less, otherwise the only difference is that EVGA has detailed it differently than NVIDIA and used some rounded corners in place of square corners.
The only other change you’ll notice is that EVGA is using their own high flow bracket in place of NVIDIA’s bracket. The high flow bracket cuts away as much metal as possible, maximizing the area of the vents. Though based on our power and temperature readings, this doesn’t seem to have notably impacted the GTX 670 SC.
While we’re on the matter of customized cards and factory overclocks, it’s worth reiterating NVIDIA’s position on factory overclocked cards. Reference and semi-custom cards (that is, cards using the reference PCB) must adhere to NVIDIA’s power target limits. For GTX 670 this is a 141W power target, with a maximum power target of 122% (170W). Fully custom cards with better power delivery circuitry can go higher, but not semi-custom cards. As a result the flexibility in building semi-custom cards comes down to binning. EVGA can bin better chips and use them in cards such as the Superclocked – such as our sample which can go 17 boost bins over the base clock versus 13 bins for our reference GTX 670 – but at the end of the day for stock performance they’re at the mercy of what can be accomplished within 141W/170W.
In any case, as the card is otherwise a reference GTX 670 EVGA is relying on the combination of their factory overclock, their toolset, and their strong reputation for support to carry the card. EVGA has priced the card at $419, $20 over the GTX 670 MSRP, in-line with other factory overclocked cards.
On the subject of pricing and warranties, since this is the first EVGA card we’ve reviewed since April 1st, this is a good time to go over the recent warranty changes EVGA has made.
Starting April 1st, EVGA has implemented what they’re calling their new Global Warranty Policy. Starting July 1st, 2011 (the policy is being backdated), all new EVGA cards ship with at least a 3 year warranty. And for the GTX 600 series specifically, so far EVGA has only offered models with a 3 year warranty in North America, which simplifies their product lineup.
To complement the 3 year warranty and replace the lack of longer term warranties, EVGA is now directly selling 2 and 7 year warranty extensions, for a total of 5 and 10 years respectively. So instead of buying a card with a 3 year warranty or a longer warranty, you’ll simply buy the 3 year card and then buy a warranty extension to go with it. However the extended warranty requires that the card be registered and the warranty purchased within 30 days.
The second change is that the base 3 year warranty no longer requires product registration. EVGA has other ways to entice buyers into registering, but they’ll now honor all applicable cards for 3 years regardless of the registration status. At the same time the base 3 year warranty is now a per-product warranty (e.g. a transferable warranty) rather than per-user warranty, so the base warranty will transfer to 2nd hand buyers. The extended warranties however will not.
The third change is how EVGA is actually going to handle the warranty process. First and foremost, EVGA is now allowing cards to be sent to the nearest EVGA RMA office rather than the office for the region the card was purchased from. For example a buyer moving from Europe to North America can send the card to EVGA’s North American offices rather than sending it overseas.
Finally, EVGA is now doing free cross shipping, alongside their existing Advanced RMA program. EVGA will now cross-ship replacement cards for free to buyers. The buyer meanwhile is responsible for paying to ship the faulty card back and putting up collateral on the new card until EVGA receives the old card.
There’s also one quick change to the step-up program that will impact some customers. With the move to purchasing extended warranties, the step-up program is only available to customers who either purchase an extended warranty or purchase an older generation card that comes with a lifetime warranty. Step-up is not available to cards with only the base 3 year warranty.
Moving on, along with EVGA’s new warranty EVGA is bundling the latest version of their GPU utilities, Precision X and OC Scanner X.
Precision X, as we touched upon quickly in our GTX 680 review, is the latest iteration of EVGA’s Precision overclocking & monitoring utility. It’s still based on RivaTuner and along with adding support for the GTX 600 series features (power targets, framerate caps, etc), it also introduces a new UI. Functionality wise it’s still at the top of the pack along with the similarly RivaTuner powered MSI Afterburner. Personally I’m not a fan of the new UI – circular UIs and sliders aren’t particularly easy to read – but it gets the job done.
OC Scanner X has also received a facelift and functionality upgrade of its own. Along with its basic FurMark-ish stress testing and error checking, it now also offers a basic CPU stress test and GPU benchmark.
The official launch drivers for the GTX 670 are 301.34, which are also the official launch drivers for the GTX 690. These drivers do not support any other cards but are otherwise virtually identical to the 301.33 press drivers used for the GTX 690 launch. For all other NVIDIA cards we’re using 301.24.
|CPU:||Intel Core i7-3960X @ 4.3GHz|
|Motherboard:||EVGA X79 SLI|
|Chipset Drivers:||Intel 18.104.22.1682|
|Power Supply:||Antec True Power Quattro 1200|
|Hard Disk:||Samsung 470 (256GB)|
|Memory:||G.Skill Ripjaws DDR3-1867 4 x 4GB (8-10-9-26)|
|Case:||Thermaltake Spedo Advance|
AMD Radeon HD 5870
AMD Radeon HD 6950
AMD Radeon HD 6970
AMD Radeon HD 7870
AMD Radeon HD 7950
AMD Radeon HD 7970
NVIDIA GeForce GTX 470
NVIDIA GeForce GTX 570
NVIDIA GeForce GTX 580
NVIDIA GeForce GTX 670
NVIDIA GeForce GTX 680
NVIDIA ForceWare 301.24
NVIDIA ForceWare 301.33
NVIDIA ForceWare 301.34
AMD Catalyst 12.4
|OS:||Windows 7 Ultimate 64-bit|
Kicking things off as always is Crysis: Warhead. It’s no longer the toughest game in our benchmark suite, but it’s still a technically complex game that has proven to be a very consistent benchmark. Thus even four years since the release of the original Crysis, “but can it run Crysis?” is still an important question, and the answer continues to be “no.” While we’re closer than ever, full Enthusiast settings at a 60fps is still beyond the grasp of a single-GPU card.
If GTX 680 had one weakness in particular it was Crysis, and that certainly hasn’t changed with GTX 670. The good news is that the GTX 670 does relatively well compared to the GTX 680 because of its memory bandwidth – GK104 in general seems to be memory bandwidth constrained here – but that’s where the good news ends. GTX 670 can’t otherwise tie the Radeon HD 7950, let alone beat it or threaten the 7970.
Overall performance isn’t particularly strong either. Given the price tag of the GTX 670 the most useful resolution is likely going to be 2560x1600, where the GTX 670 can’t even cross 30fps at our enthusiast settings. Even 1920x1200 isn’t looking particularly good. This is without a doubt the legitimate lowpoint of the GTX 670.
As for gamers looking to upgrade, the GTX 670 looks decent here compared to the GTX 570, but nothing fantastic. The memory bandwidth limitations mean that performance has only gained 33%, which isn’t particularly great for an 18 month span.
Finally, EVGA’s first performance here is decent, but nothing spectacular. Thanks to a combination of being TDP limited and Crysis’s memory bandwidth limits, the GTX 670 Superclocked is at best 3% faster here.
The story with minimum framerates is much the same. The GTX 670 can closely trail the GTX 680, but it’s still not up to the caliber of the 7950 let alone the 7970.
Paired with Crysis as our second behemoth FPS is Metro: 2033. Metro gives up Crysis’ lush tropics and frozen wastelands for an underground experience, but even underground it can be quite brutal on GPUs, which is why it’s also our new benchmark of choice for looking at power/temperature/noise during a game. If its sequel due this year is anywhere near as GPU intensive then a single GPU may not be enough to run the game with every quality feature turned up.
With our second game the GTX 670 already begins to dig itself out of its hole from Crysis. Like the GTX 680 it doesn’t do particularly well here compared to AMD’s best, but it’s enough for a very slight lead on the 7950. At the same time however the GTX 670 falls farther behind the GTX 680, hitting an 8% gap at 2560.
For racing games our racer of choice continues to be DiRT, which is now in its 3rd iteration. Codemasters uses the same EGO engine between its DiRT, F1, and GRID series, so the performance of EGO has been relevant for a number of racing games over the years.
As was the case with the GTX 680, NVIDIA’s prospects improve significantly with DiRT 3. At 2560 the GTX 670 has now launched ahead of the 7950 by a comfortable margin and is within a few FPS of the 7970, still leaving it short but giving us our first sign that the GTX 670 can compete with AMD’s top card. What’s interesting to note here is that we seem to be particularly shader limited here even though DiRT 3 isn’t a highly intensive game, as the GTX 670 falls a bit further behind the GTX 680, leaving a 10% gap. Unfortunately EVGA’s factory overclock isn’t doing a ton here, once again improving performance by just 3% over our reference card.
Total War: Shogun 2
Total War: Shogun 2 is the latest installment of the long-running Total War series of turn based strategy games, and alongside Civilization V is notable for just how many units it can put on a screen at once. As it also turns out, it’s the single most punishing game in our benchmark suite (on higher end hardware at least).
Unfortunately for NVIDIA the Kepler performance bug persists with Shogun 2, which makes things particularly wretched for NVIDIA at Ultra settings. At 2560 the GTX 670 can’t beat a Radeon HD 6970 let alone a 7950. The fact that the GTX 670 does so well at 1920 where this issue doesn’t come up does give the GTX 670 a great deal of hope however, as at this point it’s fast enough to climb past both the 7950 and 7970. It will be interesting to see just where the GTX 670 fits in once this bug is fixed, though we’re expecting it to be on the low side of the performance curve relative to the GTX 680.
On that note this is one of a couple of games that really drives a wedge between the GTX 670 and the GTX 570, even with the Kepler issues. The GTX 570 and it’s increasingly puny 1.25GB of RAM can’t even run this game with our 2560 benchmark settings, meanwhile at 1920 it has enough RAM but the GTX 670 still beats it by 68%.
Batman: Arkham City
Batman: Arkham City is loosely based on Unreal Engine 3, while the DirectX 11 functionality was apparently developed in-house. With the addition of these features Batman is far more a GPU demanding game than its predecessor was, particularly with tessellation cranked up to high.
Arkham City is another good showing for the GTX 670 all around. At 2560 performance is within 3% (2fps) of both the GTX 680 and the Radeon HD 7970. EVGA’s overclock, even if it’s once again only around 3%, is just enough to close that gap and to bring the GTX 670 to parity with the GTX 680 and the 7970. For reasons that aren’t entirely clear Batman isn’t as shader performance bottlenecked as we would have expected, leading to it doing so well compared to the GTX 680 here.
Portal 2 continues the long and proud tradition of Valve’s in-house Source engine. While Source continues to be a DX9 engine, Valve has continued to upgrade it over the years to improve its quality, and combined with their choice of style you’d have a hard time telling it’s over 7 years old at this point. Consequently Portal 2’s performance does get rather high on high-end cards, but we have ways of fixing that…
With this latest generation of high-end cards Portal 2 performance is so high that it’s more than practical to play with SSAA, and that’s where we’re going to focus today. At all resolutions and anti-aliasing options the GTX 670 can surpass the 7970, but this is especially the case with SSAA. At 2560 the GTX 670 is just shy of 60fps, while the 7970 manages only 48fps and the 7950 39fps. SSAA is the ultimate lavish feature, and although Portal 2 is perfectly playable on AMD’s cards without SSAA, SSAA is also the ultimate option for image quality and there’s no good reason not to use it with cards that perform this well.
At this point it’s not entirely clear why the GTX 600 series does so well here (both AMD and NV use SGSSAA), especially given the fact that the Radeons have a memory bandwidth advantage. But regardless of the reason at 2560 the GTX 670 is looking pretty good. For that matter this also happens to be one of the stronger games for EVGA’s GTX 670 Superclocked; the performance gain at 2560 pushes past 4%.
Its popularity aside, Battlefield 3 may be the most interesting game in our benchmark suite for a single reason: it’s the first AAA DX10+ game. It’s been 5 years since the launch of the first DX10 GPUs, and 3 whole process node shrinks later we’re finally to the point where games are using DX10’s functionality as a baseline rather than an addition. Not surprisingly BF3 is one of the best looking games in our suite, but as with past Battlefield games that beauty comes with a high performance cost.
Battlefield 3 has been NVIDIA’s crown jewel; a widely played multiplayer game with a clear lead for NVIDIA hardware. As a result the GTX 670 has another great showing here, easily outperforming AMD’s best. At 2560 with FXAA (and 1920 with MSAA) the GTX 670 has just enough performance to crack 60fps, which means it should be able to keep above 30fps even in larger firefights.
Interestingly enough however this is another game that the GTX 670 does very well at compared to the GTX 680. At 1920 with MSAA in particular the GTX 680 only leads by 3%, reinforcing the fact that as a consequence of giving the GTX 670 all of the GTX 680’s memory bandwidth that the GTX 680 doesn’t have very many tricks up its sleeve to lead with. This also means that the GTX 670 does particularly well here compared to the GTX 570, leading by 55% or more at every resolution and setting.
Our next game is Starcraft II, Blizzard’s 2010 RTS megahit. Much like Portal 2 it’s a DX9 game designed to run on a wide range of hardware so performance is quite peppy with most high-end cards, but it can still challenge a GPU when it needs to.
Starcraft II is another game that just doesn’t seem to be pushing shading or texturing very hard, and as a result the GTX 670 does well for itself here. Against the GTX 680 it’s within 3%, while at 2560 it has a 10% lead over the 7970.
The Elder Scrolls V: Skyrim
Bethesda's epic sword & magic game The Elder Scrolls V: Skyrim is our RPG of choice for benchmarking. It's altogether a good CPU benchmark thanks to its complex scripting and AI, but it also can end up pushing a large number of fairly complex models and effects at once, especially with the addition of the high resolution texture pack.
Skyrim is a game that for inexplicable reasons AMD just has some trouble with that NVIDIA doesn’t, possibly driver overhead. In any case it’s another game where the GTX 670 can take the lead over the 7970, but only at 2560. At 1920 we’re clearly CPU limited even with all of Skyrim’s graphical features turned up.
One interesting thing we do see however is that the GTX 670 is greatly improving on the GTX 570 due to the latter’s lack of memory. 1.25GB is cutting it close here with the high resolution texture pack, which gives the GTX 670 a distinct advantage. Nearly double, in fact.
Our final game, Civilization 5, gives us an interesting look at things that other RTSes cannot match, with a much weaker focus on shading in the game world, and a much greater focus on creating the geometry needed to bring such a world to life. In doing so it uses a slew of DirectX 11 technologies, including tessellation for said geometry, driver command lists for reducing CPU overhead, and compute shaders for on-the-fly texture decompression.
On our final test the 7970 sees a slight resurgence compared to the past few games, preventing NVIDIA from sweeping the whole back half of our tests. In any case it’s just enough to leave the GTX 670 trailing the 7970 by 3%, or about 2fps.
It’s interesting to note however that this is one of a couple of games that GTX 670 doesn’t do particularly well at compared to the GTX 500 series. At 2560 it has a 29% lead on the GTX 570, but that’s still the smallest lead out of any game we have tested. More than anything else it seems Civ V really needs more shader performance.
Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers. Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.
Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.
It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.
Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.
SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.
For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.
Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.
Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.
For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.
Finally, we’re adding one last benchmark to our compute run. NVIDIA and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.
Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.
We’ll also take a quick look at synthetic performance to see if NVIDIA’s choice of clockspeeds and disabling a SMX had any kind of other impact we haven’t anticipated. We’ll start with 3DMark Vantage’s Pixel Fill test.
Pixel Fill is said to be memory bandwidth limited, and for good reason. Even with the lower clockspeed of the GTX 670, the fact that it has memory bandwidth equal to the GTX 680 means that it achieves equal performance in this test, confirming that GTX 670 can utilize its memory bandwidth just as well as GTX 680 can.
Our second test is 3DMark’s Texel Fill test, which as expected shows a moderate gap between the GTX 670 and GTX 680. Thanks to the loss of an SMX and the lower clocksped the GTX 670 only achieves 85% of the performance of the GTX 680, which thanks to the GTX 670’s more aggressive boost clock is a bit better than what we’d expect from the specifications.
Our third theoretical test is the set of settings we use with Microsoft’s Detail Tessellation sample program out of the DX11 SDK
As expected GTX 670 falls behind GTX 680, but again not by nearly as much as we’d expect. The loss of the SMX means GTX 670 lost a Polymorph Engine, but with only a 4% gap with maximum tessellation you’d never be able to tell.
Our final theoretical test is Unigine Heaven 2.5, a benchmark that straddles the line between a synthetic benchmark and a real-world benchmark as the engine is licensed but no notable DX11 games have been produced using it yet.
With Heaven the gap between the GTX 680 and GTX 670 once again opens up similarly to what we saw in our compute benchmarks with SmallLuxGPU. This means the GTX 670 falls behind by about 10%, reflecting the shader-heavy nature of this benchmark.
As always, we’re wrapping up our look at a video card’s stock performance with a look at power, temperature, and noise. As GTX 670 is a lower clocked, lower power, harvested version of GK104, it should do even better than GTX 680 here. Remember, the power target for GTX 670 is only 141W.
|GeForce GTX 600 Series Voltages|
|Ref GTX 670 Boost Load||EVGA GTX 670SC Boost Load||Ref GTX 680 Boost Load||Ref GTX 670 Idle|
Stopping to take a quick look at voltages, we find that NVIDIA hasn’t really adjusted the voltages of GTX 670 compared to GTX 680. Because GTX 670 has a lower maximum boost bin than GTX 680 it ramps up to 1.175v a bit sooner, but otherwise it’s still covering the same range of voltages as opposed to having a lower max voltage to further improve the power consumption. EVGA however does just that, topping out at 1.162v. They’ll need all the power savings they can get since power consumption is inversely proportional to performance under NVIDIA’s power target system.
On that note before we jump into our graphs we wanted to try something new: a look at the average core clockspeed during our benchmarks. Because of GPU boost the boost clock alone doesn’t give us the whole picture – particularly when also taking a look at factory overclocked cards – we’ve recorded the clockspeed of our GPU during each of our benchmarks when running them at 2560x1600 and computed the average clockspeed over the duration of the benchmark.
|GeForce GTX 670 Average Clockspeeds|
|GTX 670||EVGA GTX 670SC|
|Max Boost Clock||1084MHz||1188MHz|
In spite of the GTX 670’s boost clock of only 980MHz, we see that it spends almost its entire time above that, and indeed spends its whole time above the base clock. As far as games go, with the exception of Portal 2 it’s always in the mid-1000s, whereas the GTX 680 was only a bit higher at the high 1000s. This is a big part of why the GTX 670 has performed so well in our tests: it may be rated lower, but in fact it can reach clockspeeds very close to the GTX 680.
At the same time this is why we see the EVGA GTX 670 Superclocked struggle to separate itself from the reference GTX 670, in spite of a 6%+ factory overclock. Too much of the time it’s simply not boosting much higher than the reference GTX 670, which limits the performance gains. With GPU Boost in effect this means there are a range of cards and we could be looking at a particularly good reference GTX 680, but whether that’s the case or not the end result is that EVGA’s card can’t do too much amidst the 141W power target limit.
Starting with idle power, there aren’t any major surprised compared to the GTX 680. NVIDIA and AMD have both done such a good job managing their idle power consumption that even with the disabled SMX there’s no measurable difference between video cards. GTX 680 and GTX 670 effectively have the same idle and long idle power consumption.
Moving on to load power consumption we can immediately see the GTX 670’s lower power target come into play. At 317W from the wall it’s 45W less than the GTX 680 for roughly 90% of the gaming performance, and this is in fact is lower power consumption than anything except the Radeon HD 7870. Even the GTX 560 Ti (which isn’t in this chart) is higher at 333W, reflecting the fact that GK104 is the true successor to GF114, and which would make GTX 670 the successor to GTX 560 from a design perspective.
Of course this also means that GTX 670 does very well for itself compared to AMD’s cards. We saw GTX 670 slightly outperform 7950 in Metro and here we see it drawing 36W less at the wall at the same time. Or compared to the 7970, 7970 can outperform GTX 670 by about 10%, but we’re drawing 74W more at the wall as a result.
Relative power consumption shifts slightly under OCCT, but the story is otherwise similar. It’s not immediately clear why GTX 670’s power consumption is slightly higher than 7950 even after what we saw with Metro (PowerTune should be capping at 200W), but at least in this test the GTX 670 does end up doing a bit worse. On the other hand it’s still 60W less than the 7970.
Otherwise if we compare the GTX 670 to the GTX 680 we see that the GTX 680 ends up drawing 36W more at the wall, only a bit less than the difference we saw under Metro and quite close to what we’d expect based on the cards’ specifications.
On a related note, pay close attention to NVIDIA’s power target system in action here. In spite of the fact that the GTX 670SC is a binned part running at a lower voltage and higher clockspeeds, it’s drawing 1W less than the reference GTX 670. It would appear that NVIDIA has very good control over their power consumption overall, even if they can only adjust clocks for it a few times per second.
Seeing as how NVIDIA is using roughly the same cooler design with GTX 670 as they are with GTX 680 and they have the same idle power consumption, it should come as no surprise that temperatures are so similar. 32C is among the lowest of the temperatures we’ll see for a blower unless it’s running fast & loud.
Thanks to the lower power consumption of the GTX 670 temperatures have gone down as well. 76C is still 2C warmer than the Radeon HD 7900 series due to AMD’s more aggressive cooler but it’s 2C cooler than the GTX 680. The thermal threshold for GK104 is 98C, which means there’s still over 20C of thermal headroom to play with here.
OCCT pushes temperatures up some, but not by much thanks to NVIDIA’s power target system. Since the GTX 680 tied the 7970 at 79C here, this means the GTX 670 is 2-3C cooler than both the GTX 680 and the 7970, and 0-1C warmer than the 7950.
Unfortunately the impressiveness of the GTX 670 begins to wane some here at noise. NVIDIA tells us that the GTX 670 uses the same fan as the GTX 680 but this seems questionable. 41.7dB is not significantly louder than the GTX 680’s 40.5dB, but there’s a definite difference in pitch compared to the GTX 680. The GTX 670 seems to have a low pitch hum/grind to it that GTX 680 didn’t have.
Meanwhile EVGA’s card fares the worst here, at the precipice of falling out of the quiet zone. Since it’s using what’s fundamentally the same fan and cooler as the reference card, the only practical explanation is that EVGA’s smaller diameter fan intake has a negative impact on fan noise.
Finally, when looking at load noise we see that the GTX 670 doesn’t fare significantly better or worse than the GTX 680 here. At 0.8dB quieter than the GTX 680 the GTX 670 is a smidge quieter, but it’s nothing that’s particularly appreciable. Perhaps the more important fact is that this is 3.8dB quieter than the 7950 and 4.1dB quieter than the 7970, making the GTX 670 notable quieter than either 7900 series card. In practice the only place the GTX 670 loses is oddly enough to the GTX 570. The GTX 570 was 0.7dB quieter than the GTX 670 despite the former’s much higher power consumption. NVIDIA did let the GTX 570 get hotter than the GTX 670 so it looks like the GTX 670’s fan curve is a bit more aggressive than the GTX 570’s.
As for the EVGA GTX 670 Superclocked we’re seeing the same thing that we saw at idle: it’s just a bit louder than the reference GTX 670. This is in spite of the fact that the GTX 670SC’s fan actually went to the same speed as the reference GTX 670’s fan in this test.
NVIDIA’s power target once more makes itself known here, thanks to which power consumption and thereby heat generation increases very little compared to what we saw under Metro. This widens the gap between the GTX 670 and 7900 series, which is now at 5.2dB and 6.2dB compared to the 7950 and 7970 respectively. Or compared to NVIDIA cards this is 1.4dB quieter than the GTX 680 and nearly 6dB quieter than the unthrottled GTX 570.
Meanwhile there’s something very interesting going on with the GTX 670SC that’s a wider reflection of the GTX 670’s reference fan. The fan speed went up but objective noise (A-weighted) went down. Why? That low-pitch hum we mentioned diminishes with a higher fan speed, and as a result the fan gets quieter once it passes a certain threshold. Subjectively we agree with our sound meter: the GTX 670SC sounds quieter here than it does as the lower fan speed it uses for cooling during Metro. We haven’t experienced anything like this with the GTX 680, which makes us further doubt that the fans are identical between the GTX 680 and GTX 670. Close no doubt, but not the same.
Wrapping things up this entire scenario is very similar to how we saw the launch of the GTX 680 play out. NVIDIA has a strong performing part with less noise and less power consumption than either its 500 series predecessors or AMD’s closest competition. This in turn was a big part of what made the GTX 680 so easy to recommend, as better performance with less noise is exactly the kind of direction we like to see video cards move in.
OC: Power, Temperature, & Noise
Our final task is our look at GTX 670’s overclocking capabilities. Based on what we’ve seen thus far with GTX 670, it looks like NVIDIA is binning chips based on functional units rather than clockspeeds. As a result GTX 670 could have quite a bit of overclocking potential, albeit one still limited by the lack of voltage control.
|GeForce 600 Series Overclocking|
|GTX 670||EVGA GTX 670SC||GTX 680|
|Shipping Core Clock||915MHz||967MHz||1006MHz|
|Shipping Max Boost Clock||1084MHz||1188MHz||1110MHz|
|Shipping Memory Clock||6GHz||6GHz||6GHz|
|Shipping Max Boost Voltage||1.175v||1.162v||1.175v|
|Overclock Core Clock||1065MHz||1042MHz||1106MHz|
|Overclock Max Boost Clock||1234MHz||1263MHz||1210MHz|
|Overclock Memory Clock||6.9GHz||6.6GHz||6.5GHz|
|Overclock Max Boost Voltage||1.175v||1.162v||1.175v|
Because of the wider gap between base clock and boost clock on the GTX 670 we see that it doesn’t overclock quite as far as GTX 680 from a base clock perspective, but from the perspective of the maximum boost clock we’ve slightly exceeded the GTX 680. Depending on where a game lands against NVIDIA’s power targets this can either mean that an overclocked GTX 670 is faster or slower than an overclocked GTX 680, but at the same time it means that overclocking potential is clearly there.
We’re also seeing another strong memory overclock out of a GK104 card here. GTX 680 only hit 6.5GHz while GTX 690 could hit 7GHz. GTX 670 is only a bit weaker at 6.9GHz, indicating that even with the relatively small PCB that NVIDIA can still exceed the high memory clocks they were shooting for. At the same time however this is a luck of the draw matter.
The EVGA card meanwhile fares both worse and better. Its gap between the base clock and and maximum boost clock is even larger than the reference GTX 670, leading to it having an even lower overclocked base clock but a higher overclocked maximum boost clock. The real limiting factor however is that it couldn’t reach a memory overclock quite as high as the reference GTX 670 – again, luck of the draw – which means it can’t match the overclocked reference GTX 670 as it’s going to be more memory bandwidth starved more often.
Moving on to our performance charts, we’re going to once again start with power, temperature, and noise, before moving on to gaming performance. We’ll be testing our GTX 670 cards at both stock clocks with the maximum power target of 122% (170W) to showcase what is possible at validated clockspeeds with a higher power cap, and a true overclock with a maximum power target along with the largest clock offsets we can achieve.
Not surprisingly, since we’re almost always operating within the realm of the power target as opposed to the TDP on the GTX 600 series, our power consumption closely follows our chosen power target. Cranking up the power target on the GTX 670 for example to 170W puts us within 6W of the GTX 680, which itself had a 170W power target in the first place. This is true for both Metro and OCCT, which means power consumption is very predictable when doing any kind of overclocking.
This also means that power consumption is still 18W-30W below the 7970, which in turn means that if these overclocks can close the performance gap, then the GTX 670 still has a power consumption advantage.
As to be expected, with an increase in power consumption comes an increase in load temperatures. However the fact that we’re only able to increase power consumption by about 30W means the temperature rise is limited to 4-5C, pushing temperatures into the low 80s. This does end up being warmer than the equivalent GTX 680 however due to the 680’s superior heatsink.
Finally, when it comes to noise we’re also seeing the expected increase, but again it’s rather small. Under Metro the amount of noise from the reference GTX 670 rises by under 3dB when pushing the power target higher on its own, while it rises 3dB when adding in our full overclock. Again the smaller cooler means that the GTX 670’s fan has to work harder here, which means our gaming performance may be able to reach the GTX 680, but our noise is going to slightly exceed it. As a point of reference, in the process we’ll also exceed the GTX 580’s noise levels under Metro. Still, in both OCCT and Metro none of our GTX 670 cards exceed the Radeon HD 7900 series, which means we've managed to increase our performance relative to those cards without breaching the level of noise they generate in the first place.
OC: Gaming Performance
We’ll keep the running commentary short here, but depending on how shader bottlenecked any individual game is, it’s possible for GTX 670 to beat a stock GTX 680 with just an increase of the power target. Without that pesky 8th SMX drawing power this leaves more power for increasing clockspeeds, which helps games that are more bottlenecked by the ROPs and/or GPCs.
Conversely, if a game is extremely shader bound (such as Portal 2), then only a full overclock can make up for that 8th SMX on GTX 680.
Looking at this data I’m reminded a great deal of the Radeon HD 6900 series launch. AMD launched the 6900 series after the GTX 500 series, but launch order aside the end result was very similar. NVIDIA’s second tier GTX 570 and AMD’s first tier Radeon HD 6970 were tied on average but were anything but equals. This is almost exactly what we’re seeing with the GTX 670 and the Radeon HD 7970.
Depending on the game and resolution we’re looking at the GTX 670 reaches anywhere between 80% and 120% of the 7970’s performance. AMD sails by the GTX 670 in Crysis and to a lesser extent Metro, only for the GTX 670 to shoot ahead in BF3 and Portal 2 (w/SSAA). Officially NVIDIA’s positioning on the GTX 670 is that it’s to go against the 7950 and not the 7970, and that’s a wise move on NVIDIA’s behalf; but the GTX 670 is surely nipping at the 7970’s heels.
With that said, there are a couple of differences from the 6900 series launch which are equally important. The first is that unlike last time the GTX 670 and Radeon HD 7970 are not equally priced. At MSRP the GTX 670 is $80 cheaper, while at cheapest retail it’s closer to $60. The second difference is that this time the competing cards are not nearly as close in power consumption or noise, and thanks to GK104 NVIDIA has a notable advantage there.
Much like the GTX 570 and the Radeon HD 6970, if you’re in the market for cards at these performance levels you need to take a look at both cards and see what kind of performance each card gets on the games you want to play. From our results the GTX 670 is doing better at contemporary games and is cheaper to boot, but the Radeon HD 7970 can hold its own here at multi-monitor resolutions and games like Crysis or Metro. Or for that matter it can still run circles around the GTX 670 in GK104's real weakness: compute tasks
On the other hand if you’re buying a gaming card on price then this isn’t a contest. For the Radeon HD 7950 this is the GTX 680 all over again. NVIDIA can’t quite beat the 7950 in every game (e.g. Crysis), but when it loses it’s close, and when it wins it’s 15%, 25%, even 50% faster. At the same time gaming power consumption is also lower as is noise. As it stands the worst case scenario for the GTX 670 is that it performs like a 7950 while the best case scenario is that it performs like a 7970. And it does this priced like a 7950, which means that something is going to have to give the moment NVIDIA’s product supply is no longer in question.
Outside of the obligatory AMD matchup, interestingly enough NVIDIA has put themselves in harm’s way here in the process. At 2560x1600 the GTX 680 only beats the GTX 670 by 7% on average. NVIDIA has always charged a premium for their top card but the performance gap has also been greater. In games that aren’t shader bound the GTX 670 does very well for itself thanks to the fact that it has equal memory bandwidth and only a slight ROP performance deficit, which means the GTX 680 is only particularly strong in Metro, Portal 2, and DiRT 3. The 7% performance lead certainly doesn’t justify the 25% price difference, and if you will give up that performance NVIDIA will shave $100 off of the price of a card, but if you do want that top performance NVIDIA intends to make you pay for it. Of course this is also why the GTX 670 is only priced $100 cheaper rather than $150. Potential buyers looking for a $350 GK104 card are going to be left out in the cold for now, particularly buyers looking for a meaningful GTX 570 upgrade.
Finally, the nature of NVIDIA’s power target technology has put partners like EVGA in an odd place. Even with a moderate 6%+ factory overclock the GTX 670 Superclocked just isn’t all that much faster than the reference GTX 670, averaging only a 3% gain at 2560. Since the GTX 670 virtually always operates above its base clock the culprit is NVIDIA’s power target, which keeps the GTX 670SC from boosting much higher than our reference GTX 670. Once you increase the power target the GTX 670SC can easily make an interesting niche for itself, but while this isn’t true overclocking it isn’t stock performance either. In any case it’s clear that for factory overclocked cards to really push the limit they’re going to need to go fully custom, which is what a number of partners are going to do in the coming months.