Original Link: http://www.anandtech.com/show/2778
Overclocking Extravaganza: GTX 275's Complex Characteristicsby Derek Wilson on June 4, 2009 5:35 AM EST
- Posted in
After our in depth look at overclocking with AMD's Radeon HD 4890, many of our readers wanted to see the same thing done with NVIDIA's GTX 275. We had planned on looking at both parts from the beginning, but we knew each review would take a bit of time and effort to design and put together. Our goal has been to try and design tests that would best show the particular overclocking characteristics of the different hardware, and shoehorning all that into one review would be difficult. Different approaches are needed to evaluate overclocking with AMD and NVIDIA hardware.
For our AMD tests, we only needed to worry about memory and core clock speed. This gave us some freedom to look at clock scaling in order to better understand the hardware. On the other hand, NVIDIA divides their GPU up a bit more and has another, higher speed, clock domain for shader hardware. Throwing another variable in there has a multiplicative impact on our testing, and we had a hard time deciding what tests really mattered. If we had simply used the same approach we did with the 4890 article, we would have ended up with way too much data to easily present or meaningfully analyze.
We've kept a few key test points, as we will look at each clock at the highest speed we could achieve on its own (all other clocks set at stock speeds). We will also look at performance with all clocks set to the maximum we could hit. Beyond this, rather than looking at how performance scales over clock speed with memory and shader at their maximum and looking at how performance scales over shader speed with memory and core at their maximum, we decided it would be cleaner to look at just one more configuration. For this test, we chose core and shader speed at maximum with memory at stock.
As with the previous look at overclocking, we present our analysis based on percent increases in performance but provide the raw data as well. It's all pretty straight forward with the raw data, and we do include our highly overclocked 4890 as well as the 900MHz core clocked 4890 that can be picked up pre-overclocked from the manufacturer. For the bulk of the article, we will just be considering the impact of overclocking on the GTX 275, but our conclusion will compare AMD and NVIDIA on overclocking in this segment.
The clock speeds we were able to pull out of our GTX 275 were not to shabby as far as overclocks go. Our core clock speed could have been better, but otherwise we did pretty well. Here is what we will be looking at today:
Core: 702MHz (vs. 633MHz stock)
Memory: 1296MHz (vs. 1134MHz stock)
Shader: 1656MHz (vs. 1404MHz stock)
These are 10.9, 14.3, and 17.9 percent increases respectively. First up, we'll look at the impact of overclocking the memory, then we'll move on to core and shader. After that it's on to fully overclocked and our core/shader combined overclock.
Data availability is important in the performance of GPUs, and AMD and NVIDIA pack huge amounts of bandwidth into their designs in order to accommodate this need. While AMD's high end parts have moved over to the newer, less tested, GDDR5, NVIDIA will stick with GDDR3 until at least their next architecture revision (though it is still unclear exactly what memory technologies NVIDIA will support beyond the current generation). This does mean that NVIDIA needs twice the number of pins to achieve the same bandwidth (at the same clock speed), but this isn't a huge problem for the already monolithic G80 and GT100 based GPUs.
With the 448-bit wide connection to GDDR3 memory, NVIDIA's GTX 275 needs to run it's RAM at a higher clock speed in order to achieve the same data rate the Radeon 4890 can hit with it's 256-bit GDDR5 bus. Certainly fast GDDR3 has had time to mature and is highly available. This and the fact that demand is still much higher for GDDR3 mean that NVIDIA is saving some money on competitive memory subsystems. But needing a higher baseline clock speed to compete with AMD's solution could mean less overclockability overall.
We were able to get a greater than 23% clock speed increase out of our 4890, but the best we could manage between a couple of GTX 275 samples was a little more than 14%. Starting out with very nearly the same memory bandwidth, our overclocked AMD part comes out ahead in absolute terms.
It is important to remember, however, that absolute bandwidth doesn't matter as much as how well the bandwidth matches the demand of the GPU. This isn't something we can easily ascertain, but our look at the impact of only overclocking memory certainly shows that the bandwidth NVIDIA chose for the GTX 275 is a good match for the core and shader clock speeds with which it is paired.
1680x1050 1920x1200 2560x1600
We will be digging deeper into how memory speed impacts performance after we look at the rest of our scaling tests, but without any other assistance, just overclocking memory is not going to gain a lot for the GTX 275.
After G80 hit (the first NVIDIA GPU to employ a separate clock domain for shaders), silent shader clock speed increases were made with any core clock speed increase. At first this made sense because NVIDIA only exposed the ability to adjust core and memory clock and the shader clock was not directly adjustable by the end user. Of course, we went to some trouble back then to try our hand at BIOS flashing for shader overclocking. After NVIDIA finally exposed a separate shader adjustment, they still tied core clock and shader clock to some degree.
Since the middle of last year, NVIDIA's driver based clock speed adjustments have been "unlinked," meaning that the shader clock is not affected by the core clock as it used to be. This certainly makes things a lot easier for us, and we'll start by testing out core clock speed adjustment.
The maximum core clock we could hit on our reference GTX 275 was 702. Try as we might, we just could not get it stable beyond that speed. But it's still a good enough overclock for us to get a good handle on scaling. We know some people have GTX 275 parts that will get up toward 750 MHz, so it is possible to get more speed out of this. Still, we have an 11% increase in core clock speed which should net us some decent results.
1680x1050 1920x1200 2560x1600
Call of Duty edges up toward the theoretical maximum but drops off up at 2560x1600 which is much more resource intensive. Interestingly, most of the other games see more benefit at the highest resolution we test hitting over 5% there but generally between 2 and 5 percent at lower resolutions. FarCry 2 and Fallout 3 seem not to gain as much benefit from core overclocking as our other tests.
It could be that the fact we aren't seeing numbers closer to theoretical maximums because there is a bottleneck either in memory or in the shader hardware. This makes analysis a little more complex than with the AMD part, as there are more interrelated factors. Some aspects of a game could be accelerated, but if a significant amount of work is going on elsewhere, we'll still be waiting on one of the other subsystems.
Let's move on to the last independent aspect of the chip and then bring it all together.
We were able to really crank up the shader core on our GTX 275, hitting almost an 18% increase in clock speed with our 1656MHz shader clock. This is a pretty huge overclock for an already highly clocked aspect of the hardware. While or core clock speed couldn't quite push up like other overclockers, our shader overclock made up for this and was pretty high from what else we've seen out there on stock cooling.
In shader heavy applications, we should see a significant benefit from this, but the reality of the situation is a little bit disappointing.
1680x1050 1920x1200 2560x1600
At best we see about a 7 and three quarter percent improvement in Age of Conan at 2560x1600. This certainly doesn't come near our 18% theoretical maximum. Most of our other tests don't even see the type of performance they got from a much more modest boost in core clock speed. In fact, in a couple cases it makes more sense to overclock the memory than the shader core.
If we take everything separately, the prospects for getting good performance improvement out of the GTX 275 don't look that great. But, even more so than with the 4890, putting overclocking together right can make huge gains in realized performance.
Bringing it All Together: Everything OC'd
So each of core, shader and memory overclocking didn't produce dramatic results on their own, but when we put them all together we get quite a different picture. It is sort of hard to set an upper limit on maximum performance improvement when we are faced with different factors that limit performance which could all interact. Throwing more factors in there complicates it as well. I'm not a statistician or mathematician, but it is logical that we could never see a performance improvement greater than the product of the separate percent improvements to each subsystem (i.e. overclocked performance must be less than (stock performance) * 1.11 * 1.143 * 1.179).
The actual limit is lower than the 50% potential gain implied by this, as there is no way to gain the maximum benefit on overall performance by each subsystem simultaneously as gaining the maximum benefit requires that a subsystem be the sole significant bottleneck. I'm not sure how to model anything this complex, especially considering the fact that the performance of any one subsystem affects the efficiency of the other two. Please feel free to school me in the comments on this one.
But the proof that you can get huge returns on overclocking is in the pudding.
1680x1050 1920x1200 2560x1600
Call of Duty and Race Driver GRID get over a 30% boost at 1680x1050 when everything is overclocked simultaneously. Everything else sees respectable gains at over 1680x1050 while these huge boosts go away at higher resolution. An overall gain of 10% to 15% at 2560x1600 isn't too shabby at all, but it doesn't live up to the potential we clearly see in some of our other tests.
The complexity of the factors that go into these performance differences deserve a little more investigation. So we'll look at a few more tests before we throw out our raw numbers.
Pulling it Back Apart: Performance Interactions
Rather than test everything combination of clock speeds and look at scaling as we did in our Radeon HD 4890 overclocking article, we wanted a streamlined way to get a better idea of how combinations clock domain overclocking could help. Our solution was to add only one test configuration and use multiple comparison points to get a better idea of the overall impact of changing multiple clocks at a time.
Testing our hardware while overclocking both the core clock and the shader clock gives us four more key comparisons that fill in the gaps between what we've already seen and how the different aspects of the hardware interact with each other. First, and most obviously, we can see how much performance improvement we get beyond stock when overclocking both core and shader clocks.
1680x1050 1920x1200 2560x1600
These results are certainly interesting, showing, in general, less benefit from moving to 2560x1600 when the GPU is overclocked. We also see less improvement at lower resolution where memory performance isn't as large an issue in the first place (it seems to become even less important). But at 1920x1200, overclocking memory has a higher impact when the GPU is fully overclocked. So at lower resolutions, memory speed isn't as important anyway and the GPU overclock has the prevailing benefit on overall speed. This makes sense. So does the increasing performance at 1920x1200. But the fact that performance improvement we can attribute to faster memory at 2560x1600 is lower with a faster core and shader clocks is a bit of an enigma.
While we can get a better feel for the effects of tweaking different aspects of the chip through these glimpses into scaling, it's still not possible from this data to definitively pin down the interactions between core, shader and memory clock speed. The benefit to different games is dependent on their demand for resources, and there's no real formula for knowing what you will get out.
But the thing to take away is that overclocking the GTX 275 should be done with balance between the three clocks in mind. No single aspect is a magic bullet, and NVIDIA has balanced things pretty well already. Maintaining the balance is the key to extracting good performance improvement when overclocking the GTX 275.
That sums up our analysis of overclocking the GTX 275. The following pages are our raw data for those more interested in direct/absolute comparisons.
Raw Performance Data
Here are the graphs with raw data for all seven of the games we tested. Once again we've got our standard bar graphs at each resolution. We opted not to include the scaling graphs this time around, as we didn't feel they added that much and they can get a little cluttered. All the numbers are here, and as we can see, the GTX 275 does a good job of staying at or near the top when it is fully overclocked. The Radeon HD 4890 doesn't always lead the GTX 275, but it does come in on top a fair amount as well. Often these cards trade blows at different resolutions.
1680x1050 1920x1200 2560x1600
At Idle, the GTX 275 does very well, drawing less power than all the other boards we tested, even when overclocked.
Crank up the load, however, and the GTX 275 becomes a bit of a power hog. Overclocked power draw is higher than everything else we tested.
When it comes to overclocking, NVIDIA's GeForce GTX 275 is a capable part. Other people out there have gotten higher core speeds than we did but we achieved a very high shader clock. Our memory speed was also decent. And with this test the overclocked GTX 275 shows it can compete with an overclocked Radeon HD 4890: the GTX 275 never trailed NVIDIA's own GTX 285 while the Radeon HD 4890 fell behind in a couple instances.
At the same time, while the 4890 doesn't always lead the GTX 285, when the 4890 does lead it also typically leads the fully overclocked GTX 275 as well. These parts trade blows, and it is difficult to say that one is hands down better than the other. It really depends on the game and what you need out of the hardware.
We did find the AMD hardware easier to quickly and efficiently tweak. Some of that may have to do with the fact that we only had two sliders to worry about, but the built in stress test is a nice plus. Of course, NVIDIA automatically tests the clocks when you try and set them, but it seemed like it would sometimes arbitrarily decide not to let us set a clock speed we had set before. Setting three different clocks can give us more control over the overclocking process, but that can be tough when we still don't know exactly the best way to balance these.
Our suggestion is to start with the core and get it as high as you can. After that, crank up your shader. Last would be memory. But you will definitely want to make sure you've got all three turned up at least a little to get the amplifying benefit these hardware resources have on each other.
Those hoping we would definitively recommend either NVIDIA or AMD for overclocking will be disappointed, as this just isn't the blowout necessary for that. Both camps have pluses and minuses, and readers should take that into account when making a purchasing decision. Additionally, we do want to stress that every retail card is different; there may be some AMD parts that can overclock higher than the one we tested, and there may be NVIDIA hardware that can achieve better results as well.
Regardless of what else comes into play, overclocking looks good on both the Radeon HD 4890 and the GeForce GTX 275.