That Darn Memory Bus

Among the entire GTX 600 family, the GTX 660 Ti’s one unique feature is its memory controller layout. NVIDIA built GK104 with 4 memory controllers, each 64 bits wide, giving the entire GPU a combined memory bus width of 256 bits. These memory controllers are tied into the ROPs and L2 cache, with each controller forming part of a ROP partition containing 8 ROPs (or rather 1 ROP unit capable of processing 8 operations), 128KB of L2 cache, and the memory controller. To disable any of those things means taking out a whole ROP partition, which is exactly what NVIDIA has done.

The impact on the ROPs and the L2 cache is rather straightforward – render operation throughput is reduced by 25% and there’s 25% less L2 cache to store data in – but the loss of the memory controller is a much tougher concept to deal with. This goes for both NVIDIA on the design end and for consumers on the usage end.

256 is a nice power-of-two number. For video cards with power-of-two memory bus widths, it’s very easy to equip them with a similarly power-of-two memory capacity such as 1GB, 2GB, or 4GB of memory. For various minor technical reasons (mostly the sanity of the engineers), GPU manufacturers like sticking to power-of-two memory busses. And while this is by no means a true design constraint in video card manufacturing, there are ramifications for skipping from it.

The biggest consequence of deviating from a power-of-two memory bus is that under normal circumstances this leads to a card’s memory capacity not lining up with the bulk of the cards on the market. To use the GTX 500 series as an example, NVIDIA had 1.5GB of memory on the GTX 580 at a time when the common Radeon HD 5870 had 1GB, giving NVIDIA a 512MB advantage. Later on however the common Radeon HD 6970 had 2GB of memory, leaving NVIDIA behind by 512MB. This also had one additional consequence for NVIDIA: they needed 12 memory chips where AMD needed 8, which generally inflates the bill of materials more than the price of higher speed memory in a narrower design does. This ended up not being a problem for the GTX 580 since 1.5GB was still plenty of memory for 2010/2011 and the high pricetag could easily absorb the BoM hit, but this is not always the case.

Because NVIDIA has disabled a ROP partition on GK104 in order to make the GTX 660 Ti, they’re dropping from a power-of-two 256bit bus to an off-size 192bit bus. Under normal circumstances this means that they’d need to either reduce the amount of memory on the card from 2GB to 1.5GB, or double it to 3GB. The former is undesirable for competitive reasons (AMD has 2GB cards below the 660 Ti and 3GB cards above) not to mention the fact that 1.5GB is too small for a $300 card in 2012. The latter on the other hand incurs the BoM hit as NVIDIA moves from 8 memory chips to 12 memory chips, a scenario that the lower margin GTX 660 Ti can’t as easily absorb, not to mention how silly it would be for a GTX 680 to have less memory than a GTX 660 Ti.

Rather than take the usual route NVIDIA is going to take their own 3rd route: put 2GB of memory on the GTX 660 Ti anyhow. By putting more memory on one controller than the other two – in effect breaking the symmetry of the memory banks – NVIDIA can have 2GB of memory attached to a 192bit memory bus. This is a technique that NVIDIA has had available to them for quite some time, but it’s also something they rarely pull out and only use it when necessary.

We were first introduced to this technique with the GTX 550 Ti in 2011, which had a similarly large 192bit memory bus. By using a mix of 2Gb and 1Gb modules, NVIDIA could outfit the card with 1GB of memory rather than the 1.5GB/768MB that a 192bit memory bus would typically dictate.

For the GTX 660 Ti in 2012 NVIDIA is once again going to use their asymmetrical memory technique in order to outfit the GTX 660 Ti with 2GB of memory on a 192bit bus, but they’re going to be implementing it slightly differently. Whereas the GTX 550 Ti mixed memory chip density in order to get 1GB out of 6 chips, the GTX 660 Ti will mix up the number of chips attached to each controller in order to get 2GB out of 8 chips. Specifically, there will be 4 chips instead of 2 attached to one of the memory controllers, while the other controllers will continue to have 2 chips. By doing it in this manner, this allows NVIDIA to use the same Hynix 2Gb chips they already use in the rest of the GTX 600 series, with the only high-level difference being the width of the bus connecting them.

Of course at a low-level it’s more complex than that. In a symmetrical design with an equal amount of RAM on each controller it’s rather easy to interleave memory operations across all of the controllers, which maximizes performance of the memory subsystem as a whole. However complete interleaving requires that kind of a symmetrical design, which means it’s not quite suitable for use on NVIDIA’s asymmetrical memory designs. Instead NVIDIA must start playing tricks. And when tricks are involved, there’s always a downside.

The best case scenario is always going to be that the entire 192bit bus is in use by interleaving a memory operation across all 3 controllers, giving the card 144GB/sec of memory bandwidth (192bit * 6GHz / 8). But that can only be done at up to 1.5GB of memory; the final 512MB of memory is attached to a single memory controller. This invokes the worst case scenario, where only 1 64-bit memory controller is in use and thereby reducing memory bandwidth to a much more modest 48GB/sec.

How NVIDIA spreads out memory accesses will have a great deal of impact on when we hit these scenarios. In the past we’ve tried to divine how NVIDIA is accomplishing this, but even with the compute capability of CUDA memory appears to be too far abstracted for us to test any specific theories. And because NVIDIA is continuing to label the internal details of their memory bus a competitive advantage, they’re unwilling to share the details of its operation with us. Thus we’re largely dealing with a black box here, one where poking and prodding doesn’t produce much in the way of meaningful results.

As with the GTX 550 Ti, all we can really say at this time is that the performance we get in our benchmarks is the performance we get. Our best guess remains that NVIDIA is interleaving the lower 1.5GB of address while pushing the last 512MB of address space into the larger memory bank, but we don’t have any hard data to back it up. For most users this shouldn’t be a problem (especially since GK104 is so wishy-washy at compute), but it remains that there’s always a downside to an asymmetrical memory design. With any luck one day we’ll find that downside and be able to better understand the GTX 660 Ti’s performance in the process.

The GeForce GTX 660 Ti Review Meet The EVGA GeForce GTX 660 Ti Superclocked
Comments Locked

313 Comments

View All Comments

  • CeriseCogburn - Thursday, August 23, 2012 - link

    660Ti is hitting over 1300 core.
    amd loses in oc this time

    Get used to changing your whistling in the dark tune
  • TheJian - Sunday, August 19, 2012 - link

    http://www.guru3d.com/article/radeon-hd-7950-overc...
    "We do need to warn you, increasing GPU voltages remains dangerous."
    From page 2. Also note he says you can't go over 1ghz or so without hitting raising volts (he hit 1020 default). Raise your hand if you like to spend $300 only to blow it up a few days or months later. Never mind the Heat, noise, watts this condition will FORCE you into. His card hit a full 10c and 6db noisier than defaults. He hit 1150/1250boost max. Only 50mhz more than the 660ti. Nice try. From page 11 7950 BOOST review at gurud3:
    http://www.guru3d.com/article/radeon-hd-7950-with-...

    "In AMD's briefing we notice that the R7950 BOOST cards will be available from August 16 and onwards, what an incredibly coincidental date that is. It's now one day later August 17, just one AIC partner has 'announced' this product and there is NIL availability. Well, at least you now have an idea of where the competition will be in terms of performance. But that's all we can say about that really."

    Note you can BUY a 660TI for $299 or $309 that is clocked by default only 14mhz less than the zotac AMP in this review. If Ryan is to note what you want in the article, he should also note it will possibly light on fire, or drive you out the room due to heat or noise while doing it. AMD isn't willing to BACK your speeds. Heck the DEFAULT noise/heat alone would drive me out of the room, never mind the extra cost of running it at your amazing numbers...LOL. A quick look at the 7950B already tells the story above it's ref speeds/volts. RIDICULOUS NOISE/HEAT/COST.

    From guru3d article above:
    Measured power consumption default card=138 Watts
    Measured power consumption OC at 1150 MHz (1.25 Volts)=217w!! Note the one in Ryan's review is clocked at 850mhz and already 6.5db's higher than 660TI AMP. I want a silent (or as close as poosible) PC that won't heat the room or every component in my PC.
    http://benchmarkreviews.com/index.php?option=com_c...
    1122mhz CORE on 660TI. The gpu boost hit 1200! That's 31% above stock (and I only googled one oc review), and I don't think this is as HOT as your 1150 would be, nor as noisy. Scratch that, I KNOW yours will be worse in BOTH cases. Just look at this review at anandtech with zotac at 1033mhz already. The zotac Amp is also 5c cooler already and you haven't got over 850mhz on the 7950 boost here. Try as you might, you can't make AMD better than they are here. Sorry. Even Anand's 7950 boost review 4 days ago says it's hard to argue for the heat/noise problem added to the already worse 7950 regular vs 660 ti. Not to mention both 7950's are more expensive than 660ti. It's all a loss, whether or not I like AMD. Heat/temp/watts are WORSE this time around on AMD. Raising to higher clocks/volts only makes it worse.

    I already pointed out in another post Ryan should have posted 900mhz scores, but not to help AMD, rather that's what I'd buy if I was looking for a card from AMD for the going market cards on newegg. You just wouldn't purchase an 800mhz version (or even 850mhz), but AMD would have paid the price in the heat/watts/noise scores if Ryan did it. I would still rather have had it in there. Anandtech reviews seem to always reflect "suggested retail prices and speeds" rather than reality for buyers. That still doesn't help your case though.

    It's not ridiculously easy to OC a 7950boost to 35-45% higher...Which loses a lot at 1920x1200, by huge margins, and warhead is useless as shown in my other posts, it's a loser in crysis2 now for boost vs. 660ti's of any flavor above ref.
    http://www.guru3d.com/article/radeon-hd-7950-with-...
    Crysis 2 ultra uber dx 11 patches everything on high. WASH for 7950 boost vs. REF 660ti! Why did Anandtech choose 2008 version?
    Ref 660TI which nobody would buy given pricing of high clocked versions at $299/309 for 660 TI, default no fiddling necessary and no voiding warranty or early deaths of hardware. You seem to ignore what happens when you OC things past reasonable specs (already done by AMD with heat/noise/watts above zotac Amp here). I suspect AMD didn't want their chip to look even worse in reviews.

    Argument over. I win... :)...So does your wallet if you put your fanboyism away for a bit. Note I provided a google search to my RADEON 5850 XFX purchased card complaints (regarding backorder) at amazon in another post here. I love AMD but, c'mon...They lost this round, get over it. You may have had an argument for a 7950 boost at $299 that was actually COOLER than the 7950 regular and less noisy. But with both being worse, & price being higher...It's over this round. Note the cool features of the 600 series cards in the above oc article. It's safe at 1122/1200! It's safe no matter the card, though they vary you can't hurt them (per card settings are different...Ultimate OC without damage). Nice feature.

    I'd argue blow by blow over 2560x1600 (as you can prove NV victories depending on games) but I think it's pointless as I already proved in other posts, only 2% actually use that or above. Meaning 98% are using a res where the 660TI pretty much TRASHES the 7950 in all but a few games I could find (1920x1200 and below).
    (hit post but didn't post...sorry if I'm about to double post this).
  • Galidou - Sunday, August 19, 2012 - link

    It's fun to see that Nvidia as reached a very good power consumption and heat level compared to the generation before. How they mention it, but when AMD fanboys were mentioning it, coparing 6xxx against gtx5xx, they were just denied and being told it wasn't important.

    ''Measured power consumption OC at 1150 MHz (1.25 Volts)=217w''

    Wow it's amazing, 217 watts, almost as much as a gtx 580 stock.....

    Comparing the 7950b noise and temperature with the very bad reference cooler against a very quiet aftermarket cooler on the 660 ti, very nice apples to apples comparison. The 7950b is for the average users, we all know the 7950 models that are overclocked and got VERY nice coolers already, thanks for the refresh.
  • TheJian - Monday, August 20, 2012 - link

    This AMD fanboy bought a radeon 5850. Not sure what your point is?

    The ref design was in there too...Check the green bar card.
  • Galidou - Monday, August 20, 2012 - link

    Well, the reference design works wonders on 660 ti because it has alot better power consumption and temperatues, the 7xxx reference coolers are just plain crap, good thing there's not much around, else the opinion of the 7xxx series would be uber bad.

    Overclockers tend to love the radeons and I'm an overclocker, not an AMD fanboy, I just can't support all the hate when there'S no reason for it.
  • CeriseCogburn - Thursday, August 23, 2012 - link

    radeon 6000 series was losing the benches as it "saved power"

    580 was referred to as housefire and worse, nVidia was attacked for abandoning gamers w/compute
    ROFL - abandoning the gamers

    green earth became more important than gaming

    losing frame rates was a-okay because you saved power and money

    Compare that to now - nVidia is faster, quieter, smoother, and uses less power

    amd loses frames, and sucks down the juice, and choppier

    The 580 had a HUGE lead at the top of the charts....

    So, that's the same how ?

    It would be the same if amd hadn't completely failed on frame rates and had a giant lead stretching out in disbelief at the top of the charts - then one can say "the power doesn't matter" because you get something for it

    It's really simple. So simple, simpletons should be able to understand. I don't think fanboys will though.
  • RussianSensation - Thursday, August 16, 2012 - link

    Well in fairness AnandTech did test reference clocked 660Ti cards, which is a fair review. They also could have included factory pre-overclocked 660Ti cards and just commented on the price difference (i.e., up to $339). This was also mentioned in the review.

    But what I find the most amusing is that after how much talk was around the amazing overclocking capabilities of GTX460, NV users want to ignore that HD7950 can overclock to 1.1-1.15ghz and match a $500 GTX680. Can a GTX660Ti do that? At the end of the day an overclocked 7950 will beat an overclocked 660Ti with AA. Overclockers will go for the 7950 and people who want a quiet and efficient card will pick the 660Ti.
  • just4U - Thursday, August 16, 2012 - link

    Does a 1.1-1.5GHZ 7950 actually match up well against a GTX680? While AMD and NVidia perform better on different games I'd still think the 680 would be somewhat ahead..
  • CeriseCogburn - Sunday, August 19, 2012 - link

    No the 7950 does not, it takes a 1200-1250 core 7970 to "match up".
    Even then, it can only match up in just "fps".
    It still doesn't have: PhysX, adaptive v-sync, automatic OC, target frame rate, TXAA, good 3D, Surround center taskbar by default without having driver addons, STABILITY, smoothest gaming.
    I could go on.
    Hey here's a theory worthy of what we hear here against nVidia, but we'll make it against the real loser amd.
    It appears amd has problems with smooth gameplay because they added a strange and unable to use extra G of ram on their card. Their mem controller has to try to manage access to more ram chips, more ram, and winds up stuttering and jittering in game, because even though the extra ram is there it can't make use of it, and winds up glitching trying to mange it.
    There we go ! A great theory worthy of the reviewers kind he so often lofts solely toward nVida.
  • Galidou - Sunday, August 19, 2012 - link

    Look at all the big words: ''PhysX, adaptive v-sync, automatic OC, target frame rate, TXAA, good 3D, Surround center''. Stability is my preffered, I owned so many video cards and had so little problems with them, Nvidia ATI or AMD but still Nvidia fanboys still have to make us feel that everytime you buy a video card from AMD, you gotta have to face the ''inevitable'' hangups, drivers problems, the hulk is gonna come at your home and destroy everything you OWN!!!! Beware if you buy an AMD video card, you might even catch.... ''CANCER''. oohhh cancer, beware....

    I had none of that and still has none of that and ALL my games played very good, memory is the problem now, not the lack of adaptive crapsync, physixx and such. You just made me remember why I do not listen to TV anymore, the adds always try to make you feel like everything you own should be changed for the new stuff, but then you change it and you feel almost nothing has been gained.

    I call for ''planned obsolescence'' for the last message.

Log in

Don't have an account? Sign up now