That Darn Memory Bus

Among the entire GTX 600 family, the GTX 660 Ti’s one unique feature is its memory controller layout. NVIDIA built GK104 with 4 memory controllers, each 64 bits wide, giving the entire GPU a combined memory bus width of 256 bits. These memory controllers are tied into the ROPs and L2 cache, with each controller forming part of a ROP partition containing 8 ROPs (or rather 1 ROP unit capable of processing 8 operations), 128KB of L2 cache, and the memory controller. To disable any of those things means taking out a whole ROP partition, which is exactly what NVIDIA has done.

The impact on the ROPs and the L2 cache is rather straightforward – render operation throughput is reduced by 25% and there’s 25% less L2 cache to store data in – but the loss of the memory controller is a much tougher concept to deal with. This goes for both NVIDIA on the design end and for consumers on the usage end.

256 is a nice power-of-two number. For video cards with power-of-two memory bus widths, it’s very easy to equip them with a similarly power-of-two memory capacity such as 1GB, 2GB, or 4GB of memory. For various minor technical reasons (mostly the sanity of the engineers), GPU manufacturers like sticking to power-of-two memory busses. And while this is by no means a true design constraint in video card manufacturing, there are ramifications for skipping from it.

The biggest consequence of deviating from a power-of-two memory bus is that under normal circumstances this leads to a card’s memory capacity not lining up with the bulk of the cards on the market. To use the GTX 500 series as an example, NVIDIA had 1.5GB of memory on the GTX 580 at a time when the common Radeon HD 5870 had 1GB, giving NVIDIA a 512MB advantage. Later on however the common Radeon HD 6970 had 2GB of memory, leaving NVIDIA behind by 512MB. This also had one additional consequence for NVIDIA: they needed 12 memory chips where AMD needed 8, which generally inflates the bill of materials more than the price of higher speed memory in a narrower design does. This ended up not being a problem for the GTX 580 since 1.5GB was still plenty of memory for 2010/2011 and the high pricetag could easily absorb the BoM hit, but this is not always the case.

Because NVIDIA has disabled a ROP partition on GK104 in order to make the GTX 660 Ti, they’re dropping from a power-of-two 256bit bus to an off-size 192bit bus. Under normal circumstances this means that they’d need to either reduce the amount of memory on the card from 2GB to 1.5GB, or double it to 3GB. The former is undesirable for competitive reasons (AMD has 2GB cards below the 660 Ti and 3GB cards above) not to mention the fact that 1.5GB is too small for a $300 card in 2012. The latter on the other hand incurs the BoM hit as NVIDIA moves from 8 memory chips to 12 memory chips, a scenario that the lower margin GTX 660 Ti can’t as easily absorb, not to mention how silly it would be for a GTX 680 to have less memory than a GTX 660 Ti.

Rather than take the usual route NVIDIA is going to take their own 3rd route: put 2GB of memory on the GTX 660 Ti anyhow. By putting more memory on one controller than the other two – in effect breaking the symmetry of the memory banks – NVIDIA can have 2GB of memory attached to a 192bit memory bus. This is a technique that NVIDIA has had available to them for quite some time, but it’s also something they rarely pull out and only use it when necessary.

We were first introduced to this technique with the GTX 550 Ti in 2011, which had a similarly large 192bit memory bus. By using a mix of 2Gb and 1Gb modules, NVIDIA could outfit the card with 1GB of memory rather than the 1.5GB/768MB that a 192bit memory bus would typically dictate.

For the GTX 660 Ti in 2012 NVIDIA is once again going to use their asymmetrical memory technique in order to outfit the GTX 660 Ti with 2GB of memory on a 192bit bus, but they’re going to be implementing it slightly differently. Whereas the GTX 550 Ti mixed memory chip density in order to get 1GB out of 6 chips, the GTX 660 Ti will mix up the number of chips attached to each controller in order to get 2GB out of 8 chips. Specifically, there will be 4 chips instead of 2 attached to one of the memory controllers, while the other controllers will continue to have 2 chips. By doing it in this manner, this allows NVIDIA to use the same Hynix 2Gb chips they already use in the rest of the GTX 600 series, with the only high-level difference being the width of the bus connecting them.

Of course at a low-level it’s more complex than that. In a symmetrical design with an equal amount of RAM on each controller it’s rather easy to interleave memory operations across all of the controllers, which maximizes performance of the memory subsystem as a whole. However complete interleaving requires that kind of a symmetrical design, which means it’s not quite suitable for use on NVIDIA’s asymmetrical memory designs. Instead NVIDIA must start playing tricks. And when tricks are involved, there’s always a downside.

The best case scenario is always going to be that the entire 192bit bus is in use by interleaving a memory operation across all 3 controllers, giving the card 144GB/sec of memory bandwidth (192bit * 6GHz / 8). But that can only be done at up to 1.5GB of memory; the final 512MB of memory is attached to a single memory controller. This invokes the worst case scenario, where only 1 64-bit memory controller is in use and thereby reducing memory bandwidth to a much more modest 48GB/sec.

How NVIDIA spreads out memory accesses will have a great deal of impact on when we hit these scenarios. In the past we’ve tried to divine how NVIDIA is accomplishing this, but even with the compute capability of CUDA memory appears to be too far abstracted for us to test any specific theories. And because NVIDIA is continuing to label the internal details of their memory bus a competitive advantage, they’re unwilling to share the details of its operation with us. Thus we’re largely dealing with a black box here, one where poking and prodding doesn’t produce much in the way of meaningful results.

As with the GTX 550 Ti, all we can really say at this time is that the performance we get in our benchmarks is the performance we get. Our best guess remains that NVIDIA is interleaving the lower 1.5GB of address while pushing the last 512MB of address space into the larger memory bank, but we don’t have any hard data to back it up. For most users this shouldn’t be a problem (especially since GK104 is so wishy-washy at compute), but it remains that there’s always a downside to an asymmetrical memory design. With any luck one day we’ll find that downside and be able to better understand the GTX 660 Ti’s performance in the process.

The GeForce GTX 660 Ti Review Meet The EVGA GeForce GTX 660 Ti Superclocked
Comments Locked

313 Comments

View All Comments

  • TheJian - Sunday, August 19, 2012 - link

    For one there are 6 superclocked 660TI cards on newegg today available at $319 or less DEFAULT. They are fully warranted at those speeds:
    http://www.newegg.com/Product/Product.aspx?Item=N8...
    1032 core/1111mhz boost.
    http://www.newegg.com/Product/Product.aspx?Item=N8...
    1019/1097 for $299.
    Can you do that with a 7950?
    How hot and noisy is yours. I can see what AMP speeds do here at anandtech. How many watts will yours use doing what you said? Just look at the boost edition here and scores around the web at 1920x1200 and you realize it's getting whipped. GTX680?
    Lets just get who can go faster totally out of the way at ridiculous overclocks:
    http://www.hardocp.com/article/2012/07/30/msi_gefo...
    GTX680 MSI Lightning $580 review at hardocp vs. Sapphire 7970 OC $460 at 1280GPU/1860memory at 1.300v! all @2560x1600 min/avg 680 1st 7970 2nd
    Max Payne 3
    41/86.2 vs. 42/79.7

    battlefield 3
    29/52.8 vs 32/50.6

    Batman Arkham
    42/68 vs. 29/57

    Witcher 2 enhanced
    22/51.5 vs 21/50.3

    battlefield 3 multiplayer 1920x1200 (sorry multi isn't run at higher)
    59/78.7 vs. 50/64.2
    So based on avg framerate,
    Batman >19% faster for gtx
    Witcher2=Wash based on min/max either way
    battlefield multi >22% faster for gtx
    Battlefield3 singleplayer wash I guess based on min max
    Maxpayne 3 >8% faster
    Bottom line from hard OCP conclusion:
    "The video card also one of the fastest out-of-box operating speeds. It even went head to head with one of our fastest overclocked AMD Radeon HD 7970's and swept the floor with it."

    If you can find a better review of these two GPU's clocked faster let's see it. I mean any GTX 680 vs. any 7970, where both are ridiculously OC'd like these here. You mentioned 1.15 for 7970, well they got it to 1280! And it got the snot knocked out of it anyway. Sorry Russian. Note they got the mem to 7.44ghz (I'd say a bit lucky draw) vs. the GTX 680 mem hit a wall at 7.05ghz. I'm guessing there will be a few cards do quite a bit better in GTX 680 land here. IT's just a luck of the draw either way, but the sapphire came with a good mem draw for their particular samples so consider the sapphire a great score and still swept. Max scores (check the article) were worse. I tend to think avg is a better rating, you live there mostly. Min/max are rarely hit. Just4u already said it. 680 wins. It's either a wash or landslide depending on games and having cash as no obstacle I'd go gtx 680.
    More regarding all 600 series:
    http://www.hardocp.com/article/2012/05/29/galaxy_g...
    "Since the introduction of the GeForce GTX 680 we have seen the launch of the GeForce GTX 690 and GeForce GTX 670 all providing the best performance and value in their class."
    Same article bottom line on $535 cards in SLI:
    "These cards are a beast in SLI, providing us the best performance possible at 5760x1200. There is no question these also beat Radeon HD 7970 CrossFireX to the punch at every turn."

    Smack...As an AMD fanboy I hold little hope for AMD. They are fighting with billions in debt (junk bond status credit, think of them as Spain/Greece, hard to borrow and gets worse and worse), little to invest, lots of interest on the debt, vs. NV with 3bil in the bank in CASH, no DEBT, no Interest on NO debt. I believe the age of AMD catching Intel or Nvidia is over. Bummer. Our only hope is NV buying AMD once they plummet enough (after losing billions more no doubt), and getting back some CPU competition as NV could invest in cpu design to get this game going again. I could see IBM or Samung pulling this off too (maybe even better as I think both have far more cash, and both have fabs). IBM/Samsung could really put the hurt on Intel with AMD buyouts. It would be a fair fight on cpu for sure. NV may be able to pull off both gpu and cpu as they have no fabs to keep up (which can kill you quickly if you screw up). Interesting thoughts about all that roll around in my head...LOL. For now though, NV is on a path to get 10bil in the bank by 2014 xmas I'd say if not at least by 2015. Like NV or not, they're CEO is smarter with money and never loses it unless he hurts someone else for doing it. He never prices their products at a loss like AMD. He makes smarter moves and thinks further ahead with a bigger picture in mind.

    Motleyfool.com thinks they're the next 100bil company :) The Gardner brothers are NOT STUPID. I will be piling my money in this stock until it goes over $20. They're getting close to returning to the profits of old when it was $35 in 2007 and no dilution in the stock since then, with another 1.5bil of buyback scheduled last I checked. Same cash as 2007 and much stronger company with acquisitions since 2007. AMD is going the other way and investors are scared sh1tless :( Bankruptcy or bought by 2014 xmas. You heard it here :) Unfortunately. I cringe as I say it, but at this point it may help our future cpu cheap prices to just get this over with and get them bought by someone who can help AMD before there's nothing left and they're cpu's are even worse, while gpu's are starting to show desperation too. The 7950boost is just that. A company with money in the bank would have dev'd a lower wattage/cooler less noisy version for less cost, rather than all 3 going up to try to spoil a launch. OUCH. As much as Ryan etc try to help them, there's no getting around the facts (despite page titles like "that darn memory"...LOL...Yet better performance anyway...why even title pages like that?). Despite attempting to make this a 2560 discussion when only 2% of the world uses it according to steampowered.com hardware survey. Even then, if you look at updated games you could argue it's still a no brainer. Toss out warhead (from 2008) and replace with Crysis 2 you get a 660 victory instead of a loss. Hmmm...

    2GB a hindrance? 4GB do anything? :
    "The 4GB -- Realistically there was not one game that we tested that could benefit from the two extra GB's of graphics memory. Even at 2560x1600 (which is a massive 4 Mpixels resolution) there was just no measurable difference."
    http://www.guru3d.com/article/palit-geforce-gtx-68...
    Funny I thought the 4GB version sucked after reading their review but whatever :) I'd rather have a FASTER 2GB than same speed 4GB costing a lot more. I'd call the measurable difference the MONEY for nothing.
  • Galidou - Sunday, August 19, 2012 - link

    Nice results there, just too bad these are almost all the games that works better on Nvidia. They forbid themselves from adding portal 2 so it doesn't look too much biased.
  • TheJian - Monday, August 20, 2012 - link

    Including the useless Warhead isn't enough? Screaming the entire time about it having bad bandwidth wasn't enough?
    Skyrim is in there too..Just forgot to mention it

    Again, 680 SLI vs. 7970sli
    78fps to 62fps (avg. to avg).

    Heck even the single beat it with 72fps..
    "GALAXY GTX 680 GC SLI was 26% faster than AMD Radeon HD 7970 CFX at 8X MSAA + FXAA."

    Skyrim not good enough too? So what game would you like me to point to? I'm sorry it's difficult to point to a winner for AMD. :)
    So lets see, it's biased in your mind on Skyrim, Batman AC, Witcher2, Battlefield3, Battlefield 3 Multiplayer, Portal 2, max payne 3...That rules out HardOCP I guess. Anand added a few more, Shogun 2 (another landslide for 660 TI, even against 7970), Dirt3 used here anand - Wash (though minimums do show Nvidia as Ryan points out)...
    Civ5, landslide again at 1920x1200 here anandtech...Metro2033 here anandtech, <5% win for Nvidia %1920x1200 (I call it a wash I guess)...

    So which game can I point to that will be OK to you? I'm running out of games to find a victory for AMD, so just give me what you want to see...IT's kind of hard as you can see to give you the viewpoint you want which is apparently "nothing will make me happy until AMD wins"...Am I wrong, or at what point do you accept reality?
  • Galidou - Wednesday, August 22, 2012 - link

    Yet another result before the big driver improvements, poor fanboys, they lack on informations and they're totally uninformed about drivers enhancements. A while ago AMD said they were changing their tactics about drivers development. That was like 3-4 months ago I think. Since then, we see really big improvements from the drivers.

    Last 12,8 catalyst brings:
    •Up to 25% in Elder Scrolls: Skyrim
    •Up to 3% in Battlefield 3
    •Up to 6% in Batman: Arkham City
    •Up to 3% in Dues Ex: Human Revolution
    •Up to 6% in Crysis 2
    •Up to 15% in Total War: Shogun
    •Up to 8% in Crysis Warhead
    •Up to 5% in Just Cause 2
    •Up to 10% in Dirt 3

    All in one driver realesed in august, any review prior to gtx 660 ti is then flawed. And there's probably much more to come considering Nvidias fanboys have been whining about their drivers for years. Their team focused on the good way to improve drivers, how much will they be able to improve them? They were SO bad at making drivers than anyone buying an AMD card couldn't even play the slightest game without it crashing, overheating the card and making you cry to your mother to buy an Nvidia card..... Imagine!!
  • CeriseCogburn - Thursday, August 23, 2012 - link

    A lot of reviewers have very recently commented on how crappy amd drivers are - and this just past release came out with - NO DRIVERS for amd...

    Then 5 top review sites had half the games crash on amd, and had to toss some out of the reviews.

    do you have AMNESIA ? are you sick ? a little under the weather, or just a fanboy liar who plays only one game Skyrim ( ROFL) except for the other games you said you play the prior page - and you're just an overclocker...

    So just an overclocker and 2x6850 and they suck for OC
    http://www.anandtech.com/show/4002/amd-radeon-hd-6...

    Way to go... LOL I mean the scaling is pathetic. your not a very smart Ocer.
  • Galidou - Thursday, August 23, 2012 - link

    And then again with the attacks and hatred on my choice, attacking my personnal life once again.

    ''do you have AMNESIA ? are you sick ?''
    ''your not a very smart Ocer. ''

    I did not buy them for ocing, I got them because I had an opportunity and I paid them like, dirt cheap. But still you attack me lacking of respect not even knowing the reasons why I did that. And again with the crashing, I changed to 12,8 driver on my 6850 and it did improve my skyrim performance and none of my other games crashed. Sorry for those reviewers.
  • CeriseCogburn - Thursday, August 23, 2012 - link

    Ok, so in other words, you're the noob, that knows very little to nothing.
    BTW- knowing your an amd fanboy means we all KNOW you scrinched and scrunched (at least in your own stupid head) every little tiny "penny" in your purchase of the AMD videocard...LOL - THAT'S A GIVEN DUMMY.
    Ok, so, in light of that STUPIDITY - you have that same crappy set in WATER COOLING,- DUAL WC...
    And.... "I don't even know why you got them".
    ROFL - dude, either you're lying an awful lot, or you actually needed my help back then, desperately.
    So you waggled up your big water overlcock OC manness, and now we find out... LOL
    This is not happening ! (x-files quote)
  • CeriseCogburn - Thursday, August 23, 2012 - link

    newegg verified owner most helpful 660Ti link from thejian
    " Pros: Runs very quiet and overclocks like a champ. My card hits 1330 core and is completely stable. "
    LOL
    OUCH.
    Oh well, no more OC whiny whine ... tsk tsk how painful
    ROFL
  • Galidou - Sunday, August 19, 2012 - link

    And there's even twice Battlefield 3(single and multi) LOL @ the bias :)
  • CeriseCogburn - Thursday, August 23, 2012 - link

    5760x1080 you lose
    http://www.bit-tech.net/hardware/2012/08/16/nvidia...

    1300+ 660Ti core, you lose again
    7000+ memory, you lose a third time

Log in

Don't have an account? Sign up now