More GDDR5 Technologies: Memory Error Detection & Temperature Compensation

As we previously mentioned, for Cypress AMD’s memory controllers have implemented a greater part of the GDDR5 specification. Beyond gaining the ability to use GDDR5’s power saving abilities, AMD has also been working on implementing features to allow their cards to reach higher memory clock speeds. Chief among these is support for GDDR5’s error detection capabilities.

One of the biggest problems in using a high-speed memory device like GDDR5 is that it requires a bus that’s both fast and fairly wide - properties that generally run counter to each other in designing a device bus. A single GDDR5 memory chip on the 5870 needs to connect to a bus that’s 32 bits wide and runs at base speed of 1.2GHz, which requires a bus that can meeting exceedingly precise tolerances. Adding to the challenge is that for a card like the 5870 with a 256-bit total memory bus, eight of these buses will be required, leading to more noise from adjoining buses and less room to work in.

Because of the difficulty in building such a bus, the memory bus has become the weak point for video cards using GDDR5. The GPU’s memory controller can do more and the memory chips themselves can do more, but the bus can’t keep up.

To combat this, GDDR5 memory controllers can perform basic error detection on both reads and writes by implementing a CRC-8 hash function. With this feature enabled, for each 64-bit data burst an 8-bit cyclic redundancy check hash (CRC-8) is transmitted via a set of four dedicated EDC pins. This CRC is then used to check the contents of the data burst, to determine whether any errors were introduced into the data burst during transmission.

The specific CRC function used in GDDR5 can detect 1-bit and 2-bit errors with 100% accuracy, with that accuracy falling with additional erroneous bits. This is due to the fact that the CRC function used can generate collisions, which means that the CRC of an erroneous data burst could match the proper CRC in an unlikely situation. But as the odds decrease for additional errors, the vast majority of errors should be limited to 1-bit and 2-bit errors.

Should an error be found, the GDDR5 controller will request a retransmission of the faulty data burst, and it will keep doing this until the data burst finally goes through correctly. A retransmission request is also used to re-train the GDDR5 link (once again taking advantage of fast link re-training) to correct any potential link problems brought about by changing environmental conditions. Note that this does not involve changing the clock speed of the GDDR5 (i.e. it does not step down in speed); rather it’s merely reinitializing the link. If the errors are due the bus being outright unable to perfectly handle the requested clock speed, errors will continue to happen and be caught. Keep this in mind as it will be important when we get to overclocking.

Finally, we should also note that this error detection scheme is only for detecting bus errors. Errors in the GDDR5 memory modules or errors in the memory controller will not be detected, so it’s still possible to end up with bad data should either of those two devices malfunction. By the same token this is solely a detection scheme, so there are no error correction abilities. The only way to correct a transmission error is to keep trying until the bus gets it right.

Now in spite of the difficulties in building and operating such a high speed bus, error detection is not necessary for its operation. As AMD was quick to point out to us, cards still need to ship defect-free and not produce any errors. Or in other words, the error detection mechanism is a failsafe mechanism rather than a tool specifically to attain higher memory speeds. Memory supplier Qimonda’s own whitepaper on GDDR5 pitches error correction as a necessary precaution due to the increasing amount of code stored in graphics memory, where a failure can lead to a crash rather than just a bad pixel.

In any case, for normal use the ramifications of using GDDR5’s error detection capabilities should be non-existent. In practice, this is going to lead to more stable cards since memory bus errors have been eliminated, but we don’t know to what degree. The full use of the system to retransmit a data burst would itself be a catch-22 after all – it means an error has occurred when it shouldn’t have.

Like the changes to VRM monitoring, the significant ramifications of this will be felt with overclocking. Overclocking attempts that previously would push the bus too hard and lead to errors now will no longer do so, making higher overclocks possible. However this is a bit of an illusion as retransmissions reduce performance. The scenario laid out to us by AMD is that overclockers who have reached the limits of their card’s memory bus will now see the impact of this as a drop in performance due to retransmissions, rather than crashing or graphical corruption. This means assessing an overclock will require monitoring the performance of a card, along with continuing to look for traditional signs as those will still indicate problems in memory chips and the memory controller itself.

Ideally there would be a more absolute and expedient way to check for errors than looking at overall performance, but at this time AMD doesn’t have a way to deliver error notices. Maybe in the future they will?

Wrapping things up, we have previously discussed fast link re-training as a tool to allow AMD to clock down GDDR5 during idle periods, and as part of a failsafe method to be used with error detection. However it also serves as a tool to enable higher memory speeds through its use in temperature compensation.

Once again due to the high speeds of GDDR5, it’s more sensitive to memory chip temperatures than previous memory technologies were. Under normal circumstances this sensitivity would limit memory speeds, as temperature swings would change the performance of the memory chips enough to make it difficult to maintain a stable link with the memory controller. By monitoring the temperature of the chips and re-training the link when there are significant shifts in temperature, higher memory speeds are made possible by preventing link failures.

And while temperature compensation may not sound complex, that doesn’t mean it’s not important. As we have mentioned a few times now, the biggest bottleneck in memory performance is the bus. The memory chips can go faster; it’s the bus that can’t. So anything that can help maintain a link along these fragile buses becomes an important tool in achieving higher memory speeds.

Lower Idle Power & Better Overcurrent Protection Angle-Independent Anisotropic Filtering At Last
Comments Locked

327 Comments

View All Comments

  • SiliconDoc - Wednesday, September 30, 2009 - link

    I was here before this site was even on the map let alone on your radar, and have NEVER had any other acct name.
    I will wait for your APOLOGY.
  • ol1bit - Friday, September 25, 2009 - link

    Goodbye 8800gt SLI... nothing has given me the bang for the buck upgrade that this card does!

    I paid $490 for my SLI 8800Gt's in 11/07

    $379 Sweetness!
  • Brazos - Thursday, September 24, 2009 - link

    I always get nostalgic for Tech TV when a new gen of video cards come out. Watching Leo, Patrick, et al. discuss the latest greatest was like watching kids on Christmas morning. And of course there was Morgan.
  • totenkopf - Thursday, September 24, 2009 - link

    SiliconDoc, this is pathetic. Why are you so upset? No one cares about arguing the semantics of hard or paper launches. Besides, where the F is Nvidias Gt300 thingy? You post here more than amd fanboys, yet you hate amd... just hibernate until the gt300 lauunches and then you can come back and spew hatred again.

    Seriously... the fact that you cant even formulate a cogent argument based on anything performance related tells me that you have already ceded the performance crown to amd. Instead, you've latched onto this red herring, the paper launch crap. stop it. just stop it. You're like a crying child. Please just be thankful that amd is noww allowing you to obtain more of your nvidia panacea for even less money!

    Hooray competition! EVERYONE WINS! ...Except silicon doc. He would rather pay $650 for a 280 than see ati sell one card. Ati is the best thing that ever happened to nvidia (and vice versa) Grow the F up and dont talk about bias unless you have none yourself. Hope you dont electrocute yourself tonight while making love to you nvidia card.
  • SiliconDoc - Thursday, September 24, 2009 - link

    " Hooray competition! EVERYONE WINS! ...Except silicon doc. He would rather pay $650 for a 280 than see ati sell one card."
    And thus you have revealed your deep seated hatred of nvidia, in the common parlance seen.
    Frankly my friend, I still have archived web pages with $500 HD2900XT cards from not that long back, that would easily be $700 now with the inflation we've seen.
    So really, wnat is your red raving rooster point other than you totally excuse ATI tnat does exactly the same thing, and make your raging hate nvidia whine, as if "they are standalone guilty".
    You're ANOTHER ONE, that repeats the same old red fan cleche's, and WON'T OWN UP TO ATI'S EXACT SAME BEHAVIOR ! Will you ? I WANT TO SEE IT IN TEXT !
    In other words, your whole complaint is INVALID, because you apply it exclusively, in a BIASED fashion.
    Now tell me about the hundres of dollars overpriced ati cards, won't you ? No, you won't. See that is the problem.
  • silverblue - Friday, September 25, 2009 - link

    If you think companies are going to survive without copying what other companies do, you're sadly mistaken.

    Yes, nVidia has made advances, but so has ATI. When nVidia brought out the GF4 Ti series, it supported Pixel Shader 1.3 whereas ATI's R200-powered 8500 came out earlier with the more advanced Pixel Shader 1.4. ATI were the first of the two companies to introduce a 256-bit memory bus on their graphics cards (following Matrox). nVidia developed Quincunx, which I still hold in high regard. nVidia were the first to bring out Shader Model 3. I still don't know of any commercially available nVidia cards with GDDR5.

    We could go on comparing the two but it's essential that you realise that both companies have developed technologies that have been adopted by the other. However, we wouldn't be so far down this path without an element of copying.

    The 2900XT may be overpriced because it has GDDR4. I'm not interested in it and most people won't be.

    "In other words, your whole complaint is INVALID, because you apply it exclusively, in a BIASED fashion. " Funny, I thought we were seeing that an nauseum from you?

    Why did I buy my 4830? Because it was cheaper than the 9800GT and performed at about the same level. Not because I'm a "red rooster".

    ATI may have priced the 5870 a little high, but in terms of its pure performance, it doesn't come too far off the 295 - a card we know to have two GPUs and costs more. In the end, perhaps AMD crippled it with the 256-bit interface, but until they implement one you'll be convinced that it's a limitation. Maybe, maybe not. GT300 may just prove AMD wrong.
  • SiliconDoc - Wednesday, September 30, 2009 - link

    You have absolutely zero proof that we wouldn't be further down this path without the "competition".
    Without a second company or third of fourth or tenth, the monopoly implements DIVISIONS that complete internally, and without other companies, all the intellectual creativity winds up with the same name on their paycheck.
    You cannot prove what you say has merit, even if you show me a stagnant monopoly, and good luck doing that.
    As ATI stagnated for YEARS, Nvidia moved AHEAD. Nvidia is still ahead.
    In fact, it appears they have always been ahead, much like INTEL.
    You can compare all you want but "it seems ati is the only one interested in new technology..." won't be something you'll be blabbing out again soon.
    Now you try to pass a lesson, and JARED the censor deletes responses, because you two tools think you have a point this time, but only with your deleting and lying assumptions.
    NEXT TIME DON'T WAIL ATI IS THE ONLY ONE THAT SEEMS INTERESTED IN IMPLEMENTING NEW TECHGNOLOGY.
    DON'T SAY IT THEN BACKTRACK 10,000 % WHILE TRYING TO "TEACH ME A LESSON".
    You're the one whose big far red piehole spewed out the lie to begin with.

  • Finally - Friday, September 25, 2009 - link

    The term "Nvidiot" somehow sprung to my mind. How come?
  • silverblue - Thursday, September 24, 2009 - link

    Youre spot on about his bias. Every single post consists of trash-talking pretty much every ATI card and bigging up the comparative nVidia offering. I think the only product he's not complained about is the 4770, though oddly enough that suffered horrific shortage issues due to (surprise) TSMC.

    Even if there were 58x0 cards everywhere, he'd moan about the temperature or the fact it should have a wider bus or that AMD are finally interested in physics acceleration in a proper sense. I'll concede the last point but in my opinion, what we have here is a very good piece of technology that will (like CPUs) only get better in various aspects due to improving manufacturing processes. It beats every other single GPU card with little effort and, when idle, consumes very little juice. The technology is far beyond what RV770 offers and at least, unlike nVidia, ATI seems more interested in driving standards forward. If not for ATI, who's to say we'd have progressed anywhere near this far?

    No company is perfect. No product is perfect. However, to completely slander a company or division just because he buys a competitor's products is misguided to say the least. Just because I own a PC with an AMD CPU, doesn't mean I'm going to berate Intel to high heaven, even if their anti-competitive practices have legitimised such criticism. nVidia makes very good products, and so does ATI. They each have their own strengths and weaknesses, and I'd certainly not be using my 4830 without the continued competition between the two big performance GPU manufacturers; likewise, SiliconDoc's beloved nVidia-powered rig would be a fair bit weaker (without competition, would it even have PhysX? I doubt it).
  • SiliconDoc - Thursday, September 24, 2009 - link

    Well, that was just amazing, and you;re wrong about me not complaining about the 4770 paper launch, you missed it.
    I didn't moan about the temperature, I moaned about the deceptive lies in the review concerning temperatures, that gave ATI a complete pass, and failed to GIVE THE CREDIT DUE THAT NVIDIA DESERVES because of the FACTS, nothing else.
    The article SPUN the facts into a lying cobweb of BS. Juzt like so many red fans do in the posts, and all over the net, and you've done here. It is so hard to MAN UP and admit the ATI cards run hotter ? Is is that bad for you, that you cannot do it ? Certainly the article FAILED to do so, and spun away instead.
    Next, you have this gem " at least, unlike nVidia, ATI seems more interested in driving standards forward."
    ROFLMAO - THIS IS WHAT I'M TALKING ABOUT.
    Here, let me help you, another "banned" secret that the red roosters keep to their chest so their minions can spew crap like you just did: ATI STOLE THE NVIDIA BRIDGE TECHNOLOGY, ATI HAD ONLY A DONGLE OUTSIDE THE CASE, WHILE NVIDIA PROGRESSED TO INTERNAL BRIDGE. AFTER ATI SAW HOW STUPID IT WAS, IT COPIED NVIDIA.
    See, now there's one I'll bet a thousand bucks you never had a clue about.
    I for one, would NEVER CLAIM that either company had the lock on "forwarding technbology", and I IN FACT HAVE NEVER DONE SO, EVER !
    But you red fans spew it all the time. You spew your fanboyisms, in fact you just did, that are absolutely outrageous and outright red leaning lies, period!
    you: " at least, unlike nVidia, ATI seems more interested in driving standards forward...."
    I would like to ask you, how do you explain the never before done MIMD core Nvidia has, and will soon release ? How can you possibly say what you just said ?
    If you'd like to give credit to ATI going with DRR4 and DDR5 first, I would have no problem, but you people DON'T DO THAT. You take it MUCH FURTHER, and claim, as you just did, ATI moves forward and nvidia does not. It's a CONSTANT REFRAIN from you people.
    Did you read the article and actually absorb the OpenCL information ? Did you see Nvidia has an implementation, is "ahead" of ati ? Did you even dare notice that ? If not, how the hell not, other than the biased wording the article has, that speaks to your emotionally charged hate Nvidia mindset :
    "However, to completely slander a company or division just because he buys a competitor's products is misguided to say the least."
    That is NOT TRUE for me, as you stated it, but IT IS TRUE FOR YOU, isn't it ?
    ---
    You in fact SLANDERED Nvidia, by claiming only ATI drives forward tech, or so it seems to you...
    I've merely been pointing out the many statements all about like you just made, and their inherent falsehood!
    ---
    Here next, you pull the ol' switcharoo, and do what you say you won't do, by pointing out you won't do it! roflmao: " doesn't mean I'm going to berate Intel to high heaven, even if their anti-competitive practices have legitimised such criticism.."
    Well, you just did berate them, and just claimed it was justified, cinching home the trashing quickly after you claimed you wouldn't, but have utterly failed to point out a single instance, unlike myself- I INCLUDE the issues and instances, pointing them out imtimately and often in detail, like now.
    LOL you: " I'd certainly not be using my 4830 without ...."
    Well, that shows where you are coming from, but you're still WRONG. If either company dies, the other can move on, and there's very little chance that the company will remain stagnant, since then they won't sell anything, and will die, too.
    The real truth about ATI, which I HAVE pointed out before, is IT FELL OFF THE MAP A FEW YEARS BACK AND ALTHOUGH PRIOR TO THAT TIME WAS COMPETITIVE AND PERHAPS THE VERY BEST, IT CAVED IN...
    After it had it's "dark period" of failure and depair, where Nvidia had the lone top spot, and even produced the still useful and amazing GTX8800 ultimate (with no competition of any note in sight, you failed to notice, even to this day - and claim the EXACT OPPOSITE- because you, a dead brained red, bought the "rebrand whine" lock stock and barrel), ATI "re-emerged", and in fact, doesn't rteally deserve praise for falling off the wagon for a year or two.
    See, that's the truth. The big fat red fib, you liars can stop lying about is the "stagnant technology without competition" whine.
    ATI had all the competition it could ever ask for, and it EPIC FAILED for how many years ? A couple, let's say, or one if you just can't stand the truth, and NVIDIA, not stagnated whatsoever, FLEW AHEAD AND RELEASED THE MASSIVE GTX8800 ULTIMATE.
    So really friend, just stop the lying. That's all I ask. Quit repeating the trashy and easily disproved ati cleche's.
    Ok ?

Log in

Don't have an account? Sign up now