The Truth About Processor "Degradation"

Degradation - the process by which a CPU loses the ability to maintain an equivalent overclock, often sustainable through the use of increased core voltage levels - is usually regarded as a form of ongoing failure. This is much like saying your life is nothing more than your continual march towards death. While some might find this analogy rather poignant philosophically speaking, technically speaking it's a horrible way of modeling the life-cycle of a CPU. Consider this: silicon quality is often measured as a CPU's ability to reach and maintain a desired stable switching frequency all while requiring no more than the maximum specified process voltage (plus margin). If the voltage required to reach those speeds is a function of the CPU's remaining useful life, then why would each processor come with the same three-year warranty?

The answer is quite simple really. Each processor, regardless of silicon quality, is capable of sustained error-free operation while functioning within the bounds of the specified environmental tolerances (temperature, voltage, etc.), for a period of no less than the warranted lifetime when no more performance is demanded of it than its rated frequency will allow. In other words, rather than limit the useful lifetime of each processor, and to allow for a consistent warranty policy, processors are binned based on the highest achievable speed while applying no more than the process's maximum allowable voltage. When we get right down to it, this is the key to overclocking - running CPUs in excess of their rated specifications regardless of reliability guidelines.

As soon as you concede that overclocking by definition reduces the useful lifetime of any CPU, it becomes easier to justify its more extreme application. It also goes a long way to understanding why Intel has a strict "no overclocking" policy when it comes to retaining the product warranty. Too many people believe overclocking is "safe" as long as they don't increase their processor core voltage - not true. Frequency increases drive higher load temperatures, which reduces useful life. Conversely, better cooling may be a sound investment for those that are looking for longer, unfailing operation as this should provide more positive margin for an extended period of time.



The graph above shows three curves. The middle line models the minimum required voltage needed for a processor to continuously run at 100% load for the period shown along the x-axis. During this time, the processor is subjected to its specified maximum core voltage and is never overclocked. Additionally, all of the worst-case considerations come together and our E8500 operates at its absolute maximum sustained Tcase temperature of 72.4ºC. Three years later, we would expect the CPU to have "degraded" to the point where slightly more core voltage is needed for stable operation - as shown above, a little less than 1.15V, up from 1.125V.

Including Vdroop and Voffset, an average 45nm dual-core processor with a VID of 1.25000 should see a final load voltage of about 1.21V. Shown as the dashed green line near the middle of the graph, this represents the actual CPU supply voltage (Vcore). Keep in mind that the trend line represents the minimum voltage required for continued stable operation, so as long as it stays below the actual supply voltage line (middle green line) the CPU will function properly. The lower green line is approximately 5% below the actual supply voltage, and represents an example of an offset that might be used to ensure a positive voltage margin is maintained.

The intersection point of the middle line (minimum required voltage) and the middle green line (actual supply voltage) predicts the point in time when the CPU should "fail," although an increase in supply voltage should allow for longer operation. Also, note how the middle line passes through the lower green line, representing the desired margin to stability at the three-year point, marking the end of warranty. The red line demonstrates the effect running the processor above the maximum thermal specification has on rated product lifetime - we can see the accelerated degradation caused by the higher operating temperatures. The blue line is an example of how lowering the average CPU temperature can lead to increased product longevity.



Because end of life failures are usually caused by a loss of positive voltage margin (excessive wear/degradation) we can establish a very real correlation between the increased/decreased probability of these types of failures and the operating environment experienced by the processor(s) in question. Here we see the effect a harsher operating environment has on observed failure rate due to the new end of life failure rate curve. By running the CPU outside of prescribed operating limits, we are no longer able to positively attribute any failure near the end of warranty to any known cause. Furthermore, because Intel is unable to make a distinction in failure type for each individual case of warranty failure when overclocking or improper use is suspected, policy is established which prohibits overclocking of any kind if warranty coverage is desired.

So what does all of this mean? So far we have learned that of the three basic failure types, failures due to degradation (i.e. wearing out) are in most cases directly influenced by the means and manner in which the processor is operated. Clearly, the user plays a considerable role in the creation and maintenance of a suitable operating environment. This includes the use of high-quality cooling solutions and pastes, the liberal use of fans to provide adequate case ventilation, and finally proper climate control of the surrounding areas. We have also learned that Intel has established easy to follow guidelines when it comes to ensuring the longevity of your investment.

Those that choose to ignore these recommendations and/or exceed any specification do so at their own peril. This is not meant to insinuate that doing so will necessarily cause immediate, irreparable damage or product failure. Rather, every decision made during the course of overclocking has a real and measureable "consequence." For some, there may be little reason to worry as concern for product life may not be a priority. On the other hand, perhaps precautions will be taken in order to accommodate the higher voltages like the use of water-cooling or phase-change cooling. In any case, the underlying principles are the same - overclocking is never without risk. And just like life, taking calculated risks can sometimes be the right choice.

Determining a Processor Warranty Period Initial Thoughts and Recommendations
Comments Locked

45 Comments

View All Comments

  • mdma35 - Friday, October 9, 2009 - link

    Epic Article was pleasure to read thnx for sucj informative stuff
  • jamstan - Sunday, July 13, 2008 - link

    I just did a build with an E8500. The temp always shows 30 degrees no matter how high I overclock it or what speed I have my Vantec Tornado at. Being an overclocker it stinks that I bought a cpu with a temp sensor that doesn't work. I guess its a common problem with this cpu and I hear Intel won't RMA a cpu with a bad sensor. I'm gonna be giving them a call.
  • Johnbear007 - Saturday, March 8, 2008 - link

    I'd still like to know (other than microcenter) what retailer(S) are carrying the q6600 for "under 200$". I would much rather have a sub 200$ q6600 than a 260$ e8400 from mwave
  • MrSpadge - Thursday, March 6, 2008 - link

    I do not agree with much of mindless1's critique on page 3, but we arrive at a somewhat similar conclusion: the section " The Truth About Processor "Degradation" " is lacking. Rather than adressing my issues with mindless1's post I'll just explain my point.

    Showing the influence of temperature on reliability is nice and well, but you neglect the factor which is by far the most important: voltage. It's effect on reliability / expected lifetime / MTTF is much higher than temperature (within sane limits).

    How did you generate the curves in the first plot on that page? Is it just a guess or do you have exact data? Since you mention the 8500 specifically I can imagine that you got the data (or formula) from some insider. If so I'd be curious about how these curves look like if you apply e.g. 1.45 V. There should be a drastic reduction in lifetime.

    If you don't think voltage is that important and you have no ways to adjust the calculations, you could pm dmens here at AT. I'd say he's expert enough in this field.

    MrS
  • Toferman - Thursday, March 6, 2008 - link

    Another great article, thanks for your work on this Kris. :)
  • xkon - Thursday, March 6, 2008 - link

    where are the sub $200 q6600's? i know microcenter had some for $200, but they are no where near me. any other ones? stating it in the article like that makes me think they are available at almost any retailer for that price. maybe if it was rephrased to something like they have been known to be priced as low as $200 or something like that. then again. maybe i'm not in the know, and am just not looking hard enough.
  • TheJian - Thursday, March 6, 2008 - link

    Yet another example of lies. The cheapest Q6600 on pricewatch is $243. And that doesn't come with a 3yr warranty OR a heatsink. So really the cheapest is $253 for retail box with heatsink/fan and 3yr. That's a FAR cry from $200. Cheapest on Cnet.com is $255. Where did they search to find these magical $200 Q6600 chips? I want one. I suspect pricegrabber etc would show the same. I'm too lazy to check now...LOL
  • MaulSidious - Thursday, March 6, 2008 - link

    dunno about america but in britain you can get a q6600 anywhere for 130-150 pounds
  • Johnbear007 - Thursday, March 6, 2008 - link

    150 pounds is about 250-300$ american which is nowhere near what the articles author is claiming. One microcenter deal doesnt really constitute claiming you can bag one from retailer(S) for under 200$. Also, another poster pointed to what he called a q6700 for 80$. That is not true, it was an e6700 which is dual core not quad.
  • Karaktu - Wednesday, March 5, 2008 - link

    I would just like to point out that it has been possible to run a sub-90-watt maximum HTPC for nearly two years. In fact, I've been doing it.

    It DOES require a Core Duo or Core 2 Duo mobile chip, but MoD isn't a new concept.

    ASUS N4L-VM DH
    - Using onboard Intel graphics, Realtek SPDIF and Gigabit network
    Core Duo T2500 (2.0GHz)
    - Cooled by a Nactua NC-U6 northbridge cooler and 60mm fan set to low
    2 x 1GB DDR2 667
    Vista View D1N1-E NTSC/ATSC PCI-E tuner
    Vista View D1N1-I NTSC/ATSC PCI tuner
    - (That's two analog and two HDTV tuners)
    1TB WDC GP 5400rpm hard drive
    750GB Samsung Spinpoint F1 7200rpm hard drive
    Antec Fusion case (rev 1)
    - VFD
    - 430-watt 80 Plus power supply
    - 2 x 120mm TriCool fans set to low
    - External IR for remote and keyboard
    Running MCE 2005

    Idles at 68 watts AT THE WALL and draws a maximum of 90 watts at full load (recording 4 shows and watching a fifth show/movie).

    If I ever get around to dropping the PSU to an EA-380, I'm sure the efficiency would go up a little since I would be closer to that magic 20 - 80% range on the power supply.

    Joe

Log in

Don't have an account? Sign up now