It was literally a week before we received our Phenom samples that many within AMD learned of a serious erratum in the processor that could potentially have a significant impact on system stability or performance, depending on how it was handled. Microprocessor erratum are quite common - no CPU is perfect and many are patched with fixes for these erratum through BIOS updates throughout the life of the CPU.

However, every now and then an erratum comes along that is a little more dangerous, its impact a little more serious, and that's when microprocessors either get recalled or tackled by a software workaround immediately. Phenom hardly had a smooth launch and its traction in the marketplace has been nearly nonexistant, partially because of the TLB issue but also because of a relative inability to compete, even with AMD's own dual-core products in many cases.

AMD is looking to relaunch Phenom this year with a new revision of the core and higher clock speeds. This new core was designed specifically to address the TLB erratum that crept up late last year and we managed to get our hands on a pre-release sample from one of AMD's partners before final production samples shipped. What follows is a quick explanation of the erratum and a look at how, and if, the B3 stepping core does indeed fix things.

Phenom needs help and B3 would at least be the first step towards giving it some much needed aid.

The "TLB Bug" Explained
Comments Locked

29 Comments

View All Comments

  • eok - Thursday, March 20, 2008 - link

    The article, while great news, still leaves me guessing at what they actually tested and how.

    They say they got their hands on a 2.2ghz B3. The CPUZ data confirms that. But in the final "extreme" test, they show the B3 @ 2.3ghz. So, they overclocked?? Via unlocked multiplier or by increasing the FSB???
  • eye smite - Saturday, March 15, 2008 - link

    I don't typically read your site anymore as the articles since phenom launch, particularly the phenom launch article reads more like a rant than a review. Every AMD article reads the same, it's a good cpu but........and then this laundry list of issues and why you see them as critical or distasteful and so on. Why the hell can't you just do a straight forward review based on the facts and leave the colorful commentary for the political videos on youtube? You lot suffer from some idiotic perception issues.
  • Narg - Friday, March 14, 2008 - link

    Personally I think the problem lies within the 3 levels used. When I first read the Phenom specs, I was sad to see there were 3 levels of cache. The complexity that adds to a processor is exponential! I would have been far more happy to see a single L2 cache between all processors of much larger size and better access. I can't imagine why they opted for 3 levels. Seems to be a hurried solution to get the Phenom to market, since so many of the single core chips AMD has built in the past have 3 levels of cache, which of course helped those chips. They need to design an effective 2 level cache only chip.
  • BernardP - Friday, March 14, 2008 - link

    About the article, I am thankful for the most complete and understandable explanation of the TLB error and fixes I have seen since Phenom was released.
  • bradley - Wednesday, March 12, 2008 - link

    We finally have more concrete instances of the bug being induced and documented. Though maybe we have different opinions of what constitutes a rare bug. If it took AMD to inform us, perhaps they are fairly rare instance. In hindsight, the coverage on the TLB issue does appear vastly disproportionate to the actual threat itself.
  • DigitalFreak - Wednesday, March 12, 2008 - link

    On the desktop, yes. My understanding is that it's a huge issue with the Opterons, which are more likely to be used in situations where the bug crops up (VMWare, etc.) It also explains why you still can't buy a quad-core AMD server from HP, Dell, etc.
  • Griswold - Thursday, March 13, 2008 - link

    Its not so much of an issue with opterons if you use a unix/linux derivate and apply AMDs own kernel patch which solves the problem with almost zero performance loss (it doesnt just disable the TLB like the BIOS option does). Windows Servers on the other hand...

    Still, understandable that some vendors just stay away from it until B3.
  • JumpingJack - Friday, March 14, 2008 - link

    http://www.amd.com/us-en/assets/content_type/white...">http://www.amd.com/us-en/assets/content...e/white_...

    As most have stated, errata are a fact of life for every CPU, the fact that Intel or AMD publish errata is because the found something so obscure, the typical quality assurance testing (which must be rigorous and thorough) never expressed the bug. The occurence so rare that the problems it may cause would likely go unnoticed to the average user. This is fine for DT, a crash or lock up would probably result in a few choice, colorful words with regard to Microsoft and they reboot and on their way.

    In the enterprise space, though, uptime is everything, and more importantly the sanctity of the data.... from AMD's own publication on Errata 298 ...

    " One or more of the following events may occur:
    • Machine check for an L3 protocol error. The MC4 status register (MSR 0000_0410) will be
    equal to B2000000_000B0C0F or BA000000_000B0C0F. The MC4 address register (MSR
    0000_0412) will be equal to 26h.
    • Loss of coherency on a cache line containing a page translation table entry.
    • Data corruption. "

    It is last possibility that most likely resulted in AMDs decision to hold off service the mainstream server market and they made the absolute right decision doing so...
  • bradley - Wednesday, March 12, 2008 - link

    Yes, that goes without saying. I guess most assumed the same held true for Phenom as Barcelona, when taking the TLB errata into consideration. And there wasn't one clear voice to dissent or say otherwise, which is unfortunate. For whatever reason Intel's own C2D TLB bug didn't receive nearly as much press, which could also cause system instability. Every chip has bugs, but only when documented does it become errata and revised.
  • larson0699 - Wednesday, March 12, 2008 - link

    SPEC CPU 2006 in Vista x64 may be real world for enough to warrant the fix (though IMO it should have been right the first time), but it's really not that common.

    Only labrats and enthusiasts run benchmarks (but at least they have my respect), and only complete tools run the version of an already heavy OS that further bottlenecks most of today's apps. As a tech, I have no sympathy for anyone who chooses to run down MS's path and patronize their every mistake--yes, it may be a hasty opinion, but it is backed by common sense. There is nothing XP or an Xbox 360 cannot do better.

    Anyway... *sigh*

    High fives to Anand for another awesome in-depth review, for making me one article smarter, and to AMD for more practical results as of late.

    P.S. These guys run their own show--not plagiarize others. Please cite your evidence to them directly instead of FUD the forums.

Log in

Don't have an account? Sign up now