• What
    is this?

    You've landed on the AMD Portal on AnandTech. This section is sponsored by AMD. It features a collection of all of our independent AMD content, as well as Tweets & News from AMD directly. AMD will also be running a couple of huge giveaways here so check back for those.

    PRESENTED BY

It has been months since AMD's Bulldozer architecture surprised the hardware enthusiast community with performance all over the place. The opinions vary wildly from “server benchmarks are here, and they're a catastrophe” to “Best Server Processor of 2011”. The least you can say is that the idiosyncrasies of AMD's latest CPU architecture have stirred up a lot of dust.

Now that the dust has settled, the Bulldozer chips now account for more than half of Opteron shipments and revenues. Since AMD's Financial Analyst Day (February 2, 2012), we have new code names: the improved Bulldozer architecture "Piledriver" will power the "Abu Dhabi" chip, a replacement for the current top server chip "Interlagos". AMD is clearly committed to the new "Bulldozer" direction: fitting as many cores as possible into a certain power envelope to improve thread throughput, while trying to "hold the line" on single-threaded performance.

In theory, the new 16-core Interlagos should have offered somewhere around a 33% boost in most highly-threaded applications. The reality is unfortunately not that rosy: in many highly-threaded server applications such as OLAP databases and virtualization, the new Opteron 6200 fails to impress and is only a few percent faster than it's older brother the 12-core Magny-Cours. There are even times where the older Opteron is faster.

Some, including sources inside AMD, have blamed Global Foundries for not delivering higher clocked SKUs. Sure, the clock speed targets for Interlagos were probably closer to 3GHz instead of 2.3GHz. But that does not explain why the extra integer cores do not deliver. We were promised up to 50% higher performance thanks to the 33% extra cores, but we got 20% at the most.

The combination of low single-threaded performance, the failure to really outperform the previous generation in highly-threaded applications, the relatively high power consumption at full load, and the fact that the CPU is designed for high clock speeds gives a lot of people a certain sense of Déjà vu: is this AMD's version of the Pentum 4 ?

One of our readers, "Iketh", spoke up and voiced the opinion of many of our readers:

" Unfortunately, the thought still in the back of my mind while reading was why did AMD reinvent the Pentium 4? I just don't get it."

Another reader nicknamed "Clagmaster" commented:

"A core this complex in my opinion has not been optimized to its fullest potential. Expect better performance when AMD introduces later steppings of this core with regard to power consumption and higher clock frequencies."

Although there have already been quite a few attempts to understand what Bulldozer is all about, we cannot help but not feel that many questions are still unanswered. Since this architecture is the foundation of AMD's server, workstation, and notebook future (Trinity is based on the improved Bulldozer core with the codename "Piledriver"), it is interesting enough to dig a little deeper. Did AMD take a wrong turn with this architecture? And if not, can the first implementation "Bulldozer" be fixed relatively easily?

We decided to delve deeper into the SAP and SPEC CPU2006 results, as well as profiling our own benchmarks. Using the profiling data and correlating it with what we know about AMD's Bulldozer and Intel's Sandy Bridge, we attempt to solve the puzzle.

Setting Expectations: the Front End
POST A COMMENT

82 Comments

View All Comments

  • Taft12 - Wednesday, May 30, 2012 - link

    Johan, this is the best article I've read on Anandtech in quite some time, even better than Jarred, Ryan and Anand have come up with lately.

    The level of analysis goes far, far beyond just what the benchmarks show.

    Bravo!
    Reply
  • JohanAnandtech - Thursday, May 31, 2012 - link

    Great! Good to read there are still people that like these kinds of analysis!

    :-)
    Reply
  • ct760ster - Wednesday, May 30, 2012 - link

    Would be interesting if they could test the aforementioned benchmark in an OS with a customizable kernel like GNU-Linux since code optimization is not possible in most of the proprietary format benchmark. Reply
  • alpha754293 - Wednesday, May 30, 2012 - link

    What about the lacklustre FPU performance?

    The very fact that the FP has to be shared between two integer cores and as far as I know, it cannot run two FP threads at the same time, so for a lot of HPC/computationally heavy workloads - Bulldozer takes a HUGE performance hit. (almost regardless of anything/everything else; although yes, it counts, but remembering that CPUs are glorified calculators, when you take out one of the lanes of the highway and two-lane traffic is now squeezed down to one lane, it's bound to get slower.)
    Reply
  • The_Countess - Wednesday, May 30, 2012 - link

    except the FP CAN run 2 threads at the same time.
    only for the as yet pretty much unused 256bit instructions does it need the whole FP unit per clock.

    in fact the FP can run 2 threads of 128bit, or 4 even of 64bit.
    and a single CPU can use 2x128bit or both can use 1x128.
    intel and AMD previously had only 1x128bit capability per core.
    so there is no regression in FP performance per core. its just much more flexible.
    Reply
  • Zoomer - Wednesday, May 30, 2012 - link

    FPU throughput is much more irrelevant nowadays, as many FP intensive HPC computations have already been ported to GPUs. Yes, there may be instances where there might be FP heavy and branchy, not easily parallelization or otherwise unsuitable, but such beasts are few and far between. I can't think of any, to be honest. Reply
  • Iger - Wednesday, May 30, 2012 - link

    Thanks a lot, that was a very interesting read! Reply
  • Rael - Wednesday, May 30, 2012 - link

    AMD should fire all its marketing department, because these guys accustomed to lie at every announcement they make. The performance gains are multiplied by five or ten, and the per-core advancement, which is close to zero, is presented as 'significant'.
    I don't believe these announcements anymore.
    Reply
  • jabber - Wednesday, May 30, 2012 - link

    What the whole of the AMD Marketing team?

    Thats Tim the caretaker and Trisha on the front desk isnt it?

    I thought AMD's marketing budget was around $42.
    Reply
  • kyuu - Wednesday, May 30, 2012 - link

    Oh hai. You must be new to the human race. Marketing and "stretching the truth" have been synonymous since... forever. AMD is hardly exceptional in this regard. Stop believing anything any marketing department sells you, period. Reply

Log in

Don't have an account? Sign up now