The Impact of Bulldozer's Pipeline

With a new branch prediction architecture and an unknown, but presumably significantly deeper pipline, I was eager to find out just how much of a burden AMD's quest for frequency had placed on Bulldozer. To do so I turned to the trusty N-Queens solver, now baked into the AIDA64 benchmark suite.

The N-Queens problem is simple. On an N x N chessboard, how do you place N queens so they cannot attack one another? Solving the problem is incredibly branch intensive, and as a result it serves as a great measure of the impact of a deeper pipeline.

The AIDA64 implementation of the N-Queens algorithm is heavily threaded, but I wanted to first get a look at single-core performance so I disabled all but a single integer/fp core on Bulldozer, as well as the competing processors. I also looked at constant frequency as well as turbo enabled speeds:

Single Core Branch Predictor Performance—AIDA64 Queens Benchmark

Unfortunately things don't look good. Even with turbo enabled, the 3.6GHz Bulldozer part needs another 25% higher frequency to equal a 3.6GHz Phenom II X4. Even a 3.3GHz Phenom II X6 does better here. Without being fully aware of the optimizations at work in AIDA64 I wouldn't put too much focus on Sandy Bridge's performance here, but Intel is widely known for focusing on branch prediction performance.

If we let the N-Queens benchmark scale to all available threads, the performance issues are easily masked by throwing more threads at the problem:

SMP Branch Predictor Performance—AIDA64 Queens Benchmark

However it is quite clear that for single or lightly threaded operations that are branch heavy, Bulldozer will be in for a fight.

Power Management and Real Turbo Core Cache and Memory Performance
Comments Locked

430 Comments

View All Comments

  • JumpingJack - Sunday, November 6, 2011 - link

    This is a good point.
  • mianmian - Wednesday, October 12, 2011 - link

    How disappointed I am. I can't believe what AMD will claim later on.
  • Marburg U - Wednesday, October 12, 2011 - link

    Cannot see a reason to wait for Piledriver. Am3+ won't survive that chip, and +15%, even in single thread, won't be enough (for Sandy, I'm not even talking about Ivy).

    If BD had not been so bad i would have hoped in a price drop of the Thuban, and would have gone for it. But now, i fear price spikes of the old Phenom II X6 as it approaches it's EOL.
  • Ethaniel - Wednesday, October 12, 2011 - link

    ... using a chainsaw. Newegg sells a 2500k for USD 220. I'm thinking something like 170-180 for the FX-8150. I was expecting a lot from the FX line. And I think that was my mistake, probably. Too bad.
  • Leyawiin - Wednesday, October 12, 2011 - link

    I guess we can take comfort in that some things never change - naming AMD processors are always behind the curve (since before Intel's C2 Duo). Guess I'll hang onto my X4 955 @ 3.6 Ghz for a while longer. It'll be the last AMD processor I'll bother with (and I'm tired of being faithful and waiting on them).
  • richard77aus - Wednesday, October 12, 2011 - link

    ""At the same clock speed, Phenom II is almost 7% faster per core than Bulldozer according to our Cinebench results.""

    I am far from being an expert in CPUs but isn't the main advantage intel has had since core2- sandybridge the per core performance? not closk speed and not multi core.

    I've seen some benchmarks showing real world usage of the SB i3 dual core where it out performs a faster clocked quad core phenom 2.
  • richard77aus - Wednesday, October 12, 2011 - link

    Meaning AMD giving first priority to clockspeed and core count was the wrong thing to aim for even if they had achieved a 4ghz+ stock 8 core speed processor, but to actually go backwards compared to such an old arch. is a disaster. (my first post here, is there a way to edit posts?)
  • Kristian Vättö - Wednesday, October 12, 2011 - link

    The thing is that Phenom II, which is AMD's arch, is FASTER clock for clock than their new Bulldozer arch. Intel is far ahead of both CPUs, but it's a bit laughable that AMD's older CPUs actually outperform their new ones.
  • Saxie81 - Wednesday, October 12, 2011 - link

    Hey Anand, did you happen to get the power consumption numbers when you hit 4.7ghz?

    This is... disappointing. I knew the Single thread benchmarks were going to be bad, but you need to be running something thats needing the 8 cores, if not its of no use. Kinda like using a Magny Cours to run Crysis.
  • Anand Lal Shimpi - Wednesday, October 12, 2011 - link

    I'm going to be doing some more overclocking tomorrow, but I broke 300W at 4.7GHz :-/

Log in

Don't have an account? Sign up now