The Impact of Bulldozer's Pipeline

With a new branch prediction architecture and an unknown, but presumably significantly deeper pipline, I was eager to find out just how much of a burden AMD's quest for frequency had placed on Bulldozer. To do so I turned to the trusty N-Queens solver, now baked into the AIDA64 benchmark suite.

The N-Queens problem is simple. On an N x N chessboard, how do you place N queens so they cannot attack one another? Solving the problem is incredibly branch intensive, and as a result it serves as a great measure of the impact of a deeper pipeline.

The AIDA64 implementation of the N-Queens algorithm is heavily threaded, but I wanted to first get a look at single-core performance so I disabled all but a single integer/fp core on Bulldozer, as well as the competing processors. I also looked at constant frequency as well as turbo enabled speeds:

Single Core Branch Predictor Performance—AIDA64 Queens Benchmark

Unfortunately things don't look good. Even with turbo enabled, the 3.6GHz Bulldozer part needs another 25% higher frequency to equal a 3.6GHz Phenom II X4. Even a 3.3GHz Phenom II X6 does better here. Without being fully aware of the optimizations at work in AIDA64 I wouldn't put too much focus on Sandy Bridge's performance here, but Intel is widely known for focusing on branch prediction performance.

If we let the N-Queens benchmark scale to all available threads, the performance issues are easily masked by throwing more threads at the problem:

SMP Branch Predictor Performance—AIDA64 Queens Benchmark

However it is quite clear that for single or lightly threaded operations that are branch heavy, Bulldozer will be in for a fight.

Power Management and Real Turbo Core Cache and Memory Performance
Comments Locked

430 Comments

View All Comments

  • TiGr1982 - Monday, October 17, 2011 - link

    Indeed, much much better performance was expected from BD. I was an AMD focused PC buyer since 2005, at AMD "golden age", when I purchased AMD Turion-based laptop. That CPU was actually better than the corresponding Intel competitor at the moment - Pentium M Dothan, as probaly some people remember.

    We know the rest of the story since then till now...

    But the released BD-based product in its current state seems to be barely concurrent at all on the desktop market. Presumably, its popularity will be much lower, than in case of previous Phenom II lineup...
  • TiGr1982 - Monday, October 17, 2011 - link

    By "concurrent", I actually meant "competitive".
  • psiboy - Tuesday, October 18, 2011 - link

    Why are there no benchmarks with it overclocked... especially gaming? Would be relevant as these processors are shipping unlocked as standard.. all I'm asking for is a reasonable overclock on air to be included...
  • eldemoledor25 - Tuesday, October 18, 2011 - link

    I think they rushed all wanting to position their review as the first, if you read the other post of the network goes bullozer better positioned than the i7 2600K in many things over which a pricipio dicen.el problem was in the bios the asus and gigabyte motherboards, released immature bios fact overclock would hold more, as you may ASRock and MSI makes a bulldozer to 4.6 ghz be better than the i7 and i5 5.2GHz oc do not believe me check this and read well.
    1.-http: / / www.madboxpc.com/foro/topic/161318-la-verdad-sobre-el-amd-fxo-bulldozer/page__st__20

    Greetings to all!!!
  • Martin281 - Wednesday, October 19, 2011 - link

    Well, the situation among AMDs CPU is still the same...good ideas, great expectitions and manufacturing delays resulting in inappropriate results compared to Intel. Bulldozer would have been a way competitive 2 years ago, not these days. At this point AMD desperately needs way higher clock speeds and core optimizations to be competitive..the predicted 10-15% performance per watt increasing each year is really funny when compared to planned intel´s cpu roadmap (just known information that 1Q/2012 to-be-introduce ivy bridge´s TDP in top performance class is to drop from 95W to 77W, that is almost 20% only in power consumption - not to mention performance boost caused also by 22nm manufacturing process). I am worried, that the performance gap between intel and AMD cpus is going to broaden in the near future without "any light in the darkness bringing the true competition in the CPU field".
  • siniranji - Wednesday, October 19, 2011 - link

    waiting for the BD to come, but now, what a disappointment, but AMD should continue to compete with intel, otherwise, there wont be any battle to watch. I love to see a good pricing from AMD.
  • loa567 - Wednesday, October 19, 2011 - link

    I think you are wrong on one point, about the FPU. You claim that one bulldozer module has the same FP capacity as earlier AMD processors. However, in reality it has twice the (theoretical) capacity Whereas each K8/K10 core had one 128-bit FP unit, each bulldozer module has 2 x 128 bit FP units. They can work together as one 256-bit, when used with the new instructions (AVX and others). See for example this page for details: http://blogs.amd.com/work/2010/10/25/the-new-flex-...

    However, it is strange that this does not show in performance. Could anyone explain this to me?
  • Pipperox - Thursday, October 20, 2011 - link

    It does show on performance.
    In SiSoft Sandra, 4 Bulldozer modules easily beat 6 Thuban cores.

    Same goes for floating point intensive rendering tasks such as Cinebench and 3dsMax.
  • beck2050 - Friday, October 21, 2011 - link

    "in single threaded apps a good 40-50% advantage the i5 2500K enjoys over the FX-8150."
    These are the apps most people use, duh.
    core for core Bulldozer is epic fail. This is not going to be a popular desktop chip at all. As for servers, AMD's share has dropped from 20% to 5,5% in the last few years. I doubt this chip will be the savior.
  • richaron - Friday, October 21, 2011 - link

    They have lost ground in the server market, so a radical new design wont make a difference...? I admire your logic.
    For the record I specifically look for programs/games which are multithreading, it often shows good programming on the whole. Unless of course there are other factors limiting the system (like net speed, or gpu). Perhaps I'm just ahead of the curve compared to you're average troll, duh.

Log in

Don't have an account? Sign up now