The Impact of Bulldozer's Pipeline

With a new branch prediction architecture and an unknown, but presumably significantly deeper pipline, I was eager to find out just how much of a burden AMD's quest for frequency had placed on Bulldozer. To do so I turned to the trusty N-Queens solver, now baked into the AIDA64 benchmark suite.

The N-Queens problem is simple. On an N x N chessboard, how do you place N queens so they cannot attack one another? Solving the problem is incredibly branch intensive, and as a result it serves as a great measure of the impact of a deeper pipeline.

The AIDA64 implementation of the N-Queens algorithm is heavily threaded, but I wanted to first get a look at single-core performance so I disabled all but a single integer/fp core on Bulldozer, as well as the competing processors. I also looked at constant frequency as well as turbo enabled speeds:

Single Core Branch Predictor Performance—AIDA64 Queens Benchmark

Unfortunately things don't look good. Even with turbo enabled, the 3.6GHz Bulldozer part needs another 25% higher frequency to equal a 3.6GHz Phenom II X4. Even a 3.3GHz Phenom II X6 does better here. Without being fully aware of the optimizations at work in AIDA64 I wouldn't put too much focus on Sandy Bridge's performance here, but Intel is widely known for focusing on branch prediction performance.

If we let the N-Queens benchmark scale to all available threads, the performance issues are easily masked by throwing more threads at the problem:

SMP Branch Predictor Performance—AIDA64 Queens Benchmark

However it is quite clear that for single or lightly threaded operations that are branch heavy, Bulldozer will be in for a fight.

Power Management and Real Turbo Core Cache and Memory Performance
Comments Locked

430 Comments

View All Comments

  • Mishera - Monday, October 17, 2011 - link

    Good points TekDemon. But I'll add that from what I understand, the GPU might be capable of processing huge amounts of graphic information, but might have to wait for the CPU to process certain information before it's able to continue, hence some games going only so high in graphic tests no matter what kind of GPU is put in.

    Like he said, buying a good CPU will last longer than spending that money on a really good GPU. I personally try to build a balanced system since by the time I upgrade it's a pretty big jump on all ends.
  • Snorkels - Wednesday, October 12, 2011 - link

    This and other benchmark tests are BOGUS. You are comparing apples to oranges. LiarMarks..

    This test shows non-optimized code for AMD vs optimized code for the Intel CPU.

    It does not show the actual performance of the Bulldozer CPU.

    Most software companies compile their software using Intels compiler, which creates crappy and unefficient codepaths for AMD processors.

    Compile with Open64 compiler and you get a totally different result.
  • actionjksn - Wednesday, October 12, 2011 - link

    @Snorkels If most software company's are using an Intel compiler, why would you want an AMD processor that can't utilize it properly.
  • g101 - Wednesday, October 12, 2011 - link

    Well, I'm happy that gamer children cannot understand the point of this architecture. You obviously have no concept of the architectural advantages, since it's not designed for game-playing children or completely unoptimized synthetic benchmarks.

    Bulldozer optimized and future-proofed for professional software, rather than entertainment software for children.
  • AssBall - Wednesday, October 12, 2011 - link

    "obviously have no concept of the architectural advantages"

    Enlighten us then, oh wise one.
  • guyjones - Sunday, October 16, 2011 - link

    Who exactly is the child here? Your infantile comment conveniently ignores the fact that AMD has made gigantic marketing pushes that are clearly directed at the gamer community, not to mention gaming-related sponsorship activities and marketing tie-ins. So, on the contrary, the company has made very visible and consciously-directed efforts to appeal to gamers with its products. It is totally unreasonable to now posit that BD is not directed at least in part toward that market segment.
  • Will Robinson - Wednesday, October 12, 2011 - link

    FailDozer....pretty limp performance numbers.
    Intel still rules.
  • etrigan420 - Wednesday, October 12, 2011 - link

    What an unfortunate series of events...maybe my e8400 will hold out a little longer...
  • Malih - Wednesday, October 12, 2011 - link

    before reading (and i've read all other review sites -and disappointed at AMD-, just dying to see your view on the matter) thanks for the review as always
  • xorbit - Wednesday, October 12, 2011 - link

    You are not measuring "branch prediction" performance. You are measuring misspeculation penalty (due to longer pipeline or other reasons). Nothing can "predict" random data-dependent branches.

Log in

Don't have an account? Sign up now