Zooming in on SPEC CPU2006: the Bad

The optimized SPEC CPU2006 int binaries allow gains in the range of 30% to 117%. Unfortunately the complete benchmark suite only shows a gain of 21% when we compare the Opteron 6276 with the 6176. Closer inspection shows that four benchmarks regress. The regression appears to be small in most benchmarks (7 to 14%), but remember that we have 33% more cores. Even a small regression of 7% means that we are losing up to 30% of the previous architecture's single-threaded performance!

SPEC Int CPU2006: the Bulldozer unfriendly

Perlbench has high locality in the L1 and L2 caches and rarely accesses the Last Level Cache, let alone the memory. The result is a benchmark that delivers high IPC: 1.67 on a five year old Core 2 Duo ("Merom"), and close to +/- 1.9 IPC on the latest Intel CPUs. The interesting thing to note is that h264ref and Perlbench are among the top IPC performers in the SPEC CPU2006 suite.

Sjeng (chess) and Gobmk are both Artificial Intelligence subroutines. Again, the IPC is relatively high (>1), but their most important performance characteristic is that they contain a very high percentage of hard to predict branches: twice the average of the SPEC CPU integer suite.

Granted, the evidence we've presented is still circumstantial. It would take an extremely long and intensive profiling session on all new processors to really determine what is going on, and that is beyond our time budget: one SPEC CPU run alone consumes a whole day. However, we did get our hands dirty. A short profiling session on three different benchmarks gives us some very interesting results that we want to discuss next.

Zooming in on SPEC CPU 2006: the Good IPC Analysis


View All Comments

  • thunderising - Wednesday, May 30, 2012 - link

    Glad AMD has "Greater Performance" planned sometime in the future. Wow! Reply
  • themossie - Wednesday, May 30, 2012 - link

    "There are also other factors at play, though, as it's already known that StarCraft II doesn't use more than two cores; theinstead, it's likely the..."

    (feel free to remove comment after fixing this)
  • Nightraptor - Wednesday, May 30, 2012 - link

    I am wondering if it would be possible to compare the processor performance of the Trinity A10 with a underclocked FX-4100 set to the same frequencies (I don't know if it is possible to disable the L3 cache on the FX-4100). This might give us a rough idea of how much the improvements of Piledriver have bought us. Just doing rough math in my head it would seem that they have to be pretty significant given how a FX-4100 compared to the Phenom II X4's (it lost alot, if not most of the time t of the time). The new A10 Trinity's on the other hand seem to win most of the time compared to the old architecture. Given that the A10 is a Piledriver based FX-4XXX series equivalent minus the L3 cache it would seem that Piledriver brought very significant enhancements. Either that or the Phenom II era processors responded much more poorly to the lack of L3 than Piledriver does. Reply
  • coder543 - Wednesday, May 30, 2012 - link

    I was hoping they would be doing the same thing, even though it would be challenging to draw real information out of comparing a desktop processor and a mobile processor. Reply
  • SleepyFE - Wednesday, May 30, 2012 - link

    +1 Reply
  • kyuu - Wednesday, May 30, 2012 - link


    If you can figure out some way to do a comparison and analysis of Piledriver's performance vs. Bulldozer, I think a great many of us are interested to see that. From benchmarks, it seems like Piledriver improved a great deal over Bulldozer, but it's difficult to tell without being able to compare two similar processors.
  • Aone - Wednesday, May 30, 2012 - link

    You can compare A10-4600M or A8-4500M versus mobile Llano or Phenom or even Turion to see tweaked BD is nothing of spectacular. For instance, in most cases A8-4500M (2.3GHz base) loses to Llano A8-3500M (1.5GHz base). Reply
  • Nightraptor - Wednesday, May 30, 2012 - link

    Where are you getting the benchmarks from that a Trinity loses to Llano. In almost all the benchmarks I have been able to find (with the exception of a few) it seems that Trinity beats Llano, hence the original post. If the Piledriver enhancements were very minor I would've expected Trinity (a hacked quad core) to lose to Llano most of the time (a true quad core). This didn't appear to happen - at least not in the anandtech review. Reply
  • Aone - Wednesday, May 30, 2012 - link

    Look at "Show comparison chart". Great info!
  • Nightraptor - Thursday, May 31, 2012 - link

    I'm not a big fan of the reliability of that website - They tend to be pretty scant on the test circumstances and configurations. Furthermore I'm curious where they are getting the informaiton for the A8-4500 as to the best of my knowledge the only Trinity in the wild at the moment is the A10 which AMD sent out in a custom made review laptop. All they list is a "K75D Sample". Reply

Log in

Don't have an account? Sign up now