The Bulldozer Aftermath: Delving Even Deeperby Johan De Gelas on May 30, 2012 1:15 AM EST
Zooming in on SPEC CPU2006: the Bad
The optimized SPEC CPU2006 int binaries allow gains in the range of 30% to 117%. Unfortunately the complete benchmark suite only shows a gain of 21% when we compare the Opteron 6276 with the 6176. Closer inspection shows that four benchmarks regress. The regression appears to be small in most benchmarks (7 to 14%), but remember that we have 33% more cores. Even a small regression of 7% means that we are losing up to 30% of the previous architecture's single-threaded performance!
Perlbench has high locality in the L1 and L2 caches and rarely accesses the Last Level Cache, let alone the memory. The result is a benchmark that delivers high IPC: 1.67 on a five year old Core 2 Duo ("Merom"), and close to +/- 1.9 IPC on the latest Intel CPUs. The interesting thing to note is that h264ref and Perlbench are among the top IPC performers in the SPEC CPU2006 suite.
Sjeng (chess) and Gobmk are both Artificial Intelligence subroutines. Again, the IPC is relatively high (>1), but their most important performance characteristic is that they contain a very high percentage of hard to predict branches: twice the average of the SPEC CPU integer suite.
Granted, the evidence we've presented is still circumstantial. It would take an extremely long and intensive profiling session on all new processors to really determine what is going on, and that is beyond our time budget: one SPEC CPU run alone consumes a whole day. However, we did get our hands dirty. A short profiling session on three different benchmarks gives us some very interesting results that we want to discuss next.