Intel's Haswell Architecture Analyzed: Building a New PC and a New Intelby Anand Lal Shimpi on October 5, 2012 2:45 AM EST
Intel has held the single threaded performance crown for years now, but the why is really quite easy to understand: it has prioritized extracting instruction level parallelism with every generation. Couple that with the fact that every two years we see a "new" microprocessor architecture from Intel and there's a recipe for some good old evolutionary gains. The table below shows the increase in size of some major data structures inside Intel's architectures for every tock since Conroe:
|Intel Core Architecture Buffer Sizes|
|Integer Register File||N/A||N/A||160||168|
|FP Register File||N/A||N/A||144||168|
Increasing the OoO window allows the execution units to extract more parallelism and thus improve single threaded performance. Each generation Intel is simply dedicating additional transistors to increasing these structures and thus better feeding the beast.
This isn't rocket science, but it is enabled by Intel's clockwork fab execution. Designers can count on another 30% die area to work with every 2 years, so every 2 years they increase the size of these structures without worrying about ballooning the die. The beauty of evolutionary improvements like this is that when viewed over the long term they look downright revolutionary. Comparing Haswell to Conroe, the OoO scheduling window has grown by a factor of 2x, despite generation to generation gains of only 14 - 33%.