Nehalem - Everything You Need to Know about Intel's New Architectureby Anand Lal Shimpi on November 3, 2008 1:00 PM EST
- Posted in
Understanding Nehalem’s Server Focus (and Branch Predictors)
I’ve talked about these improvements before so I won’t go into such great detail here, but Nehalem made some moderate improvements on Intel’s already very strong branch predictors.
The processor now has a second level branch predictor that is slower, but looks at a much larger history of branches and whether or not they were taken. The inclusion of the L2 branch predictor enables applications with very large code sizes (Intel gave the example of database applications), to enjoy improved branch prediction accuracy.
The renamed return stack buffer is also a very important enhancement to Nehalem. Mispredicts in the pipeline can result in incorrect data being populated into Penryn's return stack (a data structure that keeps track of where in memory the CPU should begin executing after working on a function). A return stack with renaming support prevents corruption in the stack, so as long as the calls/returns are properly paired you'll always get the right data out of Nehalem's stack even in the event of a mispredict.
The targeted applications here are very important: Nehalem is designed to fix Intel’s remaining shortcomings in the server space. Our own Johan de Gelas has been talking about Intel not being as competitive in the server market as on the desktop for quite some time now. He even published a very telling article on Nehalem’s server focus before IDF started. While many of Nehalem’s improvements directly impact the desktop market, motivating its design were servers.
This is an important thing to realize because this whole architecture, where Nehalem and its predecessors came from started on the mobile side of the business with Banias/Pentium-M and Centrino. We may have just come full circle with Nehalem, where we once again have the server market driving the microprocessor design for the desktop and mobile chips as well.
The key distinction here and what will hopefully prevent Nehalem’s successors from turning into Pentium 4 redux is Intel’s performance/power ratio golden rule. Nehalem and Atom were both designed, for the first time ever in Intel history, with one major rule on power/performance. For every feature proposed for Nehalem (and Atom), for each 1% increase in power consumption that feature needed to provide a corresponding 2% or greater increase in performance. If the feature couldn’t equal or beat this ratio, it wasn’t added, regardless of how desirable.