Oooh, Shiny - But Why?
Remember this slide?
How about this one?
I referenced both in the Core i7 review, alluding to the possibility that those fundamental design changes would give the Core i7 much better power efficiency than Core 2. However in speaking to Intel's Nehalem architects and power engineers I came to the realization that those very design changes wouldn't be solely responsible for the sorts of power efficiency gains I showed on the previous page. If you look at maximum power consumption as a hard limit, for example the 130W TDP, Nehalem's designers have to somehow - without the benefits of a die shrink - improve performance without increasing power.
Since Core i7 is a "tock" processor you just get the new architecture, you don't get the benefits of Moore's law since it's still a 45nm chip. With no help from the manufacturing process, Nehalem's architects must create ways to save power and then spend the power savings on improving performance. Switching to an all static CMOS design and a more power efficient cache are two examples of ways that the Nehalem architects won themselves a bigger power budget, without increasing the total TDP of the chip. The architects then promptly spent their power savings on more performance; since the market has already accepted a 130W TDP part, simply delivering lower power but with no additional performance wouldn't make any sense. It's because of this that we're able to see these 20 - 60% increases in performance without correspondingly large increases in power consumption.
So why then is the Core i7-965 so much more power efficient than the QX9770? The answer actually boils down to the architectural level decisions made in Nehalem. Remember the power gate transistors?
With these transistors Intel can effectively shut off an entire core if it is idle, cutting it off completely from being a power drain. At the same TDP, for applications that don't use all four cores, Intel's Core i7 should draw less power than any Core 2 Duo before it and we see this in the single-threaded Cinebench test as well as the gaming tests:
|CPU||Intel Core 2 Extreme QX9770 (3.2GHz)||Intel Core i7-965 (3.2GHz)|
Cinebench (1 thread)
Age of Conan
Race Driver GRID
The Cinebench test is single threaded so only one core is active at any time and only a few of the gaming tests can keep all four cores busy, thus giving the Core i7 the ability to be far more power efficient than Intel's Core 2 Extreme QX9770.
But what about in the multi-threaded tests (or the gaming tests like FarCry 2 that actually stress all four cores)? Here, at worst, the Core i7 draws about the same amount of power as the Core 2 despite offering much better performance. In these situations we get a combination of things benefitting Nehalem. The memory controller is on-die and built on a 45nm process, instead of 90nm like on the QX9770's X48 chipset, which gives Nehalem an edge. The transistor design decisions, while mostly spent on increasing performance, can have an impact on power consumption here as well. Nehalem also has fewer transistors and a smaller cache, the majority of which runs slower than the cache in Penryn.
The sum of all of this is that at the same TDP value, with less than four cores fully active, Intel's Core i7 is capable of drawing a good 10 - 20% less total system power than the previous generation 45nm Core 2. With all cores pegged at 100%, the Core i7 tends to draw the same amount of power or a bit more, but performance is improved significantly in those cases thanks to Hyper Threading.
It's interesting but not surprising that the Core i7's power story mimics its performance one: well threaded applications show huge improvements in power efficiency, but the unexpected benefit is that not-so-well-threaded applications can also showcase Core i7's more efficient power usage.