Nehalem - Everything You Need to Know about Intel's New Architectureby Anand Lal Shimpi on November 3, 2008 1:00 PM EST
- Posted in
Not Another Conroe
Comparing Conroe to Pentium 4 was night-and-day, the former was such a radical departure from the NetBurst micro-architecture that seemingly everything was done differently. The Pentium 4 needed a tremendous amount of software optimization to actually extract performance from that chip, Intel has since learned its lesson and no longer expects the software community to re-compile and re-optimize code for every new architecture. Nehalem had to be fast out of the box, so it was designed that way.
Conroe was the first Intel processor to introduce this 4-issue front end. The processor could decode, rename and retire up to four micro-ops at the same time. Conroe’s width actually went under utilized a great deal of the time, something that Nehalem did address, but fundamentally there was no reason to go wider.
Intel introduced macro-ops fusion in Conroe, a feature where two coupled x86 instructions could be “fused” and treated as one. They would decode, execute and retire as a single instruction instead of two, effectively widening the hardware in certain situations.
Nehalem added additional instructions that could be fused together, in addition to all of the cases supported in existing Core 2 chips:
The other macro-ops fusion enhancement is that now 64-bit instructions can be fused together, whereas in the past only 32-bit instructions could be. It’s a slight performance improvement but 64-bit code could see a performance improvement on Nehalem.