Single Cycle SSE and Macro-Fusion in Core

Not spending any time beating around the bush, Justin Rattner immediately jumped into the five key innovations in Intel's new Core micro-architecture.

The 4-issue core and 14-stage pipeline were both disclosed at the last IDF and we already knew that the Pentium M's micro-ops fusion would make its way into Conroe, but what's new here is the support for macro fusion. While micro-ops fusion will allow decoded instructions to be sent down the pipe together (as "fused" instructions), macro fusion will allow x86 instructions (before the decode stage) to be fused together and sent down as a single instruction. The example of this that Rattner gave was that Compare and Jump instructions now become a single instruction in the pipeline thanks to Macro-fusion.

The next major feature of the Core micro-architecture is that now all 128-bit SSE instructions will execute in a single cycle. The single cycle throughput for all SSE instructions should offer some pretty hefty gains in any applications that make extensive use of SSE. We've confirmed that this applies to all SSEn instructions (SSE1/SSE2/SSE3). Updated: Intel clarified the single-cycle SSE item for us. The throughput (not latency) of all SSE instructions is now 1 cycle, whereas in the past it was generally a 2 cycle throughput. The increase in throughput will result in some pretty hefty performance gains in SSE optimized encoding applications.

Introducing the Core Micro-Architecture General Performance Expectations for Conroe, Merom and Woodcrest
  • Brunnis - Tuesday, March 7, 2006 - link

    So, Conroe won't be that cool running then. Not exactly shocking, though. A 40% reduction on power consumption versus the P-D 950 makes Controe land at like 70-75W, correct? I guess it's quite impressive if the performance claims hold true.
  • JackPack - Tuesday, March 7, 2006 - link


    65/80/95W - desktop, server, XE.
  • BigLan - Tuesday, March 7, 2006 - link

    FYI - Most of the images are showing up as red x's here. Not sure if the server is getting hammered or if they're just broken though.
  • creathir - Tuesday, March 7, 2006 - link

    Looks like the Conroe should be quite more powerful (according to Intel) than their current top of the line. I wonder what this will do to the AMD v. Intel debate... I suppose just up the ante

    - Creathir
  • creathir - Tuesday, March 7, 2006 - link

    Also, the single cycle execusion of the SSE extensions should really increase the benefits gained from SSE applications. This will do wonders for video encoding.
    - Creathir

