Immediately following the slightly disappointing keynote, Intel revealed a few more details about their next-generation microprocessor architecture. As of now, the new architecture doesn't have a name, but we've got some of its features now.

Intel has come out and said that the next-gen microarchitecture will be a unified architecture, combining the lessons learned from the Pentium 4's NetBurst and Pentium M's Banias architectures. To put it bluntly, the next-generation microprocessor architecture borrows the FSB and 64-bit capabilities of NetBurst and combines it with the power saving features of the Pentium M platform. Features like virtualization and security will also be a part of the new architecture.

Contrary to wild speculation, Intel's new architecture will continue to feature an Out of Order execution core; a direct descendant of the Pentium M and Pentium 4 predecessors. The core will be a wider 4-issue core (4-issue decode, execute and retire) with deeper buffers, presumably with more instructions in flight than the Pentium 4 courtesy of the 4-issue core.

The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and a slight increase from the 12 stage pipeline in the Athlon 64. Intel's move to a much shorter pipeline will definitely decrease power consumption (as well as clock speed), but hopefully improve performance considerably.

Note that with a 4-issue core, the new processors will actually have a higher degree of ILP than AMD's Athlon 64, and with a slightly deeper pipeline the CPU should be able to reach higher clock speeds than what AMD has been able to achieve. We'd expect that at 65nm these new cores could run as high as 3GHz in clock speed, but definitely not at the 4GHz+ levels that we currently have with the Pentium 4.

Given the significant reduction in pipeline stages, Intel's claims of a 5x improvement in performance per watt over the Pentium 4 architecture seems very realistic.

The new architecture will feature a shared L2 cache between the cores, much like what we've seen from Yonah already. Intel also said that there would be a higher "relative" increase in L2 cache bandwidth. The new processors will also apparently feature a direct L1-to-L1 cache transfer system in order to improve the currently very poor cache-to-cache transfer performance of Intel's dual core processors.

There are also a number of new prefetching algorithms, allowing data to be prefetched from L1 to L1 (one core to another), L1 to L2, etc... Intel is also introducing speculative data loads with the new architecture, loads to be executed ahead of stores if a dependency is predicted to not exist between the two. We are waiting for more details on the feature to be exact about its functionality.

Both Conroe and Merom (desktop and mobile) will feature 2 cores. Intel says that Conroe will be available in multiple L2 cache sizes, while Merom will not. We'd assume that the multiple L2 cache sizes would be to accomodate and differentiate products like the Extreme Edition.

On the server side, the first next-gen architecture CPU will be the dual core Woodcrest, followed by the quad-core Whitefield processor.

More info as we get it.

Comments Locked

26 Comments

View All Comments

  • IntelUser2000 - Wednesday, August 24, 2005 - link

    quote:

    Where does it say they lack an on-die memory controller? Methinks direct L1-to-L1 transfer and unified L2 cache means something 'intelligent' sits between the cores?


    Well, that 'intelligent' thing is called the Arbitration Logic, which acts to manage data between two cores.
  • haelduksf - Tuesday, August 23, 2005 - link

    If they want 5x the performance/watt, and these things are going to be about 65W, that means they will be 2-3x faster. Exciting stuff.
  • Den - Tuesday, August 23, 2005 - link

    Well, if you compare new dual core to old single core, the speed increase is more like 20-50% depending on which old CPU you look at...
  • Furen - Tuesday, August 23, 2005 - link

    Sounds like BS to me. If they were truly 5x faster per watt we'd be seeing benchmarks left and right. CPUs are different from video cards. You dont see performance increases of that magnitude.
  • neogodless - Tuesday, August 23, 2005 - link

    Integer Performance...

    Which doesn't really mean overall system performance and certainly not real world performance...

    Let us not forget the "amazing" integer performance of the upcoming game consoles... which does not equate to amazing real world physics...
  • Furen - Tuesday, August 23, 2005 - link

    How is vector math going to be handled by the conroe? Is it going to be done like it was done on the p4 (using dedicated vector hardware) or like AMD does it (using the x87 fpu)?
  • bart - Tuesday, August 23, 2005 - link

    The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and the 12 stage pipeline in the Athlon 64.

    Is 14 less then 12???
  • IntelUser2000 - Wednesday, August 24, 2005 - link

    quote:

    The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and the 12 stage pipeline in the Athlon 64.


    NO you are wrong it says this: The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and a slight increase from the 12 stage pipeline in the Athlon 64.
  • mino - Wednesday, August 24, 2005 - link

    Heh, you forgot that most writers at AT have tendency to correct mistakes as soon as they surface... I like this approach.
  • neogodless - Tuesday, August 23, 2005 - link

    Heh yup... I was wondering about the wording chosen for that too!

Log in

Don't have an account? Sign up now