Core Duo - A High Level Architectural Overview

We have talked extensively about Intel’s Core Duo processor since back when it was called Yonah, and while we would have liked to bring you an extremely detailed report on exactly what was done (architecturally) in Yonah to make it what it is today the fact of the matter is that Intel just isn’t very forthcoming with this sort of information. 

Intel has been very protective with their Centrino processor architectures ever since the platform’s introduction.  We’ve always been given bits and pieces of information, but never the full disclosure we’ve hoped for.  Even to this day Intel has not disclosed the exact number of pipeline stages in the Pentium M or Core Duo processors.  In many cases it was Intel’s first Pentium M processor, Banias, that they were most forthcoming with.  With every successor, the flow of information became far more marketing and far less technical.  We do hope that at some point this won’t be the case, but until then we will have to do the best with what we’re given at this point.  What follows is a brief architectural overview of Intel’s Core Duo to help you understand where some of the performance advantages and more importantly, improvements in power consumption, come from at an architectural level. 

Intel’s Smart Cache

The very first thing you can take for granted about the Core Duo processor is that all of the advancements in performance and power efficiency seen in previous Pentium M processors are all rolled into the Core Duo.  That means the power saving cache in Banias and Dothan is still here today, as is the unique method of architecting the CPU for a fixed clock frequency rather than maximum performance. 

If you take for granted the years of work that Intel’s Israel team put into Banias and Dothan (which I’m sure they would love for you to do), it makes understanding the improvements of Core Duo all that much easier. 

While it’s easy to assume that the biggest change in Yonah is the fact that it is a dual core processor, the largest impacts actually come from how those two cores interact and not the fact that there are two of them to begin with.  One such example is what Intel is calling their “Smart Cache”, which is just a really terrible way of saying that the two cores share a common 2MB L2 cache. 

In AMD’s Athlon 64 X2 design there is a System Request Queue (SRQ) that queues up memory requests from each core and sends the requests off to either main memory or one of the on-die caches. 


How core-to-core communication works on the Athlon 64 X2  

The beauty of the SRQ is that the two cores can communicate with one another at core speed.  This is in stark contrast to the way the Pentium D works, where all cache-to-cache requests must actually leave one core, go out onto the external bus before finally being sent to the other core. 


How core-to-core communication works on the Pentium D  

Why do you need to communicate between two cores?  If both cores are working on the same data, they must be in communication to determine which cache actually contains the latest and most correct copy of the data before using it (either for further operations or for writing it to main memory).  In order to ensure that only one core has the “correct” copy of the data, the individual cores may put requests on the bus (or SRQ in the case of the Athlon 64 X2) asking to invalidate the copy of the data that should not be used.  This sort of traffic does take up bus bandwidth and is much better handled on-die than over a higher latency external bus. 

With a shared L2 cache, core-to-core communication can happen much faster than on the Pentium D, since it runs at clock speed - making the Core Duo a lot more like the Athlon 64 X2 in that regard.  It still lacks an on-die memory controller, but communication between the two cores is improved.  It is worth noting that even when comparing AMD’s Athlon 64 X2 with its SRQ to Intel’s Pentium D which lacked any low latency core-to-core communication, the real world impacts in desktop applications were tough to find.  That being said, we would rather have the benefit on paper and have it hard to prove in the real world than not have it at all.


Core-to-core communication on the Core Duo

The other benefit of Intel’s Smart Cache is that it can be dynamically resized depending on the needs of the individual cores.  So if one core is running idle, the other core can get full access to the 2MB L2 cache.  If both are active, they are able to split the 2MB of cache depending on their needs, which means that as long as both cores wouldn’t benefit from a full 2MB cache then overall efficiency of the chip is better than a similar design with two separate 2MB caches. 

Also, as the size of the L2 cache is changed its usage is also monitored.  If it is determined that the cache can safely be flushed to main memory and powered down, the cache controller will do so to keep CPU power consumption down.  The idea here is that refreshing main memory will eat up less power than keeping the on-die cache running and active.

Intel’s Digital Media Boost

Nothing says an increase in decoder throughput and instruction level parallelism better than Digital Media Boost, which is what the next group of Core Duo’s enhancements are  called. 

We’ve known about Digital Media Boost for a while now, and Intel actually publicly disclosed information about the Boost at the last IDF.  Unfortunately since then there has been no new information, so all we can report on is what we’ve already talked about:

Making Pentium M more "Media Friendly"

All of the major performance improvements to each of Yonah's cores seem to revolve around SIMD FP and FP performance, two of the Pentium M's present day weaknesses in comparison to the Pentium 4.

The first improvement is that now all three of Yonah's decoders can decode SSE instructions, regardless of the type of instruction. Improving the decode width of the processor is a quick way to improve performance.

Next, SSE/SSE2 operations (not sure if all can be, but at least some) can now be fused using the Micro Ops Fusion engine of Yonah. At a high level, the benefit here is increased performance and lower power consumption, we'll get into architectural details of why that is when we eventually sink our teeth into Yonah next year.

Each of the two cores in Yonah have also received support for SSE3 instructions much like the Pentium 4 E [Prescott].

And finally there have been some improvements to Yonah's floating point performance, although Mooly would not say exactly what's been done. Curiously, Mooly referred to the floating point performance improvements as specifically made to improve gaming performance. Intel may have grander plans for Yonah than once thought...

The SSE/FP optimizations are all being grouped into what Intel is calling their Digital Media Boost technology, yes the names seem to get worse and worse as time goes on - but at least the functionality should be good.

 

Napa vs. Sonoma - Tangible Features CPU-Level Power and Thermal Enhancements
Comments Locked

29 Comments

View All Comments

  • OvErHeAtInG - Saturday, January 7, 2006 - link

    You hit the nail on the head. The increased power consumption would not be worth it. And IIRC was pointed out in the article, higher memory freq would provide a really minimal performance increase since the FSB is already lower bandwidth than that.
  • psychobriggsy - Friday, January 6, 2006 - link

    Did anyone else notice the strange mention of three compaq laptops on page 13 IIRC of the review?

    Anyway, this looks like a good product from Intel which will keep them ahead in mobile areas for the foreseeable future. AMD may catch up of course, but we will see what they offer later this year. I'm sure that revision F will be good though, and DDR2 will reduce power consumption on AMD notebooks a bit more.
  • Stolichnaya - Friday, January 6, 2006 - link

    Looks like the 'i' is going to crash on it's left side any time...
  • nserra - Friday, January 6, 2006 - link

    You are all dreaming here, thinking that amd can release a processor (platform) as good as this for the notebook area. The only extra is the 64 bit.

    They lack all the others, and primary ones:
    -Good platform from one of their partners.
    -Low power chipset to couple with the processor.
    -Brand recognition....
  • nidomus - Monday, January 9, 2006 - link

    coughfanboycough
  • Brucmack - Friday, January 6, 2006 - link

    I'm normally not a spelling nazi, but this is the second time I've seen this on Anandtech, and it's really annoying...

    On page 5, the word you're looking for is "segue", not "segway".
  • Shark Tek - Thursday, January 5, 2006 - link

    Great package but I don't have money for it :(

    http://www.pcmag.com/article2/0,1759,1908402,00.as...">Dell Inspiron E1705


    Type: Gaming, General Purpose, Media
    Operating System: MS Windows XP Media Center
    Processor Name: Intel Pentium M T2500
    Processor Speed: 2 GHz
    RAM: 1024 MB
    Hard Drive Capacity: 80 GB
    Graphics: nVidia GeForce Go 7800GTX
    Primary Optical Drive: Dual-Layer DVD+/-RW
    Wireless: 802.11a/g
    Screen Size: 17 inches
    Screen Size Type: widescreen
    System Weight: 8.2 lbs
  • Calin - Friday, January 6, 2006 - link

    But that isn't a portable laptop, is a towable one :(
  • Shark Tek - Thursday, January 5, 2006 - link

    That power consumption will be equal or better than previous Pentium-M generation. Now lets wait for AMD what they have to offer when they launch the Turion64 X2.

    They wont be sufficient to compete with "Core Duo" the only real advantages over intel are 64 bit support and cheaper cpu prices, nothing else.

    Intel will leap forward a few more years in the mobile market.
  • Viditor - Thursday, January 5, 2006 - link

    quote:

    They wont be sufficient to compete with "Core Duo" the only real advantages over intel are 64 bit support and cheaper cpu prices, nothing else

    Keep in mind that you're just making an "enthusiastic guess" here...
    AMD has started a new process of strained silicon on their 90nm chips which is specifically targeted at reducing power and increasing effeciency.
    These are released in new steppings rather than new architectures (remember Rev E cut power requirements in half compared to previous generations of 90nm chips).

    Even more important is the platforms...remember that the Turion isn't even 1 year old, and the platform designs are still minimal at best. It would be foolish to discount AMD at this point.

    That said, Intel deserves hearty congratulations on the duo and it's platform! 2006 is going to be an interesting year...!

Log in

Don't have an account? Sign up now