Core Duo - A High Level Architectural Overview

We have talked extensively about Intel’s Core Duo processor since back when it was called Yonah, and while we would have liked to bring you an extremely detailed report on exactly what was done (architecturally) in Yonah to make it what it is today the fact of the matter is that Intel just isn’t very forthcoming with this sort of information. 

Intel has been very protective with their Centrino processor architectures ever since the platform’s introduction.  We’ve always been given bits and pieces of information, but never the full disclosure we’ve hoped for.  Even to this day Intel has not disclosed the exact number of pipeline stages in the Pentium M or Core Duo processors.  In many cases it was Intel’s first Pentium M processor, Banias, that they were most forthcoming with.  With every successor, the flow of information became far more marketing and far less technical.  We do hope that at some point this won’t be the case, but until then we will have to do the best with what we’re given at this point.  What follows is a brief architectural overview of Intel’s Core Duo to help you understand where some of the performance advantages and more importantly, improvements in power consumption, come from at an architectural level. 

Intel’s Smart Cache

The very first thing you can take for granted about the Core Duo processor is that all of the advancements in performance and power efficiency seen in previous Pentium M processors are all rolled into the Core Duo.  That means the power saving cache in Banias and Dothan is still here today, as is the unique method of architecting the CPU for a fixed clock frequency rather than maximum performance. 

If you take for granted the years of work that Intel’s Israel team put into Banias and Dothan (which I’m sure they would love for you to do), it makes understanding the improvements of Core Duo all that much easier. 

While it’s easy to assume that the biggest change in Yonah is the fact that it is a dual core processor, the largest impacts actually come from how those two cores interact and not the fact that there are two of them to begin with.  One such example is what Intel is calling their “Smart Cache”, which is just a really terrible way of saying that the two cores share a common 2MB L2 cache. 

In AMD’s Athlon 64 X2 design there is a System Request Queue (SRQ) that queues up memory requests from each core and sends the requests off to either main memory or one of the on-die caches. 


How core-to-core communication works on the Athlon 64 X2  

The beauty of the SRQ is that the two cores can communicate with one another at core speed.  This is in stark contrast to the way the Pentium D works, where all cache-to-cache requests must actually leave one core, go out onto the external bus before finally being sent to the other core. 


How core-to-core communication works on the Pentium D  

Why do you need to communicate between two cores?  If both cores are working on the same data, they must be in communication to determine which cache actually contains the latest and most correct copy of the data before using it (either for further operations or for writing it to main memory).  In order to ensure that only one core has the “correct” copy of the data, the individual cores may put requests on the bus (or SRQ in the case of the Athlon 64 X2) asking to invalidate the copy of the data that should not be used.  This sort of traffic does take up bus bandwidth and is much better handled on-die than over a higher latency external bus. 

With a shared L2 cache, core-to-core communication can happen much faster than on the Pentium D, since it runs at clock speed - making the Core Duo a lot more like the Athlon 64 X2 in that regard.  It still lacks an on-die memory controller, but communication between the two cores is improved.  It is worth noting that even when comparing AMD’s Athlon 64 X2 with its SRQ to Intel’s Pentium D which lacked any low latency core-to-core communication, the real world impacts in desktop applications were tough to find.  That being said, we would rather have the benefit on paper and have it hard to prove in the real world than not have it at all.


Core-to-core communication on the Core Duo

The other benefit of Intel’s Smart Cache is that it can be dynamically resized depending on the needs of the individual cores.  So if one core is running idle, the other core can get full access to the 2MB L2 cache.  If both are active, they are able to split the 2MB of cache depending on their needs, which means that as long as both cores wouldn’t benefit from a full 2MB cache then overall efficiency of the chip is better than a similar design with two separate 2MB caches. 

Also, as the size of the L2 cache is changed its usage is also monitored.  If it is determined that the cache can safely be flushed to main memory and powered down, the cache controller will do so to keep CPU power consumption down.  The idea here is that refreshing main memory will eat up less power than keeping the on-die cache running and active.

Intel’s Digital Media Boost

Nothing says an increase in decoder throughput and instruction level parallelism better than Digital Media Boost, which is what the next group of Core Duo’s enhancements are  called. 

We’ve known about Digital Media Boost for a while now, and Intel actually publicly disclosed information about the Boost at the last IDF.  Unfortunately since then there has been no new information, so all we can report on is what we’ve already talked about:

Making Pentium M more "Media Friendly"

All of the major performance improvements to each of Yonah's cores seem to revolve around SIMD FP and FP performance, two of the Pentium M's present day weaknesses in comparison to the Pentium 4.

The first improvement is that now all three of Yonah's decoders can decode SSE instructions, regardless of the type of instruction. Improving the decode width of the processor is a quick way to improve performance.

Next, SSE/SSE2 operations (not sure if all can be, but at least some) can now be fused using the Micro Ops Fusion engine of Yonah. At a high level, the benefit here is increased performance and lower power consumption, we'll get into architectural details of why that is when we eventually sink our teeth into Yonah next year.

Each of the two cores in Yonah have also received support for SSE3 instructions much like the Pentium 4 E [Prescott].

And finally there have been some improvements to Yonah's floating point performance, although Mooly would not say exactly what's been done. Curiously, Mooly referred to the floating point performance improvements as specifically made to improve gaming performance. Intel may have grander plans for Yonah than once thought...

The SSE/FP optimizations are all being grouped into what Intel is calling their Digital Media Boost technology, yes the names seem to get worse and worse as time goes on - but at least the functionality should be good.

 

Napa vs. Sonoma - Tangible Features CPU-Level Power and Thermal Enhancements
Comments Locked

29 Comments

View All Comments

  • stmok - Saturday, January 7, 2006 - link

    I admit it, I have no use for the Weener (Windows) keys. Its a pointless feature to have if you use other OSs or migrating AWAY from Windows. Its like Nvidia's chipset firewall solution...Another pointless feature for "Windows Only" users. (Which causes more trouble than its really worth).

    With Lenovo adopting all these "everyone else has it" features, its not the same ThinkPad anymore. They don't stand out technologically, like they used to.

    Granted, the fingerprint scanner and keyboard light is interesting, but that's all there is. My old R40 ThinkPad has a keyboard light as well. So I guess the only thing is the fingerprint scanner.

    As for ThinkVantage, that is useful...To some extent.

    I tried to "clean restore" WinXP from the hidden partition (as Windows requires a clean installation after 2 or more years of use), and I get a crapload of errors. The Trackpoint or Touchpad seem to be no longer detected, and so on. And other error messages. I couldn't get past finishing the install. So I unhid that WinXP Partition, and formatted the sucker clean, gained 8GB back of HDD space. Which is enough for a quadriple boot...Win2k, Slackware, FreeBSD and Solaris. (And they all work fine with the Trackpoint/Touchpad).
  • Scarceas - Saturday, January 7, 2006 - link

    I think Apple will focus their Intel support on the Yonah designs. I wouldn't be surprised to see a Mac Mini or something that was essentially a Yonah desktop.

    And I am quite glad that IBM/Lenovo are finally putting a Windows key on their Thinkpads!

    Hope that carries over to their rack-mount KVM's, as well. Drives me nuts....
  • littlebitstrouds - Friday, January 6, 2006 - link

    I wanna see a desktop board with this chip in it... then overclock the heck out of it. I bet that thing would scream.
  • raskren - Friday, January 6, 2006 - link

    Hmmm...

    Looks like an extremely competitive if not flat-out better Intel solution.

    So where is Beenthere's a.k.a. CRAMITPAL's canned comment?
  • stateofbeasley - Sunday, January 8, 2006 - link

    The fanboi is probably too demoralized to come out and troll. The numbers don't lie -- Core Duo is fast and efficient, and the Centrino Duo stuff is going to make Intel a pile of money.

    Beenthere tried to claim the opposite in his comments re the AnandTech preview, and he got run over like a Prescott in the way of an Athlon 64. Come to think of it, Beenthere's claims about Core Duo were about as stupid as claiming Prescott >>> Athlon 64.
  • uly - Friday, January 6, 2006 - link

    "Intel 3945ABG Wireless solution"
    "starting to look at platforms and solutions"
    "the 3945ABG wireless solution is what is known as"
    "915 chipset and 2915ABG wireless solution"
    "wireless solutions have both been undergoing reductions"
    "Pricing (with 945GM chipset and wireless solution)"
    "it did give us a nice solution"

    Another definition of 'solution' is something that is diluted or watered down. Wonder if Intel appreciates having their products looked upon from that perspective. (cred: buzzkiller dot net)

    Anand, whenever you find yourself about to type 'solution' in the future, please think, do I really want to sound like I'm copying from the presskit?

    Other than that, nice review.
  • raskren - Friday, January 6, 2006 - link

    You read this hunting for the word "solution." Please, this is part of everyday speech, not a buzzword.
  • uly - Friday, January 6, 2006 - link

    It's part of everyday speech - for PR guys. It's also pretentious - the customer should decide the solution for himself.

    > You read this hunting for the word "solution."

    No, I read it and buzzwords like solution kept popping out at me, so I used grep to do a quick wordcount. Seven times repeating mindless marketing drivel! C'mon Anand, I know you can write better than this.
  • sprockkets - Friday, January 6, 2006 - link

    The inside meant that this computer had an Intel chip inside meaning better performance than those other people, way back in 1993, not that Intel focused on the insides of the computer.

    Watch it and this will actually be bad for them. All those people won't even recognize the intel they knew with the new logo. "Leap Ahead"? How original.
  • henroldus - Friday, January 6, 2006 - link

    the only mistake in this excellent article is that they use the wrong memory with ddr2-533.
    the new core Duo supports DDR2-667.
    I am wrong when I mean that this could be a bottleneck?
    maybe the performance will raise with this memory but also the powerconsumption because of the higher frequency.

Log in

Don't have an account? Sign up now