A Look at AMD's Dual Core Architecture

Even Intel will admit that the architecture of the Pentium D is not the most desirable as is two Pentium 4 cores literally glued together.  The two cores can barely be managed independently from a power consumption standpoint (they still share the same voltage and must run in the same power state) and all communication between cores must go over the external FSB.  The diagram below should illustrate the latter point pretty well:


Intel's Pentium D dual core architecture

Any communication between the two cores has to be done over the external FSB, and obviously, core-to-core communication over an external bus is slow.  It particularly doesn't make sense, since the two cores are on the same die.  Even the 65nm successor to the Pentium D (Presler) will have this same limitation.

AMD's architecture is much more sophisticated, thanks to the K8 architecture's on-die North Bridge.  While we normally only discuss the benefits of the K8's on-die memory controller, the on-die North Bridge is extremely important for dual core.  Instead of having all communication between the cores go over an external FSB, each core will put its request on the System Request Queue (SRQ) and when resources are available, the request will be sent to the appropriate execution core - all without leaving the confines of the CPU's die.  There are numerous benefits to AMD's implementation, and in heavily multithreaded/multitasking scenarios, it is possible for AMD to have a performance advantage over Intel just because of this implementation detail alone. 

The one limitation that both AMD and Intel have is bandwidth.  In order to maintain compatibility with present day Socket-940 and Socket-939 motherboards, AMD could not increase the pincount of their dual core processors.  The benefit is that AMD's dual core CPUs will work in almost all Socket-940 and Socket-939 motherboards (more on this later), but the downside is that the memory bus remains unchanged at 128-bits wide and supports a maximum memory speed of DDR400.  So, while single core Athlon 64 and Opteron CPUs get a full 6.4GB/s of memory bandwidth, today's dual core CPUs are given the same memory bandwidth to share among two cores instead of one. 

AMD's solution to the problem will come in the form of DDR2 and a new socket down the road, but for now there's no getting around the memory bandwidth limitations.  Intel is actually in a better position from a memory bandwidth standpoint. At this point, their chipsets provide more memory bandwidth than what a single core needs with their dual channel DDR2-667 controller.  The problem is that the Intel dual core CPUs still run on a 64-bit wide 800MHz FSB, which makes Intel's problem more of a FSB bandwidth limitation than a memory bandwidth limitation.

Backwards Compatibility

Intel's dual core Pentium D and Extreme Edition won't work in any previous motherboards, but as we mentioned at the start of this article, AMD has more bang.  Here, the additional bang comes from the almost 100% backwards compatibility with single-core motherboards.  We say "almost" because it's not totally perfect; here's the breakdown:
- On the desktop, the Athlon 64 X2 series is fully compatible with all Socket-939 motherboards.  All you need is a BIOS update and you're good to go.

- For workstations/servers, if you have a motherboard that supports the 90nm Opterons, then all you need is a BIOS update for dual core Opteron support.  If the motherboard does not support 90nm Opterons then you are, unfortunately, out of luck. 
For desktop users, the ability to upgrade your current Socket-939 motherboards to support dual core in the future is a huge offer from AMD.  While it may not please motherboard manufacturers to lengthen upgrade cycles like this, we have never seen a CPU manufacturer take care of their users like this before.  Even during the Socket-A days when you didn't have to upgrade your motherboard, most users still did because of better chipsets. AMD's architectural decisions have made those days obsolete.  The next generation of dual core processors will most likely need a new motherboard, but rest assured that you have a solid upgrade path if you have recently invested in a new Socket-939 desktop system or Socket-940.

Index The Lineup - Opteron x75
Comments Locked

144 Comments

View All Comments

  • MDme - Tuesday, April 26, 2005 - link

    #133

    i think what #130 was saying was that: from top to bottom, AMD's offerings are really good...if you want the best "bang for the buck" the 3400+ or whatever, or a 3000+ winnie OC'd will provide you with the best performance per dollar you spend...EVEN against the X2's.

    On the other hand if cost is not an issue, an X2 4400+ provides extremely good performance for people willing to pay the $500 premium.

    Zebo's point is in direct response to your point, which is AMD "STILL" has the best bang for the buck, not intel.

    or maybe YOU missed the logic? LOL
  • MPE - Tuesday, April 26, 2005 - link

    "Intel is just lucky a 3400+ new castle wasn't in that test suite. It's would win the majority of tests over an 830!! and it's still cheaper. Or did you miss this chart? LOL"

    Why not just admit it. AMD's DC is about 10-20% faster while costing 80-100% more.

    Even if the 3400+ is added, that comparison is moot since if you compare the score of that to the price of AMD's own DC - the price performance ratio is stagerrring? Or did you miss that logic?

    Anyways did you miss the part that even AMD DC was being beaten by their own single core.

    Next.
  • nserra - Tuesday, April 26, 2005 - link

    "The Athlon 64 4000+ was the last single core member of the Athlon 64 line.
    The Athlon 64 FX will continue as a single core CPU line, with the FX-57 (2.8GHz) due out later this year."

    Where did you get this info anand, i am not sure if an Athlon64 X2 4400+ could not coexist with a Athlon64 4400+. If this is the last 4000+ than i must say gee thats too bad....
  • Zebo - Tuesday, April 26, 2005 - link

    #125

    Techreports review is better for you. 64-bit OS, 64-bit apps when possible, no mystery unreproducable benchmarks like Anand's database stuff.
  • Zebo - Tuesday, April 26, 2005 - link

    MPE BS, Intel is just lucky a 3400+ new castle wasn't in that test suite. It's would win the majority of tests over an 830!! and it's still cheaper. Or did you miss this chart? LOL
    http://images.anandtech.com/reviews/cpu/amd/athlon...

    Intels DC chips can hardy compete with AMDs single core offerings. Side by side both DC it's a joke.

    So ya, AMD still has the "best bang for the buck" top end to bottom end. And they a far on top of the mountain.
  • MPE - Monday, April 25, 2005 - link

    Isn't the shoe on the other foot?

    For several years now, so many touted AMD's cheaper price and competative pricing.

    Now with Pentium4 D, especially with the 3GHz model, you get half the price of the cheapest X2 while probably at best 20% lower performance?

    What happened here?

    Now P4D 3GHz model is the best bang for the buck and not the AMD offering. This is a complete reversal of what a lot of AMD supporters have been touting?
  • ceefka - Monday, April 25, 2005 - link

    #125 Yeah, good point.

    Compare:
    A. singletreaded 32-bit app on a singlecore
    B. multi-threaded 64-bit app on a dualcore
    Considering that multithreaded apps already see such large gains on dualcores, going 64-bit too could well mean a more than 100% improvement from A to B.

    But of course, NO ONE needs dual core, 64-bit and +4GB memory in the next 5-10 years :P

    The ball now lies with MS and (Linux) app developpers to write more stuff in multithreaded 64-bit code. From what I hear and read it is not so much the 64-bit part as it is the threading that is a real challenge, even for veterans.
  • Ross Whitehead - Sunday, April 24, 2005 - link

    Visual, On P.12 I was referring to the closest Xeon competitor to the 252s which is the Quad Xeon 3.6 GHz 667 MHz FSB.

    Does that make any more sense?
  • Ross Whitehead - Sunday, April 24, 2005 - link

    jvarszegi, the actual stored procs are not prefixed with "sp_", we just used that as part of the "analogy" to the real system.

    One could also argue that we did not prefix the analogy example with the object owner either which also incurs a cache miss.

    Honestly, I have never quantified the expense of the sp_ prefix or the object owner.
  • Binji7 - Sunday, April 24, 2005 - link

    Where are the dual-core Windows x64 and Linux x64 benchmarks?? That's what I really want to see.

Log in

Don't have an account? Sign up now