A Look at AMD's Dual Core Architecture

Even Intel will admit that the architecture of the Pentium D is not the most desirable as is two Pentium 4 cores literally glued together.  The two cores can barely be managed independently from a power consumption standpoint (they still share the same voltage and must run in the same power state) and all communication between cores must go over the external FSB.  The diagram below should illustrate the latter point pretty well:


Intel's Pentium D dual core architecture

Any communication between the two cores has to be done over the external FSB, and obviously, core-to-core communication over an external bus is slow.  It particularly doesn't make sense, since the two cores are on the same die.  Even the 65nm successor to the Pentium D (Presler) will have this same limitation.

AMD's architecture is much more sophisticated, thanks to the K8 architecture's on-die North Bridge.  While we normally only discuss the benefits of the K8's on-die memory controller, the on-die North Bridge is extremely important for dual core.  Instead of having all communication between the cores go over an external FSB, each core will put its request on the System Request Queue (SRQ) and when resources are available, the request will be sent to the appropriate execution core - all without leaving the confines of the CPU's die.  There are numerous benefits to AMD's implementation, and in heavily multithreaded/multitasking scenarios, it is possible for AMD to have a performance advantage over Intel just because of this implementation detail alone. 

The one limitation that both AMD and Intel have is bandwidth.  In order to maintain compatibility with present day Socket-940 and Socket-939 motherboards, AMD could not increase the pincount of their dual core processors.  The benefit is that AMD's dual core CPUs will work in almost all Socket-940 and Socket-939 motherboards (more on this later), but the downside is that the memory bus remains unchanged at 128-bits wide and supports a maximum memory speed of DDR400.  So, while single core Athlon 64 and Opteron CPUs get a full 6.4GB/s of memory bandwidth, today's dual core CPUs are given the same memory bandwidth to share among two cores instead of one. 

AMD's solution to the problem will come in the form of DDR2 and a new socket down the road, but for now there's no getting around the memory bandwidth limitations.  Intel is actually in a better position from a memory bandwidth standpoint. At this point, their chipsets provide more memory bandwidth than what a single core needs with their dual channel DDR2-667 controller.  The problem is that the Intel dual core CPUs still run on a 64-bit wide 800MHz FSB, which makes Intel's problem more of a FSB bandwidth limitation than a memory bandwidth limitation.

Backwards Compatibility

Intel's dual core Pentium D and Extreme Edition won't work in any previous motherboards, but as we mentioned at the start of this article, AMD has more bang.  Here, the additional bang comes from the almost 100% backwards compatibility with single-core motherboards.  We say "almost" because it's not totally perfect; here's the breakdown:
- On the desktop, the Athlon 64 X2 series is fully compatible with all Socket-939 motherboards.  All you need is a BIOS update and you're good to go.

- For workstations/servers, if you have a motherboard that supports the 90nm Opterons, then all you need is a BIOS update for dual core Opteron support.  If the motherboard does not support 90nm Opterons then you are, unfortunately, out of luck. 
For desktop users, the ability to upgrade your current Socket-939 motherboards to support dual core in the future is a huge offer from AMD.  While it may not please motherboard manufacturers to lengthen upgrade cycles like this, we have never seen a CPU manufacturer take care of their users like this before.  Even during the Socket-A days when you didn't have to upgrade your motherboard, most users still did because of better chipsets. AMD's architectural decisions have made those days obsolete.  The next generation of dual core processors will most likely need a new motherboard, but rest assured that you have a solid upgrade path if you have recently invested in a new Socket-939 desktop system or Socket-940.

Index The Lineup - Opteron x75
POST A COMMENT

144 Comments

View All Comments

  • KillerBob - Friday, April 22, 2005 - link

    Griswold,

    MT Test 1: PEE 1 - X2 0 Very likely scenario
    MT Test 2: PEE 2 - X2 0 Likely scenario
    MT Test 3: PEE 2 - X2 1 So-so scenario
    MT Test 4: PEE 3 - X2 1 Likely scenario
    MT Test 5: PEE 3 - X2 2 Likely scenario
    MT Test 6: PEE 3 - X2 3 Unlikely scenario

    I play a lot of games, but I never have things in the background, as a matter of fact I don't want to have anyting in the background, except for perhaps a big NewsPro download.
    Reply
  • MrEMan - Friday, April 22, 2005 - link

    102,

    Artificial stupidity run rampant?

    or

    Natural deselection (survival of the twitest)?
    Reply
  • Quanticles - Friday, April 22, 2005 - link

    I vote that 90% of the people on here have no idea what they're talking about... lol Reply
  • erwos - Friday, April 22, 2005 - link

    "It's odd that some picture game developers immediately supporting the PhysX chip as soon as it's available, but think they'll drag their feet to take advantage of another whole CPU core at their disposal."

    It's basically about the implementation differences of the two. You can be relatively certain that PhysX is going to be shipping their chips/cards with libraries that allow game devs to just speed up certain processing with special function calls (ie, calculate_particle_spread()). Multi-threading requires that you design your application from the very start to take advantage of it (mostly - I would wager splitting off the background music to its own thread is reasonably straightforward).

    Game logic doesn't always lend itself to multi-threading, either. If I shoot my gun, I want to hear the sound next. I don't want it to be thrown at the sound thread, where it may or may not execute next. Threading introduces latency, in other words, unless you so tightly bind your threads together that you may as well not use multi-threading.

    -Erwos
    Reply
  • Griswold - Friday, April 22, 2005 - link

    KillerBob, so that makes you a brilliant illiterate, since it's not what the benchmarks say. :) Reply
  • cHodAXUK - Friday, April 22, 2005 - link

    #83 Get a clue, a single core 3500+ is faster than the quivelant Opteron at the same speed. Why? Unregistered memory and tigher memory timinings. ECC memory comes with a 2-4% performance penalty but the big difference comes with the command speed, 2T for the Opteron and 1T 3500+, the AMD64 thrives on lower lower latancies that can make as big as an 10% performance difference and that is BEFORE we start to even think about raising the FSB speed which makes a significant difference to overall system perfomance. 15% is in no way unrealistic with a mild overclock and lower latancies, if you don't believe me then email Anand and ask him. Reply
  • Zebo - Friday, April 22, 2005 - link

    Jep4444 (#89) What do you mean X2's "arent nearly as good as the dual core Opterons"??

    Comming from XS I suspect don't OC very well?

    But they are the same cores as the Opterons are. and with ram should run signifigantly faster.

    Or do you mean buggy? That's easily attibuted to BIOS, IE none released yet so no working BIOS.

    How about a link please.
    Reply
  • Umbra55 - Friday, April 22, 2005 - link

    The benchmark overviews show "dual opteron 252 (2.6 GHz)" all over the review. I suppose this is single 252 instead of dual?

    Please correct accordingly
    Reply
  • emboss - Friday, April 22, 2005 - link

    #40 (Doormat):
    You're forgetting that the size of a dual-core is (roughly) double that of a single-core. So, assuming 1000 cores/wafer, 70% defect rate per core, then a single-core wafer (with an ASP of $500) will net AMD 700*500 = $350K.

    The same wafer with dual-cores will produce (approximately) 1000/2 * (0.7)^2 = 245 CPUs. So, to get the same amount of cash per wafer, AMD needs an ASP of $1429, or the second core costing 85% more than the first core.

    Of course, it's not quite this simple ("bad" chips running OK at lower speeds, etc) but it's not entirely unreasonable to see dual-cores with prices ~3 times that of a single core at the same speed grade. Intel is almost dumping (in the economic sense of the word) dual-core chips.
    Reply
  • saratoga - Friday, April 22, 2005 - link

    "saratoga, waah? There are similarities between C# and C++. While agree it's java'ish as well, it definitely has similarties to c++. One could say c# shaes similarities with c/c/c++.

    read away:

    http://www.mastercsharp.com/article.aspx?ArticleID...

    http://www.csharphelp.com/archives/archive138.html

    "

    I'm guessing you're not a c++ programmer ;)

    Anyway, yes they both use c syntax, however thats pretty much irrelevent given that Java also uses c syntax (as does Managed c++ which incidently IS the .net language directly based on c++) and I've never heard anyone call it related to c++. Beyond (some) syntax heritage and the fact that they're both OO langauges, they're very different beasts.

    ""C# is directly related to C and C++. This is not just an idea, this is real. As you recall C is a root for C++ and C++ is a superset of C. C and C++ shares several syntax, library and functionality." Quoted from above.

    L8r."

    Err yeah c++ is mostly a superset of c++. Thats neither here nor there. Just try and use the c/c++ preprocessor in c# and you'll see very quickly what the difference is. Or try using c++ multiple inherritance. You'll find that just because you took java and added operator overloading and made binding static by default, its not c++.
    Reply

Log in

Don't have an account? Sign up now