A Look at AMD's Dual Core Architecture

Even Intel will admit that the architecture of the Pentium D is not the most desirable as is two Pentium 4 cores literally glued together.  The two cores can barely be managed independently from a power consumption standpoint (they still share the same voltage and must run in the same power state) and all communication between cores must go over the external FSB.  The diagram below should illustrate the latter point pretty well:


Intel's Pentium D dual core architecture

Any communication between the two cores has to be done over the external FSB, and obviously, core-to-core communication over an external bus is slow.  It particularly doesn't make sense, since the two cores are on the same die.  Even the 65nm successor to the Pentium D (Presler) will have this same limitation.

AMD's architecture is much more sophisticated, thanks to the K8 architecture's on-die North Bridge.  While we normally only discuss the benefits of the K8's on-die memory controller, the on-die North Bridge is extremely important for dual core.  Instead of having all communication between the cores go over an external FSB, each core will put its request on the System Request Queue (SRQ) and when resources are available, the request will be sent to the appropriate execution core - all without leaving the confines of the CPU's die.  There are numerous benefits to AMD's implementation, and in heavily multithreaded/multitasking scenarios, it is possible for AMD to have a performance advantage over Intel just because of this implementation detail alone. 

The one limitation that both AMD and Intel have is bandwidth.  In order to maintain compatibility with present day Socket-940 and Socket-939 motherboards, AMD could not increase the pincount of their dual core processors.  The benefit is that AMD's dual core CPUs will work in almost all Socket-940 and Socket-939 motherboards (more on this later), but the downside is that the memory bus remains unchanged at 128-bits wide and supports a maximum memory speed of DDR400.  So, while single core Athlon 64 and Opteron CPUs get a full 6.4GB/s of memory bandwidth, today's dual core CPUs are given the same memory bandwidth to share among two cores instead of one. 

AMD's solution to the problem will come in the form of DDR2 and a new socket down the road, but for now there's no getting around the memory bandwidth limitations.  Intel is actually in a better position from a memory bandwidth standpoint. At this point, their chipsets provide more memory bandwidth than what a single core needs with their dual channel DDR2-667 controller.  The problem is that the Intel dual core CPUs still run on a 64-bit wide 800MHz FSB, which makes Intel's problem more of a FSB bandwidth limitation than a memory bandwidth limitation.

Backwards Compatibility

Intel's dual core Pentium D and Extreme Edition won't work in any previous motherboards, but as we mentioned at the start of this article, AMD has more bang.  Here, the additional bang comes from the almost 100% backwards compatibility with single-core motherboards.  We say "almost" because it's not totally perfect; here's the breakdown:
- On the desktop, the Athlon 64 X2 series is fully compatible with all Socket-939 motherboards.  All you need is a BIOS update and you're good to go.

- For workstations/servers, if you have a motherboard that supports the 90nm Opterons, then all you need is a BIOS update for dual core Opteron support.  If the motherboard does not support 90nm Opterons then you are, unfortunately, out of luck. 
For desktop users, the ability to upgrade your current Socket-939 motherboards to support dual core in the future is a huge offer from AMD.  While it may not please motherboard manufacturers to lengthen upgrade cycles like this, we have never seen a CPU manufacturer take care of their users like this before.  Even during the Socket-A days when you didn't have to upgrade your motherboard, most users still did because of better chipsets. AMD's architectural decisions have made those days obsolete.  The next generation of dual core processors will most likely need a new motherboard, but rest assured that you have a solid upgrade path if you have recently invested in a new Socket-939 desktop system or Socket-940.

Index The Lineup - Opteron x75
POST A COMMENT

144 Comments

View All Comments

  • MDme - Friday, April 22, 2005 - link

    #92

    the difference between an opteron and an a64/fx is as follows:

    opteron - needs ECC memory cache is 1mb it is multiplier locked, COHERENT HT LINKS

    a64 - uses non-registered (non-ecc) memory (which is faster), cache 512k-1mb, multiplier locked

    a64fx - non-ecc memory, 1mb cache, unlocked.

    so opteron's are not a64/fx's but are quite similar. the main difference is the memory type and the COHERENT HT links

    therefore the X2 4400+ is really an opteron (dual core) running at 2.2 with dual 1mb cache but with the COHERENT HT link disabled that uses non-ECC ram.

    performance should therefore be almost identical between one DC opteron 2.2ghz and one A64 X2 4400+ (possibly the X2 will be faster 5%) due to the non-ECC memory which is faster.


    Reply
  • tygrus - Friday, April 22, 2005 - link

    If you use a Opteron 875 then label it as such in all diagrams. You can make a note that the Athlon64 X2 4400+ will perform similarly to the Opteron 875. The differences in MB and RAM will affect results and so a direct re-labelling should not be made.
    Good database, multimedia, data analysis should make good use of multi-core/multi-CPU systems. When I mention data analysis I'm talking about software like SAS 9.1.3 and SAP. Even SAS is only threaded for a few tasks and is a big hassel to pipeline one step into another.
    Reply
  • Some1ne - Thursday, April 21, 2005 - link

    Good article overall, although I question the validity of declaring that an Opteron 875 is roughly equivalent to an Athlon 64 4400+. I could be wrong, but surely there must be significant architectural differences between the server-class chip (top of the line server-class chip no less) and the desktop Athlon 64? If not then why the price premium for Opterons, and why don't manufacturers just find a way to kludge the Athlon64 to work in MP configurations as in theory if they are really equivalent when run at the same clock speed, it would be much more cost effective to use kludged Athlon 64's, and it would also let higher performance levels to be reached as the dual-core Athlon64's are slated to run at one clock increment higher than the fastest dual-core Opteron's? So anyways, is it *really* valid to treat an Opteron as being essentially equivalent to a similarly clocked Athlon64? As much as I love finally seeing Intel chips trounced pretty much across the board, it seems to me like the results could potentially be inaccurate given that an Opteron 875 was used and simply "labeled" as an Athlon64 4400+. Reply
  • Cygni - Thursday, April 21, 2005 - link

    #89... seeing as how the Opty x75 and A64 X2 are based on functionally identical cores, thats not too likely at all. What DOES seem likely to me reading this article is that BIOS updates, and X2 support on 939 boards, is going to be a very interesting story to follow. It doesnt look like its too easy to get a solid AMD Dual Core BIOS if even Tyan is struggling, of all board mfts. May give a fiesty smaller board mft a chance to slam the bigboys and grab marketshare (such as ECS with the K7S5A). Reply
  • Jason Clark - Thursday, April 21, 2005 - link

    saratoga, waah? There are similarities between C# and C++. While agree it's java'ish as well, it definitely has similarties to c++. One could say c# shaes similarities with c/c/c++.

    read away:

    http://www.mastercsharp.com/article.aspx?ArticleID...

    http://www.csharphelp.com/archives/archive138.html

    "C# is directly related to C and C++. This is not just an idea, this is real. As you recall C is a root for C++ and C++ is a superset of C. C and C++ shares several syntax, library and functionality." Quoted from above.

    L8r.




    Reply
  • Jep4444 - Thursday, April 21, 2005 - link

    I've spoken to a few people from XS who have Engineering Samples of the Athlon X2s and all im hearing is that arent nearly as good as the dual core Opterons, they were apparently rushed Reply
  • xtknight - Thursday, April 21, 2005 - link

    #86 - the r_smp cvar was disabled in quake3 in a patch, for a reason i don't know. i confirmed this by having quake3 crash on my p4 HT CPU with that setting enabled. as for doom3, i'm not sure. i'm guessing it's not implemented well enough yet... Reply
  • Chuckles - Thursday, April 21, 2005 - link

    #83:
    "Real gamers" may use a single core, but I have been hankering for duallies since I tried an older dual G4 to my newer single G4. Even on the crappy MaxBus, I could browse the web, chat, do "real work" and game, without having everything go to pot when a bolus of e-mail came in.
    When you buy a dualie of any type, you buy the ability to do other stuff while you computer working on its latest task. Remember that when you get lagged while Outlook downloads your latest spam.

    Reply
  • Googer - Thursday, April 21, 2005 - link

    Why wern't there any SMP Tests done on Quake 3 engine, after all it is said to be multithreaded.

    Also, Carmack said during the devlopment of DOOM3 that the engine was going to support multiple processors, did this ever happen? Does anyone know what the command might be for D3 console to enable SMP, like it's cousin? How much truth is there to this?
    Reply
  • Nighteye2 - Thursday, April 21, 2005 - link

    Add to all the arguments that we can potentially see programs taking advantage of this quite soon...without the effort required to implement full multi-threading, game functions could be assigned to use the other processor if it's available. For example, AI can be done by one core, while the other core does the rest of running the game.
    Reply

Log in

Don't have an account? Sign up now