A Look at AMD's Dual Core Architecture

Even Intel will admit that the architecture of the Pentium D is not the most desirable as is two Pentium 4 cores literally glued together.  The two cores can barely be managed independently from a power consumption standpoint (they still share the same voltage and must run in the same power state) and all communication between cores must go over the external FSB.  The diagram below should illustrate the latter point pretty well:


Intel's Pentium D dual core architecture

Any communication between the two cores has to be done over the external FSB, and obviously, core-to-core communication over an external bus is slow.  It particularly doesn't make sense, since the two cores are on the same die.  Even the 65nm successor to the Pentium D (Presler) will have this same limitation.

AMD's architecture is much more sophisticated, thanks to the K8 architecture's on-die North Bridge.  While we normally only discuss the benefits of the K8's on-die memory controller, the on-die North Bridge is extremely important for dual core.  Instead of having all communication between the cores go over an external FSB, each core will put its request on the System Request Queue (SRQ) and when resources are available, the request will be sent to the appropriate execution core - all without leaving the confines of the CPU's die.  There are numerous benefits to AMD's implementation, and in heavily multithreaded/multitasking scenarios, it is possible for AMD to have a performance advantage over Intel just because of this implementation detail alone. 

The one limitation that both AMD and Intel have is bandwidth.  In order to maintain compatibility with present day Socket-940 and Socket-939 motherboards, AMD could not increase the pincount of their dual core processors.  The benefit is that AMD's dual core CPUs will work in almost all Socket-940 and Socket-939 motherboards (more on this later), but the downside is that the memory bus remains unchanged at 128-bits wide and supports a maximum memory speed of DDR400.  So, while single core Athlon 64 and Opteron CPUs get a full 6.4GB/s of memory bandwidth, today's dual core CPUs are given the same memory bandwidth to share among two cores instead of one. 

AMD's solution to the problem will come in the form of DDR2 and a new socket down the road, but for now there's no getting around the memory bandwidth limitations.  Intel is actually in a better position from a memory bandwidth standpoint. At this point, their chipsets provide more memory bandwidth than what a single core needs with their dual channel DDR2-667 controller.  The problem is that the Intel dual core CPUs still run on a 64-bit wide 800MHz FSB, which makes Intel's problem more of a FSB bandwidth limitation than a memory bandwidth limitation.

Backwards Compatibility

Intel's dual core Pentium D and Extreme Edition won't work in any previous motherboards, but as we mentioned at the start of this article, AMD has more bang.  Here, the additional bang comes from the almost 100% backwards compatibility with single-core motherboards.  We say "almost" because it's not totally perfect; here's the breakdown:
- On the desktop, the Athlon 64 X2 series is fully compatible with all Socket-939 motherboards.  All you need is a BIOS update and you're good to go.

- For workstations/servers, if you have a motherboard that supports the 90nm Opterons, then all you need is a BIOS update for dual core Opteron support.  If the motherboard does not support 90nm Opterons then you are, unfortunately, out of luck. 
For desktop users, the ability to upgrade your current Socket-939 motherboards to support dual core in the future is a huge offer from AMD.  While it may not please motherboard manufacturers to lengthen upgrade cycles like this, we have never seen a CPU manufacturer take care of their users like this before.  Even during the Socket-A days when you didn't have to upgrade your motherboard, most users still did because of better chipsets. AMD's architectural decisions have made those days obsolete.  The next generation of dual core processors will most likely need a new motherboard, but rest assured that you have a solid upgrade path if you have recently invested in a new Socket-939 desktop system or Socket-940.

Index The Lineup - Opteron x75
POST A COMMENT

144 Comments

View All Comments

  • liebremx - Thursday, April 21, 2005 - link


    Anand, great reading as always.

    I have an observation:

    On the 'Development Performance - Compiling Firefox' section you write
    "This particular test is only single threaded, ..."

    Why not launch a multithreaded build?

    "make -j3 -f client.mk build_all"
    Reply
  • Jalf - Thursday, April 21, 2005 - link

    Makes good sense for AMD to keep their (server) dualcore chips pricey. AMD has limited manufacturing capacity, and they have best singlecore solution. In other words, they might as well keep the dualcore prices high, to a) make more money in cases where people are willing to fork over lots of money, and b) keep people who are on a budget interested in their singlecore offerings, at least until their new fab goes online. Reply
  • GentleStream - Thursday, April 21, 2005 - link

    I have some comments about the Firefox compile test. First, thanks alot for including it. Now I have some comments about it. First, you are using GNU make and it supports parallel compiles. So, you should be able to replace the line:

    make -f client.mk build_all

    with the line:

    make -j 2 -f client.mk build_all

    to perform a parallel compile using 2 processors. The -j option specifies how many processors or threads you are using. You can do parallel compiles on a single processor machine as well as multi-processor or multi-core machines. It is often the case that using -j 2 or -j 3 on a single processor machine will give the best results because of it's allowing the overlaping of cpu computations and I/O.

    You don't say whether you did a debug or optimized build. I would recommend doing both the debug and optimized builds and reporting the results of both. When doing parallel optimized compiles, you may want to make sure you are not swapping although for the server tests it looks like you have plenty of memory - 4 GBytes. I did not see immediately how much memory you were using for the X2 tests. Anyway, I would recommend doing both debug and optimized compiles with -j n where n is 1, 2, 3, and 4 or perhaps just 1, 2, and 4. Since compiles are essential to development work and also embarassingly parallel, this should provide a really good comparison of the multitasking capabilities of these systems.

    Hope you can do this or at least some of it and thanks alot for adding a really good compile test to your test suite.

    Dave
    Reply
  • michaelpatrick33 - Thursday, April 21, 2005 - link

    The server market is where AMD is going headed to get large margins in their chips. With Supermicro joining the AMD camp (they must have seen the performance of the Opteron dualcore, blinked their eyes and said, "we're in") Dell is left alone holding Intel only product lines. Intel will not have a response on the server front until Q1 2006. That is troubling for Intel because it give AMD six months of market buildup and Fab36 time to come online and increase volume tremendously. It should be interesting.

    Imagine a 4800+ on a 939 DFI board running at 2-2-2-8 1t timings versus the P4 Extreme dualcore. Drooling just thinking about having either processor, but especially the AMD
    Reply
  • erwos - Thursday, April 21, 2005 - link

    "AMD would probably have problems delievering a lower cost dual core in quantities ."

    This is exactly it. Why should AMD let demand outstrip supply? Just jack up the price until you've got just enough demand to consume your supply.

    I mean, yes, I'd love an Athlon64 X2 5000+ with 1mb of cache for ~$250, but that's life. AMD stockholders should be pleased with this decision.

    There's also the impending move to socket M2 to consider... the Athlon64 X2 makes sense for people with very low-end A64's, but M2 is going to be the better upgrade path for FX and/or 3800+ users. I would be surprised to see any 939 Athlon64's past 5200+.
    Reply
  • eetnoyer - Thursday, April 21, 2005 - link

    While our desires as desktop users are for high volumes of X2s at low prices, we have to balance that with what AMD as a company needs to survive...money. AMD is currently capacity constrained with regard to dual-core CPUs with only Fab30. They have entered into agreements with both IBM and Chartered for additional capacity (probably on the lower end chips), but that won't come online until late this year. Just before production starts to ramp at Fab36.

    In the meantime, AMD has stated that their order of priority goes Server -> Mobile -> Desktop with the profitability motive in mind. For most users that will be heavily into the multi-tasking benefits of dual-core CPUs, spending $5xx for the low-end X2 vs $1000 for the PEE 840 will be a no-brainer. Seeing how that is a small minority of users, AMD can reasonbly supply the demand for them while still maintaining the highlest level of availability of dual-core Opterons at much better ASPs. Remember that AMD wants to capture as much market share in the server market as possible while Intel has no response.

    As a share-holder, I hope that the demand for dual-core Opteron is deafening based on the incredible price/performance ratio (thus limiting their ability to produce X2 in high quantity). As a middle-of-the-road desktop user, I'm quite content with my mildly OC'd A64 for the next year or two.
    Reply
  • ksherman - Thursday, April 21, 2005 - link

    w00t! Ill have to read it later tho... Reply
  • MrHaze - Thursday, April 21, 2005 - link

    Certainly impressive.

    I think it is important to remember that the "Athlon64 X2" was actually an Opteron running ECC RAM at 2T on a less-than-stable motherboard. I think it is best think of this as a comparison of Intel's dual cores, AMD's single cores, and a hog-tied Athlon64 X2.
    Makes you wonder how an actual X2 with fast memory on a fast motherboard will perfom.

    Regardless, I'm really excited about the upgrade potential, and I hope that AMD sticks with socket 939 for a long while.

    Mr.Haze
    Reply
  • kirbalo - Thursday, April 21, 2005 - link

    Great review Anand...Thanks for fixing your gaming bar charts...they were wacked before!

    Reply
  • Tapout1511 - Thursday, April 21, 2005 - link

    Sure would have been nice if they had included a single core A64 at 2.2GHz w/ 1MB cache (3500+ right?) to illustrate instances where the extra core was useful and when it wasn't.

    Oh well.
    Reply

Log in

Don't have an account? Sign up now