More Sockets, but Lower Performance?

When AMD briefed us on Quad FX, the performance focus was on heavy multitasking (AMD calls this "Megatasking") or very multi-threaded tests. We figured it was an innocent attempt to make sure we didn't run a bunch of single threaded benchmarks on Quad FX and proclaim it a failure. Given that the vast majority of our CPU test suite is multi-threaded to begin with, we didn't think there would be any problems showcasing where four cores is better than two, much like we did in our Kentsfield review.

However when running our SYSMark 2004SE tests we encountered a situation that didn't make total sense to us at first, and somewhat explained AMD's desire for us to strongly focus on megatasking/multithreaded tests. If we pulled one of the CPUs out of the Quad FX system, we actually got higher performance in SYSMark than with both CPUs in place. In other words, four cores was slower than two.

CPU SYSMark 2004SE Internet Content Creation Office Productivity
2 Sockets (4 cores) 261 373 182
1 Socket (2 cores) 288 393 211

You'll see that in some of the individual tests there is an advantage to having both CPUs installed, but in the vast majority of them performance goes down with four cores. It turns out that there are two explanations for the anomaly.

CPU Internet Content Creation 3D Creation 2D Creation Web Publication
2 Sockets (4 cores) 373 245 514 411
1 Socket (2 cores) 393 364 453 369

First, in Internet Content Creation SYSMark 2004SE, there appears to be an issue with having two physical CPUs in the system that results in the 3dsmax rendering test only spawning a single thread, lowering performance below that of a normal dual-core processor. This problem may be caused by a licensing violation within the benchmark where it is expecting to see one physical CPU with multiple cores and isn't prepared to deal with multiple CPUs. Regardless of the exact cause of the problem, it doesn't appear to be anything more than a benchmark issue. It's the performance in the Office Productivity suite that is far more worrisome because there is no issue with the benchmark that's causing the problem.

CPU Office Productivity Communication Document Creation Data Analysis
2 Sockets (4 cores) 182 171 259 137
1 Socket (2 cores) 211 187 285 176

The Office Productivity suite of SYSMark 2004SE wasn't the only situation where we saw lower performance on Quad FX than with a single dual core setup. 3D games seemed to suffer the most; take a look at what happens in our Oblivion and Half Life 2: Episode One tests:

CPU Oblivion - Bruma Oblivion - Dungeon Half Life 2: Episode One
2 Sockets (4 cores) 67.3 78.3 155.8
1 Socket (2 cores) 75.2 90.9 165.7

Once again, populate both sockets in the Quad FX system and performance goes down. The explanation for these anomalies lies in the result of one more benchmark, CPU-Z's memory latency test:

CPU CPU-Z Latency (8192KB, 128-byte)
2 Sockets (4 cores) 55.3 ns
1 Socket (2 cores) 43.3 ns

With both sockets populated, memory latency goes up by around 27% and thus in applications that are more latency sensitive and don't necessarily need all four cores, you get worse performance than with a single dual-core CPU. The added latency comes from the additional probing over the HT bus that's done for coherency whenever a memory request is made to see where the latest copy of the data resides.

It's a problem that will go away if you have a single quad-core CPU with one memory controller, but one that makes Quad FX a tougher pill to swallow compared to Intel's quad-core offerings.

How does a 3GHz Athlon 64 X2 Perform? Four cores, 1 Socket or Four cores, 2 Sockets?
Comments Locked

88 Comments

View All Comments

  • Neosis - Friday, December 1, 2006 - link

    quote:

    the Kentsfield has exactly the same latency as a 2 socket dual core because the 2 dual cores on-board don't talk directly with each other.


    However (in my opinion) since all these four cores share the same 8MB L2 cache and Intel's memory disambiguation forces all cores to use L2 cache more, additional latencies are not significant as the Amd's 4x4 platform. But you are right again that connecting the dies through the FSB requires all die to die communication to go back to the Northbridge and into the system memory. That can be a serios perfomance issue when Amd has competing processers.
  • mino - Friday, December 1, 2006 - link

    Kentsfield == 2 Conroes stuck on 1 FSB. They have _separate_ 4M L2 cache. No 8M L2 on the horizon..
  • Neosis - Thursday, November 30, 2006 - link

    Where is edit button?

    The first sentence should be "I think ..."
  • Neosis - Thursday, November 30, 2006 - link

    I don't think AMD can compete with Kenstfield even with this platform. Enthuiasts usually don't care power consumption and heat problems. A water cooling system (with a large radiator and a strong pump) will do just good. The main concern is neither the power consumption nor the heat problems. When you install two dual core processor, you are going to have performance down due to the increased latency. Nearly in all benchmarks Intel is leading. No suprise that only one motherboard manufacturer was in on.

    Even though I'm an AMD user, I don't see any particular reason people will buy this. But I can say why not:
    - no one knows how long Amd will support this platform. In the past years Amd has beem changing sockets almost each year and half. We know Socket Am3 will use Ddr3.
    - pricing
  • Griswold - Saturday, December 2, 2006 - link

    quote:

    - no one knows how long Amd will support this platform. In the past years Amd has beem changing sockets almost each year and half. We know Socket Am3 will use Ddr3.


    Well the first part isnt quite true or very precise, as for the second part, we also know that AM3 CPUs will run in AM2 sockets (but not vice versa). On top of that, we're talking about Socket F here and not AMx.

    If you want to name a good reason to not buy this: The other option is just that much better. End of story. If you want quad AMD, wait 6 months.
  • Gigahertz19 - Thursday, November 30, 2006 - link

    On black Friday I was at Circuit City and some store employee near me was telling this woman who was looking for a computer to make sure she buys a computer with a AMD processor because their faster and all around better. I couldn't stand there and let him lie to that woman so I went over there and told her she needs to buy a comp with a Core 2 Duo and gave my reasons. Then the Circuit City guy went into this rant about AMD and the 5000+ processor and how it's the best, haha apparently he hasn't updated his knowledge for quite sometime. I could of stood there and argued it but I just said okay and walked away, didn't walk to make a scene...plus how geeky would that be arguing over processors in the middle of a store where customers are.

    Anyways looks like Intel Core 2 Duo tech is the thing to get. I'm stilling running a old XP-M overclocked with a DFI Socket A mobo. I want to upgrade to Core 2 Duo sometime soon probably get the Core 2 E6600 only because it has 4Mb cache and the slower speed ones don't. Overclock that baby to 3GHz which should be a given with the right mobo like the Evga one and I'll have a awesome system, probably buy a X1950 XT or Pro for around $250 then upgrade to DX10 when it gets cheaper.
  • madnod - Thursday, November 30, 2006 - link

    i am really into AMD and i was buying AMD since the last 4 years, but this time intel isreally pushing ahead.
    there is a major thing that intel is doing these days and it's really funny to see the way AMD is responding to that, it kinda remind me of the 3DFX approach, start stacking more things that u already have and wish that things will be better.
    AMD should expedite their transition to the newer CPU desgin, the current K8 architecture can't keep up with the core technology.
  • THX - Thursday, November 30, 2006 - link

    Very nice tests. I can't believe the power draw AMD is dealing with here.
  • Ecmaster76 - Thursday, November 30, 2006 - link

    The pin count of AM2 probably isn't an issue. It has as many pins as 940 which can handle multiple HT links and dual channel memory.

    AMD just moved it tot he other socket to people from buying the bundled CPUs and selling them individually for a profit. The 2.6 GHz model for example runs about $100 less per chip than the standard X2 does.
  • punjabiplaya - Thursday, November 30, 2006 - link

    Are we going to see updated benchmarks with 64 bit performance and/or Vista and when there is a BIOS fix for the NUMA issues on the board (not the WinXP shortfalls as far as NUMA is concerned, Vista should take care of that)?
    Just curious.

Log in

Don't have an account? Sign up now