The last time we had a turn around like this was when NVIDIA launched the GeForce FX. NVIDIA gave us a weekend, Superbowl Weekend to be exact, to review its latest GPU back in 2003. History was bound to repeat itself, and this time it was AMD keeping us occupied all weekend.

We got a call earlier in the week asking if we'd be able to turn around a review of AMD's Barcelona processor for Monday if we received hardware on Saturday. Naturally we didn't decline, and as we were secretly working on a Barcelona preview already, AMD's timing was impeccable.

What we've been waiting for

AMD shipped us a pair of 2U servers a day early, we actually got them on Friday but being in Denver at CEDIA we couldn't begin testing until Saturday. Luckily, Johan had Barcelona in Europe for over a week by this point and was already hard at work on server benchmarks. I augmented Johan's numbers with some additional results on these servers, but I had other plans in mind for the Barcelona system that AMD was sending me.

We went from no Barcelona, to fist-fulls of Barcelona in one weekend

You see, we've known for a while that Barcelona was going to do well for AMD on the server side. AMD is far more competitive there than in the desktop market, mostly thanks to its Direct Connect architecture, something Intel won't be able to duplicate until the end of 2008 with Nehalem. Barcelona will improve clock-for-clock performance over Opteron and is a drop in replacement for Socket-1207 servers with nothing more than a BIOS update; the Enterprise world couldn't be happier.

Things are different on the desktop; AMD hasn't been competitive since the launch of Core 2 in the Summer of 2006 and we're very worried that even after Phenom's late-year launch, the market still won't be competitive. While that's great for consumers today, the concern is that a non-competive AMD will bring about a more complacent Intel, which we do not want. We want the hungry Intel that we've enjoyed for the past year, we want ridiculous performance and aggressive pricing, and we won't get that without an AMD that can fight.

But AMD won't tell us anything about how Phenom will perform, other than that it will be competitive with Conroe/Kentsfield. So the goal here today is to get an idea of exactly how much faster Barcelona (the same core that'll be in Phenom X4) will be compared to the Athlon 64 X2.

We'll have more Barcelona server content coming as we spend more time with the system, but be sure to check out Johan's coverage to get a good idea of how Barcelona will compete in its intended market. If you're not familiar with Barcelona/Phenom architecture, or if you're confused as to exactly what Phenom is here's some required reading before proceeding.

2.0GHz Today, 2.5GHz Tomorrow


View All Comments

  • MadBoris - Monday, September 10, 2007 - link

    hmm, especially if it is only @cas5, as mentioned above.
    It will be interesting to see if it yields anything more than just a few percent, as to scaling, and if benefits compound per socket.
    As to one socket and 4 cores I don't really envision it being that much more than a few percent, but then again, I'm not investing any thought or speculation to try and figure out what will be answered when it actually matters and HW is available.

    Major point for me is, being able to OC a q6600($280) to 3.2GHz - 3.4GHz on air is going to be real stiff competition for AMD's Phenom, as to my purchasing decisions, which is all I am concerned about mainly.

    Also I believe all peoples talk about "true" quad is going to fall a bit flat for the majority of applications/games in real world comparisons with Kentsfield. Because already anyone that is interested to research it can see that the cache/bus penalties in scaling from 2 to 4 cores is basically nonexistent on applications that actually 'fully leverage' all 4 cores. Some apps will benefit, but I expect this to come to light before long and people will see that the penalty of 2 cores in one (Intel Quad), was more speculation, than actual reality, for 'most' consumer applications and games.

    I do like AMD's advances but we seriously need more frequency, CPI cannot be overlooked.
  • duploxxx - Monday, September 10, 2007 - link

    if AMD is already able to show multiple phenom systems on 3.0GHZ without dditional cooling (just boxed heatpipe cooler) then i wouldn't be too worried about oc performance of k10 Reply
  • ilkhan - Monday, September 10, 2007 - link

    15% over K8 is not going to be enough if it launches at (or at least doesn't overclock easily to) 3.2Ghz. At the 2.5 indicated here, yorkfield@3.2+ is going to eat agena for lunch, while being more profitable for intel than agena can hope to be for AMD.
  • JackPack - Monday, September 10, 2007 - link

    Based on these numbers, consumers are likely going to stick with Intel quads.

    Clock for clock, Kentsfield was often >30% faster than Quad FX. Barcelona being 15% faster than K8 is reasonable but it's clearly not going to touch Penryn/Yorkfield.
  • duploxxx - Monday, September 10, 2007 - link

    Alltough it is nice to see what anand tried to put here on electronic paper. I can't be compared to the real phenom in a few months.

    If you want to know why, check Anand's memory review of a year a go and check how well k8 and also k10 is scaling with better/faster memory.

    in a barcelona rig you have reg 667@cas5.

    so people who are already making conclusions on these benches, one reply: too early.
  • JackPack - Monday, September 10, 2007 - link

    This isn't K8 though. The L3 in Barcelona is going to make it less sensitive to memory bandwidth and latency. Reply
  • Regs - Monday, September 10, 2007 - link

    Memory hits and misses (latency) have nothing to do with the L3. The L3 is there as a buffer for the information being proportion to the 4 cores. Reply
  • JackPack - Monday, September 10, 2007 - link

    Look up the term "memory hierarchy." Reply
  • Regs - Tuesday, September 11, 2007 - link

    what do you think pulls the data into the L3? God? Reply
  • JackPack - Tuesday, September 11, 2007 - link

    It's called prefetching. The data is in the L3 before the CPU needs it, reducing memory traffic and latency.

    Not only that, but Barcelona has a L3 latency of 20ns. To get data from the main memory, it has to go through all levels of cache. When you look at the cumulative latency of the memory hierarchy, the one or two cycle penalty of RDDR2 is trivial.

Log in

Don't have an account? Sign up now