The Secret Boost of the Opteron 2224

Socket F Opterons have a small secret weapon: a speed bump offers more than just a faster CPU. To understand this, take a look at the table below. We measured the L2 cache's bandwidth with Lavalys Everest 3.51.

Lavalys Everest 3.51 L2 Bandwidth
  Read (MB/s) Write (MB/s) Copy (MB/s)
Dual Xeon 5160 3.0 GHz 22019 17751 23628
Xeon E5345 2.33 GHz 17610 14878 18291
Opteron 2224 SE 3.2 GHz 14636 12636 14630
Opteron 8218HE 2.6 GHz 11891 10266 11891

The L2 cache of the Opteron 8218 at 2.6GHz is slower than the Core 2's L2 cache at 2.33. At about 10-11 GB/s it barely matches the theoretical peak bandwidth that DDR2 at 667MHz can deliver (10.6 GB/s), while its exclusive nature also forces it to exchange quite a bit of data with the L1 cache. Now combine this table with the following one, where we measured memory bandwidth.

Lavalys Everest 3.51 Memory Bandwidth
  Read (MB/s) Write (MB/s) Copy (MB/s) Latency (ns)
Dual Xeon 5160 3.0 GHz 3656 2771 3800 112.2
Xeon E5345 2.33 GHz 3578 2793 3665 114.9
Opteron 2224 SE 3.2 GHz 7466 6980 6863 58.9
Opteron 8218HE 2.6 GHz 6944 6186 5895 64

It is no secret that a higher clocked integrated memory controller can increase the actual delivered bandwidth of the same DDR2 modules. But it also helps that the L2 cache is able to swallow the bandwidth that the memory is capable of delivering. Also notice that without the use of SSE2 instructions, the memory subsystem of the 5000p chipset delivers relatively disappointing amounts of bandwidth. As most applications do not use carefully tuned SSE2 code to get data from memory, this should reflect the real world situation most of the time. And of course, until Intel introduces the Nehalem family, memory latency will continue to be one of the strong points of AMD.

Processor Latency Comparison
CPU L1 L2 L3 min mem max mem Absolute latency (ns)
Xeon 5160 3.0 - DDR2 533 3 14   69 380 127
Xeon 5160 3.0 - DDR2 667 3 14   67 338 113
Core 2 Duo 2.933 - DDR2 533 3 14   67 180 61
Quad Xeon E5345 2.33 - DDR2 533 3 14   80 280 120
Quad Xeon E5345 2.33 - DDR2 667 3 14   80 271 116
Xeon 7130M 3.2 - DDR2 400 4 29 109 245 624 195
Opteron 880 2.4 - DDR333 3 12   84 228 95
Opteron 2224 SE - DDR2 667 3 12   72 189 59
Opteron 2218 HE - DDR2 667 3 12   62 157 60

The latency penalty that FB-DIMM introduces is huge. To get an idea, we added the latency measured with a Core 2 Duo 2.933 using 2x 2GB 533MHz DDR2. The staggering conclusion is that registered FB-DIMMs add - in the worst case - about 200 cycles or 66ns of latency. Sure, some of that latency can be attributed to the buffering which is necessary for server memory. Buffered memory contains registers which will actually hold data for one full clock cycle before it's passed on. So this means that registered memory should add about 8ns (2 clock cycles at 266MHz base clock, DDR2-533).

The secondary benefit of FB-DIMMs is that motherboards can use more DIMMs per bank, potentially increasing total memory capacity. AMD already gets around this quite easily with up to eight DIMM sockets per CPU socket, however, so this benefit really doesn't materialize in any reasonable form. The bottom line is that while FB-DIMMs were a potentially good idea from a purely theoretical point of view, it is rather obvious that in practice they have some pretty bad consequences.

Tyan Transport TA26 SPECjbb2005
Comments Locked

30 Comments

View All Comments

  • Spoelie - Monday, August 6, 2007 - link

    Thanks for the clarification, I was under the impression the only real states were idle (1ghz) and full tilt (3.2ghz). Never seen any other states but all I ever use are the desktop chips, I wasn't aware CnQ could be more dynamic than that.
  • yuchai - Monday, August 6, 2007 - link

    I believe all A64 chips including the desktop ones have the different power states. For example my X2 4200+ has 4 states. 1.0, 1.8, 2.0 and 2.2 Ghz.
  • ButterFlyEffect78 - Monday, August 6, 2007 - link

    Are they talking about the barcelona?

    If not, then this is old news.

    I'm sure everyone by now knows that intels new cpu's are better then the current AMD opterons.
  • KingofFah - Monday, August 6, 2007 - link

    It really isn't. The were demonstrating the new 3.2ghz opteron. Also, this was a dual socket setup, and anand said, and everyone who monitors the server world knows, that the opterons come out ahead overall in the 4S environment.

    The more sockets, the more performance advantage opterons have on intel in the server space. This is well known. The purpose of this was to show it in the dual socket environment.
  • duploxxx - Monday, August 6, 2007 - link

    confused, no it is the stupidity of people like you that think that all Intel offerings are better then the ones for AMD.

    @anand, you're conclusion of the database world that the quadcore still rules..... where are the benchmarks?

    now it is nice to see all these benches next to each other, when are you going to combine benches, no longer servers are used for one application, they are more combined these days with more apps. Maybe its time you also have a look at vmware esx etc.... will probably give you a different look at the offerings of AMD these days.
  • clairvoyant129 - Monday, August 6, 2007 - link

    You don't have to get hostile because he does have a point. In the desktop market, Intel is clearly better unless we're talking about low end. Server market, it's still a toss up but Intel still has a lead.
  • yyrkoon - Monday, August 6, 2007 - link

    Um, you guys obviously have not been paying much attention have you ?

    1) AMD CPUs=cheaper
    2) AMD CPUs of comparrible speed perform nearly as good if not as good or better than their Intel counterparts. ie: I think you better check the last benchmarks anandtech post 'homie', because I saw a lot of AMD on top of the game benches. (6000+ vs e6600).
    3) Yes, a C2D *may* overclock better, and if it is you intention to overclock, it makes perfect sense to buy one, just be prepared to pay more for the CPU.
    4) Up until recently, or possibly still happening into the near future, AMD system boards availible often offered more features for less cost. It does seem however with the P35 Chipset, vendors are starting to come around.
    5) last, but not least, THIS article IS NOT about desktop hardware now IS IT ?! why bring some stupid lame ass coment into some place that it does not even fit ? GOd, and I thought I needed a new life . . .
  • Final Hamlet - Monday, August 6, 2007 - link

    It is these "but"s, that make the difference.
    If they exist, you can't state "all Intel CPUs" anymore, because there are exceptions.
  • ButterFlyEffect78 - Monday, August 6, 2007 - link

    I'm sorry everybody.

    English is my 2nd language so I sometimes can't always express what I want to say.

    What i meant to say is that Intel's new line of cpu's based on Core 2 duo tech. are better-(more advanced) then those based on K8 technology. If this is not true then there should not be a reason to introduce the K10 later this year to counterattack core 2 duo/quad.

    But again, I could be wrong.
  • Calin - Monday, August 6, 2007 - link

    Core2Duo technology from Intel is better overall than the K8 technology from AMD - this includes basic architecture, current improvements on the initial architecture (K8 is older and has more of those small improvements), and process/production technology.
    However, Intel lagged in introduction of Core2 based server processors, and even now their FBDIMM technology is slower and hotter (power hungry) than AMD's Opteron/DDR. Until this changes, AMD still has a market in servers, albeit not as good as before the Core2Duo Xeon processors.

Log in

Don't have an account? Sign up now