Architecture and Memory Performance

When Johan did his No More Mysteries article, he found that as a processor, the G5 is quite competitive with modern day x86 CPUs.  In fact, he found that it offered floating point performance on par with that of the fastest x86 processor - the Athlon 64/Opteron. 

Separately, I looked at Core Duo performance and found that clock-for-clock, it was a pretty solid competitor to AMD's offerings.  Intel had effectively created a performance equal to AMD's Athlon 64 at lower clock speeds, without the use of an on-die memory controller. 

But now, it's time for judgment day. How does the Core Duo stack up to the G5?  Let's start at one of the G5's weakest points - memory speed.

I turned to lmbench and compiled it for both G5 and Intel x86 architectures, and used it to give me some hints to how memory speed has changed with the new platform.

I organized the data in terms of distance from the CPU. So first, we have performance of the on-die L2 cache of these two chips.  The Core Duo's L2 cache took 7.649ns to access, which translates into 14 clock cycles, a number that agrees with my ScienceMark results from previous articles. 

L2 Cache Latency - lmbench 2.5

The G5's L2 cache took 6.329ns to access, which at 1.9GHz, translates into a 12 cycle L2 - a slight performance advantage over the Core Duo.  Remember that the Core Duo's predecessor originally had a 10 cycle L2 cache, but thanks to the new power saving technology and some other unmentionable (for now) changes to the cache, Core Duo's L2 now takes 14 cycles to access.  Despite the greater access time, it's important to note that Core Duo's L2 cache is four times as large as the G5's. 

The 1.83GHz Core Duo features a 64-bit wide 667MHz FSB, offering a similar 5.336GB/s of bandwidth.  The FSB connects the chip to a 945 Express MCH with a dual channel DDR2-667 memory controller, providing it with 10.6GB/s of memory bandwidth. However, the Intel based iMac only ships with a single SO-DIMM installed, meaning that it is only operating in single-channel mode - delivering 5.336GB/s of memory bandwidth. I didn't have a DDR2 SO-DIMM on hand to test whether or not installing a second one would actually enable dual channel mode. 

The 1.9GHz G5 features a bi-directional 64-bit wide 633MHz FSB, offering a total of 5.06GB/s of bandwidth.  The chip connects to a North Bridge that appears to have a dual channel DDR2-533 memory controller, which provides it with 4.264GB/s of memory bandwidth, thanks to only one channel being active. This means that the iMac G5 is slightly memory bandwidth starved.

In Johan's article, he uncovered that the G5 is in terrible need of a lower latency memory controller, with memory requests taking almost twice as long as on an Intel platform!  Whether it is the G5's FSB or its chipset's memory controller that is at fault is difficult to isolate, but needless to say, the comparison to the Core Duo isn't pretty:

Memory Access Latency - lmbench 2.5

It takes the G5 almost twice as long just to get data back from memory as the Core Duo. That means that the CPU has to waste around twice as many clock cycles as the Core Duo, which leads to higher power consumption and lower performance.  To make matters worse, the G5 only has a 512KB L2 cache, so it has to go to main memory more often than the Core Duo with its massive 2MB L2 cache.  While sticking with a 512KB L2 cache may have kept the CPU small, the G5 really needed a larger cache much earlier in its lifetime (either that or a better FSB/memory controller). 

The high latency memory access and slower memory bus is why the G5 suffers tremendously when it comes to memory bandwidth:

Memory Read Speed - lmbench 2.5

Memory Write Speed - lmbench 2.5

Although, it is worth noting that the G5 actually posts a higher memory write speed here than the Core Duo.  It's not easy to explain why, as it could very well be a compiler issue. Remember that here, we are relying on gcc 4.0 and not Intel's C compiler to extract the best performance out of their platforms.  Over time, you can expect that to change, but for a first showing, it's not terrible. 

IBM vs. Intel - Performance per Watt Floating Point and Branch Predictor Performance
Comments Locked

35 Comments

View All Comments

  • Illissius - Tuesday, January 31, 2006 - link

    Compared to native applications, obviously, it's less than ideal; on the other hand, compared to, say, PearPC, it's pretty amazing. (I don't have any data and haven't tried it myself, but from what I've heard I'd suspect it runs at 5%-ish performance; compared to that, 30-70% is a minor miracle.)
    I know it won't interest the end user any whether it could've been even worse, but wanted to point it out, nonetheless ;).
  • yacoub - Tuesday, January 31, 2006 - link

    I wonder how it compares in game- oh, right, Mac. Hehehe ;)
  • DrZoidberg - Tuesday, January 31, 2006 - link

    there is one very popular game on mac.

    World of warcraft....could anandtech pls include a benchie comparing mac with intel core duo vs g5 in wow? It would be interesting to see if apple switching to intel means macs are better at games (or not).
  • fitten - Tuesday, January 31, 2006 - link

    Is the Universal Binary out for WoW yet?
  • Cusqueno - Tuesday, January 31, 2006 - link

    I have a 20" iMac Core Duo and with the default 512 RAM it was bad performance. About 5-10 fps in IronForge and 20-25 elsewhere. When I upgraded to 2 GB RAM it has improved greatly, maybe 10 - 20 in IF and 30 - 40 on the road. I guess this is due to Rosetta using lots of RAM.

    As of last night, there was no Universal binary. But today is patch/reboot day so might be pushed when I get off work. It is supposed to be included with version 1.9.3 according to the WoW forums.
  • fitten - Thursday, February 2, 2006 - link

    That's pretty awesome considering that you're running WoW in emulation (Rosetta).
  • vortmax - Tuesday, January 31, 2006 - link

    Seeing that Rosetta is needed for all MS and Adobe apps. and since using Rosetta seems to take lots of memory, it would be nice to see how it runs with 1gb. Also, some benchmarks from Photoshop would be nice :)

    Thanks Anand!
  • Lifted - Tuesday, January 31, 2006 - link

    "... but those are the ones we want to measure anyways so they have to be there."
  • Eug - Tuesday, January 31, 2006 - link

    Does turning off one core turn off half the cache?

    ie. Is it really Yonah Core Solo, or is it Yonah Celeron M?
  • maconlysource - Wednesday, February 1, 2006 - link

    Where did you get the toolbar single proc- dual proc utility.
    I installed the developer pkg on my Intel iMac but can't find it?
    Can you email me it?

    Thanks.

    Pete.

Log in

Don't have an account? Sign up now