Architecture and Memory Performance

When Johan did his No More Mysteries article, he found that as a processor, the G5 is quite competitive with modern day x86 CPUs.  In fact, he found that it offered floating point performance on par with that of the fastest x86 processor - the Athlon 64/Opteron. 

Separately, I looked at Core Duo performance and found that clock-for-clock, it was a pretty solid competitor to AMD's offerings.  Intel had effectively created a performance equal to AMD's Athlon 64 at lower clock speeds, without the use of an on-die memory controller. 

But now, it's time for judgment day. How does the Core Duo stack up to the G5?  Let's start at one of the G5's weakest points - memory speed.

I turned to lmbench and compiled it for both G5 and Intel x86 architectures, and used it to give me some hints to how memory speed has changed with the new platform.

I organized the data in terms of distance from the CPU. So first, we have performance of the on-die L2 cache of these two chips.  The Core Duo's L2 cache took 7.649ns to access, which translates into 14 clock cycles, a number that agrees with my ScienceMark results from previous articles. 

L2 Cache Latency - lmbench 2.5

The G5's L2 cache took 6.329ns to access, which at 1.9GHz, translates into a 12 cycle L2 - a slight performance advantage over the Core Duo.  Remember that the Core Duo's predecessor originally had a 10 cycle L2 cache, but thanks to the new power saving technology and some other unmentionable (for now) changes to the cache, Core Duo's L2 now takes 14 cycles to access.  Despite the greater access time, it's important to note that Core Duo's L2 cache is four times as large as the G5's. 

The 1.83GHz Core Duo features a 64-bit wide 667MHz FSB, offering a similar 5.336GB/s of bandwidth.  The FSB connects the chip to a 945 Express MCH with a dual channel DDR2-667 memory controller, providing it with 10.6GB/s of memory bandwidth. However, the Intel based iMac only ships with a single SO-DIMM installed, meaning that it is only operating in single-channel mode - delivering 5.336GB/s of memory bandwidth. I didn't have a DDR2 SO-DIMM on hand to test whether or not installing a second one would actually enable dual channel mode. 

The 1.9GHz G5 features a bi-directional 64-bit wide 633MHz FSB, offering a total of 5.06GB/s of bandwidth.  The chip connects to a North Bridge that appears to have a dual channel DDR2-533 memory controller, which provides it with 4.264GB/s of memory bandwidth, thanks to only one channel being active. This means that the iMac G5 is slightly memory bandwidth starved.

In Johan's article, he uncovered that the G5 is in terrible need of a lower latency memory controller, with memory requests taking almost twice as long as on an Intel platform!  Whether it is the G5's FSB or its chipset's memory controller that is at fault is difficult to isolate, but needless to say, the comparison to the Core Duo isn't pretty:

Memory Access Latency - lmbench 2.5

It takes the G5 almost twice as long just to get data back from memory as the Core Duo. That means that the CPU has to waste around twice as many clock cycles as the Core Duo, which leads to higher power consumption and lower performance.  To make matters worse, the G5 only has a 512KB L2 cache, so it has to go to main memory more often than the Core Duo with its massive 2MB L2 cache.  While sticking with a 512KB L2 cache may have kept the CPU small, the G5 really needed a larger cache much earlier in its lifetime (either that or a better FSB/memory controller). 

The high latency memory access and slower memory bus is why the G5 suffers tremendously when it comes to memory bandwidth:

Memory Read Speed - lmbench 2.5

Memory Write Speed - lmbench 2.5

Although, it is worth noting that the G5 actually posts a higher memory write speed here than the Core Duo.  It's not easy to explain why, as it could very well be a compiler issue. Remember that here, we are relying on gcc 4.0 and not Intel's C compiler to extract the best performance out of their platforms.  Over time, you can expect that to change, but for a first showing, it's not terrible. 

IBM vs. Intel - Performance per Watt Floating Point and Branch Predictor Performance
POST A COMMENT

35 Comments

View All Comments

  • snookie - Friday, February 3, 2006 - link

    The article is very good but surprisingly makes the same mistake as so many other reviews which is to test with only 512MB of ram. The intel imac is a much better machine with more ram and it doesn't make sense to test it with the minimum amount. Also Universal apps are coming fast and furious on a daily basis. I've got 1.5 GB of ram in mine and lots of the little apps I use everyday are already UB and are nice and fast as is the OS and iLife apps. It won't be long before Windows runs on these as well as Linux with Red Hat promising support. Check out Bare Feats for some pretty nice benchmarks including games. Yes, Quake 4 will actually run at a decent speed as well as COD 2.
    http://www.barefeats.com/imcd.html">http://www.barefeats.com/imcd.html
    Reply
  • csoto - Friday, February 3, 2006 - link

    Your only complaints stem from poor choice of models/configuraitons. The 20" unit will provide the added resolution, and BTO options allow up to 2GB on the Core Duo and 2.5GB on the G5 (although a 2GB soDIMM is listed at >$1K!). This is like me complaining that my mini van doesn't have a navigation system, because I was too cheap to buy the model that came with it :)

    Also, your assertion that the Core Duo is a "public beta" is absurd. You had zero problems running applications. Word from those around me that are testing Core Duos is that for most applications, you don't even notice Rosetta. Pro Apps users would complain, but they're never early adopters, because their apps always lag at least a few months behind the latest platform (remember the "multiprocessor plug-in" that allowed Photoshop to limp along for so long before a "MP-native" version was released?). This is a solid platform transition, likely exceeding the fairly solid (albeit far more daunting for the day) transition from 680x0 to PPC.

    Now if only VMWare would ship Workstation for Mac OS X, then I could ditch the Dell...

    Charles
    Reply
  • Furen - Sunday, February 5, 2006 - link

    He says he already had an iMac so in order to compare the two I'm guessing he bought the closest-matching one possible. I would hardly do to have an 20" iMac compared with a 17" one in power consumption or running at a different native resolution. I do agree that the RAM limits the system insanely but he went for default specs rather you start improving all the draw backs each system has.

    The reason why he says this is like a public beta is not because Rosetta sucks or anything of the sort but because there are almost no universal binaries besides those shipped by Apple. Apple chose to bring these systems forwards (at first they had said the systems would come out mid '06, I believe) without having enough of a software base and that's a pretty big drawback.
    Reply
  • jepapac - Wednesday, February 1, 2006 - link

    I was just wondering if the graphics adapter on the iMac is upgradeable since it is using pciexpress. Does anyone know? Reply
  • aliasfox - Thursday, February 2, 2006 - link

    I'm guessing its actually the laptop X1600 in the iMac, soldered onto the motherboard. Unfortunate, yes, but given the primary audience that the iMac is targeted at, I'm not surprised.

    Your average home user would rather buy a new $600-1000 box instead of dropping ~$500 for more RAM, a bigger hard drive, new graphics, and a faster processor.
    Reply
  • Eug - Thursday, February 2, 2006 - link

    quote:

    I'm guessing its actually the laptop X1600 in the iMac

    Why? Previous iMacs used desktop GPU parts.
    Reply
  • aliasfox - Thursday, February 2, 2006 - link

    I read somewhere that the 9600 in the second generation iMac G5 was a laptop part, and I therefore assumed that since Apple used the same GPUs in the iMac that it used in PowerBooks (GeForce FX5200, Radeon 9600, X1600), it was sourcing the same parts for both lines.

    Also, I've never read about an integrated 9600 or FX5200 as a desktop part. I might be mistaken though.
    Reply
  • nizzki - Tuesday, January 31, 2006 - link

    Any idea which compilers apple has used for their apps? For example, for the PPC apps I assume apple uses the IBM compiler heavily optimized for PPC instead of GCC.
    If that is the case, with the intel compiler for osx is in beta, the current somewhat lackluster performance of the core duo might be skewed in PPC's favor. This would be further exacerbated if Apple used GCC to compile the macintel apps, since it is unlikely to be heavily optimized for the core duo architecture.
    Reply
  • Commodus - Tuesday, January 31, 2006 - link

    Just a heads-up, Anand: the Core Duo iMac is the first iMac model to support desktop spanning, not just mirroring. So if you want, you can hook up even a 23" Cinema Display and get a huge amount of extra workspace. I'd probably only do that with a 20" iMac and the 256 MB video memory option, though. Reply
  • ingoldsby - Tuesday, January 31, 2006 - link

    Perhaps it's just me, but the non native apps I run seem to run at about the same speed as they natively ran on my G5. While the universal binaries run much faster.

    I would love to see this comparison revisited with a realistic amount of memory in the machine (ie. 1gb+) instead of limiting the machine to 512mb.
    Reply

Log in

Don't have an account? Sign up now