Next up, we'll look at floating point performance.

Flops, programmed by Al Aburto, is a very floating-point intensive benchmark. Analyses show that this benchmark contains:

70% floating point instructions;
only 4% branches; and
Only 34% of instructions are memory instructions.
Note that some of those 70% FP instructions are also memory instructions. Benchmarking with Flops is not real world, but isolates the FPU power.

Al Aburto, about Flops:
" Flops.c is a 'C' program which attempts to estimate your systems floating-point 'MFLOPS' rating for the FADD, FSUB, FMUL, and FDIV operations based on specific 'instruction mixes' (see table below). The program provides an estimate of PEAK MFLOPS performance by making maximal use of register variables with minimal interaction with main memory. The execution loops are all small so that they will fit in any cache."
Flops shows the maximum double precision power that the core has, by making sure that the program fits in the L1-cache. Flops consists of 8 tests, and each test has a different, but well known instruction mix. The most frequently used instructions are FADD (addition), FSUB (subtraction) and FMUL (multiplication).

MOD FADD FSUB FMUL FDIV
iMac G5 1.9GHz
iMac Core Duo 1.83GHz
1 50% 0% 43% 7% 705 876
2 43% 29% 14% 14% 490 366
3 35% 12% 53% 0% 2213 1216
4 47% 0% 53% 0% 1349 1178
5 45% 0% 52% 3% 868 1109
6 45% 0% 55% 0% 1509 1291
7 25% 25% 25% 25% 341 235
8 43% 0% 57% 0% 1440 1264
Average: 1114 942

One of the G5's strengths is in its floating point performance, and here, we see an example of that as it holds a 18% performance advantage over the Core Duo.  This does complicate the performance scene, as the move to Core Duo isn't necessarily going to be a clean victory for Apple today.

The last architectural performance test was the Queens benchmark, which does a great job of measuring the performance of a CPU's branch predictor. 

To test the branch prediction, we used the benchmark "Queens". Queens is a very well known problem where you have to place n chess Queens on an n x n board. The catch is that no single Queen must be able to attack the other. The exhaustive search strategy for finding a solution to placing the Queens on a chess board so that they don't attack each other is the algorithm behind this benchmark, and it contains some very branch intensive code.

Queens has about:

23% branches
45% memory instructions
No FP operations

On a PIII, the Branch misprediction rate is up to 19%! (Typical: 9%) Queens runs perfectly in the L1-cache.

As Johan mentioned in his article, it seemed as if a good branch predictor was very important to the chip's designers.  The necessity for a good branch predictor is also evident when you look at how long it takes the G5 to access main memory.  For this test, we looked at Queens performance with 16 queens on the chessboard:

Branch Predictor Performance - Queens (N=16)

The G5 completely dominates the Core Duo here. With a relatively short pipeline, not as much attention is usually paid to branch prediction as on a chip with a longer pipe.

Architecture and Memory Performance Boot Time
Comments Locked

35 Comments

View All Comments

  • Illissius - Tuesday, January 31, 2006 - link

    Compared to native applications, obviously, it's less than ideal; on the other hand, compared to, say, PearPC, it's pretty amazing. (I don't have any data and haven't tried it myself, but from what I've heard I'd suspect it runs at 5%-ish performance; compared to that, 30-70% is a minor miracle.)
    I know it won't interest the end user any whether it could've been even worse, but wanted to point it out, nonetheless ;).
  • yacoub - Tuesday, January 31, 2006 - link

    I wonder how it compares in game- oh, right, Mac. Hehehe ;)
  • DrZoidberg - Tuesday, January 31, 2006 - link

    there is one very popular game on mac.

    World of warcraft....could anandtech pls include a benchie comparing mac with intel core duo vs g5 in wow? It would be interesting to see if apple switching to intel means macs are better at games (or not).
  • fitten - Tuesday, January 31, 2006 - link

    Is the Universal Binary out for WoW yet?
  • Cusqueno - Tuesday, January 31, 2006 - link

    I have a 20" iMac Core Duo and with the default 512 RAM it was bad performance. About 5-10 fps in IronForge and 20-25 elsewhere. When I upgraded to 2 GB RAM it has improved greatly, maybe 10 - 20 in IF and 30 - 40 on the road. I guess this is due to Rosetta using lots of RAM.

    As of last night, there was no Universal binary. But today is patch/reboot day so might be pushed when I get off work. It is supposed to be included with version 1.9.3 according to the WoW forums.
  • fitten - Thursday, February 2, 2006 - link

    That's pretty awesome considering that you're running WoW in emulation (Rosetta).
  • vortmax - Tuesday, January 31, 2006 - link

    Seeing that Rosetta is needed for all MS and Adobe apps. and since using Rosetta seems to take lots of memory, it would be nice to see how it runs with 1gb. Also, some benchmarks from Photoshop would be nice :)

    Thanks Anand!
  • Lifted - Tuesday, January 31, 2006 - link

    "... but those are the ones we want to measure anyways so they have to be there."
  • Eug - Tuesday, January 31, 2006 - link

    Does turning off one core turn off half the cache?

    ie. Is it really Yonah Core Solo, or is it Yonah Celeron M?
  • maconlysource - Wednesday, February 1, 2006 - link

    Where did you get the toolbar single proc- dual proc utility.
    I installed the developer pkg on my Intel iMac but can't find it?
    Can you email me it?

    Thanks.

    Pete.

Log in

Don't have an account? Sign up now