Next up, we'll look at floating point performance.

Flops, programmed by Al Aburto, is a very floating-point intensive benchmark. Analyses show that this benchmark contains:

70% floating point instructions;
only 4% branches; and
Only 34% of instructions are memory instructions.
Note that some of those 70% FP instructions are also memory instructions. Benchmarking with Flops is not real world, but isolates the FPU power.

Al Aburto, about Flops:
" Flops.c is a 'C' program which attempts to estimate your systems floating-point 'MFLOPS' rating for the FADD, FSUB, FMUL, and FDIV operations based on specific 'instruction mixes' (see table below). The program provides an estimate of PEAK MFLOPS performance by making maximal use of register variables with minimal interaction with main memory. The execution loops are all small so that they will fit in any cache."
Flops shows the maximum double precision power that the core has, by making sure that the program fits in the L1-cache. Flops consists of 8 tests, and each test has a different, but well known instruction mix. The most frequently used instructions are FADD (addition), FSUB (subtraction) and FMUL (multiplication).

MOD FADD FSUB FMUL FDIV
iMac G5 1.9GHz
iMac Core Duo 1.83GHz
1 50% 0% 43% 7% 705 876
2 43% 29% 14% 14% 490 366
3 35% 12% 53% 0% 2213 1216
4 47% 0% 53% 0% 1349 1178
5 45% 0% 52% 3% 868 1109
6 45% 0% 55% 0% 1509 1291
7 25% 25% 25% 25% 341 235
8 43% 0% 57% 0% 1440 1264
Average: 1114 942

One of the G5's strengths is in its floating point performance, and here, we see an example of that as it holds a 18% performance advantage over the Core Duo.  This does complicate the performance scene, as the move to Core Duo isn't necessarily going to be a clean victory for Apple today.

The last architectural performance test was the Queens benchmark, which does a great job of measuring the performance of a CPU's branch predictor. 

To test the branch prediction, we used the benchmark "Queens". Queens is a very well known problem where you have to place n chess Queens on an n x n board. The catch is that no single Queen must be able to attack the other. The exhaustive search strategy for finding a solution to placing the Queens on a chess board so that they don't attack each other is the algorithm behind this benchmark, and it contains some very branch intensive code.

Queens has about:

23% branches
45% memory instructions
No FP operations

On a PIII, the Branch misprediction rate is up to 19%! (Typical: 9%) Queens runs perfectly in the L1-cache.

As Johan mentioned in his article, it seemed as if a good branch predictor was very important to the chip's designers.  The necessity for a good branch predictor is also evident when you look at how long it takes the G5 to access main memory.  For this test, we looked at Queens performance with 16 queens on the chessboard:

Branch Predictor Performance - Queens (N=16)

The G5 completely dominates the Core Duo here. With a relatively short pipeline, not as much attention is usually paid to branch prediction as on a chip with a longer pipe.

Architecture and Memory Performance Boot Time
Comments Locked

35 Comments

View All Comments

  • ohnnyj - Tuesday, January 31, 2006 - link

    I have already preorded one (did so on the day they were announced), but now I am having serious doubts about keeping the order (does not ship until the 15th). The only thing that really worries me is if Apple will release new MacBooks when Intel releases the Conroe processor. I would think by that time (fall?) they would have most of the programs ported (i.e. Photoshop) and then an even better processor to run it with. I have been waiting so long for a laptop,...decisions, decisions.
  • Furen - Tuesday, January 31, 2006 - link

    I would say you should tough it out for a bit. Like Anand said, this is basically a Public Beta test. Kind of sucks that Apple brought out a 32bit version of the OS considering that it could've been x86-64 native if Apple had waited for a couple of quarters. Then again, it makes no difference if the OS is not 64 bits yet, since a 64 bit version would be able to run 32 bit apps anyway.
  • IntelUser2000 - Tuesday, January 31, 2006 - link

    I wonder if Rosetta itself doesn't take advantage of multi-thread...
  • IntelUser2000 - Tuesday, January 31, 2006 - link

    Wait, doesn't X1600 use H.264 decoding on hardware??
  • smitty3268 - Tuesday, January 31, 2006 - link

    It does if the drivers are set up to use it properly. Given that Windows users only got this about a month ago I'd say it probably isn't doing that yet on Macs. Could be, though.

Log in

Don't have an account? Sign up now