We have already discussed a couple of the other aspects of the Banias architecture from tidbits of information revealed at previous Intel Developer Forum conferences. The name of the game with Banias is efficiency, and thus the Israel design team introduced a technology called micro-ops fusion into the Banias core.

The idea behind micro-ops fusion is to bundle micro-ops (decoded instructions) together before sending them down the pipeline to the execution units. The idea behind micro-ops fusion is that the pipeline is not used unless a fixed number of micro-ops are ready to be sent down the pipe, thus the efficiency of the overall pipeline is improved. Obviously the downside to this approach is increased latency, but as you will see with a number of the design decisions behind Banias, the power savings enable higher overall performance at the end of the day.

Banias' dedicated stack manager is another power saving tool integrated into the Banias architecture that is designed to manage stack pointers and other stack-related data. Remember that stacks are used to store information about the current state of the CPU including data that cannot be kept in registers due to limits in the number of available registers, thus a dedicated manager can help performance considerably. As usual, whenever efficiency is improved power consumption is optimized which is the case with Banias here as well.

The combination of a very advanced branch predictor, micro-ops fusion and a dedicated stack manager make Banias a very interesting architecture. Despite having a 20 - 50% longer pipeline, Banias still maintains a significantly higher IPC than the Pentium III, which is not an insignificant achievement. Remember from our discussions about the Pentium 4 that IPC (Instructions executed Per Clock) is generally reduced by moving to a longer pipeline, but is made up for by the fact that longer pipeline architectures can reach higher clock speeds. With Banias, we have an architecture that already has a longer pipeline than the Pentium III, thus enabling higher clock speeds, all while boasting a higher IPC - you're in fact getting the best of both worlds with Banias.

In order to feed the higher IPC execution core, Intel outfitted the Banias with a 64-bit 100MHz quad-pumped FSB, identical in design to the Pentium 4's FSB. The Banias' FSB is even electrically compatible to the Pentium 4's FSB, which is why any Pentium 4 chipset is able to interface with the chip as we saw at IDF with this E7501/Banias setup:


Click to Enlarge

If you're picking up on the fact that Banias is significantly different from the Pentium III, then you're on the right track…

Longer than a Pentium III, Shorter than a Pentium 4 Pentium III Execution Power
Comments Locked

8 Comments

View All Comments

  • zigCorsair - Wednesday, July 14, 2004 - link

    I thought it was a very informative article. Of course, I'll be upset if it's biased, but being a master's student in CS, many of the exact details I was looking for were in here, and for that I say thank you.
  • Zebo - Monday, May 10, 2004 - link

    I don't see whats so impressive. An athlon mobile 2600/2800 xp 35W version, which runs ~2000Mhz will kill these. To little to late.
  • Anonymous User - Wednesday, September 10, 2003 - link

    how the hell could this be a balanced and informative article when in their own analysis they ignored their own data?

    There is no mention of the anamolous nature of the BAPCO test..absolutely NOTHING...

    Its enough for me to question the competency of this site...and even to the point where I suspect that certain unethical compromises have been made.
  • Anonymous User - Wednesday, September 10, 2003 - link

    Yeah, I agree with Sprockkets... same reason Athlon XP loses to the P4 in this benchmark... someone was trying to make the P4 look better, and everything else look worse. Now all the sudden, this new great CPU is getting it's but kicked because of all the P4 optimizations (and probably non-P4 deoptomizations).
  • sprockkets - Tuesday, September 9, 2003 - link

    I wonder why the P4 trashes the PM on Content Creation Performance and nothing else? Maybe it's the stupid skewing toward the P4. Why else would it lose here and kick butt everywhere else? www.theinquirer.net has an article which brought this to readers attention.
  • Anonymous User - Thursday, August 21, 2003 - link

    "Without a trace cache, the design team was forced to develop a more accurate branch predictor unit for the Banias core. Although beyond the scope of this article, Banias was outfitted with a branch predictor significantly superior to what was in the Pentium III. The end result was a reduction of mispredicted branches by around 20%."

    Wouldn't he mean that the branch predictor was superior to the P4?
  • Anonymous User - Tuesday, August 19, 2003 - link

    looks good
  • Anonymous User - Friday, August 8, 2003 - link

    An outstanding well balanced article, after this read I feel I really know about Centrino. Thanks

Log in

Don't have an account? Sign up now