Execution Core Improvements

Intel lengthened the pipeline on Prescott but they did not give the CPU any new execution units; so basically the chip can run faster to crunch more data, but at the same speeds there are no enhancements to work any faster.

Despite the lack of any new execution units (this is nothing to complain about, remember the Athlon 64 has the same number of execution units as the Athlon XP), Intel did make two very important changes to the Prescott core that were made possible because of the move to 90nm.

Both of these changes can positively impact integer multiply operations; with one being a bit more positive than the other. Let us explain:

The Pentium 4 has three Arithmetic and Logic Units (ALUs) that handle integer code (code that operates on integer values - the vast majority of code you run on your PC). Two of these ALUs can crank out operations twice every clock cycle, and thus Intel marketing calls them "double pumped" and says that they operate at twice the CPU's clock speed. These ALUs are used for simple instructions that are easily executed within 1/2 of a clock cycle, this helps the Pentium 4 reach very high clock speeds (the doing less work per cycle principle).

More complicated instructions are sent to a separate ALU that runs at the core frequency, so that instead of complex instructions slowing down the entire CPU, the Pentium 4 can run at its high clock speeds without being bogged down by these complex instructions.

Before Prescott, one type of operation that would run on the slow ALU was a shift/rotate. One place where shifts are used is when multiplying by 2; if you want to multiply a number in binary by 2 you can simply shift the bits of the number to the left by 1 bit - the resulting value is the original number multiplied by 2.

In Prescott, a shift/rotate block has been added to one of the fast ALUs so that simple shifts/rotates may execute quickly.

The next improvement comes with actual integer multiplies; before Prescott, all integer multiplies were actually done on the floating point multiply unit and then sent back to the ALUs. Intel finally included a dedicated integer multiplier in Prescott, thanks to the ability to cram more 90nm transistors into a die size smaller than before. The inclusion of a dedicated integer multiplier is the cause of Prescott's "reduced integer multiply" claim.

Integer multiplies are quite common in all types of code, especially where array traversal is involved.

An Impatient Prescott: Scheduler Improvements Larger, Slower Cache
Comments Locked

104 Comments

View All Comments

  • ianwhthse - Sunday, February 1, 2004 - link

    *sigh*

    Well, now I know.

    *goes to buy A64*
  • KristopherKubicki - Sunday, February 1, 2004 - link

    read the article...
  • Stlr22 - Sunday, February 1, 2004 - link

    31 stage pipeline?!.....lol..guess those "30 stage pipelne" rumors were true.

    These processors aren't bad at all. They performed on the same level as the Northwood versions. They just aren't worth the "premium" price tag that they will carry for now.

    Looks like there wont be a better time to grab a Northwwod,
    as I'm sure these puppies will keep dropping in price to make room for the Prescotts.
  • Thatguy97 - Wednesday, April 29, 2015 - link

    lol never even made to 4ghz man you guys did not give intel the crap it deserved

Log in

Don't have an account? Sign up now