Intel's 90nm Pentium M 755: Dothan Investigatedby Anand Lal Shimpi on July 21, 2004 12:05 AM EST
- Posted in
The 5 Things that Comprise DothanThere are five basic parts of Dothan that differ it from Banias, but unfortunately (just as was the case with Banias), Intel is not very forthcoming with details about Dothan out of a desire to guard their intellectual property. Even a year after its release, we have yet to see any serious competition for the Pentium M and Intel wants it to remain that way for as long as possible.
That being said, we will try to be as specific about the details of Dothan as much as possible; and we'll start at the most obvious - its 90nm process.
90nm process and 2MB L2Banias was built on Intel's 0.13-micron manufacturing process at its peak. The tried and true manufacturing process meant that Banias faced no manufacturing delays and could hit its target clock speeds without a problem.
Dothan gets its most noticeable improvements over Banias, thanks to the move to Intel's smaller 90nm manufacturing process. This is the same process that's used in the manufacturing of Prescott, which means a couple of things. For starters, it explains why availability of Dothan hasn't been incredible, since its launch as 90nm production is still ramping. The availability problem aside, 90nm gives Dothan the ability to cram almost twice as many transistors onto the chip without increasing the overall die size compared to Banias.
Dothan is now a 140 million transistor chip (up from 77 million in Banias) with those 140 million transistors occupying the same 84 mm2 die area as Banias (almost, Banias is about 1 mm^2 smaller). Almost twice the transistors with no increase in die size? It's a chip manufacturer's dream. Because of the stagnant die size, yields should not differ between Banias and Dothan (once Intel's 90nm process has truly matured) and it shouldn't cost Intel any more to produce Dothan than it did Banias.
The majority of the increase in transistor count is thanks to Dothan's 2MB L2 cache, twice that of Banias' 1MB cache. The 64KB L1 cache remains the same that was present in Banias.
We believe that Intel is using the same 90nm SRAM cells from Prescott in Dothan. If they are indeed, then the extremely small 84 mm2 die is further enabled by the significantly smaller 90nm SRAM cells that Intel developed. However, we are not clear as to how independent Banias and Dothan's SRAM cell design remains from the desktop chips, thanks to their unique power requirements.
Along with a larger L2 cache, Intel has increased how aggressively Dothan prefetches data into its cache in order to take advantage of the extra on-die L2. This is a fairly normal practice that microprocessor designers employ whenever an architecture stays the same, but cache size increases in order to help improve performance.
The 90nm process will also allow Dothan to scale up in clock speed, thanks in part to Intel's strained silicon technology, something that we're already seeing the fruits of today with its introductory 2GHz clock speed (up from Banias' 1.6GHz intro speed). Dothan will break the 2GHz barrier by the end of 2004. Remember that Intel's design philosophy with Dothan, just like Banias, is to design the chip for a specific power consumption and to leave clock speed scaling mostly up to the manufacturing process to enable.
Dothan's 90nm manufacturing process, in the end, gives it the higher clock speeds and larger L2 cache, which offer some of the more tangible advantages over Banias. Another very important fact to keep in mind is that these are the only major changes to Banias that make up Dothan; unlike Prescott, the pipeline has not been changed at all. Even Intel's Dothan design team views Prescott as a bit of a risky move, to try out significant modifications to the architecture alongside a brand new manufacturing process. Thus, it's no surprise that Dothan remains relatively unchanged architecturally outside of the move to 90nm; the pipeline and L1 cache are identical to Banias.
Micro Ops FusionIntel has been deliberately vague about Banias' micro ops fusion and they continue to be such with the modifications to the micro ops fusion engine in Dothan. All that we are allowed to publish is that Dothan now allows more types of micro ops to be fused, which isn't a bad thing, it would just be nice to know which ones and what enables Dothan to support the fusing of more micro ops.
Local Branch Prediction ImprovementsWith Dothan, there have been some improvements to branch prediction performance in order to reduce power consumption and increase performance. Remember that the fewer branch mispredicts you have, the less power that is wasted on refilling the pipeline after a flush.
One of the biggest improvements to Dothan's branch predictors is in its loop detector. Although most don't think of a loop as a branch, all loops either end or begin with some sort of a comparison statement that determines whether the loop should continue to execute (e.g. if i ‹ 10, then keep looping). Loops are normally handled by a static branch predictor that always predicts taken once a loop is detected, and usually the only mispredicts that exist once a loop is detected are at the end of the loop. While this works fine for larger loops (100+ iterations), it does not work so well for extremely small loops (e.g. 5 iterations). What ends up happening is that the 5th, 6th and 7th time around, the predictor will mispredict a taken branch when, actually, the loop is finished with. Mispredicting 3 times for a loop that only runs for 5 iterations does not help branch prediction accuracy, so we have a problem on our hands.
Dothan includes a more sophisticated algorithm in its detection and prediction of branches involving small loops; once again, Intel was purposely vague about exactly what Dothan does that Banias did not, but just know that Dothan has better overall branch predictor performance, thanks to modifications like improved detection of short loops.