Understanding Pentium M Architecture

There is no doubt that the Pentium M performs very well as a low power, high performance mobile processor. We published two articles comparing the performance of the Pentium M Athlon 64 and Pentium 4, and in both cases the Pentium M did exceptionally well.

The problem is that until recently, the only mobile platforms were all single channel DDR solutions, making it difficult to extrapolate how the Pentium M would fare against its competition in the desktop world. The desktop Pentium 4 and Athlon 64s aren't equipped with a single channel memory controller and they come in larger cache, higher performance models than in the thin and light systems on the mobile side that you find them.

Before we get to the actual performance comparison, there's a lot that needs to be understood about the Pentium M architecture.

While the underlying architecture of the Pentium M is far more complex than this, the real world application performance of the CPU can be summarized and understood when looking at four points:
  1. High IPC Core
  2. Low Latency L2 Cache
  3. Memory Latency and Bandwidth, and
  4. FPU Performance
The high IPC core has already been explained in previous articles on the Pentium M, as well as briefly recapped in this article. With a shorter pipeline than the Pentium 4, but one longer than the Pentium III, the Pentium M can do more per clock than its more popular desktop cousin - which is why it is able to remain competitive despite its lower clock speeds (much like the Athlon 64).

Through the use of technologies like micro-ops fusion and its sophisticated branch prediction unit, the Pentium M ends up being even more efficient per clock than a Pentium III - despite having a longer pipeline. Based on its SPEC CPU2000 scores, the Pentium M features a 20% higher IPC than the Pentium III at an identical clock speed, despite its longer pipeline. The Pentium M vs. Pentium III comparison is similar to the Prescott vs. Northwood comparison, where the deeper pipelined Prescott was still able to make up for the loss in IPC through increases in efficiency and new branch prediction algorithms.

Problem #2: Total Cost of Ownership Low Latency L2 Cache
Comments Locked

77 Comments

View All Comments

  • Lupine - Wednesday, February 16, 2005 - link

    I'm surprised at these results. I'm setting up a new Dell Inspiron 9200 (M 725 @ 1.6GHz/400MHz FSB) and it is schooling both my Barton 2500+ @ 2.2GHz and TBred B 1700+ @ 2.2GHz running Stanford's Folding@Home project (600 point proteins: ~37min per frame for the XP boxes compared to ~34min per frame w/ the laptop).

    So, if it is so weak, what is allowing it to process WUs at such a competitive rate? Sure, that is slower than an A64, but competitive w/ most P4 procs.
  • fitten - Thursday, February 10, 2005 - link

    Something else to remember about the Banias/Dothan line of chips... Agressive power reduction was the #1 goal of the design process. In a 'normal' chip design, not all pipeline stages are the same length, the clock speed it runs at is the speed of the slowest part of the CPU. Since power usage is directly related to the frequency of the switching gates, the Intel engineers actually deliberately slowed down some parts of the chip to match the target release speeds (or get close to them) to reduce power consumption. This is, perhaps, the main reason why the frequencies don't scale so well as some would want them to scale.
  • Visual - Thursday, February 10, 2005 - link

    here's another thought... when the opterons launched initially at ECC DDR266, there were similar comments like "give it unbuffered DDR400 or higher and stay out of its way" :) well, now that we have that, ok it did improve performance a bit. but not hugely. shouldn't help the dothan significantly more too.
  • Visual - Thursday, February 10, 2005 - link

    I like how AMD got beaten by the P-M :) not because im intel fan, just because this will make things more interesting now.

    don't catch flame from this comment :p its my oppinion

    Funny how you picked the game benchmarks btw, its almost as if you wanted to show the P-M lacking behind the A64... from what I've seen it beats A64 in HL2 and CSS, and that's a game you don't skip usually :) so why now?

    Also looks suspicious how in lots of tests where P-M performs well with the A64 clock-for-clock or beats it, there is almost no difference in the 3800+ and 4000+ results... like if L2 isnt all that important, yet L2 is exactly how everyone explains the P-M success

    Maybe we'll see some 2MB L2 A64 "emergency edition" once Dothan gets a decent desktop chipset, just like what intel did to (try to) save P4 from the A64 :)
    actually i'd be happy if Dothan motivates AMD to develop faster L2 cache or something.

    Knowing Intel, i dont expect they'd even try to match AMD's prices with the P-M... and there's a lot of room for AMD to decreace prices, as they're selling with quite a margin now. So for sure the P-M won't be cost-effective compared to A64, not if you don't care for ultra-low power consumption at least.

    also it doesn't look likely Dothan could scale beyond 2.6GHz on current 90nm tech. by the time it gets there, AMD should've launched the 2.8 FX and most likely 3GHz too. so I have no doubts AMD will keep the lead for quite a while... maybe the race to 65nm will be the next turning point, as it seems its going smooth for intel (at least for P-M)

    anyway, even if AMD is better in absolute performance, pricepoint and (arguably) clock-for-clock, you gotta admit it to the P-M, it does quite a punch. fun times are coming :)
  • Zebo - Wednesday, February 9, 2005 - link

    dobwal buy intel if you want mhz, AMD is for performance.
  • dobwal - Wednesday, February 9, 2005 - link

    i wasn't referring to the FX series. Plus you are not understanding the point i was trying to make. Lets take a look at the FX series.

    OPN Model Operating Freq. Package ADAFX55DEI5AS FX55 2600MHz 939-Pin
    ADAFX53DEP5AS FX53 2400MHz 939-Pin
    ADAFX53CEP5AT FX53 2400MHz 940-Pin
    ADAFX51CEP5AT FX51 2200MHz 940-Pin
    ADAFX51CEP5AK FX51 2200MHz 940-Pin

    the first FX51 was release around late third quarter 2003. So in a little over a year the FX series has only increased 400 Mhz. Can you automatically assume that the FX has poor scalability in terms of cpu speed. NO. You know why, because the EE is underperforming and can't touch the FX. AMD has no need to push large scale speed increases out of the FX line, which would do nothing but increase cost with each new stepping it used to boost performance.

    The same goes for the Dothan at 2.26Ghz by the end of 2005. What other cpu offers the same level of performance vs. battery life. So why push for performance except to push sales.

    You simply can't determine the scalabiltiy of a cpu based on its roadmap especially when its the performance leader in its market segment and has no current viable competitor or one in the near future.
  • Aileur - Wednesday, February 9, 2005 - link

    Oh and, superpi relies on the fpu to do its calculations, so so much for this fpu is crap trend we have going here.

    http://mod.vr-zone.com.sg/Aopen_i855_review/25sPIm...
  • Aileur - Wednesday, February 9, 2005 - link

    Oh and before you start bragging about the better superpi1mb result of the a64
    http://www.akiba-pc.com/DFI_855/d17g_2608_spi1m.gi...

    this is 1 sec better, with 100mhz less, and single channel ram.
  • Aileur - Wednesday, February 9, 2005 - link

    Since you seem to like xtremesystems
    http://www.akiba-pc.com/DFI_855/d15g_2435_spi1m.PN...
    also a 1ghz overclock, also on default voltage

    Id like to see how an a64 would perform on a kt266 (if that were possible)

    Give the pentium m time to mature and all those "OMG HAHA YOU CPUZ IS SO HOT LOLOL!!!1111" will be obsoleet.
  • Zebo - Wednesday, February 9, 2005 - link

    58 "How long has A64 been stuck on 2.4Ghz."
    ----------------------------------

    There not. 2.6 FX-55 been out for months. More importantly AMD does'nt have to release new chips the way they dominate the benchmarks now. Could they? Hell ya.They got a nice buffer going, New FX's hit 3.0 on stock air. Cheap 90nm's are now hitting 2.7 on default Vcore and air. And by air I mean AMD's cheap all aluminum HS with a itty bitty 15mmx70mm fan, not Prescotts copper core screamers.

    T8000- You're clueless. Maybe it's the heat generated by your prescott making your head woozy, I dunno, but have a look here..1800 Mhz to 2800 Mhz on default Vcore stock fan.
    http://www.xtremesystems.org/forums/showthread.php...

Log in

Don't have an account? Sign up now