Floating Point Performance

Just about a year ago, our own Johan De Gelas made an extremely interesting point about one of the weaknesses of the Pentium M - floating point performance. The theory is this - the Pentium 4, Athlon 64 and Pentium M all have very different platforms, with equally different characteristics. Unfortunately, as we've already shown, the Pentium M is quite possibly the worst off with only a single channel 333MHz DDR memory bus. It's also widely known that most floating point intensive applications are highly memory bandwidth limited, meaning that the Pentium M already has an excuse for poor floating point performance - it doesn't have enough memory bandwidth.

But what if we are able to take memory bandwidth out of the equation? This is where a little benchmark called "flops" comes into play. The beauty of flops is that it executes entirely within the L1 cache of the Pentium M, meaning that the benchmark is limited by two things: the performance of the Pentium M's L1 cache, and more importantly, the performance of the Pentium M's floating point and SSE units.

The actual tests that flops runs are a mixture of floating point add, subtract, multiply and divide operations. The mix of ADD/SUB, MUL and DIV operations is listed next to each test in the table below.

We compiled flops using the latest Intel C compilers to give the Pentium M as solid of a foundation as possible using the /O3 and architecture specific flags under Visual Studio .NET. All of the results are expressed in MFLOPs, higher scores being better:

 Test (% ADD, SUB, MUL, DIV)  AMD Athlon 64 3200+ (2.0GHz)  AMD Athlon 64 FX-55 (2.6GHz)  Intel Pentium 4 3.2GHz  Intel Pentium M 755 (2.0GHz)
1 (50,0,43,7) 1576 2057 1274 899
2 (43,29,14,14) 856 1118 790 492
3 (35,12,53,0) 1388 1802 2476 1470
4 (47,0,53,0) 1244 1622 2792 1601
5 (45,0,52,3) 1477 1923 2351 1019
6 (45,0,55,0) 1466 1908 2762 1607
7 (25,25,25,25) 458 595 365 252
8 (43,0,57,0) 1585 2065 2566 1572
Average 1256 1636 1922 1114

The first comparison to look at is the Athlon 64 3000+ vs the Pentium M 755, since both CPUs run at the same clock speed. Despite the Pentium M's improvements to enhance IPC, the Athlon 64 is still able to outperform it at a core level (without the aid of its memory controller) by almost 13%. But here's where the next Athlon 64 score comes into play - while the Pentium M will hit 2.26GHz by the end of this year, the Athlon 64 will be at or above 3.0GHz. So, the headroom of the Athlon 64's architecture gives it a huge performance advantage here in flops as you can see by the Athlon 64 FX-55 results (remember that the larger L2 cache of the FX-55 has no effect on the flops results as the program runs entirely out of L1).

Next, we have one of the slower Pentium 4s vs. the Pentium M 755. Why not compare to a 3.6GHz or the new 3.8GHz Pentium 4? Well, look at how much the Pentium 4 3.2GHz outperforms the Pentium M 755 - 72% using Intel's 8.1 C++ compiler. When running optimized SSE2/3 code, the Pentium 4 is a much stronger FP performer than what the Pentium M ever could be, which is very important for the following reason: the future of desktop applications is in very floating-point intensive media transcoding tasks, and for those applications, the Pentium M just won't cut it. So, to those who feel that Intel will soon ditch Net Burst in favor of the Pentium M's architecture, the results speak for themselves. While elements of the Pentium M architecture will undoubtedly make an appearance in the Pentium 4's successor, its dated P6 execution core will not.

Memory Latency and Bandwidth The Motherboards
Comments Locked

77 Comments

View All Comments

  • bob661 - Tuesday, February 8, 2005 - link

    The only problem with this chip is that the marketing is oriented towards the mobile market and therefore not a direct competitor to the A64. It would be nice if it was. It might bring some cats out of the bag on the AMD side. Competition in the marketplace is good for us all.
  • jvrobert - Tuesday, February 8, 2005 - link

    Really, AMDroids, get a grip. You're all excited because the AMD chips beat a mobile processor pretty handily, and because you are making some silly assumption that the Pentium-M in its current form is Intel's "last chance".

    First, Intel doesn't need a last chance. They make enough money to make AMD look like a Mexico City taco stand. So enough of those delusions of grandeur.

    But on a technical front, if Intel ramps the clockspeed up to the 2.8 range (easy), and releases a desktop class chipset for the Pentium M it would match or exceed any current chip. And these are _basic_ steps. What if they made more improvements?
  • jvrobert - Tuesday, February 8, 2005 - link

  • bob661 - Tuesday, February 8, 2005 - link

    #45
    You are a rock. The point of the article was to compare the P-M to desktop CPU's because most of us here wanted to know it will perform. And you know what? It performed very nicely.
  • classy - Tuesday, February 8, 2005 - link

    I just can't help but to laugh at some folks. Its a nice chip but clearly not in the A64 ballpark. Its that simple. As far as the 2.8 oc, that was only accomplished in one reveiw. All the reviews show the same thing you have oc so it can it compete. What's interesting though is most of these Intel fanboys don't want to see a comparison of an oc'ed A64 vs a Dothan. Smoke city :)
  • FrostAWOL - Tuesday, February 8, 2005 - link

    IF the Pentium-M and P4 are electrically incompatible then someone please explain this:

    HP Blade system Pentium-M with Serverworks GC-SL chipset
    http://h18000.www1.hp.com/products/servers/prolian...

    FrostAWOL
  • jae63 - Tuesday, February 8, 2005 - link

    Great review & of interest to those of us with HTPCs. Too bad the price point is so steep.

    One minor correction on page 11:
    "The Pentium M does a bit better in the document creation tests, as they are mostly using applications that will fit within the CPU's cache. However, the introduction of a voice recognition program into the test stresses the Pentium M's floating point performance, which does hamper its abilities here."

    Actually NaturallySpeaking uses almost no floating point but is very memory intensive. The performance hit that you are seeing is because it uses a lot of memory bandwidth and its dataset doesn't fit in the L2 cache.

    Here's some support for my statement, by the main architect of NaturallySpeaking, Joel Gould:
    http://tinyurl.com/6s4mh
  • segagenesis - Tuesday, February 8, 2005 - link

    #43 - I think you have the right idea here. This processor is not meant to be performance busting but rather a low energy alternative to current heat factories present inside every P4 machine. I would love to have this in a HTPC machine myself but the cost is still too damn high. Hopefully higher production will bring the cost down.
  • Aileur - Tuesday, February 8, 2005 - link

    I guess the pentium M isnt ready (yet) for a full featured gaming machine, but with that kind of power, passively cooled, it would make for one hell of an htpc.
  • PrinceGaz - Tuesday, February 8, 2005 - link

    #45- It was not an unfair review, on the contrary it seemed very well done. The reason the P-M was compared with fast P4 and A64's is because they cost about the same.

    Maybe someone else buys your computers for you, but most of us here have to spend our own money on them so cost is the best way to decide what to compare it with.

Log in

Don't have an account? Sign up now