32-bit vs. 64-bit Performance

Our entire benchmark suite to this point has been on 32-bit applications under a 32-bit OS, mostly because there are no good desktop 64-bit applications at this point in a popular 64-bit OS (not to mention the issues with 64-bit Windows XP we described earlier).

Under Linux however we don't have to wait for applications to be released in a 64-bit version, we can simply recompile them. Linux would thus provide us with an excellent venue to see the tangible performance increases from exposing the additional general purpose registers in 64-bit mode.

We ran all benchmarks on Red Hat Enterprise 2.9.5WS (Taroon), a beta release, booted in single user mode to avoid system services interfering with benchmark results. Neither Red Hat 9 nor 9.0.93 Beta (Severn) supply a 64-bit compiler or libraries, which is why we used Taroon.

The Taroon kernel initially had issues on startup requiring us to disable APIC and ACPI support to get it to install. Once actually running the OS was quite stable however DMA disk access was disabled for some reason.

We used the following compiler that came with Taroon:

gcc 3.2.3 20030502 (Red Hat Linux 3.2.3-16)

And the following kernel:

2.4.21-1.1931.2.393.ent

With this compiler and kernel we ran the following tests:

Whetstone

A simple C loop measuring floating point performance, configured to do double precision calculations.

Compiled with:
-O3 -msse2 -mfpmath=sse (and -m32 for 32bit, -m64 for 64bit)

The performance improvements due to 64-bit are in the 10 - 20% range we mentioned earlier.

Bytemark

An old integer CPU benchmark (FP results were discarded) - for more information on the tests visit this site.

Compiled with:
-O3 -msse2 -mfpmath=sse (and -m32 for 32bit, -m64 for 64bit)

Here we do see a small 2% drop in performance when moving to 64-bit in one test, however the rest of the tests show a 0 - 15% improvement across the board.

Lame 3.93

A MP3 encoder; encoded a 40minute .wav file (403MB).
Lame args: -b 192 -m s -h --quiet <file> - >/dev/null
(192kbps, simple stereo, high quality, output to nothing to avoid disk hits)

Compiled with:
-O3 -fomit-frame-pointer -fno-strength-reduce -malign-functions=4 -funroll-loops -ffast-math -msse2 -mfpmath=sse (again, -m32 for 32bit, -m64 for 64bit)

The performance improvement here is astounding - in 64-bit mode the Athlon 64 FX managed to finish the encode 34% quicker than in 32-bit mode, if these results are any hint of what could be in store for Windows users, there's a lot of promise behind the Athlon 64...assuming we get software support in time.

We wanted to do a transcode benchmark but that didn't work out - one library found a bug in gcc and transcode refused to compile. It actually forced a compile error because a structure came out padded, meaning they didn't expect anyone to run it on a 64bit machine just yet.

3D Rendering Final Words
Comments Locked

122 Comments

View All Comments

  • Anonymous User - Wednesday, September 24, 2003 - link

    Nice review anand, however I am missing the P4 EE in a number of the tests, as previous post (#67) suggested.

    The Athlon 64/A64 FX appears to be a nice processor, for a shiny new design cpu the advantages were expectable.

    Some more 64bit tests, maybe a divx codec pre-compiled for 64bit in a test?

    As for the amd vs intel combat:
    The A64 and A64FX match up a lot better against the latest p4/p4EE. I wouldnt have expected anything else.
    While the prescot still lurks in the dark and I have a feeling Intel has something up their sleeve I wouldnt call the prescot an failure yet.
    If Intel plays nicely along, maybe they can create a cpu that beats the A64/A64FX in 32 (and just maybe in 64bit http://www.theinquirer.net/?article=11668).

    Either way, the more AMD and Intel compete each other, the happier I am, after all I end up paying less for either cpu.
  • Anonymous User - Wednesday, September 24, 2003 - link

    There is already a 64 bit port of America's Army available that doesn't need a 64 bit OS! http://www.amd.com/us-en/Corporate/VirtualPressRoo...
  • Anonymous User - Wednesday, September 24, 2003 - link

    Guys, if the AMD Athlon 64s don't succeed, I believe AMD would go under or be in financial trouble. So these new processors MUST sell well.
  • Anonymous User - Wednesday, September 24, 2003 - link

    I'm curious why RedHat Taroon, which is an enterprise-focused Linux distribution, was used for the 64-bit benchmarks and not RedHat GinGin64, which is more consumer-focused. Both are available from the RedHat FTP site.
  • Anonymous User - Wednesday, September 24, 2003 - link

    Several of the benchmarks left out the P4 Extreme scores (memory bandwidth, Content Creation 2003, and Virtual Studio 6.0 Compile) - was that a mistake or benchmarking per AMD's new guidelines? It's also funny that this is one of the first Anandtech CPU reviews without the full system specs documented (i.e. ECC memory vs non ECC, Intel branded motherboard vs. ASUS enthusiast motherboard, memory latency settings, etc.) - more AMD review guidelines?

    The way I see it, you can either spend the money on a whole new Athlon 64/FX CPU, motherboard, and memory outfit, or buy a P4 EE and stick it into any motherboard today that accepts a 3.2 GHz CPU - that sure beats having to buy a new motherboard and memory to get about the same level of performance average across the board.

    Even when 64-bit Windows comes out, does everyone really think that Microsoft and Bill Gates will really make it priced at mainstream levels and reduce the cost of the current 32-bit Windows XP so soon? I have my doubts but I guess we'll just have to wait and see.

    Another interesting thing to note from Tom's Hardware review is that the 64-bit code for AMD64 does run faster on 64-bit OS but if you read carefully, he says that the same program optimized for the P4 runs even faster on 32-bit OS. So, software companies will probably have to make a choice (unless they are big enough and make enough money to serve all markets): A - optimize 32-bit software to take advantage of the P4/Prescott and Hyperthreading using compilers that Intel provides, or B - compile 64-bit software for which there is still no mainstream OS and there are hardly any standard compilers available for and market them to the 500,000 or so people who will have the opportunity to own AMD64 desktop chips this year.

    Sure Intel has a problem with the Prescott heat dissipation right now but I don't think they will be sitting idle. Thermal interface technology is getting better all of the time and I wouldn't doubt if Intel isn't already making process improvements and/or implementing newer cooling methods. After all, it was Intel who came up with the heatspreader design for the current generation P4 that is now being used by Hammer chips.

    Once the Prescott on .09-micron technology hits the streets it will continue to be refined and improved upon so the clock speed will continue to increase. Imagine a Prescott EE CPU with 1MB L2 and 2MB L3 or more. What would be a real thorn in AMD's side would be if Intel makes a shrink of the current P4 onto the new .09-micron technology and increases the clock speed to the 4 GHz level (already achievable by some CPUs on the current .13-micron process) to keep pace with the Athlon 64/FX which is supposed to be AMD's next generation CPU. They could put a whole bunch of P4 die (even P4 EE die) on a 300 mm wafer and put a hurting on AMD until they can get their 90nm process and 300mm wafer process going. It is a scary possibility for AMD but could be reality for Intel - meanwhile, AMD still has to face the daunting task of converting to 300mm wafers and 90nm process at the same time to keep up. AMD says that they will start 90nm production in the first half of 2004, but then again, they've been promising hammer since 2001. But they have to do something because with their current situation of roughly 192 square millimeters per Athlon 64/FX die on a 200mm wafer yields a theoretical 73 die per wafer (per Tom's Hardware review). And I believe that AMD wants to put all of their products on the same line and differentiate them at the end - similar to the way Intel does with their Northwood/Celeron products (same die with certain cache and other things disabled) - so even the 256K L2 cache mainstream Athlon 64 comes out, it may still be the same size as all of the other Athlon 64/FX/Opterons.

    Hector Ruiz, Jerry Sanders and AMD as a whole have a very steep mountain in front of them to climb. Time will tell if they have what it takes to get up and over it. The first checkpoint for them will come in about 3 weeks in the form of Q3 earnings. By then we'll see how sales of their new CPUs are going and if their joint venture in FLASH pays off. (It didn't really make sense to me for them to lay off 2000 people at the beginning of the year to reduce costs and then turn around 2 quarters later and pick up 7000 people in the FLASH venture with Fujitsu which comes with more debt than earnings.) I'm not a betting man, but if I were, my money would be on AMD making 9 straight quarters of losses in a row. When Hector Ruiz came to office he vowed that AMD restructuring would make them hit break-even sometime around Q2 2003 but that never happened. There seems to be a pattern with promises made by AMD. I guess it's why his 3 million stock options which were granted at $16 are still under water.
  • Anonymous User - Wednesday, September 24, 2003 - link

    Apparently the FX series is unlocked multiplier, and mobos will be coming shortly that have multiplier selection options (read Anand's "weblog" entry)... I can't wait to see the results of a 13x 220 FX-51; now, THAT I might part with $800 to play with... Talk about insanely fast processors, the higher FX goes, the smaller Intel's lead gets.
    Whoever said that throwing more cache on the P4 core beat Hammer is just deluded; P4 is at the end of its line, AMD64 is just beginning. And once again, Intel supporters seem to grow rather silent when you point out that the on-die memory controller becomes significantly more powerful when the clock speeds ramp up; a 3.3GHz FX chip would be more than a match for a 3.4GHz Prescott, I'd think. Memory bandwidth advantage is a thing of the past for Intel, now it's up to AMD to shore up their lacking SSE/SSE2 performance and work on speeding up the core, as well as meeting the demand for such upgraded processors.

    I'm not normally so pro-AMD (though I support their products more than Intel's, just from a cost efficiency standpoint), but it's kind of hard to not be wowed by the muscle this chip can flex. I mean, this is the Day One marketed "prototype" and it's capable of matching its most recent and mature rivals, can you imagine what next year is going to look like?
  • sprockkets - Wednesday, September 24, 2003 - link

    The fact that Intel HAD to release a EE edition shows desparation at looking behind. Yeah, so the Prescott does look good. That and the 103w dissipation, unconfirmed if it is on the 90 process, if it is then that's pathetic.

    I can buy a Athlon 64 or FX, where is the EE like others said? And why was the NDA lifted on the same day as the AMD Athlon 64?

    Complaining about price? The 1.5ghz P4 costed around $1000 when it came out and was slower than a P3 1.0ghz, while the new Athlon 64 always is faster than the XP.
  • Anonymous User - Wednesday, September 24, 2003 - link

    lmfao

    AMD is still the underdog :P Always will be. You get what you pay for.

    AMD is for the guys who love to root for the underdog; in otherwords fanboys. If you want solid, no hassle performance with top support .. you know where to put your money.

    I mean christ, Intel doesn't even have to design a next generation core to outmatch AMD's next core eveolution -K8- they just simply tack on more cache ;) How sad is that?

    AMD fans, take a hint ... AMD's STILL playing cat-up with Intel. Read the article closely, and you'll see what I mean -the author hints at it so clearly as well- Its so easy to see. AMD has never had the advantage ;) Only clever marketing which most people pin as bad marketing on AMD's part. Quite the contrary, Stupid kids!
  • Anonymous User - Tuesday, September 23, 2003 - link

    #61 ya I know how they compare...the g5 is a mac no software support and the cartoons might pop out of the screen and eat you.....no comparison....
  • Anonymous User - Tuesday, September 23, 2003 - link

    Which chip is faster at Divx encoding?

    http://www.anandtech.com/cpu/showdoc.html?i=1884&a...
    OR
    http://www.hardocp.com/article.html?art=NTI0LDM=
    OR
    http://www.aceshardware.com/read.jsp?id=60000256
    OR
    http://www4.tomshardware.com/cpu/20030923/athlon_6...

Log in

Don't have an account? Sign up now