32-bit vs. 64-bit Performance

Our entire benchmark suite to this point has been on 32-bit applications under a 32-bit OS, mostly because there are no good desktop 64-bit applications at this point in a popular 64-bit OS (not to mention the issues with 64-bit Windows XP we described earlier).

Under Linux however we don't have to wait for applications to be released in a 64-bit version, we can simply recompile them. Linux would thus provide us with an excellent venue to see the tangible performance increases from exposing the additional general purpose registers in 64-bit mode.

We ran all benchmarks on Red Hat Enterprise 2.9.5WS (Taroon), a beta release, booted in single user mode to avoid system services interfering with benchmark results. Neither Red Hat 9 nor 9.0.93 Beta (Severn) supply a 64-bit compiler or libraries, which is why we used Taroon.

The Taroon kernel initially had issues on startup requiring us to disable APIC and ACPI support to get it to install. Once actually running the OS was quite stable however DMA disk access was disabled for some reason.

We used the following compiler that came with Taroon:

gcc 3.2.3 20030502 (Red Hat Linux 3.2.3-16)

And the following kernel:

2.4.21-1.1931.2.393.ent

With this compiler and kernel we ran the following tests:

Whetstone

A simple C loop measuring floating point performance, configured to do double precision calculations.

Compiled with:
-O3 -msse2 -mfpmath=sse (and -m32 for 32bit, -m64 for 64bit)

The performance improvements due to 64-bit are in the 10 - 20% range we mentioned earlier.

Bytemark

An old integer CPU benchmark (FP results were discarded) - for more information on the tests visit this site.

Compiled with:
-O3 -msse2 -mfpmath=sse (and -m32 for 32bit, -m64 for 64bit)

Here we do see a small 2% drop in performance when moving to 64-bit in one test, however the rest of the tests show a 0 - 15% improvement across the board.

Lame 3.93

A MP3 encoder; encoded a 40minute .wav file (403MB).
Lame args: -b 192 -m s -h --quiet <file> - >/dev/null
(192kbps, simple stereo, high quality, output to nothing to avoid disk hits)

Compiled with:
-O3 -fomit-frame-pointer -fno-strength-reduce -malign-functions=4 -funroll-loops -ffast-math -msse2 -mfpmath=sse (again, -m32 for 32bit, -m64 for 64bit)

The performance improvement here is astounding - in 64-bit mode the Athlon 64 FX managed to finish the encode 34% quicker than in 32-bit mode, if these results are any hint of what could be in store for Windows users, there's a lot of promise behind the Athlon 64...assuming we get software support in time.

We wanted to do a transcode benchmark but that didn't work out - one library found a bug in gcc and transcode refused to compile. It actually forced a compile error because a structure came out padded, meaning they didn't expect anyone to run it on a 64bit machine just yet.

3D Rendering Final Words
Comments Locked

122 Comments

View All Comments

  • AgaBooga - Tuesday, September 23, 2003 - link

    Where is the P4EE in the memory tests?
  • Anonymous User - Tuesday, September 23, 2003 - link

    Personally this was rather anti-climatic for me. It's certainly not a Intel killer that all the hype proclaimed. AMD for business, Intel for content, and a throwup for gaming. Same as it has been for awhile.
  • Anonymous User - Tuesday, September 23, 2003 - link

    #27 & #28 (amd fanboy double post)

    It SHOULD be up there with the P4EE because the PRESCOTT will be coming right around the corner! Face it, AMD did not put out a killer and Intel is sitting pretty in 2004.
  • Anonymous User - Tuesday, September 23, 2003 - link

    #20 are you serious? Did you just comment in the forum without looking at the review or did you actually look at the review. AMD is not "lagging" behind Intel. They are right up there with them. Look at the benchmarks and you will see the CURRENTLY AVAILABLE Athlon64 easily matches a NOT CURRENTLY AVAILABLE P4EE.
  • Anonymous User - Tuesday, September 23, 2003 - link

    #20 are you serious? Did you just comment in the forum without looking at the review or did you actually look at the review. AMD is not "lagging" behind Intel. They are right up there with them. Look at the benchmarks and you will see the CURRENTLY AVAILABLE Athlon64 easily matches a NOT CURRENTLY AVAILABLE P4EE.
  • Anonymous User - Tuesday, September 23, 2003 - link

    AMD, Pamela Anderson called. She want's to know how she can get a bust as big as yours. I have two words for AMD- "Segway" and "Scooter."
  • Anonymous User - Tuesday, September 23, 2003 - link

    nForce3 performance bug

    Time to re-do the benchmarks, Anand.

    Your FX-51 benchmarks are inaccurate.

    http://www20.tomshardware.com/cpu/20030923/athlon_...

    Nvidia: NForce-3 Bug

    The extremely low AGP performance of the NForce3 can be clearly attributed to problems with the HyperTransport channel interface to the Northbridge. That is proven by the benchmark results and the performance differences of up to 33.2 percent. Details about this can be found in the benchmark section of this article.

    Originally, Nvidia had planned to also integrate a SATA RAID controller in the Southbridge. Although the controller is included in the current NForce 3, Nvidia deactivated this feature. The reason was that error-free operation was not possible. For this reason, we decided to use additional boards based on the VIA K8T800 chipset.

    Nvidia (Athlon 64 FX, or alternatively GeForce FX - related names) may be a more high-profile partner for AMD than VIA. However, we would point out that VIA, with the K8T800 chipset, currently offers a clearly better solution for the Athlon 64.

  • Anonymous User - Tuesday, September 23, 2003 - link

    What is that smell?

    AMD just let loose with a huge turd!
  • Anonymous User - Tuesday, September 23, 2003 - link

    #4 You may be right (I don't think so but let say you are), but then ask yourself - where is the Pentium 4 Extreme Edition? There is no mention of this CPU at Intel web site at all, there is no datasheet and no batch numbers. Today, it is only a prototype CPU, such as Prescott is. They managed to build few Gallatin B1 cores that are able to work at this frequency and then remarked them. This CPU is not reality, only OEMs can buy it in very limited quantities, but end users can't. I think a 3 GHz Athlon 64 FX on 90nm prototype would perform far the best in this review... and it would be the same policy as with this Pentium 4 Extreme Edition.
  • Anonymous User - Tuesday, September 23, 2003 - link

    #17 Answers:

    1. Athlon 64's memory controller is very fast as you can see from the benchmarks. Dual channel is only needed in some situations to give decent performance. HT operates at 800 MHz with DDR and 16 bits, thus giving 3.2 GB/s each way (6.4 GB/s). Not so bad for a I/O and AGP interface only bus

    3. S754 is a lower end platform while S940 is an Opteron platform. AMD will introduce S939 early next year and will continue to produce CPUs for all those sockets. S940 A64 FX will, however, disapper in the end of next year.

    6. HyperTransport "Tunnel" system allows for practically unlimited number of chipset combinations, thus a PCI Express will only require to add another Tunnel or integrate it into current chipsets.

Log in

Don't have an account? Sign up now