32-bit vs. 64-bit Performance

Our entire benchmark suite to this point has been on 32-bit applications under a 32-bit OS, mostly because there are no good desktop 64-bit applications at this point in a popular 64-bit OS (not to mention the issues with 64-bit Windows XP we described earlier).

Under Linux however we don't have to wait for applications to be released in a 64-bit version, we can simply recompile them. Linux would thus provide us with an excellent venue to see the tangible performance increases from exposing the additional general purpose registers in 64-bit mode.

We ran all benchmarks on Red Hat Enterprise 2.9.5WS (Taroon), a beta release, booted in single user mode to avoid system services interfering with benchmark results. Neither Red Hat 9 nor 9.0.93 Beta (Severn) supply a 64-bit compiler or libraries, which is why we used Taroon.

The Taroon kernel initially had issues on startup requiring us to disable APIC and ACPI support to get it to install. Once actually running the OS was quite stable however DMA disk access was disabled for some reason.

We used the following compiler that came with Taroon:

gcc 3.2.3 20030502 (Red Hat Linux 3.2.3-16)

And the following kernel:

2.4.21-1.1931.2.393.ent

With this compiler and kernel we ran the following tests:

Whetstone

A simple C loop measuring floating point performance, configured to do double precision calculations.

Compiled with:
-O3 -msse2 -mfpmath=sse (and -m32 for 32bit, -m64 for 64bit)

The performance improvements due to 64-bit are in the 10 - 20% range we mentioned earlier.

Bytemark

An old integer CPU benchmark (FP results were discarded) - for more information on the tests visit this site.

Compiled with:
-O3 -msse2 -mfpmath=sse (and -m32 for 32bit, -m64 for 64bit)

Here we do see a small 2% drop in performance when moving to 64-bit in one test, however the rest of the tests show a 0 - 15% improvement across the board.

Lame 3.93

A MP3 encoder; encoded a 40minute .wav file (403MB).
Lame args: -b 192 -m s -h --quiet <file> - >/dev/null
(192kbps, simple stereo, high quality, output to nothing to avoid disk hits)

Compiled with:
-O3 -fomit-frame-pointer -fno-strength-reduce -malign-functions=4 -funroll-loops -ffast-math -msse2 -mfpmath=sse (again, -m32 for 32bit, -m64 for 64bit)

The performance improvement here is astounding - in 64-bit mode the Athlon 64 FX managed to finish the encode 34% quicker than in 32-bit mode, if these results are any hint of what could be in store for Windows users, there's a lot of promise behind the Athlon 64...assuming we get software support in time.

We wanted to do a transcode benchmark but that didn't work out - one library found a bug in gcc and transcode refused to compile. It actually forced a compile error because a structure came out padded, meaning they didn't expect anyone to run it on a 64bit machine just yet.

3D Rendering Final Words
Comments Locked

122 Comments

View All Comments

  • Anonymous User - Tuesday, September 23, 2003 - link

    That's all ?
  • wecv - Monday, August 14, 2017 - link

    Hello!

    I am from the future and I am here to tell you that AMD failed so hard in 2012 with bulldozer yet intel made a huge success with Sandybridge

    Where is your god now amdrones?

    But don't worry guys... in 2017 AMD made a huge success with Ryzen which is a cheap awesome 8 core with SMT which is kinda similar to the HT you know with the Pentium 4 HT but don't you worry guys! it's much better so you get 8 lovely cores and 16 threads

    Where is your god now shintel boys?

Log in

Don't have an account? Sign up now