TSCP

We apologize for the broken TSCP Makefile in the previous review which rendered our initial results inaccurate.  Fortunately we posted the file so that others were able to detect the error and not find fault with the processors instead.  The large issue many of our readers have brought to our attention are the severe difference in performance between various optimizations.  Below you can see how various compile flags affected our benchmark scores.

The first benchmark is run with the optimization flags:

-O2 -funroll-loops -frerun-cse-after-loop
TSCP 1.8.1 -O2

The next benchmark is run with the optimization flags:

-O3 funroll-loops -frerun-cse-after-loop
TSCP 1.8.1 -O3

Finally, we have the architecture optimized flags as well:

(Intel) -O3 - march=nocona -funroll-loops -frerun-cse-after-loop 
(AMD) -O3 - march=k8 -funroll-loops -frerun-cse-after-loop 
TSCP 1.8.1 -O3 -march

You are reading these charts correctly, the O3 flag actually penalizes the AMD CPU.  We also compiled the program with -O2 -march=k8 but we got virtually the same score with or without the march flag.

We were informed others have been capable of much faster nodes per second using GCC 3.4.1 and the flagset:

-O3 -march=athlon-xp -funroll-loops -fomit-frame-pointer -ffast-math -fbranch-probabilities

We did not have time to fully test GCC 3.4.1, although there is a strong likelihood that 3.4 encourages better optimizations (particularly on the x86_64 platforms).

Crafty

For good measure, we have included Crafty into our chess benchmarks section.  Crafty was only built using the "make linux-amd64" target.  From the Makefile, it seems as though the "AMD64" moniker is slightly inappropriate.  The target claims:

#   -INLINE_AMD       Compiles with the Intel assembly code for FirstOne(),

#                     LastOne() and PopCnt() for the AMD opteron, only tested #                     with the 64-bit opteron GCC compiler.

The benchmark was generated by running the "bench" command inside the program.

Crafty v19.15

It is clear the difference between both processors is quite severe in this instance.  Although it is difficult to pin an exact culprit, there are likely multiple arch optimizations were left untapped, and thus our reasoning for discouraging overusage of optimizations in general.

Database Benchmarks Rendering Benchmarks
Comments Locked

92 Comments

View All Comments

  • adiposity - Thursday, August 12, 2004 - link

    > From looking at the graphs, it becomes easy to
    > see why JTR makes a difficult program to use as
    > a benchmark. Had we left the default -O2
    > compilation, Blowfish hashing would have been
    > faster on the Xeon processor than the Opteron.
    > However, as soon as we use -O3, the Opteron
    > outperforms the Xeon processor.

    Um, no it doesn't. The opteron continues to lose, even with the -O3 optimization. In fact, -O3 doesn't seem to help either significantly in any of the JTR benchmarks.

    -O2: Xeon wins 481 to 419
    -O3: Xeon wins 478 to 420

    Of course, Opteron wins the rest, and -O3 doesn't seem to matter there, either.

    -Dan
  • Dranzerk - Thursday, August 12, 2004 - link

    Kris glad you included Crafty chess program into review. If you want to address anything dealing with the Program (like test wise) you can contact Dr. Robert Hyatt directly through ICC (Internet chess club, free to use if you log on as a Guest) he is online there as the name Hyatt.

    He is very easy to talk to about crafty if you need help.
  • douglar - Thursday, August 12, 2004 - link

    The one thing that I really see in these benchmarks is how much Intel is suffering when they are trying to run AMD64 optimized code. Normally intel makes the spec for new instructions before AMD impliments, the compiler writers and software coders take advantage of Intel's peculiaries and then AMD has to build the functionality with the same peculiaries as the intel implimentation if they want to compete. Look at most SSE2 benchmarks. I think AMD is at a disadvantage having had to back fit the instructions using the existing CPU op units.

    This time it looks like intel is getting a taste of their own bitter medicine, trying to make 32bit integer units look and perform like 64bit units to the outside world, if the rumors about intel's 64bit implimentation are true. Now it will be interesting to see if compiler writers and software coders will (or are able to) go back to the drawing board and make this round of intel chips perform up to the strong initial AMD 64 bit performance baseline.

    I'm guessing that there will be a 64 bit performance gap (larger than the 32 bit gap) until intel respins the silicon a couple times. I look at this round of 64bit intel chips as like a em64sx, in reference to the 386sx, even though the 386sx was 32 internally an 16 externally, kind of the opposite of having 64bit registers and 32bit ops units, but perhaps still an appropriate analogy.
  • Macro2 - Thursday, August 12, 2004 - link

    Kris,
    I have to hand it to you, you took a lickin' and kept on tickin'. I realize it's pathetic that in order to do a review you have to literally dissect every benchmark for "fuzzy" code but that's the way it is. Got to do the homework and if it's out of the realm om your expertise yoou have to ask other. Remember, benchmarks are for liars.
    That said, you'll probably turn out to be the best comparitive benchmark reviewer on the internet.

    Mac
  • mrdoubleb - Thursday, August 12, 2004 - link

    Good job, Kris!

    1, Now I'd just like to point out, that it's not like nobody is complaining now becouse this time AMD wins. This time around the processor choice was ok and, as much as I see from the opinions of readers who are much better informed in the server/linux world than me, the benchmark choice/execution was great, too.

    2, I know you're off to your vacation now (by the way, have a good time!), bat later it would be interesting if you managed to do a 2 and/or 4 processor setup with Noconas and opteron 250/850s as well. As far as i know, Opteron's biggest advantage was its excellent scalability. Opteron's advantage used to grow immensly. Does this change with the new Noconas?

    3, I saw this at another HW site: on the final page of their reviews they have a chart where they list all the benchmarks once more and show with a percentage number how faster/slower a processor is compare to its rival. E.g. you take the nocona as 100% (or zero) and list for every benchmark how faster/slower Opteron was. (Say, +25% or -37%).

    4, As for pos #36 by kaoman. I think that if they had compared 2 desktop processors, than we wouldn't have seen any of these benchmarks, save for Lame. We would have seen office, gaming, AV, and rendering benchmarks. And about the price: let's wait for it, shall we?! By the time Prescott 3.6F is available, 90nm A64 is out, which might also be tweaked, if the rumor mill is right, and it will also be cheaper for AMD to produce, so it might be cheaper for us as well.

    Have great holiday, Kris!
  • kresek - Thursday, August 12, 2004 - link

    Thanks for the SSL benches.
    Especially useful are 1024+ bits RSA/DSA sign/verify figures (at the bottom), digests: MD5 or SHA-1, and popular ciphers, like RC4, blowfish. Take 1024 bytes or bigger blocks, and you have valuable, easy to visualize comparison information.
  • Pjotr - Thursday, August 12, 2004 - link

    ""Now will all of you A-Holes get off KrizK's & AT editorial staff's back!!"

    HAHHAHAHAHAHA I'm laughing my ass off.
    Great Job getting in the first post, and a good first post at that."

    I don't think it was a good post. Calling people with valid views, that even the author acknowledged, "A-holes" is just blunt and irrelevant.
  • skiboysteve - Thursday, August 12, 2004 - link

    great review, glad to see you take the readers to heart.
  • Pjotr - Thursday, August 12, 2004 - link

    Great job, Kris, and thanks to Super Micro! It's good to see when people don't just hide, but both listen and respond. I was ready to remove Anand from my bookmarks like I did with Tom's years ago, but it's staying now.
  • Soultrap - Thursday, August 12, 2004 - link

    Kris,

    Awsome! You took it like a man, and I think that all of us learned from your hard work including the errors in it. I feel that you did your abosolute best to accomodate & listen to your readers, correct your errors, and produce an unbiased evaluation.

    There is nothing anybody can do about their mistakes except do their best to correct them and learn from them.

    After reading these posts to your new article I am sure that you blood preasure has dropped greatly. Harsh (un)constructive criticism can be very difficult to take.

    Good work!

Log in

Don't have an account? Sign up now