TSCP

We apologize for the broken TSCP Makefile in the previous review which rendered our initial results inaccurate.  Fortunately we posted the file so that others were able to detect the error and not find fault with the processors instead.  The large issue many of our readers have brought to our attention are the severe difference in performance between various optimizations.  Below you can see how various compile flags affected our benchmark scores.

The first benchmark is run with the optimization flags:

-O2 -funroll-loops -frerun-cse-after-loop
TSCP 1.8.1 -O2

The next benchmark is run with the optimization flags:

-O3 funroll-loops -frerun-cse-after-loop
TSCP 1.8.1 -O3

Finally, we have the architecture optimized flags as well:

(Intel) -O3 - march=nocona -funroll-loops -frerun-cse-after-loop 
(AMD) -O3 - march=k8 -funroll-loops -frerun-cse-after-loop 
TSCP 1.8.1 -O3 -march

You are reading these charts correctly, the O3 flag actually penalizes the AMD CPU.  We also compiled the program with -O2 -march=k8 but we got virtually the same score with or without the march flag.

We were informed others have been capable of much faster nodes per second using GCC 3.4.1 and the flagset:

-O3 -march=athlon-xp -funroll-loops -fomit-frame-pointer -ffast-math -fbranch-probabilities

We did not have time to fully test GCC 3.4.1, although there is a strong likelihood that 3.4 encourages better optimizations (particularly on the x86_64 platforms).

Crafty

For good measure, we have included Crafty into our chess benchmarks section.  Crafty was only built using the "make linux-amd64" target.  From the Makefile, it seems as though the "AMD64" moniker is slightly inappropriate.  The target claims:

#   -INLINE_AMD       Compiles with the Intel assembly code for FirstOne(),

#                     LastOne() and PopCnt() for the AMD opteron, only tested #                     with the 64-bit opteron GCC compiler.

The benchmark was generated by running the "bench" command inside the program.

Crafty v19.15

It is clear the difference between both processors is quite severe in this instance.  Although it is difficult to pin an exact culprit, there are likely multiple arch optimizations were left untapped, and thus our reasoning for discouraging overusage of optimizations in general.

Database Benchmarks Rendering Benchmarks
Comments Locked

92 Comments

View All Comments

  • Adul - Friday, August 13, 2004 - link

    prd00

    My understanding is the Nocona 3.6 is the same care os the P4 3.6F. Cache size and all. There is no L3 cache on this chip as far as I know of yet.

    from page 1

    AMD Opteron 150 (130nm, 2.4GHz, 1MB L2 Cache)
    Intel Xeon 3.6GHz (90nm, 1MB L2 Cache)

  • prd00 - Friday, August 13, 2004 - link

    Hmm... #68, that's nice.. I would also like Apache server performance as well, as this one is server CPU shootout. Try to reconstruct a page request response benchmark. kind like the one that used in Opteron review or in AcesHardware. Also, please check the scalability when adding second and 3rd/4th processor. how many percent can we gain over the single one.
    BTW, Kris.. in my opinion, Nocona 3.6 is not comparable to P4 3.6F, because Nocona 3.6 is way faster than 3.6F. It is much more comparable to P4 3.6EE than 3.6F vanilla. I think Intel will release a new 3.6F with vanilla P4 flavor when it is re-released as plain a plain one. So, I guess, 3.6EE is not comparable to A64 3500+ in many ways, as P3 3.4EE is comparable to AFX-53.
  • DrMrLordX - Friday, August 13, 2004 - link

    Good review. This one was definitely thoroughly explained and well-thought-out.

    I'd like to see the 3800+ put through the paces next, but eh, you guys deserve a rest *)
  • Locutus4657 - Friday, August 13, 2004 - link

    Good job Kris, you did a much better job this time! I guess the first article must really have been a learning experience! But that's what life is all about. So keep up the good work, and I look forward to reading more of your reviews.
  • Arias74 - Friday, August 13, 2004 - link

    One last comment as well, and then I'll shut up.

    I was wondering what the possibility of using Apache for benchmarking in future reviews? Just a thought...

    Salvador
  • Arias74 - Friday, August 13, 2004 - link

    I also posted after reading the first article, so I figure I may as well do so again.

    I'm still a little confused why KK still thinks that a 3.6GHz P4 will be marketed against a 3500+ A64. They do not occupy the same space, price-wise. If you're looking at a 3500 in the name, even Intel realizes that you can't judge by numbers alone, based on the fact that they are moving to an arbitrary naming convention for their processors. The only way to compare the two different product lines is by price, because that is the only constant. So, if the 3.6GHz P4 is the highest priced desktop cpu, then you would have to compare that to AMD's highest price.

    For example, if you have 2 systems in the store side-by-side, and one was priced $500 more than the other, wouldn't you assume that the higher priced item would be that much better, hence the higher price tag? Especially nowadays when $1000 can get you a very capable system, $500 is a huge price difference.

    Basically, all I'm saying is that it is faulty reasoning to assume that a 3.6GHz Intel cpu will be marketed against a 3500+ Amd A64 cpu. Heck, even AMD doesn't know how to market their own Semprons... apparently, a Sempron 2800+ is only equal to an AXP 2400+ in terms of performance... very weird and wacky stuff!

    Salvador
  • Zebo - Friday, August 13, 2004 - link

    Good job, Kris!

    As far as the chips used...These are very much in compitition with one another. Around the same price and "the best workstation processor" of the respective competing companies. Best vs. best and price. What else is there?

    You still need to dump the old article... or at least get rid of the hyperbolye in the conclusion.
  • AnnoyedGrunt - Thursday, August 12, 2004 - link

    T8000, it is interesting how you consider this second review biased, even though many people had pointed out legitimate problems with the first one.

    You bring up a good point about HT though. Even though it helps in some cases, it hurts in many others. How much of a selling point is it in that case?

    I was impressed with both the constructive criticism of the readers and the profesionalism of Kris's responses in this whole affair. Very nice job on this follow-up review. Hopefully, more updates will follow as more 64-bit programs become available.

    -D'oh!
  • Topnikko - Thursday, August 12, 2004 - link

    Nice job on this piece, Kris. I like the way you handle criticism. I think it's fairly obvious to you and everyone else that if the Opteron is capable of such a showing in a UP configuration, the Xeon doesn't have a snowball's chance in hell of outperforming the Opteron in multiprocessor configs. If you do decide to write an article comparing multiprocessor machines it'll be down-right ugly for Intel.
  • drewintheav - Thursday, August 12, 2004 - link

    Kris you are a w e s o m e !!!!!!

Log in

Don't have an account? Sign up now