Opstone

Since our use of Ubench in the previous article clearly infuriated many people, we are going to kick that benchmark to the side for the time being until we can decide a better way to implement it. 

In the meantime, a reader suggested we give Blue Sail Software's Opstone benchmarks a try.  In this portion of the review, we will use their precompiled optimized binaries of the Scalar Product (SP) and Sparce Scalar Product (SSP) benchmark.  The SP benchmark is explained by the author: 

"The 'SP' benchmark calculates the scalar product (dot product) of 2 vectors ranging in size from 16 elements to 1048576 elements for both single and double-precision floats.  Although the Gflops/sec. for every vector length is recorded (in the resulting output log file), the average of all these values is reported. This benchmark is indicative of the performance of many raw floating-point data processing apps (movie format conversion, MP3 extraction, etc.)"

Note that we ran the P4 optimized binaries on the Nocona, which did not provide x86-64 enhancements.  Running the AMD64 binaries on the Xeon yielded poor results. The P4 Opstone binaries are the only 32-bit binaries used in this analysis.

Opstone 04q2: Scalar Product

Below is the SSP benchmark, as explained by the author:

"The 'ssp' benchmark also calculates the scalar product of 2 vectors, except that these vectors are sparsely populated (only the non-zero value elements are stored) ranging from a 'loading factor' (non-zero/zero elements) of 0.000001 to 0.01 for both single and double-precision floats.  Since the data is not contiguous in memory, the performance is much lower than regular 'sp' and is measured in Mflops/sec.  There is not much difference in performance between different loading factors as this benchmark really challenges the ability of the processor to perform short bursts of calculations coupled with lots of conditional testing.  It is this reason that the P4 with its longer pipeline does not generally perform as well as the Athlon64.  This benchmark is indicative of the performance of many 3D games as the processing is similar (short bursts of calculations with numerous conditional testing)"
Opstone 04q2: Sparse Scalar Product

There is a general distrust of synthetic benchmarks, so take this portion of the analysis only with a grain of salt.  We see a tale of two processors in these graphs; generally the Xeon performs better in the raw operation SP benchmark, while the Opteron performs better in the condition testing SSP benchmark. We would be lead to believe the Intel processor does content integer content creation better than the Opteron, and visa versa with floating point applications. However as we see in the rest of the review, this is not always the case.

Rendering Benchmarks Content Creation
Comments Locked

92 Comments

View All Comments

  • Decoder - Thursday, August 12, 2004 - link

    Kris,

    Great review! I wish someone would benchmark AMD 64 and EM64T in 64 bit mode with MORE THAN 4 Gigs of RAM. I heard EM64T takes a hit with more than 4 gigs.

  • offtangent - Thursday, August 12, 2004 - link

    Kris,

    This was a great followup article, and certainly cleared a lot of things up. I was just wondering if its possible to use the SPEC benchmarks in addition to the ones you've used, so we can get the SPECint & SPECfp values to go with it. There are some published values for these on the spec website, but the setups for each of those published results are not the same, so its difficult to put them in perspective. Since you ususally have access to very similar setups, I was wondering if you could add those two tests to your set of benchmarks. Thanks!

    OT
  • Viditor - Thursday, August 12, 2004 - link

    Kris - "To be honest i wouldnt have known some of the mistakes i made had people not been so critical. I am not upset with the final outcome, it happens to everyone"

    And that is why AT is the first site I come to for information...
    Great job, and thanks!

    Cheers,
    Charles
  • T8000 - Thursday, August 12, 2004 - link

    When I compare this review to the previous one, I see two interesting points:

    1. Most benchmarks ran a lot faster without hyperthreading, a scenario that was not tested here.

    2. When enough users (or a user with a lot of names) complain about their favorite product not winning the benchmarks, their product will come out better soon therafter. I wonder if the Celeron 335 would have outperformed the Athlon 64 3800+ as well when this was required in the comments of the Celeron 3xx review by enough user names.
  • trooper11 - Thursday, August 12, 2004 - link

    i just wanted to say im a long time anandtech reader and I appluade the work done wiht this review to clear up the problems with the previous one.

    it takes some guts to come out and admit things were done badly and I can say I can respect the reviews more knowing you all are willing to admit those things, some sites have a problem with that and work with the readers to solve the problem. i have been a fan of the site for several years and I was very suprised at the first review, but now i see you trying to make up for that and go forward, I just want to thank you for the work done on this review.

    it may still not be perfect in answering all the questions, but it certainly goes along way versus the first article. i look forward to follow ups.
  • KristopherKubicki - Thursday, August 12, 2004 - link

    #55: Had a typo when i moved the table back over to make it readable :) They are both registered C3.

    I will work on the color issue more in the future, i just picked the default colors this time around.

    There are new Xeon processors, dubbed Iriwindale, that use 2MB L2 cache. However, the Xeons you see now with large cache are L3.

    Kristopher
  • Anemone - Thursday, August 12, 2004 - link

    It's a small word, but means a lot...

    Thankyou.

  • 2002cbr600f4i - Thursday, August 12, 2004 - link

    Kris,

    First off, MUCH better.... At least this seemed like a more fair fight.

    2 concerns/gripes/comments though...

    1) In the hardware config I noticed that one machine had Unregistered memory with CAS2, the other had Registered CAS 3 memory. Since I know that Opteron requires registered, I'd assume that made the Opteron run the CAS 3 stuff. I really would have prefered to see a CAS 2 to CAS 2 fight (just to keep the apples to apples as much as possible.)

    Second, (and this is a personal gripe against most benchmarking sites) either pick a color code for each brand's processor and use that color for ALL charts showing that processor, or always list them in the same order. Showing the "best one first" can be rather confusing when they're changing order from one chart to the next.

    One other thing... Doesn't the Xeon have more than 1MB L2 cache? I thought the newer ones were all using 2MB or more of that or L3???

    Anyhow, thanks for going back and redoing this work. I don't think any of us hates you personally, we just want to see FAIR and EVEN reporting in general across the board. This review has gone a long way towards restoring my faith in this site.

    --Mike
  • Pumpkinierre - Thursday, August 12, 2004 - link

    Agree with 42 and 52 something wrong with your statement on Blowfish. Also agree with 50 on the power of different optimisations (and its early days for the Nocona). Thanks also for waking up my interest in linux.
  • adiposity - Thursday, August 12, 2004 - link

    Hey snore...

    I noticed the unreadable table, too. I think it's some IE specific code, because I could view it in IE, just not firefox. You'd think linux benchmarks would have mozilla-compliant html :)

    Now, I don't know if it's just me, but I couldn't bring up the forum popup in firefox, either. Why not?

    -Dan

Log in

Don't have an account? Sign up now