Synthetic Benchmarks

Our Nocona server was setup in a remote location with little access, so we had limited time to run as many real world benchmarks as we are typically accustomed to. Fortunately, there are multitudes of synthetic benchmarks that we can use to deduce information quickly and constructively.

Sieve of Atkin (primegen)

Primegen is an older, but still useful library for generating prime numbers in order using the Sieve of Atkin. We compiled the Bernstein implementation by simply running "make". We ran the program as so:

# time ./primes 1 100000000000 > /dev/null

primegen 0.97

We found the benchmark to be extremely reliable and we replicated our figures continually with less than 1% difference.

Super Pi

We ran the Linux compilation of Super Pi 2.0, which is a closed source application. We are not aware of which optimizations are compiled with the program and we are prohibited from redistributing the binaries. Please download the latest binaries from ftp://pi.super-computing.org/Linux. We ran the command:

# ./super_pi 20

Below is the program's output of calculation time in number of seconds.

Super Pi 2.0

After re-running the program several times, our benchmarks never deviated outside of 1%. In a mathematical operation-only situation, the Intel processor has outpaced the AMD offering twice now.

Content Creation Synthetic Benchmarks (continued)
Comments Locked

275 Comments

View All Comments

  • intelpen - Tuesday, August 10, 2004 - link

    Did Intel paid you to put so questionable unknown tests in your article ? :)
  • Floffer - Tuesday, August 10, 2004 - link

    KK: You may be a bit excused if this was your first CPU review. But I find it a bit drastic to do any conclusions on some of these tests.
    You should know that a test like this would be looked at like the like Win/linux server test.
    Give a guy money from ms and let him test the 2 platforms - the linux guys will eat that man raw for failing to show what linux can do.
    This is the exact same thing with AMD/Intel.
    Doing such a fast preview containing containing errors people start the trenchwar - that is funny for maybe 5 min
    Ask yourself: What do I want to test, Linux workstations, Linux Servers, Linux, CPUs.
    This looks more like - ohh I've seen this test - don't know what the results I have actually mean but without standards, references and tweaks and doing conclusions. To me this test looks a bit like a F1 Car and a 24h Le Mans car showing up at at brand new racetrack without allowing the teams to change setup. Then conclude from one testlap with 2 different drivers what car will be the winner. The Chess test to me looks a bit like this. A 64 bit AMD vs Intel test is that new that to conclude this is the test where Intel is alot better than AMD - next day AMD is 25% faster than Intel.
    Firstly have a reason to run a test - be sure how it works. A new test with a program intended to be optimized a CPU should be so. And have enough results to see that this maybe isn't "correct" - and may need further testing.


    And to me it doesn't look that good that you are having problems compiling being a senior linux editor?

    Here is a testidea I would like to see ala a FPS/$ that can be seen in cpu or gfx tests:

    What I would like to see is a kWh pr year pr PC/Server. If a company wants to upgrade 100 or 1000 PC's seeing they could save money, and being environmentally friendly.
    Like buy a fuelefficient car that (maybe) costs more but has lower running costs, pollute less.

    eg: calculate kWh for standard use of a desktop/gaming pc 2-6h a day (load/idle factor)
    server running 24/7
    It could be one component only when possible and/or the whole computer.
    But I guess the gennerally concern in the US is more in the FPS/$ - like getting a SUV instead of a fuel efficient car and complain over high oil prices.

    Greatings from Denmark (hoping not to start Bush/Kerry, Win/*nix, etc flamewar)
  • plus - Tuesday, August 10, 2004 - link

    Kris,

    You just don't get it. You correct two erroneous benchmarks, which now favor the Athlon 64, yet you don't change your conclusion.

    Also, you need to clarifiy regarding the Nocona being in a remote location - somehow justifying only running synthetic benchmarks. Did you have your hands on the cpu, or not?

    Who set up the Nocona box?

    Plus
  • snorre - Tuesday, August 10, 2004 - link

    "Update: We have addressed the issue with the -02 compile options in TSCP, the miscopy from previous benchmarks of the MySQL benchmark, and various other issues here and there in the testing of this processor."

    That's not enough, you should have pulled the bad review altogheter and reposted it when you've updated it with Opteron 150 results and with a proper conclusion. There's still without a doubt many things wrong with your conclusion and test setup.
  • peter79 - Tuesday, August 10, 2004 - link

    I think people here are maybe exagerating a little. By this I mean,different errors are done in this review, making the final result disgusting and requiring deletion. But the mistakes separately are understandable. If a 3500+ was compared to this xeon in a correct benchmarking, with correct results, I would not have had a problem with this. If the benchmarkchoice was unfair, but all other things were OK, I would also not have complained. Problem is, all these things together do make for a very bad review. And I do think that this might permanently damage Anand's. Hope it's not the case. You will be getting more intel-fanboys like this. Might even get some THG people.
  • 4lpha0ne - Tuesday, August 10, 2004 - link

    The test loop of ubench:

    double x,y;
    unsigned i,j,k=0;
    i=pmin;
    for (j=0;j<i;j++)
    if ( j%67 )
    k+=j%(i-j+1);
    else
    {
    x=i-j;
    y=log(1.0+x);
    x=abs(sqrt(y/(2.0+x))*(x*cos(atan(y/(3.0+x)))+y*sin(atan(y/(4.0+x)))));
    if ( x > 10.0 )
    y=pow(1.0+j/(5.0+x),y/(6.0+x));
    else
    y=pow(1.0+y/(1+j),x);
    x=x*exp(1.0/(1.0+y));
    k+=x;
    }
    i=k%99;
    if ( i==0 ) i++;
    return i;


    I think that says enough - it's neither useful to test 64bit mode capabilities nor to compare different CPUs. Divisions, modulos, sin, cos, atan, pow, exp, log... A good test for the ability to execute some of the less often used microcode routines, nothing more.
  • WiZzACkR - Tuesday, August 10, 2004 - link

    i cannot believe it. after having read most of the posts i'm still so pissed i just have to say it: I find it incredible just how irresponsible this review treats their duty as a source for reliable benchmarks and reviews - ppl actually base their opinion on this stuff here! you can't just put a freaking update behind shit you dribbled and think it'll all be good! not even to talk about the serious damage you've done to anandtech's image as a unbiased source of info! i mean man: look at ANY hardwareforum on the bloody planet and see how annoyed and disappointed ppl are!
    saying "Not a big deal on the choice of hardware; it was just in there for reference anyway" is plainly the dumbest and most ignorant statement i've ever read from someone who wants to be a journalist.

    way to go guys! i'll quote some random lines of a few threads on [H] for you (could be ANY other site though, mind you):

    "Infact, I was getting fed up with all the bullshit articles Anandtech came out lately. So much so, I've just signed up here, and left Anandtech to sell out."

    "Dissappointing review at best, frighteningly incompetent review better describes it though"

    "This was the WORST review ever. Maybe tomorrow he will review the celeron vs the FX-53? Since that's basicly what he did here. Idiot."

    and finally: "This article looks like it was written by a forum troll, actually. It is almost like he wrote it in the hope to start a huge flame war. I mean, i have always trusted Anandtech.com almost as much as HardOCP.com, but that time is now over." - YES, indeed, for crying out loud! now get that dumb review down and admit it was shit - otherwise you cannot even measure the damage that article inflicted on the site's image and no little tiny update in the world will help you out of this one, believe me.

    man, am i disappointed...
  • Burbot - Tuesday, August 10, 2004 - link

    My major problem with the article is that despite Athlon winning or archieving a draw in all real-life tests, author does not pay much attention to the results. Instead, he turns to synthetic tests and bases his conclusion on their results, ignoring real-world apps. Does the author actually expect Anandtech readers to run some obscure pi or prime number calculator more often then lame, gzip or mysql?
    It is unfortunate to see yet another reviewer falling into "I'll just quote GarbageMark's results" trap. Synthetic benchmarks may be useful, but correctly measuring the results, understanding and interpreting them takes a lot of knowledge of "what happens under the hood" and time. Author lacked both - and have a look at the deceptive conclusions he arrived to. Benchmarking is *hard*. Going easy and fast way (quick and dirty Pi or prime JunkMark instead of carefully picked benchmark suites or respectable selection of real world apps) gives you irrelevant numbers and not a single useful result.

    Sincerely, Scientific Approach Fanboy.
  • fifi - Tuesday, August 10, 2004 - link

    I for one don't think that the choice of CPU is all that inappropriate.

    I think Kristopher has a valid point that when P43.6F comes out, it's going to be much like Xeon 3.6 now. The L2 cache is not going to change, and the instructions should be pretty similar too.

    If the 3500+ is what he has his hands on, then that's alright, it's not like getting a 200Mhz upgrade is going to make ALL the difference, but the L2 is difficult to say, depending on the applications. But 128+512 kb is what the 3500+ has, and nothing is going to change that.

    My main critique is with the choice of benchmarks and the incompleteness.

    I apologise if my comments came out a bit too strong before, but I can't stand people who start throwing accusations of "fanboy" or conspiracies at the drop of the hat and unable to listen to any reasonings. It seems like the only way to get their attention is to insult them loudly.

  • Dennis Travis - Tuesday, August 10, 2004 - link

    Kris, on your new test with Derek please test in both 32 and 64 Bit as I feel that what caused most of this flaming is just weeks ago in 32 Bit the new Intel got BEAT down by the AMD 64. What changed? I would like to see the 32 Bit benchmarks as what changed in just a week that makes the intel Walk all over the AMd 64. Anand even said with the performance of the new Intel AMD was the CPU to buy. Putting AMD's 64 Bit code in the Intel CPU can't change the 32 Bit performance. Something is going on somewhere. I will not flame nor put you down. that is not fair or right. I just feel reading all the 32 Bit tests just a week or so ago that something is wrong somewhere. Something with the CPU's and Linux and the benchmarks.

    Will be awaiting the new round of testing. Thanks in advance for the 32 Bit tests.

    it will help figure this mess out!

    Take care Kris and hang in there.

    ...Dennis

Log in

Don't have an account? Sign up now