Multiple Load Points

For AnandTech Database Benchmarks, we have always focused on "real world" Benchmarks. To achieve this, we have used real applications with loads such that CPU utilization was 80-90%. Recently we discussed how most Enterprise Database Servers do not average 80-90% CPU utilization, but rather something closer to the 30-60% range. We thought it would make more sense to show performance where it is most likely going to be used, as well as the saturation numbers for the situations where the CPU is maxed. We feel this is consistent with how GPUs are reviewed, and how you might test drive a car. With GPUs, the cards are tested with varying resolutions, and anti-aliasing levels. With a car, you don't just hit the highway and see what the top end is.

We settled on six load points for testing. These load points are consistent across all platforms and are throttled from the client, independent of the platform being measured. We chose these load points as they split the load range into 6 roughly equal parts and allow us to extrapolate data between the points. The last/highest load point is a "saturation plus" load point to verify that we tested up the capability of the CPUs.

For any given load point, there is a defined number of threads. Each test is 20 minutes in duration, which includes an 8 minute warm up period followed by a 12 minute measured period. For a given load point, the client submits requests to the DB server as fast as the DB server will respond. The rate which the client is able to submit requests is measured during the final 12 minutes of the test and averaged to determine the Orders/Minute for Dell and Transactions/Minute for Forums. After much blood, sweat, and almost tears we were able to produce repeatable loads with an average deviation of 1.6%.

For each platform we ran the test 5 times for each load point and then averaged the 5 results. This was repeated for all loads, all tests, on all platforms... that is 300 test executions!!! (We won't even get into the debugging issues we had to deal with prior to the final results.) Thankfully, we managed to automate the process as much as possible when implementing the throttling mechanism for the load points.

The new benchmark suite Dell DVD Store Results
POST A COMMENT

59 Comments

View All Comments

  • peternelson - Thursday, July 13, 2006 - link

    Agreed!

    I'm not interested in 32 bit performance.

    If you're gonna be spending this much money on an upmarket system you better be running it in 64 bit mode. I know I will.

    So if Opteron benches better than Woodcrest in 64 bit mode that changes the equation for me.

    Also isn't Opteron 290 out any time now? The would close the % gap a little because of the higher clock speed.

    Also S1207 Opterons will be here 1st August. The new nforce5 based pro/server chipsets might give a little boost over existing ones too, as could the bandwidth boost and lower power of DDR2.
    Reply
  • defter - Thursday, July 13, 2006 - link

    quote:

    I'm not interested in 32 bit performance.


    Check the review, this (and previous Linux review) uses only 64bit software.
    Reply
  • Kiijibari - Thursday, July 13, 2006 - link

    Oh indeed, x64 Windows was used.

    Good to see than, that woodcrest doesnt take a 64bit penalty. Maybe the Linux application of my source uses FP calculations and no SSE, or it is due to the compiler which may be in favour for Intel on the windows. Anyways 64bit is a big "playing" field for benchmarks looking forward to read more ;-)

    cheers

    Kiijibari
    Reply
  • Calin - Thursday, July 13, 2006 - link

    Slightly better performance and slightly lower power consumption. Looks like you have a winner for new servers.
    However, for a Fortune 500 company, there are other things much more important than slightly better performance and slightly lower power consumption.
    Reply
  • JarredWalton - Thursday, July 13, 2006 - link

    After the poor showing of NetBurst Xeons against Opteron, I'd think any Dell shops would be thrilled to regain the performance crown. Also, frankly, a 5-10% lead is about all most things get you these days, especially when I/O and everything else comes into play. The Woodcrest systems have better overall CPU performance, but it often isn't that important when working on massive databases.

    Incidentally, from what we've seen of Conroe, it seems like Intel could release Core chips at up to 3.4-3.6 GHz without difficulty right now. Rather surprising, given the 14 stage pipeline vs. 39 for Prescott.
    Reply
  • FesterOZ - Thursday, July 13, 2006 - link

    Actually its not a big thrill at all. One of the major pushes at the firm is to consolidate into VMware based servers or larger raw servers but in all cases stop the traditional 1 server per application that seems to affect most firms. Therefore we are more focused on 4 socket 8 core style servers i.e. HP BL45 blades than 2 socket blades. We had all the top level Dell executives coming in trying to convince us to stay with Dell because at this time, they have no answer for the larger server (the Oct/Nov timeframe for the Dell 4 socket AMD server is too far out). So in the short term we will be a hybrid Dell/HP shop. Maybe we will shift back if Dell's commitment to AMD indeed ramps as expected. Reply
  • Dubb - Thursday, July 13, 2006 - link

    I doubt this has much practical use, but I am nonetheless curious...could you "pinmod" a cloverton to run 1333 FSB?

    might make for some speedy rendering if it was stable.
    Reply
  • Kiijibari - Thursday, July 13, 2006 - link

    quote:

    I doubt this has much practical use, but I am nonetheless curious...could you "pinmod" a cloverton to run 1333 FSB?
    No that is not possible, dont you think that intel would release it, if it would be possible ? Just think about it, to lower the FSB bandwidth on the 4core part doesnt make sense, does it ? 4 cores are much more bandwith hungry than 2 ..

    Reason is the "Intel bolt-together" architecture. The 4core part is just 2 Dies in one package, thus it will have twice the bus load of a single (dual core) CPU. Intel did the same already with Netburst dual cores, hence you have the same FSB limitations there.

    All in all it is a little bit odd, Cloverton/Kentsfield performance increases will be much less than linear, but Intel has the advantage of time to market vs. the AMD K8L quad core. Though AMD's QC design looks much more sound I expect intel to be 1st with releasing a quad core CPU.

    cheers

    Kiijibari
    Reply
  • Dubb - Thursday, July 13, 2006 - link

    okaaaayyy...

    cloverton's platform supports 1333, and the kentsfield ESs run 1333 easily. most clovertons probably CAN, it's just a question of if the 1066>1333 pinmod some have suggested for dempsey or 1066 woodcrests actually works, and if so, clovertons might be an interesting application of it.

    I'm just curious, is'all.
    Reply
  • Kiijibari - Thursday, July 13, 2006 - link

    It may work, however that kind of overclocking is more dangerous than normal overclocking. It is easy to oc a chip that run at, lets say 2 GHz, and there is e.g. a 3 GHz top model. Chances are good that yields are well, thus your 2 GHz model may be able to run faster as most models pass 2.6 GHz tests, thus your model was just down binned to 2 GHz.

    However with the FSB1066 vs. FSB1333 I assume that you are playing around at the absolutly maximum. Intel would do everything to raise FSB speeds, exspecially with Quad cores. It is nonsense from the performance point of view, to decrease the available bandwidth while the number of bandwidth consumers (i.e. cores) increases.

    It might boot & work with a FSB1333 though, but Intel cant and wont gurantee that. It may be good enough for Super Pi or other "fun stuff", but if you run I/O intensive applications, cross your fingers and be prepared for data corruption.

    bb

    Alex
    Reply

Log in

Don't have an account? Sign up now