AMD Socket-F Opteron vs. Intel Woodcrest
by Jason Clark & Ross Whitehead on December 18, 2006 12:05 AM EST- Posted in
- IT Computing
Benchmark Methodology
For AnandTech Database Benchmarks, we have always focused on "real world" Benchmarks. To achieve this, we have used real applications with loads such that CPU utilization was 80-90%. Recently we discussed how most Enterprise Database Servers do not average 80-90% CPU utilization, but rather something closer to the 30-60% range. We thought it would make more sense to show performance where it is most likely going to be used, as well as the saturation numbers for the situations where the CPU is maxed.
We feel this is consistent with how GPUs are reviewed, and how you might test drive a car. With GPUs, the cards are tested with varying resolutions, and anti-aliasing levels. With a car, you don't just hit the highway and see what the top end is.
We settled on six load points 20%, 40%, 60%, 80%, 100%, and 120+% for testing the varying ranges of load. These load points are consistent across all platforms and are throttled from the client, independent of the platform being measured. We chose these load points as they split the load range into 6, roughly, equal parts and allow us to extrapolate data between the points. The 120+% load point was included to verify that our 100% load point really was 100%.
The 100% load point was determined by starting an execution of the client and adding threads until the CPU utilization was between 95% and 100%. The other load points were determined by altering the number of threads from the client, thus adjusting the rate of client request per second, until the appropriate ratio of Orders/Minute for Dell and Transactions/Minute for Forums was obtained relative to the 100% load point. These thread counts were recorded and maintained consistent across all platforms.
For any given load point, there is a defined number of threads. Each test is 20 minutes in duration, which includes an 8 minute warm up period followed by a 12 minute measured period. For a given load point, the client submits requests to the DB server as fast as the DB server will respond. The rate which the client is able to submit requests is measured during the final 12 minutes of the test and averaged to determine the Orders/Minute for Dell, and Transactions/Minute for Forums.
After much blood, sweat, and almost tears we were able to produce repeatable loads with an average deviation of 1.6%.
For each platform we ran the test 5 times for each load point and than averaged the 5 results, this was repeated for all loads, all tests, on all platforms... that is 300 test executions!.
Dell & Forum SQL Trace Analysis
The Dell and Forum benchmarks are quite different workloads, which you will see in the benchmark results. Dell executes approximately 10 times more queries during the test, and the durations are approximately 4 times less than that of the Forum benchmark durations. To summarize, Dell is a workload with a high transaction volume, and each query executes in a very short amount of time. The Forum workload has a medium transaction volume, and the queries execute in a reasonable amount of time but are much more read intensive (larger datasets are returned).
Dell DVD Store Information
For AnandTech Database Benchmarks, we have always focused on "real world" Benchmarks. To achieve this, we have used real applications with loads such that CPU utilization was 80-90%. Recently we discussed how most Enterprise Database Servers do not average 80-90% CPU utilization, but rather something closer to the 30-60% range. We thought it would make more sense to show performance where it is most likely going to be used, as well as the saturation numbers for the situations where the CPU is maxed.
We feel this is consistent with how GPUs are reviewed, and how you might test drive a car. With GPUs, the cards are tested with varying resolutions, and anti-aliasing levels. With a car, you don't just hit the highway and see what the top end is.
We settled on six load points 20%, 40%, 60%, 80%, 100%, and 120+% for testing the varying ranges of load. These load points are consistent across all platforms and are throttled from the client, independent of the platform being measured. We chose these load points as they split the load range into 6, roughly, equal parts and allow us to extrapolate data between the points. The 120+% load point was included to verify that our 100% load point really was 100%.
The 100% load point was determined by starting an execution of the client and adding threads until the CPU utilization was between 95% and 100%. The other load points were determined by altering the number of threads from the client, thus adjusting the rate of client request per second, until the appropriate ratio of Orders/Minute for Dell and Transactions/Minute for Forums was obtained relative to the 100% load point. These thread counts were recorded and maintained consistent across all platforms.
For any given load point, there is a defined number of threads. Each test is 20 minutes in duration, which includes an 8 minute warm up period followed by a 12 minute measured period. For a given load point, the client submits requests to the DB server as fast as the DB server will respond. The rate which the client is able to submit requests is measured during the final 12 minutes of the test and averaged to determine the Orders/Minute for Dell, and Transactions/Minute for Forums.
After much blood, sweat, and almost tears we were able to produce repeatable loads with an average deviation of 1.6%.
For each platform we ran the test 5 times for each load point and than averaged the 5 results, this was repeated for all loads, all tests, on all platforms... that is 300 test executions!.
Dell & Forum SQL Trace Analysis
The Dell and Forum benchmarks are quite different workloads, which you will see in the benchmark results. Dell executes approximately 10 times more queries during the test, and the durations are approximately 4 times less than that of the Forum benchmark durations. To summarize, Dell is a workload with a high transaction volume, and each query executes in a very short amount of time. The Forum workload has a medium transaction volume, and the queries execute in a reasonable amount of time but are much more read intensive (larger datasets are returned).
Dell DVD Store Information
38 Comments
View All Comments
JarredWalton - Monday, December 18, 2006 - link
Fixed 3 as well, thanks.MartinT - Monday, December 18, 2006 - link
I don't like tests that rely on one tested party to supply both their own and their competitor's systems. Those situations are prone to favorable choice of components and outright manipulation much beyond the BIOS settings you claim to have checked.The very least you could do would be to ask for the competitor to supply their own system for comparison.
Also, while I realize that AMD is kinda desperate to find any advantage, their current "Best CPU at doing nothing."-push seems rather convoluted, IMHO.
JarredWalton - Monday, December 18, 2006 - link
If you ask each vendor to supply a system, you will never get anywhere near "equivalent" configurations. The purpose of this article is to show that there are a lot of companies that will be fine with their current Opteron systems, and if you are more interested in saving power (because you know your server won't be run at capacity) Opteron does very well. Obviously, there are plenty of areas where Woodcrest (and now Clovertown) are better, and we've covered some of those areas in the past.What server is best? That depends largely on the intended use, which is hardly surprising. I've heard that Opterons do especially well in virtual server environments, for example, easily surpassing Intel's current best. I'd love to see some concrete, independent testing of that sort of thing, but obviously figuring out exactly how to benchmark such servers is difficult at best.
MartinT - Monday, December 18, 2006 - link
I'm not sure you understood my point, which was that by sourcing an Intel system from AMD, AMD had full control over not just their own system, but its competitor, too, down to even the specific CPUs they sent.Now that wouldn't be too bad if it was a performance test, these hardly vary much amongst samples from the same product lines, but as power consumption enters the mix, and in fact takes center-stage here, system choices become paramount to the outcome.
Maybe I'm too big a cynic, and maybe what I allege is far from true, but under the specific circumstances of this review, I suspect that AMD's competitive performance analysis team played a major role in what hardware actually ended up in your hands.
(i.e. Not just are the memory configs and motherboards probably carefully chosen to support the intended message, the Opteron and Xeon CPUs might also have been sampled accordingly. And from your conclusion, they've done their job well, apparently!)
Would an off-the-shelf Opteron system produce the same results your review unit did? I don't know. Would the outcome have changed if the Intel Xeon system wasn't built to the specs of its main competitor? I don't know. But I'd be much more willing to accept the conclusion if either (a) both competitors supplied their entries themselves or (b) both units were anonymously bought from a respected OEM.
PS: Kudos to the AMD marketing team, too, as they managed to seed at least two of these articles and so far got their message across, and only a couple of days before Christmas, too, virtually ensuring full frontpage exposure for the better part of three weeks.
mino - Monday, December 18, 2006 - link
I cannot say for AT, but these numbers are reasonable and pretty much corespond to our own observations.Overall, the review says the two - Opteron 2000 and Xeon 5100 are pretty evenly matched. And AFAIK this is the opinion of pretty much every serious IT magizine or preffesional.
BTW we had IBM tech guys on visit and they had similar view of the situation. 5100 slightly better clock/clock to 2000 in most generla tasks while Opteron ruling the roost on heavily loaded virtualized machines.
From the long-term perspective IMHO Opteron is far better choice if only for the possible upgrade to K8L. Woodcrest platform has no such option available. And not, Clovertown is NOT a seriou contender for most workloads. It would yield even to hypothetical Quad-K8 not to mention K8L.
WarpNine - Monday, December 18, 2006 - link
Please read this review ( I think same review as this one )http://www.techreport.com/reviews/2006q4/xeon-vs-o...">http://www.techreport.com/reviews/2006q4/xeon-vs-o...
Very different conclusion??
JarredWalton - Monday, December 18, 2006 - link
I think the reviews basically say the same thing in different ways. We are not saying Opteron is 100% the clear winner here, merely that it can still be very useful and fulfills a market need. For a lot of companies, service and support will be at least as important as power and performance, though - which is why plenty of businesses ran NetBurst servers even when Opteron was clearly faster and more power efficient. For companies that switched to Opterons, it's going to take more than a minor performance advantage (in some cases) to get them to change back. At least, that would make sense to me.Companies that need absolute maximum performance will of course be looking at Clovertown configurations (or perhaps even waiting for 4S quad core - is that out yet?)
photoguy99 - Monday, December 18, 2006 - link
The conclusion of this article seems slanted - Did AMD suggest specifically that you look into "low end performance per watt"? Be honest, they planted the seed, right?1) Please post a link to the last article where AT's conclusion was overall this favorable to the top end performance loser. Please, we're waiting...
2) Why should the Intel system not be quad-core? Just because AMD doesn't have it yet? They even work in the same socket!
3) How can you justify saying AMD did "very well", and there's no Intel upgrade benefit unless you "routinely run your servers near capacity", when Intel quad core would have completely invalidated the results for performance per watt at nearly all levels?
Full disclosure, no axe to grind: I have praised previous AT articles because they are usually great. I currently own AMD as my primary system.
This article just doesn't smell right - too much vendor influence.
Jason Clark - Monday, December 18, 2006 - link
Have you not read anything we've posted in the last few months?http://www.anandtech.com/IT/showdoc.aspx?i=2793&am...">Woodcrest article
We've been touting performance / watt for months. You most certainly don't compare a quad core (8-way) setup to a 4-way and call that fair :) We have a Clovertown article on the way, it's going to include an 8-way socket-F system.
Cheers
Lifted - Monday, December 18, 2006 - link
I agree, to an extent. I just order some DL380G5's and they current fastest CPU point for quad core CPU's is 1.86GHz. Comparing 8 cores at 1.86 vs 4 cores at 3.0 starts to get difficult as it it really depends on the application in use on that system. Since this article seems to be more of a comparison of CPU architectures, the systems and CPU's used in the test seem appropriate. I think it's smart to wait until AMD has quad core out and compare apples to apples. The folks currently ordering quad core Intel systems (like me) would likely not be interested in a dual core Intel or AMD system as the task dictates the hardware, and with the systems I'm using quad cores in I simply don't need the speed, just more cores.