Original Link: http://www.anandtech.com/show/1099
AMD Opteron Coverage - Part 2: Enterprise Performanceby Anand Lal Shimpi on April 23, 2003 3:07 AM EST
- Posted in
Almost a full year before AMD launched the Athlon MP platform, we were already using AMD based servers on AnandTech. Back then there were no 1U chassis solutions for Athlons, there weren't even any server-class chipsets, much less server-class motherboards to take advantage of them.
We had to go out and find the most stable desktop Socket-A motherboards available, and built them in oversized 4U/5U ATX rackmount cases in order to realize our goal of using Athlons as servers. We didn't make the move to AMD based servers to be "cool" or to try something new, we made the move because all performance indicators stated that we'd be better off with AMD than we were with our older Pentium II/III Xeon platforms.
We documented our entire migration process here and it ended up being that we made the right bet; we were sold on the usefulness (and cost effectiveness) of the Athlon as a server platform although even to this day there isn't much support for the Athlon MP.
For a CPU that wasn't designed as an enterprise class processor to begin with, the Athlon did an excellent job when it actually made it to the server market. The lack of Tier 1 OEM support for the Athlon MP platform as well as relatively lackluster Tier 3 OEM solutions with very poor manageability options left the Athlon MP a diamond in the rough, never to be exposed to the majority of the enterprise market.
Fast-forward to today and you'll see that the Athlon MP is no match for the higher speed Xeons, and the lack of a 4-way+ solution hampers the platforms success in database environments.
Although the Athlon MP has lost its performance advantage, AMD has finally brought the Opteron to market - and just in time. We've already thoroughly explained the architecture behind the Opteron, if you are not intimately familiar with the K8 core then we strongly suggest you read Part 1 of our Opteron coverage before proceeding here.
The Contenders: Opteron 244 vs. Xeon DP 2.80
You should already be familiar with the three flavors of Opterons that were just recently launched, but as a quick recap here they are in all of their confusing-model-number glory:
AMD Opteron Model Numbers
Note that all three of these CPUs are 1MB L2 parts, there haven't been any announcements for smaller cache versions. AMD did mention a while ago that the K8 core would not be paired up with any more than a 1MB L2 cache, presumably because of the low-latency memory access that is made possible by the on-die memory controller.
Keep in mind that mass-market availability of the Opteron 244 isn't planned until June, with the 242/240 available today.
The Opteron chips were paired up with an AMD-8000 based motherboard made by Rioworks.
In the opposing corner we have Intel's Xeon DP running at 2.80GHz; Intel could not provide us with any of the newer 3.06GHz samples in time for this review, but we will make an effort to explain where the 3.06GHz would fall in the performance spectrum when possible. These are 512KB L2 parts, although Intel has accelerated the introduction of their 1MB L2 Xeon DP to next month in order to compete with AMD's Opteron. The first 1MB L2 Xeon DP part will debut at 3.06GHz and will make for a very interesting comparison to the Opteron 244.
The Xeon DPs were installed on an Intel E7501 based motherboard; for those
not familiar with the E7501 chipset (Plumas-533), its major features are as
- Dual Processor Xeon support (400/533MHz FSB)
- 2 x 64-bit DDR266 memory channels
We tested all processors in dual processor mode, with Hyper-Threading enabled on the Xeon platform.
With the test platforms clearly defined, let's get a taste for Opteron's performance
The same ol' SQL Tests
At AnandTech we've always advocated real-world benchmarks, especially with enterprise-class CPUs. When it comes to real-world enterprise benchmarks there are the massive TPC benchmarks available online, but there's very little offered for the small-to-medium enterprise applications. Luckily, at AnandTech, we're sitting on a goldmine of real-world test data since every day our servers are hit by hundreds of thousands of users looking to get the latest info on the greatest hardware.
We harvested the power of this resource by recording a trace of every access to each one of our three database servers: the Website, Advertising and the Forums database servers. With these trace files we can then playback those accesses in a highly repeatable fashion, on any machine we choose, thus we can gauge the performance of the platforms we're reviewing today in our own server environment.
Here's a description of the types of accesses that occur in each one of the three databases:
The Web DB is where all of our content is stored; everything from news and reviews to our own internal article rankings are stored in this database. By far the majority of the transactions on this database are selects (reads). Remember that the web site only really offers one way interaction, the readers come to the site and read articles which are contained in this database. The articles are selected from the database and fed to one of the 6 web servers for assembly into a page for your browser. Internally, some update queries are also run, but they were not recorded in the test trace we ran. This database is the smallest out of the three; the DB was only 300MB when we ran the test.
The Ad DB is very similar to the web database in that quite a few selects are running. The select queries are used to pull the ads from the database for display in the user's browser. There are also a number of stored procedures that run along with the selects, but to keep things as simple as possible (at least for this comparison), we omitted them from the test trace. The Ad DB is noticeably larger than the web database, at a large 2.1GBs at the time of publication.
The final database is the Forums DB, which is by far the most transaction intensive database in the AnandTech Network. While the vast majority of the requests to the DB are in the form of selects (users reading categories and threads), there are significantly more inserts and updates (posting, thread/post counts, etc ) than in either of the other DBs. This database is also our largest, weighing in at just under 3GB during the testing and close to 12GB today (we used an older version of the DB from a over a year ago).
In the past, when we used database server testing, it was done using a single trace run on the AnandTech Forums. While we're using two additional databases, the test methodology remains the same. We recorded a trace of transactions on each one of these databases for a set period of time. These were live recordings while the website and forums were being accessed just like they would on any normal day. The traces were then played back at full speed (as fast as the server test bed could replay them) and their playback times recorded. We divided the number of transactions replayed by the playback time and reported all scores in numbers of database transactions per second: the higher the better.
What's unique about this round of server benchmarks is that we're actually forming the next set of tests that we're going to be running on the Opteron from the database accesses that occur while you all are reading this article. It has been well over a year since we updated our database tests and our server load has grown tremendously since then, so updated benchmarks are necessary. At the same time we've also got a new set of web tests based on our web architecture that we will be debuting along with the new DB tests once they're ready, so keep hitting these servers while you read the article - the more load we put on the DBs, the more stressful our tests will be for next time. How does it feel to be a part of a benchmark?
AnandTech Forums Database Performance
With hundreds of users always logged on and thousands more just browsing the forums, the AnandTech Forums database gets a full workout. Now with over 111,000 registered users it has become a difficult task making sure our upgrade cycles keep up with the growth of the forums.
All of the systems featured in the chart below are 2-way setups, using identical hardware where possible.
The Athlon MP has historically done extremely well in our Forums DB performance test and the trend continues here, giving even the Opteron 244 a run for the money. The Opteron 244 does manage to come out on top, by a very small performance margin. A 3.06GHz Xeon would most likely tie the Opteron 244 for the performance lead, however everything above the Opteron 242 pretty much exhibits identical levels of performance.
The AnandTech Forums DB trace is a prime example of a well balanced database, where the performance isn't overly dependent on any one aspect of the server's architecture. Here we see that CPU speed does help but is far from the determining factor in the overall performance of the platform.
AnandTech Website Database Performance
The AnandTech Website DB is much less I/O bound than the Forums DB, simply because we're dealing with a much smaller database whose access pattern is mostly composed of selects (reads). Here it's quite easy to find yourself CPU bound:
The Opteron does extremely well here, not only outperforming its predecessor but also the 2.80GHz Xeon (with Hyper Threading enabled) by a good 13%.
AnandTech Ad Database Performance
Next up on the test list is our Ad Database Server test bench. All of the ads across AnandTech and the AnandTech Forums are served using e-Zone Media's FuseAds advertising management software. The software was custom tailored to fit our needs and is tightly integrated with our internal content management system. The end result is that the ad placement, rotation, statistics and customization is very flexible and easy to use for our sales staff, however its tight integration with the entire network means that this server is just as important as the content feeding servers. Should the Ad Database be slow to respond or not respond at all because of an overwhelming load, the entire site and forums would slow to a crawl.
Although all of the Opterons manage to outperform Intel's Xeon, the 244 (1.80GHz) extends a healthy 12% lead over Intel's offering. Even a 3.06GHz Xeon would not be able to close up that gap, although a 1MB L2 part might help...
Database Scaling Performance
Now we've given you an idea of how well these platforms compare in normal circumstances, but the life of a database server is rarely limited to one set level of load. As a website grows, the accesses to the database server become more frequent and the load on the server grows tremendously.
If left unchecked, a poorly planned server architecture could become bottlenecked by a single database server over the course of several months. You could have the fastest webservers in the world, but if they can't get the data they need because your CPU-bound on the database server then their power is useless.
For this test we took the two fastest platforms - the dual Opteron 244 and dual Xeon 2.80GHz and increased the size and load of the test database (for simplicity's sake we stuck with the AnandTech Ad DB test) from 1x up to 24x.
Here you can see that although the Opteron 244 and Xeon 2.80GHz both start out at about the same level of performance, as the database grows in size and in load, the Opteron handles things much better.
This is a perfect real-world example of the Opteron's strengths as a multiprocessor system. Thanks to the high-bandwidth HT interconnect between the processors and thanks to the fact that each CPU gets its own memory controller, the scaling here is incredible on the Opteron 244 platform.
The black lines on the chart above are trend lines showing the overall performance curve of the two platforms; as you can see, the Xeon begins to level off much quicker than the Opteron, although both run into the I/O limitations of our testbed at around the same point and begin offering diminishing returns.
To make sure that these performance results aren't just due to the larger cache of the Opteron, we did a quick 1P vs. 2P comparison to see if the MP architecture of the Opteron platform actually does grant it an advantage over the Xeon. We chose a medium-sized database setting of 8x the original size and load of the Ad DB and came up with the following results:
As you can see, the highly scalable architecture of the Opteron is helping it out considerably in this test. Not only will the Opteron offer higher performance immediately, it will provide for longer hardware lifetimes courtesy of its ability to handle more load than the Xeon.
Web Server Performance under ColdFusion MX
We have been running database server tests ever since the NetBurst based Xeon was released at a mere 1.7GHz, but to this date we have never really done any webserver tests. Considering that Athlon MP systems power all of our webservers that are a part of our 22-server farm, we thought it would be appropriate to devise a test that would stress our web applications as well.
Our own in-house developer, Jason Clark, put together the test that simulates accesses to discussion forums running FuseTalk Community Edition. FuseTalk Community Edition takes full advantage of ColdFusion MX and is developed by FuseTalk, Inc.; the software is an enterprise-class version of the forums software we use at AnandTech.
As was the case with our database server tests, in order to create load we resorted to recording a sort of trace of usage patterns and played it back on the test system. The trace was recorded using Microsoft's Web Application Stress Tool; the tool allowed us to record all actions in a web browser, and then play them back with a multiplier in order to simulate a realistic number of clients. We instructed the tool to record the actions without any user delays so that we could truly stress the hardware.
The results reported were in the form of average request time (how long it takes a single page to become ready for download by the user) and total number of page requests (the number of pages actually served). Both metrics are useful and thus we report them all below:
The Athlon MP has served us well as a high-performance, low-cost webserver platform - and it looks like the Opteron is a worthy replacement. Even the 1.40GHz Opteron 240 (not pictured here), can outperform the 2.80GHz Xeon. The integrated memory controller helps performance here tremendously, not to mention the highly efficient multiprocessor architecture. The end result is that the average page request time is 36% faster on the Opteron 244 than on the 2.80GHz Xeon.
These sorts of page request times are reasonable for a server under heavy load; on the Opteron 244 server under load the results show that, on average, a page is ready for you to download 171ms after you request it by clicking on a link or typing in a URL - not too shabby.
Quicker page request times result in more pages that are able to be served, as you can see by this graph above.
Without even touching the 64-bit capabilities of Opteron, and without exploring the performance benefits of a NUMA-aware OS, AMD has an extremely capable enterprise microprocessor on their hands with Opteron.
The CPU is not only highly scalable, but also offers extremely high performance as is evident by our real-world database and web serving tests. Although Intel's 3.06GHz Xeon DP with a 1MB L2 cache will definitely eat into the Opteron's performance lead (especially on the DB side of things), it looks like AMD has done their job well in terms of threatening Xeon's throne.
Be sure to read the rest of our Opteron coverage for even more information on AMD's latest microprocessor: