AMD Opteron 248 vs. Intel Xeon 2.8: 2-way Web Servers go Head to Head

Name: AMD Opteron 248 vs. Intel Xeon 2.8: 2-way Web Servers go Head to Head
Item: AMD Opteron 248 vs. Intel Xeon 2.8: 2-way Web Servers go Head to Head
Author: Anand Lal Shimpi & Jason Clark

by Anand Lal Shimpi & Jason Clark on December 17, 2003 9:15 AM EST

Posted in
IT Computing

43 Comments | Add A Comment

43 Comments

First Round K.O.

We measured performance using two metrics: the average time it took to fulfill a request to the web server, and the total number of templates (pages) served by the web server during the 30-minute test period. The two numbers are related, but both are useful to look at in order to get an idea of the real world difference in performance between the platforms.

All of our tests were done on dual processor configurations. So, to make the charts easier to read, we omitted any 2-way labeling on the CPU names themselves.

Frankly, we were shocked when we saw the first performance results, and we ran and re-ran them to make sure our numbers were correct. In the end, they were.

The Opteron 248 setup managed to outperform Intel’s fastest, largest cache Xeon MP by a whopping 45%. Boasting 141 ms request times, the Opteron 248 system was 12% faster than the Opteron 244 setup, indicating very good scaling with clock speed — a 50% increase in performance for every 100% increase in clock speed.

It is widely known that the Opteron and Xeon should not be compared on a clock for clock basis, but with the 2.0GHz Xeon MP being the fastest Xeon MP available just about a year ago, it is interesting to note the performance advantage AMD can offer over aging Intel systems.

You don’t even have to go for the top-of-the-line Opteron system in order to achieve performance greater than an Intel Xeon platform; although not depicted here, even the Opteron 240 should be able to be, at least, as fast as the 2.8GHz Xeon MP.

Intel’s 533MHz FSB Xeon 3.20GHz with its 1MB L2 cache may be a better match for the Opteron, but it is going to take much more than a 400MHz increase in clock speed to close the 45% performance gap that exists here. These Xeon parts are hard to come by and we’d love to re-run the tests with the new 3.2GHz parts (although they have smaller caches, the extra clock speed and faster FSB should help performance a bit) to see how they stack up.

Here, you can see the real-world performance advantages from another angle. Instead of looking at it as how much more responsive the Opteron server was, look at it from a standpoint of how many more people were able to access the site being hosted.

The performance, once again, speaks for itself. Just as the Athlon MP was a leader in web and database serving performance, the Opteron carries the torch for AMD this time around.

Keep in mind that web and database server applications are very sensitive to memory performance. So, although the Xeon attempts to hide larger memory access latencies with its 2MB L3 cache, the Opteron’s on-die memory controller helps improve performance significantly. The Opteron’s TLB optimizations work alongside the on-die memory controller to ensure that accesses to main memory (which will happen more frequently on the Opteron than on the Xeon because of the absence of any L3 cache) occur as quickly as possible.

The Test Final Words

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

43 Comments

View All Comments

Zuni - Friday, December 19, 2003 - link
"That entirely depends on the application being tested, and the load on the server, and the memory on the server. If you're calling the same 10 files over and over again then they'll likely be cached. If you have a true application where someone hits a page and creates a 600 megabyte recordset"

Sure I/O usage would depend on the type of web application being tested. However, most web applications feeding off a database backend would not pull 600 mb recordsets :) Pulling such record sets would be a fairly silly thing to do in most any circumstance. Our test was absolutely not hitting I/O on the webservers, the template cache is more than enough to hold the templates being tested. In fact as I'm writing this I pulled the disk time on one of Anands busiest web servers and its showing less than 1% of disk time :) If an application is well written, and the webserver has a decent amount of memory (most do) then I/O should not be a factor in a run of the mill application such as FuseTalk.

"This smacks of rhetoric. ASP and ASP.NET are also 'very good reference points' as enterprise level application environments. "

Not at all, although I can see how you could draw that conclusion. FuseTalk is currently being ported to .NET, once we have a stable working version or another app we deem worthy of a real world test we’d be glad to run tests on .NET as well. I quite like the platform. This is just the beginning of our server testing and developing real world tests. It takes time, and a lot of it to develop and run theses tests.

Cheers
Zuni - Friday, December 19, 2003 - link
Trog, again RAMDISK was used so that the application server was free to tax the cpus as much as possible instead of waiting for data requests from the database. This has nothing to do with normal or abnormal. We aren’t testing the database here we’re testing the application server, and we wanted no bottlenecks on the backend to cause any skewing on the front end. We don’t have fibre channel backend database hardware just laying around, thus a RAMDISK was used to remove the I/O bottleneck.

Again, the database is not in question here or being tested so its irrelevant.

Cheers.
Falco. - Friday, December 19, 2003 - link
" 1) disk subsystem is irrelevant in a web application driven by a database backend. Most any web application server caches the templates in memory so I/O is not a factor. However, each web machine had a 36gb u320 scsi drive. "

That entirely depends on the application being tested, and the load on the server, and the memory on the server. If you're calling the same 10 files over and over again then they'll likely be cached. If you have a true application where someone hits a page and creates a 600 megabyte recordset, your disk IO is gonna come in handy right quick. Saying "I/O is not a factor" is misleading.

However, I'm glad they responed the way they did, because if you hvae a testbed that starts to swap or spin out to disk, that doesn't test the CPU as much. I was expecting they'd come back and say they used a test set that was cached compiled templates (which is a good thing). My question was aimed at eliminating it as a possible variable.

" 2) Not sure I understand your friends point really... Of course ColdFusion tests performance, we used FuseTalk as the web application and ColdFusion as the web application server. "

That's not entirely what I meant. Cold Fusion is a middleware layer. It's not a "web server" as per the title of the review (Cold Fusion actually can include its own webserver if you want it to).

I found the title of the review to be somewhat misleading, since it implied they were testing a web server and not an application server.

" ColdFusion in its current version is a J2EE server which allows you to write in the ColdFusion language. After its compiled its the same compiled byte code as any J2EE app. "

Correct; it's an interpreter (CF) running on an interpreted language (Java). It's a good test in that it has multiple subsystems communicating together on top of having to handle actual web serving such as setting up sessions and opening and closing connections.

" We may include .NET in the future, but J2EE is a very good reference point as its a enterprise level application environment. "

This smacks of rhetoric. ASP and ASP.NET are also 'very good reference points' as enterprise level application environments. A whole huge great pile of systems run ASP. It struck me that it would have been worthwhile to round out the review by testing a number of common web application server configurations:

* Pure J2EE
* ColdFusion
* ASP
* ASP.NET
* PHP (although this would not test IIS, as most PHP systems drive it with Apache -- mind you, a test with Apache would be equally interesting)

" Sustained load throughout the test keeping the CPU working in the 80-100% bracket. We measure performance by how long the cpu took to run each code template from start to end. (this was all detailed in the review btw.) "

This part I understood and took for granted. Interestingly if they had sustained 80-100% CPU load, this is an added shine for the AMD parts, because past 70% load on your webserver you should be adding another box or upgrading your CPUs.
TrogdorJW - Thursday, December 18, 2003 - link
Hey Zuni... who are you? Anand? Or "just" an Anandtech employee (or reviewer or whatever)? Anyway, let me clarify:

"The entire reason we used a RAMDISK was to ISOLATE the cpu not to hold it back by something else."

Exactly. Ergo, without the RAMDISK holding the database, file I/O can be a much bigger factor than CPU speed. Yes, you can get bigger and faster disk subsystems, but that's not always the way it's done. Anyway, it *sounds* to me like a more "normal" setup on the database side would end up being a major bottleneck on the overall throughput, so that either of these systems is overkill. You need a DB system with a very high I/O rate first, and then looking at these faster CPUs comes second.

On a separate thought, are there going to be some benchmarks *without* using a separate DB server (which used a RAMDISK)? I understand that your site may not run that way, but the vast majority of web servers do, I believe. If you have one server running both the database backend as well as the application services, what do the benchmarks look like? That would be something to look at, surely?
Falco. - Thursday, December 18, 2003 - link
thanks Zuni, i'll let him know and post back any replies he may have
Zuni - Thursday, December 18, 2003 - link
Falco,

1) disk subsystem is irrelevant in a web application driven by a database backend. Most any web application server caches the templates in memory so I/O is not a factor. However, each web machine had a 36gb u320 scsi drive.

2) Not sure I understand your friends point really... Of course ColdFusion tests performance, we used FuseTalk as the web application and ColdFusion as the web application server. ColdFusion in its current version is a J2EE server which allows you to write in the ColdFusion language. After its compiled its the same compiled byte code as any J2EE app. We may include .NET in the future, but J2EE is a very good reference point as its a enterprise level application environment.

3) Sustained load throughout the test keeping the CPU working in the 80-100% bracket. We measure performance by how long the cpu took to run each code template from start to end. (this was all detailed in the review btw.)
Falco. - Thursday, December 18, 2003 - link
a friend has some questions about this review :

Did they get into details about the disk subsystems on each of the web servers? I know they tried to keep the database in ramdisk (even though SQL server will do that anyway if the database is small enough).

What sort of web application were they running? I find it curious they were using a Coldfusion application as well, since that doesn't test web response as much as the cold fusion runtime services; they should have used an ASP or ASP.NET application test, at least as an additional data point.

What did they do to test web response time? Sustained load? stressed load? burst load?

thats what hes asking in a forum i frequent...
Icewind - Thursday, December 18, 2003 - link
Man, how the hell do you learn and keep up with all this server technology, both hardware and software? Im having a hardtime just keeping desktop technology and changes....
Zuni - Thursday, December 18, 2003 - link
skiboysteve, because thats what was sent with the systems :) 8GB of it.
Zuni - Thursday, December 18, 2003 - link
Visual, the db server didnt need hardly any ram the database was barely 50 MB :) Again the point of the article was to eliminate the db as anything but a boundless data resorce so the web applications servers could process pages as fast as possible, keeping the cpu taxed. This was a 32 bit test, OS etc. We're trying to get ahold of a 64 JVM and we're looking at some UNIX based 64 bit tests for another article.

Cheers!

AMD Opteron 248 vs. Intel Xeon 2.8: 2-way Web Servers go Head to Head

First Round K.O.

Post Your Comment

43 Comments

View All Comments

Zuni - Friday, December 19, 2003 - link

Zuni - Friday, December 19, 2003 - link

Falco. - Friday, December 19, 2003 - link

TrogdorJW - Thursday, December 18, 2003 - link

Falco. - Thursday, December 18, 2003 - link

Zuni - Thursday, December 18, 2003 - link

Falco. - Thursday, December 18, 2003 - link

Icewind - Thursday, December 18, 2003 - link

Zuni - Thursday, December 18, 2003 - link

Zuni - Thursday, December 18, 2003 - link

Log in

Don't have an account? Sign up now