Original Link: http://www.anandtech.com/show/1419

Doom 3: CPU Battlegrounds

by Anand Lal Shimpi on August 4, 2004 2:14 AM EST

In a continuation of Doom 3 Week, we're bringing you the next installment of our coverage, this time focusing on CPU performance. If you haven't already, be sure to read our guide to graphics performance under Doom 3 before proceeding with this guide.

When does Doom 3 Need a Fast CPU?

We know by now that the GPU requirements of Doom 3 are quite high; the days of ultra high resolutions bringing us triple digit frame rates on mid range cards are gone with Doom 3, even cards like the Radeon 9800 Pro are best played at resolutions as low as 800x600. But is a fast GPU all you need to get the most out of Doom 3?

Remember that while your GPU will handle all of the rendering of the scenes in Doom 3, it is the CPU that handles all of the physics, artificial intelligence and 3D setup for sending vertex data to your GPU. So in order to get the most out of a fast GPU, you will also need to pair it up with a fast CPU - but how fast? The basic rule of thumb is this: the faster your GPU is, the faster your CPU will have to be to keep up with it.

Let's take the GeForce 6800 Ultra for example; as you've already seen, the GeForce 6 series is the fastest set of GPUs for running Doom 3, making it an ideal reference point for our discussion.

Below we have a graph of frame rate vs. resolution taken on a Pentium 4 Extreme Edition running at 3.4GHz with a GeForce 6800 Ultra running at 400MHz core/1.1GHz mem. The curve on the graph is what you'll want to pay attention to. If the graph were perfectly flat, as in there was no drop from 640x480 up to 1600x1200, our test system would be completely CPU limited (or bound by something other than the GPU). On the flip side, if the graph showed a clearly negative slope then we would have a much more GPU limited scenario, where the burden of rendering more pixels was not masked by an overly slow CPU.

In this particular case we see that at resolutions below 1280x1024 the GeForce 6800 Ultra is primarily CPU limited, making all of the lower resolutions perform identical to one another. It isn't until we hit 1280x1024 and 1600x1200 that there is a significant performance drop off. So it's clear that if you have a GeForce 6800 Ultra, pairing the chip up with a fast CPU is quite important. But what about on a slower card like a Radeon 9800 Pro?

Here we have a completely different graph, where not even at 800x600 is the card CPU bound. With a Radeon 9800 Pro, having a fast CPU doesn't help as much since you are mostly GPU limited, especially at higher resolutions. This doesn't mean that you can pair up a Radeon 9800 Pro with a 1.4GHz Celeron and be fine, but it does mean that a Pentium 4 Extreme Edition is going to be overkill for your 9800 Pro.

When investigating CPU performance under Doom 3 it's clear that we'll want to use a GeForce 6800 Ultra to put as much stress on the CPU as possible, but we've also looked at how slower cards like the Radeon 9800 Pro react to CPU speed improvements as you'll see on the next page.

How does CPU Speed Impact Graphics Performance?

For the most part, we ultimately make our purchasing decisions based on price. If we have $200 to spend on a processor, it doesn't matter how fast an Extreme Edition runs our apps and games unless it sells for $200. It's the price point that determines what our options are, and then we look at the best performer at that price point to make our final decision. But when upgrading, it's sometimes difficult to know when to upgrade various components - especially a CPU.

If you have a 2.4GHz Pentium 4, is it worth it to upgrade to a 3.4GHz P4 in order to get greater performance in Doom 3? Or is your 2.4GHz P4 paired up just fine with a NVIDIA 6800GT? These next set of graphs are designed to help you see the type of CPU scaling you can expect out of a high end card like the GeForce 6800 Ultra, or a slower card like the Radeon 9800 Pro.

The graphs below are frame rate vs. clock speed graphs, taken using a Pentium 4 C and varying the clock speed on a single platform. Once again it's the curve of the graph that you want to look at; the steeper the slope of the curve is, the more benefit you'll get from having a faster CPU. The flatter the curve is, the less benefit you'll get from having a faster CPU.

Although we're only showing two cards here, you can extrapolate performance of faster and/or slower cards pretty easily. Using our Doom 3 Graphics Guide you should know what cards are faster or slower than the two we're representing here; then just remember that a faster card will have a steeper (more CPU dependent) curve, while a slower card will have a flatter (less CPU dependent curve).

Also keep in mind that the scaling will be relatively similar for both AMD and Intel platforms; we chose to stick with only a single platform here in the interest of time as well as keeping these pages simple.

First up, we have the 6800 Ultra at 800x600:

Here we have a decently steep curve, probably the steepest it will get since we're dealing with one of the fastest GPUs at a relatively low resolution. The move from a 2.4GHz to a 3.2GHz processor resulted in a 21% increase in performance, considering that this was because of a 33% increase in clock speed it is safe to say that at lower resolutions the more money you put into a faster CPU, the higher your Doom 3 performance will be on a 6800 Ultra.

At higher resolutions the burden shifts to the GPU as is evident by the change in the slope of the curve. Now we have a distinctly more flat curve, with only a 13% difference between the fastest and the slowest CPUs - it's not insignificant, but definitely not huge.

Looking at the 9800 Pro at 800x600 we see a curve that closely resembles the 6800 Ultra's curve at 1280x1024, once again with a 13% gain seen from the 2.4GHz processor to the 3.2GHz part.

Although the rest of our CPU tests use the 6800 Ultra, the standings and degrees of performance improvement will apply to other graphics cards as well. As we've just seen, at 1280x1024 the GeForce 6800 Ultra scales much like a Radeon 9800 Pro at 800x600 - keep that in mind as we compare CPUs under id's latest and most impressive 3D game to date.

The Battlegrounds

For all of our benchmarks we used Doom 3's built in timedemo functionality. To benchmark Doom 3 yourself simply do the following while in Doom 3:

Bring up the console by hitting: CTRL + ALT + ~
Type: timedemo demo1

Then hit return and Doom 3's timedemo will run. The average frame rate for the demo will be reported after the run is complete. We ran all of our tests three times, disregarding the first score and taking the higher of the remaining two scores. We disregarded the first score because the first time the demo runs there is a lot of pausing as the demo gets cached, the remaining two runs are generally within 1% of one another.

 Performance Test Configuration

Socket-939 & Socket-754 Athlon 64/64 FX CPUs
Socket-754 & Socket-A Sempron CPUs
Socket-A Athlon XP CPUs
Socket-478 Pentium 4 & Celeron CPUs

RAM: 2 x 512Mb OCZ 3500 Platinum Ltd (2:3:3:7)
Hard Drives Seagate 120GB 7200 RPM (8MB Buffer)
Video AGP & IDE Bus Master Drivers Intel Chipset Driver
NVIDIA nForce Drivers: 4.27
Video Card(s): NVIDIA GeForce 6800 Ultra
ATI Radeon 9800 Pro
Video Drivers: ATI Catalyst 4.7
NVIDIA ForceWare 61.77
Operating System(s): Windows XP Professional SP1
Motherboards: Intel 875P
NVIDIA nForce3
NVIDIA nForce2 Ultra

Battle 1: Prescott vs. Northwood

The first battle of our Doom 3 CPU Comparison occurs between the two Pentium 4 cores: Prescott and Northwood.

You may remember from our review of Prescott that the new 90nm core was hard pressed to outperform its 130nm Northwood predecessor. Although Prescott featured twice the cache of Northwood, its longer pipeline and similar clock speeds held it back in most performance tests. However, Prescott does have one major advantage over Northwood - twice the L1-D and L2 cache. In other games the added cache has not been able to do much for Prescott, but let's see how that changes under Doom 3:

How the tables have turned - Prescott is actually faster than Northwood for a change, and at the same clock speed. A 7% performance advantage over the regular Pentium 4 3.2C is not too shabby for Prescott, but how can we be sure that the performance advantage is solely due to the cache size advantage? Look at the Extreme Edition.

The 3.2GHz Extreme Edition shares the same core as Northwood, but features a 2MB on-die L3 cache, and manages to outperform Northwood and Prescott by 15% and 7% respectively. These first benchmarks foreshadow what is soon to come and bring about a realization that Doom 3 is quite possibly the most memory/cache dependent game we've ever benchmarked.

The standings remain the same at higher resolutions, but as we've see the 6800 Ultra becomes mostly GPU limited at 1280x1024, reducing the impact of these processors. The Extreme Edition still manages to be 10% faster than Northwood, and Prescott continues to hold a lead over Northwood, just not as much at the higher resolution.

The last thing we wanted to look at in the Northwood vs. Prescott battle was how the two CPUs scaled - as we mentioned in our original Prescott review, we expected Prescott to do a better job scaling with clock speed than Northwood and we are beginning to see examples of that here in Doom 3:

Although it's ever-so-slight, Prescott's performance does seem to scale with clock speed better than Northwood.

The winner of this battle is clearly Prescott, we're sure Intel's happy that there's finally a situation where Northwood isn't in the limelight.

Battle 2: AMD vs. AMD

Next up on the fight list for today is AMD, competing against themselves. AMD has gained quite a bit of popularity over the past year and needless to say it is because of their extremely strong showing with the Athlon 64. That being said, with three different flavors of Athlon 64s (Socket-754, Socket-939 and FX) and a lot of users still hanging onto their Athlon XPs, AMD's performance breakdown is an important one to look at.

We know by now that Doom 3 is very cache intensive, which in turns means its very memory intensive - bringing us to our first evaluation: Athlon 64 vs. Athlon 64 FX. The Athlon 64 FX once held two advantages over the Athlon 64, a larger 1MB L2 cache and dual channel memory controller. Now with the introduction of Socket-939, the Athlon 64 also has dual channel capabilities but only on newer chips, not the older Socket-754 offerings. As you can guess, there are two comparisons we'd like to make here: Dual Channel vs Single Channel as well as the impact of cache size on performance.

First we'll tackle dual vs single channel memory interfaces; for this test we used a Socket-939 Athlon 64 FX-51 (2.2GHz/1MB L2) as our Dual Channel platform, and a Socket-754 Athlon 64 3400+ (2.2GHz/1MB L2) as our Single Channel platform. You can see that other than the sockets, the two chips are identical, making this the perfect single vs dual channel memory comparison:

Memory bandwidth doesn't seem to be something that the regular Athlon 64 needs much more of, as the move to dual channel DDR400 only offered a 3% increase in performance. At higher resolutions, the performance advantage would become even smaller. We didn't really expect anything different here, as the dual channel memory interface never really helped the Athlon 64 - definitely not as much as it did the Pentium 4.

Next, let's see how cache size influences Athlon 64 performance under Doom 3. For this comparison we have four chips to compare in two separate sets. We use an Athlon 64 2800+ and a Sempron 3100+, both clocked at 1.8GHz but feature a 512KB and a 256KB L2 cache respectively. We also have an Athlon 64 FX-53 and an Athlon 64 3800+, both clocked at 2.4GHz but feature 1MB and 512KB respective cache sizes. While the four numbers are not directly comparable to one another, the two comparisons do give us an idea of improvements due to cache size varying from 256KB up to 1MB on the Athlon 64:

Looking at the Athlon 64 vs Sempron we see that there's barely a 5% performance difference between the two identically clocked chips, indicating that although a 256KB L2 cache isn't big enough for Doom 3 a 512KB L2 cache doesn't help out that much more. The on-die memory controller helps ensure that despite the small cache size, performance remains very competitive with the competition as we will soon see in our fourth battle.

Our 512KB vs. 1MB L2 cache size comparison reveals something interesting: it's not that a 512KB L2 cache isn't big enough for Doom 3 (which is the case with the Pentium 4), it's that the Athlon 64's on-die memory controller effectively masks the need for a large L2 cache in Doom 3. Going to a 1MB L2 cache results in less than a 4% performance improvement, much less than what we saw with Prescott vs. Northwood.

Bottom line: cache size is far less important for the Athlon 64 than on the Pentium 4 as you would expect thanks to the on-die memory controller.

Battle 3: Celeron D vs. Sempron

AMD just recently introduced their new low-end branded CPU: Sempron, and as we've already seen it does a wonderful job of outperforming Intel's Celeron D, however the margin of improvement is far less than what we're used to seeing thanks to a much improved Celeron D. How does the Sempron fare under Doom 3? Let's find out:

Remember that there are two flavors of Sempron, a K7 and a K8 version. The K7 version performs just like an Athlon XP since it's basically a Thoroughbred core with its 256KB L2 cache. The biggest performance limiter to the K7 based Sempron 2800+ is that it has no on-die memory controller, bringing its performance down pretty far.

But the K8 based Sempron 3100+ does some serious damage, outperforming the Celeron D 335 by an incredible 53%. For a budget Doom 3 system, you will want to steer far away from a Celeron D and towards the Sempron. As we've seen before, the cache size dependency of Doom 3 on the Pentium 4 is significant and even though the Celeron D and the Sempron both only have a 256KB L2 cache, the Sempron's on-die memory controller helps reduce the impact of such a small cache on Doom 3 performance.

The winner here is Sempron.

Battle 4: AMD vs. Intel

Coming to our fourth and final battle of this Doom 3 comparison we have the comparison we've all been waiting for, AMD vs. Intel. For this comparison we benchmarked quite a few different CPUs, all K7 based processors (e.g. Athlon XP, Sempron 2800+) are colored green, all K8 based processors (e.g. Athlon 64, Athlon 64 FX, Sempron 3100+) are colored orange and all Intel processors are blue in the graphs:

The first thing you'll notice is that the top of the chart is dominated almost exclusively by AMD Athlon 64 and Athlon 64 FX processors. Even the Athlon 64 3400+ manages to outperform the almighty Pentium 4 Extreme Edition 3.4, not to mention that the FX-53 distances itself from Intel's fastest by no less than 18%.

Making our way further down the chart we see that the Athlon 64 3000+ is quite possibly the best buy for excellent Doom 3 performance, weighing in right between the two Extreme Edition processors at less than 20% of the cost of those chips.

Next we have all of the Pentium 4s that manage to offer middle of the road performance under Doom 3, although we do see the K8 based Sempron 3100+ wedged in between the Prescott 3.2 and Prescott 3.0GHz CPUs.

Finally at the bottom we have the Athlon XPs as well as the lonely Celeron D, which is barely saved from a disappointing last place showing.

The standings remain the same at 1280x1024 as you can see below:

Final Words

It can be argued that as much of a GPU hog Doom 3 is, it is just as demanding on your CPU. The recipe to success is much simpler on the CPU side however: Doom 3 needs cache and lots of it.

On the Pentium 4 side of things, if you've got anything with less than 512KB of cache it's time for you to upgrade. Prescott owners will be happy that their chips are finally faster than Northwood in something thanks to larger caches.

AMD owners have much more of a reason to rejoice: the Athlon 64 runs Doom perfectly. It's almost as if the game was built to run best on an Athlon 64; maybe AMD should invest some marketing dollars in their own "The way it's meant to be played" campaign. And to make things even better, you don't even have to have the fastest Athlon 64 to get great performance, even the meager 3000+ manages to offer performance equal to that of Intel's Extreme Edition Pentium 4 at a much lower cost. The key to AMD's success is the on-die memory controller; with lower latency memory accesses than the competing Intel solutions, Doom 3 sees system memory as one big cache and drives performance up considerably. It is also the on-die memory controller that makes cache size less of an issue on the Athlon 64, while too small of a cache seems to make or break performance with the Pentium 4.

The Athlon XP is much less impressive under Doom 3 thanks to its lack of an on-die memory controller; unless you have a Barton based Athlon XP, it may be time to bite the bullet and upgrade to an Athlon 64. That being said, the entry level Sempron 3100+ offers very competitive performance at a price point that's low enough to make the transition to a Socket-754 platform relatively painless.

If you are lucky enough to own any of the GeForce 6 series cards and play at resolutions lower than 1280x1024 rest assured that money spent on a faster CPU is money well spent. If you happen to have a slower card, something along the lines of a Radeon 9800 Pro or even a regular X800, your system is far less CPU bound and you may want to go with a more middle-of-the-road CPU in order to maximize performance without spending needlessly.

In the end, the winner of the final battle is clear: the AMD Athlon 64 is the processor for Doom 3.

Log in

Don't have an account? Sign up now