UT3 Teaches us about CPU Architecture

For our first real look at Epic's Unreal Engine 3 on the PC, we've got a number of questions to answer. First and foremost we want to know what sort of CPU requirements Epic's most impressive engine to date commands.

Obviously the GPU side will be more important, but it's rare that we get a brand new engine to really evaluate CPU architecture with so we took this opportunity to do just that. While we've had other UE3 based games in the past (e.g. Rainbow Six: Vegas, Bioshock), this is the first Epic created title at our disposal.

The limited benchmarking support of the UT3 Demo beta unfortunately doesn't lend itself to being the best CPU test. The built-in flybys don't have much in the way of real-world physics as the CPU spends its extra time calculating spinning weapons and the position of the camera flying around, but there are no explosions or damage to take into account. The final game may have a different impact on CPU usage, but we'd expect things to get more CPU-intensive, not less, in real world scenarios. We'll do the best we can with what we have, so let's get to it.

Cache Scaling: 1MB, 2MB, 4MB

One thing we noticed about the latest version of Valve's Source engine is that it is very sensitive to cache sizes and memory speed in general, which is important to realize given that there are large differences in cache size between Intel's three processor tiers (E6000, E4000 and E2000).

The Pentium Dual-Core chips are quite attractive these days, especially thanks to how overclockable they are. If you look back at our Midrange CPU Roundup you'll see that we fondly recommend them, especially when mild overclocking gives you the performance of a $160 chip out of a $70 one. The problem is that if newer titles are more dependent on larger caches then these smaller L2 CPUs become less attractive; you can always overclock them, but you can't add more cache.

To see how dependent Unreal Engine 3 and the UT3 demo are on low latency memory accesses we ran 4MB, 2MB and 1MB L2 Core 2 processors at 1.8GHz to compare performance scaling.

L2 Cache Comparison - DM-ShangriLa

L2 Cache Comparison - DM-HeatRay

L2 Cache Comparison - vCTF-Suspense 

From 1MB to 2MB there's a pretty hefty 12 - 13% increase in performance at 1.8GHz, but the difference from 2MB to 4MB is slightly more muted at 4 - 8.5%. An overall 20% increase in performance simply due to L2 cache size on Intel CPUs at 1.8GHz is impressive. We note the clock speed simply because the gap will only widen at higher clock speeds; faster CPUs are more data hungry and thus need larger caches to keep their execution units adequately fed.

In order to close the performance deficit, you'd have to run a Pentium Dual-Core at almost a 20% higher frequency than a Core 2 Duo E4000, and around a 35% higher frequency than a Core 2 Duo E6000 series processor.

Index FSB Scaling: 1066MHz, 1333MHz
Comments Locked

72 Comments

View All Comments

  • decalpha - Wednesday, October 17, 2007 - link

    Why not compare the CPUs with similar cache size, since Athlon 64 X2 6000 has 2MB cache whereas the Core 2 Duo E6850 has 4MB and cache size does seem to matter.
  • drebo - Wednesday, October 17, 2007 - link

    I think it's even more relevant to point out that clock-for-clock comparisons have been worthless for a very long time, and only seem to have come back on this site now that Intel has a more efficient pipeline.
  • PrinceGaz - Wednesday, October 17, 2007 - link

    The X2 6000+ actually has 2x 1MB cache, which in most cases is worse than 2MB shared, so the cache situation is even worse for AMD in the comparison that was performed.
  • drebo - Thursday, October 18, 2007 - link

    Well, cache size in general is less important for AMD processors, as the path from CPU to RAM is much, much quicker. It would be interesting (and very, very difficult to gauge) what the difference would be. This is most likely why they left AMD off of the cache comparison charts. It's impossible, due to far too dissimilar architectures, to isolate ONLY the memory subsystems, which is what a cache comparison would be attempting to do.

    Cache misses on an Intel architecture are far more expensive than on AMD's architecture. But, without otherwise identical chips, there's simply no way to make a comparison.
  • bloc - Wednesday, October 17, 2007 - link

    I think if you compared the 8600 gts and x2600 xt, the perf would be pretty close, with the x2600 xt being $50 cheaper.

    The architecture is there. Some games like cod4 hasn't taken advantage of it yet.
  • ImmortalZ - Wednesday, October 17, 2007 - link

    The second set of graphs on page 3 seem to be all confused. Mixed up title text?

    Also, regarding the ATI midrange part, surely you guys have heard about the 2900PRO?
  • JarredWalton - Wednesday, October 17, 2007 - link

    P3 graphs fixed. I'd imagine trying to get a 2900 Pro for testing is proving more difficult than anticipated. I know looking online that the few places I've seen that list them are out of stock.
  • ImmortalZ - Wednesday, October 17, 2007 - link

    Well, it's easy to test a 2900PRO. Underclock a 2900XT to 600Mhz core and 1600Mhz memory and test away! :D (there are 512MB GDDR3 and 1GB GDDR4 versions, so...). Just change the price from 389.99 to 249.99 for the 512MB and 319.99 for the 1GB.

    Of course, I'd personally wait for the 2950s to show up - single slot coolers are teh win :P
  • Bremen7000 - Wednesday, October 17, 2007 - link

    What about the page 6 graphs? Am I missing something or are they lacking something?
  • RobberBaron - Wednesday, October 17, 2007 - link

    Second that. The second set of charts on page 6 is Intel CPu's only. Little confusing

Log in

Don't have an account? Sign up now