Original Link: http://www.anandtech.com/show/872
Next-Generation Game Performance with the Unreal Engine: 15-way GPU Shootoutby Anand Lal Shimpi on January 24, 2002 3:09 AM EST
- Posted in
The problem with gaming benchmarks is that we're almost always measuring the performance of today's cards with yesterday's games. For the longest time, Quake III Arena was a very popular benchmark as very few configurations could run it with all of the visual options turned on at high frame rates. Today, the situation is much different; it isn't uncommon to see 200+ frame rates under Quake III Arena.
There have been a number of other gaming benchmarks that have risen to the occasion in recent history. Games such as Serious Sam introduced a very configurable engine to the world of benchmarking, effectively giving us a very flexible tool to measure performance with. There were other titles that didn't fare as well, such as Unreal Tournament which, in the vast majority of cases, ended up being more CPU limited than graphics limited.
The one thing that all of these benchmarks have in common is that they are of currently available or previously popular games. We can extrapolate from their results how a particular card or family of GPUs will perform in future games but we never really know until those games become available. It's already widely known that you can't even begin to treat a video card upgrade as an investment; with 6-month product cycles you have to play a guessing game as to whether you'll be buying adequate power for the future.
Case in point would be the release of the GeForce3 almost 12 months ago. The card was more than enough for the games that were out at the time but many bought on the premise that it would give them superior performance in forthcoming DirectX 8 titles. Fast forwarding to the present day, there are still no major titles that require the DX8 features of the GeForce3 and those that purchased the card early on were left with much cheaper and sometimes higher performing alternatives just 6 months later.
But we're simply talking about things from the standpoint of the end-user. The situation is even more frustrating from the standpoint of the developer. The developers want to make their games as incredible as possible, but they need the hardware, driver and API support to do so. At the same time, hardware manufacturers such as ATI and NVIDIA aren't going to waste precious die-space implementing features that won't be used for another 2 years. It's the classic chicken and the egg syndrome; luckily, in this case, both ATI and NVIDIA have supplied the eggs with their DX8-compliant GPUs.
Then there's the issue of drivers. Is a graphics vendor going to spend their time optimizing for features that won't be used in games for another 6 - 12 months or will they focus on the benchmarks and games that are currently being played? The answer is obvious; but where does this leave the developers? These are the people that are using currently available cards to test and build their next-generation game engines. If currently available drivers won't run their next-generation engines then they are forced to either wait for the hardware manufacturers to fix their drivers (an effort that doesn't provide immediate results to the hardware vendor) or to highly optimize for a very small subset of cards or even one particular vendor. With only two major manufacturers left, ATI and NVIDIA, there is an unspoken understanding that the developers deserve as much, if not more, attention than the end-users. Although this hasn't always been the case, we're now hearing that both ATI and NVIDIA are equally responsive to developer driver issues.
We've just outlined a number of problems that currently exist in the way graphics is handled from both a reviewer's standpoint and from a developer-relations standpoint. But what to do about it?
Luckily one of the most prominent game developers happens to be in AnandTech's back yard and we've been working with them on addressing some of these very issues.
Epic Games is a name that any gamer should be familiar with. Titles ranging from Epic Pinball all the way through Unreal Tournament have been the products of a very talented group of developers who are hard at work every day on building bigger and better games for the industry.
Epic's current brainchild is of course the Unreal Engine. This engine has come a long way since the debut of Unreal as it is a constantly evolving entity. Most of you are probably also familiar with Unreal Tournament, an equally popular evolution of the Unreal Engine. And more recently there have been a number of other games announced such as Unreal Tournament II, Unreal II and Unreal Championship (Xbox) that are to use the current build of the Unreal Engine.
You should keep in mind that the current build of the Unreal Engine has actually come a long way since the days of Unreal Tournament. While Unreal Tournament left off at build 436, Epic is now up to build 848 on the Unreal Engine.
One of the things that has changed considerably -- mostly because we have powerful enough hardware to allow this -- is the number of polygons that the engine now pushes in any given scene. With the introduction of ATI's Radeon 8500 and NVIDIA's GeForce3 Ti 500 and the promise of even more powerful cards just months away, developers such as Epic are more than encouraged to experiment with even more detailed scenes.
It is the latest build of the Unreal Engine that Epic has used to spin off what is being tentatively called the Unreal Performance Test 2002. This benchmark is designed to stress today's systems, from CPU to GPU, using a game engine that will be widely used in the very near future. With Epic's help, we'll be able to provide you all with an idea of exactly how your systems will perform in games that will be coming out in the near future. It's time to put the manufacturers' claims to the test and find out what platforms are truly built for the next-generation games.
If you'll remember, the original Unreal engine was geared for software rendering but it has evolved into something that is entirely geared for hardware accelerated 3D rendering thus making it a much better benchmark of GPUs. Eventually Epic will have a version of the Unreal Performance Test 2002 that will encompass every feature in the engine that will eventually make its way to Unreal Tournament II among other titles.
Project: GPU Benchmark
First and foremost, it should be mentioned that this will be an evolving project. Epic's next-generation game engine is constantly evolving and so will this series of articles. We will provide follow-ups if/when Epic introduces new features to their engine that introduce new performance challenges, as well as if manufacturers release new drivers that are further optimized for these new features that we're testing. We'll also entertain reader requests, as always, for other avenues to continue this investigation. Our primary goal is to provide you with the most accurate and thorough information possible, and thus with enough demand (and time permitting), we will explore other performance questions that can be answered through the use of this benchmark.
The current state of the benchmark is that it is built off of build 848 of the Unreal Engine. Although the engine currently doesn't take advantage of any custom pixel/vertex shader programs it is considerably more GPU limited than any previous test we've run.
To give you an idea of the complexity of the engine and the benchmark, we asked two of Epic's finest, Daniel Vogel and Tim Sweeney to give us a brief overview of the Unreal Performance Test 2002:
"The Unreal Performance Test 2002 currently consists of a flyby through an outdoor terrain map with as many as 100,000 triangles. Due to the nature of the flyby and extended visibility in outdoor areas the flyby is quite memory bandwidth intensive on the GPU. To achieve a realistic CPU load 14 bots walking around pseudo-randomly have been added to the map.
For statistics gathering 2510 frames are rendered as fast as possible at a locked game framerate of 30 fps with the first 10 frames being disregarded."
To put things into perspective, the average polygon counts range from 50 - 100 times that of Unreal Tournament. We were also told that texture usage has increased approximately 8 fold as well. Courtesy of DXT texture compression, the overall memory bandwidth usage has only gone up by a quarter of that.
The benchmark is not a synthetic test, rather it's designed to simulate performance on the latest builds of the Unreal Engine. This is not your average flyby, it has sharp turns simulating what every hardcore first person shooter player will be doing and has a number of other features to give you a real flavor of a game benchmark.
The benchmark is not publicly available and we have received it courtesy of Epic Games, we cannot redistribute it.
This first article is merely designed to set the stage for the rest to come, and thus we tested under a single configuration. The Athlon XP 2000+ was chosen to make the bottleneck as much the video card as possible; in future articles we'll look at the role the CPU and the rest of the system will play in this test as well.
We ran the benchmark at five resolutions with v-sync disabled: 640x480x32, 800x600x32, 1024x768x32, 1280x1024x32 and 1600x1200x32.
Windows XP Test System
|AMD Athlon XP 2000+|
|Motherboard(s)||ASUS A7V266-E (KT266A)|
256MB DDR266 Crucial DDR SDRAM
IBM Deskstar 30GB 75GXP 7200 RPM Ultra ATA/100
Radeon 7500 64MB DDR
GeForce3 Ti 500 64MB DDR
Hercules 3D Prophet 4500 64MB SDR
Linksys LNE100TX 100Mbit PCI Ethernet Adapter
Windows XP Professional
If you look at a performance comparison chart at 640 x 480 in most of today's games, everything above the GeForce2 Pro will pretty much perform about the same; that's clearly not the case here.
The card that had potential actually ends up coming out on top here as the Radeon 8500 managed to slightly edge out the GeForce3 Ti 500. We did run into a problem with the Radeon 8500 and that was a flickering fog issue which ATI is aware of. This should be fixed via a driver update.
Next we see that the Radeon 7500 does a lot better than it has done in most of today's games as it comes within 4% of the GeForce3. We can also derive from this that early adopters of the GeForce3, although they spent quite a bit, are still among the top three performers in this benchmark.
There's also not much difference between the GeForce2 Ultra and the GeForce3 Ti 200 because of the fact that the GeForce2 Ultra has a much higher GPU clock than the GeForce3 Ti 200. The main advantages the Ti 200 offers here are the improved memory controller as well as the programmable pixel and vertex shader support.
Although this is supposed to be a GPU shootout we included the Kyro II which, as you should be aware of, does not have any hardware T&L capabilities and thus relies on the host CPU for all triangle setup and vertex processing. When paired with a fast CPU like the Athlon XP 2000+ the Kyro II actually outperforms all of the lower end cards and comes dangerously close to ATI's original Radeon. With a fast CPU it seems like the Kyro II could be quite the low-end competitor, but remember we're only looking at 640 x 480 - let's crank up the resolution a bit and see what happens.
As the resolution increases the performance difference between the Radeon 8500 and the GeForce3 Ti 500 increases even more, now to over 14%. The ability to scale better with resolution could be because of the fact that HyperZ II is more efficient than NVIDIA's visibility subsystem but it's very difficult to hypothesize at this point.
The standings don't change too much although we do notice that the GeForce3 Ti 200 jumps ahead of the ATI Radeon 7500, now leading it by 15%.
The Kyro II moved up in the standings and is now only 6% away from the GeForce2 Ti 200.
If you look at the benchmarks from a playable performance standpoint, the current GeForce2 MX series is entirely too slow and the original Radeon is on the borderline at 800 x 600.
By far the most popular resolution, 1024 x 768, creates an even more interesting situation. Now instead of the Radeon 8500 and GeForce3 Ti 500 increasing their performance gap, it is closing. The two are now separated by only 6% as additional limitations are thrown into the mix with 64% more pixels to be rendered on the screen.
The Kyro II is also just as fast as the GeForce2 Ti 200 now, again courtesy of the extremely fast host CPU. In future articles we will investigate exactly what speed CPU the Kyro II needs in order to remain competitive with these hardware T&L equipped solutions from ATI and NVIDIA.
The one abnormality is that regardless of how many times we ran the benchmark, the Radeon 7500 always dropped significantly in performance. As you'll see from the next chart, the abnormality is only limited to this resolution so it is most likely a quirk with the drivers.
If we set 30 fps as our minimum requirement for playability, everything slower than the GeForce3 Ti 200 fails to make the cut. It's time to squeeze some more performance out of drivers.
Once again we have a virtual tie between the Radeon 8500 and the GeForce3 Ti 500. The rest of the results begin to suffer as 1280 x 1024 is entirely too stressful for the vast majority of these cards.
If we continue to use 30 fps as our cutoff mark, 1280 x 1024 isn't a playable resolution for anything slower than a GeForce3 Ti 500 or Radeon 8500 in this test.
This final resolution is more of a novelty than anything else as you won't find it too practical to run at 1600 x 1200 if you own any of these cards. Granted that by the time this engine is introduced in a publicly available game there will be faster cards available, 1600 x 1200 is still a bit excessive.
While this is normally where our conclusion goes it should be said that the results we've shown you today are by no means conclusive. We're dealing with an engine that is still in development and over time, with the aid of the major graphics players performance should improve. What we have seen here however is a good starting point for what we can expect to see from the next-generation of games running at the highest possible settings. The performance can and will improve with efforts from ATI and NVIDIA as well as Epic's own efforts to make sure that owners of some of the slower cards will not be hung out to dry.
With that said we have discovered a few interesting things through this initial investigation of performance:
1) The Radeon 8500 does exceptionally well, only losing out to the GeForce3 Ti 500 at the highest resolution. The only question that remains is whether the performance will remain high with the fog issues fixed.
2) The original GeForce3, although very expensive for those that were early adopters, ends up being one of the top performers out of today's GPUs. It's good to know that not all year old technology is obsolete.
3) If these results are any indication, moving forward, GPU clock will actually play a much more important role than it has in the past. A delicate balance between GPU clock and memory clock, such as what was made possible on the GeForce3, will be ideal to obtain.
4) The low-end ATI and NVIDIA solutions don't perform very well at all, thus making it worth while to upgrade to one of the higher end cards.
There are still some questions that remain unanswered, including how effective hardware T&L actually is on slower systems. This is just one of many topics that we will be covering as our investigation continues.