Original Link: http://www.anandtech.com/show/2076
PhysX Performance Update: City of Villainsby Ryan Smith on September 7, 2006 6:00 AM EST
- Posted in
The last time we took a look at physics performance under City of Villains utilizing a PhysX card, the code was still in beta, and the results for the PhysX card were lackluster at best. What we found was that while utilizing a PhysX card did indeed enable CoV to generate a good deal of extra eye-candy, the cost came at an immense performance drop, especially to minimum framerates. At the time, AGEIA had promised that performance under CoV would improve between then and when the improvements were actually put into the Issue 7 patch. That patch has since shipped, and we are back to take a look at just what the PhysX card is capable of doing given more refined code.
Since the beta version of Issue 7 was in testing, Cryptic has also made several changes to CoV to better allow benchmarking, as previously FRAPS was the only way to do a repeatable test using higher physics settings. The CoV's benchmark mode now properly renders additional physics based on the game's settings, allowing for more controlled testing and more importantly testing against the live server where we can no longer copy characters. Also, the highest physics mode, previously only allowed with PhysX hardware installed, now can run completely under software, giving true apples-to-apples testing where all physics effects are the same.
This later point is particularly important, since AGEIA was able to confirm a few earlier theories we had on the game. In the highest quality mode (various physics modes have replaced the items slider), both the hardware and software physics engines end up doing the same physics routines with the same precise formulas. Meanwhile, in any of the lower quality modes the software engine is allowed to make approximations in return for the faster calculations, and at the lowest quality settings the software mode renders fewer overall effects. We'll look at how the various settings affect the performance and visual quality under CoV, so keep in mind that only the maximum quality mode is identical for both software and hardware physics.
Dual core performance is also something that has seen some changes since the Issue 7 beta, as the game is now capable of doing physics calculations on a separate core. As we'll see in a second, the game benefits tremendously from a second core, which may not be in AGEIA's best interests.
What does all of this mean for physics performance then under CoV? Let's take a look.
When Things Go Wrong & The Test
It's worth noting that at one point this article had a very different tone to it, based on the benchmark results we had gotten. We recently replaced the test rig we run the PhysX articles on with a newer machine, and as the component of note, threw in an ATI Radeon X1900XTX as it's generally the fastest single-slot card we have that isn't an SLI card (i.e. the GeForce 7950 GX2). City of Heroes/Villains is a game we long ago established was CPU limited, so the choice in video cards is largely academic, or so we thought.
It turns out that ATI's latest drivers have a problem with City of Heroes/Villains where the performance of the game chokes when using some of the advanced rendering features. Fortunately, we caught this issue, but for a while we were wondering why the PhysX card wasn't helping as much as expected. It's always interesting to discover where the bottlenecks are in different benchmarks. City of Heroes is a great testing components since it's an OpenGL title that isn't built on the Doom3 engine. Unfortunately, this is bad timing for ATI, given their recent OpenGL improvements on games that do use the Doom3 engine.
Due to the issues with the ATI card, we switched to testing with a 7950 GX2 instead. Here are the details of our test setup.
|PhysX Testbed Configuration|
|CPU:||Intel Core 2 Extreme X6800 (2.93GHz/4MB)|
|Motherboard:||Intel D975XBX (LGA-775)|
|Chipset Drivers:||Intel 188.8.131.527 (Intel)|
|Hard Disk:||Seagate 7200.7 160GB SATA|
|Memory:||Corsair XMS2 DDR2-800 4-4-4-12 (1GB x 2)|
|Video Card:||NVIDIA GeForce 7950GX2|
|Video Drivers:||NVIDIA ForceWare 91.33|
|OS:||Windows XP Professional SP2|
As we mentioned previously, even with demo recording, City of Villains does not recreate the exact same scenario every time (similar to Oblivion), so any benchmarks under CoV will still have a higher variance than normal. In order to get a better approximation of performance, every benchmark has been run multiple times, but a 1-2 FPS difference is still within the normal variance. As we noted in our previous article, CoV is a CPU-limited application, so we'll stick with the highest resolution.
For these tests, we have run them both with only a single and both cores of our Core 2 Duo X6800 in order to illustrate the impact of the second core.
With the changes offered by Cryptic, the tables have turned significantly on the performance results. Now that we can use the same physics quality with both software and hardware rendering, it's easy to see just how loaded down the processor is when trying to perform the exact same physics calculations the PhysX card is. For our dual-core X6800 at full speed, using the PhysX card offers an average framerate about 25% higher than without, and on our X6800 running with just one active core, that jumps up to a 60% difference.
It's interesting to note the impact of the second core in relation to the PhysX card. While the card clearly offers a performance boost in all situations, the disparity between the boost of a dual-core system and a single-core is rather remarkable. Against a single-core system, the PhysX card is increasing performance by about 60%, but adding a second core instead of a PhysX card adds an even larger 77%. Granted, our single-core setup is slightly contrived, as Intel won't be selling a single-core Core architecture CPUs at this speed grade, but there are numerous systems out there currently running single-core processors where a dual-core processor could simply be dropped in.
Switching to taking a look at the minimum framerate, the results are nearly identical. The physics calculations being done for CoV have a large impact on performance resulting in the PhysX card boosting performance by 35% on the dual-core setup. However, the overhead from other parts of the engine start to catch up to the single-core setup, where the performance boost is only 25%. Again, relative to the single-core configuration running software PhysX algorithms, adding the second core offers a larger boost than adding a PhysX card. The dual-core software PhysX calculations are 11% faster in average frame rates and 20% faster in minimum frame rates than a single-core processor with hardware PhysX.
Given the performance boost offered by the second core, we thought that perhaps the PhysX card could be more competitive on a dual-core system with a slower clock speed. We downclocked our X6800 to 1.86ghz to find out.
Since reducing the clockspeed of our CPU doesn't directly reduce the performance of the PhysX card, we find that our earlier theory is correct and that the PhysX card adds a hearty 50% with two cores functioning, and 73% with one core. Both results are greater than what we saw with a 2.93 GHz CPU. However, adding the second core also has a larger effect, this time at about 85% faster than a single core, although the dual-core setup is now only 6% faster than a single core with hardware PhysX.
Returning to the minimum framerates, the numbers are somewhat similar to our earlier minimums, with a boost of 35% against a dual-core setup and 41% against the single-core, indicating that the speedup offered by the PhysX card is somewhat greater even with the additional overhead encountered by the slower single-core setup. Adding the second core without the PhysX card comes in at a 57% performance increase relative to a single core, which is 17% faster than the single core with a PhysX card.
Last but not least, how about the performance when the CPU is allowed to cheat a bit and use approximate physics calculations? As we mentioned before, only the highest physics mode pushes software and hardware to the same level of precision. The next-highest mode allows the software to run faster by using estimations rather than doing the full calculations, as well as generating a lower amount of debris.
Even given the chance to cheat some, we still can't match the performance of using a PhysX card with a dual-core processor. The spread does drop to only 20%, but the setup using the PhysX card clearly takes the lead. It is worth noting however that the quality between these two modes is extremely close, even with the CPU using only physics approximations.
The highest quality mode still offers a superior amount of "stuff", and in turn the PhysX card does a better job of running that mode. The two modes are not easy to tell apart, though, and even less so in the middle of combat where such minor differences can easily be missed. If you compare the high-quality software PhysX scores on the dual core processor to the maximum quality hardware PhysX scores on the same setup, high-quality software PhysX ends up being slightly faster. Dropping the physics quality down a notch instead of buying a PhysX card is a perfectly viable alternative if extra performance is desired without spending any more money.
These numbers also show that even with our Core 2 Extreme processor, CoV is CPU limited both with and without the PhysX card. (You can also see this by referring back to the resolution scaling with ATI and NVIDIA cards on the previous page, for the 7900 GTX performance remains flat despite increasing resolutions.) What this means is that in these situations, the PhysX card is only useful up to the point where it's offloaded all the physics processing it can. Past that the bottleneck remains the CPU.
While we still think that on paper, the AGEIA's PhysX technology has promise, we find ourselves in a situation similar to where we were a few months ago with Ghost Recon and the City of Villains beta. On the positive side, AGEIA and Cryptic have fixed many of our earlier complaints about using PhysX hardware acceleration under City of Villains. The game no longer stutters, and installing a PhysX card doesn't immediately result in a drop in performance (though this has much to do with the new way of adjusting physics settings and other optimizations Cryptic has made in how the game handles large quantities of debris).
However, what AGEIA has failed to fix, and what ultimately ends up counting the most, is value. There's no question that a PhysX card will give better performance in City of Villains at the highest settings, and at times that difference can be pretty sizable. But as we found out, using a slightly lower quality physics mode will result in graphics similar to the highest mode where the PhysX card shines, but at performance levels nearly equal to the PhysX card just by using a dual-core CPU. When we're talking about adding a $250 PPU to a system that's already using a $1000 CPU and a $500 GPU, the PhysX card is a sensible way to boost performance by a good measure without spending all that much more. Under a tighter budget, that's a much harder thing to recommend.
For someone currently using a single-core CPU and working with a limited budget, an upgrade to a dual-core CPU is going to be superior to adding the PhysX card in City of Villains, and it's going to be much more useful in games and applications where the PhysX card can't be used. Similarly, someone with a slower dual-core CPU may not see gains as great going to a faster CPU as they would with a PhysX card, but unless the extra eye-candy and a few frames is what you desire, the faster CPU will still be more useful overall. Ultimately, since City of Villains is CPU limited, the PhysX card is only the best upgrade when a system's CPU performance can't be improved much; otherwise, the effect of the CPU holding back performance is just too great to ignore.
Eventually, we still must question the usefulness of a product like the PhysX card on a game like City of Villains. Physics processing is an embarrassingly parallel problem, the kind of problem that the hardware industry has gotten extremely good at solving first with video and GPUs, and now physics and PPUs. But this technology must be put to a better use if AGEIA wants to drive more adoption and influence an era of video games that can make a massive jump in the number of physics interactions used. Adding more particles to games like City of Villains -- and then only to certain segments of the game -- is really demeaning for the hardware; it's not changing gameplay and it's not something at which a PPU can universally excel versus other options such as additional CPU cores, even given the sheer advantage of hardware optimized for these calculations over a general-purpose processor.
We still believe that PPUs can influence and improve gaming, but it must be done in ways that make sense in improving gameplay, or at the very least improve things in ways not related to gameplay such that there's a clear benefit over the alternatives. City of Villains and similar games won't be able to sell the PPU (with the exceptions of wealthy die hard fans); that will have to come in the following years as games like CellFactor take root which implement the PPU in a more pervasive manner to create an undeniably more immersive experience.
If AGEIA could even promise a consistent 25% performance boost over software mode in several games, more people would be interested in the technology. The problem is, many games are completely GPU limited, so faster physics processing doesn't necessarily help. What we end up with is the classic chicken vs. egg problem: without a large installed base of PPUs, how many developers will even bother to try and take advantage of the technology, and without software that takes advantage of the technology, who will want to buy the hardware? ATI and NVIDIA are also working on trying to accelerate physics with their GPUs, and every gamer will already have that technology available. GPU-based physics calculations might not be a good solution in games that are already GPU limited, but faster processors and PPUs won't help such games either.