PhysX Performance

The first program we tested is AGEIA's test application. It's a small scene with a pyramid of boxes stacked up. The only thing it does is shoot a ball at the boxes. We used FRAPS to get the framerate of the test app with and without hardware support.

AEGIA Test Application


With the hardware, we were able to get a better minimum and average framerate after shooting the boxes. Obviously this case is a little contrived. The scene is only CPU limited with no fancy graphics going on to clutter up the GPU: just a bunch of solid colored boxes bouncing around after being shaken up a bit. Clearly the PhysX hardware is able to take the burden off the CPU when physics calculations are the only bottleneck in performance. This is to be expected, and doing the same amount of work will give higher performance under PhysX hardware, but we still don't have any idea of how much more the hardware will really allow.

Maybe in the future AGEIA will give us the ability to increase the number of boxes. For now, we get 16% higher minimum frame rates and 14% higher average frame rates by using be AGEIA PhysX card over just the FX-57 CPU. Honestly, that's a little underwhelming, considering that the AGEIA test application ought to be providing more of a best case scenario.

Moving to the slower Opteron 144 processor, the PhysX card does seem to be a bit more helpful. Average frame rates are up 36% and minimum frame rates are up 47%. The problem is, the target audience of the PhysX card is far more likely to have a high-end processor than a low-end "chump" processor -- or at the very least, they would have an overclocked Opteron/Athlon 64.

Let's take a look at Ghost Recon and see if the story changes any.

Ghost Recon Advanced Warfighter

This next test will be a bit different. Rather than testing the same level of physics with hardware and software, we are only able to test the software at a low physics level and the hardware at a high physics level. We haven't been able to find any way to enable hardware quality physics without the board, nor have we discovered how to enable lower quality physics effects with the board installed. These numbers are still useful as they reflect what people will actually see.

For this test, we looked at a low quality setting (800x600 with low quality textures and no AF) and a high quality setting (1600x1200 with high quality textures and 8x AF). We recorded both the minimum and the average framerate. Here are a couple screenshots with (top) and without (bottom) PhysX, along with the results:



Ghost Recon Advanced Warfighter


Ghost Recon Advanced Warfighter


The graphs show some interesting results. We see a lower framerate in all cases when using the PhysX hardware. As we said before, installing the hardware automatically enables higher quality physics. We can't get a good idea of how much better the PhysX hardware would perform than the CPU, but we can see a couple facts very clearly.

Looking at the average framerate comparisons shows us that when the game is GPU limited there is relatively little impact for enabling the higher quality physics. This is the most likely case we'll see in the near term, as the only people buying PhysX hardware initially will probably also be buying high end graphics solutions and pushing them to their limit. The lower end CPU does still have a relatively large impact on minimum frame rates, however, so the PPU doesn't appear to be offloading a lot of work from the CPU core.

The average framerates under low quality graphics settings (i.e. shifting the bottleneck from the GPU to another part of the system) shows that high quality physics has a large impact on performance behind the scenes. The game has either become limited by the PhysX card itself or by the CPU, depending on how much extra physics is going on and where different aspects of the game are being processed. It's very likely this is a more of a bottleneck on the PhysX hardware, as the difference between the 1.8 and 2.6 GHz CPU with PhysX is less than the difference between the two CPUs using software PhysX calculations.

If we shift our focus to the minimum framerates, we notice that when physics is accelerated by hardware our minimum framerate is very low at 17 frames per second regardless of the graphical quality - 12 FPS with the slower CPU. Our test is mostly that of an explosion. We record slightly before and slightly after a grenade blowing up some scenery, and the minimum framerate happens right after the explosion goes off.

Our working theory is that when the explosion starts, the debris that goes flying everywhere needs to be created on the fly. This can either be done on the CPU, on the PhysX card, or in both places depending on exactly how the situation is handled by the software. It seems most likely that the slowdown is the cost of instancing all these objects on the PhysX card and then moving them back and forth over the PCI bus and eventually to the GPU. It would certainly be interesting to see if a faster connection for the PhysX card - like PCIe X1 - could smooth things out, but that will have to wait for a future generation of the hardware most likely.

We don't feel the drop in frame rates really affects playability as it's only a couple frames with lower framerates (and the framerate isn't low enough to really "feel" the stutter). However, we'll leave it to the reader to judge whether the quality gain is worth the performance loss. In order to help in that endeavor, we are providing two short videos (3.3MB Zip) of the benchmark sequence with and without hardware acceleration. Enjoy!

One final note is that judging by the average and minimum frame rates, the quality of the physics calculations running on the CPU is substantially lower than it needs to be, at least with a fast processor. Another way of putting it is that the high quality physics may be a little too high quality right now. The reason we say this is that our frame rates are lower -- both minimum and average rates -- when using the PPU. Ideally, we want better physics quality at equal or higher frame rates. Having more objects on screen at once isn't bad, but we would definitely like to have some control over the amount of additional objects.

ASUS Card and Test Configuration Final Words
Comments Locked

101 Comments

View All Comments

  • Walter Williams - Friday, May 5, 2006 - link

    Too bad not even quadcores will be able to outperfrom the PPU when it comes to physics calculations.

    You all need to wait for another game that uses the PPU to be reviewed before jumping to any conclusions.

    The developers of GRAW did a very poor job compared to the developers of CellFactor. This will come to light soon.
  • saratoga - Friday, May 5, 2006 - link

    quote:

    Too bad not even quadcores will be able to outperfrom the PPU when it comes to physics calculations.


    quote:

    jumping to any conclusions.


    Haha.
  • DerekWilson - Friday, May 5, 2006 - link

    just because something is true about the hardware doesn't mean it will every come to fruition in the software. it isn't jumping to a conclusion to say that the PPU is *capable* of outperforming a quadcore cpu when it comes to physics calculations -- that is a fact, not an opinion due to the architecture.

    had the first quote said something about games that use physics performing better on one rather than the other, that would have been jumping to conclusions.

    the key here is the developers and how the problem of video game physics maps to hardware that is good at doing physics calculations. there are a lot of factors.
  • saratoga - Saturday, May 6, 2006 - link

    quote:

    it isn't jumping to a conclusion to say that the PPU is *capable* of outperforming a quadcore cpu when it comes to physics calculations -- that is a fact, not an opinion due to the architecture.


    Its clearly an opinion. For it to be a fact, it would have to be verifiable. However, no one has made a quad core x86 processor, and no game engine has been written to use one.

    The poster simply stated his opinion and then blasted other people for having their own opinions, all without realizing how stupid it sounded which is why it was such a funny post.

  • Walter Williams - Saturday, May 6, 2006 - link

    I did not blast anybody...

    It is a simple fact that a dedicated processor for X will always outperfrom a general purpose processor when doing X from a hardware perspective.

    Whether or not the software yields the same results is another question. Assuming that the PCI bus is not holding back performance of the PPU, it is incredibly unlikely that quad core CPUs will be able to outperform the PPU.
  • saratoga - Saturday, May 6, 2006 - link

    quote:

    It is a simple fact that a dedicated processor for X will always outperfrom a general purpose processor when doing X from a hardware perspective.


    Clearly false. General purpose processors sometimes beat specialized units. It depends on resources available to each device, and the specifics of the problem. Specialization is a trade off. If your calculation has some very specific and predictable quality, you might design a custom processor that exploits some property of your problem effectively enough to overcome the billions Intel and AMD poured into developing a general purpose core. But you may also end up with an expensive processor thats left behind by off the shelf components :)

    Furthermore, this statement is hopelessly general. What if X is running Linux? Or any other application that x86 CPUs are already specialized for. Can you really concieve of an even specialized processor for this task that didn't resemble a general purpose CPU? Doubtful.

    quote:

    Assuming that the PCI bus is not holding back performance of the PPU, it is incredibly unlikely that quad core CPUs will be able to outperform the PPU.


    You're backpeddleing. You said:

    "Too bad not even quadcores will be able to outperfrom the PPU when it comes to physics calculations."

    Now you're saying they might be able to do it. So much for jumping to conclusions?
  • JarredWalton - Friday, May 5, 2006 - link

    People keep mentioning Cell Factor. Well and good that it uses more physics calculations as well as the PhysX card. Unfortunately, right now it requires the PhysX card and it's looking like 18 MONTHS (!) before the game ships - if it ever gets done. We might as well discuss how much better Havok FX is going to be in The Elder Scrolls V. :p

    For the first generation, we're far more likely to see a lot of the "tacked on" approach as companies add rudimentary support to existing designs. We also don't have a way to even compare Cell Factor with and without PhysX. Are they hiding something? I mean, 15% faster under the AGEIA test demo using a high-end CPU isn't looking like much. If they allow CellFactor to run on software (CPU) PhysX calculations, get that to support SMP systems for the calculations, and we get 2 FPS in Cell Factor, that's great. It shows the PhysX card does soemthing. If they allow all that and the dual core chips end up coming very close to the same performance, we've got a problem.

    Basically, right now we're missing real world (i.e. gaming) apples-to-apples comparisons. It's like comparing X800 to 6800 cards under games that only supported SM3.0 or SM1.1 - better shaders or faster performance, but X800 could have come *much* closer with proper SM2.0 support.
  • NastyPope - Friday, May 5, 2006 - link

    AMD & Intel could license the PhysX technology and include a dedicated PhysX (or generic multi-API) core on their processors and market them as game processors. Although some science and technology applications could make use of it as well. Being on-die would reduce latency and provide a huge amount of bandwidth between cores. Accessing system memory could slow things down but still be much faster than data transfers across a PCI bus.
  • Woodchuck2000 - Friday, May 5, 2006 - link

    The reason that framerates drop with the PhysX card installed is simply that the graphics card is given more complex effects to render.

    At some point in the future, games will be coded with a physics API in mind. Interactions between the player and the game environment will be through this API, regardless of whether there is dedicated hardware available.

    It's a truth universally acknowledged that graphics are better left to the graphics card - I don't hear anyone suggesting that the second core in a duallie system should perform all the graphics calculations. I think that in time, this will be true of physics too.

    Once the first generation of games built from the ground up with a physics API in mind come out, this will sell like hot cakes.
  • Calin - Friday, May 5, 2006 - link

    The reasons frame rates drop is the fact that with the physics engine, the video card have more to render - in the grenade explosion images, the "with physics" image has tens of dumpster bits flying, while in the "non physics" there are hardly a couple.
    If there would have been the same complexity of scenes, I wonder how much faster the ageia would be

Log in

Don't have an account? Sign up now