PhysX Performance

The first program we tested is AGEIA's test application. It's a small scene with a pyramid of boxes stacked up. The only thing it does is shoot a ball at the boxes. We used FRAPS to get the framerate of the test app with and without hardware support.

AEGIA Test Application


With the hardware, we were able to get a better minimum and average framerate after shooting the boxes. Obviously this case is a little contrived. The scene is only CPU limited with no fancy graphics going on to clutter up the GPU: just a bunch of solid colored boxes bouncing around after being shaken up a bit. Clearly the PhysX hardware is able to take the burden off the CPU when physics calculations are the only bottleneck in performance. This is to be expected, and doing the same amount of work will give higher performance under PhysX hardware, but we still don't have any idea of how much more the hardware will really allow.

Maybe in the future AGEIA will give us the ability to increase the number of boxes. For now, we get 16% higher minimum frame rates and 14% higher average frame rates by using be AGEIA PhysX card over just the FX-57 CPU. Honestly, that's a little underwhelming, considering that the AGEIA test application ought to be providing more of a best case scenario.

Moving to the slower Opteron 144 processor, the PhysX card does seem to be a bit more helpful. Average frame rates are up 36% and minimum frame rates are up 47%. The problem is, the target audience of the PhysX card is far more likely to have a high-end processor than a low-end "chump" processor -- or at the very least, they would have an overclocked Opteron/Athlon 64.

Let's take a look at Ghost Recon and see if the story changes any.

Ghost Recon Advanced Warfighter

This next test will be a bit different. Rather than testing the same level of physics with hardware and software, we are only able to test the software at a low physics level and the hardware at a high physics level. We haven't been able to find any way to enable hardware quality physics without the board, nor have we discovered how to enable lower quality physics effects with the board installed. These numbers are still useful as they reflect what people will actually see.

For this test, we looked at a low quality setting (800x600 with low quality textures and no AF) and a high quality setting (1600x1200 with high quality textures and 8x AF). We recorded both the minimum and the average framerate. Here are a couple screenshots with (top) and without (bottom) PhysX, along with the results:



Ghost Recon Advanced Warfighter


Ghost Recon Advanced Warfighter


The graphs show some interesting results. We see a lower framerate in all cases when using the PhysX hardware. As we said before, installing the hardware automatically enables higher quality physics. We can't get a good idea of how much better the PhysX hardware would perform than the CPU, but we can see a couple facts very clearly.

Looking at the average framerate comparisons shows us that when the game is GPU limited there is relatively little impact for enabling the higher quality physics. This is the most likely case we'll see in the near term, as the only people buying PhysX hardware initially will probably also be buying high end graphics solutions and pushing them to their limit. The lower end CPU does still have a relatively large impact on minimum frame rates, however, so the PPU doesn't appear to be offloading a lot of work from the CPU core.

The average framerates under low quality graphics settings (i.e. shifting the bottleneck from the GPU to another part of the system) shows that high quality physics has a large impact on performance behind the scenes. The game has either become limited by the PhysX card itself or by the CPU, depending on how much extra physics is going on and where different aspects of the game are being processed. It's very likely this is a more of a bottleneck on the PhysX hardware, as the difference between the 1.8 and 2.6 GHz CPU with PhysX is less than the difference between the two CPUs using software PhysX calculations.

If we shift our focus to the minimum framerates, we notice that when physics is accelerated by hardware our minimum framerate is very low at 17 frames per second regardless of the graphical quality - 12 FPS with the slower CPU. Our test is mostly that of an explosion. We record slightly before and slightly after a grenade blowing up some scenery, and the minimum framerate happens right after the explosion goes off.

Our working theory is that when the explosion starts, the debris that goes flying everywhere needs to be created on the fly. This can either be done on the CPU, on the PhysX card, or in both places depending on exactly how the situation is handled by the software. It seems most likely that the slowdown is the cost of instancing all these objects on the PhysX card and then moving them back and forth over the PCI bus and eventually to the GPU. It would certainly be interesting to see if a faster connection for the PhysX card - like PCIe X1 - could smooth things out, but that will have to wait for a future generation of the hardware most likely.

We don't feel the drop in frame rates really affects playability as it's only a couple frames with lower framerates (and the framerate isn't low enough to really "feel" the stutter). However, we'll leave it to the reader to judge whether the quality gain is worth the performance loss. In order to help in that endeavor, we are providing two short videos (3.3MB Zip) of the benchmark sequence with and without hardware acceleration. Enjoy!

One final note is that judging by the average and minimum frame rates, the quality of the physics calculations running on the CPU is substantially lower than it needs to be, at least with a fast processor. Another way of putting it is that the high quality physics may be a little too high quality right now. The reason we say this is that our frame rates are lower -- both minimum and average rates -- when using the PPU. Ideally, we want better physics quality at equal or higher frame rates. Having more objects on screen at once isn't bad, but we would definitely like to have some control over the amount of additional objects.

ASUS Card and Test Configuration Final Words
Comments Locked

101 Comments

View All Comments

  • iNsuRRecTiON - Saturday, May 6, 2006 - link

    Hey,

    the ASUS PhysX card does already have 256 MB RAM instead of 128 MB RAM, compared to BFG Tech. card..

    best regards,

    iNsuRRecTiON
  • fishbits - Friday, May 5, 2006 - link

    I want the physics card to be equivalent to a sound card in terms of standardization and how often I feel compelled to upgrade it. In other words, it would be upgraded far less often than graphics cards are. Putting the physics hardware on a graphics card means you would throw away (or sell at a loss) perfectly good physics capability just to get a faster GPU, or get a second card to go to SLI/Crossfire. This is a bad idea for all the same reasons you'd say putting sound card functionality on a graphics card is a bad idea.
  • Calin - Friday, May 5, 2006 - link

    Yes, you could do all kind of nice calculations on the physics boards. However, moving geometry data from the video card to the physics board to be calculated and moving them back to the video card would be shooting yourself in all feets.
    I think this could run well as an accelerator for rendering images or for 3D applications... how soon until 3DStudio, PhotoShop and so on take advantage?
  • tonjohn - Friday, May 5, 2006 - link

    quote:

    I hope that they won't need a respin to add pcie functionality but fear this may be the case.

    The pre-production cards had both PCI and PCIe support at the same time. You simply flipped the card depending on which interface you wanted to use. So I believe that the PPU offers native PCIe support and that BFG and ASUS could produce PCIe boards today if Ageia would give them permission to.
    quote:

    I agree with the post that in volume, this kind of chip could find its way onto 3d graphics cards for gaming.

    Bad idea. Putting the PPU onboard with a GPU means higher costs all around (longer PCBs, possibly more layers, more ram). Also, the two chips will be fighting for banwidth which is never a good thing.

    Higher costs and lower performance = a bad idea.

    FYI: I have a BFG PhysX card.
  • saratoga - Friday, May 5, 2006 - link

    Actually, putting this on the GPU core would be much cheaper. You'd save by getting rid of all the duplicated hardware: DRAMs, memory controller, power circuitry, PCI bridge, cooling, PCB, etc.

    Not to mention you'd likely gain a lot of performance by having a PCI-E 16x slot and an ondie link to the GPU.
  • Calin - Monday, May 8, 2006 - link

    I wonder how much of the 2TB/s internal bandwidth will be used on the Ageia card... if enough of it, then the video card will have very little bandwidth remaining for its operations (graphic rendering). However, if the cooling really needs that heat sink/fan combo, and the card really needs that power connector, you won't be able to put one on the highest end video cards (for power and heat reasons).
  • kilkennycat - Friday, May 5, 2006 - link

    "I have a BFG PhysX card"

    Use it as a door-stop ?

    Pray tell me where you plug one of these if you have the following:-

    Dual 7900GTX512 (or dual 1900XTX)
    and
    Creative X-Fi

    already in your system.
  • Walter Williams - Friday, May 5, 2006 - link

    quote:

    Use it as a door-stop ?

    Actually, I use it to play CellFactor. Your missing out.
    quote:

    Pray tell me where you plug one of these if you have the following:

    SLi and CrossFire are the biggest waste of money unless you are doing intense rendering work.

    I hope people with that setup enjoy their little fps improvement per dollar while I'm playing CellFactor, which requires the PPU to run.
  • kilkennycat - Friday, May 5, 2006 - link

    Cellfactor MP tech demo....

    Cellfactor to be released in Q4 2007.. maybe... Your PhysX is going to be a little old by the time the full game is released...Should be up to quad-core CPUs and lots of cycles available for physics calculations by that time.

    I have recently been playing Oblivion a lot, like several million others. The Havok software physics are just great --- and you NEED the highest-end graphics for optimum visual experience in that game --- see the current Anandtech article. Sorry, I care little about (er) "better particle effects" or "more realistic explosions", even when I play Far Cry. In fact, from my experiences with BF2 and BF1942 I find them more than adequately immersive with their great scenery graphics and their CURRENT physics effects -- even the old and noble BF1942.

    On single-player games, I would far prefer seeing additional hardware, or compute-cycles, being directed at advanced-AI than physics. What point fancy physics-effects if the AI enemy has about as much built-in intelligence as a lump of Swiss cheese? Sure does not help the game's immersive experience at all. And tightly-scripted AI just does not work in open-area scenarios (c.f: Unreal 2 and the dumb enemies easily sneaked from behind -- somebody forgot to script that eventuality amongst many others that can occur in an open play-area). The successful tightly-scripted single-play shooters like Doom3, HL2, FEAR etc all have overt or disguised "corridors". So, the developers of open-area games like Far Cry or Oblivion chose an algorithmic *intelligent-agent AI* approach, with a simple overlay of scripting to set some broad behavioral and/or location boundaries. A distinct move in the right direction but there are some problems with the AI implementation in both games. More sophisticated AI algorithms will require more compute-power, which, if performed on the CPU, will need to be traded off with cycles available for graphics. Dual-core will help, but a general-purpose DSP might help even more... they are not expensive and easily integrated into a motherboard.

    Back to the immediate subject of the Ageia PPU and physics effects:-

    I am far more intrigued by Havok's exercises with Havok FX harnessing both dual-core CPU power and GPU power in the service of physics emulation. Would be great to have action games with a physics-adjustable slider so that one can trade off graphics with physics effects in a seamless manner, just as one can trade-off advanced-graphics elements in games today.... which is exactly where Havok is heading. No need to support marginal added hardware like the PhysX. Now, if the PhysX engine was an option on every high-end motherboard, for say not more than $50 extra, or as an optional motherboard plug-in at say $75, (like the 8087 of yore) and did not take up any additional precious peripheral slots, then I would rate its chances of becoming main-stream to be pretty high. Seems as if Ageia should license their hardware DESIGN as soon as possible to nVidia or ATi at (say) not more than $15 a copy and have them incorporate the design into their motherboard chip-sets.

    The current Ageia has 3 strikes against it for cost, hardware interface ( PCI ) and software-support reasons. The PhysX PPU certainly has NO hope at all as a periphreal device as long as it stays in PCI form. Must migrate to PCIe asap. Remember that a X1, or X4 PCIe card will happily work in a PCIe X16 slot, and there are still several million SLI and Crossfire motherboard with empty 2nd video slots. Plus, even on a dual-SLI with dual-slot-width video cards and an audio card present, it is more likely to find one PCIeX1 or X4 slot vacant that does not compromise the video-card ventilation than to find a PCI slot that is not either covered up by the dual-width video cards or that does not completely block airflow to one or other of the video cards.

    So if a PCIe version of the PhysX ever becomes available... you will be able to sell your PCI version... at about the price of a doorstop. Few will want the PCI version if a used PCIe version is also available.

    Hard on the wallet being an early adopter at times.....
  • tonjohn - Friday, May 5, 2006 - link

    The developers did a poor job when it came to how the implemented PPU support in GRAW.

    CellFactor is a MUCH better test of what the PhysX card is capable. The physics in CellFactor are MUCH more intense. When blowing up a load of crap, my fps only drop 2fps at the most, and that is mainly b/c my 9800Pro is struggling to render the actual effects of a grenade explosion.

Log in

Don't have an account? Sign up now