AGEIA PhysX Technology and GPU Hardware

First off, here is the low down on the hardware as we know it. AGEIA, being the first and only consumer-oriented physics processor designer right now, has not given us as much in-depth technical detail as other hardware designers. We certainly understand the need to protect intellectual property, especially at this stage in the game, but this is what we know.

PhysX Hardware:
125 Million transistors
130nm manufacturing process
128MB 733MHz Data Rate GDDR3 RAM
128-bit memory bus interface
20 giga-instructions per second
2 Tb/sec internal memory bandwidth
"Dozens" of fully independent cores


There are quite a few things to note about this architecture. Even without knowing all the ins and outs, it is quite obvious that this chip will be a force to be reckoned with in the physics realm. A graphics card, even with a 512-bit internal bus running at core speed, has less than 350 Gb/sec internal bandwidth. There are also lots of restrictions on the way data moves around in a GPU. For instance, there is no way for a pixel shader to read a value, change it, and write it back to the same spot in local RAM. There are ways to deal with this when tackling physics, but making highly efficient use of nearly 6 times the internal bandwidth for the task at hand is a huge plus. CPUs aren't able to touch this type of internal bandwidth either. (Of course, we're talking about internal theoretical bandwidth, but the best we can do for now is relay what AGEIA has told us.)

Physics, as we noted in last years article, generally presents itself in sets of highly dependant small problems. Graphics has become sets of highly independent mathematically intense problems. It's not that GPUs can't be used to solve these problems where the input to one pixel is the output of another (performing multiple passes and making use of render-to-texture functionality is one obvious solution); it's just that much of the power of a GPU is mostly wasted when attempting to solve this type of problem. Making use of a great deal of independent processing units makes sense as well. In a GPU's SIMD architecture, pixel pipelines execute the same instructions on many different pixels. In physics, it is much more often the case that different things need to be done to every physical object in a scene, and it makes much more sense to attack the problem with a proper solution.

To be fair, NVIDIA and ATI are not arguing that they can compete with the physics processing power AGEIA is able to offer in the PhysX chip. The main selling points of physics on the GPU is that everyone who plays games (and would want a physics card) already has a graphics card. Solutions like Havok FX which use SM3.0 to implement physics calculations on the GPU are good ways to augment existing physics engines. These types of solutions will add a little more punch to what developers can do. This won't create a revolution, but it will get game developers to look harder at physics in the future, and that is a good thing. We have yet to see Havok FX or a competing solution in action, so we can't go into any detail on what to expect. However, it is obvious that a multi-GPU platform will be able to benefit from physics engines that make use of GPUs: there are plenty of cases where games are not able to take 100% advantage of both GPUs. In single GPU cases, there could still be a benefit, but the more graphically intensive a scene, the less room there is for the GPU to worry about anything else. We are certainly seeing titles coming out like Oblivion which are able to bring everything we throw at it to a crawl, so balance will certainly be an issue for Havok FX and similar solutions.

DirectX 10 will absolutely benefit AGEIA, NVIDIA, and ATI. For physics on GPU implementations, DX10 will decrease overhead significantly. State changes will be more efficient, and many more objects will be able to be sent to the GPU for processing every frame. This will obviously make it easier for GPUs to handle doing things other than graphics more efficiently. A little less obviously, PhysX hardware accelerated games will also benefit from a graphics standpoint. With the possibility for games to support orders of magnitude more rigid body objects under PhysX, overhead can become an issue when batching these objects to the GPU for rendering. This is a hard thing for us to test for explicitly, but it is easy to understand why it will be a problem when we have developers already complaining about the overhead issue.

While we know the PhysX part can handle 20 GIPS, this measure is likely simple independent instructions. We would really like to get a better idea of how much actual "work" this part can handle, but for now we'll have to settle for this ambiguous number and some real world performance. Let's take a look a the ASUS card and then take a look at the numbers.

Index ASUS Card and Test Configuration
Comments Locked

101 Comments

View All Comments

  • Magnadoodle - Friday, May 5, 2006 - link

    Actually, if you look at this statement by Havok: http://www.firingsquad.com/news/newsarticle.asp?se..."> (Already linked)

    They also arrive at the conclusion that it is not a GPU bottleneck.

    Furthermore, the only thing the PPU seems to do in GRAW is render a couple of particles, while not improving or accelerating *at all* the processing of physics. This particle effect could have been processed very well by a GPU.

    I guess Anandtech didn't notice that the physics were exactly the same, thus pointing out the somewhat elicit nature of better physics.
  • DerekWilson - Friday, May 5, 2006 - link

    The havok guys did miss a few things pointed out earlier in the comments. Some destructable objects do break off into real persistant objects under PhysX -- like the dumpster lid and car doors. Also, the debris in the explosions is physically simulated rather than scripted. While I certainly agree that the end effect in these cases has no impact on "goodness", it is actually doing something.

    I'll certainly agree that "better physics" is kind of strange to think about. But it is really similar to how older games used to do 3D with canned animations. More realtime simulation opened up opportunities to do so many amazing things that just couldn't be done otherwise. This should extend well to physics.

    Also, regardless of how (or how efficiently) the developers did it, there's no denying that the game feels better with the hardware accelerated aspects. Whether they could have done the same thing on the CPU or GPU, they didn't.

    I'd still love to find a way to test the performance of this thing running the hardware physics on the CPU.
  • JumpyBL - Saturday, May 6, 2006 - link

    quote:

    Some destructable objects do break off into real persistant objects under PhysX -- like the dumpster lid and car doors.


    I see these same effects without PhysX.
  • DerekWilson - Saturday, May 6, 2006 - link

    When I play it without the PhysX hardware, doors just seem to pop open -- not fly off ... though I haven't exhaustively blown up every object -- there could be some cases where these types of things happen in software as well.
  • JumpyBL - Saturday, May 6, 2006 - link

    Shoot up the tires, car doors, etc enough and they come off. Same with the garbage can lid, throw a nade, it'll blow right off the container, all without a PPU.
  • Fenixgoon - Friday, May 5, 2006 - link

    how is the game going to "feel better" with a PPU when it slams your framerate down from buttery smooth to choppy? sorry, i'll take the FPS over any degree of better simulated physics, ESPECIALLY on a budget PC. i mean, look at the numbers! opteron minimum fps at 8x6 was 46, and with the PPU hardware it dropped to 12 - over a 75% decrease!!
  • DerekWilson - Friday, May 5, 2006 - link

    Note that the min framerate is much lower than the average -- with the majority of frames rolling along at average framerates, one or two frames that drop to an instantaneous 12-17fps isn't going to make the game feel choppy. The benchmark was fairly short, so even outliers have an impact on the average -- futher going to show that these minimum fps are not anything to worry about. At the same time, they aren't desierable either.

    Also, I would certainly not recommend this part to anyone but the hardcore enthusiast right now. People with slow graphics cards and processors would benefit much more by upgrading one or the other. In these early stages with little software support, the PPU will really only look attractive to people who already have very powerful systems and want something else to expand the capabilities.

    if you watch the videos, there's no noticable choppiness in the motion of the explosion. and I can say from gameplay experience that there's no noticeable mouse lag when things are exploding either. thus, with the added visual effects, it feels better. certainly a subjective analysis, but I hope that explains how I could get that impression.
  • mongo lloyd - Friday, May 5, 2006 - link

    You must be joking. Watching the videos, the PhysX one is WAY choppier compared to the software one. The PhysX video even halts for a split second, in a way that's more than noticeable; it's downright terrible.

    And the graphics/effect of the extra debris? Negligible. I've seen more videos from this game (for example: http://www.pcper.com/article.php?aid=245">http://www.pcper.com/article.php?aid=245 ) and the extra stuff with PhysX in this game is just not impressive or a big deal, and in some cases it's actually worse (like the URINATING walls and ground when shooting them). It's not realistic, it's not fun, not particularly cool, and it's slow.
  • Clauzii - Friday, May 5, 2006 - link

    I also think thats why we see such a massive FPS drop. We are trying to render, say, 100 times as many objects now?
  • DerekWilson - Friday, May 5, 2006 - link

    I was hoping we made this clear in the article ...

    While there is certainly more for the GPU to do, the numbers under a CPU limited configuration (800x600 with lower quality settings) we see a very low minimum framerate and a much lower average when compared to software physics.

    The drop in performance is much much less signficant when we look at a GPU limited configuration -- if all those object were bottlenecking on the graphics card, then giving them high quality textures and rendering them at a much higher resolution would show a bigger impact on performance.

    Tie that in with the fact that both the CPU and GPU limited configurations turn out the same minimum framerate and we really can conclude that the bottleneck is somewhere other than the GPU.

    It becomes more difficult to determin whether the bottleneck is at the CPU (game/driver/api overhead) or the PPU (pci bus/object generation/actual physics).

Log in

Don't have an account? Sign up now