AGEIA PhysX Technology and GPU Hardware

First off, here is the low down on the hardware as we know it. AGEIA, being the first and only consumer-oriented physics processor designer right now, has not given us as much in-depth technical detail as other hardware designers. We certainly understand the need to protect intellectual property, especially at this stage in the game, but this is what we know.

PhysX Hardware:
125 Million transistors
130nm manufacturing process
128MB 733MHz Data Rate GDDR3 RAM
128-bit memory bus interface
20 giga-instructions per second
2 Tb/sec internal memory bandwidth
"Dozens" of fully independent cores


There are quite a few things to note about this architecture. Even without knowing all the ins and outs, it is quite obvious that this chip will be a force to be reckoned with in the physics realm. A graphics card, even with a 512-bit internal bus running at core speed, has less than 350 Gb/sec internal bandwidth. There are also lots of restrictions on the way data moves around in a GPU. For instance, there is no way for a pixel shader to read a value, change it, and write it back to the same spot in local RAM. There are ways to deal with this when tackling physics, but making highly efficient use of nearly 6 times the internal bandwidth for the task at hand is a huge plus. CPUs aren't able to touch this type of internal bandwidth either. (Of course, we're talking about internal theoretical bandwidth, but the best we can do for now is relay what AGEIA has told us.)

Physics, as we noted in last years article, generally presents itself in sets of highly dependant small problems. Graphics has become sets of highly independent mathematically intense problems. It's not that GPUs can't be used to solve these problems where the input to one pixel is the output of another (performing multiple passes and making use of render-to-texture functionality is one obvious solution); it's just that much of the power of a GPU is mostly wasted when attempting to solve this type of problem. Making use of a great deal of independent processing units makes sense as well. In a GPU's SIMD architecture, pixel pipelines execute the same instructions on many different pixels. In physics, it is much more often the case that different things need to be done to every physical object in a scene, and it makes much more sense to attack the problem with a proper solution.

To be fair, NVIDIA and ATI are not arguing that they can compete with the physics processing power AGEIA is able to offer in the PhysX chip. The main selling points of physics on the GPU is that everyone who plays games (and would want a physics card) already has a graphics card. Solutions like Havok FX which use SM3.0 to implement physics calculations on the GPU are good ways to augment existing physics engines. These types of solutions will add a little more punch to what developers can do. This won't create a revolution, but it will get game developers to look harder at physics in the future, and that is a good thing. We have yet to see Havok FX or a competing solution in action, so we can't go into any detail on what to expect. However, it is obvious that a multi-GPU platform will be able to benefit from physics engines that make use of GPUs: there are plenty of cases where games are not able to take 100% advantage of both GPUs. In single GPU cases, there could still be a benefit, but the more graphically intensive a scene, the less room there is for the GPU to worry about anything else. We are certainly seeing titles coming out like Oblivion which are able to bring everything we throw at it to a crawl, so balance will certainly be an issue for Havok FX and similar solutions.

DirectX 10 will absolutely benefit AGEIA, NVIDIA, and ATI. For physics on GPU implementations, DX10 will decrease overhead significantly. State changes will be more efficient, and many more objects will be able to be sent to the GPU for processing every frame. This will obviously make it easier for GPUs to handle doing things other than graphics more efficiently. A little less obviously, PhysX hardware accelerated games will also benefit from a graphics standpoint. With the possibility for games to support orders of magnitude more rigid body objects under PhysX, overhead can become an issue when batching these objects to the GPU for rendering. This is a hard thing for us to test for explicitly, but it is easy to understand why it will be a problem when we have developers already complaining about the overhead issue.

While we know the PhysX part can handle 20 GIPS, this measure is likely simple independent instructions. We would really like to get a better idea of how much actual "work" this part can handle, but for now we'll have to settle for this ambiguous number and some real world performance. Let's take a look a the ASUS card and then take a look at the numbers.

Index ASUS Card and Test Configuration
Comments Locked

101 Comments

View All Comments

  • Clauzii - Monday, May 8, 2006 - link

    Youre right.. have to wait and see what happens whith drivers/PCIe then...
  • Hypernova - Friday, May 5, 2006 - link

    They should have waited for cellfactor's release as a launch line up for aegia. First impression is important and the glued on approach of GRAW is nothing but negative publicity for aegia. Not everyone knows that physx was nothing more of a patched addon to havok in GRAW and will think that this is how the cards will turn out.

    If you can't do it right then don't do it at all. GRAW's implementation was a complete failure, even for a first generation product.
  • segagenesis - Friday, May 5, 2006 - link

    As much as I would want to give them the benefit of the doubt... With what you say its that simple really. Most gamers (at least ones I know?) want worry free performance, and if spending extra money on hardware results in worse performance then this product will be short lived.

    I watched the non-game PhysX demos and they looked really damn cool, but they really should have worked on making a PCI-E version from the start... boards with PCI slots are already becoming dated, and those that have 16x slots for graphics have at least the small 1x slot!
  • nullpointerus - Friday, May 5, 2006 - link

    I wouldn't say it was a complete failure. IMO the benchmarks were quite disappointing, and this was compounded by the lack of effects configurability in the game, but the videos were quite compelling. If you look at the dumpster (?), you can see that not only does the lid blow off but it bends and crumples. If we see more than just canned animations in better games (Cell Factor?), then this $300 should be worth its cost to high-end gamers. I'd say Aegia is off to a rough start, not an implosion.
  • DerekWilson - Friday, May 5, 2006 - link

    There are quite a few other things PhysX does in the game as well -- though they all are kinda "tacked on" as well. You noticed the lid of the dumpster which only pops open undersoftware. In hardware it is a seperate object that can go flying around depending on the explosion. The same is true of car doors and other similar parts -- under software they'll just pop open, but with hardware they go flying if hit right.

    It is also interesting to add that the explosions and such are scripted under software, but much of it becomes physically simulated under hardware. While this fact is kinda interesting, it really doesn't matter to the gamer in this title. But for other games that make more extensive use of the hardware, it could be quite useful.
  • nullpointerus - Friday, May 5, 2006 - link

    That should be very cool. I was playing F.E.A.R. recently and decided to rake my SMG over a bunch of glass panes in an "office" level. Initially, it looked and sounded good because I have the settings cranked up reasonably high, but then I noticed all the glass panes were breaking in exactly the same way. Rather disappointing...
  • DigitalFreak - Friday, May 5, 2006 - link

    ... may use the Gamebryo engine, but it uses Havok for physics.
  • Bull Dog - Friday, May 5, 2006 - link

    You are correct and I noticed that the first time I read the article.
  • DerekWilson - Friday, May 5, 2006 - link

    Sorry if what I said wasn't clear enough -- I'll add that Oblivion uses Havok.
  • peternelson - Friday, May 5, 2006 - link


    In your introduction you might have mentioned there are TWO partners for bringing this technology to market: Asus and BFG Tech.

    Both have shipping boards. I wonder if they perform identically or if there is some difference.

    I agree that PCI could be a bottleneck, but I'm more concerned that putting a lot of traffic on the pci bus will impair my OTHER pci devices.

    PCIE x1 would have been much more sensible. I hope that they won't need a respin to add pcie functionality but fear this may be the case.

    I agree Cellfactor looks more heavy use of physics so may make the difference with/without PPU more noticeable/measurable.

    I also wonder how much the memory size on the Physx board matters? Maybe a second gen board could double that to 256. I'm also interested in whether PPU could be given some abstraction layer and programmed to do non-physics useful calculations as is being done on graphics cards now. This might speed its adoption.

    I agree with the post that in volume, this kind of chip could find its way onto 3d graphics cards for gaming. As BFG make GPU cards, they might move that direction.

Log in

Don't have an account? Sign up now