Tessellation: Because The GS Isn't Fast Enough

Microsoft and AMD tend to get the most excited about tessellation whenever the topic of DX11 comes up. AMD jumped on the tessellation bandwagon long ago, and perhaps it does make sense for consoles like the XBox 360. Adding fixed function hardware to quickly and efficiently handle a task that improves memory footprint has major advantages in the living room. We still aren't sold on the need for a tessellator on the desktop, but who's to argue with progress?

Or is it really progressive? The tessellator itself is fixed function rather than programmable. Sure, the input to and output of the tessellator can be manipulated a bit through the Hull Shader and Domain Shader, but the heart of the beast is just not that flexible. The Geometry Shader is the programmable block in the pipeline that is capable of tessellation as well as much more, but it just doesn't have the power to do tessellation on any useful scale. So while most everything has been moving towards programmability in the rendering pipe, we have sort of a step backward here. But why?

The argument between fixed function and programmable hardware is always one of performance versus flexibility and usefulness. In the beginning, fixed function was necessary to get the desired performance. As time went on, it became clear that adding in more fixed function hardware to graphics chips just wasn't feasible. The transistors put into specialized hardware just go unused if developers don't program to take advantage of it. This made a shift toward architectures where expanding the pool of compute resources that could be shared and used for many different tasks became a much more attractive way to go. In the general case anyway. But that doesn't mean that fixed function hardware doesn't have it's place.

We do still have the problem that all the transistors put into the tessellator are worthless unless developers take advantage of the hardware. But the reason it makes sense is that the ROI (return on investment: what you get for what you put in) on those transistors is huge if developers do take advantage of the hardware: it's much easier to get huge tessellation performance out of a fixed function tessellator than to put the necessary resources into the Geometry Shader to allow it to be capable of the same tessellation performance programmatically. This doesn't mean we'll start to see a renaissance of fixed function blocks in our graphics hardware; just that significantly advanced features going forward may still require the sacrifice of programability in favor of early adoption of a feature. The majority of tasks will continue to be enabled in a flexible programmable way, and in the future we may see more flexibility introduced into the tessellator until it becomes fully programmable as well (or ends up just being merged into some future version of the Geometry Shader).

Now don't let this technical assessment of fixed function tessellation make you think we aren't interested in reaping the benefits of the tessellator. Currently, artists need to create different versions of their objects for different LODs (Level of Detail -- reducing or increasing complexity as the object moves further or nearer the viewer), and geometry simulation through texturing at each LOD needs to be done by pixel shaders. This requires extra work from both artists and programmers and costs a good bit in terms of performance. There are also some effects than can only be done with more geometry.

Tessellation is a great way to get that geometry in there for more detail, shadowing, and smooth edges. High geometry also allows really cool displacement mapping effects. Currently, much geometry is simulated through textures and techniques like bump mapping or parallax occlusion mapping or some other technique. Even with high geometry, we will want to have large normal maps for our lighting algorithms to use, but we won't need to do so much work to make things like cracks, bumps, ridges, and small detail geometry appear to be there when it isn't because we can just tessellate and displace in a single pass through the pipeline. This is fast, efficient, and can produce very detailed effects while freeing up pixel shader resources for other uses. With tessellation, artists can create one sub division surface that can have a dynamic LOD free of charge; a simple hull shader and a displacement map applied in the domain shader will save a lot of work, increase quality, and improve performance quite a bit.

If developers adopt tessellation, we could see cool things, and with the move to DX11 class hardware both NVIDIA and AMD will be making parts with tessellation capability. But we may not see developers just start using tessellation (or the compute shader for that matter) right away. Because DirectX 11 will run on down level hardware and at the release of DX11 we will already have a huge number cards on the market capable of running a subset of DX11 bringing with it a better, more refined, programming language in the new version of HLSL and seamless parallelization optimizations, we will very likely see the first DX11 games only implementing features that can run completely on DX10 hardware.

Of course, at that point developers can be fully confident of exploiting all the aspects of DX10 hardware, which they still aren't completely taking advantage of. Many people still want and need a DX9 path because of Vista's failure, which means DX10 code tends to be more or less an enhanced DX9 path rather than something fundamentally different. So when DirectX 11 finally debuts, we will start to see what developers could really do with DX10.

Certainly there will be developers experimenting with tessellation, but these will probably just be simple amplification to get rid of those jagged edges around curved surfaces at first. It will take time for the real advanced tessellation techniques everyone is excited about to come to fruition.

So What's a Tessellator? One Last Thing and Closing Thoughts
Comments Locked

109 Comments

View All Comments

  • DerekWilson - Saturday, January 31, 2009 - link

    Hi, thanks for the feedback ... I've already talked the vista issue to death elsewhere in these comments, so I'll skip that, but ...

    3) You are right that to get the most out of DX10 you need renderer designed for DX10 not ported from DX9. At the same time, there are things that can be done to make DX9 stuff faster by using DX10 capabilities that don't require an engine rewrite. Yes this also requires development time, but it seems developers have opted to put time into adding effects with DX10 rather than increasing framerate. I'm not disappointed with that direction, but performance was an option.

    I agree with what you said about OGL.

    4) I know it's not going to happen, but it didn't need to not be possible. MS could have designed the API to expose new hardware featuers without requiring the driver model change. DX9 can run on XP or Vista's new driver model for example. They chose to make it so that it was impossible to back-port rather than designing DX10 (and subsequent versions) to be tied to the driver model. OpenGL exposes most of the featuers of DX10 to WinXP and all of the things that are interesting to graphics programmers).

    5) You are right -- it's not required to support multithreading but it is required if you want a performance benefit from that multithreading.
  • bobvodka - Saturday, January 31, 2009 - link

    Well, to be fair, many of my comments were directed at others in the thread :)

    4) The thing is, OpenGL and DX9, specifically D3D9, live in different places. D3D9 had alot more kernel side code which caused expensive switches when issuing certain commands; this is why D3D9 got all that instance draw stuff and OpenGL didnt, because on small batch sizes OpenGL could be between 2.3x and 1.4x quicker at executing the draw call then D3D9. OpenGL sits the otherside of the kernel calls and gives the implimenters more control on when that switch occures. D3D10 also sits the other side, again not wasting that time.

    There were also changes in the resource model, the driver model and various other areas; some of which were to make the Vista windowing system possible. This resource model and everything about it was very much tied to D3D10 and how it does things. OpenGL's resource model was also fundamentally different to the D3D9 model, with again the implimenter having alot more control; you never suffered a 'lost device' in OpenGL for example and the runtime automatically controlled allocated memory unlike in D3D9. There are some things OpenGL could do which D3D9 couldn't as well; such as on-card async memory copies and render-to-vertex buffer (well, ATI had a hack for it but the OpenGL method was cross-hardware).

    Could they have done it? Well, of course they could have done but unlike previous DX updates it would have taken alot more effort, time and money. All for a, at the time, 5 year old OS they were hoping to begin getting rid of.

    Personally, I think this was a good move all in all, the problem was with how it was presented to the general masses who suddenly saw they had to pay a few 100 USD for a new DX version.
  • Dribble - Saturday, January 31, 2009 - link

    The reason all our games are DX9c are because that's what the consoles support (yes I know PS3 uses open GL, but it has 9c feature support - basically has 7800GTX in it).
    Most games are made for console as well as PC, the majority use some cross platform renderer (e.g. unreal 3 engine) and that will support DX9c. Cross platform DX support won't change until the Xbox 720 and PS4 arrive.
    I don't see how DX11 will change this?
  • DerekWilson - Saturday, January 31, 2009 - link

    That's a really good point and something I should have considered.

    I think the gap in performance between the PC and consoles will so heavily favor PCs that it will inspire developers to once again shift their focus to the PC. I could be wrong though.
  • ssj4Gogeta - Sunday, February 1, 2009 - link

    That will change if Microsoft choose Larrabee for XBox 720. Cause then it will support all the DirectX (even unreleased) versions and all the OpenGL versions. If MS chooses Larrabee, and it also becomes popular for PC gaming, we PC gamers may see a huge benefit because then the games will be built for the latest DX version.
  • bobvodka - Sunday, February 1, 2009 - link

    It's all well and good saying that but Larrabee is currently utterly unproven technology.

    Don't get me wrong, I'd like to see it do well if only because 3 players in the GPU race will be better than 2 from a technology and consumer stand point, however everyone seems to be pinning their hopes on this technology when there hasn't even been a working demo of DX9 at the same speed as NV/AMD, never mind DX10 or DX11.
  • ssj4Gogeta - Sunday, February 1, 2009 - link

    You're right. But I'm really excited. It would be so nice if Intel can really pull off such feat. we'll have 2 TFLOPS of general purpose parallel processing power!
  • bobvodka - Saturday, January 31, 2009 - link

    The problem is, the consoles is where the money is. Combine that with a fixed hardware platform (well, hard drives and different screen sizes not withstanding) it makes for a much easier time for devs.
  • piroroadkill - Saturday, January 31, 2009 - link

    That's definitely a good point. Any game engine these days has to support the major consoles for it to be successful - even if that only means the 360, you're still hamstrung by DirectX9, regardless of whether XP has it or not.

    From a personal point of view I'm still running XP because I run a clusterfuck of graphics cards that vista would shit the bed thinking about.
  • William Gaatjes - Saturday, January 31, 2009 - link

    "Many under-the-hood enhancements mean higher performance for features available but less used under DX10. "


    I must be remembering it wrong but the same thing was sad about dx10. However it turned out to be nothing more then getting people to buy vista. I wonder how much will come true this time.

Log in

Don't have an account? Sign up now