NVIDIA Works: ANSEL & VRWorks Audio

Along with the various hardware aspects of Pascal, NVIDIA’s software teams have also been working on new projects to coincide with the Pascal launch. These are a new screenshot tool, and a new audio simulation package based on path traced audio.

We’ll start with NVIDIA’s new screenshot utility. Dubbed ANSEL, after famous American environmental photographer Ansel Adams, ANSEL is a very different take on screenshots. Rather than taking screenshots from the player’s perspective at the game rendering resolution, ANSEL allows for an entire scene to be captured at a far higher resolution than with standard screenshots. NVIDIA is pitching this as an art tool rather than a gaming tool, and I get the impression that this is one of those pie-in-the-sky kind of ideas that NVIDIA’s software group decided to run with in order to best show off Pascal’s various capabilities.

At its core, ANSEL is a means to decouple taking screenshot from the limitations of the player’s view. In an ANSEL-enabled application, ANSEL can freeze the state of the game, move the camera around, and then generate a copious amount of viewports to take screenshots. The end result is that ANSEL makes it possible to generate an ultra-high resolution 360 degree stereo 3D image of a game scene. The analogy NVIDIA is working towards is dropping a high quality 360 degree camera into a game, and letting users play with it as they see fit.

But even this isn’t really a great description of ANSEL, as there isn’t anything else like it to compare it to. Some games have offered 360 degree capture, but they haven’t done so at any kind of resolution approaching what ANSEL can do. And this still doesn’t touch features such as HDR (FP16) scene capture or the free camera.

Under the hood, ANSEL is at times a checklist for Pascal technologies (though it does work with Maxwell 2 as well). In order to capture scenes at a super high resolution, it forces a scene to its maximum LOD and breaks it down into a number of viewports, implemented efficiently using SMP. To demonstrate this technology NVIDIA put together a 4.5Gpix image rendered out of The Witcher 3, which was composed of 3600 such viewport tiles. Meanwhile stitching together the individual tiles is a CUDA based rendering process, which uses overlapping tiles to resolve any tone mapping conflicts. Finally, ANSEL captures images before they’re actually sent to a display, grabbing HDR images (in EXR format0 in games that support HDR.

Meanwhile given its level of deep interaction with games, ANSEL does require individual game support to work. This is in the form of a library provided by NVIDIA, which helps ANSEL and NVIDIA’s driver make sense of a scene and pause the simulation when necessary. Unsurprisingly, NVIDIA is eager to get ANSEL into more games – it just launched on Mirror’s Edge: Catalyst – and as a result is touting to developers that ANSEL is easy to implement, having taken only 150 lines of code on The Witcher 3.

Ultimately NVIDIA seems to be throwing ANSEL at the wall here to see what sticks. But it should be neat to see what users end up doing with the technology,

VRWorks Audio

Not to be outdone by the ANSEL team, other parts of NVIDIA’s software group has been working on a slightly different kind of project for NVIDIA: audio. As a GPU company, NVIDIA has never been deeply involved with audio (not since getting out of the chipset business, at least), but with the current focus on VR, they are taking a crack at it in a new way.

VRWorks Audio is the latest library as part of NVIDIA’s larger VRWorks suite. As given away by the name, this library is focused on audio, specifically for VR. In a nutshell, VRWorks is a full audio simulation library, using path tracing to power the simulation. The goal of VRWorks Audio is to provide a realistic sound simulation for VR, to further increase the apparent realism.

Under the hood, VRWorks audio leverages NVIDIA’s existing OptiX path tracing technology. Only rather than tracing light it’s used to trace sound waves. Along with simulating audio propagation itself – including occlusion and reverb – VRWorks Audio is also able to run the necessary Head Related Transfer Functions (HRTFs) to reduce the simulation down to binaural audio for headphones.

All of this is, of course, executed on Pascal’s CUs in a manner similar to path tracing or PhysX, running alongside the main graphics rendering thread. The amount of processing power required for VRWorks Audio can vary considerably depending on the detail desired (particularly the number of reflections); for NVIDIA’s VR Funhouse demo, VR Works audio can occupy most of a GPU on its own.

Ultimately, unlike some of the other technologies presented by NVIDIA, VRWorks Audio is in a relatively early stage. As a result while NVIDIA is shipping the SDK, there aren’t any games that are announced to be using it at this time, and if it gets any traction it’ll be farther into the future before we see the first games using it. That said, NVIDIA is already reaching out to the all-important middleware vendors on the subject, and to that end their own VR Funhouse demo is using FMOD with a VRWorks Audio plugin to handle the sound, demonstrating that they already have VRWorks Audio working with the popular audio middleware.

GPU Boost 3.0: Finer-Grained Clockspeed Controls Meet the GeForce GTX 1080 & GTX 1070 Founders Edition Cards
Comments Locked

200 Comments

View All Comments

  • Ryan Smith - Friday, July 22, 2016 - link

    2) I suspect the v-sync comparison is a 3 deep buffer at a very high framerate.
  • lagittaja - Sunday, July 24, 2016 - link

    1) It is a big part of it. Remember how bad 20nm was?
    The leakage was really high so Nvidia/AMD decided to skip it. FinFET's helped reduce the leakage for the "14/16"nm node.

    That's apples to oranges. CPU's are already 3-4Ghz out of the box.

    RX480 isn't showing it because the 14nm LPP node is a lemon for GPU's.
    You know what's the optimal frequency for Polaris 10? 1Ghz. After that the required voltage shoots up.
    You know, LPP where the LP stands for Low Power. Great for SoC's but GPU's? Not so much.
    "But the SoC's clock higher than 2Ghz blabla". Yeah, well a) that's the CPU and b) it's freaking tiny.

    How are we getting 2Ghz+ frequencies with Pascal which so closely resembles Maxwell?
    Because of the smaller manufacturing node. How's that possible? It's because of FinFET's which reduced the leakage of the 20nm node.
    Why couldn't we have higher clockspeeds without FinFET's at 28nm? Because power.
    28nm GPU's capped around the 1.2-1.4Ghz mark.
    20nm was no go, too high leakage current.
    16nm gives you FinFET's which reduced the leakage current dramatically.
    What does that enable you to do? Increase the clockspeed..
    Here's a good article
    http://www.anandtech.com/show/8223/an-introduction...
  • lagittaja - Sunday, July 24, 2016 - link

    As an addition to the RX 480 / Polaris 10 clockspeed
    GCN2-GCN4 VDD vs Fmax at avg ASIC
    http://i.imgur.com/Hdgkv0F.png
  • timchen - Thursday, July 21, 2016 - link

    Another question is about boost 3.0: given that we see 150-200 Mhz gpu offset very common across boards, wouldn't it be beneficial to undervolt (i.e. disallow the highest voltage bins corresponding to this extra 150-200 Mhz) and offset at the same time to maintain performance at lower power consumption? Why did Nvidia not do this in the first place? (This is coming from reading Tom's saying that 1060 can be a 60w card having 80% of its performance...)
  • AnnonymousCoward - Thursday, July 21, 2016 - link

    NVIDIA, get with the program and support VESA Adaptive-Sync already!!! When your $700 card can't support the VESA standard that's in my monitor, and as a result I have to live with more lag and lower framerate, something is seriously wrong. And why wouldn't you want to make your product more flexible?? I'm looking squarely at you, Tom Petersen. Don't get hung up on your G-sync patent and support VESA!
  • AnnonymousCoward - Thursday, July 21, 2016 - link

    If the stock cards reach the 83C throttle point, I don't see what benefit an OC gives (won't you just reach that sooner?). It seems like raising the TDP or under-voltaging would boost continuous performance. Your thoughts?
  • modeless - Friday, July 22, 2016 - link

    Thanks for the in depth FP16 section! I've been looking forward to the full review. I have to say this is puzzling. Why put it on there at all? Emulation would be faster. But anyway, NVIDIA announced a new Titan X just now! Does this one have FP16 for $1200? Instant buy for me if so.
  • Ryan Smith - Friday, July 22, 2016 - link

    Emulation would be faster, but it would not be the same as running it on a real FP16x2 unit. It's the same purpose as FP64 units: for binary compatibility so that developers can write and debug Tesla applications on their GeForce GPU.
  • hoohoo - Friday, July 22, 2016 - link

    Excellent article, Ryan, thank you!

    Especially the info on preemption and async/scheduling.

    I expected the preemption mght be expensive in some circumstances, but I didn't quite expect it to push the L2 cache though! Still this is a marked improvement for nVidia.
  • hoohoo - Friday, July 22, 2016 - link

    It seems like the preemption is implemented in the driver though? Are there actual h/w instructions to as it were "swap stack pointer", "push LDT", "swap instruction pointer"?

Log in

Don't have an account? Sign up now