Turing Tensor Cores: Leveraging Deep Learning Inference for Gaming

Though RT Cores are Turing’s poster child feature, the tensor cores were very much Volta’s. In Turing, they’ve been updated, reflecting its positioning as a gaming/consumer feature via inferencing. The main changes for the 2nd generation tensor cores are INT8 and INT4 precision modes for inferencing, enabled by new hardware data paths, and perform dot products to accumulate into an INT32 product. INT8 mode operates at double the FP16 rate, or 2048 integer operations per clock. INT4 mode operates at quadruple the FP16 rate, or 4096 integer ops per clock.

Naturally, only some networks tolerate these lower precisions and any necessary quantization, meaning the storage and calculation of compacted format data. INT4 is firmly in the research area, whereas INT8’s practical applicability is much more developed. Regardless, the 2nd generation tensor cores still have FP16 mode, which they now support in a pure FP16 mode without FP32 accumulator. While CUDA 10 is not yet out, the enhanced WMMA operations should shed light on any other differences, such as additional accepted matrix sizes for operands.

Inasmuch as deep learning is involved, NVIDIA is pushing what was a purely compute/professional feature into consumer territory, and we will go over the full picture in a later section. For Turing, the tensor cores can accelerate the features under the NGX umbrella, which includes DLSS. They can also accelerate certain AI-based denoisers that cleanup and correct real time raytraced rendering, though most developers seem to be opting for non-tensor core accelerated denoisers at the moment.

Turing RT Cores: Hybrid Rendering and Real Time Raytracing The Turing Trio: TU102, TU104, & TU106
POST A COMMENT

113 Comments

View All Comments

  • willis936 - Sunday, September 16, 2018 - link

    Also in case there's anyone else in the signal integrity business reading this: does it bug anyone else that eye diagrams are always heatmaps without a colorbar legend? When I make eye diagrams I put a colorbar legend in to signify hits per mV*ps area. The only thing I see T&M companies do is specify how many samples are in the eye diagram but I don't think that's enough for easy apples to apples comparisons. Reply
  • Manch - Sunday, September 16, 2018 - link

    No, bc heatmaps are std. Reply
  • willis936 - Monday, September 17, 2018 - link

    A heatmap would still be used. The color alone has no meaning unless you know how many hits there are total. Even that is useless if you want to build a bathtub. The colorbar would actually tell you how many hits are in a region. This applies to all heatmaps. Reply
  • casperes1996 - Sunday, September 16, 2018 - link

    Wow... I just started a computer science education recently, and I was recently tasked with implementing an effecient search algorithm that works on infinitely long data streams. I made it so it first checks for an upper boundary in the array, (and updates the lower boundary based on the upper one) and then does a binary search on that subarray. I feel like there's no better time to read this article since it talks about the BVH. I felt so clever when I read it and thought "That sounds a lot like a binary search" before the article then mentioned it itself! Reply
  • ballsystemlord - Sunday, September 16, 2018 - link

    You made only 1 typo! Great job!

    "In any case, as most silicon design firms hvae leapfrogging design teams,"
    Should be "have":
    "In any case, as most silicon design firms have leapfrogging design teams,"

    There is one more problem (stray 2 letter word), in your article, but I forgot were it was. Sorry.
    Reply
  • Sherlock - Monday, September 17, 2018 - link

    The fact that Microsoft has released a Ray Tracing specific API seems to suggest that the next XBox will support it. And considering AMD is the CPU/GPU partner for the next gen XBox - it seems highly likely that the next gen AMD GPU's will have dedicated Ray Tracing hardware as well. I expect meaningful use of these hardware feature only once the next gen console hardware is released - which is due in the next 2-3 years. RTX seems a wasteful expenditure for the end-consumer now. The only motivation for NVidia to release this now is so that consumers don't feel as they are behind the curve against AMD. This gives some semblance to the rumros that Nvidia will release a "GTX" line and expect it to be their volume selling product - with the RTX as proof-of-concept for early adopters Reply
  • bebby - Monday, September 17, 2018 - link

    Very good point from Sherlock. I also believe that Sony and Microsoft will be the ones defining what kind of hardware features will be used and which not.
    In general, with Moore's Law slowing down, progress gets slower and the incremental improvements are minimal. With the result that there is less competition, prices go up and there is not any more any "wow" effect coming with a new GPU. (last time I had this was with the 470gtx)
    My disappointment lies with the power consumption. Nvidia should focus more on power consumption rather than performance if they ever want to have a decent market share in tablets/phablets.
    Reply
  • levizx - Monday, September 17, 2018 - link

    Actually the efficiency increased only 18% not 23%. 150% / 127% - 1 = 18.11%, you can't just 50% - 27% = 23%, the efficiency increase is compared to "without optimization" i.e. 127% Reply
  • rrinker - Monday, September 17, 2018 - link

    91 comments (as I type this) and most of them are arguing over "boo hoo, it's too expensive" Well, if it's too expensive - don't buy it. Why complain? Oh yeah, because Internet. This is NOT just the same old GPU, just a little faster - this is something completely different, or at least, something with significant differences to the current cop of traditional GPUs. There's no surprise that it's going to be more expensive - if you are shocked at the price then you really MUST be living under a rock. The first new ANYTHING is always premium priced - there is no competition, it's a unique product, and there is a lot of development costs involved. CAN they sell it for less? Most likely, but their job is not to sell it for the lowest possible profit, it's to sell it for what the market will bear. Simple as that. Don't like it, don;t buy it. Absolutely NO ONE needs the latest and greatest on launch day. I won't be buying one of these, I do nothing that would benefit from the new features. Maybe in a few years, when everyone has raytracing, and the games I want to play require it - then I'll buy a card like this. Griping about pricing on something you don't need - priceless. Reply
  • eddman - Tuesday, September 18, 2018 - link

    ... except this is the first time we've had such a massive price jump in the past 18 years. Even 8800 series, which according to jensen was the biggest technology jump before 20 series, launched at about the same MSRP as the last gen.

    It does HYBRID, partial, limited ray-tracing, and? How does that justify such a massive price jump? If these cards are supposed to be the replacements for pascals, then they are damn overpriced COMPARED to them. This is not how generational pricing is supposed to be.

    If these are supposed to be a new category, then why name them like that? Why not go with 2090 Ti, 2090, or something along those lines.

    Since they haven't done that and considering they left and right compare these cards to pascal cards even in regular rasterized games, then I have to conclude they consider them generational replacements.
    Reply

Log in

Don't have an account? Sign up now