Alongside a sneak peek at their forthcoming Xe-HPG architecture, the other big reveal today from Intel’s consumer graphics group comes from the software side of the business. Along with preparing Intel’s software stack for the 2022 launch of the first Arc products, the group has also been hard at work at their own take on modern, neural net-driven image upscaling techniques. The product of that research is Xe Super Sampling, or XeSS, which Intel is pitching as the best solution yet for high image quality and low processing cost image upscaling.

As briefly hinted at by Intel at the start of this week with the announcement of their Arc video card brand, the company has been developing their own take on image upscaling. As it turns out, they’re actually quite far along, so for today they’re not just announcing XeSS, but they are showing off footage of the technology as well. Even better, the initial version of the SDK will be shipping to game developers later this month.

XeSS (pronounced “ex-ee-ess-ess”) is, at a high level, a combination spatial and temporal AI image upscaling technique, which uses trained neural networks to integrate both image and motion data in order to produce a superior, higher resolution image. This is a field of research that has seen a great deal of research in the last half-decade, and was brought to the forefront of the consumer space a couple of years ago by NVIDIA with their DLSS technology. Intel’s XeSS technology, in turn, is designed to address similar use cases, and from a technical perspective ends up looking a lot like NVIDIA’s current DLSS 2.x technology.

As with NVIDIA and AMD, Intel is looking to have their cake and eat it too with respect to graphics rendering performance. 4K monitors are increasingly cheap and plentiful, but the kind of performance needed to natively render at 4K in modern AAA games is outside the reach of all but the most expensive discrete video cards. Ultimately looking to find ways to drive these 4K monitors with more modest video cards and without the traditional drop in image quality, this has led recent research into smart image upscaling techniques, and ultimately DLSS, FSR, and now XeSS.

In choosing their approach, Intel seems to have gone in a similar direction as NVIDIA’s second attempt at DLSS. Which is to say, they’re using a combination of spatial data (neighboring pixels) and temporal data (motion vectors from previous frames) to feed a (seemingly generic) neural network that has been pre-trained to upscale frames from video games. Like many other aspects of today’s GPU-related announcements, Intel isn’t going into too much detail here. So there are plenty of outstanding questions about how XeSS handles ghosting, aliasing, and other artifacts that can arise from these upscaling solutions. With that said, what Intel is promising isn’t something that’s out of their reach if they’ve really done their homework.

Meanwhile, given the use of a neural network to handle parts of the upscaling process, it should come as no surprise that XeSS is designed to leverage Intel’s new XMX matrix math units, which are making their debut in the Xe-HPG graphics architecture. As we saw in our sneak peek there, Intel is baking quite a bit of matrix math performance into their hardware, and the company is no doubt interested in putting it to good use. Neural network-based image upscaling techniques remain one of the best ways to use that hardware in a gaming context, as the workload maps well to these systolic arrays, and their high performance keeps the overall hit to frame rendering times small.

With that said, Intel has gone one step further and is also developing a version of XeSS that doesn’t require dedicated matrix math hardware. Owing to the fact that the installation base for their matrix hardware is starting from 0, that they’d like to be able to use XeSS on Xe-LP integrated graphics, and that they want do everything possible to encourage game developers to adopt their XeSS technology, the company is developing a version of XeSS that instead uses the 4-element vector dot product (DP4a) instruction. DP4a support is found in Xe-LP along with the past few generations of discrete GPUs, making its presence near-ubiquitous. And while DP4a still doesn’t offer the kind of performance that a dedicated systolic array does – or the same range of precisions, for that matter – it’s a faster way to do math that’s good enough for a somewhat slower (and likely somewhat duller) version of XeSS.

By offering a DP4a version of XeSS, game developers will be able to use XeSS on virtually all modern hardware, including competing hardware. In that respect Intel is taking a page from AMD’s playbook, targeting their own hardware while also letting customers of competitors benefit from this technology – even if by not quite as much. Ideally, that will be a powerful carrot to entice game developers to implement XeSS in addition to (or even in place of) other upscaling techniques. And while we won’t put the cart before the horse, should XeSS live up to all of Intel’s performance and image quality claims, then Intel would be in the unique position of being able to offer the best of both worlds: an upscaling technology with wide compatibility like AMD’s FSR and the image quality of NVIDIA’s DLSS.

As an added kicker, Intel is also planning on eventually open sourcing the XeSS SDK and tools. At this juncture there are no further details on their commitment – presumably, they want to finish and refine XeSS before releasing their tech to the world – but this would be a further feather in Intel’s cap if they can deliver on that promise as well.

In the meantime, game developers will be able to get their first look at the technology later this month, when Intel releases the initial, XMX-only version of the XeSS SDK. This will be followed by the DP4a version, which will be released later this year.

Finally, along with today’s technology disclosure Intel has also posted some videos of XeSS in action, using an early version of the technology baked into a custom Unreal Engine demo. The minute or so of footage shows several image quality comparisons between native 4K rendering and XeSS, which is upscaling from a native 1080p image.

As with all vendor demos, Intel’s should be taken with a suitable grain of salt. We don’t have any specific framerate data to go with, and Intel’s demo is fairly limited. In particular, I would have liked to see something with more object motion – which tends to be harder on these upscalers – but for now, it is what it is.

With all of that said, at first glance the image quality with XeSS is quite good. In some respects it’s almost suspiciously good; as Ian quickly picked up on, the clarity of the “ventilation” text in the above nearly rivals the native 4K renderer, making it massively clearer than the illegible mess on the original 1080p frame. This is solid evidence that as part of XeSS Intel is also doing something outside the scope of image upscaling to improve texture clarity, possibly by enforcing a negative LOD bias on the game engine.

In any case, like the rest of Intel’s forthcoming slate of GPU technologies, this won’t be the last we hear of XeSS. What Intel is demonstrating so far certainly looks promising, but it’s going to be their ability to deliver on those promises to both game developers and gamers that will matter in the end. And if Intel can indeed deliver, then they’re set to become a very welcome third player in the image upscaling technology race.

Performance Improvements For Intel’s Core Graphics Driver

Last but not least, while XeSS was the star of the show for Intel’s graphics software group, the company also delivered a brief update on the state of their core graphics driver that included a few interesting tidbits.

As a quick refresher, Intel these days is using a unified core graphics driver for their entire slate of modern GPUs. As a result, the work that has gone into the driver to prepare it for the launch of Xe-HPG can benefit existing Intel products (e.g. Xe-LP), and improvements made for current products get fed into the driver that will underpin future Xe-HPG products. While this is no different than how rival AMD operates, Intel’s expansion into discrete graphics has meant that the company has needed re-focus on the state of their graphics driver. What was good enough for an integrated product in terms of performance and features will not cut it in the discrete graphics space, where customers spending hundreds of dollars on a video card will have higher expectations on both fronts.

Of recent note, Intel has completed a significant overhaul of both its GPU memory manager and its shader compiler. The net impact of these changes includes improving game loading times by up to 25%, and improved the throughput of CPU-bound games by up to 18%. In the case of the former, by getting smarter about how and where they compile shaders – including eliminating redundant compilations and doing a better job at scheduling compiler threads. As well, Intel has also refactored parts of their memory management code to better optimize the VRAM utilization of their discrete graphics products. Intel of course just launched their first discrete product earlier this year with DG1, so this is a good example of the kind of additional optimization work facing Intel as they branch out into discrete graphics.

Finally, for features and functionality, the software group is also planning on releasing a suite of new driver features. Chief among these will be integrating all of their performance and overclocking controls directly into the company’s Graphics Command Center application. Intel will also be taking a page from NVIDIA and AMD’s current feature sets by adding new features for game streamers, including a fast stream capture path using Intel’s QuickSync encoder, automatic game highlights, and support for AI-assisted cameras. These features should be ready in time for the Intel Arc launch in Q1 of next year.

POST A COMMENT

45 Comments

View All Comments

  • Mat3 - Wednesday, September 1, 2021 - link

    His comment doesn't imply that he believes the 6700 has tensor cores. He is saying that low and midrange cards are more expensive than they used to be (from current supply not meeting demand, and ever more complex manufacturing), and making the cores bigger to accommodate tensor cores is not going to help.

    The extra die space to have the ALUs do to packed math is trivial compared to tensor cores.
    Reply
  • mikeztm - Friday, August 20, 2021 - link

    It is mathematically better compared to native resolutions.
    It's a proven fact and you are just ignoring it.

    It's no magic here, just by rendering jittered 4 frames and combine them into 1.
    For a 1440p render this is literally rendering 1 5120x2880 frame across 4 frames and use motion vector to compensate the moving parts.
    If you stand still it is literally a 5120x2880 5k resolution and no doubt this will be better than a 4k native.
    Reply
  • mode_13h - Saturday, August 21, 2021 - link

    > It is mathematically better compared to native resolutions.

    This can only be true, in general, if the native 4k render was done at a lower quality setting, like no-AA.

    > It's no magic here, just by rendering jittered 4 frames and combine them into 1.

    If they rendered the same frame 4 times, that would be *slower* than native rendering. So, I gather you mean using TAA-like techniques to combine a new frame with the previous 3.

    > For a 1440p render this is literally rendering 1 5120x2880 frame
    > across 4 frames and use motion vector to compensate the moving parts.

    Not exactly, because the samples don't line up quite that neatly.

    > If you stand still it is literally a 5120x2880

    Well, that is the simplest case. The real test is going to be how it handles something like walking through a field of swaying grass.

    Flashing and rapidly-moving light sources are going to be another challenge.
    Reply
  • Bik - Saturday, August 21, 2021 - link

    Yep it wont surpass native resolution. But in our case, if a model is trained on 16k frames on servers, the inference can produce 4k images that go beyong 4k native. Reply
  • flyingpants265 - Wednesday, September 15, 2021 - link

    I think you're right, these things might be here to stay. Higher resolutions are pretty taxing, 4k@60hz and especially 4k*144hz, which is 9.6 TIMES the pixels per second, versus a 1080p*60hz. 9.6 TIMES, that's not gonna work unless you want to bring back quad SLI or something (no thanks). Plus we have consoles to deal with.

    All this stuff will look good on screens. It's just a way to achieve better performance per dollar (which won't necessarily benefit us because they'll just raise prices on cards that can upscale), or per watt. And a way to render on 4k screens. I like 1440p screens but TVs use 4k, and anyway for PC the extra resolution is nice for a lot of things besides gaming.

    The scaling hardware might end up useless in 5 years though, as more of these newer scaling things come out.

    I would like to see a demo that switches a game, while playing, between native 4k and each resolution (from 240p native, to 4k, with every single resolution/upscaling option in between), and sorted by framerate. And then allow the user to switch on-the-fly with F1 and F2 or something. Kind of like the GamersNexus demos of DLSS, except instead of zooming in on images you'd just do it in game.

    I'm mostly concerned with the input lag from the scaling.
    Reply
  • mode_13h - Friday, August 20, 2021 - link

    > quality degenerative scaling

    As far as scaling goes, it's a lot better than you'd get with naive methods. So, I reject the construction "quality degenerative scaling", unless you mean that any scaled image is degenerative by comparison with native rendering @ the same quality level.

    > image tearing techniques?

    Image-tearing has a precise definition, and it's not what you mean. I think you mean "image-degrading".

    And the answer is simple. As the article states, 4k monitors and TVs are pretty cheap and kinda hard to resist. Games are using ever more sophisticated rendering techniques to improve realism and graphical sophistication. So, it's difficult for GPUs ever to get far enough ahead of developers that a mid/low-end GPU could be used to render 4k. For those folks, high-quality scaling is simply the least bad option.
    Reply
  • edzieba - Friday, August 20, 2021 - link

    Once you dip your toes into even the basics of sampling theorem and 3D image rendering (or 2D, with the exception of unscaled sprites), it becomes very clear that pixels are a very poor measure of image fidelity. The demand for 'integer scaling; in drivers is a demonstration that this misapprehension is widespread, despite the confusion by those aware of how rendering works of why people are demanding the worst possible scaling algorithm.
    The idea that the pixels that pop out the end of the rendering pipeline are somehow sacrosanct compared to pixels that pass through postprocess filtering is absurd, given how much filtering and transformation they went though to reach the end of that pipeline in the first place. Anything other than a fixed-scale billboard with no anisotrpic filtering will mean every single texture pixel on your screen is filtered before reaching any postprocessor, but even if you ignore textures altogether (and demand running in some mythical engine that only applies lighting to solid colour primitives, I guess?) you still have MSAA and SSAA causing an output pixel to be a result of combining multiple non-aligned (i.e. not on the 'pixel grid' samples, temporal back- and forward-propagation meaning frames are dependant of previous frames (and occasionally predictive of future frames), screen-space shadows and reflections being at vastly different sample rates than the 'main' render (and being resamples of that render anyway), etc.

    Ai-based scalers are mot mincing up your precious pixels. Those were ground down much earlier in the sausage-making render pipeline.
    Reply
  • mode_13h - Saturday, August 21, 2021 - link

    > The demand for 'integer scaling; in drivers is a demonstration
    > that this misapprehension is widespread

    I don't think there's any misapprehension. Integer scaling is mostly for old games that run at very low, fixed resolutions. It's just easier to look at big, chunky pixels than the blurry mess you get from using even a high-quality conventional scaler, at such magnification.

    However, some people have gone one better. There are both heuristic-based and neural-based techniques in use, that are far superior to simple integer scaling of old games.

    > Ai-based scalers are mot mincing up your precious pixels.

    Um, you mean @Kurosaki's precious pixels. I always thought the idea of DLSS had potential, even during the era of its problematic 1.0 version.
    Reply
  • edzieba - Sunday, August 22, 2021 - link

    >Integer scaling is mostly for old games that run at very low, fixed resolutions.

    For those games, where display on a CRT was the design intent, nearest-neighbour scaling is simply making the image worse than it needs to be for no good reason. Given that every game made in that period (anachronistic 'retro games' aside) intentionally took advantage of the intra-line, inter-line, and temporal blending effects of the CRTs they were viewed on, it is preferable to use one of the many CRT-emulation scalers available (not just 'scanline emulation, that's another anachronism that misses the point through misunderstanding them problem) than generating massive square 'pixels' that were intended by nobody actually creating them.

    >Um, you mean @Kurosaki's precious pixels. I always thought the idea of DLSS had potential, even during the era of its problematic 1.0 version.

    Yep, hit reply in the wrong nested comment
    Reply
  • mode_13h - Monday, August 23, 2021 - link

    > where display on a CRT was the design intent

    Well, more like "design reality", rather than some kind of self-imposed artistic limitation. And don't forget that the CRTs of that era were also smaller than the displays we use today.

    I'm no retro gamer, however. I'm just making the case as I understand it. I'm just not so quick to write off what some enthusiasts say they prefer.
    Reply

Log in

Don't have an account? Sign up now