Hidden Secrets: Investigation Shows That NVIDIA GPUs Implement Tile Based Rasterization for Greater Efficiency

by Ryan Smith on August 1, 2016 5:00 AM EST

Posted in
GPUs
NVIDIA
Maxwell

191 Comments | Add A Comment

191 Comments

As someone who analyzes GPUs for a living, one of the more vexing things in my life has been NVIDIA’s Maxwell architecture. The company’s 28nm refresh offered a huge performance-per-watt increase for only a modest die size increase, essentially allowing NVIDIA to offer a full generation’s performance improvement without a corresponding manufacturing improvement. We’ve had architectural updates on the same node before, but never anything quite like Maxwell.

The vexing aspect to me has been that while NVIDIA shared some details about how they improved Maxwell’s efficiency over Kepler, they have never disclosed all of the major improvements under the hood. We know, for example, that Maxwell implemented a significantly altered SM structure that was easier to reach peak utilization on, and thanks to its partitioning wasted much less power on interconnects. We also know that NVIDIA significantly increased the L2 cache size and did a number of low-level (transistor level) optimizations to the design. But NVIDIA has also held back information – the technical advantages that are their secret sauce – so I’ve never had a complete picture of how Maxwell compares to Kepler.

For a while now, a number of people have suspected that one of the ingredients of that secret sauce was that NVIDIA had applied some mobile power efficiency technologies to Maxwell. It was, after all, their original mobile-first GPU architecture, and now we have some data to back that up. Friend of AnandTech and all around tech guru David Kanter of Real World Tech has gone digging through Maxwell/Pascal, and in an article & video published this morning, he outlines how he has uncovered very convincing evidence that NVIDIA implemented a tile based rendering system with Maxwell.

In short, by playing around with some DirectX code specifically designed to look at triangle rasterization, he has come up with some solid evidence that NVIDIA’s handling of tringles has significantly changed since Kepler, and that their current method of triangle handling is consistent with a tile based renderer.

NVIDIA Maxwell Architecture Rasterization Tiling Pattern (Image Courtesy: Real World Tech)

Tile based rendering is something we’ve seen for some time in the mobile space, with both Imagination PowerVR and ARM Mali implementing it. The significance of tiling is that by splitting a scene up into tiles, tiles can be rasterized piece by piece by the GPU almost entirely on die, as opposed to the more memory (and power) intensive process of rasterizing the entire frame at once via immediate mode rendering. The trade-off with tiling, and why it’s a bit surprising to see it here, is that the PC legacy is immediate mode rendering, and this is still how most applications expect PC GPUs to work. So to implement tile based rasterization on Maxwell means that NVIDIA has found a practical means to overcome the drawbacks of the method and the potential compatibility issues.

In any case, Real Word Tech’s article goes into greater detail about what’s going on, so I won’t spoil it further. But with this information in hand, we now have a more complete picture of how Maxwell (and Pascal) work, and consequently how NVIDIA was able to improve over Kepler by so much. Finally, at this point in time Real World Tech believes that NVIDIA is the only PC GPU manufacturer to use tile based rasterization, which also helps to explain some of NVIDIA’s current advantages over Intel’s and AMD’s GPU architectures, and gives us an idea of what we may see them do in the future.

Source: Real World Tech

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

191 Comments

View All Comments

Jedi2155 - Monday, August 1, 2016 - link
The funny thing is that this same technology was used to whoop the GeForce 2 GTS back in the day :)

I recall the Hercules Prophet 3D 4500 was ~$150 versus the GeForce 2 GTS was ~$250 yet got spanked by the tile based rendering.
http://www.anandtech.com/show/735/10
Alexvrb - Monday, August 1, 2016 - link
I had one. It was a great card for the money. You needed a decent CPU to go with it. Unfortunately they fell behind and had to withdraw. It's a shame as the 5500 had hardware T&L and DDR, along with a wider design and higher clocks. It basically nearly tripled the Kyro II / 4500 in terms of fillrate and memory bandwidth. There were rumors of DX8 support and 128MB variants. But desktop graphics was a tough market to break into even back then.

Interesting tidbit, they developed drivers with fairly efficient software T&L for the 4800 (Kyro II SE) which was delayed and eventually cancelled. However they did release said drivers for free to users of existing models as a nice going away present.
Cygni - Tuesday, August 2, 2016 - link
I believe quite a few Kyro II SE boards still exist and pop up in collector circles occasionally and are actually functional. Gotta be up there with the various Voodoo 5 6000 dev boards as far as collectible cards go, due to the fact that they actually work.

I remember reading the Kyro II review on this very website and thinking that tile based was the future. Well, its 15 years later, and Nvidia has finally made the move. Intel has been on board for a long time, as well as all the cellphone and mobile designs, so I guess all thats really left is AMD/ATI.
asendra - Monday, August 1, 2016 - link
Well, this explains how Kepler cards have been tanking so much in performance lately.
They obviously weren't being optimised specifically for, but now it seems they weren't even getting any of the more "general" optimisations.
Scali - Monday, August 1, 2016 - link
What are you talking about? This article is about how the *hardware* in Maxwell differs. You can't expect NVidia to optimize the hardware of Kepler, which is already in the hands of customers, can you?
asendra - Monday, August 1, 2016 - link
mm, drivers? I think that was clear enough. Drivers are in constant flux, with general optimizations and specific optimizations for new games.
Lately, kepler cards have being losing relative performance, and AMD cards that were trailing Kepler when released, have being performing much better in newer games.

Also the gap between kepler and maxwell cards have kept increasing.
Scali - Monday, August 1, 2016 - link
Yes, I know you are talking about drivers. That's exactly my point: what do drivers have to do with this article?
Yojimbo - Tuesday, August 2, 2016 - link
http://www.hardwarecanucks.com/forum/hardware-canu...

Kepler performance isn't tanking.
milli - Monday, August 1, 2016 - link
So PowerVR was right all along? No surprise there.
How Kyro, a 12m transistor chip, with less than half the bandwidth of the GeForce 256 DDR, a 23m transistor chip, could get so close to it, should have made the company more successful. Not too lucky.
jabber - Monday, August 1, 2016 - link
I've been pushing and holding a candle for tile based rendering since I had a PowerVR M3D card back around 1998. I remember MS stating big time when they were found to be quite a bit down on the power compared to the PS4 before release, that the Xbox One would use tile rendering. Seems that didn't work out...like the power of the cloud/off site processing making the One many times more powerful.

Hidden Secrets: Investigation Shows That NVIDIA GPUs Implement Tile Based Rasterization for Greater Efficiency

Post Your Comment

191 Comments

View All Comments

Jedi2155 - Monday, August 1, 2016 - link

Alexvrb - Monday, August 1, 2016 - link

Cygni - Tuesday, August 2, 2016 - link

asendra - Monday, August 1, 2016 - link

Scali - Monday, August 1, 2016 - link

asendra - Monday, August 1, 2016 - link

Scali - Monday, August 1, 2016 - link

Yojimbo - Tuesday, August 2, 2016 - link

milli - Monday, August 1, 2016 - link

jabber - Monday, August 1, 2016 - link

Log in

Don't have an account? Sign up now