Hidden Secrets: Investigation Shows That NVIDIA GPUs Implement Tile Based Rasterization for Greater Efficiency

by Ryan Smith on August 1, 2016 5:00 AM EST

Posted in
GPUs
NVIDIA
Maxwell

191 Comments | Add A Comment

191 Comments

As someone who analyzes GPUs for a living, one of the more vexing things in my life has been NVIDIA’s Maxwell architecture. The company’s 28nm refresh offered a huge performance-per-watt increase for only a modest die size increase, essentially allowing NVIDIA to offer a full generation’s performance improvement without a corresponding manufacturing improvement. We’ve had architectural updates on the same node before, but never anything quite like Maxwell.

The vexing aspect to me has been that while NVIDIA shared some details about how they improved Maxwell’s efficiency over Kepler, they have never disclosed all of the major improvements under the hood. We know, for example, that Maxwell implemented a significantly altered SM structure that was easier to reach peak utilization on, and thanks to its partitioning wasted much less power on interconnects. We also know that NVIDIA significantly increased the L2 cache size and did a number of low-level (transistor level) optimizations to the design. But NVIDIA has also held back information – the technical advantages that are their secret sauce – so I’ve never had a complete picture of how Maxwell compares to Kepler.

For a while now, a number of people have suspected that one of the ingredients of that secret sauce was that NVIDIA had applied some mobile power efficiency technologies to Maxwell. It was, after all, their original mobile-first GPU architecture, and now we have some data to back that up. Friend of AnandTech and all around tech guru David Kanter of Real World Tech has gone digging through Maxwell/Pascal, and in an article & video published this morning, he outlines how he has uncovered very convincing evidence that NVIDIA implemented a tile based rendering system with Maxwell.

In short, by playing around with some DirectX code specifically designed to look at triangle rasterization, he has come up with some solid evidence that NVIDIA’s handling of tringles has significantly changed since Kepler, and that their current method of triangle handling is consistent with a tile based renderer.

NVIDIA Maxwell Architecture Rasterization Tiling Pattern (Image Courtesy: Real World Tech)

Tile based rendering is something we’ve seen for some time in the mobile space, with both Imagination PowerVR and ARM Mali implementing it. The significance of tiling is that by splitting a scene up into tiles, tiles can be rasterized piece by piece by the GPU almost entirely on die, as opposed to the more memory (and power) intensive process of rasterizing the entire frame at once via immediate mode rendering. The trade-off with tiling, and why it’s a bit surprising to see it here, is that the PC legacy is immediate mode rendering, and this is still how most applications expect PC GPUs to work. So to implement tile based rasterization on Maxwell means that NVIDIA has found a practical means to overcome the drawbacks of the method and the potential compatibility issues.

In any case, Real Word Tech’s article goes into greater detail about what’s going on, so I won’t spoil it further. But with this information in hand, we now have a more complete picture of how Maxwell (and Pascal) work, and consequently how NVIDIA was able to improve over Kepler by so much. Finally, at this point in time Real World Tech believes that NVIDIA is the only PC GPU manufacturer to use tile based rasterization, which also helps to explain some of NVIDIA’s current advantages over Intel’s and AMD’s GPU architectures, and gives us an idea of what we may see them do in the future.

Source: Real World Tech

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

191 Comments

View All Comments

piroroadkill - Monday, August 1, 2016 - link
It's funny, because ATI practically introduced tessellation to the consumer graphics card, but no games developers took advantage of it at the time, so work on it was stopped.

Look up TruForm.
extide - Monday, August 1, 2016 - link
Yeah, I think the original Radeon (R100) had the earliest Implementation of TruForm, ATI's tessalator
Alexvrb - Monday, August 1, 2016 - link
Very true. They were ahead of the times... and it bit them in the behind. Oh well. They're doing well enough I think with the 480, puts them on solid footing for the foreseeable future.
mr_tawan - Tuesday, August 2, 2016 - link
ATi were bad with developer relationship. That's the reason why no one made use of ATi shiny tech (it's not only TruForm that suffer from this). AMD now tries to recover from that by open souce their library and stuffs. I am excited to see more from them.

Game coding is hard, and if you're throwing new tech into it, it'd become much harder. That's one of the reason why game developers stick with things tried and true (AFAIK many game's codebase is still C99 or even C89). If they want those devs to move, they have to lobby those dev some way or another. That's what Nvidia doing very well for a long time. Having only shiny tech demo does not do any excitement to those dev....
Scali - Tuesday, August 2, 2016 - link
"ATi were bad with developer relationship. That's the reason why no one made use of ATi shiny tech (it's not only TruForm that suffer from this)."

TruForm was a standard feature in the D3D API: N-patches. As were the GeForce 3's competing RT-patches, of which about as little was heard as from TruForm, perhaps even less.
See: https://msdn.microsoft.com/en-us/library/windows/d...
The reason is: both technologies were extremely limited, and it was difficult enough to design game content in a way that they looked good with N-patches or RT-patches. Even more difficult to design game content that looked good with both N, RT and no patches. Because the market your games target will contain an combination of all three types of hardware.
wumpus - Tuesday, August 2, 2016 - link
It isn't just that, its market share as well (although if you are talking about Mantle at least all modern consoles are driven by Mantle-derived systems). If the market is overwhelmingly nvidia, you don't program for AMD.
Scali - Tuesday, August 2, 2016 - link
"all modern consoles are driven by Mantle-derived systems"

The opposite actually: both Microsoft and Sony developed their own APIs with low-level features, long before Mantle arrived.
KillBoY_UK - Monday, August 1, 2016 - link
and when the tables turned AMD fans suddenly said they didn't care about PPW and general power useage lol
Mr.AMD - Monday, August 1, 2016 - link
I truly din't, and you NV fan could also not care.....
Because all GPU's use more power by OCing them, NVIDIA cards also used more than 300Watt in FULL OC. Not that i care, true performance lay in stock cards. If stock performance is good why the need for OC? Cards become more power hungry, run hotter, shorter life....
Budburnicus - Tuesday, May 2, 2017 - link
"Not that i care, true performance lay in stock cards. If stock performance is good why the need for OC? Cards become more power hungry, run hotter, shorter life...."
LMAO OC does NOT shorten life significantly for any card. Only stupid noobs say that shit. Because A) Whatever amount your OC does shorten the lifespan by is insignificant! I still have a GTX 460 in my basement that I had OCed and running for 6 years in one PC or another, similar deal with a GTX 560 Ti - both of which are MASSIVELY hot and power hungry chips, and even with ~10%+ OC running on both of them for YEARS and they both still run at those same OCs. Point being - would YOU want to still be using a GTX 460 or 560 Ti today?? Because I sure as hell would not! Even with my CURRENT G1 Gaming GTX 980 Ti at 1505 core 8 GHZ VRAM (8.5 TFLOPS, 145 Gpixel, and 265 Gtexel/sec at this frequency which it runs 100% of the time under full graphics load) - this GPU will last me FAR LONGER than I will ever need it to! By the time it is likely to have ANY FAILURE it will be nearly obsolete! and B) If you are SMART ABOUT OVERCLOCKING then you will KNOW the safe operating temperatures of you GPU and all of its Components! As long as you are NEVER going higher than the approximately 85 C upper safe range for ANY GTX 900 series GPU - you are not doing ANY SIGNIFICANT DAMAGE!

As for ''Not that i care, true performance lay in stock cards.'' BAHAHAHAHAHAH what are you smoking m8? Can I have some?

980 Ti Stock = 6 TFLOP at reference boost of just 1076, with nearly exactly 96 Gpixel/sec, 176 Gtexel/sec and just 336 GB/sec VRAM

G1 Gaming 980 Ti @ 1505 core and 8 GHZ VRAM = 8.476 TFLOPs 144.5 Gpixel/sec 264.9 Gtexel/sec 384.4 GB/sec!

ALL while it NEVER gets hotter than just 67 C! Thermocouples showed not more than 78 C on VRM, not more than 70 on VRAM, and never more than 80 C at any single heat point on the ENITRE card! Now you probably don't know ANY of this, but the safe temps for the VRM FETs used on the G1 Gaming are in excess of 125 C - and this is rather standard for FETs.

Even WITH a 475 watt system power draw under gaming, the cost per hour of my PC is minuscule in my state.

Yes my GPU sucks A TON of power when I am GAMING! But when just watching Netflix, surfing etc, my ENTIRE PC never draws more than 110 watts at the wall. Under a HEAVY load like Prime95 and Furmark that does jump as high as 830 watts, but in gaming it never exceeds about 475 watts total system power draw. And when you take into consideration the massive OC on my i7-3770k at 4.7 GHZ, the 4 2.5 SSDs (3x RAID O) 1 4 TB HDD, 8 Case Fans and 2 CPU fans, as well as MANY USB peripherals all of this is rather normal.

Also, I have an i7-2600k that is and has been running at 4.5 GHZ since the DAY it was bought on RELEASE month, Jan 2011 - it still runs flawlessly. And your argument TOTALLY DIES with the fact that Pentium 4 CPUs still work today!

As long as heat and power are REASONABLE - you are NOT DEGRADING THE LIFESPAN OF ANY PC PART BY ANY SIGNIFICANT AMOUNT OF TIME! The Part you are OCing will be LONG old and nearly obsolete before the effects of Overclocking take ANY TOLL!

AND electronics just sometimes break. But modern parts will NOT BREAK from excessive heat or power, they will shut down before that becomes an issue. You can download MSI Afterburner, and just MAX out EVERY SLIDER - it will NOT KILL your GPU unless it had a weak or failing component to begin with - but what it WILL do is shut down the MOMENT you put it under stress! Same goes for CPU frequencies and BIOS OC settings!

Hidden Secrets: Investigation Shows That NVIDIA GPUs Implement Tile Based Rasterization for Greater Efficiency

Post Your Comment

191 Comments

View All Comments

piroroadkill - Monday, August 1, 2016 - link

extide - Monday, August 1, 2016 - link

Alexvrb - Monday, August 1, 2016 - link

mr_tawan - Tuesday, August 2, 2016 - link

Scali - Tuesday, August 2, 2016 - link

wumpus - Tuesday, August 2, 2016 - link

Scali - Tuesday, August 2, 2016 - link

KillBoY_UK - Monday, August 1, 2016 - link

Mr.AMD - Monday, August 1, 2016 - link

Budburnicus - Tuesday, May 2, 2017 - link

Log in

Don't have an account? Sign up now