Software, Cont: ShadowPlay and "Reason Flags"

Along with providing the game optimization service and SHIELD’s PC client, GeForce Experience has another service that’s scheduled to be added this summer. That service is called ShadowPlay, and not unlike SHIELD it’s intended to serve as a novel software implementation of some of the hardware functionality present in NVIDIA’s latest hardware.

ShadowPlay will be NVIDIA’s take on video recording, the novel aspect of it coming from the fact that NVIDIA is basing the utility around Kepler’s hardware H.264 encoder. To be straightforward video recording software is nothing new, as we have FRAPS, Afterburner, Precision X, and other utilities that all do basically the same thing. However all of those utilities work entirely in software, fetching frames from the GPU and then encoding them on the CPU. The overhead from this is not insignificant, especially due to the CPU time required for video encoding.

With ShadowPlay NVIDIA is looking to spur on software developers by getting into video recording themselves, and to provide superior performance by using hardware encoding. Notably this isn’t something that was impossible prior to ShadowPlay, but for some reason recording utilities that use NVIDIA’s hardware H.264 encoder have been few and far between. Regardless, the end result should be that most of the overhead is removed by relying on the hardware encoder, minimally affecting the GPU while freeing up the CPU, reducing the amount of time spent on data transit back to the CPU, and producing much smaller recordings all at the same time.

ShadowPlay will feature multiple modes. Its manual mode will be analogous to FRAPS, recording whenever the user desires it. The second mode, shadow mode, is perhaps the more peculiar mode. Because the overhead of recording with the hardware H.264 encoder is so low, NVIDIA wants to simply record everything in a very DVR-like fashion. In shadow mode the utility keeps a rolling window of the last 20 minutes of footage, with the goal being that should something happen that the user decides they want to record after the fact, they can simply pull it out of the ShadowPlay buffer and save it. It’s perhaps a bit odd from the perspective of someone who doesn’t regularly record their gaming sessions, but it’s definitely a novel use of NVIDIA’s hardware H.264 encoder.

NVIDIA hasn’t begun external beta testing of ShadowPlay yet, so for the moment all we have to work from is screenshots and descriptions. The big question right now is what the resulting quality will be like. NVIDIA’s hardware encoder does have some limitations that are necessary for real-time encoding, so as we’ve seen in the past with qualitative looks at NVIDIA’s encoder and offline H.264 encoders like x264, there is a quality tradeoff if everything has to be done in hardware in real time. As such ShadowPlay may not be the best tool for reference quality productions, but for the YouTube/Twitch.tv generation it should be more than enough.

Anyhow, ShadowPlay is expected to be released sometime this summer. But since 95% of the software ShadowPlay requires is also required for the SHIELD client, we wouldn’t be surprised if ShadowPlay was released shortly after a release quality version of the SHIELD client is pushed out, which may come as early as June alongside the SHIELD release.

Reasons: Why NVIDIA Cards Throttle

The final software announcement from NVIDIA to coincide with the launch of the GTX 780 isn’t a software product in and of itself, but rather an expansion of NVIDIA’s 3rd party hardware monitoring API.

One of the common questions/complaints about GPU Boost that NVIDIA has received over the last year is about why a card isn’t boosting as high as it should be, or why it suddenly drops down a boost bin or two for no apparent reason. For technically minded users who know the various cards’ throttle points and specifications this isn’t too complex – just look at the power consumption, GPU load, and temperature – but that’s a bit much to ask of most users. So starting with the recently released 320.14 drivers, NVIDIA is exposing a selection of flags through their API that indicate what throttle point is causing throttling or otherwise holding back the card’s clockspeed. There isn’t an official name for these flags, but “reasons” is as good as anything else, so that’s what we’re going with.

The reasons flags are a simple set of 5 binary flags that NVIDIA’s driver uses to indicate why it isn’t increasing the clockspeed of the card further. These flags are:

  • Temperature Limit – the card is at its temperature throttle point
  • Power Limit – The card is at its global power/TDP limit
  • Voltage Limit – The card is at its highest boost bin
  • Overvoltage Max Limit – The card’s absolute maximum voltage limit (“if this were to occur, you’d be at risk of frying your GPU”)
  • Utilization Limit – The current workload is not high enough that boosting is necessary

As these are simple flags, it’s up to 3rd party utilities to decide how they want to present these flags. EVGA’s Precision X, which is NVIDIA’s utility of choice for sampling new features to the press, simply records the flags like it does the rest of the hardware monitoring data, and this is likely what most programs will do.

With the reason flags NVIDIA is hoping that this will help users better understand why their card isn’t boosting as high as they’d like to. At the same time the prevalence of GPU Boost 2.0 and its much higher reliance on temperature makes exposing this data all the more helpful, especially for overclockers that would like to know what attribute they need to turn up to unlock more performance.

Software: GeForce Experience, Out of Beta Our First FCAT & The Test
POST A COMMENT

155 Comments

View All Comments

  • aidivn - Thursday, May 23, 2013 - link

    so, how many Double Precision units are there in each SMX unit of gtx780? titan had 64 dp units in each of their SMX units which totaled to 896 dp units

    And can u turn them on or off from the forcewre driver menu like “CUDA – Double Precision” for gtx780?
    Reply
  • Ryan Smith - Thursday, May 23, 2013 - link

    Hardware wise this is GK110, so the 64 DP units are there. But most of them would be disabled to get the 1/24 FP64 rate. Reply
  • aidivn - Friday, May 24, 2013 - link

    so how many are disabled and how many are enabled (numbers please)? Reply
  • Ryan Smith - Friday, May 24, 2013 - link

    You would have only 1/8th enabled. So 8 per SMX are enabled, while the other 56 are disabled. Reply
  • aidivn - Saturday, May 25, 2013 - link

    so, the GTX780 only has 96 DP units enabled while the GTX TITAN has 896 DP units enabled...thats a huge cut on double precision Reply
  • DanNeely - Sunday, May 26, 2013 - link

    That surprised me too. Previously the cards based on the G*100/110 cards were 1/8; this is a major hit vs the 580/480/280 series cards. Reply
  • Old_Fogie_Late_Bloomer - Thursday, May 23, 2013 - link

    "GTX 780 on the other hand is a pure gaming/consumer part like the rest of the GeForce lineup, meaning NVIDIA has stripped it of Titan’s marquee compute feature: uncapped double precision (FP64) performance. As a result GTX 780 can offer 90% of GTX Titan’s gaming performance, but it can only offer a fraction of GTX Titan’s FP64 compute performance, topping out at 1/24th FP32 performance rather than 1/3rd like Titan."

    Seriously, this is just...it's asinine. Utterly asinine.
    Reply
  • tipoo - Thursday, May 23, 2013 - link

    Market segmentation is nothing new. The Titan really is a steal if you need DP, the next card up is 2400 dollars. Reply
  • Old_Fogie_Late_Bloomer - Thursday, May 23, 2013 - link

    I'm well aware of the existence of market segmentation, but this is just ridiculous. Putting ECC RAM on professional cards is segmentation. Disabling otherwise functional features of hardware, most likely in the software drivers...that's just...ugh. Reply
  • SymphonyX7 - Thursday, May 23, 2013 - link

    I just noticed that the Radeon HD 7970 Ghz Edition has been trouncing the GTX 680 in most of the benchmarks and trailing the GTX 680 in those benchmarks that traditionally favored Kepler. What the heck just happened? Didn't the review of the Radeon HD 7970 Ghz Edition say that it was basically tied with the GTX 680? Reply

Log in

Don't have an account? Sign up now