LucidLogix Virtu MVP Technology and HyperFormance

While not specifically a feature of the chipset, Z77 will be one of the first chipsets to use this remarkable new technology. LucidLogix was the brains behind the Hydra chip—a hardware/software combination solution to allow GPUs from different manufacturers to work together (as we reviewed the last iteration on the ECS P62H2-A). Lucid was also behind the original Virtu software, designed to allow a discrete GPU to remain idle until needed, and let the integrated GPU deal with the video output (as we reviewed with the ASUS P8Z68-V Pro). This time, we get to see Virtu MVP, a new technology designed to increase gaming performance.

To explain how Virtu MVP works, I am going to liberally utilize and condense what is said in the Lucid whitepaper about Lucid MVP, however everyone is free to read what is a rather interesting ten pages.

The basic concept behind Virtu MVP is the relationship between how many frames per second the discrete GPU can calculate, against what is shown on the screen to the user, in an effort to increase the 'immersive experience'.

Each screen/monitor the user has comes with a refresh rate, typically 60 Hz, 75 Hz or 120 Hz with 3D monitors (Hz = Hertz, or times ‘per second’). This means that at 60 times per second, the system will pull out what is in the frame buffer (the bit of the output that holds what the GPU has computed) and display what is on the screen.

With standard V-Sync, the system will only pull out what is in the buffer at certain intervals—namely at factors of the base frequency (e.g. 60, 30, 20, 15, 12, 10, 6, 5, 3, 2, or 1 for 60Hz) depending on the monitor being used. The issue is with what happens when the GPU is much faster (or slower) than the refresh rate.

The key tenet of Lucid’s new technology is the term responsiveness. Responsiveness is a wide-ranging term which could mean many things. Lucid distils it into two key areas:

a) How many frames per second can the human eye see?
b) How many frames per second can the human hand respond to?

To clarify, these are NOT the same questions as:

i) How many frames per second do I need to make the motion look fluid?
ii) How many frames per second makes a movie stop flickering?
iii) What is the fastest frame (shortest time) a human eye would notice

If the display refreshes at 60 Hz, and the game runs at 50 fps, would this need to be synchronized? Would a divisor of 60 Hz be better? Alternatively, perhaps if you were at 100 fps, woud 60 fps be better? The other part of responsiveness is how a person deals with hand-to-eye coordination, and if the human mind can correctly interpolate between a screen's refresh rate and the output of the GPU. While a ~25 Hz rate may be applicable for a human eye, the human hand can be as sensitive as 1000 Hz, and so having the correlation between hand movement and the eye is all-important for 'immersive' gaming.

Take the following scenarios:

Scenario 1: GPU is faster than Refresh Rate, VSync Off

Refresh rate: 60 Hz
GPU: 87 fps
Mouse/Keyboard responsiveness is 1-2 frames, or ~11.5 to 23 milliseconds
Effective responsiveness makes the game feel like it is between 42 and 85 FPS

In this case, the GPU is 45% faster than the screen. This means that as the GPU fills the frame buffer, it will continuously be between frames when the display dumps the buffer contents on screen, such that the computation of the old frame and the new frame is still in the buffer:

This is a phenomenon known as Tearing (which many of you are likely familiar with). Depending on the scenario you are in, tearing may be something you ignore, notice occasionally, or find rather annoying. For example:

So the question becomes, was it worth computing that small amount of frame N+1 or N+3?

Scenario 2: GPU is slower than Refresh Rate, VSync Off

Refresh rate: 60 Hz
GPU: 47 fps
Mouse/Keyboard responsiveness is 1-2 frames, or ~21.3 to 43 milliseconds
Effective responsiveness makes the game feel it is between 25 and 47 FPS

In this case, the GPU is ~37% slower than the screen. This means that as the GPU fills the frame buffer slower than what the screen requests and it will continuously be between frames when the display dumps the buffer contents on screen, such that the computation of the old frame and the new frame is still in the buffer.

So does this mean that for a better experience, computing frame N+1 was not needed, and N+2 should have been the focus of computation?

Scenario 3: GPU can handle the refresh rate, V -Sync On

This setting allows the GPU to synchronize to every frame. Now all elements of the system are synchronized to 60 Hz—CPU, application, GPU and display will aim for 60 Hz, but also at lower intervals (30, 20, etc.) as required.

While this produces the best visual experience with clean images, the input devices for haptic feedback are limited to the V-Sync rate. So while the GPU could enable more performance, this artificial setting is capping all input and output.

Result:

If the GPU is slower than the display or faster than the display, there is no guarantee that the frame buffer that is drawn on the display is of a complete frame. A GPU has multiple frames in its pipeline, but only few are ever displayed at high speeds, or frames are in-between when the GPU is slow. When the system is set a software limit, responsiveness decreases. Is there a way to take advantage of the increased power of systems while working with a limited refresh rate—is there a way to ignore these redundant tasks to provide a more 'immersive' experience?

LucidLogix apparently has the answer…

The answer from Lucid is Virtu MVP. Back in September 2011, Ryan gave his analysis on the principles of the solution. We are still restricted to the high level overview (due to patents) explanation as Ryan was back then. Nevertheless, it all boils down to the following image:

Situation (A) determines whether a rendering task/frame should be processed by the GPU, and situation (B) decides which frames should go to the display. (B) helps with tearing, while (A) better utilizes the GPU. Nevertheless, the GPU is doing multiple tasks—snooping to determine which frames are required, rendering the desired frame, and outputting to a display. Lucid is using hybrid systems (those with an integrated GPU and a discrete GPU) to overcome this.

Situation (B) is what Lucid calls its Virtual V-Sync, an adaptive V-Sync technology currently in Virtu. Situation (A) is an extension of this, called HyperFormance, designed to reduce input lag by only sending required work to the GPU rather than redundant tasks.

Within the hybrid system, the integrated GPU takes over two of the tasks for the GPU—snooping for required frames, and display output. This requires a system to run in i-Mode, where the display is connected to the integrated GPU. Users of Virtu on Z68 may remember this: back then it caused a 10% decrease in output FPS. This generation of drivers and tools should alleviate some of this decrease.

What this means for Joe Public

Lucid’s goal is to improve the 'immersive experience' by removing redundant rendering tasks, making the GPU synchronize with the refresh rate of the connected display and reduce input lag.

By introducing a level of middleware that intercepts rendering calls, Virtual V-Sync and HyperFormance are both tools that decide whether a frame should be rendered and then delivered to the display. However the FPS counter within a title counts frame calls, not completed frames. So as the software intercepts a call, the frame rate counter is increased, whether the frame is rendered or not. This could lead to many unrendered frames, and an artificially high FPS number, when in reality the software is merely optimizing the sequence of rendering tasks rather than increasing FPS.

If it helps the 'immersion factor' of a game (less tearing, more responsiveness), then it could be highly beneficial to gamers. Currently, to work as Lucid has intended, they have validated around 100 titles. We spoke to Lucid (see next page), and they say that the technology should work with most, if not all titles. Users will have to add programs manually to take advantage of the technology if the software is not in the list. The reason for only 100 titles being validated is that each game has to be validated with a lot of settings, on lots of different kit, making the validation matrix huge (for example, 100 games x 12 different settings x 48 different system hardware configurations = time and lots of it).

Virtu MVP causes many issues when it comes to benchmarking and comparison of systems as well. The method of telling the performance of systems apart has typically been the FPS values. With this new technology, the FPS value is almost meaningless as it counts the frames that are not rendered. This has consequences for benchmarking companies like Futuremark and overclockers who like to compare systems (Futuremark have released a statement about this). Technically all you would need to do (if we understand the software correctly) to increase your score/FPS would be to reduce the refresh rate of your monitor.

Since this article was started, we have had an opportunity to speak to Lucid regarding these technologies, and they have pointed out several usage scenarios that have perhaps been neglected in other earlier reviews regarding this technology. In the next page, we will discuss what Lucid considers ‘normal’ usage.

The Z77 Chipset Lucid’s Take on Virtu MVP and How it Should Work
POST A COMMENT

145 Comments

View All Comments

  • DanNeely - Monday, April 09, 2012 - link

    This is similar to what happened with the USB1->2 transition. The newer controller is significantly bigger (read more expensive) and very few people have more than one or two devices using it per computer. I suspect the 8x (Haswell) chipset will be mixed as well; simply because the total number of ports on the chipset is so much higher than it was a decade ago (vs older boards were all but the lowest end models added more USB from 3rd party controllers). Reply
  • ASUSTechMKT - Monday, April 09, 2012 - link

    mSATA currently has very little penetration on the market and cost wise it is much lower to purchase a larger cache SSD for the same or lower cost. We would prefer to focus on bringing implementations that offer immediate value to users.

    As for the Intel nics all our launch boards across the board for ATX ( Standard and above all feature Intel lan ) we have been leading in this regard for a couple of generations.

    In regards to USB 3 we offer more than the standard on many boards but keep in mind many users only have 1 USB3 device.
    Reply
  • jimnicoloff - Sunday, April 08, 2012 - link

    Maybe I missed something from an earlier post, but could someone please tell me why these don't have light peak? Are they waiting to go optical and it is not ready yet? Having my USB3 controlled by Intel instead of another chip is not enough to make me want to upgrade my Z68 board... Reply
  • repoman27 - Sunday, April 08, 2012 - link

    Thunderbolt controllers are relatively expensive ($20-30) and their value is fairly limited on a system using a full size ATX motherboard that has multiple PCIe slots. Including two digital display outputs, an x4 and a couple x1 PCIe slots on a motherboard provides essentially all the same functionality as Thunderbolt but at a way lower cost. Reply
  • ASUSTechMKT - Monday, April 09, 2012 - link

    Almost all of our boards feature a special TB header which allows for you to easily equip our boards with a Thunderbolt add on card which we will release at the end of the month. Expect an approximate cost of $40 dollars, this card will connect to the TB header and install in a X4 slot providing you with Thunderbolt should you want it. A great option for those who want it and for those who do not they do not pay for it. Reply
  • DanNeely - Tuesday, April 10, 2012 - link

    Sounds like a reasonable choice for something that's still rather expensive and a very niche product.

    Am I correct in thinking that the mobo header is to bring in the DisplayPort out channel without impacting bandwidth available for devices?
    Reply
  • jimwatkins - Sunday, April 08, 2012 - link

    I've made it this far on my venerable OC Q6600, but I can't wait any longer. I do wish they weren't so stingy on the 6 core as I could use it, but I just can't justify the price differential (w 3 kids that is.) Reply
  • androticus - Sunday, April 08, 2012 - link

    USB 3.0 descriptions and depictions are contradictory. The platform summary table says there are 4. The Intel diagram shows up to 4 on front and back (and the diagram is itself very confusing, because there are 4 USB 3.0 ports indicated on the chipset, and then they show 2 going to hubs, and 2 going directly to the jacks.) The text of the article says there can only be 2 USB 3.0 ports.

    What is the correct answer?
    Reply
  • mariush - Sunday, April 08, 2012 - link

    I think there's 2 real ports (full bandwidth ports) and the Intel solution uses 2 additional chips that act like "hubs", splitting each real port into 4 separate ports.

    Basically the bandwidth of each real port gets split if there are several devices connected to the same hub.

    Hub as far as I know means that what the hub receives sends to all four ports (and then the devices at the end of each port ignore the data if it's not for them).
    This would be different than a switch, which has the brains to send the data packages only to the proper port.
    Reply
  • plamengv - Sunday, April 08, 2012 - link

    DZ77GA-70K makes DX79SI looks like a bad joke (which it is really).

    LGA 2011 turns into an epic fail and DZ77GA-70K is the proof. I have 1366 system and I have zero will to get LGA 2011 system thanks to the crappy tech decisions somebody made there. Six cores is the top? Again? An old 32nm process? Really? Chipset with nothing new inside but troubles? Since 1366 something strange is going on and Intel fails to see it. The end user can get better manufacturing tech for the video card than for the CPU. First it was 45nm CPU with 40nm GPU and now 28nm GPU and 32nm CPU and Intel call that high end? Really?

    Everything that DX79SI should have been you can find inside DZ77GA-70K.

    1. DZ77GA-70K has high quality TI 1394 firewire controller, while DX79SI has cheap VIA one that no any audio pro would ever want to deal with.
    2. DZ77GA-70K has next best after Intel SATA controller by Marvell to get 2 more SATA 6.0 and eSATA vs zero extra SATA and hard to believe no any eSATA on DX79SI.
    3. Intel USB 3.0 vs crappy Renesas.

    DZ77GA-70K has everything to impress, including the two Intel LANs vs the Realtek that everyone else is using.

    DZ77GA-70K fails in only one thing - it had to be LGA 2011, not 1155 that will be just 4 cores like forever and has zero future.

    Wake up INTEL!
    Reply

Log in

Don't have an account? Sign up now