Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology

Name: Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology
Item: Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology
Author: Jarred Walton

by Jarred Walton on September 20, 2011 6:40 AM EST

91 Comments | Add A Comment

91 Comments

How AMD’s Dynamic Switchable Graphics Works

One of the things we discussed with AMD was the technical details of their dynamic switchable graphics. At a high level, things might appear similar to NVIDIA’s Optimus, but dig a little deeper and you start to find differences. To recap how switchable graphics works, let’s start at the top.

The original switchable graphics technologies used the IGP and dedicated GPU as discrete devices. Both were connected to the necessary display outputs, with some hardware muxes that could select the active device. This requires more cost in the motherboard, and switching results in a blanking of the display as one device is deactivated and the other comes online. In the earliest implementations, you had to reboot when switching, and the system would start with either the IGP or dGPU active. Later implementations moved to software controlled muxes and dynamic switching, which required Windows Vista to work properly (since the IGP driver would unload, the GPU driver would start, and then the display content would activate on the GPU).

NVIDIA’s Optimus changes things quite a bit, as there are no longer any muxes. The display ports are always linked to the IGP output, and NVIDIA’s drivers simply look for calls to applications that the dedicated GPU can help accelerate. When they detect such an application—and the user can add their own custom apps—the drivers wake up the GPU and send it the rendering commands. The GPU does all of the necessary work, and then the result is copied directly into the IGP framebuffer, omitting any flickering or other undesirable effects as the IGP is constantly connected to the display output. The GPU can wake up in a fraction of a second, and when it’s no longer needed it will power down completely. NVIDIA even demonstrated this by removing the dGPU from a test system while it was powered on. The only catch is that the drivers need to have some knowledge of the applications/games in order to know when to use the GPU.

The details of AMD’s Dynamic Switchable Graphics are similar in practice to Optimus, but with a few differences. First, AMD always has both the IGP and GPU driver loaded, with a proxy driver funneling commands to the appropriate GPU. Where NVIDIA is able to completely power off the GPU under Optimus, AMD has modified their GPUs so that the PCI-E bus is isolated from the rest of the chip. Now when the GPU isn’t needed, everything powers down except for that PCI-E connection, so Windows doesn’t try to load/unload the GPU driver. The PCI-E link state gets retained, and a small amount (around 50mW) is needed to keep the PCI-E state active, but as far as Windows knows the GPU is still ready and waiting for input. AMD also informed us that their new GPUs use link adapter mode instead of multi adapter mode, and that this plays a role in their dynamic switchable graphics, but we didn’t receive any additional details on this subject.

As far as getting content from the dGPU to the display, the IGP always maintains a connection to the display ports, and it appears AMD’s drivers copy data over the PCI-E bus to the IGP framebuffer, similar to Optimus. Where things get interesting is that there are no muxes in AMD’s dynamic switchable graphics implementations, but there is still an option to fall back to manual switching. For this mode, AMD is able to use the display output ports of the Intel IGP, so their GPU doesn’t need separate output ports (e.g. with muxes). With the VAIO C, both dynamic and manual switching are supported, and you can set the mode as appropriate. Here are some static shots of the relevant AMD Catalyst Control Center screens.

Gallery: AMD Catalyst Control Center - Switchable Graphics

In terms of the drivers, right now you get a single large driver package that includes a proxy driver, an Intel IGP driver, and AMD’s GPU driver all rolled into one. Long-term, AMD says they have plans to make their GPU driver fully independent from Intel’s IGP driver. They say this will only require some packaging updates and that they should make this change some time in 2012, but for now they continue to offer a monolithic driver package. OEMs apparently get this driver on a monthly basis (or can at least request it), but it’s up to the OEMs to validate the driver for their platform and release it to the public.

In the case of non-switchable graphics, AMD has a monthly driver update that we refer to as “reference drivers” that is publicly available. At present, you download a utility that will check your laptop GPU ID to see if the laptop is officially supported by the reference driver. Right now certain OEMs like to maintain control of the drivers so the AMD utility will refuse to download the full driver suite. In such cases, users have to wait for the manufacturers to roll out updates (Sony, Toshiba, and Panasonic all fall into this category). In the past, we have been able to download the reference driver using a “sanctioned” laptop (e.g. something from Acer), and we were able to install the reference driver on a non-sanctioned laptop. However, this does not work with switchable graphics laptops; you need the monolithic driver package for such systems.

That takes care of the high-level overview of how AMD’s Dynamic Switchable Graphics works, as well as a few other related items. The details are a little light, but that at least gives us an introduction to AMD’s current switchable graphics solutions. With the hardware and software discussions out of the way, let’s turn to our gaming results first and see how the two solutions and GPUs compare in performance as well as compatibility.

Switchable Graphics - Meet the Contenders Medium Detail Gaming Comparison

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

91 Comments

View All Comments

JarredWalton - Tuesday, September 20, 2011 - link
I can't see any other drivers for that laptop other than the original Sept. 2010 driver and a new Sept. 2011 driver. Got a link for the other previous drivers to confirm? Anyway, AMD says they make a driver build available for switchable graphics on a monthly basis, so it sounds like Acer is actually using that. Kudos to them if that's the case.
overseer - Tuesday, September 20, 2011 - link
Perhaps Acer just omitted older versions of drivers they deemed unnecessary. My 4745G was manufactured in Jan. 2010 and the initial CCC was dated late 2009. I can recall I took 2 updates: one in summer 2010 (Catalyst 10.3?) and one in last week (the latest one). So it's safe to say there have been at least 4 traceable versions of AMD GPU drivers for my model.

While I can't really convince you that it's a bi-monthly or quarterly update cycle from Acer with the limited evidence, this OEM nonetheless has been keeping an eye on new graphics drivers - something I'd never expected in the first place as an early adopter of AMD switchable.
MJEvans - Tuesday, September 20, 2011 - link
You toyed with several ideas for power throttling graphics chips. The obvious ones like turning off cores and working at a different point of the voltage/frequency curve to slash used power are solid.

Where things turn silly is suggesting the use of only 64 of 256 bits of memory interface. This simply won't work for a few reasons. However let's assume that for price and performance 4, 64 bit, chips had been selected. Probably, the application assets would be littered across the 4 slabs of memory; either for wider page accesses to the same content (faster transfer) or for parallel access to unrelated content (say for two isolated tasks). In any event the cost of consolidating them; both in time and energy expended for the moves, would only be worthwhile if it were for a long duration.

Instead a better approach would be to follow a similar voltage/freq curve for medium power modes. For low power modes the obvious solution is to have core assets on one bank of memory and use any other enabled blanks as disposable framebuffers. This would allow you to operate at lower clock speeds without impacting performance. Further, if the display output were disabled you would then be able to de-activate all but the asset bank of memory.

Probably a patent thicket of somekind exists for this stuff; but really I consider these to be things that must be obvious to someone skilled in the art, or even just of logic and basic knowledge of the physics driving current transistors; since my college degree is getting stale and I've never been employed in this field.
Old_Fogie_Late_Bloomer - Tuesday, September 20, 2011 - link
This might not be feasible, but perhaps AMD and/or Nvidia could do something like what Nvidia is doing with Kal-El: have low-power circuitry that can do the bare minimum of what's needed for Windows Aero, etc. (basically what the integrated graphics chip does in Optimus) and switch it out for the more powerful circuitry when needed. As with Kal-El, this switch could be more or less invisible to Windows, or it could be handled at the driver level.

Of course, that basically wastes the cost of the unused integrated graphics. Perhaps AMD's APUs could more take advantage of this idea: basically, put a CPU and two GPUs on one die, flipping between the slow, power-sipping graphics and the fast and powerful graphics.
MJEvans - Tuesday, September 20, 2011 - link
Actually the Kal-El article explained the key point rather well. The two common options are high speed but a high cost of being 'on' and lower speed but more efficiency per operation at the same speed. Given the highly parallel nature of a graphics solution it makes much more sense to keep the parts that are operating running at faster speed and switch off more of the ones that then aren't needed at all. The main barrier to doing that effectively enough would be the blocks of units used; however if development is occurring with this in mind it might be economically viable. That's a question that would require actual industry experience and knowledge of current design trade-offs to answer.
Old_Fogie_Late_Bloomer - Wednesday, September 21, 2011 - link
Well, the thing that I got from the Kal-El article that was really interesting to me, which I think COULD be relevant to mobile GPU applications, is that, apparently, using this other form of transistor--which is low-leakage but cannot run faster than a certain speed (half a GHz or so)--is sufficiently more efficient in terms of power usage that Nvidia engineers felt that the additional cost is worth it, both in terms of silicon area and the increased cost per part of manufacturing, which, of course, trickles down to the price of the tablet or whatever it's used in. That sounds to me like they feel pretty confident about the idea.

That being said, I have not the slightest clue what kind of transistors are used in current mobile chips. It might be that GPU manufacturers are already reaping the benefits of low-leakage transistors, in which case there might not be anywhere to go. If they are not, however, why not have just enough low-power cores to run Windows 8's flashy graphical effects, and then switch over to the more powerful, higher-clocked circuitry for gaming or GPGPU applications. I don't know how much it would cost to the consumer, but I'm betting someone would pay $25-$50 more for something that "just works."
MJEvans - Friday, September 23, 2011 - link
It seems that either your comment missed my primary point or I didn't state it clearly enough.

Likely the engineering trade-off favors powering just a single graphics core (out of the hundreds even mainstream systems how have, relative to merely 4 (vastly more complex) CPU cores on 'mid-high' end systems) rather than increasing hardware and software complexity by adding in an entirely different manufacturing technology and tying up valuable area that could be used for other things with a custom low power version of a thing.

I find it much more likely that normal use states favor these scenarios:
a) Display is off,
a.a) entire gpu is off (Ram might be trickle refreshing).
b) Display is frozen,
b.a) entire gpu is off (Ram might be trickle refreshing).
c) Display is on,
c.a) gpu has one core active at medium or higher speed
(This would not be /as/ efficient as an ultra-low power core or two, but I seriously have to wonder if they would even be sufficient; or what fraction of a single core is required for 'basic' features these days).
c.b) mid-power; some cores are active, possibly dynamically scaled based on load (similar to CPU frequency governing but a bit more basic here)
c.c) full power; everything is on and cycles are not even wasted on profiling (this would be a special state requested by intensive games/applications).
danjw - Tuesday, September 20, 2011 - link
When I worked for a small game developer, getting ATI to give you the time of day was pretty much impossible. Where as Nvidia was more then happy to help us out, with some free graphics cards and support. If AMD is continuing on this path, they will not ever be real competition to Nvidia.
fynamo - Tuesday, September 20, 2011 - link
The WHOLE POINT of having switchable graphics is to reduce power consumption and thereby extend battery life, and at the same time provide any necessary 2D acceleration capabilities for the OS and Web browsing.

I'm disappointed that this review totally misses the mark.

I've been testing numerous Optimus configurations myself lately and have discovered some SERIOUS issues with [at least the Optimus version of] switchable graphics technology: Web browsing.

Web browsers today increasingly accelerate CSS3, SVG and Flash; however, GPU's have yet to catch up to this trend. As a result rendering performance is abysmal on a Dell XPS 15 with an Intel 2720QM CPU + NVIDIA GeForce 540. Not just with web acceleration. Changing window sizes, basic desktop / Aero UI stuff is like a slideshow. I upgrade from a Dell XPS 16 with the Radeon 3670, and the overall experience has been reduced from a liquid-smooth Windows environment to a slideshow.

Granted, gaming performance is not bad but that's not the issue.

Running the latest drivers for everything.

I was hoping to see this topic researched better in this article.
JarredWalton - Tuesday, September 20, 2011 - link
There's really not much to say in regards to power and battery life, assuming the switchable graphics works right. When the dGPU isn't needed, both AMD and NVIDIA let the IGP do all the work, so then we're looking at Sony vs. Acer using Intel i5-2410M. BIOS and power optimizations come into play, but the two are close enough that it doesn't make much of a difference. (I posted the battery life results above if you're interested, and I still plan on doing a full review of the individual laptops.)

I'm curious what sites/tests/content you're using that create problems with CSS3/SVG/Flash. My experience on an Intel IGP only is fine for everything I've done outside of running games, but admittedly I might not be pushing things too hard in my regular use. Even so, the latest NVIDIA drivers should allow you to run your browser on dGPU if you need to -- have you tried that? Maybe your global setting has the laptop set to default to IGP everywhere, which might cause some issues. But like I said, specific sites and interactions that cause the sluggishness would be useful, since I can't rule out other software interactions from being the culprit otherwise.

Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology

Post Your Comment

91 Comments

View All Comments

JarredWalton - Tuesday, September 20, 2011 - link

overseer - Tuesday, September 20, 2011 - link

MJEvans - Tuesday, September 20, 2011 - link

Old_Fogie_Late_Bloomer - Tuesday, September 20, 2011 - link

MJEvans - Tuesday, September 20, 2011 - link

Old_Fogie_Late_Bloomer - Wednesday, September 21, 2011 - link

MJEvans - Friday, September 23, 2011 - link

danjw - Tuesday, September 20, 2011 - link

fynamo - Tuesday, September 20, 2011 - link

JarredWalton - Tuesday, September 20, 2011 - link

Log in

Don't have an account? Sign up now