Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology

Name: Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology
Item: Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology
Author: Jarred Walton

by Jarred Walton on September 20, 2011 6:40 AM EST

91 Comments | Add A Comment

91 Comments

Medium Detail Gaming Comparison

We’re going to focus on the main gaming target for this level of hardware: our Medium presets. The Low preset tends to look pretty lousy in several of the games (StarCraft II being the worst), while our High settings are too much for the GT 540M/HD 6630M in about half of the titles, even at 1366x768. Medium detail on the other hand should give us 30+ FPS in nearly all of the games we’re testing while still providing decent visuals. Here are the results.

Battlefield: Bad Company 2

Civilization V

DiRT 2

Left 4 Dead 2

Mafia II

Mass Effect 2

Metro 2033

STALKER: Call of Pripyat

StarCraft II: Wings of Liberty

Total War: Shogun 2

In terms of performance, the 6630M matches up nicely against the GT 540M. It wins a few games by a small margin, loses a few by a similarly small margin, and ties in the remainder. If we were just looking at gaming performance, there’s not much to discuss here. NVIDIA still has CUDA and PhysX, and AMD has their Accelerated Parallel Processing SDK, but given the level of performance we’re talking about, you won’t need or use these extras most of the time. Unfortunately for AMD, it’s not just about gaming performance.

We tested ten games (though we don’t have results from the Alienware M11x R3 in Civilization V and Total War: Shogun 2), and the newest titles are Civ5 (13 months old) and TWS2 (6 months old). You’d hope that with relatively known games, compatibility wouldn’t be an issue, but we ran into two problems during testing.

The first wasn’t a consistent problem, but we did have a couple of crash to desktop issues when benchmarking DiRT 2 using Dynamic Switchable Graphics mode. The other issue was a lot worse, and it was in StarCraft II when running at anything more than the Low default settings. As noted earlier, SC2 looks pretty lousy at its minimum settings, and Medium detail (and possibly even High) should be more than playable on the HD 6630M. In dynamic mode, however, not only was performance quite a bit lower than expected at our Low settings, but at Medium and above there were tons of missing textures. The solution is to switch to manually controlled switching, which solved both the performance and rendering errors, but then why even have dynamic switching in the first place?

While we’re on the subject, let’s also take a look at a few other comparisons. Running “stock” on the CPU, the Acer 3830TG is slightly slower overall than when using ThrottleStop to limit the CPU to 2.1GHz. With ThrottleStop, the average score across ten games improves by 20%, so the CPU throttling has a major impact in games. Civ5, ME2, and TWS2 show the least difference, with Mass Effect 2 being the only game that ran slightly faster at the stock CPU settings; Mafia 2 runs 10% faster, Metro 2033 and STALKRE are 13% faster, L4D2 jumps up 28%, DiRT2 is 49% faster, and the ever-CPU-limited StarCraft II runs 58% faster. Interesting to point out is that 2.1GHz is 75% higher than 1.2GHz, so on average it looks like the 3830TG runs around 1.3GHz in StarCraft II.

Flipping over to the other GT 540M equipped laptops and using the best result from the Acer 3830TG, the Alienware M11x R3 averages out to 3.5% faster, but it only has a clear lead in two games (DiRT2 and SC2) while it’s slightly slower in L4D2 and Mafia2. The Dell XPS 15 comes in 9% faster than the 3830TG on average, with the biggest lead again coming from SC2; Mafia2 once again gives the 3830TG a lead, so the latest NVIDIA drivers appear to be playing a role.

Finally, since we now have results from the HD 6630M with a Llano A8-3500M as well as the i5-2410M, we thought it would be interesting to see just how much performance you give up by gaming with the slower Llano CPU. If you’ll recall, in our first look at a Llano notebook, we expressed concern that the CPU would hold back GPU performance. Out of our ten tested titles, only Mass Effect 2 ran faster on Llano with a dGPU. Most games (at Medium settings where the GPU becomes more of a bottleneck) are close: BFBC2 and Civ5 are exactly the same performance, Mafia 2 is 2% faster with the Intel CPU, and Dirt2, L4D2, Metro, and STALKER are all within 5-10%. Not surprisingly, StarCraft II is the big outlier, running over twice as fast on the Intel i5-2410M. We’re still not looking at similar driver builds, as the Llano laptop is running a Catalyst 11.6 derivative (8.862.110607a-120249E) compared to 11.1 on the Sony, but even so the average performance increase with an Intel CPU comes in at 12% (mostly thanks to SC2).

Since there were two games where we didn’t have a perfect experience, we decided to look at some more recent titles.

How AMD’s Dynamic Switchable Graphics Works What about Recent Games?

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

91 Comments

View All Comments

JarredWalton - Tuesday, September 20, 2011 - link
I can't see any other drivers for that laptop other than the original Sept. 2010 driver and a new Sept. 2011 driver. Got a link for the other previous drivers to confirm? Anyway, AMD says they make a driver build available for switchable graphics on a monthly basis, so it sounds like Acer is actually using that. Kudos to them if that's the case.
overseer - Tuesday, September 20, 2011 - link
Perhaps Acer just omitted older versions of drivers they deemed unnecessary. My 4745G was manufactured in Jan. 2010 and the initial CCC was dated late 2009. I can recall I took 2 updates: one in summer 2010 (Catalyst 10.3?) and one in last week (the latest one). So it's safe to say there have been at least 4 traceable versions of AMD GPU drivers for my model.

While I can't really convince you that it's a bi-monthly or quarterly update cycle from Acer with the limited evidence, this OEM nonetheless has been keeping an eye on new graphics drivers - something I'd never expected in the first place as an early adopter of AMD switchable.
MJEvans - Tuesday, September 20, 2011 - link
You toyed with several ideas for power throttling graphics chips. The obvious ones like turning off cores and working at a different point of the voltage/frequency curve to slash used power are solid.

Where things turn silly is suggesting the use of only 64 of 256 bits of memory interface. This simply won't work for a few reasons. However let's assume that for price and performance 4, 64 bit, chips had been selected. Probably, the application assets would be littered across the 4 slabs of memory; either for wider page accesses to the same content (faster transfer) or for parallel access to unrelated content (say for two isolated tasks). In any event the cost of consolidating them; both in time and energy expended for the moves, would only be worthwhile if it were for a long duration.

Instead a better approach would be to follow a similar voltage/freq curve for medium power modes. For low power modes the obvious solution is to have core assets on one bank of memory and use any other enabled blanks as disposable framebuffers. This would allow you to operate at lower clock speeds without impacting performance. Further, if the display output were disabled you would then be able to de-activate all but the asset bank of memory.

Probably a patent thicket of somekind exists for this stuff; but really I consider these to be things that must be obvious to someone skilled in the art, or even just of logic and basic knowledge of the physics driving current transistors; since my college degree is getting stale and I've never been employed in this field.
Old_Fogie_Late_Bloomer - Tuesday, September 20, 2011 - link
This might not be feasible, but perhaps AMD and/or Nvidia could do something like what Nvidia is doing with Kal-El: have low-power circuitry that can do the bare minimum of what's needed for Windows Aero, etc. (basically what the integrated graphics chip does in Optimus) and switch it out for the more powerful circuitry when needed. As with Kal-El, this switch could be more or less invisible to Windows, or it could be handled at the driver level.

Of course, that basically wastes the cost of the unused integrated graphics. Perhaps AMD's APUs could more take advantage of this idea: basically, put a CPU and two GPUs on one die, flipping between the slow, power-sipping graphics and the fast and powerful graphics.
MJEvans - Tuesday, September 20, 2011 - link
Actually the Kal-El article explained the key point rather well. The two common options are high speed but a high cost of being 'on' and lower speed but more efficiency per operation at the same speed. Given the highly parallel nature of a graphics solution it makes much more sense to keep the parts that are operating running at faster speed and switch off more of the ones that then aren't needed at all. The main barrier to doing that effectively enough would be the blocks of units used; however if development is occurring with this in mind it might be economically viable. That's a question that would require actual industry experience and knowledge of current design trade-offs to answer.
Old_Fogie_Late_Bloomer - Wednesday, September 21, 2011 - link
Well, the thing that I got from the Kal-El article that was really interesting to me, which I think COULD be relevant to mobile GPU applications, is that, apparently, using this other form of transistor--which is low-leakage but cannot run faster than a certain speed (half a GHz or so)--is sufficiently more efficient in terms of power usage that Nvidia engineers felt that the additional cost is worth it, both in terms of silicon area and the increased cost per part of manufacturing, which, of course, trickles down to the price of the tablet or whatever it's used in. That sounds to me like they feel pretty confident about the idea.

That being said, I have not the slightest clue what kind of transistors are used in current mobile chips. It might be that GPU manufacturers are already reaping the benefits of low-leakage transistors, in which case there might not be anywhere to go. If they are not, however, why not have just enough low-power cores to run Windows 8's flashy graphical effects, and then switch over to the more powerful, higher-clocked circuitry for gaming or GPGPU applications. I don't know how much it would cost to the consumer, but I'm betting someone would pay $25-$50 more for something that "just works."
MJEvans - Friday, September 23, 2011 - link
It seems that either your comment missed my primary point or I didn't state it clearly enough.

Likely the engineering trade-off favors powering just a single graphics core (out of the hundreds even mainstream systems how have, relative to merely 4 (vastly more complex) CPU cores on 'mid-high' end systems) rather than increasing hardware and software complexity by adding in an entirely different manufacturing technology and tying up valuable area that could be used for other things with a custom low power version of a thing.

I find it much more likely that normal use states favor these scenarios:
a) Display is off,
a.a) entire gpu is off (Ram might be trickle refreshing).
b) Display is frozen,
b.a) entire gpu is off (Ram might be trickle refreshing).
c) Display is on,
c.a) gpu has one core active at medium or higher speed
(This would not be /as/ efficient as an ultra-low power core or two, but I seriously have to wonder if they would even be sufficient; or what fraction of a single core is required for 'basic' features these days).
c.b) mid-power; some cores are active, possibly dynamically scaled based on load (similar to CPU frequency governing but a bit more basic here)
c.c) full power; everything is on and cycles are not even wasted on profiling (this would be a special state requested by intensive games/applications).
danjw - Tuesday, September 20, 2011 - link
When I worked for a small game developer, getting ATI to give you the time of day was pretty much impossible. Where as Nvidia was more then happy to help us out, with some free graphics cards and support. If AMD is continuing on this path, they will not ever be real competition to Nvidia.
fynamo - Tuesday, September 20, 2011 - link
The WHOLE POINT of having switchable graphics is to reduce power consumption and thereby extend battery life, and at the same time provide any necessary 2D acceleration capabilities for the OS and Web browsing.

I'm disappointed that this review totally misses the mark.

I've been testing numerous Optimus configurations myself lately and have discovered some SERIOUS issues with [at least the Optimus version of] switchable graphics technology: Web browsing.

Web browsers today increasingly accelerate CSS3, SVG and Flash; however, GPU's have yet to catch up to this trend. As a result rendering performance is abysmal on a Dell XPS 15 with an Intel 2720QM CPU + NVIDIA GeForce 540. Not just with web acceleration. Changing window sizes, basic desktop / Aero UI stuff is like a slideshow. I upgrade from a Dell XPS 16 with the Radeon 3670, and the overall experience has been reduced from a liquid-smooth Windows environment to a slideshow.

Granted, gaming performance is not bad but that's not the issue.

Running the latest drivers for everything.

I was hoping to see this topic researched better in this article.
JarredWalton - Tuesday, September 20, 2011 - link
There's really not much to say in regards to power and battery life, assuming the switchable graphics works right. When the dGPU isn't needed, both AMD and NVIDIA let the IGP do all the work, so then we're looking at Sony vs. Acer using Intel i5-2410M. BIOS and power optimizations come into play, but the two are close enough that it doesn't make much of a difference. (I posted the battery life results above if you're interested, and I still plan on doing a full review of the individual laptops.)

I'm curious what sites/tests/content you're using that create problems with CSS3/SVG/Flash. My experience on an Intel IGP only is fine for everything I've done outside of running games, but admittedly I might not be pushing things too hard in my regular use. Even so, the latest NVIDIA drivers should allow you to run your browser on dGPU if you need to -- have you tried that? Maybe your global setting has the laptop set to default to IGP everywhere, which might cause some issues. But like I said, specific sites and interactions that cause the sluggishness would be useful, since I can't rule out other software interactions from being the culprit otherwise.

Mobile GPU Faceoff: AMD Dynamic Switchable Graphics vs. NVIDIA Optimus Technology

Post Your Comment

91 Comments

View All Comments

JarredWalton - Tuesday, September 20, 2011 - link

overseer - Tuesday, September 20, 2011 - link

MJEvans - Tuesday, September 20, 2011 - link

Old_Fogie_Late_Bloomer - Tuesday, September 20, 2011 - link

MJEvans - Tuesday, September 20, 2011 - link

Old_Fogie_Late_Bloomer - Wednesday, September 21, 2011 - link

MJEvans - Friday, September 23, 2011 - link

danjw - Tuesday, September 20, 2011 - link

fynamo - Tuesday, September 20, 2011 - link

JarredWalton - Tuesday, September 20, 2011 - link

Log in

Don't have an account? Sign up now