Original Link: http://www.anandtech.com/show/5311/introducing-the-2012-mobile-benchmark-matrix
Introducing the 2012 Mobile Benchmark Matrixby Jarred Walton on January 6, 2012 8:02 PM EST
2012: Meet Our New Mobile Benchmark Suite
Testing computer hardware can be a difficult process. On the one hand there’s a desire for more information and benchmarks, and on the other hand there’s a desire for timely reviews. Our goal at AnandTech has always been to deliver the most comprehensive reviews possible, and while we strive to timeliness there are occasions where additional testing or questions may delay a review. Ultimately, there’s a balancing act that needs to be maintained, and over time we periodically refresh our review suite and testing methodologies.
With 2012 now here, we’re launching a new suite of benchmarks for our laptop reviews. We'll also have the results from our first laptop using the new tests, courtesy of ASUS' G74SX. Some of the tests have already been in use for a while and others are brand new. In order to provide a single location with a list of our benchmarks and testing procedures, we have put together this short overview. We plan on using the following test suite throughout 2012, and while it’s possible we will add some benchmarks, we don’t have any plans to stop using any of the following at least for the next year.
General Performance Tests
Starting with our general tests, all of these have been in use for several months at least, with many tests dating back to 2010 and earlier. We’ll continue to use the full PCMark 7 suite, PCMark Vantage (x64), Graysky’s x264 HD encoding test, Cinebench 11.5, 3DMark06, 3DMark Vantage (Entry-Level and Performance defaults), and 3DMark 11. We’ll also continue with our battery life tests (now with Internet Explorer 9 in place of IE8) and LCD tests. So for most areas, our test suite remains largely unchanged—we’re finally dropping Cinebench 10, but that’s about it.
As we’ll mention in the conclusion, we’re willing to add some additional general performance benchmarks if there are any specific requests. One of the difficult things to quantify with modern PCs is how fast they are in the things most people do on a regular basis. Part of the problem is that most PCs from the past three or four years are all “fast enough” for generic tasks like surfing the web—if you’re actually reading the content of web pages rather than just repeatedly loading a complex page, I’m not sure most users would notice the difference between a 2GHz Core 2 Duo or Athlon X2 laptop and a quad-core i7-2760QM. This is why battery life is such an important element, as where many wouldn’t notice the difference between a web page loading in two seconds and a web page loading in one second, they’re far more likely to notice two hours of battery life versus four or eight hours. Anyway, let us know if you have other mobile benchmarks you’d like us to consider.
With that out of the way, we’ll save the next page for the major changes: our updated gaming suite.
All New Gaming Test Suite
On the gaming side, the changes are rather more substantial. We’ve decided to wipe the slate clean and select all new titles for 2012. Actually, that’s not entirely true—I’ve been running tests with Civilization V and Total War: Shogun 2 for a while, but we’re now updating the settings for the benchmarks and all laptop editors will test with the following games and settings. Note that we tried to get a good selection of game genres in the list; depending on how you want to classify the games, we have four games representing first person/third person action games, two strategy games, one RPG, and one simulation/driving game. We also have representatives of several major engines—Unreal Engine 3, Frostbite 2, and Source being the most noteworthy. We’ve tried to overlap our desktop gaming suite, and while we won’t use identical test suites, we do overlap on six of the titles.
The other big change is that we’re ramping up the quality settings this year. Previously, we had both Low/Minimum and Medium settings at 1366x768. Unfortunately, we often ran into games where minimum quality looked, frankly, awful; in other games the difference between our Low and Medium settings ended up being negligible. For 2012, then, we’ve decided to skip the “minimum” detail testing and select settings that we feel most gamers will actually like using. [Update: we're changing the naming convention to avoid name space conflicts.]
Our Value setting for the test suite loosely corresponds to last year’s “Medium” settings, all run at 1366x768; our new Mainstream settings bump the resolution to 1600x900 and increase quality to roughly match last year’s “High” settings; finally, our Enthusiast settings enable 4x MSAA in all seven titles and increase the resolution to 1920x1080, basically matching the “Ultra” settings of 2011. We’ve tested each game on several setups with the goal of choosing settings that will result in reasonable quality and performance differences between the three settings. With that out of the way, here’s a rundown of the games.
Batman: Arkham City: The sequel to 2009’s Batman: Arkham Asylum, Arkham City continues the story of the Dark Knight with a free roaming playground to explore at your leisure. Graphically, the game is similar to the original title, only now the PC version has (properly working) DX11 support. The DX11 features come at a serious performance cost, however, so our Value test setting will leave them off. Note that even in the game settings, DX11 features are disabled unless you choose the maximum “Extreme” preset—and you’ll really need a beefy PC to handle the workload at that point!
We’re using the built-in benchmark so that readers can compare our results with their own hardware. Our Value settings use the Medium defaults; for Mainstream we switch to the High defaults and enable DX11 features; finally, for Enthusiast we use the Extreme preset and enable 4x MSAA. If you’re wondering, leaving DX11 disabled largely removes performance differences between the various settings, at least on moderate hardware. We tested at 1366x768 with Low, Medium, High, and Very High settings and found the average frame rates only dropped around 20%; enabling DX11 on any of the other modes results in a drop of around 40%.
Also worth noting is that Batman supports PhysX, but we won’t be testing with PhysX as that’s only available on NVIDIA hardware. That said, we do want to mention that PhysX definitely improves the gaming experience, with enhanced fog effects, more debris, cloth effects, and certain weapons (e.g. the freeze gun in one scene) fire polygons instead of sprites/textures. My own impression is that Batman with PhysX enabled and DX11 disabled generally looks better than Batman with DX11 enabled and PhysX disabled. If you want all of the goodies enabled, you’ll need very high-end hardware, beyond what most laptops can support—we’d suggest GTX 560M SLI as a bare minimum for the Extreme preset with PhysX enabled at 1080p.
Battlefield 3: We’re switching our Battlefield choice from Bad Company 2 to Battlefield 3, though in practice performance is frequently similar. For this title, we’re using FRAPS and using a two minute tank “on rails” sequence from the Thunder Run single player mission. Performance in the single player missions is highly variable depending on the level, and multi-player is even more so, but we need something that provides consistency between test runs. The Frostbite 2 engine puts quite a hefty load on your GPU, and if you want all the eye candy enabled you’ll need more than your typical mobile GPU. For BF3, our Value settings use the Medium preset; Mainstream uses the High preset; and Enthusiast uses the Ultra preset. BF3 also supports DX10 and DX11, and we leave the DirectX version support set to “Auto”; outside of Intel’s HD 2000/3000 hardware, that means all laptops will run in DX11 mode.
Civilization V: Civ5 is an interesting title in that the use of driver command lists allowed NVIDIA to optimize performance and get a healthy boost in frame rates not long after it launched. AMD has yet to implement command lists (AFAIK), but as we showed in our HD 7970 review, there may be other factors at play. We continue to use the built-in LateGameView benchmark, and it’s worth noting that the turn-based nature of Civ5 makes lower frame rates more palatable than in shooters. For our detail settings, Value has all of the video settings at Low; Mainstream uses High settings on everything (with the High detail strategic view enabled); Enthusiast is the same, only with 4x MSAA enabled. We use the DX10/11 executable and set the configuration file to allow the use of both SM41 and SM50 (Shader Model 4.1/DX10.1 and Shader Model 5.0/DX11).
DiRT 3: Our replacement to DiRT 2 is a simple update to the latest title in the series. As with BF3, this time we’re letting all systems use DX11 hardware—early indications are that DX11 improves performance at Low to High presets, but it creates a pretty massive performance drop at the Ultra preset. We run the in-game benchmark. Our Value setting will use the Medium preset; Mainstream will use the High preset, and Enthusiast will use the Ultra preset with 4x MSAA. (Note that just moving the detail slider from High to Ultra results in a ~40% drop in frame rate while adding 4xAA accounts for another 10-15%, so there’s a pretty sizeable gap between our Mainstream and Enthusiast results.)
Elder Scrolls: Skyrim: Skyrim is one of two titles in our updated list that doesn’t (currently) support DX11. There may be a patch at some point to improve the situation—there are some old conflicting statements from earlier this year where Skyrim was claimed to support DX11—but for now we’re using whatever the game has in the latest patch (e.g. as of early January, 2012). Texture quality is not one of the strong points of Skyrim, with frequently blurry textures (thanks to the console cross-platform nature of development), but at least dragons are now properly attacking with the latest updates.
As far as benchmarking goes, Skyrim appears to be far more taxing on the CPU side of things than on graphics, particularly for desktop gamers, but mobile graphics hardware is several rungs down the performance ladder so we’re going to use it. Our Value setting uses the Medium preset with antialiasing off, anisotropic filtering set to 4x, and texture quality set to medium with FXAA disabled—the latter basically uses a full screen blur filter to remove jaggies while increasing blurriness. For Mainstream, we use the High preset and turn antialiasing off. Last, for Enthusiast, we use the Ultra preset but drop antialiasing to 4xAA. Note that enabling antialiasing, at least on a GTX 560M, appears to have a minimal impact on performance; however, that may not always be the case so we’re sticking with our standard of no-AA at Value and Mainstream settings and 4xAA at the Enthusiast settings.
Update: The 1.4 patch of Skyrim dramatically improved performance, and Bethesda also released a high resolution texture pack for the PC. We will use the high resolution texture pack at the Mainstream and Enthusiast settings going forward.
Portal 2: Portal 2 is our representative of the Source engine, and like the other Source games released so far from Valve, that means no DX10 or DX11 support. That doesn’t mean the game isn’t graphically demanding, though it may have different bottlenecks than many of the other titles. We use an in-house demo file where the player combines speed gel with portals, switches, and an Excursion Funnel to advance through the map. Like most Source engine games, frame rates tend to be quite a bit higher than other titles. Our Value settings use trilinear filtering with multicore rendering enabled, and Shader/Effect/Model/Texture detail are all at Medium (with paged pool memory available set to High). Mainstream maxes out all of the settings with the exception of antialiasing, which remains off, and Enthusiast adds 4x MSAA to the mix.
Total War: Shogun 2: Wrapping up our gaming list is Total War: Shogun 2, a game which holds the dubious honor of being the slowest loading title in our test suite—by a large margin. Initially launched as a DX9 title, a patch later added DX10/11 support. Graphically, it’s difficult to tell what differences the various rendering modes have, but DX11/SM5.0 does appear to have substantially better SSAO. The patch that added DX11 features also added a built-in benchmark, the introduction to one of the scenarios, which we use for our testing. Our Value settings use the Medium preset, Mainstream will use the High preset, and Enthusiast uses the Very High preset with 4x MSAA enabled. In addition to enabling the DX11 engine, all of our settings files are set to use SM5.0 code where applicable.
Benchmarking the Matrix
With the updated test suite, we’re also losing some points of reference to our back catalog of laptops. Obviously, the biggest change is in the gaming results, and we decided to take one of our recently reviewed laptops for a spin using the new benchmark suite. (We may look at adding a couple more lower end laptops from late 2011 to the charts as well in the near future.) ASUS was kind enough to let us hang onto the G74SX until the new suite was complete, and given the reasonably high-end hardware and continued availability, it makes for a good starting point for our 2012 laptop results. We updated to the latest NVIDIA drivers (290.56 at the time of testing) and ran through all of our gaming tests. You can find the complete results in Mobile Bench, and the games are all grouped under the Mobile Gaming 2012 category; since we only have one laptop tested right now, we’ve summarized the gaming scores below.
In our 2011 gaming suite, the ASUS G74SX—and NVIDIA’s GTX 560M—proved capable of handling the majority of games at our Enthusiast settings and 1080p while still breaking 30 FPS. With some of the latest titles at similar “maxed out” settings, frame rates now drop below 30FPS in five of the seven titles, but remember that our new Enthusiast is equivalent to last year’s “Ultra”. There are certainly other games that will tax the GTX 560M, and our recommendation is that you consider disabling antialiasing or dropping the quality down a notch if you want higher frame rates, but in general the GTX 560M is still a good solution for notebook gamers.
As a sci-fi buff, it’s pretty exciting to see the rapid pace of advancement over the last few years. Today’s smartphones pack about as much power in a small portable device as the PCs we used less than a decade ago. If you’ve ever dreamed of real-world tricorders and holodecks—or maybe cyberspace and Ono-Sendai decks—they’re getting tantalizingly close. Maybe we won’t have exactly what the sci-fi writers of 20 or 30 years ago envisioned, but we’re definitely shedding the wires and I look forward to seeing where we will be in another ten years!
Back on topic, no benchmark suite can ever (reasonably) contain every performance metric, and we do understand that mobile gaming is still a small piece of the larger mobility pie. Even so, it’s still important to consider mobile GPU performance, and with the improving nature of integrated graphics we felt it was time to finally ditch the 2006-era graphics quality settings and shoot for something more visually appealing. Our mobile gaming suite now represents some of the latest DX11 titles, and even at our Value settings all of the games look quite good. If you’re looking for basic gaming capabilities, all you really need is a mobile GPU that can hit 30 FPS at our Value settings in all seven titles and you should be set. If you’re after higher quality and higher resolutions, you’ll want something more than midrange GPUs, but be prepared to pay the price—both in terms of cost as well as in terms of notebook size.
With the updated laptop benchmarks now in place, we’re still early enough in 2012 that if you can make a good case for other benchmarks that we haven’t included we’re willing to consider adding a couple more tests. Remember that the goal is to provide a reasonable test suite from which you can estimate performance in other similar benchmarks, so adding three more video encoding tests isn’t really going to add much; on the other hand, if there’s a class of application you don’t feel our test suite adequately covers, sound off in the comments.
As a final thought, I’ve been the head laptop tester at AnandTech since early 2006. While we have frequently heard about the increasing importance of laptops in the overall computer market, the past two years have really shown tremendous growth. We had seven mobile articles on AnandTech in 2006, 15 in 2007 and 2008, and 32 in 2009. That’s pretty reasonable, but then in 2010 we had a whopping 107 mobile articles and 2011 eclipsed that with 166 articles. Wow! Granted not all of the articles in the past two years are about laptops, and we've had a lot of shorter articles in the past two years, but however you want to view it one thing is eminently clear: mobile devices are now well and truly established and our increased coverage reflects that. It’s also worth noting that Intel’s Sandy Bridge and AMD’s Llano launches were both more about the mobile sector than about desktops, and the upcoming Ivy Bridge, Trinity, and Haswell appear to continue that trend.
Here's looking forward to another awesome year in the mobile space, kicking off with CES next week. Hint: besides the usual plethora of large displays and 3D demonstrations, CES is all about smartphones, tablets, and laptops. (I almost feel sorry for Brian...almost.)