The Great Equalizer 3: How Fast is Your Smartphone/Tablet in PC GPU Terms

Name: The Great Equalizer 3: How Fast is Your Smartphone/Tablet in PC GPU Terms
Item: The Great Equalizer 3: How Fast is Your Smartphone/Tablet in PC GPU Terms
Author: Anand Lal Shimpi

by Anand Lal Shimpi on April 4, 2013 1:00 AM EST

128 Comments | Add A Comment

128 Comments

Choosing a Testbed CPU

Although I was glad I could put some of these old GPUs to use (somewhat justifying them occupying space for years in my parts closet), there was the question of what CPU to pair them with. Go too insane on the CPU and I may unfairly tilt performance in favor of these cards. What I decided to do was to simulate the performance of the Core i5-3317U in Microsoft's Surface Pro. That part is a dual-core Ivy Bridge with Hyper Threading enabled (4 threads). Its max turbo is 2.6GHz for a single core, 2.4GHz for two cores. I grabbed a desktop Core i3 2100, disabled turbo, and forced its default clock speed to 2.4GHz. In many cases these mobile CPUs spend a lot of time at or near their max turbo until things get a little too toasty in the chassis. To verify that I had picked correctly I ran the 3DMark Physics test to see how close I came to the performance of the Surface Pro. As the Physics test is multithreaded and should be completely CPU bound, it shouldn't matter what GPU I paired with my testbed - they should all perform the same as the Surface Pro:

3DMark - Physics Test

3DMark - Physics

Great success! With the exception of the 8500 GT, which for some reason is a bit of an overachiever here (7% faster than Surface Pro), the rest of the NVIDIA cards all score within 3% of the performance of the Surface Pro - despite being run on an open-air desktop testbed.

With these results we also get a quick look at how AMD's Bobcat cores compare against the ARM competitors it may eventually do battle with. With only two Bobcat cores running at 1.6GHz in the E-350, AMD actually does really well here. The E-350's performance is 18% better than the dual-core Cortex A15 based Nexus 10, but it's still not quite good enough to top some of the quad-core competitors here. We could be seeing differences in drivers and/or thermal management with some of these devices since they are far more thermally constrained than the E-350. Bobcat won't surface as a competitor to anything you see here, but its faster derivative (Jaguar) will. If AMD can get Temash's power under control, it could have a very compelling tablet platform on its hands. The sad part in all of this is the fact that AMD seems to have the right CPU (and possibly GPU) architectures to be quite competitive in the ultra mobile space today. If AMD had the capital and relationships with smartphone/tablet vendors, it could be a force to be reckoned with in the ultra mobile space. As we've seen from watching Intel struggle however, it takes more than just good architecture to break into the new mobile world. You need a good baseband strategy and you need the ability to get key design wins.

Enough about what could be, let's look at how these mobile devices stack up to some of the best GPUs from 2004 - 2007.

We'll start with 3DMark. Here we're looking at performance at 720p, which immediately stops some of the cards with 256-bit memory interfaces from flexing their muscles. Never fear, we will have GL/DXBenchmark's 1080p offscreen mode for that in a moment.

Graphics Test 1

Ice Storm Graphics test 1 stresses the hardware’s ability to process lots of vertices while keeping the pixel load relatively light. Hardware on this level may have dedicated capacity for separate vertex and pixel processing. Stressing both capacities individually reveals the hardware’s limitations in both aspects.

In an average frame, 530,000 vertices are processed leading to 180,000 triangles rasterized either to the shadow map or to the screen. At the same time, 4.7 million pixels are processed per frame.

Pixel load is kept low by excluding expensive post processing steps, and by not rendering particle effects.

3DMark - Graphics Test 1

Right off the bat you should notice something wonky. All of NVIDIA's G70 and earlier architectures do very poorly here. This test is very heavy on the vertex shaders, but the 7900 GTX and friends should do a lot better than they are. These workloads however were designed for a very different set of architectures. Looking at the unified 8500 GT, we get some perspective. The fastest mobile platforms here (Adreno 320) deliver a little over half the vertex processing performance of the GeForce 8500 GT. The Radeon HD 6310 featured in AMD's E-350 is remarkably competitve as well.

The praise goes both ways of course. The fact that these mobile GPUs can do as well as they are right now is very impressive.

Graphics Test 2

Graphics test 2 stresses the hardware’s ability to process lots of pixels. It tests the ability to read textures, do per pixel computations and write to render targets.

On average, 12.6 million pixels are processed per frame. The additional pixel processing compared to Graphics test 1 comes from including particles and post processing effects such as bloom, streaks and motion blur.

In each frame, an average 75,000 vertices are processed. This number is considerably lower than in Graphics test 1 because shadows are not drawn and the processed geometry has a lower number of polygons.

The data starts making a lot more sense when we look at the pixel shader bound graphics test 2. In this benchmark, Adreno 320 appears to deliver better performance than the GeForce 6600 and once again roughly half the performance of the GeForce 8500 GT. Compared to the 7800 GT (or perhaps 6800 Ultra), we're looking at a bit under 33% of the performance of those cards. The Radeon HD 6310 in AMD's E-350 appears to deliver performance competitive with the Adreno 320.

3DMark - Graphics

The overall graphics score is a bit misleading given how poorly the G7x and NV4x architectures did on the first graphics test. We can conclude that the E-350 has roughly the same graphics performance as Qualcomm's Snapdragon 600, while the 8500 GT appears to have roughly 2x that. The overall Ice Storm scores pretty much repeat what we've already seen:

3DMark - Ice Storm

Again, the new 3DMark appears to unfairly penalize the older non-unified NVIDIA GPU architectures. Keep in mind that the last NVIDIA driver drop for DX9 hardware (G7x and NV4x) is about a month older than the latest driver available for the 8500 GT.

It's also worth pointing out that Ice Storm also makes Intel's HD 4000 look very good, when in reality we've seen varying degrees of competitiveness with discrete GPUs depending on the workload. If 3DMark's Ice Storm test could map to real world gaming performance, it would mean that devices like the Nexus 4 or HTC One would be able to run BioShock 2-like titles at 10x7 in the 20 fps range. As impressive as that would be, this is ultimately the downside of relying on these types of benchmarks to make comparisons - they fundamentally tell us how well these platforms would run the benchmark itself, not other games unfortunately.

At a high level, it looks like we're somewhat narrowing down the level of performance that today's high end ultra mobile GPUs deliver when put in discrete GPU terms. Let's see what GL/DXBenchmark 2.7 tell us.

Digging Through the Parts Closet GL/DXBenchmark 2.7 & Final Words

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

128 Comments

View All Comments

pSupaNova - Sunday, April 7, 2013 - link
Your not listening to what Wilco1 is saying.

Microsoft used a poor Tegra 3 part, the HTC One X+ ships with a Tegra 3 clocked at 1.7ghz.

So by Anand comparing the Atom based tabs to the Surface RT it puts Intel chip in a much better light.
zeo - Tuesday, April 16, 2013 - link
Incorrect, Wilco1 is ignoring the differences in the SoCs. The Tegra 3 is a quad core and that means it can have up to 50% more performance than a equivalent dual core.

While the Clover Trail is only a dual core... so while the clock speed may favor the ATOM, the number of cores favors the Tegra 3.

It doesn't help that the ATOM still wins the run time tests as well. So overall efficiency is clearly in the ATOMs favor. While needing a quad core to beat a dual core still means the ATOM has better performance per core!

Not that it matters much as Intel is set to upgrade the ATOM to Bay Trail by the end of the year, which promises to up to double CPU performance, along with going up to quad cores) and triple GPU compared to the present Clover Trail.

While also going full 64bit and offering up to 8GB of RAM... Something ARM won't do till about the later half of 2014 at earliest and Nvidia specifically won't do until the Tegra 6... with Tegra 4 yet to come out yet in actual products now...
nofumble62 - Friday, April 5, 2013 - link
LTE is not available on Intel platform yet, that is why they don't offer in US. But I heard the new Intel LTE chip is pretty good (won award), so next year will be interesting.
The ARM Big cores suck up a lot of power when they are running. That is the reason Qualcomm SnapDragon is winning the latest Samsung S4 (over Samsung own Enoxys chip) and Nexus 7 (over Nvidia Tegra).
Spunjji - Friday, April 5, 2013 - link
Nvidia Tegra's not really ready for the new Nexus 7, so it's not entirely fair to say it's out because of power issues. When you consider that the S4 situation you described isn't strictly true either (if I buy an S4 here in the UK it's going to have the Exynos chip in it) it tends to harm your conclusion a bit.
zeo - Tuesday, April 16, 2013 - link
LTE will be introduced with the XMM 7160, which will be an optional addition to the Clover Trail+ series that's starting to come out now... Lenovo's K900 being one of the first design wins that has already been announced.

MWC 2013 showed the K900 off with the 2GHz Z2580, which ups the GMA to dual 544 PowerVR GPUs at 533MHz... So they showcased it running some games and demos like Epic Citadel at the full 1080P and max FPS that demo allows.

Only issue is the LTE is not integrated into the SoC... so won't be as power efficient as the other ARM solutions that are coming out with integrated LTE... at least for the LTE part...
WaltC - Friday, April 5, 2013 - link
Unfortunately, that's not what this article delivers. It doesn't tell you a thing about current desktop gpu performance versus current ARM performance. What it does is tell you about how obsolete cpus & gpus from roughly TEN YEARS AGO look like against state-of-the-art cell-phone and iPad ARM running a few isolated 3d Mark graphics tests. What a disappointment. Nobody's even using these desktop cpus & gpus anymore. All this article does is show you how poorly ARM-powered mobile devices do when stacked up against common PC technology a decade ago! (That's assuming one assumes the 3dMark tests used here, such as they are, are actually representative of anything.) AH, if only he had simply used state-of-the-art desktops & cpus to compare with state-of-the-art ARM devices--well, the ARM stuff would have been crushed by such a wide margin it would astound most people. Why *would you* compare current ARM tech with decade-old desktop cpus & gpus? Beats me. Trying to make ARM look better than it has any right to look? Maybe in the future Anand will use a current desktop for his comparison, such as it is. Right now, the article provides no useful information--unless you like learning about really old x86 desktop technology that's been hobbled...;)

To be fair, in the end Anand does admit that current ARM horsepower is roughly on a par with ~10-year-old desktop technology IF you don't talk about bandwidth or add it into the equation--in which case the ARMs don't even do well enough to stand up to 10-year-old commonplace cpu & gpu technology. So what was the point of this article? Again, beats me, as the comparisons aren't relevant because nobody is using that old desktop stuff anymore--they're running newer technology from ~5 years old to brand new--and it runs rings around the old desktop nVidia gpus Anand used for this article.

BTW, and I'm sure Anand is aware of this, you can take DX11.1 gpus and run DX9-level software on them just fine (or OpenGL 3.x-level software, too.) Comments like this are baffling: "While compute power has definitely kept up (as has memory capacity), memory bandwidth is no where near as good as it was on even low end to mainstream cards from that time period." What's "kept up" with what? It sure isn't ARM technology as deployed in mobile devices--unless you want to count reaching ~decade-old x86 "compute power" levels (sans real gpu bandwidth) as "keeping up." I sure wouldn't say that.

Neither Intel or AMD will be sitting still on the x86 desktop, so I'd imagine the current performance advantage (huge) of x86 over ARM will continue to hold if not to grow even wider as time moves on. I think the biggest flaw in this entire article is that it pretends you can make some kind of meaningful comparisons between current x86 desktop performance and current ARM performance as deployed in the devices mentioned. You just can't do that--the disparity would be far too large--it would be embarrassing for ARM. There's no need in that because in mobile ARM cpu/gpu technology, performance is *not* king by a long shot--power conservation for long battery life is king in ARM, however. x86 performance desktops, especially those setup for 3d gaming, are engineered for raw horsepower first and every other consideration, including power conservation, second. That's why Apple doesn't use ARM cpus in Macs and why you cannot buy a desktop today powered by an ARM cpu--the compute power just isn't there, and no one wants to retreat 10-15 years in performance just to run an ARM cpu on the desktop. The forte for ARM is mobile-device use, and the forte for x86 power cpus is on the desktop (and no, I don't count Atom as a powerful cpu...;))
pSupaNova - Sunday, April 7, 2013 - link
How is it embarrassing for ARM? 90% of consumers don't require for most of their computing needs the power of a desktop CPU.

Mobile devices have took the world by storm and have been able to increase their pixel pushing ability exponentially.

No-one is suggesting that Mobile chips will suddenly catch their Desktop brethrens, but it is interesting to see that they are only three times slower than an typical CPU/ Discrete GPU combo of 2004!
zeo - Tuesday, April 16, 2013 - link
That percentage would be much higher if you eliminated cloud support... the only reason why they get away with not needing a lot of performance for the average person is because a lot is offset to run on the cloud instead of on the device.

Apple's Siri for example runs primarily on Apple Servers!

While some applications like augmented reality, voice control, and other developing features aren't wide spread or developed enough to be a factor yet but when they do then performance requirements will skyrocket!

Peoples needs may be small now but they were even smaller before... so they're steadily increasing, though maybe not as quickly as historically but never underestimate what people may need even just a few years from now.
Wolfpup - Friday, April 5, 2013 - link
Yeah, I've been wanting to know more about these architectures and how they compare to PC components for ages! Nice article.
robredz - Sunday, April 7, 2013 - link
It certainly puts things in perspective in terms of gaming on mobile platforms.

The Great Equalizer 3: How Fast is Your Smartphone/Tablet in PC GPU Terms

Choosing a Testbed CPU

Graphics Test 1

Graphics Test 2

Post Your Comment

128 Comments

View All Comments

pSupaNova - Sunday, April 7, 2013 - link

zeo - Tuesday, April 16, 2013 - link

nofumble62 - Friday, April 5, 2013 - link

Spunjji - Friday, April 5, 2013 - link

zeo - Tuesday, April 16, 2013 - link

WaltC - Friday, April 5, 2013 - link

pSupaNova - Sunday, April 7, 2013 - link

zeo - Tuesday, April 16, 2013 - link

Wolfpup - Friday, April 5, 2013 - link

robredz - Sunday, April 7, 2013 - link

Log in

Don't have an account? Sign up now