CPU Performance

While Denver’s architecture is something fascinating to study, it’s important to see how well this translates to the real world. Denver on paper is a beast, but in the real world there are a number of factors to consider, not the least of which is the effectiveness of NVIDIA’s DCO. We’ve laid out that Denver’s best and worst case scenarios heavily ride on the DCO, and for NVIDIA to achieve their best-case performance they need to be able to generate and feed Denver with lots and lots of well optimized code. If Denver spends too much time working directly off of ARM code or can’t do a good job optimizing the recurring code it finds then Denver will struggle. Meanwhile other important factors are in play as well, including the benefits and drawbacks of Denver’s two cores versus competing SoC’s quad A15/A57 configurations, and in thermally constrained scenarios Denver’s ability to deliver good performance while keeping its power consumption in check.

In order to test this and general system performance, we turn our suite of benchmarks that include browser performance tests, general system tests, and game-type benchmarks. As Denver relies on code-morphing to enable out of order execution and speculative execution, most of these benchmarks should be able to show ideal performance as loop performance in Denver is basically second to none. While most of these benchmarks are showing their age, they should be usable for valid comparisons until we move to our new test suite.

SunSpider 1.0.2 Benchmark  (Chrome/Safari/IE)

Kraken 1.1 (Chrome/Safari/IE)

Google Octane v2  (Chrome/Safari/IE)

WebXPRT (Chrome/Safari/IE)

Basemark OS II 2.0 - Overall

Basemark OS II 2.0 - System

Basemark OS II 2.0 - Memory

The Basemark System test seems to contribute quite strongly to how the Nexus 9 performs in the overall subtest. Given that this is a storage performance benchmark, it's likely that Basemark OS II has issues similar to Androbench on 5.0 Lollipop or that random I/O is heavily prioritized in this test.

Basemark OS II 2.0 - Graphics

There's a noticeable performance uplift in the graphics test, and although not exactly part of the CPU this does seem at least somewhat plausible as GPU driver updates can improve performance over time.

Basemark OS II 2.0 - Web

Overall, performance seems to be quite checkered, although improved from our initial evaluation of the Nexus 9. Unfortunately, even in benchmarks where the DCO should be able to easily unroll loops to achieve massive amounts of performance, we see inconsistent performance in Denver. This may come down to an issue with the DCO, or even more simply the fact that Denver is spending more time than it would like to directly executing ARM code as opposed to going through the DCO.

In this case looking at the SunSpider and Kraken javascript benchmarks offers an interesting proxy case for exactly that scenario. SunSpider on modern CPUs executes extremely quickly, so quickly that the individual tests are often over in only a couple of dozen of milliseconds. This is a particularly rough scenario for Denver, as it doesn’t provide Denver with much time to optimize, even if the code is run multiple times. Meanwhile Kraken pushes many similar buttons, but its tests are longer, and that gives Denver more time to optimize. Consequently we find that Denver’s SunSpider performance is quite poor – underperforming even the A15-based Tegra K1-32 – while Denver passes even the iPad Air 2 in Kraken.

Ultimately this kind of inconsistent performance is a risk and a challenge for Denver. While no single SoC tops every last CPU benchmark, we also don’t typically see the kind of large variations that are occurring with Denver. If Denver’s lows are too low, then it definitely impacts the suitability of the SoC for high-end devices, as users have come to expect peppy performance at all times.

In practice, I didn't really notice any issues with the Nexus 9's performance, although there were odd moments during intense multitasking where I experienced extended pauses/freezes that were likely due to the DCO getting stuck somewhere in execution, seeing as how the DCO can often have unexpected bugs such as repeated FP64 multiplication causing crashes. In general, I noticed that the device tended to also get hot even on relatively simple tasks, which doesn't bode well for battery life. This is localized to the top of the tablet, which should help with user comfort although this comes at the cost of worse sustained performance.

SPECing Denver's Performance GPU and NAND Performance
Comments Locked

169 Comments

View All Comments

  • seanleeforever - Wednesday, February 4, 2015 - link

    2nd that.
    I am not here to read about how fast the tablet is or how nice it looks. i am here for in depth content about the chip. would it be nice that this content was available since the release of the product? absolutely, but given the resource it would either be a brief review that is going to be the same as review you can find from hundred of other websites, or late but in depth.
    honestly i think anand should be targeting at more tech oriented contents that's few but in depth, and leave the quick/dirty review for other websites.

    superb job.
  • WaitingForNehalem - Wednesday, February 4, 2015 - link

    Yeah but who cares about tablets??!! I don't come to Anandtech to read about budget tablets, or SFF PCs, or more smartphones. The Denver coverage was not even that in depth TBH, just commentary on the NVidia slides. I have a EE degree and some of the previous write ups were so in depth they could be class material. This one isn't which is fine but I don't think it excuses how late it came out. The enthusiast market is growing and you should be targeting that demographic as you previously have, not catering to the mainstream like hundreds of sites already do.
  • retrospooty - Wednesday, February 4, 2015 - link

    The enthusiast market is growing ? What with CPU's not really getting, or needing to be any faster for several years now, and a standard mid range quad core i5 (non-overclocked) being WAY more than powerful enough to run 99.9% of anything out there, how is the enthusiast market is growing? Most enthusiasts I know don't even bother any more... There just isnt a need. Any basic PC is great these days.
  • WaitingForNehalem - Wednesday, February 4, 2015 - link

    I totally agree with you. That doesn't change the fact that the market is growing as more users are adopting gaming PCs. Enthusiasts now actually command a sizable portion of desktops sold. Intel's Devil's Canyon was in response to that.
  • retrospooty - Thursday, February 5, 2015 - link

    OK, I get what you mean.

    I guess I am still in a mind set where a PC "enthusiast" is your overclocker, tweaker, buying the latest and fastest of everything to eek out that extra few frames per second.

    Today, a mid range quad core i5 from 3 years ago and a decent mid-high range card runs any game quite nicely.
  • FunBunny2 - Thursday, February 5, 2015 - link

    There was a time, readers may be too young to have been there, when there was a Wintel monopoly: M$ needed faster chips to run ever more bloated Windoze and Intel needed a cycle-sink to soak up the increase in cycles that evolving chips provided. Now, we're near (or at?) the limits of single-threaded performance, and still haven't found a way to use multi-processor/core chips in individual applications. There just aren't a) many embarrassingly parallel problems and b) algorithms to turn single-threaded problems into parallel code. I mean, the big deal these days is 4K displays? It looks prettier, to some eyes, but doesn't change the functionality of an application (medical and such excepted, possibly).

    Does anyone really need an i7 to surf the innterTubes for neater porn?
  • nico_mach - Friday, February 6, 2015 - link

    I think the chip coverage was superb, I don't have an EE degree and I'm pretty sure that's what the website is steered towards. And I still think I got it.

    It's fascinating the number of layers involved in this Android tablet, and speaks to why Apple can optimize so much better. There's the chip->NVIDIA chip optimizer->executable code->Dalvik compiler/runtime->dalvik code. I mean, when the lags are encountered, that's twice as many suspects to investigate.

    I still think that the review is a little harsh on Denver. It's hitting the right performance envelope at the right price. While it's an mildly inefficient design, clearly NVIDIA is pricing it accordingly, and that might be a function of moving some of the optimization work to software. And that's work that Apple and MS do all the time - Apple much more successfully, obviously. There's a real gap in knowledge of how efficient Apple's chips are vs how optimized the software/hardware pairing is.
  • dakishimesan - Wednesday, February 4, 2015 - link

    I have no interest in tablets, but the deep dive on Denver was a fascinating read, and still completely relevant even if the product is a few months old. Thanks for the great review.
  • Sindarin - Wednesday, February 4, 2015 - link

    ...can I offer you a cup of hot chicken soup laddy? .....maybe some vicks vapor rub? lol! c'mon dude! we're all sick(vaca) in December!
  • hahmed330 - Wednesday, February 4, 2015 - link

    Hi, outstanding article with incredible attention to detail... Do you think its possible to run Dynamic Code Optimizer on per say 2 or maybe even 4 small cpu cores dedicated to doing all the software OoOE functions instead of using time slicing? (A53s or just some XYZ narrow cores for a potential 2+2 or 4+4 or maybe even 8+8)

    Also whats the die size of a denver core in comparison to a enhanced cyclone core?? That is where a lot of gains are possible potentially 30%-50%..

Log in

Don't have an account? Sign up now