SPECing Denver's Performance

Finally, before diving into our look at Denver in the real world on the Nexus 9, let’s take a look at a few performance considerations.

With so much of Denver’s performance riding on the DCO, starting with the DCO we have a slide from NVIDIA profiling the execution of SPECInt2000 on Denver. In it NVIDIA showcases how much time Denver spends on each type of code execution – native ARM code, the optimizer, and finally optimized code – along with an idea of the IPC they achieve on this benchmark.

What we find is that as expected, it takes a bit of time for Denver’s DCO to kick in and produce optimized native code. At the start of the benchmark execution with little optimized code to work with, Denver initially executes ARM code via its ARM decoder, taking a bit of time to find recurring code. Once it finds that recurring code Denver’s DCO kicks in – taking up CPU time itself – as the DCO begins replacing recurring code segments with optimized, native code.

In this case the amount of CPU time spent on the DCO is never too great of a percentage of time, however NVIDIA’s example has the DCO noticeably running for quite some time before it finally settles down to an imperceptible fraction of time. Initially a much larger fraction of the time is spent executing ARM code on Denver due to the time it takes for the optimizer to find recurring code and optimize it. Similarly, another spike in ARM code is found roughly mid-run, when Denver encounters new code segments that it needs to execute as ARM code before optimizing it and replacing it with native code.

Meanwhile there’s a clear hit to IPC whenever Denver is executing ARM code, with Denver’s IPC dropping below 1.0 whenever it’s executing large amounts of such code. This in a nutshell is why Denver’s DCO is so important and why Denver needs recurring code, as it’s going to achieve its best results with code it can optimize and then frequently re-use those results.

Also of note though, Denver’s IPC per slice of time never gets above 2.0, even with full optimization and significant code recurrence in effect. The specific IPC of any program is going to depend on the nature of the code, but this serves as a good example of the fact that even with a bag full of tricks in the DCO, Denver is not going to sustain anything near its theoretical maximum IPC of 7. Individual VLIW instructions may hit 7, but over any period of time if a lack of ILP in the code itself doesn’t become the bottleneck, then other issues such as VLIW density limits, cache flushes, and unavoidable memory stalls will. The important question is ultimately whether Denver’s IPC is enough of an improvement over Cortex A15/A57 to justify both the power consumption costs and the die space costs of its very wide design.

NVIDIA's example also neatly highlights the fact that due to Denver’s favoritism for code reuse, it is in a position to do very well in certain types of benchmarks. CPU benchmarks in particular are known for their extended runs of similar code to let the CPU settle and get a better sustained measurement of CPU performance, all of which plays into Denver’s hands. Which is not to say that it can’t also do well in real-world code, but in these specific situations Denver is well set to be a benchmark behemoth.

To that end, we have also run our standard copy of SPECInt2000 to profile Denver’s performance.

SPECint2000 - Estimated Scores
  K1-32 (A15) K1-64 (Denver) % Advantage
164.gzip
869
1269
46%
175.vpr
909
1312
44%
176.gcc
1617
1884
17%
181.mcf
1304
1746
34%
186.crafty
1030
1470
43%
197.parser
909
1192
31%
252.eon
1940
2342
20%
253.perlbmk
1395
1818
30%
254.gap
1486
1844
24%
255.vortex
1535
2567
67%
256.bzip2
1119
1468
31%
300.twolf
1339
1785
33%

Given Denver’s obvious affinity for benchmarks such as SPEC we won’t dwell on the results too much here. But the results do show that Denver is a very strong CPU under SPEC, and by extension under conditions where it can take advantage of significant code reuse. Similarly, because these benchmarks aren’t heavily threaded, they’re all the happier with any improvements in single-threaded performance that Denver can offer.

Coming from the K1-32 and its Cortex-A15 CPU to K1-64 and its Denver CPU, the actual gains are unsurprisingly dependent on the benchmark. The worst case scenario of 176.gcc still has Denver ahead by 17%, meanwhile the best case scenario of 255.vortex finds that Denver bests A15 by 67%, coming closer than one would expect towards doubling A15's performance entirely. The best case scenario is of course unlikely to occur in real code, though I’m not sure the same can be said for the worst case scenario. At the same time we find that there aren’t any performance regressions, which is a good start for Denver.

If nothing else it's clear that Denver is a benchmark monster. Now let's see what it can do in the real world.

The Secret of Denver: Binary Translation & Code Optimization CPU Performance
Comments Locked

169 Comments

View All Comments

  • melgross - Wednesday, February 4, 2015 - link

    So, people only buy devices during the first three months?
  • Impulses - Wednesday, February 4, 2015 - link

    Apparently... Although getting the review in before February would've shut all these people up, cheapest place to get the Nexus 9 all thru the holidays was Amazon ($350 for 16GB) and they gave you until January 31 to return it regardless of when you bought it.

    Only reason I'm so keenly aware is I bought one as a February birthday gift, opened it last weekend just to check it was fine before the return window closed... Not much backlight bleed at all even tho it was manufacturerd in October (bought in late December), some back flex but it's going in a case anyway.
  • blzd - Friday, February 6, 2015 - link

    What does the month of manufacture have to do with the back light bleed? You don't actually believe those "revision" rumors, do you?

    If you do, consider how practical it is for a hardware revision to come out 1 month after release. Then consider how one set of pictures on a Reddit post proves anything other than that their RMA worked as intended.
  • ToTTenTranz - Wednesday, February 4, 2015 - link

    I wish more smartphone/tablet makers put as much thought into their external speakers as HTC does.

    Once having a HTC One M7, I simply can't go back to mono speakers at the back of devices.
  • Dribble - Wednesday, February 4, 2015 - link

    Glad the review is here at last, next one a little bit quicker please :)
  • UpSpin - Wednesday, February 4, 2015 - link

    I have following issues with your review:
    1. You run webbrowser tests and derive CPU performance from it. That's nonsense! It's a web-browser test, and it won't be a CPU test whatever you do. If you want to test raw CPU performance you have to run native CPU test applications.

    2. Your battery life analysis is based on false assumptions and you derive doubtful claims from it.
    The error is quite evident on the iPad Air test. In your newly introduced white display test, with airplane on, CPU/GPU idling, etc. the iPad Air 2 has a battery life of 10:18 hours. Now in your web-browsing battery test with WiFi on and the CPU busy, the iPad Air 2 has a battery life of 9:76 hours. That's a difference of 4%. The Nexus 9 has a difference of 30%, the Note 4 15%, the Shield Tablet 25%.
    You conclude: The Tegra K1 is inefficient. But I could also conclude that the A8 is inefficient and the Tegra K1 very efficient. The Tegra K1 needs significantly less power while idling, compared to the A8, which consumes always the same, mostly independent on the load. So finally, the A8 lacks any kind of power saving mode.
    That's abstruse, but the consequence of your test. Or maybe your test is flawed from the beginning on.

    3. " I suspect we’re looking at the direct result of the large battery, combined with an efficient display as the Nexus 9 can last as long as 15 hours in this test compared to the iPad Air 2’s 10 hours."
    Sorry, but I don't get this either. The Nexus 9 has a 25.46 WHr battery, the iPad Air 2 a 27.3 WHr battery (+7%). The Nexus 9 has a 8.9" Display, the iPad Air 2 a 9.7". (+19% area). The resolution is the same, thus the DPI on the Nexus 9 higher. The display techonoly is the same, as you said in your analysis. So the difference must be related to something else, like a highly efficient idle SoC in the Nexus 9.
  • Andrei Frumusanu - Wednesday, February 4, 2015 - link

    The battery life tests analysis is based on true facts on the technical workings of the SoC and its idle power states and we are confident in the resulting conclusions.
  • JarredWalton - Wednesday, February 4, 2015 - link

    Going along with what Andrei said, an SoC isn't "efficient" if it's doing no work -- the A8 may not have idle power as low as the K1-64, but when you're actually doing anything more with the tablet in question is when efficiency matters. It's clear that the Air 2 wins out over the Nexus 9 in some of those tests (GFX in particular). Doing more (or equivalent) work while using less power is efficient.

    Imagine this as an example of why idle power only matters so far: if you were to start comparing cars on how long they could idle instead of actual gas mileage, would anyone care? "Car XYZ can run for 20 hours off a tank while idle while Car ZYX only lasts 15 hours!" Except, neither car is actually doing what a car is suppose to do, which is take you from point A to point B.

    The white screen test is merely a way to look at the idle power draw for a device, and by that we can get an idea of how much additional power is needed when the device is actually in use. Also note that it's possible due to the difference in OS that Android simply better disables certain services in the test scenario and iOS might be wasting power -- the fact that the battery life hardly changes in our Internet WiFi test even suggests that's the case.

    To that end, the battery life of the N9 is still quite good. Get rid of the smartphones in the charts and it's actually pretty much class leading. But it's still odd that the NVIDIA SHIELD Tablet and iPad Air 2 only show a small drop between idle and Internet, while N9 loses 33% of its battery life.
  • ABR - Thursday, February 5, 2015 - link

    Idle power is pretty important for real world use for tablets, for example where you are reading something and the system is just sitting there. Those "load web page then pause for xx time" test would probably be really good for measuring.
  • JarredWalton - Thursday, February 5, 2015 - link

    That's exactly what our Internet test does, which is why the 33% drop in battery life is so alarming. What exactly is going on that N9 loading a generally not too complex web page every 15 seconds or so kills battery life?

Log in

Don't have an account? Sign up now