Power Consumption: Big Improvements to Video Playback

It was teased earlier in the review, but it makes sense at this stage to talk about power consumption.

With a system as complex as a modern APU or SoC, the initial plans for this review involved getting a development system with the right shunts and hooks to measure the core and graphics power separately in a thermally unconstrained environment for both Kaveri and Carrizo, but unfortunately the parts didn’t come together at the time they were needed. Instead we had access to a Watts Up PRO, a power outlet based monitor with some recording capabilities. While the hardware was not ideal for what we wanted to test, it provided a large chunk of interesting data.

We did a number of tests with data monitoring enabled on both the HP Elitebooks. When AMD released Carrizo, a lot of fuss was made about video playback for several reasons. Firstly, Carrizo implements an adjusted playback pathway for data so instead of moving data from the decoder to the GPU to the display controller, it moves data directly from decoder to display, saving power in the process.

AMD also listed the video playback power of Carrizo (as compared to Kaveri) as significantly reduced. In the example above in the top right, the Kaveri APU is consuming nearly 5W, whereas Carrizo will consume only 1.9W for 1080p content.

The other video playback optimization in Carrizo is the Unified Video Decoder. The bandwidth and capability of the UVD is increased four-fold, allowing the system to ‘sleep’ between completed frames, saving power.

Video Playback, 1080p30 h264

For the first test, we took a 1920x1080 resolution h264 video at 30 FPS (specifically Big Buck Bunny) and recorded the power consumption for playback.

The difference here is striking. The Carrizo system in this instance has sustained power consumption lower than that of the Kaveri system. Overall the Kaveri system draws 11W over idle to play back our test video while the Carrizo system only draws 6.8W over idle for the same task. Put another way, the load power cost at the wall for  watching 1080p video is about 4W lower on Carrizo as compared to Kaveri, which is close to what AMD claimed in the first slide above (and note we’re measuring at the wall, so chances are there are other chipset optimizations being done under the hood).

Video Playback, 2160p30 h264

The same video but in 4K format was also tested on both systems. It is at this point I should say that the Kaveri system was unable to play the 4K video properly (Kaveri doesn't officially support 4K decoding to begin with), and would only show about 20% of the frames. Audio was also affected.

In this case power consumption is above that of the 1080p video, and both systems require around 11 watts from idle to sustained performance. The added benefit with Carrizo though is that you can actually watch the video.

Other Power Benchmarks

We also ran power tests on a set of our regular benchmarks to see the results.

Three-Dimensional Particle Movement

In our 3DPM test, we typically script up a batch of six runs and take the average score. For this we did it to the single thread and multithreaded environments.

In single threaded mode, two interesting things occurred. First, as we expected, the Carrizo system can idle lower than the Kaveri. Second is that the Carrizo system actually goes into a higher power state at load by almost 4W. This means that the delta (Load to Idle) is 8W higher for Carrizo than Kaveri.

It is easy to take away from this that Carrizo, as an APU, uses more power. But that is not what is happening. Carrizo, unlike Kaveri, integrates the chipset onto the same die as the APU (better integration, saves power), but it also means that it is essentially shut off at idle. Part of Carrizo’s optimizations is power management, so the ability to shut something down and fire it back up again gives a larger low-to-high delta automatically. Essentially, more things are turning on. The fact that the Carrizo power numbers are higher than Kaveri during the benchmark is correlated by the performance, despite Kaveri having the higher TDP.

For the multithreaded test, both systems settle to similar power consumptions as the single threaded test, although the Carrizo system has a much more varied power profile, which also finishes the benchmark earlier than the Kaveri.

Octane and Kraken

For the web tests, we expect them to be partially threaded but because they probe a number of real-world and synthetic tests, there should be some power variation.

Octane is actually relatively flat, instigating similar power profiles to both. Again, it looks like that Carrizo expends more energy to do the same amount of work, however it is easy to forget that the Carrizo idle power state is lower due to optimizations.

With Kraken we also get a flat profile, although one could argue that we’re seeing a classic case of running quick and finishing the benchmark sooner vs. a more sedate path.

WebXPRT

This graph was shown earlier in the review, but let’s look at it again, as it is a good example of a bursty workload:

With average power numbers only a few watts above the idle numbers, both systems do a good job on overall power though again it is easy to think that the larger delta of the Carrizo numbers means that the APU is consuming more power. This is where if you try and calculate the actual energy consumed for each system, you get stupid numbers: 1208.7 joules for the Kaveri and 1932.8 joules for the Carrizo. Without starting from the same platform (or without taking numbers direct from the cores), there are obviously other things at play (such as Carrizo’s capability to control more power planes).

WinRAR

Our final power test is WinRAR, which is characterized as a variable threaded load involving lots of little compressible web files and a handful of uncompressible videos.

In this instance I was surprised to see both systems perform similarly. The HP Elitebook G2 actually has the upper hand here, as it is equipped with dual channel memory. WinRAR is a very memory bandwidth affected benchmark, so the G2 has an upper hand in performance but will also balance between drawing more power for two modules or running in a more efficient mode if there is sufficient data at the CPU.

How Hot is too Hot? Temperatures and Thermal Results Negative Feedback Loops: How To Escape the Pit
Comments Locked

175 Comments

View All Comments

  • Danvelopment - Monday, February 8, 2016 - link

    Strategy AMD should adopt:

    90% of people don't notice a performance difference above 3000 Super CPU Points, Intel CPUs are usually 4000-8000 Super CPU points, our chips may only range from 3500-4500 Super CPU Points but regular users won't actually notice it, and at the same performance marks we're a hundred dollars cheaper. Make the sensible choice.

    Another way, we've done extensive testing to see what end users want and need, then we targeted those sectors, and where we matched Intel we made sure we were a hundred bucks cheaper on the same devices.

    "We don't hold the performance crown but the price/performance crown"
  • Marcelo Viana - Monday, February 8, 2016 - link

    Dammit, the solution should be simple, but must come from AMD, since can't expect it from oem's and all of them offer let's say 2011 sockets as example, why amd do not develop a socket switch, so a small board with 2011 pins on the bottom and a circuit on this boad to give a whatever socket amd choose connections on top of it, in order to accept amd chips.
    But AMD must understand that the memory on their chips must be ddr4(Carrizo do), because the lazy OEMs whon't change memory sockets, as example.
    In this case the lazy ones have only to change the chip, and even better if any consumer have a old machine can upgrade to a chip that they choose. simple as that.
    Anyone that sales more creates the standard on the market, the others is that must follow.
    So who control the user experience? I think no one. everyone in the process just looking to explore the users in order to get money nothing more, but if i have to guess, problably the users. Because they are the one that really have the power to say "i won't buy it or that' or even better "until they give to me what i want" just my 2 cents.
  • farmergann - Tuesday, February 9, 2016 - link

    Seems like you missed out on some highlights of the Y700. The memory is dual channel, the IPS screen has Freesync, and the sound is surprisingly awesome. Replaced the HDD with a Samsung 850 Pro and have thoroughly enjoyed it since.
  • bitech - Tuesday, February 9, 2016 - link

    Lol have they never seen a 17" laptop before? The HP Pavilion has a 1600x900 because it's 17". 1600x900 is the minimum resolution on all 17" laptops, not 1366x768.
  • UtilityMax - Wednesday, February 10, 2016 - link

    1600x900 is still a crappy resolution for such a large screen. I had a notebook with 15.5 inch 900p screen, and it was visibly grainy.
  • mosu - Tuesday, February 9, 2016 - link

    Just few words: Sabotage and corruption at high level OEM decision level. Simple as that.
  • Arief Sujadmika - Wednesday, February 10, 2016 - link

    AMD just need a feature to turn off the chips if its detect single channel memory for Carrizo then the OEM will make dual channel memory for it...
  • thatthing - Wednesday, February 10, 2016 - link

    the y700 r9 385x is a bonaire gpu, amd has no 512sp chips mobile r9 series, http://www.amd.com/en-us/products/graphics/noteboo...
  • silverblue - Wednesday, February 10, 2016 - link

    Articles like these make me want to see how good the unrestricted Athlon X4 845 will be, however as it's probably defective Carrizo silicon, I wouldn't expect it to be massively frugal. I do wonder if there will be any Bristol Ridge Athlons; the top models are rated with a cTDP of 25-45W which is a decent improvement and would reduce/eliminate throttling. Overclocking may not help in terms of power but performance would be more consistent. You also get DDR4 which isn't as big a help for the Athlons but it would be interesting to see the difference.

    A review of the Dell Inspiron I3656-7800BLK would be a good marker, if only to show the maximum performance of the mobile chips.
  • Masospaghetti - Wednesday, February 10, 2016 - link

    Seems like the best configuration of a Carrizo machine would be a 35w TDP A12 with dual channel memory and integrated graphics (or discrete graphics with crossfire enabled).

    It's a shame that all of the machines available are severely compromised with either single channel memory, 15w TDP, lack of crossfire, or a combination of these. Seriously. The machines tested have terrible designs. Looks like AMD made a huge mistake providing a common configuration with Carrizo-L with the single channel memory.

Log in

Don't have an account? Sign up now