Benchmark Overview

For our testing we had each of the laptops at the same time for the best part of a workweek, alongside meetings with AMD to discuss the microarchitecture and platform positioning. Each system was purged into a fresh OS state, and then we applied a high profile performance state for the benchmarking in the air-conditioned lab.

The benchmarks fall into several areas:

Short Form CPU

Our short form testing script uses a straight run through of a mixture of known apps or workloads, and requires about four hours.

CPU Short Form Benchmarks
Three Dimensional Particle Movement (3DPM) 3DPM is a self-penned benchmark, derived from my academic research years looking at particle movement parallelism. The coding for this tool was rough, but emulates the real world in being non-CompSci trained code for a scientific endeavor. The code is unoptimized, but the test uses OpenMP to move particles around a field using one of six 3D movement algorithms in turn, each of which is found in the academic literature. This test is performed in single thread and multithreaded workloads, and uses purely floating point numbers. The code was written in Visual Studio 2008 in Release mode with all optimizations (including fast math and -Ox) enabled. We take the average of six runs in each instance.
WinRAR 5.01 WinRAR is a compression based software to reduce file size at the expense of CPU cycles. We use the version that has been a stable part of our benchmark database through 2015, and run the default settings on a 1.52GB directory containing over 2800 files representing a small website with around thirty half-minute videos. We take the average of several runs in this instance.
POV-Ray 3.7 beta POV-Ray is a common ray-tracing tool used to generate realistic looking scenes. We've used POV-Ray in its various guises over the years as a good benchmark for performance, as well as a tool on the march to ray-tracing limited immersive environments. We use the built-in multithreaded benchmark.
HandBrake  HandBrake is a freeware video conversion tool. We use the tool in to process two different videos - first a 'low quality' two hour video at 640x388 resolution to x264, then a 'high quality' ten minute video at 4320x3840. The low quality video scales at lower performance hardware, whereas the buffers required for high-quality can stretch even the biggest processors. At current, this is a CPU only test.
7-Zip 7-Zip is a freeware compression/decompression tool that is widely deployed across the world. We run the included benchmark tool using a 50MB library and take the average of a set of fixed-time results.

Web and Synthetic

The web tests are a usual mix of Octane/Kraken with WebXPRT in the mix. Synthetic CPU testing relates to our long term data under CineBench and x264.

Web and Synthetic Benchmarks
Google Octane 2.0 Lots of factors go into web development, including the tools used and the browser those tools play in. One of the common and widely used benchmarks to judge performance is Google Octane, now in version 2.0. To quote: 'The updated Octane 2.0 benchmark includes four new tests to measure new aspects of JavaScript performance, including garbage collection / compiler latency and asm.js-style JavaScript performance.'
Mozilla Kraken 1.1 Kraken is a similar tool to Google, focusing on web tools and processing power. Kraken's tools include searching algorithms, audio processing, image filtering, flexible database parsing and cryptographic routines.
WebXPRT 2013/2015 WebXPRT aims to be a souped up version of Octane and Kraken, using these tools in real time to display data in photograph enhancement, sorting, stock options, local storage manipulation, graphical enterfaces and even filtering algorithms on scientific datasets. We run the 2013 and 2015 versions of the benchmark.
Cinebench Cinebench is a widely known benchmarking tool for measuring performance relative to MAXON's animation software Cinema 4D. Cinebench has been optimized over a decade and focuses on purely CPU horsepower, meaning if there is a discrepancy in pure throughput characteristics, Cinebench is likely to show that discrepancy. Arguably other software doesn't make use of all the tools available, so the real world relevance might purely be academic, but given our large database of data for Cinebench it seems difficult to ignore a small five minute test. We run the modern version 15 in this test, as well as the older 11.5 due to our back data.
x264 HD 3.0 Similarly, the x264 HD 3.0 package we use here is also kept for historic regressional data. The latest version is 5.0.1, and encodes a 1080p video clip into a high quality x264 file. Version 3.0 only performs the same test on a 720p file, and in most circumstances hits its limit on high end processors, but still works well for mainstream and low-end. Also, this version only takes a few minutes, whereas the latest can take over 90 minutes to run.

Professional and OpenCL

Our professional tests involve a synthetic workload (PCMark), a 2D to 3D image and model conversion tool used by archivists and modelers (Agisoft in CPU only and OpenCL mode) as well as Linux Bench. Unfortunately Linux Bench only seemed to work on a pair of systems.

Professional and OpenCL Benchmarks
PCMark08 PCMark08, developed by Futuremark, is a simple press play and run benchmarking tool designed to probe how well systems cope with a variety of standard tasks that a professional user might encounter. This includes video conferencing with multiple streams, image/file manipulation, video processing, 3D modelling and other tools. In this case we take the three main benchmark sets, Creative, Home and Work, and run them in OpenCL mode which aims to take advantage of OpenCL accelerated hardware. For fun we also put in the PCMark08 Storage workset.
Agisoft Photoscan Photoscan is professional software that takes a series of 2D images (as little as 50, usually 250+) and 'performs calculations' to determine where the pictures were taken and if it can create a 3D model and textures of what the images are of. This model can then be exported to other software for touch-ups or implementation in physics engines/games or, as the reader that directed me to it, national archiving. The tool has four phases, one of which can be OpenCL accelerated, while the other three are a mix of single thread and variable thread workloads. We ran the tool in CPU only and OpenCL modes.
Linux Bench Linux Bench is a collection of Linux based benchmarks compiled together by ServeTheHome. The idea for this is to have some non-windows based tools that are easy enough to run with a USB key, an internet connection and three lines of code in a terminal. The tests in Linux Bench include standard synthetic compute, compression, matrix manipulation, database tools and key-value storage.

Gaming (3DMark, Rocket League)

Due to timing we were only able to run a couple of gaming tests, namely parts of the 3DMark suite and our Rocket League test.

Gaming Benchmarks
3DMark 3DMark is Futuremark's premium software, developed to tax systems at various different performance levels. The software contains several benchmarks as a result, with some focusing more on smartphone use all the way up to 4K, quad-SLI systems with as many in-game and post processing effects as you can throw at it. The base test, Ice Storm, is actually a good indicator of GPU scaling performance, but we also test Cloud Gate, Sky Diver and Fire Strike to get a measure of all of our systems.
Rocket League Hilariously simple pick-up-and-play games are great fun. I'm a massive fan of the Katamari franchise for that reason — passing start on a controller and rolling around, picking up things to get bigger, is extremely simple. Rocket League combines the elements of pick-up-and-play, allowing users to jump into a game with other people (or bots) to play football with cars with zero rules. The title is built on Unreal Engine 3 and it allows users to run the game on super-low-end systems while still taxing the big ones.

Power and Performance Testing

A portion of our benchmarks were profiled for performance – namely their effect on CPU temperature, frequency and usage – which we will go in to detail over. Both of the HP Elitebooks, the Kaveri and Carrizo units, were also hooked up to a Watts Up PRO monitor for a full shakedown of power consumption on some of the more popular tests. We will go into these in detail.

Thermal Effects

As we have seen in previous laptop benchmarking scenarios, the design of the chassis is an important part in understanding how a processor will react to a workload. Some units have their skin temperature limit set unbearably high in order to get the best performance, whereas others are more restrictive. Carrizo promotes the expansion of both of these facets for either better performance or thermals, so we tested it with a FLIR thermal camera during Rocket League on all five systems as well as some internal recording scripts during a few benchmarks.

A Side Note worth Remembering

One intriguing thing to mention in our testing was background processes. Nominally all efforts are done to minimize these (disable WiFi when not needed, disable updates), however when a system comes preinstalled with Intel McAfee anti-virus, it can be an exercise to remove it. Yes, that’s right – for some odd reason, some of the OEMs systems had Intel McAfee pre-installed. I assume it is because the OEM gets a small kickback for including it on their OS image, therefore either increasing margins or reducing the price of the system. McAfee AV is an example of a simple piece of software that can provide a negative user experience – checking of updates when you least expect it, performing mini-scans of everything coming in and out of an I/O port, and for the systems that have mechanical hard drives with single channel memory, it can be the difference between casually watching a film to having to apologize for why a video is dropping frames. Needless to say, it was obliterated.

The other issue is actually a default windows problem. Whenever certain I/O devices are plugged in/removed, or even at random times, the system will call Windows Defender to start probing files and memory in use. The issue here is multiple – it eats up a thread with mostly integer/string work reducing available resources for the user, but on occasion will bring disk drive utilization to 100%, causing access delays when the user is in the middle of something. While Defender can be a critical part of a safer environment, it boggles my mind that it comes on so freely and robs a poorly configured system of its user experience. It also drains battery life as well. This is a disconnect between software developers writing code suitable for the resources available, OEMs for deciding what hardware would be good for a particular price point and believing users are satisfied with such a user experience, and the hardware manufacturers for not circling back round to test the most relevant use cases. It ends up being a negative loop where no-one works with each other, which benefits no-one (more on this later).

Consequently, for our testing I also turned down Windows Defender's activity/sensitivity on all of the test laptops. My personal (insert subjective experience mode) way of ‘delaying’ Windows Defender is to go to Task Scheduler, go to Microsoft > Windows > Windows Defender and on each of the four options change the conditions to:

- Enable ‘Start the task only if the computer is on AC power’
- Enable ‘Stop if the computer switches to battery power’
- Enable ‘Start the task only if the computer is idle for X minutes’
- Enable ‘Stop if the computer ceases to be idle’

How the system determines a true in-OS idle state is somewhat difficult, as some software will have idle periods before called (e.g. watching an online video) so having it come in after 30 seconds idle isn’t usually beneficial, so I (personally) set it for 10 minutes on lower end systems where responsiveness matters.

The System I Didn’t Get to Benchmark: The Dell Inspiron 3656

As part of my meeting schedule, I was offered an explanation as to what goes behind the scenes in retail marketing from one of the senior account managers. We took a trip to the local Best Buy and I was talked through how most areas of the store are, for lack of a better term, rented out by the retailer to the companies who have strict rules to follow. This applies to store-in-stores, end-caps, focused aisles and even the location within the store can affect the price. It made sense, but we came across this following AMD system:

This is the Dell Inspiron 3656 which, for lack of a better description, is Carrizo in a desktop form factor. I asked if I could peek inside, but for some reason no-one in Best Buy had a screwdriver (as if)! But inside is a mobile focused Carrizo CPU, presumably in 35W mode, with sufficient cooling as well as a discrete Radeon R9 360 graphics card in a PCIe x16 slot. Combine in some other factors such as a 2TB HDD, 16GB of DDR3L-1600 SO-DIMMs and you are good to go.

The 3656, as it turns out, can come with three different AMD Carrizo processors (FX-8800P, A10-8700P, A8-8600P) in a thermally unrestrained environment, which would arguably give the best possible scores. The two things I couldn’t confirm were related to the DRAM. I would have liked to know if the design is a true dual channel design for Carrizo only, or if it shares pin compatibility with Carrizo-L which would limit it to single channel only. Also the memory speed – if it is in 35W mode, this would mean the system could engage DDR3-2133 if it uses appropriate SO-DIMM modules. However, the specifications sheet only mentions DDR3L, which is limited to DDR3L-1600. In a desktop like this, the difference between DDR3L and DDR3 would be minor, and the higher speed memory would help benefit (unless it was Carrizo-L focused).

The ‘Who Wants AMD In A Laptop?’ Problem Benchmark Results: CPU Short Form
Comments Locked

175 Comments

View All Comments

  • MonkeyPaw - Friday, February 5, 2016 - link

    The cat cores exist to compete with Atom-level SOCs. Intel takes the Atom design from phones and tablets all the way up to Celeron and Pentium laptops. It makes some business sense due to low cost chips, but if the OEM puts them in a design and asks too much of the SOC, then there you have a bad experience. Such SOCs should not be found in anything bigger than a $300 11" notebook. For 13" and up, the bigger cores should be employed.
  • michael2k - Friday, February 5, 2016 - link

    The cat cores can't compete with Atom level SoC because they don't operate at low enough power levels (ie, 2W to 6W). The cat cores may have been designed to compete with Atom performance and Atom priced parts, but they were poorly suited for mobile designs at launch.
  • Intel999 - Sunday, February 7, 2016 - link

    AMD hasn't updated the cat cores in over three years! It is a dead channel to them. They had a bit of a problem competing in the tablet market against a competitor that was willing to dump over $4 billion pushing inferior bay trail chips. Take a plane to China and you can still find a lot of those Bay Trail chips sitting in warehouses as once users had the misfortune of using tablets being run by them the reviews destroyed any chance that those tablets ever had at being sales successes.

    AMD was forced to stop funding R&D on cat cores as they were in no position to be selling them at negative $5.

    In the time that AMD has stopped development on the cat cores Intel has improved their low end offerings, but still not enough to compete with ARM offerings that have improved as well. And now tablets are dropping at similar rates to laptops so it is actually a good thing for AMD that they suspended research on the cat cores. Sorta dodged a bullet.

    At least they still get decent volume out of them through Sony and Microsoft gaming platforms.

  • testbug00 - Friday, February 5, 2016 - link

    If the cat cores didn't exist AMD likely would have died as we know it a few years ago
    .
  • BillyONeal - Friday, February 5, 2016 - link

    The "cat cores" are why AMD is not yet bankrupt; it let them get design wins in the PS4 and XBox One which kept the company afloat.
  • mrdude - Friday, February 5, 2016 - link

    YoY Q4 earnings showed a 42% decline in revenue for computing and graphics with less than 2bn in revenue for full-year 2015 and $502m operating loss. You couldn't be more correct. The console wins aren't just keeping the company afloat, they practically define it entirely.
  • Lolimaster - Friday, February 5, 2016 - link

    In that case simply remove the OEM's altogether and sell it at AMD's store or selected physical/online stores.
  • TheinsanegamerN - Thursday, February 11, 2016 - link

    10/10 would pay for an "AMD" branded laptop that does APUs correctly.
  • Hrobertgar - Friday, February 5, 2016 - link

    Since you are talking about use experience, AMD is not the only company with a bad user experience. I purchased an Alienware 15" R2 laptop on cyber Monday and it is horrible, and support is horrible. I compare my user experience to a Commodore 64 using a Cassette drive - its that bad (I suspect you are old enough to appreciate cassette drives). It arrived in a non-bootable configuration. It cannot stream Netflix to my 2005 Sony over an HDMI cable unless I use Chrome - took Netflix help to solve that (I took a cell-phone pic of a single Edge browser straddling the two monitors - the native monitor half streaming video and the Sony half dark after passing over the hdmi cable. It only occurs with Netflix). On 50% of bootups it gives me a memory change error despite even the battery being screwed in. On 10% of bootups it fails to recognize the HDD. Once it refused to shutdown and required holding the power button for 10 secs. Lately it claims the power brick is incompatible on about 10% of bootups. Yes, I downloaded all latest drives, bios, chipset, etc. Customer Service has hanged up on me once, deleted my review once, and repeatedly asked for my service tag after I already gave it to them. Some of the Netflix issue is probably Micorsoft's issue - certainly MS App was an epic fail, but much of even that must be Dell's issue. I realize it is probably difficult to spot many of these things given the timeframe of the testing you do, and the Netflix issue in particular is bizarre. I am starting to think a Lenovo might not be so bad.
  • tynopik - Friday, February 5, 2016 - link

    "put of their hands"

Log in

Don't have an account? Sign up now