Benchmark Overview

For our testing we had each of the laptops at the same time for the best part of a workweek, alongside meetings with AMD to discuss the microarchitecture and platform positioning. Each system was purged into a fresh OS state, and then we applied a high profile performance state for the benchmarking in the air-conditioned lab.

The benchmarks fall into several areas:

Short Form CPU

Our short form testing script uses a straight run through of a mixture of known apps or workloads, and requires about four hours.

CPU Short Form Benchmarks
Three Dimensional Particle Movement (3DPM) 3DPM is a self-penned benchmark, derived from my academic research years looking at particle movement parallelism. The coding for this tool was rough, but emulates the real world in being non-CompSci trained code for a scientific endeavor. The code is unoptimized, but the test uses OpenMP to move particles around a field using one of six 3D movement algorithms in turn, each of which is found in the academic literature. This test is performed in single thread and multithreaded workloads, and uses purely floating point numbers. The code was written in Visual Studio 2008 in Release mode with all optimizations (including fast math and -Ox) enabled. We take the average of six runs in each instance.
WinRAR 5.01 WinRAR is a compression based software to reduce file size at the expense of CPU cycles. We use the version that has been a stable part of our benchmark database through 2015, and run the default settings on a 1.52GB directory containing over 2800 files representing a small website with around thirty half-minute videos. We take the average of several runs in this instance.
POV-Ray 3.7 beta POV-Ray is a common ray-tracing tool used to generate realistic looking scenes. We've used POV-Ray in its various guises over the years as a good benchmark for performance, as well as a tool on the march to ray-tracing limited immersive environments. We use the built-in multithreaded benchmark.
HandBrake  HandBrake is a freeware video conversion tool. We use the tool in to process two different videos - first a 'low quality' two hour video at 640x388 resolution to x264, then a 'high quality' ten minute video at 4320x3840. The low quality video scales at lower performance hardware, whereas the buffers required for high-quality can stretch even the biggest processors. At current, this is a CPU only test.
7-Zip 7-Zip is a freeware compression/decompression tool that is widely deployed across the world. We run the included benchmark tool using a 50MB library and take the average of a set of fixed-time results.

Web and Synthetic

The web tests are a usual mix of Octane/Kraken with WebXPRT in the mix. Synthetic CPU testing relates to our long term data under CineBench and x264.

Web and Synthetic Benchmarks
Google Octane 2.0 Lots of factors go into web development, including the tools used and the browser those tools play in. One of the common and widely used benchmarks to judge performance is Google Octane, now in version 2.0. To quote: 'The updated Octane 2.0 benchmark includes four new tests to measure new aspects of JavaScript performance, including garbage collection / compiler latency and asm.js-style JavaScript performance.'
Mozilla Kraken 1.1 Kraken is a similar tool to Google, focusing on web tools and processing power. Kraken's tools include searching algorithms, audio processing, image filtering, flexible database parsing and cryptographic routines.
WebXPRT 2013/2015 WebXPRT aims to be a souped up version of Octane and Kraken, using these tools in real time to display data in photograph enhancement, sorting, stock options, local storage manipulation, graphical enterfaces and even filtering algorithms on scientific datasets. We run the 2013 and 2015 versions of the benchmark.
Cinebench Cinebench is a widely known benchmarking tool for measuring performance relative to MAXON's animation software Cinema 4D. Cinebench has been optimized over a decade and focuses on purely CPU horsepower, meaning if there is a discrepancy in pure throughput characteristics, Cinebench is likely to show that discrepancy. Arguably other software doesn't make use of all the tools available, so the real world relevance might purely be academic, but given our large database of data for Cinebench it seems difficult to ignore a small five minute test. We run the modern version 15 in this test, as well as the older 11.5 due to our back data.
x264 HD 3.0 Similarly, the x264 HD 3.0 package we use here is also kept for historic regressional data. The latest version is 5.0.1, and encodes a 1080p video clip into a high quality x264 file. Version 3.0 only performs the same test on a 720p file, and in most circumstances hits its limit on high end processors, but still works well for mainstream and low-end. Also, this version only takes a few minutes, whereas the latest can take over 90 minutes to run.

Professional and OpenCL

Our professional tests involve a synthetic workload (PCMark), a 2D to 3D image and model conversion tool used by archivists and modelers (Agisoft in CPU only and OpenCL mode) as well as Linux Bench. Unfortunately Linux Bench only seemed to work on a pair of systems.

Professional and OpenCL Benchmarks
PCMark08 PCMark08, developed by Futuremark, is a simple press play and run benchmarking tool designed to probe how well systems cope with a variety of standard tasks that a professional user might encounter. This includes video conferencing with multiple streams, image/file manipulation, video processing, 3D modelling and other tools. In this case we take the three main benchmark sets, Creative, Home and Work, and run them in OpenCL mode which aims to take advantage of OpenCL accelerated hardware. For fun we also put in the PCMark08 Storage workset.
Agisoft Photoscan Photoscan is professional software that takes a series of 2D images (as little as 50, usually 250+) and 'performs calculations' to determine where the pictures were taken and if it can create a 3D model and textures of what the images are of. This model can then be exported to other software for touch-ups or implementation in physics engines/games or, as the reader that directed me to it, national archiving. The tool has four phases, one of which can be OpenCL accelerated, while the other three are a mix of single thread and variable thread workloads. We ran the tool in CPU only and OpenCL modes.
Linux Bench Linux Bench is a collection of Linux based benchmarks compiled together by ServeTheHome. The idea for this is to have some non-windows based tools that are easy enough to run with a USB key, an internet connection and three lines of code in a terminal. The tests in Linux Bench include standard synthetic compute, compression, matrix manipulation, database tools and key-value storage.

Gaming (3DMark, Rocket League)

Due to timing we were only able to run a couple of gaming tests, namely parts of the 3DMark suite and our Rocket League test.

Gaming Benchmarks
3DMark 3DMark is Futuremark's premium software, developed to tax systems at various different performance levels. The software contains several benchmarks as a result, with some focusing more on smartphone use all the way up to 4K, quad-SLI systems with as many in-game and post processing effects as you can throw at it. The base test, Ice Storm, is actually a good indicator of GPU scaling performance, but we also test Cloud Gate, Sky Diver and Fire Strike to get a measure of all of our systems.
Rocket League Hilariously simple pick-up-and-play games are great fun. I'm a massive fan of the Katamari franchise for that reason — passing start on a controller and rolling around, picking up things to get bigger, is extremely simple. Rocket League combines the elements of pick-up-and-play, allowing users to jump into a game with other people (or bots) to play football with cars with zero rules. The title is built on Unreal Engine 3 and it allows users to run the game on super-low-end systems while still taxing the big ones.

Power and Performance Testing

A portion of our benchmarks were profiled for performance – namely their effect on CPU temperature, frequency and usage – which we will go in to detail over. Both of the HP Elitebooks, the Kaveri and Carrizo units, were also hooked up to a Watts Up PRO monitor for a full shakedown of power consumption on some of the more popular tests. We will go into these in detail.

Thermal Effects

As we have seen in previous laptop benchmarking scenarios, the design of the chassis is an important part in understanding how a processor will react to a workload. Some units have their skin temperature limit set unbearably high in order to get the best performance, whereas others are more restrictive. Carrizo promotes the expansion of both of these facets for either better performance or thermals, so we tested it with a FLIR thermal camera during Rocket League on all five systems as well as some internal recording scripts during a few benchmarks.

A Side Note worth Remembering

One intriguing thing to mention in our testing was background processes. Nominally all efforts are done to minimize these (disable WiFi when not needed, disable updates), however when a system comes preinstalled with Intel McAfee anti-virus, it can be an exercise to remove it. Yes, that’s right – for some odd reason, some of the OEMs systems had Intel McAfee pre-installed. I assume it is because the OEM gets a small kickback for including it on their OS image, therefore either increasing margins or reducing the price of the system. McAfee AV is an example of a simple piece of software that can provide a negative user experience – checking of updates when you least expect it, performing mini-scans of everything coming in and out of an I/O port, and for the systems that have mechanical hard drives with single channel memory, it can be the difference between casually watching a film to having to apologize for why a video is dropping frames. Needless to say, it was obliterated.

The other issue is actually a default windows problem. Whenever certain I/O devices are plugged in/removed, or even at random times, the system will call Windows Defender to start probing files and memory in use. The issue here is multiple – it eats up a thread with mostly integer/string work reducing available resources for the user, but on occasion will bring disk drive utilization to 100%, causing access delays when the user is in the middle of something. While Defender can be a critical part of a safer environment, it boggles my mind that it comes on so freely and robs a poorly configured system of its user experience. It also drains battery life as well. This is a disconnect between software developers writing code suitable for the resources available, OEMs for deciding what hardware would be good for a particular price point and believing users are satisfied with such a user experience, and the hardware manufacturers for not circling back round to test the most relevant use cases. It ends up being a negative loop where no-one works with each other, which benefits no-one (more on this later).

Consequently, for our testing I also turned down Windows Defender's activity/sensitivity on all of the test laptops. My personal (insert subjective experience mode) way of ‘delaying’ Windows Defender is to go to Task Scheduler, go to Microsoft > Windows > Windows Defender and on each of the four options change the conditions to:

- Enable ‘Start the task only if the computer is on AC power’
- Enable ‘Stop if the computer switches to battery power’
- Enable ‘Start the task only if the computer is idle for X minutes’
- Enable ‘Stop if the computer ceases to be idle’

How the system determines a true in-OS idle state is somewhat difficult, as some software will have idle periods before called (e.g. watching an online video) so having it come in after 30 seconds idle isn’t usually beneficial, so I (personally) set it for 10 minutes on lower end systems where responsiveness matters.

The System I Didn’t Get to Benchmark: The Dell Inspiron 3656

As part of my meeting schedule, I was offered an explanation as to what goes behind the scenes in retail marketing from one of the senior account managers. We took a trip to the local Best Buy and I was talked through how most areas of the store are, for lack of a better term, rented out by the retailer to the companies who have strict rules to follow. This applies to store-in-stores, end-caps, focused aisles and even the location within the store can affect the price. It made sense, but we came across this following AMD system:

This is the Dell Inspiron 3656 which, for lack of a better description, is Carrizo in a desktop form factor. I asked if I could peek inside, but for some reason no-one in Best Buy had a screwdriver (as if)! But inside is a mobile focused Carrizo CPU, presumably in 35W mode, with sufficient cooling as well as a discrete Radeon R9 360 graphics card in a PCIe x16 slot. Combine in some other factors such as a 2TB HDD, 16GB of DDR3L-1600 SO-DIMMs and you are good to go.

The 3656, as it turns out, can come with three different AMD Carrizo processors (FX-8800P, A10-8700P, A8-8600P) in a thermally unrestrained environment, which would arguably give the best possible scores. The two things I couldn’t confirm were related to the DRAM. I would have liked to know if the design is a true dual channel design for Carrizo only, or if it shares pin compatibility with Carrizo-L which would limit it to single channel only. Also the memory speed – if it is in 35W mode, this would mean the system could engage DDR3-2133 if it uses appropriate SO-DIMM modules. However, the specifications sheet only mentions DDR3L, which is limited to DDR3L-1600. In a desktop like this, the difference between DDR3L and DDR3 would be minor, and the higher speed memory would help benefit (unless it was Carrizo-L focused).

The ‘Who Wants AMD In A Laptop?’ Problem Benchmark Results: CPU Short Form
Comments Locked

175 Comments

View All Comments

  • ImSpartacus - Friday, February 5, 2016 - link

    Holy shit, I haven't seen that many pages in a long time. You don't see this much content very often. Gotta love dat chorizo.
  • close - Friday, February 5, 2016 - link

    ImSpartanus, they're just writing a comprehensive article. I'm sure they put in good work with all of them.
  • ImSpartacus - Friday, February 5, 2016 - link

    I think this article provides a pretty delicate and nuanced treatment of chorizo and its place in the market (both potential & actual). There's no doubt that the circumstances demanded it. This was not business as usual and I'm glad Anandtech recognized the need for that additional effort.

    We're fooling ourselves if we pretend that any journalistic entity puts the sane amount of effort into every project. We're talking about living, breathing humans, not robots.
  • fmcjw - Friday, February 5, 2016 - link

    I found the language convoluted, verbose, and difficult to read, compared to, say, Anand's straightforward and logical writing:

    "Nonetheless, Intel’s product line is a sequence of parts that intersect each other, with low end models equipped with dual core Pentiums and Celerons, stretching into some i3 and i5 territory while still south of $1000. In this mix is Core M, Intel’s 4.5W premium dual core parts found in devices north of $600."

    "south of/north of"... can't you just put in "below/above"? And all that "intersecting of parts", can't you just say from the Atom to Pentiums, Celerons, i3's, and i5's....

    The whole thing reads like they're paying you to score a high word count. Lots of information to extract here, but it can be 3 pages shorter and take half as long to read.
  • Cellar Door - Friday, February 5, 2016 - link

    That is why Anandtech has video adds on their main page - designed for people like you. Who simply lack reading comprehension past 8th grade and find it hard to understand. Just watch watch the video on how to loose weight that auto-plays on the side.

    Or... try Tom's Hardware - they cater to your demographic.
  • ImSpartacus - Friday, February 5, 2016 - link

    There's no question that Anand had a powerful way of writing that was uniquely simple yet educated you nevertheless. And for a layman that reads this sort of stuff to learn new information, that's very attractive and I kinda miss it (along with Klug).

    However, I give Ian a pass because he at least attempted to use other brand of conveying his ideas. In certain sections he used special table-like fitting to separate "parallel" sections/stances so that the rader would be more apt to compare them. So there's at least some effort, though he surely could do better.
  • 10basetom - Saturday, February 6, 2016 - link

    fmcjw does have a point, but in all fairness it is much harder to explain techical stuff in layman terms than it is to be long-wordy. Carl Sagan was the master of it on TV, and Anand was excellent at it on paper.
  • JMC2000 - Sunday, February 7, 2016 - link

    I didn't find anything wrong with the language Ian used, as this is piece is still on a technical level, but can be understood by the layman that knows a bit more than just what the stickers on the outside tell.

    To me, the phrase "parts that intersect each other" lays out that there is a myriad of options where configurations overlap, where as saying "from the Atom to Pentiums, Celerons, i3s and i5s" indicates that there is a pricing structure that is related to general CPU performance, which there really isn't when it comes to low-end machines.
  • plonk420 - Monday, February 8, 2016 - link

    "south of/north of" sounds better than "greater than/less than," which is more correct than "below/above"
  • Sushisamurai - Thursday, February 11, 2016 - link

    yeah, colorful language is nice. Dumbing down adjectives or descriptions can often construe the true message IMO. This way, it paints a more descriptive/colorful picture.

    Keep up the good work Ian.

Log in

Don't have an account? Sign up now