AnandTech 2015 Client SSD Suite

The core of our SSD test suite has remained unchanged for nearly four years now. While we have added new benchmarks, such as performance consistency and Storage Bench 2013, in response to the evolution of the SSD industry, we haven't done a major overhaul to take our testing to the next level. That all changes today with the introduction of our 2015 Client SSD Suite.

Just to be clear, there weren't any flaws in the way we did testing in the past -- there were simply some shortcoming that I've been wanting to fix for a while now, but like any big upgrade it's not done overnight. There are four key areas where I focused in the 2015 Suite and these are modernizing our testbed, depth of information, readability and power consumption.

Our old testbed was old, really old. We were using a Sandy Bridge based system with Intel Rapid Storage Technology 10.2 drivers from 2011, so it doesn't take a genius to figure out that our system was desperately in need of a refresh. The 2015 testbed is the latest of the latest with an Intel Haswell CPU and ASUS Z97 motherboard. For the operating system, we have upgraded from Windows 7 to Windows 8.1 with native NVMe driver, which ensures that our setup is fully prepared for the wave of PCIe NVMe SSDs arriving in the second half of 2015. We are also using the latest Intel Rapid Storage Technology drivers now, which should provide a healthy boost over the old ones we were using before. I've included the full specs of the new system below.

AnandTech 2015 SSD Test System
CPU Intel Core i7-4770K running at 3.5GHz (Turbo & EIST enabled, C-states disabled)
Motherboard ASUS Z97 Deluxe (BIOS 2205)
Chipset Intel Z97
Chipset Drivers Intel 10.0.24+ Intel RST 13.2.4.1000
Memory Corsair Vengeance DDR3-1866 2x8GB (9-10-9-27 2T)
Graphics Intel HD Graphics 4600
Graphics Drivers 15.33.8.64.3345
Desktop Resolution 1920 x 1080
OS Windows 8.1 x64

The second improvement we have made is regarding the depth of information. Every now and then I found myself in a situation where I couldn't explain why one drive was faster than the other in our Storage Bench tests, so the 2015 Suite includes additional Iometer tests at various queue depths to help us understand the drive and its performance better. I'm also reporting more data from the Storage Bench traces to better characterize the drive and providing new metrics that I think are more relevant to client usage than some of the metrics we have used in the past. The goal of the 2015 Suite is to leave no stone unturned when it comes to explaining performance and I'm confident that the new Suite does an excellent job at that.

However, the increase in depth of information creates a readability problem. I know some of you prefer to have easy and quick to read graphs, but it's hard to present a mountain of data in a format that's convenient to read. To give you the best of both worlds, I'm providing both the easy and quick to read graphs as well as the full data for those who want to dig in a bit deeper. That way the benchmarks will remain comfortable to skim through in case you don't have a lot of time on your hands, but alternatively you will get access to far more data than in the past.

Last but not least, I'm taking power testing to a whole new level in our 2015 Suite. In the past, power consumption was merely a few graphs near to the end of the article and to be honest the tests we ran didn't give the full scope of the drive's power behavior. In our 2015 Suite, power is just as important as performance is because I'm practically testing and reporting power consumption in every benchmark (though for now this is limited to SATA drives). In the end, the majority of SSDs are employed in laptops and power consumption can actually be far more critical than performance, so making power consumption testing a first class citizen makes perfect sense.

A Word About Storage Benches and Real World Tests

While I'm introducing numerous new benchmarks and performance metrics, our Storage Bench traces have remained unchanged. The truth is that workloads rarely undergo a dramatic change, so I had no reason to create a bunch of new traces that would ultimately be more or less the same that we have already used for years. That's why I also dropped the year nomenclature from the Storage Benches because a trace from 2011 is still perfectly relevant today and keeping the year might have given some readers a picture that our testing is outdated. Basically, the three traces are now called The Destroyer, Heavy and Light with the first one being our old 2013 Storage Bench and the two latter ones being part of our 2011 Storage Bench. 

I know some of you have criticized our benchmarks due to the lack of real world application tests, but the unfortunate truth is that it's close to impossible to build a reliable test suite that can be executed in real time. Especially if you want to test something else than just boot and application launch times, there is simply too many tasks in the background that cannot be properly controlled to guarantee valid results. I think it has become common knowledge that any modern SSD is good enough for an average user and that the differences in basic web-centric workloads are negligible, so measuring the time it takes to launch Chrome isn't an exciting test to be honest.

In late 2013, I spent a tremendous amount of time trying to build a real world test suite with a heavier workload, but I kept hitting the same obstacle over and over again: multitasking. One of the most basic principles of benchmarking is reproducibility, meaning that the same test can be run over and over again without significant unexplainable fluctuation in the results. The issue I faced with multitasking was that once I started adding background operations, such as VMs, large downloads and backups like a heavier user would have in the background, my results were no longer explainable as I had lost the control of what was accessing the drive. The swings were significant enough that the results wouldn't hold any ground, which is why you never saw any fruit of my endeavors. 

As a result, I decided to drop off real world testing (at least for now) and go back to traces, which we have been using for years and know that they are reliable, although not a perfect way to measure performance. Unfortunately there is still no TRIM support in the playback and to speed up the trace playback we've cut the idle times to a maximum of 25 milliseconds. Despite the limitations, I do believe that traces are the best to measure meaningful real world performance because the IO trace is still straight from a real world workload, which cannot be properly replicated with any synthetic benchmark tool (like Iometer). 

Introduction & The Drive Performance Consistency
Comments Locked

128 Comments

View All Comments

  • Railgun - Wednesday, February 25, 2015 - link

    Boot times are irrelevant as there, too, there are several variables. BIOS or UEFI? HW involved. Other applications involved. And in the grand scheme of things, it's a one and done thing. If someone is so concerned on booting taking 5 seconds over 30, one can assume they'd leave the thing on. It's an irrelevant metric. Installing an OS again is an irrelevant metric due to the HW involved once again. I've never understood the fascination over boot times.
  • Edgar_in_Indy - Wednesday, February 25, 2015 - link

    Boot times and OS installation times (or game installation times, if it makes you feel better) would be interesting because they would be direct reflections of how a drive's theoretical speed is manifest in real world situations.

    I'm not really sure what your point is, and what you're arguing for. Why would you *not* want a few basic, real world metrics added to the other measurements? *Of course* the test system isn't going to be the same as every user's system. So what? We should still be able to glean some useful information about a drive's relative performance to other drives.

    Besides, I have seen some situations where the synthetic tests didn't look great for a particular drive, but in the real world tests it fared much better. This is what led me to choose the Crucial M4 when I was shopping for a 256GB SSD a couple years ago. It wasn't the darling of the synthetic tests, but in the real world scenarios it was right up there with the best of them. It did particularly well in Anand's "Light Workload" test, which seemed much more typical of real-world use than the stress-test type scenarios.

    And in regards to the "fascination" with boot times, I think that almost everybody prefers a computer that starts more quickly. I've been around since the DOS days, and that was the last time that I had a computer that booted in seconds, until recently. So having SSD's that can do the same is pretty dang cool to me.

    I guess you could also ask a car guy why they have a "fascination" with 0-60 times. I mean, how often will someone really need to get to 60 mph in 5.2 seconds? It's ridiculous to think that somebody would be so worried about 0-60 times, unless they're an Indy Car driver or something.

    And besides, there are too many variables (weather, humidity, altitude, tire conditions, driver skill, etc. etc.) to get a definitive 0-60 times, so we may as well junk the whole idea, and just assume that the $60,000 sports car is faster than the $30,000 sports car just because it puts up better numbers on the dyno or has a larger displacement engine.

    But I really can't believe I'm having to explain this...
  • Railgun - Thursday, February 26, 2015 - link

    I can't believe you had to explain that as well. You said you want real world tests, which already exist and refer to it during your selection of the M4. How does loading an OS by itself reflect anything? A plain vanilla install is a case that everyone will have only once during the lifecycle of that particular install. You allude to the light workload being more typical, which in itself is more than fine. If you've looked at what's included in the test suites, in particular the destroyer, you'll see that what can be considered normal usage tasks are there.
    -Download/install games, play games,
    -Copy and watch movies,
    -Browse the web, manage local email, copy files, encrypt/decrypt files, backup system, download content, virus/malware scan.
    How are those not real world tests? They're not synthetic tests. What was your real world scenario that showed the M4 was better than whatever you were comparing it to? Why is that worse, or different than a boot speed test? I too don't hold a lot of value in the synthetic marks. As you mention, it's more for bragging rights than anything else.

    I too remember the DOS days. Compared to that, there is no comparison as DOS, compared to Win7 is like comparing a Yugo to a Ferrari. They're both cars and get you from point A to B, but one is so much more than the other. Yes, they're both operating systems, but one has so much more to it and is more complicated than another. DOS 1.0 was about 4000 lines of code. Win7 is around 40 million. What about a nice striped down Linux build? That will load faster. What about Mac OS? Throw in a RAID controller and boot times get tossed out the window.

    I don't think anyone has been missing any boot time metrics in the history of testing drives, whether SSD or otherwise. I've not seen one single review anywhere that shows boot times. The ONLY time I've ever seen it was initial comparison between an HDD and SSD. It's a moot point. You know it will be quick. Kristian's point is dead on. We're in the realm of possible milliseconds here. There's no point for the metric.

    BTW, I am a car guy, and while 0-60 is all great, I'm more interested in the whole package. Handling, build quality, design, etc. :)
  • Edgar_in_Indy - Thursday, February 26, 2015 - link

    I agree with most everything you said, but I would go back to my original gripe that there was no stopwatch involved. Data rates are great, and let's definitely keep them coming, but I would simply like to see some timed tests too. And even if the timed tests show little or no difference, then that is also a very valuable piece of information.

    My basic gripe with the article is that it does not clearly answer the question "Should I or shouldn't I?" Sure, some of the graphs are dramatic, but how much will they be manifest in the real world? I think the answer to "Should I buy it?" should be the payoff for reading a big review like this.

    And if we can now say that we've reached the point where the real-world difference between SSD's for 99.99% of users is negligible, then I guess it begs the question whether these types of in-depth articles are worth writing, and worth reading, for very much farther into the future.

    Kind of like how in-depth sound card reviews have mostly gone away, since we've reached the point where they just work without drawing attention to themselves. Unless you are in the tiny percentile where your occupation relies on having the best soundcard with very specific features, then you don't have to worry about it. Like I said, I came from the DOS days, and for many years soundcards were one of the hottest topics in PC hardware. Now they're pretty much a non-issue.

    To draw a non-computer parallel, I'm sure an engineer could also write a 7,000-word review of a particular garbage disposal, going into great detail on every aspect of how the unit is built, but it would be total overkill, because people basically just want to know if it works or not. If SSD's are reaching a level of near-parity, then how many people will want to wade through all the background information in minute detail?

    This has been a very informative discussion for me, and in a way it's refreshing to know that I no longer need to sweat about choosing an SSD in the future. That also means that I will be very unlikely to click articles or visit sites that are focused on SSD performance.
  • Railgun - Thursday, February 26, 2015 - link

    I think you and will find, and Kristian, correct me if I'm wrong, that native nvme drives will increase perceived responsiveness as it allows for full on simultaneous read/write IOPS as opposed to unidirectional operations.

    That should show a nice increase in some scenarios.
  • Kristian Vättö - Wednesday, February 25, 2015 - link

    I find that it's waste of time to run tests that show the obvious, which in this case is that boot and application launch times are the same for all drives. Like I said, it's starting to become common knowledge that for basic workloads there's no difference between SSDs and I've never argued against that.

    If I did real world testing, I would like to do it right. This means more than timing the boot time and how many tenths of a second it takes to launch a certain app. Frankly that has no value when you consider a power user's workload with dozens of apps already open, of which some might be rather IO intensive (like running a VM).

    In such scenarios it can be hard to time the absolute difference because we are talking about stuttering and not seconds long wait times, but it's something that many certainly don't want to experience.

    That said, I will probably craft something basic (boot, app and installation times) to show whether PCIe/NVMe has any relevance in basic IO workloads, but it's not something that I'm looking to make a part of our regular test suite since I don't think it gives an accurate picture of actual real world performance under multitasking workloads.
  • Edgar_in_Indy - Wednesday, February 25, 2015 - link

    So would you say we're reaching the point where having the "fastest" SSD is really mostly about bragging rights?

    If that's the case, then it seems like the two most important specifications of an SSD would be size and price (much like it is for platter HDD's now). It would certainly make shopping for an SSD much simpler, if relative speed is no longer a meaningful factor.
  • Kristian Vättö - Wednesday, February 25, 2015 - link

    Yes, and I don't think I've been trying shovel the high-end SSDs down people's throat.

    I think the SSD market mainly consists of two segments now, which are the mainstream and enthusiast/professional segments. For the mainstream segment, any modern SSD is good enough, which is why $/GB has been the dominating factor when I recommend drives for that market (and that's why the MX100 has been my recommendation for quite some time now if you've seen our "Best SSDs" articles).

    The high-end sector is different in the sense that the users tend to want the best performance they can get. In some cases it's just for the bragging rights, but there are also workloads where SSD performance really matters (multiple VMs, photo/video/audio editing, etc). Some of our tests are more geared towards these users and I think we've been pretty clear about that, but as you said the Light workload test does a good job of illustrating average consumer usage and frankly the difference between drives in that test is rather small.

    My goal has never been to push people to buy "faster" drives than they need and if some of my writings have come across as that then please, give me some examples and I'll try to learn from those.
  • Edgar_in_Indy - Wednesday, February 25, 2015 - link

    No, I'm not trying to say that you've been pushing people to faster or more expensive SSD's. And even if you were, I probably wouldn't know, since I don't read all the SSD articles on here and follow all the developments. I mostly just jump in every year or two when I'm shopping for upgrades, and I try to play catch-up at that point in order to make sure I'm spending my money as wisely as possible.

    So for someone like me, who doesn't follow this stuff religiously, it's good to know I don't need to worry about missing out on big speed gains by not getting the hottest SSD of the moment next time I want to upgrade.

    That being said, I'm still a little bit of a performance enthusiast, so I can't help but be curious when something like this comes along and shows the potential for big improvements over previous designs. That's why I was a little disappointed to not find much in the way of real-world results.

    Anyway, it's obvious you put a lot of time and effort into this review, and the some of the performance results really were dramatic, so this is some good work.
  • Redstorm - Tuesday, February 24, 2015 - link

    Why when updating the storage bench system did you pick a motherboard without a m.2 x 4 PCIe 3.0 slot. The Asus Z97 Delux is only providing 2 x pcie 2.0 lanes to the onboard M.2 slot. seams a bit short sited with the impending avalanche of x4 PCIe 3.0 SSD controllers coming out. Your new bench system is obsolete before it began. Using PCIe adapters is old school..

Log in

Don't have an account? Sign up now