AnandTech Storage Bench 2013

When I built the AnandTech Heavy and Light Storage Bench suites in 2011 I did so because we didn't have any good tools at the time that would begin to stress a drive's garbage collection routines. Once all blocks have a sufficient number of used pages, all further writes will inevitably trigger some sort of garbage collection/block recycling algorithm. Our Heavy 2011 test in particular was designed to do just this. By hitting the test SSD with a large enough and write intensive enough workload, we could ensure that some amount of GC would happen.

There were a couple of issues with our 2011 tests that I've been wanting to rectify however. First off, all of our 2011 tests were built using Windows 7 x64 pre-SP1, which meant there were potentially some 4K alignment issues that wouldn't exist had we built the trace on a system with SP1. This didn't really impact most SSDs but it proved to be a problem with some hard drives. Secondly, and more recently, I've shifted focus from simply triggering GC routines to really looking at worst case scenario performance after prolonged random IO. For years I'd felt the negative impacts of inconsistent IO performance with all SSDs, but until the S3700 showed up I didn't think to actually measure and visualize IO consistency. The problem with our IO consistency tests are they are very focused on 4KB random writes at high queue depths and full LBA spans, not exactly a real world client usage model. The aspects of SSD architecture that those tests stress however are very important, and none of our existing tests were doing a good job of quantifying that.

I needed an updated heavy test, one that dealt with an even larger set of data and one that somehow incorporated IO consistency into its metrics. I think I have that test. I've just been calling it The Destroyer (although AnandTech Storage Bench 2013 is likely a better fit for PR reasons).

Everything about this new test is bigger and better. The test platform moves to Windows 8 Pro x64. The workload is far more realistic. Just as before, this is an application trace based test - I record all IO requests made to a test system, then play them back on the drive I'm measuring and run statistical analysis on the drive's responses.

Imitating most modern benchmarks I crafted the Destroyer out of a series of scenarios. For this benchmark I focused heavily on Photo editing, Gaming, Virtualization, General Productivity, Video Playback and Application Development. Rough descriptions of the various scenarios are in the table below:

AnandTech Storage Bench 2013 Preview - The Destroyer
Workload Description Applications Used
Photo Sync/Editing Import images, edit, export Adobe Photoshop CS6, Adobe Lightroom 4, Dropbox
Gaming Download/install games, play games Steam, Deus Ex, Skyrim, Starcraft 2, BioShock Infinite
Virtualization Run/manage VM, use general apps inside VM VirtualBox
General Productivity Browse the web, manage local email, copy files, encrypt/decrypt files, backup system, download content, virus/malware scan Chrome, IE10, Outlook, Windows 8, AxCrypt, uTorrent, AdAware
Video Playback Copy and watch movies Windows 8
Application Development Compile projects, check out code, download code samples Visual Studio 2012

While some tasks remained independent, many were stitched together (e.g. system backups would take place while other scenarios were taking place). The overall stats give some justification to what I've been calling this test internally:

AnandTech Storage Bench 2013 Preview - The Destroyer, Specs
  The Destroyer (2013) Heavy 2011
Reads 38.83 million 2.17 million
Writes 10.98 million 1.78 million
Total IO Operations 49.8 million 3.99 million
Total GB Read 1583.02 GB 48.63 GB
Total GB Written 875.62 GB 106.32 GB
Average Queue Depth ~5.5 ~4.6
Focus Worst case multitasking, IO consistency Peak IO, basic GC routines

SSDs have grown in their performance abilities over the years, so I wanted a new test that could really push high queue depths at times. The average queue depth is still realistic for a client workload, but the Destroyer has some very demanding peaks. When I first introduced the Heavy 2011 test, some drives would take multiple hours to complete it - today most high performance SSDs can finish the test in under 90 minutes. The Destroyer? So far the fastest I've seen it go is 10 hours. Most high performance I've tested seem to need around 12 - 13 hours per run, with mainstream drives taking closer to 24 hours. The read/write balance is also a lot more realistic than in the Heavy 2011 test. Back in 2011 I just needed something that had a ton of writes so I could start separating the good from the bad. Now that the drives have matured, I felt a test that was a bit more balanced would be a better idea.

Despite the balance recalibration, there's just a ton of data moving around in this test. Ultimately the sheer volume of data here and the fact that there's a good amount of random IO courtesy of all of the multitasking (e.g. background VM work, background photo exports/syncs, etc...) makes the Destroyer do a far better job of giving credit for performance consistency than the old Heavy 2011 test. Both tests are valid, they just stress/showcase different things. As the days of begging for better random IO performance and basic GC intelligence are over, I wanted a test that would give me a bit more of what I'm interested in these days. As I mentioned in the S3700 review - having good worst case IO performance and consistency matters just as much to client users as it does to enterprise users.

I'm reporting two primary metrics with the Destroyer: average data rate in MB/s and average service time in microseconds. The former gives you an idea of the throughput of the drive during the time that it was running the Destroyer workload. This can be a very good indication of overall performance. What average data rate doesn't do a good job of is taking into account response time of very bursty (read: high queue depth) IO. By reporting average service time we heavily weigh latency for queued IOs. You'll note that this is a metric I've been reporting in our enterprise benchmarks for a while now. With the client tests maturing, the time was right for a little convergence.

AT Storage Bench 2013 - The Destroyer

There's simply no comparison between the EVO and Crucial's M500. Even at half the capacity, the EVO does a better job in our consistency test. SanDisk's Extreme II remains the king here but that's more of a performance tuned part vs. something that offers better cost per GB. Note just how impactful the added spare is on giving the EVO an advantage over even the 840 Pro. It's so very important that 840 Pro owners keep as much free space on the drive as possible to keep performance high and consistent.

AT Storage Bench 2013 - The Destroyer

 

Performance Consistency & Testing TRIM Random & Sequential Performance
Comments Locked

137 Comments

View All Comments

  • verjic - Thursday, February 13, 2014 - link

    I'm talking about 120 Gb version
  • verjic - Thursday, February 13, 2014 - link

    Also what is Write/Read IOMeter Bootup and Write/Read IOMeter IOMix - what means their speed? Thank You
  • AhDah - Thursday, May 15, 2014 - link

    The TRIM validation graph shows a tremendous performance drop after a few gigs of writes, even after TRIM pass, the write speed is only 150MBps.
    Does this mean once the drive is 75%-85% filled up, the write speed will always be slow?

    I'm tempted to get Crucial M550 because of this down fall.
  • njwhite2 - Wednesday, October 15, 2014 - link

    Kudos to Anand Lal Shimpi! This is one of the finest reviews I have ever read! No jargon. No unexplained acronyms. Quantitative testing of compared items instead of reviewer bias. Explanation of why the measured criteria are imortant to the end user! Just fabulous! I read dozens of reviews each week, so I'm surprised I had not stumbled upon Anandtech before. I'm (for sure) going to check out their smartphone reviews. Most of those on other sites are written by Apple fans or Android fans and really don't tell the potential purchaser what they need to know to make the best choice for them.
  • IT_Architect - Thursday, October 22, 2015 - link

    I would be interested in how reliable they are. The reason I ask is one time, when the time the Intel SLC technology was just under two years old, and there was no MLC or TLC, I needed speed to load a database from scratch 6 times an hour during incredible traffic times. I was getting requests by users at the rate of 66 times a second per server, which each required many reads of the database per request. I couldn't swap databases without breaking sessions, and mirror and unmirror did not work well. I would have to pay a ton to duplicate a redundant array in SSDs. Then I asked the data center how many of these drives they had out there. They (SoftLayer) queried and came back with 700+. Then I asked them how many they've had go bad. They queried their records and it was none, not so much as a DOA. I reasoned from that I would be just as likely to have a chassis or disk controller go bad. None of them have any moving parts, and the drives are low power. Those were enterprise drives of course because that's all there was at that time.

    In 2011 I bought a Dell M6600. Dell was shipping them with the Micron SSD. I was concerned about the lifespan and I do a lot of reading and writing with it and work constantly with virtual machines while prototyping, and VM files are huge. It calculated out to 4 years. While researching, I came across that situation where Dell had "cold feet" about OEMing them due to lifespan. Micron/Intel demonstrated to them 10x the rated lifespan, which convinced Dell. There was plenty of other trouble with consumer-level SSDs at the time, which gave the technology a bad name. The Micron/Intel was one of the very few solid citizens at the time. I went with it, although I didn't buy my M6600 with it because Dell had such a premium on them. I had two problems with the drive, which by the way is still in service today. The first was the drive just stopped doing anything one day. I called Micron and it turned out to be a bug in the firmware. If I had two drives arrayed, it would have stopped both at the same time. I upgraded the firmware and never had that problem again. The next time I was troubleshooting the laptop and putting the battery in and out and the computer would no longer boot. I again called Micron. It was by design. They said disconnect the power, pull the battery, and wait one hour. I did, and it has worked perfectly since. If I had an array, it would have stopped both at the same time.

    Today, the market is much more mature and the technology no longer has a bad name. A redundant array is no substitute for a backup anyway. A redundant array brings business continuity and speed. Are we just as likely or more so to have a motherboard go out? We don't have redundant motherboards unless without having another entire computer. Unlike a power supplies and CPUs, SSDs are low-current devices. I'm considering the possibility that we may be at the point, even for consumer-level drives, where redundant arrays for SSDs are just plain silly.
  • Gothmoth - Sunday, January 8, 2017 - link

    in real life my RAPID test showed no benefits AT ALL!!

    all it does is making low level benchmarks look better.
    you should test with real applications. RAPID is a useless feature.
  • jeyjey - Friday, June 7, 2019 - link

    I have one of this drive. I need to find a little part that is fired, I need to replace it to try to enter the data inside. Please help.

Log in

Don't have an account? Sign up now