AnandTech Storage Bench 2013

When I built the AnandTech Heavy and Light Storage Bench suites in 2011 I did so because we didn't have any good tools at the time that would begin to stress a drive's garbage collection routines. Once all blocks have a sufficient number of used pages, all further writes will inevitably trigger some sort of garbage collection/block recycling algorithm. Our Heavy 2011 test in particular was designed to do just this. By hitting the test SSD with a large enough and write intensive enough workload, we could ensure that some amount of GC would happen.

There were a couple of issues with our 2011 tests that I've been wanting to rectify however. First off, all of our 2011 tests were built using Windows 7 x64 pre-SP1, which meant there were potentially some 4K alignment issues that wouldn't exist had we built the trace on a system with SP1. This didn't really impact most SSDs but it proved to be a problem with some hard drives. Secondly, and more recently, I've shifted focus from simply triggering GC routines to really looking at worst case scenario performance after prolonged random IO. For years I'd felt the negative impacts of inconsistent IO performance with all SSDs, but until the S3700 showed up I didn't think to actually measure and visualize IO consistency. The problem with our IO consistency tests are they are very focused on 4KB random writes at high queue depths and full LBA spans, not exactly a real world client usage model. The aspects of SSD architecture that those tests stress however are very important, and none of our existing tests were doing a good job of quantifying that.

I needed an updated heavy test, one that dealt with an even larger set of data and one that somehow incorporated IO consistency into its metrics. I think I have that test. I've just been calling it The Destroyer (although AnandTech Storage Bench 2013 is likely a better fit for PR reasons).

Everything about this new test is bigger and better. The test platform moves to Windows 8 Pro x64. The workload is far more realistic. Just as before, this is an application trace based test - I record all IO requests made to a test system, then play them back on the drive I'm measuring and run statistical analysis on the drive's responses.

Imitating most modern benchmarks I crafted the Destroyer out of a series of scenarios. For this benchmark I focused heavily on Photo editing, Gaming, Virtualization, General Productivity, Video Playback and Application Development. Rough descriptions of the various scenarios are in the table below:

AnandTech Storage Bench 2013 Preview - The Destroyer
Workload Description Applications Used
Photo Sync/Editing Import images, edit, export Adobe Photoshop CS6, Adobe Lightroom 4, Dropbox
Gaming Download/install games, play games Steam, Deus Ex, Skyrim, Starcraft 2, BioShock Infinite
Virtualization Run/manage VM, use general apps inside VM VirtualBox
General Productivity Browse the web, manage local email, copy files, encrypt/decrypt files, backup system, download content, virus/malware scan Chrome, IE10, Outlook, Windows 8, AxCrypt, uTorrent, AdAware
Video Playback Copy and watch movies Windows 8
Application Development Compile projects, check out code, download code samples Visual Studio 2012

While some tasks remained independent, many were stitched together (e.g. system backups would take place while other scenarios were taking place). The overall stats give some justification to what I've been calling this test internally:

AnandTech Storage Bench 2013 Preview - The Destroyer, Specs
  The Destroyer (2013) Heavy 2011
Reads 38.83 million 2.17 million
Writes 10.98 million 1.78 million
Total IO Operations 49.8 million 3.99 million
Total GB Read 1583.02 GB 48.63 GB
Total GB Written 875.62 GB 106.32 GB
Average Queue Depth ~5.5 ~4.6
Focus Worst case multitasking, IO consistency Peak IO, basic GC routines

SSDs have grown in their performance abilities over the years, so I wanted a new test that could really push high queue depths at times. The average queue depth is still realistic for a client workload, but the Destroyer has some very demanding peaks. When I first introduced the Heavy 2011 test, some drives would take multiple hours to complete it - today most high performance SSDs can finish the test in under 90 minutes. The Destroyer? So far the fastest I've seen it go is 10 hours. Most high performance I've tested seem to need around 12 - 13 hours per run, with mainstream drives taking closer to 24 hours. The read/write balance is also a lot more realistic than in the Heavy 2011 test. Back in 2011 I just needed something that had a ton of writes so I could start separating the good from the bad. Now that the drives have matured, I felt a test that was a bit more balanced would be a better idea.

Despite the balance recalibration, there's just a ton of data moving around in this test. Ultimately the sheer volume of data here and the fact that there's a good amount of random IO courtesy of all of the multitasking (e.g. background VM work, background photo exports/syncs, etc...) makes the Destroyer do a far better job of giving credit for performance consistency than the old Heavy 2011 test. Both tests are valid, they just stress/showcase different things. As the days of begging for better random IO performance and basic GC intelligence are over, I wanted a test that would give me a bit more of what I'm interested in these days. As I mentioned in the S3700 review - having good worst case IO performance and consistency matters just as much to client users as it does to enterprise users.

I'm reporting two primary metrics with the Destroyer: average data rate in MB/s and average service time in microseconds. The former gives you an idea of the throughput of the drive during the time that it was running the Destroyer workload. This can be a very good indication of overall performance. What average data rate doesn't do a good job of is taking into account response time of very bursty (read: high queue depth) IO. By reporting average service time we heavily weigh latency for queued IOs. You'll note that this is a metric I've been reporting in our enterprise benchmarks for a while now. With the client tests maturing, the time was right for a little convergence.

AT Storage Bench 2013 - The Destroyer

There's simply no comparison between the EVO and Crucial's M500. Even at half the capacity, the EVO does a better job in our consistency test. SanDisk's Extreme II remains the king here but that's more of a performance tuned part vs. something that offers better cost per GB. Note just how impactful the added spare is on giving the EVO an advantage over even the 840 Pro. It's so very important that 840 Pro owners keep as much free space on the drive as possible to keep performance high and consistent.

AT Storage Bench 2013 - The Destroyer

 

Performance Consistency & Testing TRIM Random & Sequential Performance
Comments Locked

137 Comments

View All Comments

  • Grim0013 - Sunday, July 28, 2013 - link

    I wonder what, if anything, the impact of Turbo Write is on drive endurance, as in, does the SLC buffer have the effect of "shielding" the TLC from some amount of write amplification (WA)? More specifically, I was thinking that in the case of small random writes (high WA), many of them would be going to the SLC first, then when the data is transferred to the TLC, I wonder if the buffering affords the controller the opportunity to write the data is such a way as to reduce WA on the TLC?

    In fact, I wonder if that is something that is done...if the controller is able to characterize certain types of files as being likely the be frequently modified then just keep them in the SLC semi-permanently. Stuff like the page file and other OS stuff that is constantly modified...I'm not very well-versed on this stuff so I'm just guessing. It just seems like taking advantage of SLCs crazy p/e endurance in addition to it's speed could really help make these things bulletproof.
  • shodanshok - Sunday, July 28, 2013 - link

    Yea, I was thinking the same thing. After all, Sandisk already did it on the Ultra Plus and Ultra II SSDs: they have a small pseudo-SLC zone used both for greater performance and reducing WA.
  • shodanshok - Sunday, July 28, 2013 - link

    I am not so exited about RAPID: data integrity is a delicate thing, so I am not so happy to trust Samsung (or others) replacing the key well-tested caching algorithm natively built into the OS.

    Anyway, Windows' write caching is not so quick because the OS, by default, flush its in-memory cache each second. Moreover, it normally issue a barrier event to flush the disk's DRAM cache. This last thing can be disabled, but the flush of the in-memory cache can not be changed, as far I know.

    Linux, on the other side, use much aggressive caching policy: it issue an in-memory cache flush (pagecache) ever 30 seconds, and it aggressively try to coalesce multiple writes into a single transactions. This parameter is configurable using the /proc interface. Moreover, if you have a BBU or power-tolerant disk subsystem, you can even disable the barrier instruction normally issued to the disk's DRAM cache.
  • Timur Born - Sunday, July 28, 2013 - link

    My Windows 8 setup uses quite exactly 1 gb RAM for write caching, regardless of whether it's writing to a 5400 rpm 2.5" HD, 5400 rpm 3.5" HD or Crucial M4 256 gb SSD. That's exactly the size of the RAPID cache. The "flush its cache each second" part becomes a problem when the source and destination are on the same drive, because once Windows starts writing the disk queue starts to climb.

    But even then it should mostly be a problem for spinning HDs that don't really like higher queue numbers. Even more so when you copy multiple files via Windows Explorer, which reads and write files concurrently even on spinning HDs.

    So I wonder if RAPID's only real advantage is its feature to coalesce multiple small writes into single big ones for durations longer than one second?!
  • Timur Born - Sunday, July 28, 2013 - link

    By the way, my personal experience is that CPU power saving features, as set up in both in the default "Balanced" and the "High Performance" power-profiles, have far more of an impact on SSD performance than caching stuff. I can up my M4' 4K random performance by 60% and more just by messing with CPU power savings to be less aggressive (or off).
  • shodanshok - Monday, July 29, 2013 - link

    If I correctly remember, Windows use at most 1/8 of total RAM size for write caching. How much RAM did you have?
  • Timur Born - Tuesday, July 30, 2013 - link

    8 gb, so you may be correct. Or you may mix it up with the 1/8 part of dirty cache that is being flushed by the Windows cache every second. Or both may be 1/8. ;-)
  • zzz777 - Monday, July 29, 2013 - link

    I'm interested in caching writes to a ram disk then to storage. This reminds me if the concept of a write-back cache: for almost everyone The possibility of data corruption is so low that there's no reason not to enable it: can this ssd ramdisk write quickly enough that home users also don't have to worry about using it? Beyond that I'm not a normal home user, I want to see benchmarks for virtualization, I want the quickest way to create, modify and test a vm before putting it on front life hardware
  • Wwhat - Monday, July 29, 2013 - link

    I for me still say: I rather go for the pro version.
  • andreaciri - Thursday, August 1, 2013 - link

    i have to decide if buy an 840 now, or an EVO when it will be available, for my macbook. considering that RAPID technology is only supported under windows, and that i'm more interested in read performance than write, is 840 a good choice?

Log in

Don't have an account? Sign up now