AnandTech Storage Bench 2011: Much Heavier

I didn't expect to have to debut this so soon, but I've been working on updated benchmarks for 2011. Last year we introduced our AnandTech Storage Bench, a suite of benchmarks that took traces of real OS/application usage and played them back in a repeatable manner. I assembled the traces myself out of frustration with the majority of what we have today in terms of SSD benchmarks.

Although the AnandTech Storage Bench tests did a good job of characterizing SSD performance, they weren't stressful enough. All of the tests performed less than 10GB of reads/writes and typically involved only 4GB of writes specifically. That's not even enough exceed the spare area on most SSDs. Most canned SSD benchmarks don't even come close to writing a single gigabyte of data, but that doesn't mean that simply writing 4GB is acceptable.

Originally I kept the benchmarks short enough that they wouldn't be a burden to run (~30 minutes) but long enough that they were representative of what a power user might do with their system.

Not too long ago I tweeted that I had created what I referred to as the Mother of All SSD Benchmarks (MOASB). Rather than only writing 4GB of data to the drive, this benchmark writes 106.32GB. It's the load you'd put on a drive after nearly two weeks of constant usage. And it takes a *long* time to run.

I'll be sharing the full details of the benchmark in some upcoming SSD articles but here are some details:

1) The MOASB, officially called AnandTech Storage Bench 2011—Heavy Workload, mainly focuses on the times when your I/O activity is the highest. There is a lot of downloading and application installing that happens during the course of this test. My thinking was that it's during application installs, file copies, downloading and multitasking with all of this that you can really notice performance differences between drives.

2) I tried to cover as many bases as possible with the software I incorporated into this test. There's a lot of photo editing in Photoshop, HTML editing in Dreamweaver, web browsing, game playing/level loading (Starcraft II & WoW are both a part of the test) as well as general use stuff (application installing, virus scanning). I included a large amount of email downloading, document creation and editing as well. To top it all off I even use Visual Studio 2008 to build Chromium during the test.

Update: As promised, some more details about our Heavy Workload for 2011.

The test has 2,168,893 read operations and 1,783,447 write operations. The IO breakdown is as follows:

AnandTech Storage Bench 2011—Heavy Workload IO Breakdown
IO Size % of Total
4KB 28%
16KB 10%
32KB 10%
64KB 4%

Only 42% of all operations are sequential, the rest range from pseudo to fully random (with most falling in the pseudo-random category). Average queue depth is 4.625 IOs, with 59% of operations taking place in an IO queue of 1.

Many of you have asked for a better way to really characterize performance. Simply looking at IOPS doesn't really say much. As a result I'm going to be presenting Storage Bench 2011 data in a slightly different way. We'll have performance represented as Average MB/s, with higher numbers being better. At the same time I'll be reporting how long the SSD was busy while running this test. These disk busy graphs will show you exactly how much time was shaved off by using a faster drive vs. a slower one during the course of this test. Finally, I will also break out performance into reads, writes and combined. The reason I do this is to help balance out the fact that this test is unusually write intensive, which can often hide the benefits of a drive with good read performance.

There's also a new light workload for 2011. This is a far more reasonable, typical every day use case benchmark. Lots of web browsing, photo editing (but with a greater focus on photo consumption), video playback as well as some application installs and gaming. This test isn't nearly as write intensive as the MOASB but it's still multiple times more write intensive than what we were running last year.

As always I don't believe that these two benchmarks alone are enough to characterize the performance of a drive, but hopefully along with the rest of our tests they will help provide a better idea.

The testbed for Storage Bench 2011 has changed as well. We're now using a Sandy Bridge platform with full 6Gbps support for these tests. All of the older tests are still run on our X58 platform.

AnandTech Storage Bench 2011—Heavy Workload

We'll start out by looking at average data rate throughout our new heavy workload test:

AnandTech Storage Bench 2011—Heavy Workload

While we saw a pretty significant difference between 3Gbps and 6Gbps interfaces with the Intel 510 and Vertex 3, but the same can't be said about Crucial's m4. There's only a 7% performance improvement seen by using a 6Gbps connector on our Sandy Bridge system. Even more interesting is that performance actually drops a bit compared to the C300. We saw this in some of our synthetic Iometer tests and it's definitely reflected here.

The breakdown of reads vs. writes tells us more of what's going on:

AnandTech Storage Bench 2011—Heavy Workload

The drop in sequential and random read performance we noticed seems to be at fault for the m4's lower-than-C300 performance. Looking at write speeds we actually see an improvement over the C300:

AnandTech Storage Bench 2011—Heavy Workload

The next three charts just represent the same data, but in a different manner. Instead of looking at average data rate, we're looking at how long the disk was busy for during this entire test. Note that disk busy time excludes any and all idles, this is just how long the SSD was busy doing something:

AnandTech Storage Bench 2011—Heavy Workload

AnandTech Storage Bench 2011—Heavy Workload

AnandTech Storage Bench 2011—Heavy Workload

Sequential Read/Write Speed AnandTech Storage Bench 2011 - Light Workload
Comments Locked

103 Comments

View All Comments

  • Rasterman - Friday, April 1, 2011 - link

    LOL wait a year? You are nuts, a year from now there will be totally new products out at all new high prices. Prices come down? Most of these new drives are not even in full production yet and some aren't even released. Regardless upgrading from a G2 level drive to any of these you aren't going to see any difference in real word use, only if you are doing massive file transfers all of the time or can afford to blow money for minimal performance increases (work system) then there is absolutely no point in upgrading speed wise.
  • eamon - Thursday, March 31, 2011 - link

    The article states that "I had the same complaint about the C300 if you'll remember from last year. If you're running an OS without TRIM support, then the m4 is a definite pass. Even with TRIM enabled and a sufficiently random workload, you'll want to skip the m4 as well." These statements don't really seem backed up by the data presented.

    Take the m4-is-bad-without-TRIM idea: If you lack TRIM *and* torture-test your SSD for twenty-minutes of random writes, then you'll see a significant but temporary loss of performance, is what you show. That's not ideal, but really, outside of benchmarking, 20-minutes of random write torture are exceedingly unusual. And you don't show a benchmark with TRIM support enabled (i.e., not just running on an OS with TRIM support, but running on a filesystem and where the filesystem isn't just completely filled up). Does the same performance degradation occur with normal TRIM usage patterns? That seems to be a far more likely usage pattern, but you don't test it.

    This makes the second statement seem even less warranted - first of all, you're testing a very unusual access pattern, and you're doing it without a common feature (TRIM) designed to avoid this degradation, and you're not checking how long it takes for performance to recover (after all, if performance quickly recovers after torture testing, then it may well be reasonable to accept the low risk of low performance since the situation will rectify itself anyhow).

    I'm not trying to defend the m4 here - and you might be right, but the data sure seems insufficient to draw these rather relevant conclusions. How quickly does the m4 recover, and how does TRIM impact the degradation in the first place?
  • JNo - Thursday, March 31, 2011 - link

    +1

    I too am not trying to defend the m4 but I think a lot of emphasis is put on sequential performance reads & writes. Whilst I'm sure the everyone will copy/move very large files to their SSD occasionally, the vast majority will still have them as their boot drive where overall system responsiveness (random reads/writes) is still king. It's still a useful metric to know for those who really want to do video editing etc on an SSD but generally over stated.

    For most users, like myself, I think the performance benefits of the amazing Vertex 3 will be imperceptible over the m4 99.999% of the time. So the real question, as always, is price - the Vertex 3 does justify a premium but only a small one. Most value-for-money buyers would probably get better real world value from the m4 assuming it is cheaper.
  • tno - Thursday, March 31, 2011 - link

    I think the thing to remember is that this performance drop occurred during a pretty short torture test. But the possibility still exists that if the m4 delays garbage collection till a sequential write comes along, then the possibility could exist that the drive could suffer lots of insults from random writes, drastically decreasing performance, and, because not very many sequential writes are performed, the garbage collection never has a chance to remedy the situation.

    This is a hypothetical but it's not that far fetched for those of us that focus on using SSDs as OS drives. If you put a small OS drive in a desktop and supplement it with a large mechanical drive, your OS drive might not see a decently long sequential write for some time. Particularly if all your downloads and content generation goes to the mechanical drive.
  • Anand Lal Shimpi - Thursday, March 31, 2011 - link

    For most users, over the course of several months, access patterns can begin to mimic portions of our torture test. I'll be addressing this in a future article but tasks like web browsing, system boot and even application launches are only sequential IOs for less than 50% of the time.

    I state that I doubt it'll be the case for typical desktop workloads but honestly there's no way to be sure given a short period of testing. Note that every recommended SSD we test ultimately goes into a primary use system and we subject it to a realistic workload for months, noting any issues that do crop up - which eventually gets fed back into our reviews.

    Our data shows that in a perfect world, the m4 does quite well in most of the tests. My concerns are two fold:

    1) Low max latency during random write operations seems to imply very little gc work is being done during typical random writes.

    2) Our torture test shows that delayed garbage collection can result in a pretty poor performance scenario, where the m4 is allowed to to drop a bit lower than I'd like.

    How likely is it that you'll encounter this poor performance state?

    1) Without TRIM it's very likely. One of the machines I run daily is an OS X system without the TRIM hack enabled. Indilinx, the C300 and even Intel's X25-M both hit this worst case scenario performance level after a few months of use.

    2) With TRIM it'll depend entirely based on your workload. Remember that you never TRIM the entire drive like we did (only in the case of a full format). Given a sufficiently random workload without enough consistent sequential writing to bring up performance, I could see things get this bad.

    Again my point wasn't to conclude that the m4 was a bad drive, just that these are concerns of mine and I'd rather be cautious about them when recommending something to the public. It's no different than being cautious about recommending the Vertex 3 given unproven reliability and questionable track record.

    Take care,
    Anand
  • kmmatney - Thursday, March 31, 2011 - link

    So, could someone write a tool that does a huge sequential write to restore performance? Sort of like running the Intel SSD Toolbox and manually doing a TRIM? I could live with that. I'm still running Windows XP at work.
  • bobbozzo - Thursday, March 31, 2011 - link

    Just copy a really big file from another drive.
  • bobbozzo - Thursday, March 31, 2011 - link

    Or a bunch of not as big files.
  • 7Enigma - Friday, April 1, 2011 - link

    I'm quite certain I remember there being a program that does this created by an enthusiast way back during the first gen of SSD's.
  • lyeoh - Friday, April 1, 2011 - link

    To me I'm actually very happy about the low latency during random write (and read) ops.

    Can't there be a way to do garbage collection during idle time and not sacrifice latency?

    Yes I know that the drive could think it's idle and then start garbage collection just at the very moment when the user finally decides to do something. But if you do the garbage collection at a low intensity, should it affect performance that much? I'm assuming that since the drives are fast they can do a fair bit of garbage collection during idle at say 10-20% speed and not affect the user experience much.

    Enterprise drives might be busy all the time and total throughput often matters more than keeping latency in the milliseconds (it's still important but...), so the best time to do garbage collection for those drives would be ASAP.

    But that's not true for Desktop drives. Right now as I'm typing in this post, my HDD isn't busy at all. So an SSD could do a fair bit of GC during that long pause. Same for when you are playing a game (after it has loaded the game assets).

    It seems silly for Desktop SSDs to do GC during the time a user wants to do something (and presumably wants to get it done as fast as possible).

    The Intel SSDs have a max latency of hundreds of milliseconds! That's very human noticeable! Do conventional nonfaulty HDDs even get that slow?

Log in

Don't have an account? Sign up now