The Crucial m4 (Micron C400) SSD Review
by Anand Lal Shimpi on March 31, 2011 3:16 AM ESTRandom Read/Write Speed
The four corners of SSD performance are as follows: random read, random write, sequential read and sequential write speed. Random accesses are generally small in size, while sequential accesses tend to be larger and thus we have the four Iometer tests we use in all of our reviews.
Note that we've updated our C300 results on our new Sandy Bridge platform for these Iometer tests. As a result you'll see some higher scores for this drive (mostly with our 6Gbps numbers) for direct comparison to the m4 and other new 6Gbps drives we've tested.
Our first test writes 4KB in a completely random pattern over an 8GB space of the drive to simulate the sort of random access that you'd see on an OS drive (even this is more stressful than a normal desktop user would see). I perform three concurrent IOs and run the test for 3 minutes. The results reported are in average MB/s over the entire time. We use both standard pseudo randomly generated data for each write as well as fully random data to show you both the maximum and minimum performance offered by SandForce based drives in these tests. The average performance of SF drives will likely be somewhere in between the two values for each drive you see in the graphs. For an understanding of why this matters, read our original SandForce article.
If there's one thing Crucial focused on with the m4 it's random write speeds. The 256GB m4 is our new king of the hill when it comes to random write performance. It's actually faster than a Vertex 3 when writing highly compressible data. It doesn't matter if I run our random write test for 3 minutes or an hour, the performance over 6Gbps is still over 200MB/s.
Let's look at average write latency during this 3 minute run:
On average it takes Crucial 0.06ms to complete three 4KB writes spread out over an 8GB LBA space. The original C300 was pretty fast here already at 0.07ms—it's clear that these two drives are very closely related. Note that OCZ's Vertex 3 has a similar average latency but it's not actually writing most of the data to NAND—remember this is highly compressible data, most of it never hits NAND.
Now let's look at max latency during this same 3 minute period:
You'll notice a huge increase in max latency compared to average latency, that's because this is when a lot of drives do some real-time garbage collection. If you don't periodically clean up your writes you'll end up increasing max latency significantly. You'll notice that even the Vertex 3 with SandForce's controller has a pretty high max latency in comparison to its average latency. This is where the best controllers do their work. However not all OSes deal with this occasional high latency blip all that well. I've noticed that OS X in particular doesn't handle unexpectedly high write latencies very well, usually resulting in you having to force-quit an application.
Note the extremely low max latency of the m4 here: 4.3ms. Either the m4 is ultra quick at running through its garbage collection routines or it's putting off some of the work until later. I couldn't get a clear answer from Crucial on this one, but I suspect it's the latter. I'm going to break the standard SSD review mold here for a second and take you through our TRIM investigation. Here's what a clean sequential pass looks like on the m4:
Average read speeds are nearing 400MB/s, average write speed is 240MB/s. The fluctuating max write speed indicates some clean up work is being done during the sequential write process. Now let's fill the drive with data, then write randomly across all LBAs at a queue depth of 32 for 20 minutes and run another HDTach pass:
Ugh. This graph looks a lot like what we saw with the C300. Without TRIM the m4 can degrade to a very, very low performance state. Windows 7's Resource Monitor even reported instantaneous write speeds as low as 2MB/s. The good news is the performance curve trends upward: the m4 is trying to clean up its performance. Write sequentially to the drive and its performance should start to recover. The bad news is that Crucial appears to be putting off this garbage collection work a bit too late. Remember that the trick to NAND management is balancing wear leveling with write amplification. Clean blocks too quickly and you burn through program/erase cycles. Clean them too late and you risk high write amplification (and reduced performance). Each controller manufacturer decides the best balance for its SSD. Typically the best controllers do a lot of intelligent write combining and organization early on and delay cleaning as much as possible. The C300 and m4 both appear to push the limits of delayed block cleaning however. Based on the very low max random write latencies from above I'd say that Crucial is likely doing most of the heavy block cleaning during sequential writes and not during random writes. Note that in this tortured state—max write random latencies can reach as high as 1.4 seconds.
Here's a comparison of the same torture test run on Intel's SSD 320:
The 320 definitely suffers, just not as bad as the m4. Remember the higher max write latencies from above? I'm guessing that's why. Intel seems to be doing more cleanup along the way.
And just to calm all fears—if we do a full TRIM of the entire drive performance goes back to normal on the m4:
What does all of this mean? It means that it's physically possible for the m4, if hammered with a particularly gruesome workload (or a mostly naughty workload for a longer period of time), to end up in a pretty poor performance state. I had the same complaint about the C300 if you'll remember from last year. If you're running an OS without TRIM support, then the m4 is a definite pass. Even with TRIM enabled and a sufficiently random workload, you'll want to skip the m4 as well.
I suspect for most desktop workloads this worst case scenario won't be a problem and with TRIM the drive's behavior over the long run should be kept in check. Crucial still seems to put off garbage collection longer than most SSDs I've played with, and I'm not sure that's necessarily the best decision.
Forgive the detour, now let's get back to the rest of the data.
Many of you have asked for random write performance at higher queue depths. What I have below is our 4KB random write test performed at a queue depth of 32 instead of 3. While the vast majority of desktop usage models experience queue depths of 0—5, higher depths are possible in heavy I/O (and multi-user) workloads:
High queue depth 4KB random write numbers continue to be very impressive, although here the Vertex 3 actually jumps ahead of the m4.
Random read performance is actually lower than on the C300. Crucial indicated that it reduced random read performance in favor of increasing sequential read performance on the m4. We'll see what this does to real world performance shortly.
103 Comments
View All Comments
Rasterman - Friday, April 1, 2011 - link
LOL wait a year? You are nuts, a year from now there will be totally new products out at all new high prices. Prices come down? Most of these new drives are not even in full production yet and some aren't even released. Regardless upgrading from a G2 level drive to any of these you aren't going to see any difference in real word use, only if you are doing massive file transfers all of the time or can afford to blow money for minimal performance increases (work system) then there is absolutely no point in upgrading speed wise.eamon - Thursday, March 31, 2011 - link
The article states that "I had the same complaint about the C300 if you'll remember from last year. If you're running an OS without TRIM support, then the m4 is a definite pass. Even with TRIM enabled and a sufficiently random workload, you'll want to skip the m4 as well." These statements don't really seem backed up by the data presented.Take the m4-is-bad-without-TRIM idea: If you lack TRIM *and* torture-test your SSD for twenty-minutes of random writes, then you'll see a significant but temporary loss of performance, is what you show. That's not ideal, but really, outside of benchmarking, 20-minutes of random write torture are exceedingly unusual. And you don't show a benchmark with TRIM support enabled (i.e., not just running on an OS with TRIM support, but running on a filesystem and where the filesystem isn't just completely filled up). Does the same performance degradation occur with normal TRIM usage patterns? That seems to be a far more likely usage pattern, but you don't test it.
This makes the second statement seem even less warranted - first of all, you're testing a very unusual access pattern, and you're doing it without a common feature (TRIM) designed to avoid this degradation, and you're not checking how long it takes for performance to recover (after all, if performance quickly recovers after torture testing, then it may well be reasonable to accept the low risk of low performance since the situation will rectify itself anyhow).
I'm not trying to defend the m4 here - and you might be right, but the data sure seems insufficient to draw these rather relevant conclusions. How quickly does the m4 recover, and how does TRIM impact the degradation in the first place?
JNo - Thursday, March 31, 2011 - link
+1I too am not trying to defend the m4 but I think a lot of emphasis is put on sequential performance reads & writes. Whilst I'm sure the everyone will copy/move very large files to their SSD occasionally, the vast majority will still have them as their boot drive where overall system responsiveness (random reads/writes) is still king. It's still a useful metric to know for those who really want to do video editing etc on an SSD but generally over stated.
For most users, like myself, I think the performance benefits of the amazing Vertex 3 will be imperceptible over the m4 99.999% of the time. So the real question, as always, is price - the Vertex 3 does justify a premium but only a small one. Most value-for-money buyers would probably get better real world value from the m4 assuming it is cheaper.
tno - Thursday, March 31, 2011 - link
I think the thing to remember is that this performance drop occurred during a pretty short torture test. But the possibility still exists that if the m4 delays garbage collection till a sequential write comes along, then the possibility could exist that the drive could suffer lots of insults from random writes, drastically decreasing performance, and, because not very many sequential writes are performed, the garbage collection never has a chance to remedy the situation.This is a hypothetical but it's not that far fetched for those of us that focus on using SSDs as OS drives. If you put a small OS drive in a desktop and supplement it with a large mechanical drive, your OS drive might not see a decently long sequential write for some time. Particularly if all your downloads and content generation goes to the mechanical drive.
Anand Lal Shimpi - Thursday, March 31, 2011 - link
For most users, over the course of several months, access patterns can begin to mimic portions of our torture test. I'll be addressing this in a future article but tasks like web browsing, system boot and even application launches are only sequential IOs for less than 50% of the time.I state that I doubt it'll be the case for typical desktop workloads but honestly there's no way to be sure given a short period of testing. Note that every recommended SSD we test ultimately goes into a primary use system and we subject it to a realistic workload for months, noting any issues that do crop up - which eventually gets fed back into our reviews.
Our data shows that in a perfect world, the m4 does quite well in most of the tests. My concerns are two fold:
1) Low max latency during random write operations seems to imply very little gc work is being done during typical random writes.
2) Our torture test shows that delayed garbage collection can result in a pretty poor performance scenario, where the m4 is allowed to to drop a bit lower than I'd like.
How likely is it that you'll encounter this poor performance state?
1) Without TRIM it's very likely. One of the machines I run daily is an OS X system without the TRIM hack enabled. Indilinx, the C300 and even Intel's X25-M both hit this worst case scenario performance level after a few months of use.
2) With TRIM it'll depend entirely based on your workload. Remember that you never TRIM the entire drive like we did (only in the case of a full format). Given a sufficiently random workload without enough consistent sequential writing to bring up performance, I could see things get this bad.
Again my point wasn't to conclude that the m4 was a bad drive, just that these are concerns of mine and I'd rather be cautious about them when recommending something to the public. It's no different than being cautious about recommending the Vertex 3 given unproven reliability and questionable track record.
Take care,
Anand
kmmatney - Thursday, March 31, 2011 - link
So, could someone write a tool that does a huge sequential write to restore performance? Sort of like running the Intel SSD Toolbox and manually doing a TRIM? I could live with that. I'm still running Windows XP at work.bobbozzo - Thursday, March 31, 2011 - link
Just copy a really big file from another drive.bobbozzo - Thursday, March 31, 2011 - link
Or a bunch of not as big files.7Enigma - Friday, April 1, 2011 - link
I'm quite certain I remember there being a program that does this created by an enthusiast way back during the first gen of SSD's.lyeoh - Friday, April 1, 2011 - link
To me I'm actually very happy about the low latency during random write (and read) ops.Can't there be a way to do garbage collection during idle time and not sacrifice latency?
Yes I know that the drive could think it's idle and then start garbage collection just at the very moment when the user finally decides to do something. But if you do the garbage collection at a low intensity, should it affect performance that much? I'm assuming that since the drives are fast they can do a fair bit of garbage collection during idle at say 10-20% speed and not affect the user experience much.
Enterprise drives might be busy all the time and total throughput often matters more than keeping latency in the milliseconds (it's still important but...), so the best time to do garbage collection for those drives would be ASAP.
But that's not true for Desktop drives. Right now as I'm typing in this post, my HDD isn't busy at all. So an SSD could do a fair bit of GC during that long pause. Same for when you are playing a game (after it has loaded the game assets).
It seems silly for Desktop SSDs to do GC during the time a user wants to do something (and presumably wants to get it done as fast as possible).
The Intel SSDs have a max latency of hundreds of milliseconds! That's very human noticeable! Do conventional nonfaulty HDDs even get that slow?