Delving Deeper

I had suspicions as to the nature of the problem based on my experience with it in my Mac Pro. The SuperTalent MLC drive in my machine would pause, most noticeably, randomly when I'd want to send an IM. What happens when you send an IM? Your logfile gets updated; a very small, random write to the disk. I turned to Iometer to simulate this behavior.

Iometer is a great tool for simulating disk accesses, you just need to know what sort of behavior you want to simulate. In my case I wanted to write tons of small files to the drive and look at latency, so I told Iometer to write 4KB files to the disk in a completely random pattern (100% random). I left the queue depth at 1 outstanding IO since I wanted to at least somewhat simulate a light desktop workload.

Iometer reports four results of importance: the number of IOs per second, the average MB/s, the average write latency and the maximum write latency. I looked at performance of four drives, the OCZ Core (Jmicron controller MLC), OCZ SLC (Samsung controller), Intel MLC (Intel controller) and the Seagate Momentus 7200.2 (a 7200RPM 2.5" notebook drive).

Though the OCZ core drive is our example, but please remember that this isn't an OCZ specific issue: the performance problems we see with this drive are apparent on all current MLC drives in the market that use a Jmicron controller with Samsung flash.

4KB, 100% random writes, IO queue depth 1 IOs per Second MB/s Average Write Latency Max Write Latency
OCZ Core (JMicron, MLC) 4.06 0.016MB/s 244ms 991ms
OCZ (Samsung, SLC) 109 0.43MB/s 9.17ms 83.2ms
Intel X25-M (Intel, MLC) 11171 43.6MB/s 0.089ms 94.2ms
Seagate Momentus 7200.2 106.9 0.42MB/s 9.4ms 76.5ms

 

Curiouser and curiouser...see a problem? Ignore the absolute ridiculous performance advantage of the Intel drive for a moment and look at the average latency column. The OCZ MLC drive has an average latency of 244 ms, that's over 26x the latency of the OCZ SLC drive and 25.9x the latency of a quick notebook drive. This isn't an MLC problem however, because the Intel MLC drive boasts an average latency of 0.09ms - the OCZ MLC drive has a 2700x higher latency!

Now look at the max latency column, the worst case scenario latency for the OCZ Core is 991ms! That's nearly a full second! This means that it takes an average of a quarter second to write a 4KB file to the drive and worst case scenario, a full second. We complain about the ~100 nanosecond trip a CPU has to take to main memory and here we have a drive that'll take nearly a full second to complete a task - totally unacceptable.

In order to find out if the latency is at all tied to the size of the write I varied the write size from 4KB all the way up to 128KB, but kept the writes 100% random. I'm only reporting latencies here:

100% random writes, IO queue depth 1 4KB 16KB 32KB 64KB 128KB
OCZ Core (JMicron, MLC) 244ms 243ms 241ms 243ms 247ms
OCZ (Samsung, SLC) 9.17ms 14.5ms 21.2ms 28ms 28.5ms
Intel X25-M (Intel, MLC) 0.089ms 0.23ms 0.44ms 0.84ms 1.73ms
Seagate Momentus 7200.2 9.4ms 8.95ms 9.14ms 9.82ms 12.1ms

 

All the way up to 128KB the latency is the same, 0.25s on average and nearly a second worst case for the OCZ Core and other similar MLC drives. If it's not the file size, perhaps it's the random nature of the writes?

For this next test I varied the nature of the writes, I ran the 4KB write test with a 100% sequential workload, 90% sequential (10% random) and 50% sequential (50% random):

4KB writes, IO queue depth 1 100% Sequential/0% Random 90% Sequential/10% Random 50% Sequential/50% Random 0% Sequential/100% Random
OCZ Core (JMicron, MLC) 0.36ms 25.8ms 130ms 244ms
OCZ (Samsung, SLC) 0.16ms 1.97ms 5.19ms 9.17ms
Intel X25-M (Intel, MLC) 0.09ms 0.09ms 0.09ms 0.089ms
Seagate Momentus 7200.2 0.16ms 0.94ms 4.35ms 9.4ms

 

The average latency was higher on the OCZ Core (MLC) than the rest of the drives, but still manageable at 0.36ms when I ran the 100% sequential test, but look at what happened in the 90% sequential test. With just 10% random writes the average latency jumped to 25.8ms, that's 13x the latency of the OCZ SLC drive. Again, this isn't an MLC issue as the Intel drive does just fine. Although I left it out of the table to keep things simpler, the max latency in the 90/10 test was 983ms for the OCZ Core drive once again. The 90/10 test is particularly useful because it closely mimics a desktop write pattern, most writes are sequential in nature but a small percentage (10% or less) are random in nature. What this test shows us is that even 10% of random writes is all it takes to bring the OCZ Core to its knees.

The problem gets worse as you increase the load on the drive. Most desktop systems have less than 1 outstanding IO during normal operation, but under heavy multitasking you can see the IO queue depth hit 4 or 5 IOs for writes. Going much above that and you pretty much have to be in a multi-user environment, either by running your machine as a file server or by actually running a highly trafficked server. I ran the same 100% random, 4KB write test but varied the number of outstanding IOs from 1 all the way up to 64. Honestly, I just wanted to see how bad it would get:

This is just ridiculous. Average write latency climbs up to fifteen seconds, while max latency peaked at over thirty seconds for the JMicron based MLC drives. All this graph tells you is that you shouldn't dare use one of these drives in a server, but even at a queue depth of four the max latency is over two seconds which is completely attainable in a desktop scenario under heavy usage. I've seen this sort of behavior first hand under OS X with the SuperTalent MLC drive, the system will just freeze for anywhere from a fraction of a second to over a full second while a write completes in the background. The write that will set it off will often times be something as simple as writing to my web browser's cache or sending an IM, it's horribly frustrating.

I did look at read performance, and while max latency was a problem (peaking at 250ms) it was a fairly rare case, average latency was more than respectable and comparable to the SLC drives. This seems to be a write issue. Let's see if we can make it manifest itself in some real world tests.

Enter the Poorly Designed MLC The Generic MLC SSD Problem in the Real World
POST A COMMENT

97 Comments

View All Comments

  • Anand Lal Shimpi - Tuesday, September 09, 2008 - link

    I think the question was: how much more performance is left untapped by current controller designs? The JMicron issues are a limited case, what will truly be telling is what happens when we see Intel vs. Samsung with SLC drives...

    The dominating the charts line was in reference to the Crysis results. If you've ever run the Crysis GPU bench you'll know that it is extremely disk intensive (particularly the first run). As I mentioned in the article, it over emphasizes the importance of disk performance but that's not to say that the results aren't valid.

    I do see your point however, let me see what I can do about clarifying that statement.

    -A
    Reply
  • yyrkoon - Tuesday, September 09, 2008 - link

    Ok, I guess I missed the JMicron 'thing', but to be perfectly honest I dislike *anything* JMicron and try to avoid them whenever possible. I guess I am just so interested in these Intel drives, I just tuned everyting else out. However, I did read what you mentioned about 'trouble-shooting' the JMicron MLC issue.

    Never ran Crysis, and do not plan on running it anytime soon if ever, but I am somewhat of a hardcore gamer.

    Keep up the good work, and PLEASE do keep us informed on at least these Intel SSD drives :)
    Reply
  • BD2003 - Monday, September 08, 2008 - link

    If the achilles heel of the JMicron MLC is the random write speed, why couldnt a ram buffer be used to cache writes? Sure this would cause a serious problem if the power went out, but thats an issue some would be willing to live with.

    I'm fairly sure vista has an option for this in the device manager in the properties tab of a drive - "enable advanced disk performance". I wonder if that would have any effect on the results?
    Reply
  • DigitalFreak - Monday, September 08, 2008 - link

    Yet more proof that JMicron products are shit. Reply
  • ggordonliddy - Monday, September 08, 2008 - link

    For the love of all humanity: If you are going to write for a living, please learn basic comma usage!

    It is NOT okay to just stick a comma in the middle of a sentence anytime you want. And it gives readers a headache.

    Here is just one of numerous examples of improper comma usage I've seen so far (and I've only gotten to the 3rd page!):

    "Intel certifies its drives in accordance with the JEDEC specs from 0 - 70C, at optimal temperatures your data will last even longer [...]"

    The comma before "at optimal" should be replaced with a semicolon or a period (I prefer the semicolon).

    Did you actually pass your English classes? I'm guessing that you probably did and you are just a product of our miserable public school system that refuses to hold students to any real level of accountability.


    (And BTW, your quoting system is broken. When I enter text in the Quote Text dialog and click OK, nothing new appears in the Comment compose field.)
    Reply
  • 7Enigma - Friday, September 19, 2008 - link

    Honestly man, you need to seriously relax. My personal rule of thumb for grammar is does the mistake make the understanding of the sentence difficult to comprehend.

    Writing something like, "Intel certifies its drives in accordance with the JEDEC specs from 0 - 70C, at optimal temperatures your data will last even longer [...]", while not grammatically correct is completely readable.

    If it was something like, ""Intel certifies drives to accordance with the JEDEC specs from 0 - 70C, at optimal data your temperatures will last even longer [...]", now you have a legitimate beef.

    The former can easily be forgiven, the latter makes my head hurt when I read it. Trust me, whatever you do, do not go to Dailytech.com and read the articles. Those even I get annoyed at frequently and I'm very forgiving.
    Reply
  • Anand Lal Shimpi - Tuesday, September 09, 2008 - link

    You're quite right, thanks for the heads up :) Some of the article was directly from my notes while I was working on the tests, so that's one source of unpolished bits. I know I'm far from perfect, so I do appreciate your (and anyone else's) assistance.

    Thanks :)

    Anand
    Reply
  • pkp - Tuesday, September 09, 2008 - link

    Thanks for posting, Anand. I see you're already aware of the problem, but I wanted to throw my two cents in.

    What is the usual editing process? I think a once over by a second set of eyes would have caught the bulk of the grammatical errors.

    Of course, the ultimate issue isn't commas. It's readability. However, the problem was bad enough that I'm making this comment without having even gotten through the first page of this article.
    Reply
  • JarredWalton - Tuesday, September 09, 2008 - link

    I'm often the content editor for posted articles, but often we skip that stage due to late nights and schedules. Doing a final thorough edit can require a couple hours (edit and then HTLM-ize), and when someone finishes an article at 5AM or whatever and it's an NDA type piece, delaying it any further is usually not desired by the readers or us.

    I do read all posted articles, and often I take the time to go through and fix any noteworthy errors. A few misplaced commas don't really detract from a 5000 word article, however, and depending on what else is going on I may or may not edit the text. If anyone takes the time to point out specific errors, i.e. "on page 3 you write "...." they always get corrected - at least if I see it. General complaints are much more difficult to address though, i.e. "You used passive voice and therefore you must DIE!" LOL.

    I know personally that when you write a long article with lots of testing, certain thoughts tend to appear in multiple places and the final result isn't always as coherent as I would like. Trying to "fix" problems relating to flow and readability is difficult at best, and requires more time than we generally spend. If anyone wants to make specific suggestions, though, we're open for input as always.

    Perhaps it's useful to compare the process to print publications. Magazines usually have several editors on staff whose job is solely to edit other authors' work; I can say that we don't have anyone at AnandTech in that position these days. (I edit some of the articles, but not all, and even then I make mistakes.) That's probably why we have more typos than magazines, but then we provide far more thorough coverage as well. Last I saw, most magazine hardware reviews end up being one page and ~1000 words, with a couple charts.

    At the end of the day, I get most of my detailed information from the internet. Magazines might be more grammatically correct, and they make for great toilet reading, but I don't generally depend on them as a source of credible information. I'd say it's safe to say we won't see such an in-depth exploration of SSD performance and issues in any magazine. [Now I have to prepare to have someone point me to an article in some magazine that does exactly that.]

    Cheers,
    Jarred
    Reply
  • KikassAssassin - Tuesday, September 09, 2008 - link

    Then I guess I should point you to an article in last month's issue of my favorite data storage magazine.

    http://www.solidstatedisksmonthly.com/2008/08/ever...">http://www.solidstatedisksmonthly.com/2...erforman...

    Unfortunately, their website seems to be down at the moment, but keep checking it, I'm sure it'll be back up soon (and don't be fooled by the article's title. It's actually only 23 pages without the ads).
    Reply

Log in

Don't have an account? Sign up now