The Flash Hierarchy & Data Loss

We've already established that a flash cell can either store one or two bits depending on whether it's a SLC or MLC device. Group a bunch of cells together and you've got a page. A page is the smallest structure you can program (write to) in a NAND flash device. In the case of most MLC NAND flash each page is 4KB. A block consists of a number of pages, in the Intel MLC SSD a block is 128 pages (128 pages x 4KB per page = 512KB per block = 0.5MB). A block is the smallest structure you can erase. So when you write to a SSD you can write 4KB at a time, but when you erase from a SSD you have to erase 512KB at a time. I'll explore that a bit further in a moment, but let's look at what happens when you erase data from a SSD.

Whenever you write data to flash we go through the same iterative programming process again. Create an electric field, electrons tunnel through the oxide and the charge is stored. Erasing the data causes the same thing to happen but in the reverse direction. The problem is that the more times you tunnel through that oxide, the weaker it becomes, eventually reaching a point where it will no longer prevent the electrons from doing whatever they want to do.

On MLC flash that point is reached after about 10,000 erase/program cycles. With SLC it's 100,000 thanks to the simplicity of the SLC design. With a finite lifespan, SSDs have to be very careful in how and when they choose to erase/program each cell. Note that you can read from a cell as many times as you want to, that doesn't reduce the cell's ability to store data. It's only the erase/program cycle that reduces life. I refer to it as a cycle because an SSD has no concept of just erasing a block, the only time it erases a block is to write new data. If you delete a file in Windows but don't create a new one, the SSD doesn't actually remove the data from flash until you're ready to write new data.

Now going back to the disparity between how you program and how you erase data on a SSD, you program in pages and you erase in blocks. Say you save an 8KB file and later decide that you want to delete it, it could just be a simple note you wrote for yourself that you no longer need. When you saved the file, it'd be saved as two pages in the flash memory. When you go to delete it however, the SSD mark the pages as invalid but it won't actually erase the block. The SSD will wait until a certain percentage of pages within a block are marked as invalid before copying any valid data to new pages and erasing the block. The SSD does this to limit the number of times an individual block is erased, and thus prolong the life of your drive.

 

Not all SSDs handle deletion requests the same way, how and when you decide to erase a block with invalid pages determines the write amplification of your device. In the case of a poorly made SSD, if you simply wanted to change a 16KB file the controller could conceivably read the entire block into main memory, change the four pages, erase the block from the SSD and then write the new block with the four changed pages. Using the page/block sizes from the Intel SSD, this would mean that a 16KB write would actually result in 512KB of writes to the SSD - a write amplification factor of 32x.

At this point we don't have any data from any of the other SSD controller makers on how they handle situations like this, but Intel states that traditional SSD controllers suffer from write amplification in the 20 - 40x range, which reduces the longevity of their drives. Intel states that on typical client workloads its write amplification factor is less than 1.1x, in other words you're writing less than 10% more data than you need to. The write amplification factor itself doesn't mean much, what matters is the longevity of the drive and there's one more factor that contributes there.

We've already established that with flash there are a finite number of times you can write to a block before it loses its ability to store data. SSDs are pretty intelligent and will use wear leveling algorithms to spread out block usage across the entirety of the drive. Remember that unlike mechanical disks, it doesn't matter where on a SSD you write to, the performance will always be the same. SSDs will thus attempt to write data to all blocks of the drive equally. For example, let's say you download a 2MB file to your band new, never been used SSD, which gets saved to blocks 10, 11, 12 and 13. You realize you downloaded the wrong file and delete it, then go off to download the right file. Rather than write the new file to blocks 10, 11, 12 and 13, the flash controller will write to blocks 14, 15, 16 and 17. In fact, those four blocks won't get used again until every other block on the drive has been written to once. So while your MLC SSD may only have a lifespan of 10,000 cycles, it's going to last quite a while thanks to intelligent wear leveling algorithms.


Intel's wear leveling efficiency, all blocks get used nearly the same amount


Bad wear leveling, presumably on existing SSDs, some blocks get used more than others

Intel's SSDs carry about a 4% wear leveling inefficiency, meaning that 4% of the blocks on an Intel SSD will be worn at a rate higher than the rest.

How SSDs Work How Long Will Intel's SSDs Last?
POST A COMMENT

97 Comments

View All Comments

  • Anand Lal Shimpi - Tuesday, September 09, 2008 - link

    I think the question was: how much more performance is left untapped by current controller designs? The JMicron issues are a limited case, what will truly be telling is what happens when we see Intel vs. Samsung with SLC drives...

    The dominating the charts line was in reference to the Crysis results. If you've ever run the Crysis GPU bench you'll know that it is extremely disk intensive (particularly the first run). As I mentioned in the article, it over emphasizes the importance of disk performance but that's not to say that the results aren't valid.

    I do see your point however, let me see what I can do about clarifying that statement.

    -A
    Reply
  • yyrkoon - Tuesday, September 09, 2008 - link

    Ok, I guess I missed the JMicron 'thing', but to be perfectly honest I dislike *anything* JMicron and try to avoid them whenever possible. I guess I am just so interested in these Intel drives, I just tuned everyting else out. However, I did read what you mentioned about 'trouble-shooting' the JMicron MLC issue.

    Never ran Crysis, and do not plan on running it anytime soon if ever, but I am somewhat of a hardcore gamer.

    Keep up the good work, and PLEASE do keep us informed on at least these Intel SSD drives :)
    Reply
  • BD2003 - Monday, September 08, 2008 - link

    If the achilles heel of the JMicron MLC is the random write speed, why couldnt a ram buffer be used to cache writes? Sure this would cause a serious problem if the power went out, but thats an issue some would be willing to live with.

    I'm fairly sure vista has an option for this in the device manager in the properties tab of a drive - "enable advanced disk performance". I wonder if that would have any effect on the results?
    Reply
  • DigitalFreak - Monday, September 08, 2008 - link

    Yet more proof that JMicron products are shit. Reply
  • ggordonliddy - Monday, September 08, 2008 - link

    For the love of all humanity: If you are going to write for a living, please learn basic comma usage!

    It is NOT okay to just stick a comma in the middle of a sentence anytime you want. And it gives readers a headache.

    Here is just one of numerous examples of improper comma usage I've seen so far (and I've only gotten to the 3rd page!):

    "Intel certifies its drives in accordance with the JEDEC specs from 0 - 70C, at optimal temperatures your data will last even longer [...]"

    The comma before "at optimal" should be replaced with a semicolon or a period (I prefer the semicolon).

    Did you actually pass your English classes? I'm guessing that you probably did and you are just a product of our miserable public school system that refuses to hold students to any real level of accountability.


    (And BTW, your quoting system is broken. When I enter text in the Quote Text dialog and click OK, nothing new appears in the Comment compose field.)
    Reply
  • 7Enigma - Friday, September 19, 2008 - link

    Honestly man, you need to seriously relax. My personal rule of thumb for grammar is does the mistake make the understanding of the sentence difficult to comprehend.

    Writing something like, "Intel certifies its drives in accordance with the JEDEC specs from 0 - 70C, at optimal temperatures your data will last even longer [...]", while not grammatically correct is completely readable.

    If it was something like, ""Intel certifies drives to accordance with the JEDEC specs from 0 - 70C, at optimal data your temperatures will last even longer [...]", now you have a legitimate beef.

    The former can easily be forgiven, the latter makes my head hurt when I read it. Trust me, whatever you do, do not go to Dailytech.com and read the articles. Those even I get annoyed at frequently and I'm very forgiving.
    Reply
  • Anand Lal Shimpi - Tuesday, September 09, 2008 - link

    You're quite right, thanks for the heads up :) Some of the article was directly from my notes while I was working on the tests, so that's one source of unpolished bits. I know I'm far from perfect, so I do appreciate your (and anyone else's) assistance.

    Thanks :)

    Anand
    Reply
  • pkp - Tuesday, September 09, 2008 - link

    Thanks for posting, Anand. I see you're already aware of the problem, but I wanted to throw my two cents in.

    What is the usual editing process? I think a once over by a second set of eyes would have caught the bulk of the grammatical errors.

    Of course, the ultimate issue isn't commas. It's readability. However, the problem was bad enough that I'm making this comment without having even gotten through the first page of this article.
    Reply
  • JarredWalton - Tuesday, September 09, 2008 - link

    I'm often the content editor for posted articles, but often we skip that stage due to late nights and schedules. Doing a final thorough edit can require a couple hours (edit and then HTLM-ize), and when someone finishes an article at 5AM or whatever and it's an NDA type piece, delaying it any further is usually not desired by the readers or us.

    I do read all posted articles, and often I take the time to go through and fix any noteworthy errors. A few misplaced commas don't really detract from a 5000 word article, however, and depending on what else is going on I may or may not edit the text. If anyone takes the time to point out specific errors, i.e. "on page 3 you write "...." they always get corrected - at least if I see it. General complaints are much more difficult to address though, i.e. "You used passive voice and therefore you must DIE!" LOL.

    I know personally that when you write a long article with lots of testing, certain thoughts tend to appear in multiple places and the final result isn't always as coherent as I would like. Trying to "fix" problems relating to flow and readability is difficult at best, and requires more time than we generally spend. If anyone wants to make specific suggestions, though, we're open for input as always.

    Perhaps it's useful to compare the process to print publications. Magazines usually have several editors on staff whose job is solely to edit other authors' work; I can say that we don't have anyone at AnandTech in that position these days. (I edit some of the articles, but not all, and even then I make mistakes.) That's probably why we have more typos than magazines, but then we provide far more thorough coverage as well. Last I saw, most magazine hardware reviews end up being one page and ~1000 words, with a couple charts.

    At the end of the day, I get most of my detailed information from the internet. Magazines might be more grammatically correct, and they make for great toilet reading, but I don't generally depend on them as a source of credible information. I'd say it's safe to say we won't see such an in-depth exploration of SSD performance and issues in any magazine. [Now I have to prepare to have someone point me to an article in some magazine that does exactly that.]

    Cheers,
    Jarred
    Reply
  • KikassAssassin - Tuesday, September 09, 2008 - link

    Then I guess I should point you to an article in last month's issue of my favorite data storage magazine.

    http://www.solidstatedisksmonthly.com/2008/08/ever...">http://www.solidstatedisksmonthly.com/2...erforman...

    Unfortunately, their website seems to be down at the moment, but keep checking it, I'm sure it'll be back up soon (and don't be fooled by the article's title. It's actually only 23 pages without the ads).
    Reply

Log in

Don't have an account? Sign up now