The Flash Hierarchy & Data Loss

We've already established that a flash cell can either store one or two bits depending on whether it's a SLC or MLC device. Group a bunch of cells together and you've got a page. A page is the smallest structure you can program (write to) in a NAND flash device. In the case of most MLC NAND flash each page is 4KB. A block consists of a number of pages, in the Intel MLC SSD a block is 128 pages (128 pages x 4KB per page = 512KB per block = 0.5MB). A block is the smallest structure you can erase. So when you write to a SSD you can write 4KB at a time, but when you erase from a SSD you have to erase 512KB at a time. I'll explore that a bit further in a moment, but let's look at what happens when you erase data from a SSD.

Whenever you write data to flash we go through the same iterative programming process again. Create an electric field, electrons tunnel through the oxide and the charge is stored. Erasing the data causes the same thing to happen but in the reverse direction. The problem is that the more times you tunnel through that oxide, the weaker it becomes, eventually reaching a point where it will no longer prevent the electrons from doing whatever they want to do.

On MLC flash that point is reached after about 10,000 erase/program cycles. With SLC it's 100,000 thanks to the simplicity of the SLC design. With a finite lifespan, SSDs have to be very careful in how and when they choose to erase/program each cell. Note that you can read from a cell as many times as you want to, that doesn't reduce the cell's ability to store data. It's only the erase/program cycle that reduces life. I refer to it as a cycle because an SSD has no concept of just erasing a block, the only time it erases a block is to write new data. If you delete a file in Windows but don't create a new one, the SSD doesn't actually remove the data from flash until you're ready to write new data.

Now going back to the disparity between how you program and how you erase data on a SSD, you program in pages and you erase in blocks. Say you save an 8KB file and later decide that you want to delete it, it could just be a simple note you wrote for yourself that you no longer need. When you saved the file, it'd be saved as two pages in the flash memory. When you go to delete it however, the SSD mark the pages as invalid but it won't actually erase the block. The SSD will wait until a certain percentage of pages within a block are marked as invalid before copying any valid data to new pages and erasing the block. The SSD does this to limit the number of times an individual block is erased, and thus prolong the life of your drive.

 

Not all SSDs handle deletion requests the same way, how and when you decide to erase a block with invalid pages determines the write amplification of your device. In the case of a poorly made SSD, if you simply wanted to change a 16KB file the controller could conceivably read the entire block into main memory, change the four pages, erase the block from the SSD and then write the new block with the four changed pages. Using the page/block sizes from the Intel SSD, this would mean that a 16KB write would actually result in 512KB of writes to the SSD - a write amplification factor of 32x.

At this point we don't have any data from any of the other SSD controller makers on how they handle situations like this, but Intel states that traditional SSD controllers suffer from write amplification in the 20 - 40x range, which reduces the longevity of their drives. Intel states that on typical client workloads its write amplification factor is less than 1.1x, in other words you're writing less than 10% more data than you need to. The write amplification factor itself doesn't mean much, what matters is the longevity of the drive and there's one more factor that contributes there.

We've already established that with flash there are a finite number of times you can write to a block before it loses its ability to store data. SSDs are pretty intelligent and will use wear leveling algorithms to spread out block usage across the entirety of the drive. Remember that unlike mechanical disks, it doesn't matter where on a SSD you write to, the performance will always be the same. SSDs will thus attempt to write data to all blocks of the drive equally. For example, let's say you download a 2MB file to your band new, never been used SSD, which gets saved to blocks 10, 11, 12 and 13. You realize you downloaded the wrong file and delete it, then go off to download the right file. Rather than write the new file to blocks 10, 11, 12 and 13, the flash controller will write to blocks 14, 15, 16 and 17. In fact, those four blocks won't get used again until every other block on the drive has been written to once. So while your MLC SSD may only have a lifespan of 10,000 cycles, it's going to last quite a while thanks to intelligent wear leveling algorithms.


Intel's wear leveling efficiency, all blocks get used nearly the same amount


Bad wear leveling, presumably on existing SSDs, some blocks get used more than others

Intel's SSDs carry about a 4% wear leveling inefficiency, meaning that 4% of the blocks on an Intel SSD will be worn at a rate higher than the rest.

How SSDs Work How Long Will Intel's SSDs Last?
Comments Locked

96 Comments

View All Comments

  • Alleniv - Wednesday, August 19, 2009 - link

    Hi all,
    I report this new review about X25-M, that takes in consideration a comparative with other SSDs and also with HDDs, with several benchmarks ? http://www.informaticaeasy.net/le-mi...m-da-80gb.h...">http://www.informaticaeasy.net/le-mi...m-da-80gb.h...
  • Bytales - Saturday, January 3, 2009 - link

    You said this: For example, let's say you download a 2MB file to your band new, never been used SSD, which gets saved to blocks 10, 11, 12 and 13. You realize you downloaded the wrong file and delete it, then go off to download the right file. Rather than write the new file to blocks 10, 11, 12 and 13, the flash controller will write to blocks 14, 15, 16 and 17. In fact, those four blocks won't get used again until every other block on the drive has been written to once

    By this i understand that a bigger capacity SSD, for instance 320 vs 160 will have more blocks and hence you will need more writes to deplete the number a write cycles the SSD was designed for. So for SSD bigger means even longer lasting. IS this TRUE ?
  • lpaster - Wednesday, November 26, 2008 - link

    Can you overclock this SSD?
  • Sendou - Wednesday, February 9, 2011 - link

    There are optimization methods available for SSD's which can mitigate performance loss through genuine usage over time.

    One such is Diskeeper's HyperFast Technology.

    There is a white paper regarding HyperFast available at:

    http://downloads.diskeeper.com/pdf/Optimizing-Soli...
  • BludBaut - Thursday, March 31, 2011 - link

    I read the pdf article you linked from Diskeeper.

    Based on the information Anand has given in his articles about Intel's technology, Diskeeper's "whitepaper" sounds like crap advertising by a company who's afraid their technology might be considered not only useless but detrimental to use with SSDs. I'm inclined to agree since Diskeeper's own results show a 4x write loss by just *one* "optimization" while Anand's article clearly suggests that the proper design (which he says Intel has accomplished) eliminates the need for Diskeeper's service.

    Until I find more thorough examination of the facts, Diskeeper's remarks make me distrust them.

    On the other hand, Anand's article definitely sounds not just like a puff piece for Intel, but qualifies in my mind as advertising. Wonder how much money Intel has spent on Anandtech? That's not to suggest that anything is misrepresentative (well, it wasn't meant to sound that way, but keep reading and you'll find the one-sided praise will later be partially retracted and I don't know the end of the story yet), but we all know that advertising always leaves out the negatives.

    (Reviews shouldn't sound like advertisements but anyone who's been reading magazine reviews for 30 years knows that's frequently the case. The reviewer's bills get paid by the manufacturers' of the products he's reviewing. But, the reviewer is objective of course. It's a matter of journalistic integrity. Yeah, I believe that. Don't you?)

    One such negative was the promotion of the life of the drive. "20GB a day for five years"? Anand praises Intel for multiplying that by five to "100GB a day for five years" but then tells us that they'll only guarantee the drive for three years and has the audacity to suggest we'll likely have a recourse "if we can prove" ... -- how is anyone going to prove how many GBs a day they put on their computer? The annoyance of trying to keep track is not something 99% of people would do.

    Did you do the math to see how long it takes to write 100GB to a drive with a write speed of 200MB/s? Eight minutes and twenty seconds is all it takes.

    Well, that's great if all you use your computer for is reading articles, checking the news and sales prices and sending email. The drive should last as long as your computer. But if you love video (who loves video???), it's a different story entirely.

    There's another negative that, though first denied, eventually was acknowledged. More than six months later, Anand reports back and says essentially, 'Intel is still the best but the performance does degrade with time and I don't know why.' If he's explained it since then, I've yet to read it.

    So, for those just reading the article, don't get so encouraged that you start drooling. The article has a tendency to make one think, "What am I waiting for? I want one of these puppies!" Unfortunately, Intel's technology isn't as rosy and bulletproof and Anand made it sound.
  • kevonly - Friday, November 21, 2008 - link

    I hope you do some benchmark on Samsung's new 256GB SSD. Hopefully it's as good as Intel's.
  • kevonly - Friday, November 21, 2008 - link

    its read/write speed is 200/160 mb/s. Will it sustain that speed in a multi applications running environment??
  • kevonly - Friday, November 21, 2008 - link

    sorry

    read/write speed is 220/200 mb/s.
  • scotopicvision - Monday, November 10, 2008 - link

    The article was an amazing read, fantastic, and well done thank you.
  • D111 - Saturday, October 25, 2008 - link


    Legacy OS like Windows Vista, XP, and Applications like Microsoft Office 2003, 2007, etc. have built in, inherent flaws with regard to SSDs.

    Specifically, optimizations of these OS for mechanical hard drives like superfetch, prefetch, etc. tend to slow down, rather than help performance and is unnecessary to speed up reads in an SSD, but slow it down with unnecessary writes of small files, which SSDs are slower than a regular hard drive.

    Things like automatic drive defragmentation with Vista does nothing for SSDs except to slow them down.

    Properly optimized, even low cost 2007 generation SSDs test out as equivalent to a 7200 rpm consumer grade drive, and typical SSDs made in 2008 or later tend to outperform mechanical hard drives.

    The tests done here have done nothing to "tweak" the OS to remove design hindrances to SSD performance, and thus, have no validity or technical merit.

    The test, as presented, would be similar to installing a 19th century steam engine on a sailing ship, and observing that it is rather slow ---- without mentioning the drag and performance hits caused by the unused sail rigging, masts, etc.

    See the discussion here for a detailed discussion of SSD performance tweaks and what it takes to make them perform well with legacy OS and Applications.

    http://www.ocztechnologyforum.com/forum/forumdispl...">http://www.ocztechnologyforum.com/forum...display....

Log in

Don't have an account? Sign up now