The Flash Hierarchy & Data Loss

We've already established that a flash cell can either store one or two bits depending on whether it's a SLC or MLC device. Group a bunch of cells together and you've got a page. A page is the smallest structure you can program (write to) in a NAND flash device. In the case of most MLC NAND flash each page is 4KB. A block consists of a number of pages, in the Intel MLC SSD a block is 128 pages (128 pages x 4KB per page = 512KB per block = 0.5MB). A block is the smallest structure you can erase. So when you write to a SSD you can write 4KB at a time, but when you erase from a SSD you have to erase 512KB at a time. I'll explore that a bit further in a moment, but let's look at what happens when you erase data from a SSD.

Whenever you write data to flash we go through the same iterative programming process again. Create an electric field, electrons tunnel through the oxide and the charge is stored. Erasing the data causes the same thing to happen but in the reverse direction. The problem is that the more times you tunnel through that oxide, the weaker it becomes, eventually reaching a point where it will no longer prevent the electrons from doing whatever they want to do.

On MLC flash that point is reached after about 10,000 erase/program cycles. With SLC it's 100,000 thanks to the simplicity of the SLC design. With a finite lifespan, SSDs have to be very careful in how and when they choose to erase/program each cell. Note that you can read from a cell as many times as you want to, that doesn't reduce the cell's ability to store data. It's only the erase/program cycle that reduces life. I refer to it as a cycle because an SSD has no concept of just erasing a block, the only time it erases a block is to write new data. If you delete a file in Windows but don't create a new one, the SSD doesn't actually remove the data from flash until you're ready to write new data.

Now going back to the disparity between how you program and how you erase data on a SSD, you program in pages and you erase in blocks. Say you save an 8KB file and later decide that you want to delete it, it could just be a simple note you wrote for yourself that you no longer need. When you saved the file, it'd be saved as two pages in the flash memory. When you go to delete it however, the SSD mark the pages as invalid but it won't actually erase the block. The SSD will wait until a certain percentage of pages within a block are marked as invalid before copying any valid data to new pages and erasing the block. The SSD does this to limit the number of times an individual block is erased, and thus prolong the life of your drive.

 

Not all SSDs handle deletion requests the same way, how and when you decide to erase a block with invalid pages determines the write amplification of your device. In the case of a poorly made SSD, if you simply wanted to change a 16KB file the controller could conceivably read the entire block into main memory, change the four pages, erase the block from the SSD and then write the new block with the four changed pages. Using the page/block sizes from the Intel SSD, this would mean that a 16KB write would actually result in 512KB of writes to the SSD - a write amplification factor of 32x.

At this point we don't have any data from any of the other SSD controller makers on how they handle situations like this, but Intel states that traditional SSD controllers suffer from write amplification in the 20 - 40x range, which reduces the longevity of their drives. Intel states that on typical client workloads its write amplification factor is less than 1.1x, in other words you're writing less than 10% more data than you need to. The write amplification factor itself doesn't mean much, what matters is the longevity of the drive and there's one more factor that contributes there.

We've already established that with flash there are a finite number of times you can write to a block before it loses its ability to store data. SSDs are pretty intelligent and will use wear leveling algorithms to spread out block usage across the entirety of the drive. Remember that unlike mechanical disks, it doesn't matter where on a SSD you write to, the performance will always be the same. SSDs will thus attempt to write data to all blocks of the drive equally. For example, let's say you download a 2MB file to your band new, never been used SSD, which gets saved to blocks 10, 11, 12 and 13. You realize you downloaded the wrong file and delete it, then go off to download the right file. Rather than write the new file to blocks 10, 11, 12 and 13, the flash controller will write to blocks 14, 15, 16 and 17. In fact, those four blocks won't get used again until every other block on the drive has been written to once. So while your MLC SSD may only have a lifespan of 10,000 cycles, it's going to last quite a while thanks to intelligent wear leveling algorithms.


Intel's wear leveling efficiency, all blocks get used nearly the same amount


Bad wear leveling, presumably on existing SSDs, some blocks get used more than others

Intel's SSDs carry about a 4% wear leveling inefficiency, meaning that 4% of the blocks on an Intel SSD will be worn at a rate higher than the rest.

How SSDs Work How Long Will Intel's SSDs Last?
Comments Locked

96 Comments

View All Comments

  • mindless1 - Thursday, September 11, 2008 - link

    Sometimes the cure is worse than the problem.
  • Gannon - Tuesday, September 9, 2008 - link

    Don't worry derek I still heart you guys! :P

    Here's some cool software to check out (they have free trial version)

    http://www.whitesmoke.com/landing_flash/free_hotfo...">http://www.whitesmoke.com/landing_flash...otforwor...

    Maybe it will help escape complaints from the grammar nazi's, I think a lot of grammar is BS anyway. Language evolves constantly. It's a flexible tool to communicate.
  • Nihility - Monday, September 8, 2008 - link

    An excellent review. The benchmark results were always confusing in the past. No one would try to explain why an SSD with seemingly superior specs can't outperform a 7200 drive in media test. Thanks for putting the time in to resolve this issue.

    As for buying a drive like that, the price is still too steep for me to consider and you definitely made it clear that buying a jmicro SSD is out of the question.
    As for further testing, I'm very interested in seeing how a good SSD performs as an external drive over USB. The robustness and sturdiness of the drive is very important for something you lug around. We all know how bad bandwidth is over USB but I wonder how the latency will fair.

    Keep up the good work.
  • kmmatney - Monday, September 8, 2008 - link

    One of the other reviews I read said this SSD's controller will learn hard drive usage patterns, and get faster over time. Any tests of this feature?
  • leexgx - Monday, September 8, 2008 - link

    not sure how thay can lern

    i did wunder why thay never put any DRAM buffer on SSD drives as i was expecting SSD to suffer badly from lack of buffer any MLC drives basicly suck (16kb buffer per flash chip) unless its the intel MLC drive lol or an SLC drive seem mostly ok, but an intel SLC going to rock when thay get tested
  • Anand Lal Shimpi - Monday, September 8, 2008 - link

    The Intel drive will learn hard drive usage patterns however it does so over an extremely long period of time, not something I could develop a test for in my time with the drive.

    Take care,
    Anand
  • whatthehey - Monday, September 8, 2008 - link

    ...that doesn't think too much about HDD performance, particularly when we're talking about insane prices. Sure, rebooting and reloading all of your apps will feel much faster. Personally, when I reboot I walk into the kitchen or bathroom, walk back a few minutes later, and I don't notice the delays. Not to mention, I only reboot about once a month (usually when nVidia releases a new driver that I need to install).

    Another major problem I have is the tests as an indication of the "real world". Take the whole antivirus thing. I hate AV software and software firewalls, which is why I don't use Norton, AVG, Avast, McAfee or any other product that kills performance, sucks up memory, and only prevents virii/trojans after an update. AV software is just a BS excuse to pay a $60/year subscription and get nags every time your subscription expires. So there's on "real" scenario I don't ever encounter.

    Archive extraction can be pretty disk intensive as well, but how often do you need to extract a 5GB archive? Okay, so let's say you're a pirate and you do that daily... great. Now you can extract faster, but you have an SSD that can only hold 14 or so large archives. It's a nice illustration of SSDs being faster, but it's completely impractical. I have a 1TB drive just for all the movies, images, music, and disc images I have floating around.

    The tests show that SSDs can help a lot, but I for one use capacity far more. Between several games, my standard apps, and Vista I think I would use most of the 80GB. Then I think of the price and I could grab a couple VelociRaptors or even four 1TB Samsung F1 drives. I'll be truly impressed when I can get at least 320GB of SSD for less than $200. Actually, it's more like I want a good SSD with a reasonable capacity for under $100. Until then I'll just stick with my slower drives and avoid worst-case situations where HDD performance is a problem as much as possible.

    The article was good, and I appreciate the info on the MLC issues with JMicron. That confirms my suspicion that inexpensive flash drives are worse than standard mechanical drives. Intel has addressed the problem, but price is now back to where we were last year it seems. I guess the real problem is that I'm just not enough of an "enthusiast" to spend this much money on 80GB of storage... not counting stuff like that old 4GB hard drive back in the day that set me back over $200. Give it a few more cycles and I think I'll be ready for SSDs.

    PS - Also, who cares about $600 CPUs when you can buy $200 CPUs and overclock to higher performance levels? I don't think we'll ever see overclocked SSDs or HDDs.
  • DerekWilson - Tuesday, September 9, 2008 - link

    i wouldn't be so sure about not seeing overclocked SSD ...

    as this article points out, intel puts a focus on reliability ... but to do so they do sacrifice performance. the voltage applied to the transistors to store data is calibrated to write the cells quickly while maintaining a good life span. a higher voltage could be applied that would allow the cells to be written faster but would reduce the number of writes that a cell could handle.

    if intel says 100gb a day for 5 years ... i don't need that by a long shot. i would be very willing to sacrifice a lot of that for more speed.

    i actually spoke with intel about the possibility of overclocking their ssd drives at idf -- it is something that could be done as it is controlled via the firmware of the drive. if intel doesn't convolute their firmware too much or if they allow enthusiasts to have the necessary control over settings at that level we could very well see overclocked SSDs ...

    which would be very interesting indeed.
  • shabby - Monday, September 8, 2008 - link

    I was so close in buying one of those ocz drives, in fact the reason i didnt buy it was because it was a special order that took 2 weeks.
    Excellent write up, especially about the jmicron/mlc "glitch".
  • OCedHrt - Monday, September 8, 2008 - link

    Any reason why the WD GP drive does so well in the multitasking test? Even better than the VelociRaptor?

Log in

Don't have an account? Sign up now