Putting Theory to Practice: Understanding the SSD Performance Degradation Problem

Let’s look at the problem in the real world. You, me and our best friend have decided to start making SSDs. We buy up some NAND-flash and build a controller. The table below summarizes our drive’s characteristics:

  Our Hypothetical SSD
Page Size 4KB
Block Size 5 Pages (20KB)
Drive Size 1 Block (20KB
Read Speed 2 KB/s
Write Speed 1 KB/s

 

Through impressive marketing and your incredibly good looks we sell a drive. Our customer first goes to save a 4KB text file to his brand new SSD. The request comes down to our controller, which finds that all pages are empty, and allocates the first page to this text file.


Our SSD. The yellow boxes are empty pages

The user then goes and saves an 8KB JPEG. The request, once again, comes down to our controller, and fills the next two pages with the image.


The picture is 8KB and thus occupies two pages, which are thankfully empty

The OS reports that 60% of our drive is now full, which it is. Three of the five open pages are occupied with data and the remaining two pages are empty.

Now let’s say that the user goes back and deletes that original text file. This request doesn’t ever reach our controller, as far as our controller is concerned we’ve got three valid and two empty pages.

For our final write, the user wants to save a 12KB JPEG, that requires three 4KB pages to store. The OS knows that the first LBA, the one allocated to the 4KB text file, can be overwritten; so it tells our controller to overwrite that LBA as well as store the last 8KB of the image in our last available LBAs.

Now we have a problem once these requests get to our SSD controller. We’ve got three pages worth of write requests incoming, but only two pages free. Remember that the OS knows we have 12KB free, but on the drive only 8KB is actually free, 4KB is in use by an invalid page. We need to erase that page in order to complete the write request.


Uhoh, problem. We don't have enough empty pages.

Remember back to Flash 101, even though we have to erase just one page we can’t; you can’t erase pages, only blocks. We have to erase all of our data just to get rid of the invalid page, then write it all back again.

To do so we first read the entire block back into memory somewhere; if we’ve got a good controller we’ll just read it into an on-die cache (steps 1 and 2 below), if not hopefully there’s some off-die memory we can use as a scratch pad. With the block read, we can modify it, remove the invalid page and replace it with good data (steps 3 and 4). But we’ve only done that in memory somewhere, now we need to write it to flash. Since we’ve got all of our data in memory, we can erase the entire block in flash and write the new block (step 5).

Now let’s think about what’s just happened. As far as the OS is concerned we needed to write 12KB of data and it got written. Our SSD controller knows what really transpired however. In order to write that 12KB of data we had to first read 12KB then write an entire block, or 20KB.

Our SSD is quite slow, it can only write at 1KB/s and read at 2KB/s. Writing 12KB should have taken 12 seconds but since we had to read 12KB and then write 20KB the whole operation now took 26 seconds.

To the end user it would look like our write speed dropped from 1KB/s to 0.46KB/s, since it took us 26 seconds to write 12KB.

Are things starting to make sense now? This is why the Intel X25-M and other SSDs get slower the more you use them, and it’s also why the write speeds drop the most while the read speeds stay about the same. When writing to an empty page the SSD can write very quickly, but when writing to a page that already has data in it there’s additional overhead that must be dealt with thus reducing the write speeds.

The Blind SSD Free Space to the Rescue
POST A COMMENT

240 Comments

View All Comments

  • Bytales - Friday, March 27, 2009 - link

    As i read the article, i'm thinking of ways to slow down the down the degrading process. Intel is gonna ship x-25m 320gb this year. If i buy this drive and use it as an OS drive, i will obviously won't need the whole 320GB. Say i would need only 40 to 50 GB. I can make a secure erase (if the drive isn't new), made a partition of 50GB, and leave the remaining space unpartitioned. Will that solve the problem in any way ?
    Another way to solve the problem, would be a method inside the OS. The OS could use a user controlled % of the RAM memory, as a cache for those small 4kb files. Since ram reads and writes are way faster, i think it will also help. Say you got 8GB ram, and use 2gb for this purpose, and then the OS would only have 6gb ram for its use, while 2gb is used for these smaller files. That would increase also the lifespan of the SSD. Can this be possible ?
    Reply
  • Hellfire26 - Thursday, March 26, 2009 - link

    In reference to SSD's, I have read a lot of articles and comments about improved firmware and operating system support. I hope manufacturers don't forget about the on-board RAID controller.

    From the articles and comments made by users around the web, who have tested SSD's in a Raid 0 configuration, I believe that two Intel X25-M SSD's in a RAID 0 configuration would more than saturate current on-board RAID controllers.

    Intel is doing a die shrink of the NAND memory that is going into their SSD's come this fall. I would expect these new Intel SSD's to show faster read and write times. Other manufacturers will also find ways to increase the speed of their SSD's.

    SSD's scale well in a RAID configuration. It would be a shame if the on-board RAID controller limited our throughput. The alternative would be very expensive add-in RAID cards.
    Reply
  • FlaTEr1C - Wednesday, March 25, 2009 - link

    Anand, once again you wrote an article that no one else could've written. This is why I'm reading this site since 2004 and will always do. Your articles and reviews are without exception unique and a must-read. Thank you for this thorough background, analysis and review of SSD.

    I was looking a long time for a solution to make my desktop experience faster and I think I'll order a 60GB Vertex. 200€(germany) is still a lot of money but it will be worth it.

    Once again, great work Anand!
    Reply
  • blackburried - Wednesday, March 25, 2009 - link

    It's referred to as "discard" in the kernel functions.

    It works very well w/ SSD's that support TRIM, like fusion-io's drives.
    Reply
  • Iger - Wednesday, March 25, 2009 - link

    This is the best review I've read in a very long time.
    Thank you very much!
    Reply
  • BailoutBenny - Tuesday, March 24, 2009 - link

    Great in depth article on flash based SSDs. I'm waiting for PRAM though. Reply
  • orclordrh - Tuesday, March 24, 2009 - link

    Very illuminating article, very well written and researched. It made me glad that I didn't pull the trigger on an SSD for my I7 machine and regret not buying OCZ memory! I'm interested in adding an SSD as the scratch disk for Photoshop CS4 to use. I don't really launch applications very often, say once a week on the weekly reboot and keep 6-8 apps open at all times. I have 12GB of memory for that. The benchmarks were very interesting, but what sort of activity does Photoshop scratch usage create? Large files or random writes? What type of SSD would be most cost effective here?
    An SSD does sound better than a SSD!
    Reply
  • semo - Wednesday, March 25, 2009 - link

    wait for ddr3 to enter the mainstream and buy loads of memory.

    use a ramdisk for your adobe scratch area. much faster than ssd and no wear to worry about (not that you would worry that much with modern ssds anyway).

    http://www.ghacks.net/2007/12/14/use-a-ramdisk-to-...">http://www.ghacks.net/2007/12/14/use-a-ramdisk-to-...

    there is also a paid for and more feature rich ramdisk out there. can't remember the name
    Reply
  • strikeback03 - Wednesday, March 25, 2009 - link

    I'll have to check when I get home, but I believe the recommended size for the scratch disk is upwards of 10GB. So would need a motherboard that supports a LOT of RAM to give enough to main memory plus a scratch disk. Reply
  • strikeback03 - Wednesday, March 25, 2009 - link

    I was wondering the same thing. I'd guess it would be a lot of writing/erasing, so an SSD might not be the best from a longevity standpoint, but if your system is hitting the scratch disk often then the speed might make it worthwhile. Reply

Log in

Don't have an account? Sign up now