Strength in Numbers, What makes SSDs Fast

Given the way a single NAND-flash IC is organized one thing should come to mind: parallelism.

Fundamentally the flash that’s used in SSDs cut from the same cloth as the flash that’s used in USB drives. And if you’ve ever used a USB flash drive you know that those things aren’t all that fast. Peak performance to a single NAND-flash IC is going to be somewhere in the 5 - 40MB/s range. You get the faster transfer rates by reading/writing in parallel to multiple die in the same package.

The real performance comes from accessing multiple NAND ICs concurrently. If each device can give you 20MB/s of bandwidth and you’ve got 10 devices you can access at the same time, that’s 200MB/s of bandwidth. While hard drives like reads/writes to be at the same place on the drive, SSDs don’t mind; some are even architected to prefer that data be spread out all over the drive so it can hit as many flash devices as possible in tandem. Most drives these days have 4 - 10 channel controllers.

The Recap

I told you I’d mention this again because it’s hugely important, so here it is:

A single NAND flash die is subdivided into blocks. The typical case these days is that each block is 512KB in size. Each block is further subdivided into pages, with the typical page size these days being 4KB.

Now you can read and write to individual pages, so long as they are empty. However once a page has been written, it can’t be overwritten, it must be erased first before you can write to it again. And therein lies the problem, the smallest structure you can erase in a NAND flash device today is a block. Once more, you can read/write 4KB at a time, but you can only erase 512KB at a time.

It gets worse. Every time you erase a block, you reduce the lifespan of the flash. Standard MLC NAND flash can only be erased 10,000 times before it goes bad and stops storing data.

Based on what I’ve just told you there are two things you don’t want to do when writing to flash: 1) you don’t want to overwrite data, and 2) you don’t want to erase data. If flash were used as a replacement for DVD-Rs then we wouldn’t have a problem, but it’s being used as a replacement for conventional HDDs. Who thought that would be a good idea?

It turns out that the benefits are more than worth the inconvenience of dealing with these pesky rules; so we work around them.

Most people don’t fill up their drives, so SSD controller makers get around the problem by writing to every page on the drive before ever erasing a single block.

If you go about using all available pages to write to and never erasing anything from the drive, you’ll eventually run out of available pages. I’m sure there’s a fossil fuel analogy somewhere in there. While your drive won’t technically be full (you may have been diligently deleting files along the way and only using a fraction of your drive’s capacity), eventually every single block on your drive will be full of both valid and invalid pages.

In other words, even if you’re using only 60% of your drive, chances are that 100% of your drive will get written to simply by day to day creation/deletion of files.

The Anatomy of an SSD The Blind SSD
POST A COMMENT

235 Comments

View All Comments

  • siliq - Wednesday, April 01, 2009 - link

    With Anand's excellent article, it's clear that the sequential read/write thoroughput doesn't matter so much - all SSDs, even the notorious JMicron series, can do a good job on that metric. What is relevant to our daily use is the random write rate. Latencies and IOs/second are the most important metric in the realm of SSD.

    Based on that, I would suggest Anand (and other Tech reporters) to include a real world test of evaluating the Random Write performance for SSD. Because current real-world tests: booting windows, loading games, rendering 3D, etc. they focus on the random read. However, measuring how long it takes to install Windows, Microsoft Visual Studio, or a 4-GB PC Game would thoroughly test the Random Write / Latency performance. I think this is a good complementary of our current testing methodology
    Reply
  • Sabresiberian - Tuesday, March 31, 2009 - link

    Just wanted to add my thanks to Anand for this article in particular and for the quality work he has done over the years; I am so grateful for Anandtech's quality and information and the fact that it has been maintained! Reply
  • Sabresiberian - Tuesday, March 31, 2009 - link

    Oops didn't proof, sorry about the misspell Anand! Reply
  • hongmingc - Saturday, March 28, 2009 - link

    Anand, This is a great Article and a good story too.
    The OCZ story caught my attention that a quick firmware upgrade make a big improvement. From my understanding that SSD system designers try to trade off Space, Speed, and Durability (Also SSD :)) due the nature of NAND flash.
    We can clearly see the trade off of Space and Speed when SSD is getting more full the slower the speed (This is due to out-of-place write to increase the write operation and a block reclaim routine). However, Speed is also sacrificed to achieve the Durability (by doing wear leveling). Remember SLC nand's life time is about 100K write, while MLC nand has only about 10K write. Without considering doing wear leveling to improve the life cycle of the SSD, the firmware can be much simple and easy which will improve the write operation speed quite a bit.
    I echo you that the performance test should reflect user's daily usage which can be small size files write and may not be 80% full.
    However, users may be more concern about the Durability, the life cycle of the SSD.
    Is there such a test? How long will the black box OCZ Vertex live?
    How long will the regular OCZ Vertex live? and How long will the X25 live?
    Reply
  • antcasq - Sunday, April 05, 2009 - link

    This article was excellent, explaining several issues regarding performance.

    It would be great if the next article abou ssd addresses durability and reliability.

    My main concert is the swap partition (Linux) or virtual memory file (Windows). I found an post in another website saying that this is not an issue. Is it true? I find it hard to believe. Maybe in a real world test/scenario the problem will arise.
    http://robert.penz.name/137/no-swap-partition-jour...">http://robert.penz.name/137/no-swap-partition-jour...

    I hope AnandTech can take my concerns into consideration.

    Best regards
    Reply
  • stilz - Friday, March 27, 2009 - link

    This is the first hardware review I've read from start to finish, and the time is well worth the information you've provided.

    Thank you for your honest, professional and knowledgeable work. Also kudos to OCZ, I'll definitely consider the Vertex while making purchases.
    Reply
  • Bytales - Friday, March 27, 2009 - link

    As i read the article, i'm thinking of ways to slow down the down the degrading process. Intel is gonna ship x-25m 320gb this year. If i buy this drive and use it as an OS drive, i will obviously won't need the whole 320GB. Say i would need only 40 to 50 GB. I can make a secure erase (if the drive isn't new), made a partition of 50GB, and leave the remaining space unpartitioned. Will that solve the problem in any way ?
    Another way to solve the problem, would be a method inside the OS. The OS could use a user controlled % of the RAM memory, as a cache for those small 4kb files. Since ram reads and writes are way faster, i think it will also help. Say you got 8GB ram, and use 2gb for this purpose, and then the OS would only have 6gb ram for its use, while 2gb is used for these smaller files. That would increase also the lifespan of the SSD. Can this be possible ?
    Reply
  • Hellfire26 - Thursday, March 26, 2009 - link

    In reference to SSD's, I have read a lot of articles and comments about improved firmware and operating system support. I hope manufacturers don't forget about the on-board RAID controller.

    From the articles and comments made by users around the web, who have tested SSD's in a Raid 0 configuration, I believe that two Intel X25-M SSD's in a RAID 0 configuration would more than saturate current on-board RAID controllers.

    Intel is doing a die shrink of the NAND memory that is going into their SSD's come this fall. I would expect these new Intel SSD's to show faster read and write times. Other manufacturers will also find ways to increase the speed of their SSD's.

    SSD's scale well in a RAID configuration. It would be a shame if the on-board RAID controller limited our throughput. The alternative would be very expensive add-in RAID cards.
    Reply
  • FlaTEr1C - Wednesday, March 25, 2009 - link

    Anand, once again you wrote an article that no one else could've written. This is why I'm reading this site since 2004 and will always do. Your articles and reviews are without exception unique and a must-read. Thank you for this thorough background, analysis and review of SSD.

    I was looking a long time for a solution to make my desktop experience faster and I think I'll order a 60GB Vertex. 200€(germany) is still a lot of money but it will be worth it.

    Once again, great work Anand!
    Reply
  • blackburried - Wednesday, March 25, 2009 - link

    It's referred to as "discard" in the kernel functions.

    It works very well w/ SSD's that support TRIM, like fusion-io's drives.
    Reply

Log in

Don't have an account? Sign up now