The Cleaning Lady and Write Amplification

Imagine you’re running a cafeteria. This is the real world and your cafeteria has a finite number of plates, say 200 for the entire cafeteria. Your cafeteria is open for dinner and over the course of the night you may serve a total of 1000 people. The number of guests outnumbers the total number of plates 5-to-1, thankfully they don’t all eat at once.

You’ve got a dishwasher who cleans the dirty dishes as the tables are bussed and then puts them in a pile of clean dishes for the servers to use as new diners arrive.

Pretty basic, right? That’s how an SSD works.

Remember the rules: you can read from and write to pages, but you must erase entire blocks at a time. If a block is full of invalid pages (files that have been overwritten at the file system level for example), it must be erased before it can be written to.

All SSDs have a dishwasher of sorts, except instead of cleaning dishes, its job is to clean NAND blocks and prep them for use. The cleaning algorithms don’t really kick in when the drive is new, but put a few days, weeks or months of use on the drive and cleaning will become a regular part of its routine.

Remember this picture?

It (roughly) describes what happens when you go to write a page of data to a block that’s full of both valid and invalid pages.

In actuality the write happens more like this. A new block is allocated, valid data is copied to the new block (including the data you wish to write), the old block is sent for cleaning and emerges completely wiped. The old block is added to the pool of empty blocks. As the controller needs them, blocks are pulled from this pool, used, and the old blocks are recycled in here.

IBM's Zurich Research Laboratory actually made a wonderful diagram of how this works, but it's a bit more complicated than I need it to be for my example here today so I've remade the diagram and simplified it a bit:

The diagram explains what I just outlined above. A write request comes in, a new block is allocated and used then added to the list of used blocks. The blocks with the least amount of valid data (or the most invalid data) are scheduled for garbage collection, cleaned and added to the free block pool.

We can actually see this in action if we look at write latencies:

Average write latencies for writing to an SSD, even with random data, are extremely low. But take a look at the max latencies:

While average latencies are very low, the max latencies are around 350x higher. They are still low compared to a mechanical hard disk, but what's going on to make the max latency so high? All of the cleaning and reorganization I've been talking about. It rarely makes a noticeable impact on performance (hence the ultra low average latencies), but this is an example of happening.

And this is where write amplification comes in.

In the diagram above we see another angle on what happens when a write comes in. A free block is used (when available) for the incoming write. That's not the only write that happens however, eventually you have to perform some garbage collection so you don't run out of free blocks. The block with the most invalid data is selected for cleaning; its data is copied to another block, after which the previous block is erased and added to the free block pool. In the diagram above you'll see the size of our write request on the left, but on the very right you'll see how much data was actually written when you take into account garbage collection. This inequality is called write amplification.


Intel claims very low write amplification on its drives, although over the lifespan of your drive a < 1.1 factor seems highly unlikely

The write amplification factor is the amount of data the SSD controller has to write in relation to the amount of data that the host controller wants to write. A write amplification factor of 1 is perfect, it means you wanted to write 1MB and the SSD’s controller wrote 1MB. A write amplification factor greater than 1 isn't desirable, but an unfortunate fact of life. The higher your write amplification, the quicker your drive will die and the lower its performance will be. Write amplification, bad.

Live Long and Prosper: The Logical Page Why SSDs Care About What You Write: Fragmentation & Write Combining
Comments Locked

295 Comments

View All Comments

  • valnar - Wednesday, September 2, 2009 - link

    Anyone?
  • antinah - Tuesday, September 1, 2009 - link

    For another great article on the SSD technology.

    I'm considering an Intel G2 for my brand new macbook pro, and if I understand what I've read correctly, performance should not degrade too much although OSX doesn't support trim yet.

    I also doubt Apple will wait too long before they release an update with trim support for osx.

    I just recently switched to mac after a lifetime with pc/windows. Anything i shoud be aware of when I install the SSD in a mac compared to pc running windows? (other than voiding the warranty and such). I'm thinking precations regarding swap usage or such.

    Best regards from norway
    Stein
  • medi01 - Tuesday, September 1, 2009 - link

    So I absolutelly need to pay 15 times as much per gigabyte as normal HDDs, so that when I start Photoshop, Firefox and WoW, straight after windows boots, it loads whopping 24 seconds faster?

    That's what one calls "absolutelly need" indeed and you also chose amazingly common combination of apps.
  • Anand Lal Shimpi - Tuesday, September 1, 2009 - link

    You can look back at the other two major SSD pieces (X25-M Review and The SSD Anthology) for other examples of application launch performance improvements. The point is that all applications launch as fast as possible, regardless of the state of your machine. Whether you're just firing it up from start (which is a valid use scenario as many users do shut off their PCs entirely) or launching an application after your PC has been on for a while, the apps take the same amount of time to start. The same can't be said for a conventional hard drive.

    Take care,
    Anand
  • Seramics - Tuesday, September 1, 2009 - link

    its not abt the 24seconds but rather the wholly different experience of near instantaneous u get wit ssd tht cannot be replicated by hdds
  • medi01 - Tuesday, September 1, 2009 - link

    Nobody starts mentioned apps together directly after boot.

    I've played WoW for a couple of years, and never had to wait dozen of seconds for it to start.

    Most well written applications start almost instantly.

    And the whole "after fresh boot" is not quite a valid option neither, I don't recall when I last switched off my pc, "hibernate" works just fine.

    The "you get completely different experience" MIGHT be a valid point, but it was destroyed by ridiculous choice of apps to start. And I suspect that it is because NOT starting stuff all together and right after boot, didn't show gap as big.
  • kunedog - Tuesday, September 1, 2009 - link

    Anand, I think your article titled "Intel Forces OCZ's Hand: Indilinx Drives To Drop in Price" (http://www.anandtech.com/storage/showdoc.aspx?i=36...">http://www.anandtech.com/storage/showdoc.aspx?i=36... could also use a follow-up, primarily to explain why the opposite has happened (especially with the Intel drives). Is this *all* attributable to Intel's disaster of a product launch? Maybe not, but in any case it deserves more attention than a brief mention at the end of this article.
  • zero2espect - Tuesday, September 1, 2009 - link

    great work again. it's for this reason that i've been coming here for ages. great analysis, great writing and an understanding about what we're all looking for.

    one thing that you may have overlooked is the difference in user experience due to the lack of hdd "buzz". fortunate enough to find myself in posession of a couple of g2160gb jobbies, one is in my gaming rig and the other in the work notebook. using the notebook the single biggest difference is speed (it makes a 18mo old notebook seems like it performs as fast as a current generation desktop) but the next biggest and very noticible difference is the lack of "hum", "buz", "thrash" and "vibrate" as the drive goes about it's business.

    thanks anadtech and thanks intel ;-P
  • Mr Perfect - Tuesday, September 1, 2009 - link

    Anand,

    Would you happen to know if there are different revisions of the G2 drives out? Newegg is listing a 80GB Intel drive with model #SSDSA2MH080G2C1 for $499, and another 80GB Intel with model #SSDSA2MH080G2R5 for $599. They are both marked as 2.5" MLC Retail drives, and as far as I can tell they're both G2. What has a R5 got that a C1 doesn't? The updated firmware maybe?

    Thanks!

    PS, dear Newegg, WTF? 100% plus price premiums? I'm thinking I'll just wait until stock returns and buy from another site just to spite you now....
  • gfody - Tuesday, September 1, 2009 - link

    It looks like the R5 is just a different retail package - shiny box, nuts and a bracket instead of just the brown box.
    Why Newegg is charging an extra $100 for it.. just look at what they're doing with the other prices. I am losing so much respect for Newegg right now. disgusting!

Log in

Don't have an account? Sign up now