Why SSDs Care About What You Write: Fragmentation & Write Combining

PC Perspective's Allyn Malventano is a smart dude, just read one of his articles to figure that out. He pieced together a big aspect of how the X25-M worked on his own, a major key to how to improve SSD performance.

You'll remember from the Anthology that SSDs get their high performance by being able to write to multiple flash die across multiple channels in parallel. This works very well for very large files since you can easily split the reads and writes across multiple die/channels.

Here we go to write a 128KB file, it's split up and written across multiple channels in our tiny mock SSD:

When we go to read the file, it's read across multiple channels and performance is once again, excellent.

Remember what we talked about before however: small file random read/write performance is actually what ends up being slowest on hard drives. It's what often happens on a PC and thus we run into a problem when performing such an IO. Here we go to write a 4KB file. The smallest size we can write is 4KB and thus it's not split up at all, it can only be written to a single channel:

As Alyn discovered, Intel and other manufacturers get around this issue by combining small writes into larger groups. Random writes rarely happen in a separated manner, they come in bursts with many at a time. A write combining controller will take a group of 4KB writes, arrange them in parallel, and then write them together at the same time.

This does wonders for improving random small file write performance, as everything completes as fast as a larger sequential write would. What it hurts is what happens when you overwrite data.

In the first example where we wrote a 128KB file, look what happens if we delete the file:

Entire blocks are invalidated. Every single LBA in these blocks will come back invalid and can quickly be cleaned.

Look at what happens in the second example. These 4KB fragments are unrelated, so when one is overwritten, the rest aren't. A few deletes and now we're left with this sort of a situation:

Ugh. These fragmented blocks are a pain to deal with. Try to write to it now and you have to do a read-modify-write. Without TRIM support, nearly every write to these blocks will require a read-modify-write and send write amplification through the roof. This is the downside of write combining.

Intel's controller does its best to recover from these situations. That's why its used random write performance is still very good. Samsung's controller isn't very good at recovering from these situations.

Now you can see why performing a sequential write over the span of the drive fixes a fragmented drive. It turns the overly fragmented case into one that's easy to deal with, hooray. You can also see why SSD degradation happens over time. You don't spend all day writing large sequential files to your disk. Instead you write a combination of random and sequential, large and small files to the disk.

The Cleaning Lady and Write Amplification A Wear Leveling Refresher: How Long Will My SSD Last?
POST A COMMENT

296 Comments

View All Comments

  • sotoa - Friday, September 04, 2009 - link

    Another great article. You making me drool over these SSD's!
    I can't wait till Win7 comes to my door so I can finally get an SSD for my laptop.
    Hopefully prices will drop some more by then and Trim firmware will be available.
    Reply
  • lordmetroid - Thursday, September 03, 2009 - link

    I use them both because they are damn good and explanatory suffixes. It is 2009, soon 2010 I think we can at least get the suffixes correct, if someone doesn't know what they mean, wikipedia has answers. Reply
  • AnnonymousCoward - Saturday, September 05, 2009 - link

    As someone who's particular about using SI and being correct, I think it's better to stick to GB for the sake of simplicity and consistency. The tiny inaccuracy is almost always irrelevant, and as long as all storage products advertise in GB, it wouldn't make sense to speak in terms of GiB. Reply
  • Touche - Thursday, September 03, 2009 - link

    Both articles emphasize Intel's performance lead, but, looking at real world tests, the difference between it and Vertex is really small. Not hardly enough to justify the price difference. I feel like the articles are giving an impression that Intel is in a league of its own when in fact it's only marginally faster. Reply
  • smjohns - Tuesday, September 08, 2009 - link

    This is where I struggle. It is all very well quoting lots of stats about all these drives but what I really want to know is if I went for Intel over the OCZ Vertex (non-turbo) where would I really notice the difference in performance on a laptop?

    Would it be slower start up / shut down?
    Slower application response times?
    Speed at opening large zipped files?
    Copying / processing large video files?

    If the difference is that slim then I guess it is down to just a personal preference....
    Reply
  • morrie - Thursday, September 03, 2009 - link

    I've made it a habit of securely deleting files by using "shred" like this: shred -fuvz, and accepting the default number of passes, 25. Looks like this security practice is now out, as the "wear" on the drive would be at least 25x faster, bringing the stated life cycles closer to having an impact on drive longevity. So what's the alternative solution for securely deleting a file? Got to "delete" and forget about security? Or "shred" with a lower number of passes, say 7 or 10, and be sure to purchase a non-Intel drive with the ten year warranty and hope that the company is still in business, and in the hard drive business, should you need warranty service in the outer years... Reply
  • Rasterman - Wednesday, September 16, 2009 - link

    watching too much CSI, there is an article somewhere i read by a data repair tech who works in one of the multi-million dollar data recovery labs, basically he said writing over it once is all you should do and even that is overkill 99% of the time. theoretically it is possible to even recover that _sometimes_, but the expense required is so high that unless you are committing a billion dollar fraud or are the secretary to osama bin laden no one will ever try to recover such data. chances are if you are in such circles you can afford a new drive 25x more often. and if you have such information or knowledge wouldn't be far easier and cheaper to simply beat it out of you than trying to recover a deleted drive? Reply
  • iamezza - Friday, September 04, 2009 - link

    1 pass should be sufficient for most purposes. Unless you happen to be working on some _extremely_ sensitive/important data. Reply
  • derkurt - Thursday, September 03, 2009 - link

    quote:

    So what's the alternative solution for securely deleting a file?


    I may be wrong on this, but I'd assume that once TRIM is enabled, a file is securely deleted if it has been deleted on the filesystem level. However, it might depend on the firmware when exactly the drive is going to actually delete the flash blocks which are marked as deletable by TRIM. For performance reasons the drive should do that as soon as possible after a TRIM command, but also preferably at a time when there is not much "action" going on - after all, the whole point of TRIM is to change the time of block erasing flash cells to a point where the drive is idle.
    Reply
  • morrie - Thursday, September 03, 2009 - link

    That's on a Linux system btw

    As to aligning drives...how about an update to the article on what needs to be done/ensured, if anything, for using the drives with a Linux OS?
    Reply

Log in

Don't have an account? Sign up now