The Performance Degradation Problem

When Intel first released the X25-M, Allyn Malventano discovered a nasty corner case where the drive would no longer be able to run at its full potential. You basically had to hammer on the drive with tons of random writes for at least 20 minutes, but eventually the drive would be stuck at a point of no return. Performance would remain low until you secure erased the drive.

Although it shouldn't appear in real world use, the worry was that over time a similar set of conditions could align resulting in the X25-M performing slower than it should. Intel, having had much experience with similar types of problems (e.g. FDIV, Pentium III 1.13GHz), immediately began working on a fix and released the fix a couple of months after launch. The fix was nondestructive although you saw much better performance if you secure erased your drive first.

SandForce has a similar problem and I have you all and bit-tech to thank for pointing it out. In bit-tech's SandForce SSD reviews they test TRIM functionality by filling a drive with actual data (from a 500GB source including a Windows install, pictures, movies, documents, etc...). The drive is then TRIMed, and performance is measured.

If you look at bit-tech's charts you'll notice that after going through this process, the SandForce drives no longer recover their performance after TRIM. They are stuck in a lower performance state making the drives much slower when writing incompressible data.

You can actually duplicate the bit-tech results without going through all of that trouble. All you need to do is write incompressible data to all pages of a SandForce drive (user accessible LBAs + spare area), TRIM the drive and then measure performance. You'll get virtually the same results as bit-tech:

AS-SSD Incompressible Write Speed
  Clean Performance Dirty (All Blocks + Spare Area Filled) After TRIM
SandForce SF-1200 (120GB) 131.7MB/s 70.3MB/s 71MB/s

The question is why.

I spoke with SandForce about the issue late last year. To understand the cause we need to remember how SSDs work. When you go to write to an SSD, the controller must first determine where to write. When a drive is completely empty, this decision is pretty easy to make. When a drive is not completely full to the end user but all NAND pages are occupied (e.g. in a very well used state), the controller must first supply a clean/empty block for you to write to.

When you fill a SF drive with incompressible data, you're filling all user addressable LBAs as well as all of the drive's spare area. When the SF controller gets a request to overwrite one of these LBAs the drive has to first clean a block and then write to it. It's the block recycling path that causes the aforementioned problem.

In the SF-1200 SandForce can only clean/recycle blocks at a rate of around 80MB/s. Typically this isn't an issue because you won't be in a situation where you're writing to a completely full drive (all user LBAs + spare area occupied with incompressible data). However if you do create an environment where all blocks have data in them (which can happen over time) and then attempt to write incompressible data, the SF-1200 will be limited by its block recycling path.

So why doesn't TRIMing the entire drive restore performance?

Remember what TRIM does. The TRIM command simply tells the controller what LBAs are no longer needed by the OS. It doesn't physically remove data from the SSD, it just tells the controller that it can remove the aforementioned data at its own convenience and in accordance with its own algorithms.

The best drives clean dirty blocks as late as possible without impacting performance. Aggressive garbage collection only increases write amplification and wear on the NAND, which we've already established SandForce doesn't really do. Pair a conservative garbage collection/block recycling algorithm with you attempting to write an already full drive with tons of incompressible data and you'll back yourself into a corner where the SF-1200 continues to be bottlenecked by the block recycling path. The only way to restore performance at this point is to secure erase the drive.

This is a real world performance issue on SF-1200 drives. Over time you'll find that when you go to copy a highly compressed file (e.g. H264 video) that your performance will drop to around 80MB/s. However, the rest of your performance will remain as high as always. This issue only impacts data that can't be further compressed/deduped by the SF controller. While SandForce has attempted to alleviate it in the SF-1200, I haven't seen any real improvements with the latest firmware updates. If you're using your SSD primarily to copy and store highly compressed files, you'll want to consider another drive.

Luckily for SandForce, the SF-2500 controller alleviates the problem. Here I'm running the same test as above. Filling all blocks of the Vertex 3 Pro with incompressible data and then measuring sequential write speed. There's a performance drop, but it's no where near as significant as what we saw with the SF-1200:

AS-SSD Incompressible Write Speed
  Clean Performance Dirty (All Blocks + Spare Area Filled) After TRIM
SandForce SF-1200 (120GB) 131.7 MB/s 70.3 MB/s 71 MB/s
SandForce SF-2500 (200GB) 229.5 MB/s 230.0 MB/s 198.2 MB/s

It looks like SandForce has increased the speed of its block recycling engine among other things, resulting in a much more respectable worst case scenario of ~200MB/s.

Verifying the Fix

I was concerned that perhaps SandForce simply optimized for the manner in which AS-SSD and Iometer write incompressible data. In order to verify the results I took a 6.6GB 720p H.264 movie and copied it from an Intel X25-M G2 SSD to one of two SF drives. The first was a SF-1200 based Corsair Force F120, and the second was an OCZ Vertex 3 Pro (SF-2500).

I measured both clean performance as well as performance after I'd filled all blocks on the drive. The results are below:

6.6GB 720p H.264 File Copy (X25-M G2 Source to Destination)
  Clean Performance Dirty (All Blocks + Spare Area Filled) After TRIM
SandForce SF-1200 (120GB) 138.6 MB/s 78.5 MB/s 81.7 MB/s
SandForce SF-2500 (200GB) 157.5 MB/s 158.2 MB/s 157.8 MB/s

As expected the SF-1200 drive drops from 138MB/s down to 81MB/s. The drive is bottlenecked by its block recycling path and performance never goes up beyond 81MB/s.

The SF-2000 however doesn't drop in performance. Brand new performance is at 157MB/s and post-torture it's still at 157MB/s. What's interesting however is that the incompressible file copy performance here is lower than what Iometer and AS-SSD would have you believe. Iometer warns that even its fully random data pattern can be defeated by drives with good data deduplication algorithms. Unless there's another bottleneck at work here, it looks like the SF-2000 is still reducing the data that Iometer is writing to the drive. The AS-SSD comparison actually makes a bit more sense since AS-SSD runs at a queue depth of 32 and this simple file copy is mostly at a queue depth of 1. Higher queue depths will make better use of parallel NAND channels and result in better performance.

Sequential Read/Write Speed AnandTech Storage Bench 2011: Much Heavier
Comments Locked

144 Comments

View All Comments

  • bigboxes - Thursday, February 17, 2011 - link

    Anand, I know you mentioned read/write and having your data a year after your last write. Does the future of SSD going to allow long-term storage on these devices? Will our data last longer than a year in storage or in use as read-only? I figured when cost went down and capacity went up that we'd start seeing SSD's truly replace HDD as the medium of long-term storage. Any insights into the (near) future?
  • marraco - Thursday, February 17, 2011 - link

    We need a roundup of SATA 6Gb controllers on AMD and Intel.

    How do added cards perform against integrated SATA 6Gb?
  • jwilliams4200 - Thursday, February 17, 2011 - link

    Here are the numbers given in the AS-SSD incompressible write speed chart for
    SF-2500 (clean, dirty, after TRIM):

    229.5 MB/s 230.0 MB/s 198.2 MB/s

    Logically, I would expect the dirty number to be less than or equal to the after-TRIM number. Is there a typo here?
  • jwilliams4200 - Thursday, February 17, 2011 - link

    Anand:

    Could you run the data files for your 2011 storage bench (heavy and light cases) through a couple of standard compression programs and report the compressed and uncompressed file sizes? That would be useful information to know when evaluating the performance of Sandforce SSDs on your storage benchmark.
  • Chloiber - Thursday, February 17, 2011 - link

    Indeed, this would be an important piece of information.
  • mstone29 - Thursday, February 17, 2011 - link

    It's been out for a few weeks and the performance is on par w/ the OCZ V3.

    Does OCZ pay better?
  • Anand Lal Shimpi - Sunday, February 20, 2011 - link

    We're still waiting for our Corsair P3 sample, as soon as we get it you'll see a review :)

    Take care,
    Anand
  • gotFrosty - Thursday, February 17, 2011 - link

    I personally will never buy from OCZ ever again... The way that they are treating the customers (including me) with this shady marketing scandal. Never will I deal with them. Never. Who is to say that they will not pull this crap somewhere down the line again.
    They changed the way they manufactured the drives. Ok thats well and fine, but at least change the product number/name whatever so that end users can distinguish between the products. Right now I'm sitting with a drive that they can't tell me whether its the slower 25nm or the 34. What kind of crap is that. I can't tell either because my build is waiting on the P67's to get fixed. Oh and to still market the drive as the same Vertex 2 that got all the great reviews.
    Lets just say I'm a little irritated with the whole scheme. I feel robbed.
  • Mr Perfect - Friday, February 18, 2011 - link

    Just stumbled across the whole Vertex 2 issue myself. Link to an explanation of what Frosty is mad about below:

    http://www.storagereview.com/ocz_issues_mea_culpa_...

    I'm not impressed with OCZ right now. Anand, any way you could talk to OCZ about this issue?
  • db808 - Thursday, February 17, 2011 - link

    Hi Anand,

    Thanks for another great SSD article. I own a OCZ Vertex 2 for my personal use, and I have been doing some testing of SSDs for work use.

    I have a questions/comments that will probably stir up some additional discussion.

    1) You present a good description on your personal workload write volume at 7GB / day, and how that even with that heavy amount of activity, the SSD life expectancy is much greater than the warranty period.

    Did you ever try to correlate this with the life expectancy (or read and write activity) reported by the SSD using the SMART attributes?

    In my first 3 weeks using a new Vertex 2 SSD as my boot disk, I averaged over 18 GB/day of write activity ... much greater than your reported 7 GB/day.

    I can not say for other Sandforce implementations, but the OCZ Vertex 2 does report a wide variety of useful statistics via the vendor-specific SMART statistics. These statistics can be displayed using the OCZ Toolbox:
    http://www.ocztechnologyforum.com/forum/showthread...

    I don't know if other SSD vendors have similar information. Crystal Disk Info (http://crystalmark.info/software/CrystalDiskInfo/i... also displays and formats many of the vendor-specific fields, but I don't know if it specifically displays the extended info for specific SSDs.

    Using the OCZ Toolbox (which works with all OCZ Sandforce SSDs), you can display a lot of interesting information. Here is the statistics for the first 3 weeks of usage from my SSD. No real benchmarking, just doing the initial install of Windows 7 64-bit, and then installing all the apps that I run. My 120 GB SSD is about half full, including a 8 gb page and 8 gb hiberbate file. I also relocated my Windows search index off the SSD. Temp IS on the SSD (my choice).

    SMART READ DATA
    Revision: 10
    Attributes List
    1: SSD Raw Read Error Rate Normalized Rate: 100 total ECC and RAISE errors
    5: SSD Retired Block Count Reserve blocks remaining: 100%
    9: SSD Power-On Hours Total hours power on: 351
    12: SSD Power Cycle Count Count of power on/off cycles: 84
    171: SSD Program Fail Count Total number of Flash program operation failures: 0
    172: SSD Erase Fail Count Total number of Flash erase operation failures: 0
    174: SSD Unexpected power loss count Total number of unexpected power loss: 19
    177: SSD Wear Range Delta Delta between most-worn and least-worn Flash blocks: 0
    181: SSD Program Fail Count Total number of Flash program operation failures: 0
    182: SSD Erase Fail Count Total number of Flash erase operation failures: 0
    187: SSD Reported Uncorrectable Errors Uncorrectable RAISE errors reported to the host for all data access: 0
    194: SSD Temperature Monitoring Current: 1 High: 129 Low: 127
    195: SSD ECC On-the-fly Count Normalized Rate: 100
    196: SSD Reallocation Event Count Total number of reallocated Flash blocks: 0
    231: SSD Life Left Approximate SDD life Remaining: 100%
    241: SSD Lifetime writes from host Number of bytes written to SSD: 384 GB
    242: SSD Lifetime reads from host Number of bytes read from SSD: 832 GB

    For my first 3 weeks, using the PC primarily after work and on weekends, I averaged 18.2 GB/day of write activity ... or 384 GB total.

    You may want to re-assess the classification of your 7 GB/day workload as "heavy". I don't think my 18.2 GB/day workload was extra heavy. My system has 8 GB of memory, and typically runs between 2-3 gb used, so I don't believe that there is a lot of activity to the page file. I have a hibernate file because I use a UPS, and it allows me to "resume" after a power blip vs. a full shutdown.

    Well ... back to the point .... The OCZ toolbox reports an estimated remaining life expectancy. I have not run my SSD long enough to register a 1% usage yet, but I will be looking at what volume of total write activity finally triggers the disk to report only 99% remaining life.

    I don't know if the OCZ Toolbox SMART reporting will work with non-OCZ Sandforce-based SSDs.

    If you can get a life expectancy value from your Sandforce SSDs, it would be interesting to see how it correlates with your synthetic estimates.

    Thanks again for a great article!

Log in

Don't have an account? Sign up now