Measuring How Long Your Intel SSD Will Last

Earlier in this review I talked about Intel's SMART attributes that allow you to accurately measure write amplification for a given workload. If you fire up Intel's SSD Toolbox or any tool that allows you to monitor SMART attributes you'll notice a few fields of interest. I mentioned these back in our 710 review, but the most important for our investigations here are E2 (226) and E4 (228):

The raw value of attribute E2, when divided by 1024, gives you an accurate report of the amount of wear on your NAND since the last timer reset. In this case we're looking at an Intel X25-M G2 (the earliest drive to support E2 reporting) whose E2 value is at 9755. Dividing that by 1024 gives us 9.526% (the field is only accurate to three decimal points).

I mentioned that this data is only accurate since the last timer reset, that's where the value stored in E4 comes into play. By executing a SMART EXECUTE OFFLINE IMMEDIATE subcommand 40h to the drive you'll reset the timer in E4 and the data stored in E2 and E3. The data in E2/E3 will then reflect the wear incurred since you reset the timer, giving you a great way of measuring write amplification for a specific workload.

How do you reset the E4 timer? I've always used smartmontools to do it. Download the appropriate binary from sourceforge and execute the following command:

smartctl -t vendor,0x40 /dev/hdX where X is the drive whose counter you're trying to reset (e.g. hda, hdb, hdc, etc...).

Doing so will reset the E4 counter to 65535. The counter then begins at 0 and will count up in minutes. After the first 60 minutes you'll get valid data in E2/E3. While E2 gives you an indication of how much wear your workload puts on the NAND, E3 gives you the percentage of IO operations that are reads since you reset E4. E3 is particularly useful for determining how write heavy your workload is at a quick glance. Remember, it's the process of programing/erasing a NAND cell that is most destructive - read heavy workloads are generally fine on consumer grade drives.

I reset the workload timer (E4) on all of the Intel SSDs that supported it and ran a loop of our MS SQL Weekly Maintenance benchmark that resulted in 320GB of writes to the drive. I then measured wear on the NAND (E2) and used that to calculate how many TBs we could write to these drives, using this workload, before we'd theoretically wear out their NAND. The results are below:

Drive Lifespan - MS SQL Weekly Maintenance Benchmark

There are a few interesting takeaways from this data. For starters, Intel's SSD 710 uses high endurance MLC (aka eMLC, MLC-HET) which is good for a significant increase in p/e cycles. With tens of thousands of p/e cycles per NAND cell, the Intel SSD 710 offers nearly an order of magnitude better endurance than the Intel SSD 320. Part of this endurance advantage is delivered through an incredible amount of spare area. Remember that although the 710 featured here is a 200GB drive it actually has 320GB of NAND on board. If you set aside a similar amount of spare area on the 320 you'd get a measurable increase in endurance. We actually see an example of that if you look at the gains the 300GB SSD 320 offers over the 160GB drive. Both drives are subjected to the same sized workload (just under 60GB), but the 300GB drive has much more unused area to use for block recycling. The result is an 85% increase in estimated drive lifespan for an 87.5% increase in drive capacity.

What does this tell us for how long these drives would last? If all they were doing was running this workload once a week, even the 160GB SSD 320 would be just fine. In reality our SQL server does far more than this but even then we'd likely be ok with a consumer drive. Note that the 800TBs of writes for the 160GB 320 is well above the 15TB Intel rates the drive for. The difference here is that Intel is calculating lifespan based on 4KB random writes with a very high write amplification. If we work backwards from these numbers for the MLC drives you'll end up with around 4000 - 5000 p/e cycles. In reality, even 25nm Intel NAND lasts longer than what it's rated for so what you're seeing is that this workload has a very low write amplification on these drives thanks to its access pattern and small size relative to the capacity of these drives.

Every workload is going to be different but what may have been a brutal consumer of IOs in the past may still be right at home on consumer SSDs in your server.

Enterprise Storage Bench - Microsoft SQL WeeklyMaintenance Final Words
Comments Locked

55 Comments

View All Comments

  • Anand Lal Shimpi - Thursday, February 9, 2012 - link

    Given enough spare area and a good enough SSD controller, TRIM isn't as important. It's still nice to have, but it's more of a concern on a drive where you're running much closer to capacity. Take the Intel SSD 710 in our benchmarks for example. We're putting a ~60GB data set on a 200GB drive with 320GB of NAND. With enough spare area it's possible to maintain low write amplification without TRIM. That's not to say that it's not valuable, but for the discussion today it's not at the top of the list.

    The beauty of covering the enterprise SSD space is that you avoid a lot of the high write amp controllers to begin with and extra spare area isn't unheard of. Try selling a 320GB consumer SSD with only 200GB of capacity and things look quite different :-P

    Take care,
    Anand
  • Stuka87 - Wednesday, February 8, 2012 - link

    Great article Anand, I have been waiting for one like this. It will really come in handy to refer back to myself, and refer others too when they ask about SSD's in an enterprise environment.
  • Iketh - Thursday, February 9, 2012 - link

    Anand's nickname should be Magnitude or the OOM Guy.
  • wrednys - Thursday, February 9, 2012 - link

    What's going on with the media wear indicator on the first screenshot? 656%?
    Or is the data meaningless before the first E4 reset?
  • Kristian Vättö - Thursday, February 9, 2012 - link

    Great article Anand, very interesting stuff!
  • ssj3gohan - Thursday, February 9, 2012 - link

    So... something I'm missing entirely in the article: what is your estimate of write amplification for the various drives? Like you said in another comment, typical workloads on Sandforce usually see WA < 1.0, while in this article it seems to be squarely above 1. Why is that, what is your estimate of the exact value and can you show us a workload that would actually benefit from Sandforce?

    This is very important, because with any reliability qualms out of the way the intel SSD 520 could be a solid recommendation for certain kinds of workloads. This article does not show any benefit to the 520.
  • Christopher29 - Thursday, February 9, 2012 - link

    Members of this forum are testing (Anvil) SSDs with VERY extreme workloads. X25-V40GB (Intel drive) has already 685 TB WRITES ! This is WAY more than 5TB suggested by Intel. They also fill drives completely! This means that your 120GB SSDs (limited even to 100GB) could withstand almost 1 PB writes. One of their 40GB Intel 320 failed after writting 400TB!
    Corsair Force 3 120GB has already 1050TB writes! You shoul reconsider your assumptions, because it seems that those drives (and large ones especially) will last much longer.

    Stats for today:
    - Intel 320 40GB – 400TB (dead)
    - Samsung 470 64GB – 490TB (dead)
    - Crucial M4 64GB – 780TB (dead)
    - Crucial M225 60GB – 840TB (dead)
    - Corsair F40A - 210TB (dead)
    - Mushkin Chronos Deluxe 60GB – 480TB (dead)
    - Corsair Force 3 120GB – 1050TB (1 PB! and still going)
    - Kingston SSDNow 40GB (X25-V) (34nm) - 640TB

    SOURCE:
    http://www.xtremesystems.org/forums/showthread.php...
  • Christopher29 - Thursday, February 9, 2012 - link

    PS: And also interestingly Force 3 (that lasted longest) is exactly SF-2281 drive? So what is it in reality Anand, does this mean that SF do write less and therefore SSD last longer?
  • Death666Angel - Thursday, February 9, 2012 - link

    In every sentence, he commented how he was being conservative and that real numbers would likely be higher. However, given the sensitive nature of business data/storage needs, I think most of them are conservative and rightly so. The mentioned p/e cycles are also just estimates and likely vary a lot. Without anyone showing 1000 Force 3 drives doing over 1PB, that number is pretty much useless for such an article. :-)
  • Kristian Vättö - Thursday, February 9, 2012 - link

    I agree. In this case, it's better to underestimate than overestimate.

Log in

Don't have an account? Sign up now