The Unmentionables: NAND Mortality Rate

When Intel introduced its X25-M based on 50nm NAND technology we presented this slide:

A 50nm MLC NAND cell can be programmed/erased 10,000 times before it's dead. The reality is good MLC NAND will probably last longer than that, but 10,000 program/erase cycles was the spec. Update: Just to clarify, once you exceed the program/erase cycles you don't lose your data, you just stop being able to write to the NAND. On standard MLC NAND your data should be intact for a full year after you hit the maximum number of p/e cycles.

When we transitioned to 34nm, the NAND makers forgot to mention one key fact. MLC NAND no longer lasts 10,000 cycles at 34nm - the number is now down to 5,000 program/erase cycles. The smaller you make these NAND structures, the harder it is to maintain their integrity over thousands of program/erase cycles. While I haven't seen datasheets for the new 25nm IMFT NAND, I've heard the consumer SSD grade stuff is expected to last somewhere between 3000 - 5000 cycles. This sounds like a very big problem.

Thankfully, it's not.

My personal desktop sees about 7GB of writes per day. That can be pretty typical for a power user and a bit high for a mainstream user but it's nothing insane.

Here's some math I did not too long ago:

  My SSD
NAND Flash Capacity 256 GB
Formatted Capacity in the OS 238.15 GB
Available Space After OS and Apps 185.55 GB
Spare Area 17.85 GB

If I never install another application and just go about my business, my drive has 203.4GB of space to spread out those 7GB of writes per day. That means in roughly 29 days my SSD, if it wear levels perfectly, I will have written to every single available flash block on my drive. Tack on another 7 days if the drive is smart enough to move my static data around to wear level even more properly. So we're at approximately 36 days before I exhaust one out of my ~10,000 write cycles. Multiply that out and it would take 360,000 days of using my machine for all of my NAND to wear out; once again, assuming perfect wear leveling. That's 986 years. Your NAND flash cells will actually lose their charge well before that time comes, in about 10 years.

Now that calculation is based on 50nm 10,000 p/e cycle NAND. What about 34nm NAND with only 5,000 program/erase cycles? Cut the time in half - 180,000 days. If we're talking about 25nm with only 3,000 p/e cycles the number drops to 108,000 days.

Now this assumes perfect wear leveling and no write amplification. Now the best SSDs don't average more than 10x for write amplification, in fact they're considerably less. But even if you are writing 10x to the NAND what you're writing to the host, even the worst 25nm compute NAND will last you well throughout your drive's warranty.

For a desktop user running a desktop (non-server) workload, the chances of your drive dying within its warranty period due to you wearing out all of the NAND are basically nothing. Note that this doesn't mean that your drive won't die for other reasons before then (e.g. poor manufacturing, controller/firmware issues, etc...), but you don't really have to worry about your NAND wearing out.

This is all in theory, but what about in practice?

Thankfully one of the unwritten policies at AnandTech is to actually use anything we recommend. If we're going to suggest you spend your money on something, we're going to use it ourselves. Not in testbeds, but in primary systems. Within the company we have 5 SandForce drives deployed in real, every day systems. The longest of which has been running, without TRIM, for the past eight months at between 90 and 100% of its capacity.

SandForce, like some other vendors, expose a method of actually measuring write amplification and remaining p/e cycles on their drives. Unfortunately the method of doing so for SandForce is undocumented and under strict NDA. I wish I could share how it's done, but all I'm allowed to share are the results.

Remember that write amplification is the ratio of NAND writes to host writes. On all non-SF architectures that number should be greater than 1 (e.g. you go to write 4KB but you end up writing 128KB). Due to SF's real time compression/dedupe engine, it's possible for SF drives to have write amplification below 1.

So how did our drives fare?

The worst write amplification we saw was around 0.6x. Actually, most of the drives we've deployed in house came in at 0.6x. In this particular drive the user (who happened to be me) wrote 1900GB to the drive (roughly 7.7GB per day over 8 months) and the SF-1200 controller in turn threw away 800GB and only wrote 1100GB to the flash. This includes garbage collection and all of the internal management stuff the controller does.

Over this period of time I used only 10 cycles of flash (it was a 120GB drive) out of a minimum of 3000 available p/e cycles. In eight months I only used 1/300th of the lifespan of the drive.

The other drives we had deployed internally are even healthier. It turns out I'm a bit of a write hog.

Paired with a decent SSD controller, write lifespan is a non-issue. Note that I only fold Intel, Crucial/Micron/Marvell and SandForce into this category. Write amplification goes up by up to an order of magnitude with the cheaper controllers. Characterizing this is what I've been spending much of the past six months doing. I'm still not ready to present my findings but as long as you stick with one of these aforementioned controllers you'll be safe, at least as far as NAND wear is concerned.

 

Architecture & What's New Today: Toshiba 32nm Toggle NAND, Tomorrow: IMFT 25nm
Comments Locked

144 Comments

View All Comments

  • HangFire - Friday, February 18, 2011 - link

    As a software engineer, I can tell you that temp files are used over in-memory storage because the s/w was originally written that way, and no bug report concerning them will ever reach high priority status because it is ranked as a system configuration issue that can be fixed by the user.

    In other words, inertia of the "good enough" file writing code (written when RAM was sparse) will prevent software from being re-written to more optimal in-memory usage. The long backlog of truly important bugs taking precedence insures that.

    You have a good point about ramdisks competing with disk caching. What is optimal depends on your application load, and to some extent your storage subsystem.
  • cdillon - Thursday, February 17, 2011 - link

    The idea of moving the page-file to a RAM disk makes my head hurt. That's just retarded. You'd do better to turn off paging entirely, but that's also of questionable benefit because paging isn't really that hard on your SSD.

    Putting the temp directory along with browser caches and other non-critical frequently-written data is not a bad idea as long as you don't over-do it. The only problem with putting the temp drive on a volatile software-based RAM drive is that any software installation you do that requires a reboot with intermediate installer files kept in the temp directory which are expected to be there on the next reboot is going to fail.
  • Qapa - Saturday, February 19, 2011 - link

    Hi Anand,

    I second this request :)

    A few changes though:
    - DISABLE page file
    --- no matter if you have SSD or HDD, windows writes to the page file even if only using 10% of RAM), so you decrease writes to disk which does 2 thing: increase life of disk and increase speed of system. possibly both only marginally, but that's what benchs would show;
    - browser caches
    --- for sure this is one of the most wasteful disk writing and it should be more and more a great amount of writes since we are ever more on the web
    - temporary folders
    --- as someone else mentions you could come into problems if you need a install-reboot-finish_install kind of instalation
    --- and I agree, with the sw engineer - if it works it won't get changed, so programs will put stupid stuff to files just because that was the way they did it at some point in time

    I think a 1-2Gb RAM Disk is more than enough for browser and temp files, considering an initial starting RAM size of 4-8Gb of RAM. And yes, I do believe this improves system performance.

    Can you do the benchs?

    Thanks for the site - all reviews - and hope you can add this request as another review.
  • shawkie - Thursday, February 17, 2011 - link

    I notice that the Intel SSD 510 has just started to appear on some retailer websites. It looks like it is SATA 6Gbs and comes in 120GB and 250GB versions. Pricing looks pretty high at this point.
  • BansheeX - Thursday, February 17, 2011 - link

    Color me unexcited. SSD is fast and reliable enough for people to want it. The price per GB isn't coming down anywhere near as fast as other technologies. I paid $200 1.5 years ago for an 80GB SSD drive that goes for $180 today.
  • chrysrobyn - Friday, February 18, 2011 - link

    Maybe 80GB for $200 is good enough for you, but I need twice that capacity, and I'm unwilling to pay more than $200. The next generation of SSDs that are coming out between now and May are going to come far closer to that price point for me.
  • seapeople - Friday, February 18, 2011 - link

    The point is that 1.5 years ago the OP purchased a SSD for $2.5/GB which had anywhere from a 2x-30x performance improvement over its predecessor (HD's), and here we are in 2011 reading a review about the next generation SSD which uses smaller, cheaper flash with half the available write-cycle life which is going to sell for... $5/GB and get a 1.2x-3x performance improvement over its predecessor (initial SSD's).

    What's next? A solid state drive that reads and writes at 2,000 GB/s and sells for $10,000 for the 1 TB model? Oh I can't wait for that.
  • ABR - Saturday, February 19, 2011 - link

    I have to agree. Year after year we see more and more mind-boggling performance improvements over regular HDDs, but little or no price drop. Perhaps the materials costs are just insurmountable and the replacement of HDDs won't be happening after all. SSDs will be like SLR digital cameras -- premium and professional use only, and pricing a previous generation of amateur users out of a market they used to be in.
  • FunBunny2 - Saturday, February 19, 2011 - link

    From what I see: as each feature size drop in the NAND, the controller has to get increasingly more byzantine, needs more cache, and so on just to maintain performance. Word is that IMFT 25nm includes an ECC engine on die!!!
  • Aernout - Saturday, February 19, 2011 - link

    Maybe we wil hear more of the hybride disks like the momentus XT from seagate in the future, for 'standard' users they can offer a lot.
    now they have a 4 gb flash with 500 gb but its 10 months old.
    I think lots of people are hoping they will multiply those specs.
    I'm thinking of getting one for my laptop, but then on the otherside i am not sure if i will use 500 gig on my laptop, maybe i should buy a 64 ssd in stead.

Log in

Don't have an account? Sign up now