It's common knowledge by now that as NAND cells shrink in size (thanks to smaller process nodes), their endurance and program/erase latencies both suffer. Consumers wouldn't really be happy with newer drives dying sooner and performing worse than their predecessors, so controller and NAND makers have to work extra hard to compensate for losses due to the physics of NAND.

In the second half of this year we saw the beginning of a transition from Intel's 25nm MLC NAND to a 20nm process. The smaller process will eventually allow us to have larger SSDs with 16GB NAND die (up from the current 8GB max), but it should also help drive SSD prices down as you can fit more 20nm NAND cells per 300mm wafer than you can at 25nm. The cost motivation alone will move production from 25nm to 20nm in fairly short order. The question remains: how is endurance impacted by the move to 20nm? When Kristian reviewed Intel's SSD 335, he tried to find out. 

The 335 is Intel's first branded drive to ship with 20nm 2-bit-per-cell MLC NAND. Like most modern Intel SSDs, the drive reports total NAND writes as well as the percent remaining program/erase cycles on the NAND. By looking at both values you can get a general idea of the expect lifespan of the NAND in number of program/erase cycles. 

In his 335 review, Kristian calculated the endurance of Intel's 20nm NAND on the 335 to be below 1,000 program/erase cycles. Intel confirmed that its 20nm MLC NAND is rated at 3,000 p/e cycles and that Kristian's results shouldn't have happened. Others duplicated the results around the web, so we waited for an explanation from Intel. Today we have that explanation.

The Media Wearout Indicator (MWI) on Intel SSDs looks at total number of times the drive's NAND has been cycled and then divides it by the p/e rating for the NAND to determine its value. The p/e rating is hard-coded into the firmware for the NAND. The 335 we reviewed had the p/e rating set to 1500 cycles, which was accurate for an earlier, non-production version of Intel's 20nm NAND. The 335 should have shipped with firmware that set this value to 3000 cycles, but someone forgot to set the variable to the right value in the firmware code. Woops. As a result, the MWI value on the 335 decreases at 2x the rate it should. This doesn't mean the drive is wearing out twice as fast as it should, just that the MWI data is inaccurate. Take our numbers from the 335 review and double them to get how long the 335 may last. 


MWI from our 335 review, incorrectly low

Kristian's review showed that to be roughly 250TB of writes, which means the actual value is aroung 500TB of actual NAND writes (incompressible). Doing the math on the 240GB capacity gives us 2083 full drive writes over the life of the drive, or about 5.7 years of useful life if you write 240GB of data to the NAND every day. Even if your workload has a write amplification factor of 10x, you're still talking about 24GB of writes per day for nearly 6 years.

The math works out to be around 1500 p/e cycles, however remember that when the MWI hits 0 the NAND isn't truly exhausted. It should last well beyond that point. The MWI hitting 0 on an Intel drive is just a good point to begin looking at replacing the drive.

Intel will have a firmware update for the 335 out before the end of the month that fixes the MWI reporting behavior, but if you're concerned about endurance on the 335 - I wouldn't be. 

There's another issue however. The 335 should have similar endurance to the 330, but even with our revised numbers the two look very different. It turns out MWI reporting on the 330 is not working at all; the MWI value will never drop, regardless of how much you write to the drive. Intel has committed to fixing this issue in a December firmware update for the 330.

Neither issue fundamentally impacts the functionality of the drive or its endurance. But if you're closely monitoring the MWI values, you should keep all of this in mind.

That's all for the Intel SSD update. I've been hearing more reports of dying Samsung SSD 840 Pros and I believe I know the cause (firmware related, should be fixed in the latest shipping revision) but I'm still waiting for confirmation on one last thing before explaining what's going on there.

Comments Locked

31 Comments

View All Comments

  • jwilliams4200 - Sunday, November 18, 2012 - link

    I was referring to people testing new products. Not end users, who I never recommend to buy a new SSD until at least 3 - 6 months after release, for the reason you allude to.
  • sheh - Saturday, November 17, 2012 - link

    If it drops at the same rate as P/E cycles, does this mean we're already approaching 5 years of retention for new cells (conjecture/extrapolation based on very little info on the topic I could find around)? Are manufacturers adhering to the JEDEC standard of 1 year retention at a cell's end of life? Is this at MWI of 0 or afterwards?
  • Anand Lal Shimpi - Saturday, November 17, 2012 - link

    At the end of the useful life of the NAND (somewhere beyond MWI == 0), retention should be 1 year for a consumer drive. Enterprise drives that boast eMLC will instead optimize for 3 month retention at the end of life.

    Take care,
    Anand
  • seapeople - Sunday, November 18, 2012 - link

    Wait, so it was a bad idea to use about 20 Intel 335's for our town's time capsule project last week? We have BILLIONS of photos that were locked away under the new Community Center... and a bootleg version of Hurt Locker.
  • HollyDOL - Monday, November 19, 2012 - link

    In your use case I suspect the drives will cover with rust than die on MWI. Bear in mind you actually have to write to the SSD to wear it out.
    I am more worried about little green aliens digging the drives in 1000 years trying to figure out SATA3 :-))
  • sparkuss - Saturday, November 17, 2012 - link

    I look at 5.7 years and really don't worry for my home system, but wonder with a WIn7 OS and active AV and other running processes left on 24/7.

    How many writes are adding up each day just by the OS and active processes. That's before I actually run other things likes games and email, internet etc.
  • Anand Lal Shimpi - Saturday, November 17, 2012 - link

    I did this experiment on my own machine a while back and came out with something around 10GB of writes per day.

    Take care,
    Anand
  • MrSpadge - Sunday, November 18, 2012 - link

    I've been using an Agility 3 60 GB as SRT cache drive for the HDD in my desktop. It's running 24/7 BOINC, some games etc. I'm seeing 438 days of run time, 5.87 TB written, 5.78 TB read, media wear out at 0% (life left 100%). If this counter works I have nothing to worry about at 13.4 GB/day.
  • LMF5000 - Saturday, November 17, 2012 - link

    So I gather that smaller process nodes lead to worse flash memory performance and longevity, instead of improving the performance as in the case of other kinds of integrated ciruits. So the question is, why move to smaller process nodes in the first place? Couldn't the same cost reductions be obtained by optimising and maturing the larger process nodes? Then at least you wouldn't be limited to 1000 write cycles instead of 3000. Incidentally with these endurance numbers the smaller process node would only a net improvement if they cost one third the price per GB. Otherwise the price per total endurance (i.e. write cycles x drive capacity) would be greater and we'd be worse-off.
  • Anand Lal Shimpi - Saturday, November 17, 2012 - link

    Yields on a known process do improve over time, but at mature yields the ability to cram ~1.5x - 2.0x the number of transistors into the same area will always win in terms of driving pricing down.

    Until we run out of NAND roadmap, Intel claims it will be able to maintain current levels of endurance. We've got a couple more shrinks to go through.

    Take care,
    Anand

Log in

Don't have an account? Sign up now