Intel Explains 20nm NAND Endurance Concerns on the SSD 335by Anand Lal Shimpi on November 17, 2012 1:49 PM EST
It's common knowledge by now that as NAND cells shrink in size (thanks to smaller process nodes), their endurance and program/erase latencies both suffer. Consumers wouldn't really be happy with newer drives dying sooner and performing worse than their predecessors, so controller and NAND makers have to work extra hard to compensate for losses due to the physics of NAND.
In the second half of this year we saw the beginning of a transition from Intel's 25nm MLC NAND to a 20nm process. The smaller process will eventually allow us to have larger SSDs with 16GB NAND die (up from the current 8GB max), but it should also help drive SSD prices down as you can fit more 20nm NAND cells per 300mm wafer than you can at 25nm. The cost motivation alone will move production from 25nm to 20nm in fairly short order. The question remains: how is endurance impacted by the move to 20nm? When Kristian reviewed Intel's SSD 335, he tried to find out.
The 335 is Intel's first branded drive to ship with 20nm 2-bit-per-cell MLC NAND. Like most modern Intel SSDs, the drive reports total NAND writes as well as the percent remaining program/erase cycles on the NAND. By looking at both values you can get a general idea of the expect lifespan of the NAND in number of program/erase cycles.
In his 335 review, Kristian calculated the endurance of Intel's 20nm NAND on the 335 to be below 1,000 program/erase cycles. Intel confirmed that its 20nm MLC NAND is rated at 3,000 p/e cycles and that Kristian's results shouldn't have happened. Others duplicated the results around the web, so we waited for an explanation from Intel. Today we have that explanation.
The Media Wearout Indicator (MWI) on Intel SSDs looks at total number of times the drive's NAND has been cycled and then divides it by the p/e rating for the NAND to determine its value. The p/e rating is hard-coded into the firmware for the NAND. The 335 we reviewed had the p/e rating set to 1500 cycles, which was accurate for an earlier, non-production version of Intel's 20nm NAND. The 335 should have shipped with firmware that set this value to 3000 cycles, but someone forgot to set the variable to the right value in the firmware code. Woops. As a result, the MWI value on the 335 decreases at 2x the rate it should. This doesn't mean the drive is wearing out twice as fast as it should, just that the MWI data is inaccurate. Take our numbers from the 335 review and double them to get how long the 335 may last.
MWI from our 335 review, incorrectly low
Kristian's review showed that to be roughly 250TB of writes, which means the actual value is aroung 500TB of actual NAND writes (incompressible). Doing the math on the 240GB capacity gives us 2083 full drive writes over the life of the drive, or about 5.7 years of useful life if you write 240GB of data to the NAND every day. Even if your workload has a write amplification factor of 10x, you're still talking about 24GB of writes per day for nearly 6 years.
The math works out to be around 1500 p/e cycles, however remember that when the MWI hits 0 the NAND isn't truly exhausted. It should last well beyond that point. The MWI hitting 0 on an Intel drive is just a good point to begin looking at replacing the drive.
Intel will have a firmware update for the 335 out before the end of the month that fixes the MWI reporting behavior, but if you're concerned about endurance on the 335 - I wouldn't be.
There's another issue however. The 335 should have similar endurance to the 330, but even with our revised numbers the two look very different. It turns out MWI reporting on the 330 is not working at all; the MWI value will never drop, regardless of how much you write to the drive. Intel has committed to fixing this issue in a December firmware update for the 330.
Neither issue fundamentally impacts the functionality of the drive or its endurance. But if you're closely monitoring the MWI values, you should keep all of this in mind.
That's all for the Intel SSD update. I've been hearing more reports of dying Samsung SSD 840 Pros and I believe I know the cause (firmware related, should be fixed in the latest shipping revision) but I'm still waiting for confirmation on one last thing before explaining what's going on there.