Toshiba last week announced its first 3D NAND flash memory chips featuring QLC (quadruple level cell) BiCS architecture. The new components feature 64 layers and developers of SSDs and SSD controller have already received samples of the devices, which Toshiba plans to use for various types of storage solutions.

Toshiba’s first 3D QLC NAND chips feature 768 Gb (96 GB) capacity and uses 64 layers, just like the company’s BICS3 chips with 256 Gb and 512 Gb capacities launched in 2016 and 2017. Toshiba does not share further details about its 3D QLC NAND IC (integrated circuit), such as page size, the number of planes as well as interface data transfer rate, but expect the latter to be high enough to build competitive SSDs in late 2018 to early 2019 (that’s our assumption). Speaking of applications that Toshiba expects to use its 3D QLC NAND ICs, the maker of flash memory mentions enterprise and consumer SSDs, tablets and memory cards.

Endurance++

Besides intention to produce 768 Gb 3D QLC NAND flash for the aforementioned devices, the most interesting part of Toshiba’s announcement is endurance specification for the upcoming components. According to the company, its 3D QLC NAND is targeted for ~1000 program/erase cycles, which is close to TLC NAND flash. This is considerably higher than the amount of P/E cycles (100 – 150) expected for QLC by the industry over the years. At first thought, it comes across a typo - didn't they mean 100?. But the email we received was quite clear:

- What’s the number of P/E cycles supported by Toshiba’s QLC NAND?
- QLC P/E is targeted for 1K cycles.

It is unclear how Toshiba managed to increase the endurance of its 3D QLC NAND by an order of magnitude versus initially predicted. What we do know is that signal processing is more challenging with QLC than it is with TLC, as each cell needs to accurately determine sixteen different voltage profiles (up from 2 in SLC, 4 in MLC, and 8 in TLC). 

The easiest way to handle this would be to increase the cell size: by having more electrons per logic level, it is easier to maintain the data and also read from it / write to it. However, the industry is also in a density race, where bits per mm^2 is an issue. Also, to deal with read errors from QLC memory, controllers with very advanced ECC capabilities have to be used for QLC-based SSDs. Toshiba has its own QSBC (Quadruple Swing-By Codes) error correction technique, which it claims to be superior to LDPC (low-density parity-check) that is widely used today for TLC-powered drives. However, there are many LDPC implementations and it is unknown which of them Toshiba used for comparison against its QSBC. Moreover, there are more ECC methods that are often discussed at various industrial events (such as FMS), so Toshiba could be using any or none of them. The only thing that the company tells about its ECC now is that it is stronger than 120 bits/1 KB used today for TLC. In any case, if Toshiba’s statement about 1000 P/E cycles for QLC is correct, it means that that the company knows how to solve both endurance and signal processing challenges.

The main advantage of QLC NAND is increased storage density when compared to TLC and MLC, assuming the same die size. As was perhaps expected, die size numbers were not provided. However, last year Toshiba and Facebook talked about a case study QLC-powered SSD with 100 TB of capacity for WORM (write once read many) applications and it looks like large-capacity custom drives and memory cards will be the first to use QLC for cold storage. P/E cycles and re-write endurance isn't a concern for WORM at this stage.

Toshiba has begun to sample its 3D QLC NAND memory devices earlier this month to various parties to enable development of SSDs and SSD controllers. Taking into account development and qualification time, Toshiba plans to mass produce its BiCS3 768 Gb 3D QLC NAND chips around the same time it starts to make its the next generation BiCS4 ICs. The latter is set to hit mass production in 2018, but the exact timeframe is yet to be determined.

Related Reading:

Source: Toshiba

POST A COMMENT

36 Comments

View All Comments

  • BurntMyBacon - Monday, July 17, 2017 - link

    @Glaring_Mistake
    Never saw an official endurance rating for the 850 EVO, so I was extrapolating from the TBW (note I didn't say P/E cycles here where it did later). The 840 Pro is a consumer drive, so I had no reason to believe they would be rated differently. I did however come across this article when looking up your 2000 P/E number suggesting that the 850 EVO does in fact have less endurance than the 840PRO. Good catch there.

    @Glaring_Mistake: "Endurance for planar TLC NAND varies quite a bit too, for example the Kingston UV400 is rated at 400-500 P/E while the Plextor M7V is rated at 2000 P/E, despite using the same NAND and controller! So it is not like 1000 P/E is set in stone for planar TLC NAND."

    Perhaps, I should have specified that I was talking about Samsung planer TLC for that 1000 P/E cycles. I thought it was self evident as my comparison was between Samsung flash, but in hindsight, I did mention other companies as examples of who uses charge trap and floating gate.

    I agree that TLC NAND varies, which is why I stuck with the same manufacturer and as close to the same controller as possible. Design of the cells, size of the cells, process, and even software considerations like error correction method can all contribute to the endurance of the NAND. This is why it is impractical to compare endurance of flash from different manufacturers using different controllers and running different firmware. The Kingston UV400 and Plextor M7V is a perfect example of how NAND with the same NAND and controller with identical physical endurance can get wildly different ratings depending on the firmware (different error correction methods). This does little to alleviate my fears regarding QLC and even gives me pause regarding TLC.

    @Glaring_Mistake: "Finally the 960 EVO was found by Nordichardware to be rated at around 1200 P/E despite using 3D TLC NAND with a Charge Trap."

    I'm guessing this is another unofficial rating, but it seems logical. I would expect the 960 EVO with Charge Trap base 3D TLC NAND NAND built on a smaller process to have less endurance than 850 EVO with Charge Trap base 3D TLC NAND built on a larger process. That's before your consider that you have a couple of generations difference in controller and firmware that makes it even harder to compare. I short, I'm not sure what you are trying to say here.
    Reply
  • BurntMyBacon - Wednesday, July 05, 2017 - link

    Unfortunately, despite all the data that suggests TLC has plenty of endurance for consumers, I can't bring myself to put a TLC drive in my personal system. I have not problem putting one in a client system after making sure they are fully aware of the trade-offs, though. However, I can't in good conscience recommend a QLC drive even with the 1000 P/E cycle rating as I know how they are getting there.

    It's like comparing a higher quality recording to a recording with pops, artifacts, and distortion. In this case, they would call them equal because the listener on the junk recording was more adept and was able to understand the same number of words as the listener on the higher quality recording.

    Just because you have better error correction, doesn't mean you haven't exceeded to limits of the NAND and caused errors. The more errors you have to contend with, the more you are playing a probability game of can you correct the errors and still retrieve your data, or will you hit a particularly bad sequence of errors that is outside of your error correction algorithm's ability to correct.
    Reply
  • Alexvrb - Wednesday, July 05, 2017 - link

    What firm do you do engineering work for again? May I see your whitepapers? With the move to 3D NAND I have no issues with good TLC drive. QLC... fine for storage, at the very least. At 1000 P/E it will likely have a lower average failure rate than HDDs, even for primary storage - not that I would use it for primary storage. Reply
  • BurntMyBacon - Monday, July 17, 2017 - link

    @Alexvrb
    What part of "Unfortunately, despite all the data that suggests TLC has plenty of endurance for consumers, I can't bring myself to put a TLC drive in my personal system." suggests that this is anything more than an opinion.

    @Alexvrb: "What firm do you do engineering work for again? May I see your whitepapers?"

    What exactly do you want me to prove here. I've already stated that there is plenty of data that suggests TLC had plenty of endurance for consumers. I didn't mention enterprise, but if you are talking enterprise, there are some use cases that require eMLC. If regular MLC isn't up to the task, then clearly TLC won't be for that scenario. If you are talking about the error correction method, there are plenty word dedicated to that in the article above.
    From Article
    "Also, to deal with read errors from QLC memory, controllers with very advanced ECC capabilities have to be used for QLC-based SSDs. Toshiba has its own QSBC (Quadruple Swing-By Codes) error correction technique, which it claims to be superior to LDPC (low-density parity-check) that is widely used today for TLC-powered drives."

    Are you suggesting that I need to show you white paper before I can have an opinion of what performance and endurance I am willing to pay for? I tell my customers that there is minimal risk with a good TLC drive and a little more so with cheaper drives like the Kingston UV400, but controller failure often happens before the NAND wears out anyways. Furthermore, for most people, even the cheapest TLC drive last much longer than spinning media. If a client wants a QLC drive, I'll give it to them. I will not, however, suggest it to the client as at this point there are too many unknowns. A NAND endurance of 1000 P/E cycles is more than some TLC drives (Kingston UV400 gets half that). That said, it has not been tested and it is unknown whether they can actually pull it off. Also consider, that the number relies on advanced ECC capabilities. The effect this has on the controller is an unknown quantity. There were drives in TechReport's SSD Endurance test that failed without a single bad sector, suggesting the controllers were responsible:
    http://techreport.com/review/27909/the-ssd-enduran...

    Full Disclosure: I have designed ASICs fabricated for and in use at Washington University's Radio Chemistry department (nuclear physics), so while I have no experience design pure memory chips, I do know a few things about silicon charge decay rates. Do you have any credentials to provide or did you think you could give yourself credibility by discrediting me?
    Reply
  • grant3 - Friday, July 07, 2017 - link

    Consumer SSDs are regularly being endurance tested, and invariably are proven to handle more wear than even a power user would put on a drive before it's hopelessly obsolete. There is no "trade-off" for someone buying a TLC drive unless they're sticking it in a datacenter.

    The odds of the drive being lost to fire, theft, or alien abduction are much greater than some some crazy random chain of cell failures overpowering normal error correction.

    A responsible advisor would simply tell his clients to regularly back up their data on PROPER archive media, and use the money they saved on some beers so they can relax.
    Reply
  • BurntMyBacon - Monday, July 17, 2017 - link

    @grant3: "Consumer SSDs are regularly being endurance tested, and invariably are proven to handle more wear than even a power user would put on a drive before it's hopelessly obsolete."

    Agreed. Quick question: Do you have a line on endurance tests with anything newer than a Samsung 840 EVO? My 60 second GoogleFu has failed me in this regard. Of particular interest to me is offline data retention, which happens to be the number 1 storage failure issue I personally encounter with SSDs.

    @grant3: "The odds of the drive being lost to fire, theft, or alien abduction are much greater than some some crazy random chain of cell failures overpowering normal error correction."

    While the odds are great, I submit for you a counter example:
    http://techreport.com/review/27909/the-ssd-enduran...
    "The 840 Series didn't encounter actual problems until 300TB, when it failed a hash check during the setup for an unpowered data retention test. The drive went on to pass that test and continue writing, but it recorded a rash of uncorrectable errors around the same time. Uncorrectable errors can compromise data integrity and system stability, so we recommend taking drives out of service the moment they appear."

    The workload is not a client workload, so this really isn't something to worry about. However, it proves that it can happen. By virtue of requiring stronger error correction, it stands to reason that QLC based drives are more likely to see this type of unrecoverable error than TLC drives. The process size the NAND is built on has also shrunk since then. Shrink the process a few more times and the odds may not be as far fetched as they once were.

    @grant3: "A responsible advisor would simply tell his clients to regularly back up their data on PROPER archive media, and use the money they saved on some beers so they can relax."

    Almost exactly what I tell them word for word. Almost never happens outside of businesses. They tell me up front they won't do it. I'll still put a TLC drive in as they usually last longer than spinning disk. However, given this situation QLC doesn't inspire as much confidence as I'd like (this is an opinion). One client I had with a proper backup solution in place still lost critical data because he relaxed the period between backups due to perceived performance issues on the network that were completely unrelated.
    Reply

Log in

Don't have an account? Sign up now