Comparison With Other Storage Paradigms

Zoned storage is just one of several efforts to enable software to make its IO more SSD-friendly and to reduce unnecessary overhead from the block storage abstraction. The NVMe specification already has a collection of features that allow software to issue writes with appropriate sizes and alignment for the SSD, and features like Streams and NVM Sets to help ensure unrelated IO doesn't land in the same erase block. When supported by the SSD and host software, these features can provide most of the QoS benefits ZNS can achieve, but they aren't as effective at preventing write amplification. Applications that go out of their way to serialize writes (eg. log-structured databases) can expect low write amplification, but only if the filesystem (or another layer of the IO stack) doesn't introduce fragmentation or reordering commands. Another downside is that these several features are individually optional, so applications must be prepared to run on SSDs that support only a subset of the features the application wants.

The Open Channel SSD concept has been tried in several forms. Compared to ZNS, Open Channel SSDs put even more requirements on the host software (such as wear leveling), which has hindered adoption. However, capable hardware has been available from several vendors. The LightNVM Open Channel SSD specification and associated projects have now been discontinued in favor of ZNS and other standard NVMe features, which can provide all of the benefits and functionality of the Open Channel 2.0 specification while placing fewer requirements on host software (but slightly more on SSD firmware). The other vendor-specific open channel SSD specifications will probably be retired when the current hardware implementations reach end of life.

ZNS and Open Channel SSDs can both be seen as modifications to the block storage paradigm, in that they continue to use the concept of a linear space of Logical Block Addresses of a fixed size. Another recently approved NVMe TP adds a command set for Key-Value Namespaces, which completely abandon the fixed-size LBA concept. Instead, the drive acts as a key-value database, storing objects of potentially variable size, identified by keys of a few bytes. This storage abstraction looks nothing like how the underlying flash memory works, but KV databases are very common in the software world. A KV SSD allows such a database's functionality to be almost completely offloaded from the CPU to the SSD. Implementing a KV database directly in the SSD firmware avoids a lot of the overhead of implementing a KV database on top of block storage that is running on top of a traditional Flash Translation Layer, so this is another viable route to making IO more SSD-friendly. KV SSDs don't really have the cost advantages that a ZNS-only SSD can offer, but for some workloads they can provide similar performance and endurance benefits, and save some CPU time and RAM in the process.

The Software Model Ecosystem Status
Comments Locked

45 Comments

View All Comments

  • WorBlux - Wednesday, December 22, 2021 - link

    > Kinda seems like they introduced a whole new problem, there?

    Sort of, many of these drives are meant to be used with the API's directly accessible to your application. Which means the application now has to solve problems that the OS hand-waved away.

    If your application uses the filesystem API, only the filesystem has to worry about this. But if you want to application to leverage the determinism and parallelism available in the ZNS drives then it should be able to leverage the zone append command (which is the big advance over the zbc api of SMR hard drives)
  • Steven Wells - Saturday, August 8, 2020 - link

    So as a DRAM cost play this might save low single percentage point savings of the parts costs which seems like not enough motivation to consider. So clearly most the savings comes from reduced over provisioning of the flash needed to get get a similar write application and trading that against the extra lift required by the host. Curious if anyone has shared TCO studies on this to validate a clear cost savings for all the heavy lifting required by both drive and data center customer.
  • matias.bjorling - Monday, August 10, 2020 - link

    Thanks for the comprehensive write-up, Billy. It's great to see you writing about ZNS on Anandtech - I've followed it for 20 years! I haven't thought that my work on creating Open-Channel SSDs and Zoned Namespaces would one day be featured on Anandtech. Thanks for making it mainstream!
  • umeshm - Monday, August 24, 2020 - link

    This is the best explanation and analysis I have found on ZNS. Thank you, Billy!

    I have a question about how 4KB LBAs are persisted when the flash page size is 16KB and when 4 pages (QLC) are stored on each wordline.

    You mentioned that a 4KB LBA is partially programmed into flash page, but is vulnerable to read disturb. But my understanding so far is that only SLC supports partial-page programming. So a 4KB LBA would need to be buffered in NVRAM within the SSD until there is a full page (16KB) worth of data to write to a page. Then, the wordline is partially programmed, because the 3 other pages have not been programmed yet, so the wordline still needs to be protected through additional caching or buffering in NVRAM.

    Could you or someone else please either confirm or correct my understanding?

    Again, I really appreciate the effort and thinking that has gone into this article.

    Umesh
  • weilinzhu - Monday, June 20, 2022 - link

    very helpful article, thanks a lot! As you wrote: "A recent Western Digital demo with a 512GB ZNS prototype SSD showed the drive using a zone size of 256MB (for 2047 zones total) but also supporting 2GB zones." could you please kindly tell me where it was published! thanks in advance!!

Log in

Don't have an account? Sign up now