NetApp: Automatic Tiering and More Flash Goodness

Most vendors did not do much better than NetApp as they used an advocated automatic tiering, meaning that hot data was moved from the slow magnetic disks to the flash disk. Although it sounds nice, the reality was that it did not solve some performance bottlenecks. As the process was not real-time, you could be hitting the disks a lot for a piece of data before the data was finally moved towards the flash tier. Also migrating data around is not very energy friendly as it wastes a lot of processing and storage bandwidth.

To sum it up: NetApp's Flash Cache did better than the "automatic flash tier" of other vendors, but the flash cache performance/dollar ratio was not exactly something to write home about.

Last year, NetApp went a step further. The storage arrays could be expanded with a “flash pool”, a storage pool consisting of a RAID group of SSDs (100, 200, or 800GB) that caches the random reads and writes of the volumes inside a magnetic hard disk pool. All writes are first written to the NVRAM and then flushed to the disks. However, an overwrite of random write is written to the flash pool. This greatly improves performance when you update the same data over and over again in a small time period because the update is only propagated to the disks when the data is not changed for some time. Sequential writes and reads are still sent to the disks, which is an intelligent way to make the most of your SSDs. Also, the flash pool is an LRU (Least Recently Used) cache.

It is ironic to notice that NetApp quotes customers who reported 100s of ms for critical requests in case studies. While the case studies did make the flash based SAN shine, they also show how a few years ago, SAN arrays were expensive and not delivering. Luckily, those customers now report that flash pools reduced the response time to 5 ms. It is good that the newest NetApp technology has accelerated this, but it is also a clear example that even high-end SANs failed to deliver good performance to customers just a year ago.

But flash pool and flash cache do not give the performance benefits that server side flash cache delivered with Fusion-IO. So something really interesting happened: NetApp announced Flash Accel, making sure its SANs could work together with server side flash caches. Even more interesting is that NetApp is not charging anything for this software, probably to make sure that the current NetApp customers do not get lured away by other server side storage solutions.

Existing customers can simply download the ESXi 5.0/Windows 2008 agent. Each VM needs to get an agent and an ESXi host, so Flash Accel works at the moment with only a limited number of configurations. However, it's quite disruptive to witness a typical SAN vendor promoting server side caching. Just a year ago, most SAN vendors were downplaying this trend.

NetApp: Flash Anywhere Fusion-IO: the Pioneer
Comments Locked

60 Comments

View All Comments

  • jhh - Wednesday, August 7, 2013 - link

    And Advanced/SDDC/Chipkill ECC, not the old-fashioned single-bit correct/multiple bit detect. The RAM on the disk controller might be small enough for this not to matter, but not on the system RAM.
  • tuxRoller - Monday, August 5, 2013 - link

    Amplidata's dss seems like a better, more forward looking alternative.
  • Sertis - Monday, August 5, 2013 - link

    The Amplistore design seems a bit better than ZFS. ZFS has a hash to detect bit rot within the blocks, while this stores FEC coding that can potentially recover the data within that block without calculating it based on parity from the other drives on the stripe and the I/O that involves. It also seems to be a bit smarter on how it distributes data by allowing you to cross between storage devices to provide recovery at the node level while ZFS is really just limited to the current pool. It has various out of band data rebalancing which isn't really present in ZFS. For example, add a second vdev to a zpool when it's 90% full and there really isn't a process to automatically rebalance the data across the two pools as you add more data. The original data stays on that first vdev, and new data basically sits in the second vdev. It seems very interesting, but I certainly can't afford it, I'll stick with raidz2 for my puny little server until something open source comes out with a similar feature set.
  • Seemone - Tuesday, August 6, 2013 - link

    Are you aware that with ZFS you can specify the number of replicas each data block should have on a per-filesystem basis? ZFS is indeed not very flexible on pool layout and does not rebalance things (as of now), but there's nothing in the on-disk data structure that prevent this. This means it can be implemented and would be applicable on old pools in a non disruptive way. ZFS, also, is open source, its license is simply not compatible with GPLv2, hence ZFS-On-Linux separate distribution.
  • Brutalizer - Tuesday, August 6, 2013 - link

    If you want to rebalance ZFS, you just copy the data back and forth and rebalancing is done. Assume you have data on some ZFS disks in a ZFS raid, and then you add new empty discs, so all data will sit on the old disks. To spread the data evenly to all disks, you need to rebalance the data. Two ways:
    1) Move all data to another server. And then move back the data to your ZFS raid. Now all data are rebalanced. This requires another server, which is a pain. Instead, do like this:
    2) Create a new ZFS filesystem on your raid. This filesystem is spread out on all disks. Move the data to the new ZFS filesystem. Done.
  • Sertis - Thursday, August 8, 2013 - link

    I'm definitely looking forward to these improvements, if they eventually arrive. I'm aware of the multiple copy solution, but if you read the Intel and Amplistore whitepapers, you will see they have very good arguments that their model works better than creating additional copies by spreading out FEC blocks across nodes. I have used ZFS for years, and while you can work around the issues, it's very clear that it's no longer evolving at the same rate since Oracle took over Sun. Products like this keep things interesting.
  • Brutalizer - Tuesday, August 6, 2013 - link

    Theory is one thing, real life another. There are many bold claims and wonderful theoretical constructs from companies, but do they hold up to scrutiny? Researchers injected artificially constructed errors in different filesystems (NTFS, Ext3, etc), and only ZFS detected all errors. Researchers have verified that ZFS seems to combat data corruption. Are there any research on Amplistore's ability to combat datacorruption? Or do they only have bold claims? Until I see research from a third part, independent part, I will continue with the free open source ZFS. CERN is now switching to ZFS for tier-1 and tier-2 long time term storage, because vast amounts of data _will_ have data corruption, CERN says. Here are research papers on data corruption on NTFS, hardware raid, ZFS, NetApp, CERN, etc:
    http://en.wikipedia.org/wiki/ZFS#Data_integrity
    For instance, Tegile, Coraid, GreenByte, etc - are all storage vendors that offers PetaByte Enterprise servers using ZFS.
  • JohanAnandtech - Tuesday, August 6, 2013 - link

    Thanks, very helpful feedback. I will check the paper out
  • mikato - Thursday, August 8, 2013 - link

    And Isilon OneFS? Care to review one? :)
  • bitpushr - Friday, August 9, 2013 - link

    That's because ZFS has had a minimal impact on the professional storage market.

Log in

Don't have an account? Sign up now