Management Granularity

Much of Apple’s marketing on Fusion Drive talks about moving data at the file and application level, but in reality data can be moved between the SSD and HDD portions in 128KB blocks.

Ars actually confirmed this a while ago, but I wanted to see for myself. Using fs_usage I got to see the inner workings of Apple's Fusion Drive. Data is moved between drives in 128KB blocks, likely determined by frequency of use of those blocks. Since client workloads tend to be fairly sequential (or pseudo-random at worst) in nature, it's a safe bet that if you're accessing a single LBA within a 128KB block that you're actually going to be accessing more LBAs in the same space. The migration process seems to happen mostly during idle periods, although I have seen some movement between drives during light IO.

What’s very interesting is just how quickly the migration is triggered after a transfer occurs. As soon as file copy/creation, application launch or other IO activity completes, there’s immediate back and forth between the SSD and HDD. As you fill up the Fusion Drive, the amount of data moved between the SSD and HDD shrinks considerably. Over time I suspect this is what should happen. Infrequently accessed data should settle on the hard drive and what really matters will stay on the SSD. Apple being less aggressive about evicting data from the SSD as the Fusion Drive fills up makes sense.

The migration process itself is pretty simple with data being marked for promotion/demotion, it being physically copied to the new storage device and only then is it moved. In the event of a power failure during migration there shouldn't be any data loss caused by the Fusion Drive, it looks like only after two copies of the 128KB block are in place is the source block removed. Apple told me as much last year, but it's good to see it for myself.

By moving data in 128KB blocks between the HDD and SSD, Apple enjoys the side benefit of partially defragmenting the SSD with all writes to it. Even though the Fusion Drive will prefer the SSD for all incoming writes (which can include smaller than 128KB, potentially random/pseudo-random writes), any migration from the HDD to the SSD happens as large block sequential writes, which will trigger a garbage collection/block recycling routine in cases of a heavily fragmented drive. Performance of the SSD can definitely degrade over time, but this helps keep it higher than it would otherwise given that the SSD is almost always running at full capacity and the recipient of all sorts of unrelated writes. As I mentioned earlier, I would’ve preferred a controller with more consistent IO latency or for Apple to set aside even more of the PM830’s NAND as spare area. I suspect cost was the deciding factor in sticking with the standard amount of overprovisioning.

Fusion Drive: Under the Hood The Application Experience
Comments Locked

127 Comments

View All Comments

  • edlee - Friday, January 18, 2013 - link

    I get the cached solution for fusion. But I would rather just handle the usage myself and have os and applications on SSD and all media on a Raid array.

    SSD for life.
  • Death666Angel - Friday, January 18, 2013 - link

    It looks better than I thought. I'm still not going to use it myself (Windows/Linux user here and I have no trouble managing more than one partition). But it seems better than the usual Windows caching solutions. Still, the non-technical people I know don't need more than a few hundred GB of space on their PC and no one has more than one HDD in their PC anyway. So the easiest way for them (which is what I always recommend) is to have a 256GB SSD and an external 1 to 3TB drive. All their work is on the SSD with daily/weekly backups and photos are on their external HDD (none of those people use the PC to view movies).
  • tipoo - Friday, January 18, 2013 - link

    If you write a huge file, it all gets written to the SSD up to 117GB. But that SSD is filled with other stuff. Won't it be limited by the speed it transfers the old things to the hard drive? How does that work if the files aren't mirrored?
  • name99 - Friday, January 18, 2013 - link

    Read the damn article before posting. ALL those questions are answered there.
  • ltcommanderdata - Friday, January 18, 2013 - link

    Apple still lists the 3TB Fusion drive as incompatible with Boot Camp "at this time". Presumably this is due to how Apple is doing the BIOS emulation with EFI 1.10 and running into the 2.2TB drive size limit. Have you heard any methods to get 3TB Fusion working with Boot Camp or heard whether Apple has a solution in the works?
  • tipoo - Friday, January 18, 2013 - link

    Just curious, I think that's how Readyboost worked. You would have a flash drive immediately start sending data slowly to your computer while the hard drive took its time to seek the larger chunks of data. So I wonder if there is a large queue of data for a Fusion drive to read, it will read from both drives concurrently?
  • tipoo - Friday, January 18, 2013 - link

    "In less than a year Apple could double the size of the NAND used in Fusion Drive at no real change to cost."

    But will they? If the iPods, iPhones, and iPad are any indication, they will more likely pocket the savings. Been a long time since a capacity doubling from them.
  • name99 - Friday, January 18, 2013 - link

    Oh for fscks sake.

    The iPod nano1 came in sizes of 1, 2, 4GB
    The 2nd gen came as 2, 4, 8GB.
    3rd were 4, 8GB
    4th was 4,8, 16GB.
    All at essentially the same retail price.

    Apple has showed consistent pattern (you also see it in the shuffle, or in iPod Touch), of doubling the storage until they hit a point which seems to cover almost everyone's needs. Then there is a year or two of stasis, then a new product category which requires more storage.

    Next time you want to post blatant nonsense, try to remember that on the internet people WILL call you out when you state bullshit.
  • tipoo - Friday, January 18, 2013 - link

    Feeling self-important today? Yes, that's what I mean, there hasn't been a doubling since the fourth generation Nano. Or does "Been a long time since a capacity doubling from them" mean "they have never ever doubled capacity" in your little world?
  • tipoo - Friday, January 18, 2013 - link

    "Then there is a year or two of stasis, then a new product category which requires more storage."

    Like the iPads, which would be ideal for storing HD video if not for the exorbitant prices of higher capacities, with zero bump for the base price since the first one?

Log in

Don't have an account? Sign up now