Management Granularity

Much of Apple’s marketing on Fusion Drive talks about moving data at the file and application level, but in reality data can be moved between the SSD and HDD portions in 128KB blocks.

Ars actually confirmed this a while ago, but I wanted to see for myself. Using fs_usage I got to see the inner workings of Apple's Fusion Drive. Data is moved between drives in 128KB blocks, likely determined by frequency of use of those blocks. Since client workloads tend to be fairly sequential (or pseudo-random at worst) in nature, it's a safe bet that if you're accessing a single LBA within a 128KB block that you're actually going to be accessing more LBAs in the same space. The migration process seems to happen mostly during idle periods, although I have seen some movement between drives during light IO.

What’s very interesting is just how quickly the migration is triggered after a transfer occurs. As soon as file copy/creation, application launch or other IO activity completes, there’s immediate back and forth between the SSD and HDD. As you fill up the Fusion Drive, the amount of data moved between the SSD and HDD shrinks considerably. Over time I suspect this is what should happen. Infrequently accessed data should settle on the hard drive and what really matters will stay on the SSD. Apple being less aggressive about evicting data from the SSD as the Fusion Drive fills up makes sense.

The migration process itself is pretty simple with data being marked for promotion/demotion, it being physically copied to the new storage device and only then is it moved. In the event of a power failure during migration there shouldn't be any data loss caused by the Fusion Drive, it looks like only after two copies of the 128KB block are in place is the source block removed. Apple told me as much last year, but it's good to see it for myself.

By moving data in 128KB blocks between the HDD and SSD, Apple enjoys the side benefit of partially defragmenting the SSD with all writes to it. Even though the Fusion Drive will prefer the SSD for all incoming writes (which can include smaller than 128KB, potentially random/pseudo-random writes), any migration from the HDD to the SSD happens as large block sequential writes, which will trigger a garbage collection/block recycling routine in cases of a heavily fragmented drive. Performance of the SSD can definitely degrade over time, but this helps keep it higher than it would otherwise given that the SSD is almost always running at full capacity and the recipient of all sorts of unrelated writes. As I mentioned earlier, I would’ve preferred a controller with more consistent IO latency or for Apple to set aside even more of the PM830’s NAND as spare area. I suspect cost was the deciding factor in sticking with the standard amount of overprovisioning.

Fusion Drive: Under the Hood The Application Experience
Comments Locked

127 Comments

View All Comments

  • name99 - Friday, January 18, 2013 - link

    Yeah, and USING Momentus XT sucks. The experience is horribly uneven.
    Enough stuff comes up fast that you get used to that, but enough stuff comes up slowly that it's REALLY noticeable because you're used to the occasional bursts of speed.

    I've used Momentus, I've used Fusion. There is no comparison.

    In fact (true story) after I replaced the broken HD in a friend's MacBook Pro with a Momentus she told me a week later that she thought the computer was still broken because it seemed to behave so strangely, sometimes feeling really fast, then a little later feeling so slow.

    Now, if Momentus were kitted out with
    - 64GB (maybe even just 32GB) of
    - FAST flash (not the cheap crap used in USB thumb drives) AND
    - cached writes
    it might work well. But that's not the product that Seagate is selling.
  • Death666Angel - Friday, January 18, 2013 - link

    2 of your 3 points are very correct. But they do use SLC which is not the cheap stuff.
  • name99 - Friday, January 18, 2013 - link

    If they do use decent flash, then why don't they cache writes?

    I always assumed it was because their flash (like USB thumb flash) was so crappy that it was slower for random writes than the HD was.
  • kyuu - Saturday, January 19, 2013 - link

    Because there's a lot more to it than just using the right NAND. Also, for the 2nd-gen Momentus XT they were going to release a firmware update that would enable write caching. I'm not sure if that ever happened, haven't followed up on it recently.
  • kyuu - Saturday, January 19, 2013 - link

    That's because the MacBook/MacOS sucks. Not the Momentus XT's fault.

    Been using a Momentus XT in a Windows machine for a long time, had no problems with it being "uneven".

    Also, they sure as hell don't use cheap flash "used in USB thumb drives".
  • ShieTar - Saturday, January 19, 2013 - link

    How would the HDD know what is a file? The OS will just command a drive to write a given data block to Sector X.

    The drive may treat X as a logical address, and reorder data internally, but it has no clue if it is writing a complete file or parts of it, or just writing zeros as ordered by some secure erase software.
  • Subyman - Friday, January 18, 2013 - link

    Any word on how much the migration process increases read/write quantity over a manually managed setup? As for ssd life being longer than hdd life, if we take into account that almost all writes will hit the ssd first and then some will transfer to the hdd this means the hdd is accessed less often. This could level the mean read/write to failure rate to make the hdd even with the ssd, unless migration has an effect that I'm not considering.
  • dimmer - Friday, January 18, 2013 - link

    Did you enable TRIM or not?
  • name99 - Friday, January 18, 2013 - link

    It's a Mac for gods sake. It comes configured correctly (yes, with TRIM enabled) out the box.
  • alanh - Friday, January 18, 2013 - link

    For me, the biggest problem is the added difficulty of doing an upgrade or replacement of storage if it starts getting full or goes bad. From what I've read, the only option is to do a full backup, replace one of the disks, and then do a full restore. I have an '11 MBP with SSD and the DVD replaced with a large HD, so I could, in theory move to a Fusion drive, but it just seems like a risky and annoying proposition.

Log in

Don't have an account? Sign up now