Management Granularity

Much of Apple’s marketing on Fusion Drive talks about moving data at the file and application level, but in reality data can be moved between the SSD and HDD portions in 128KB blocks.

Ars actually confirmed this a while ago, but I wanted to see for myself. Using fs_usage I got to see the inner workings of Apple's Fusion Drive. Data is moved between drives in 128KB blocks, likely determined by frequency of use of those blocks. Since client workloads tend to be fairly sequential (or pseudo-random at worst) in nature, it's a safe bet that if you're accessing a single LBA within a 128KB block that you're actually going to be accessing more LBAs in the same space. The migration process seems to happen mostly during idle periods, although I have seen some movement between drives during light IO.

What’s very interesting is just how quickly the migration is triggered after a transfer occurs. As soon as file copy/creation, application launch or other IO activity completes, there’s immediate back and forth between the SSD and HDD. As you fill up the Fusion Drive, the amount of data moved between the SSD and HDD shrinks considerably. Over time I suspect this is what should happen. Infrequently accessed data should settle on the hard drive and what really matters will stay on the SSD. Apple being less aggressive about evicting data from the SSD as the Fusion Drive fills up makes sense.

The migration process itself is pretty simple with data being marked for promotion/demotion, it being physically copied to the new storage device and only then is it moved. In the event of a power failure during migration there shouldn't be any data loss caused by the Fusion Drive, it looks like only after two copies of the 128KB block are in place is the source block removed. Apple told me as much last year, but it's good to see it for myself.

By moving data in 128KB blocks between the HDD and SSD, Apple enjoys the side benefit of partially defragmenting the SSD with all writes to it. Even though the Fusion Drive will prefer the SSD for all incoming writes (which can include smaller than 128KB, potentially random/pseudo-random writes), any migration from the HDD to the SSD happens as large block sequential writes, which will trigger a garbage collection/block recycling routine in cases of a heavily fragmented drive. Performance of the SSD can definitely degrade over time, but this helps keep it higher than it would otherwise given that the SSD is almost always running at full capacity and the recipient of all sorts of unrelated writes. As I mentioned earlier, I would’ve preferred a controller with more consistent IO latency or for Apple to set aside even more of the PM830’s NAND as spare area. I suspect cost was the deciding factor in sticking with the standard amount of overprovisioning.

Fusion Drive: Under the Hood The Application Experience
Comments Locked

127 Comments

View All Comments

  • Richard Fairbanks - Saturday, January 19, 2013 - link

    Thanks, Anand, for yet another timely article!

    I do almost all my work in code (i.e. text) with few graphics. I want to ensure reliability in case of disk failure.

    Thus I am considering getting a 2012 Mac mini, opening it up, and adding a 256GB Samsung 840 Pro, in addition to the default 1TB HDD. (The 256GB capacity would allow me a 25+% spare area.) This is my ideal configuration for many reasons.

    If I partition the HDD to match the 256GB SSD (leaving ~750MB for random, non-critical data), is it possible to create a RAID 1 array between the SSD and the 256GB HDD partition? (Full backups are made daily.)

    In theory, this would allow all the array reads to come from the SSD for fastest response, and still maintain a mirrored HDD that could be booted from should the SSD fail. (If only the HDD partition could be a ZEVO ZFS format! ;-) )

    Thoughts? Thanks!!
  • NCM - Saturday, January 19, 2013 - link

    Richard asks: "If I partition the HDD to match the 256GB SSD (leaving ~750MB for random, non-critical data), is it possible to create a RAID 1 array between the SSD and the 256GB HDD partition?"

    That's an interesting question. I think the problem would be that there is no "master" disk in a RAID 1 array. Each slice is treated equally. You're hoping that read/write activity would be first served by the faster SSD, with the HD slice catching up in the background on its own time. I don't know that there's any evidence it would work like that, or, putting it another way, that anyone has written a RAID controller to make it happen that way.

    It would be interesting to try it out.

    We have some Mac Pro towers that I've set up SSD boot/application drives, but we rely on conventional Time Machine backups to an internal HD rather than a RAID mirror.
  • name99 - Saturday, January 19, 2013 - link

    It is possible to create a RAID 1 in the way you are thinking using AppleRAID.
    What you want to do is simple enough that you can do it in DIsk Utility using the GUI.
    If you really insist on going hardcore, hit Terminal and look at diskutil.
    And you can boot off such an AppleRAID system.

    HOWEVER I suspect you will be very unhappy with the results. A system like that can deliver snappy reads (because they'll mostly come from the SSD) but writes will be gated by the HD, and the system will frequently feel an HD system.

    It is ALSO possible that you won't even get the read speeds you imagine.
    When I used AppleRAID in this way (mirroring two HDs) a few years ago, it seemed to me that reads were also slower, and my assumption was that the system, assuming you cared primarily about data correctness (that's why you were mirroring rather than striping), performed both reads and compared the results before passing them up to the file system. Which suggests that your reads will ALSO be gated by the HD performance.

    I'm also not sure what problem you believe you are solving with this. SSD failures are simply not that common. You can protect against them using Time Machine. If you REALLY are scared, you can have Time Machine alternate between two (or more) different backup drives.

    It seems like a huge amount of pain to solve a problem that barely exists and that can be protected against much better in other ways.
  • name99 - Saturday, January 19, 2013 - link

    To add to what I said, the AppleRAID mirroring stuff DOES work in terms of reliability, in that if one disk dies, you can just pop it out, replace it, and have the other disk copy to it. But, as I said, you pay a substantial hit in performance for this privilege.
  • cjb110 - Saturday, January 19, 2013 - link

    Gaming would have been an interesting 'use' case for the Fusion. When your playing you obviously want the fast access of SSD, but unless its your favourite game, it might not get used much and moved to the HDD.

    Also Games being much larger 'applications' would quickly fill the SSD if the Fusion just had a simple 'If App = On SSD" rule.
  • klaudyuxxx - Saturday, January 19, 2013 - link

    They reinvented the wheel. 128 GB flash + 1-3TB HDD fused into a single volume?! AKA HYBRID SAMSUNG HARD DRIVES
  • NCM - Saturday, January 19, 2013 - link

    You really haven't bothered to read the article, have you? Or perhaps it's a reading comprehension issue.
  • nerd1 - Saturday, January 19, 2013 - link

    Typical apple - charging $$$$ for non-tech-savy people.

    It's way better to have a proper SSD (most laptops and desktops now have mSATA port) in terms of both performance and cost. Yes, I know that swapping the HDD of any apple device kills the warranty and most apple customers don't know how to upgrade a single component.....
  • Andhaka - Monday, January 21, 2013 - link

    Nope, swapping the HDD with a SSD does not kill the warranty and many Apple users do that (I have done it on a 4 years old Macbook).
    But if other people find it better to pay for the Fusion solution (and a good solutions it seems to be) good for them.

    Cheers
  • pichemanu - Saturday, January 19, 2013 - link

    Hi Anand,

    i saw that in order to test a "pure ssd" setup you connected a 830 ssd to the imac over USB 3. As far as i know the best transfer rate over USB 3 is around 250 MB/s and the worst is well... terrible.

    Considering the best case scenario for the iMac:
    -USB 3 connected SSD would do 250 MB/s
    -SATA 3 connected SSD would do 322 MB/s (taken from your article)

    The performance would be 6.94 for fusion drive and 10.19 for a "pure SSD". This is an increase from 114% advantage for the "pure SSD" to 147% advantage for the "pure SSD".

    If on the other hand your USB connected SSD did not write at max and a SATA 3 connected SSD would (that is 350 for the samsung 830 on an intel Z77 SATA 3 port) that difference would skyrocket.

    Did you check that on your particular workload the USB 3 connection was not a bottleneck?

    Thank you.

Log in

Don't have an account? Sign up now