Management Granularity

Much of Apple’s marketing on Fusion Drive talks about moving data at the file and application level, but in reality data can be moved between the SSD and HDD portions in 128KB blocks.

Ars actually confirmed this a while ago, but I wanted to see for myself. Using fs_usage I got to see the inner workings of Apple's Fusion Drive. Data is moved between drives in 128KB blocks, likely determined by frequency of use of those blocks. Since client workloads tend to be fairly sequential (or pseudo-random at worst) in nature, it's a safe bet that if you're accessing a single LBA within a 128KB block that you're actually going to be accessing more LBAs in the same space. The migration process seems to happen mostly during idle periods, although I have seen some movement between drives during light IO.

What’s very interesting is just how quickly the migration is triggered after a transfer occurs. As soon as file copy/creation, application launch or other IO activity completes, there’s immediate back and forth between the SSD and HDD. As you fill up the Fusion Drive, the amount of data moved between the SSD and HDD shrinks considerably. Over time I suspect this is what should happen. Infrequently accessed data should settle on the hard drive and what really matters will stay on the SSD. Apple being less aggressive about evicting data from the SSD as the Fusion Drive fills up makes sense.

The migration process itself is pretty simple with data being marked for promotion/demotion, it being physically copied to the new storage device and only then is it moved. In the event of a power failure during migration there shouldn't be any data loss caused by the Fusion Drive, it looks like only after two copies of the 128KB block are in place is the source block removed. Apple told me as much last year, but it's good to see it for myself.

By moving data in 128KB blocks between the HDD and SSD, Apple enjoys the side benefit of partially defragmenting the SSD with all writes to it. Even though the Fusion Drive will prefer the SSD for all incoming writes (which can include smaller than 128KB, potentially random/pseudo-random writes), any migration from the HDD to the SSD happens as large block sequential writes, which will trigger a garbage collection/block recycling routine in cases of a heavily fragmented drive. Performance of the SSD can definitely degrade over time, but this helps keep it higher than it would otherwise given that the SSD is almost always running at full capacity and the recipient of all sorts of unrelated writes. As I mentioned earlier, I would’ve preferred a controller with more consistent IO latency or for Apple to set aside even more of the PM830’s NAND as spare area. I suspect cost was the deciding factor in sticking with the standard amount of overprovisioning.

Fusion Drive: Under the Hood The Application Experience
Comments Locked

127 Comments

View All Comments

  • EnzoFX - Saturday, January 19, 2013 - link

    Yes, exactly. This is the point of computers. It always bothers me when self-proclaimed experts come on tech sites dismissing anything of the sort. I can imagine them saying " Well just do RAID, or just manage the files yourself" and then stating that such a solution as this as unnecessary, when they clearly don't understand the point. They only work to slow such efforts down.
  • name99 - Saturday, January 19, 2013 - link

    If your friend has a mac, and if they can borrow enough temporary storage (to copy and hold the files while you make the change over), what I would recommend is that they stripe their 3 HDs together as a single volume. This can be done easily enough using the Disk Utility GUI.
    (Honestly they should have enough temporary storage anyway, in the form of Time Machine backup).

    This will give a single volume (less moving around from one place to another) with 3x the bandwidth (as long as each hard drive is connected to a distinct USB or FW port).

    [If the drives are of different sizes, and you don't want to waste the extra space, it is still possible to use them this way, but you will need to use the command line. Assume you have two drives, one of 300GB, one of 400GB --- the extension to more drives is obvious.
    You partition the 400GB drive as a 300GB and 100GB partition.
    You then
    (a) create a striped RAID from the 300GB drive and the 300GB partition
    (b) convert the 100GB partition to a (single-drive) concatenated RAID volume [this step is not obviously necessary but is key]
    (c) create a concatenated volume from the volume created in (a) and that created in (b).
    This will give you 600GB of striped storage, plus 100GB at the end of slower non-striped storage. Can't complain.]

    Not a perfect solution, but a substantial improvement on the situation right now.

    I don't know the state of the art for SW RAID built into Windows so I can't comment on that.
  • guidryp - Friday, January 18, 2013 - link

    Really this seems like a solution for the lazy or technically naive.

    Manually managing your SSD/HD resources allows you to speed up based exactly on your own priorities, instead of having some software guessing and making a bunch of unnecessary copies to/from the SSD/HD.

    You get faster performance of pure SSD where you want it. Less hiccups from background reorganization, and less unnecessary stressing of the SSD.

    Also it isn't exactly difficult to manage manually. Use the SSD for your main OS/Application drive and whatever else you deem important for speed up.
  • zlandar - Friday, January 18, 2013 - link

    "Really this seems like a solution for the lazy or technically naive."

    If everyone was technologically literate spam wouldn't exist and computer companies wouldn't need customer service for stupid questions.
  • jeffkibuule - Friday, January 18, 2013 - link

    Aren't a lot of solutions built for the technologically naive?
  • NCM - Friday, January 18, 2013 - link

    Apple's principal market, especially for the iMac, is to home and small business users. Once again dragging out the familiar, but still applicable, automotive metaphor, I'll point out that most people don't want to work on their cars. They just want to drive reliably to wherever they're going. That's the need that Apple's FD addresses, and it seems to do so rather well.

    Sure, the price adder is a bit higher than one might hope, but probably not so much that it'll frighten away prospective buyers.

    Interestingly though, it lost our sale. I was ready to order another iMac with a 256GB SSD and a 1TB HD for the office. We keep most of the files on the server, but a 128GB SSD application/boot drive is a bit tight. However a 256GB SSD is just right, allowing plenty of free space to maintain SSD performance. The additional 1TB HD is then repurposed for local Time machine backup.

    But that's not an option for the new iMac, which offers only HD or FD. And I'm not about to make a risky and warranty busting expedition into its innards in order to roll my own SSD solution (although my own MacBook Pro has a self-installed 512GB SSD).

    Instead I ordered up a 256GB SSD Mac mini, plus what turned out to be a very nice 24" 16:10 IPS monitor from HP. Although I would have preferred the all-in-one iMac solution for a cleaner installation without gratuitously trailing cables, the Mac mini with SSD, i7 and 8GB RAM options is fast and effective.
  • ThreeDee912 - Friday, January 18, 2013 - link

    Wasn't this the kind of thing said about virtual memory in the 60's and 70's? Some people back then thought manually managing the location of everything in memory would make things more efficient, until some guys at IBM (or was it Bell Labs?) showed you saved heck of a lot more time letting the machine do it instead of trying to move things around yourself.

    This Fusion Drive really does reminds me of virtual memory. RAM and HDD mapped in a way so it appears as a single type of memory. Most stuff gets placed into RAM first, some stuff spills over onto the HDD, and stuff gets copied back and forth depending on how frequently it's used. The fast RAM is first priority, but there's the HDD as kind of a backup.

    It's a bit different from a caching setup, where the computer has to "guess" a bit more about what should really be on the SSD. It's like the HDD is priority here, while the SSD is secondary.

    And just like with virtual memory, none of this would matter if you had a huge amount of RAM or a very large SSD.
  • web2dot0 - Saturday, January 19, 2013 - link

    Great comment ThreeDee9. Someone with a rational mind.

    To all those "experts" who claim that it's better to manage it yourself, you can also write every program in ASM. It'll be fast and small, but I'll be done with the project in 1/10 the time. The point is .... the product is not meant to provide "absolutely the best possible configuration". It's meant to be best all around solution.

    If you guys still don't get it. Well, I guess all these years in the education didn't really help you because logical people think rationally.
  • psyq321 - Monday, January 21, 2013 - link

    Hmm... is it just me who finds it slightly disturbing that we are comparing memory management (and, in some posts later, C vs. assembly coding) with the decision on how to organize documents/files?

    I would say that the intellectual investment is not really to compare.

    Which does not mean that I have anything against SSD caching solutions - on the contrary, I see nothing wrong with ability to transparently manage the optimal location for the content.
  • TrackSmart - Friday, January 18, 2013 - link

    A month ago, I would have said the same thing, but see my other post to understand why more people need this than you think. The proportion of people who can handle manually segregating their files is much, much smaller than most of us realize. I have three systems setup with both an SSD and a HDD and have no troubles. But we are a tiny, tiny minority of users.

Log in

Don't have an account? Sign up now