The More Complicated (yet predictable) SSD Lottery

Apple continues to use a custom form factor and interface for the SSDs in the MacBook Air. This generation Apple opted for a new connector, so you can't swap drives between 2011 and 2012 models. I'd always heard reports of issues with the old connector from a manufacturing standpoint, so the change makes sense. The new SSD connector looks to be identical to the one used by the Retina Display equipped MacBook Pro, although rest of the SSD PCB is different.


The Toshiba Branded SandForce SF-2200 controller in the 2012 MacBook Air - iFixit

As always there are two SSD controller vendors populating the drives in the new MacBook Air: Toshiba and Samsung. The Samsung drives use the same PM830 controller found in the 2012 MacBook Pro as well as the MacBook Pro with Retina Display. The Toshiba drives use a rebranded SandForce SF-2200 controller. Both solutions support 6Gbps SATA and both are capable of reaching Apple's advertised 500MB/s sequential access claims.

While in the past we've recommended the Samsung over the Toshiba based drives, things are a bit more complicated this round because of the controller vendor Toshiba decided to partner with.


The write/recycle path in NAND flash based SSD

Samsung's PM830 works just like any other SSD controller. To the OS it presents itself as storage with logical block addresses starting from 0 all the way up to the full capacity of the drive. Reads and writes come in at specific addresses, and the controller maps those addresses to blocks and pages in its array of NAND flash. Every write that comes in results in data written to NAND. Those of you who have read our big SSD articles in the past know that NAND is written to at the page level (these days pages are 8KB in size), but can only be erased at the block level (typically 512 pages, or 4MB). This write/erase mismatch combined with the fact that each block as a finite number of program/erase cycles it can endure is what makes building a good SSD controller so difficult. In the best case scenario, the PM830 will maintain a 1:1 ratio of what the OS tells it to write to NAND and what it actually ends up writing. In the event that the controller needs to erase and re-write a block to optimally place data, it will actually end up writing more to NAND than the OS requested of it. This is referred to as write amplification, and is responsible for the performance degradation over time that you may have heard of when it comes to SSDs.


Write Amplification

For most client workloads, with sufficient free space on your drive, Samsung's PM830 can keep write amplification reasonably low. If you fill the drive and/or throw a fragmented enough workload at it, the PM830 doesn't actually behave all that gracefully. Very few controllers do, but the PM830 isn't one of the best in this regard. My only advice is to try and keep around 20% of your drive free at all times. You can get by with less if you are mostly reading from your drive or if most of your writes are just big sequential blocks (e.g. copying big movies around). I explain the relationship between free space and write amplification here.

Write Amplification vs. Spare Area, courtesy of IBM Zurich Research Laboratory

The Toshiba controller works a bit different. As I already mentioned, Toshiba's controller is actually a rebranded SandForce controller. SandForce's claim to fame is the ability to commit less data to NAND than your OS writes to the drive. The controller achieves this by using a hardware accelerated compression/data de-duplication engine that sees everything in the IO stream.

The drive still presents itself as traditional storage with an array of logical block addresses. The controller still keeps track of mapping LBAs to NAND pages and blocks. However, because of the compression/dedupe engine, not all data that's written to the controller is actually written to NAND. Anything that's compressible, is compressed before being written. It's decompressed on the fly when it's read back. All of the data is still tracked, the drive still is and appears to be the capacity that is advertised (you don't get any extra space), you just get extra performance. After all, writing nothing is always faster than writing something.

Writing less data to NAND can improve performance over time by keeping write amplification low. There are also impacts on NAND endurance, but as I've shown in the past, endurance isn't a concern for client drives and usage models. Writing less also results in a slight reduction in component count: there's no external DRAM found on SandForce based drives. The PM830 SSD features a 256MB DDR2 device on-board, while the Toshiba based drive has nothing - just NAND and the controller. This doesn't end up making the Toshiba drive substantially cheaper as SandForce instead charges a premium for its controller. In the case of the PM830, both user data and LBA-to-NAND mapping tables are cached in DRAM. In the case of the Toshiba drive, a smaller on-chip cache is used since there's typically less data being written to the NAND itself.

SandForce's approach is also unique in that performance varies depending on the composition of the data written to the drive.

PC users should be well familiar with SandForce's limitations, but this is the first time that Apple has officially supported the controller under OS X. As such I thought I'd highlight some of the limitations so everyone knows exactly what they're getting into.

Any data that's random in composition, or already heavily compressed, isn't further reduced by Toshiba's SandForce controller. As SandForce's architecture is designed around the assumption that most of what we interact with is easily compressible, when a SF controller encounters data that can't be compressed it performs a lot slower.

Apple SSD Comparison - 128KB Sequential Read (QD1)
Special thanks to AnandTech reader KPOM for providing the 256GB Samsung results

Apple SSD Comparison - 4KB Random Read (QD3)

The performance impact is pretty much limited to writing. We typically use Iometer to measure IO performance as it's an incredibly powerful tool. You can define transfer size, transfer locality (from purely sequential all the way to purely random) and even limit your tests to specific portions of the drive, among other features. Later versions of Iometer introduced the ability to customize the composition of each IO transfer. For simplicity, whenever Iometer goes to write anything to disk it's a series of repeating bytes (all 0s, all 1s, etc...). Prior to SandForce based SSDs this didn't really matter. SandForce's engine will reduce these IOs to their simplest form. A series of repeating bytes can easily be represented in a smaller form (one byte and a record of how many times it repeats). Left at its default settings, SandForce drives look amazing in Iometer - even faster than the PM830 based Samsung drive that Apple uses. Even more impressive, since very little data is actually being written to the drive, you can run default Iometer workloads for hours (if not days) on end without any degradation in performance. Doing so only tells us part of the story. While frequently used OS and application files are easily compressed, most files aren't.

Thankfully, later versions of Iometer include the ability to use random data in each transfer. There's still room for some further compression or deduplication, but it's significantly reduced. In the write speed charts below you'll see two bars for the Toshiba based SSD, the one marked incompressible uses Iometer's random data setting, the other one uses the default write pattern.

Apple SSD Comparison - 128KB Sequential Write (QD1)

Apple SSD Comparison - 4KB Random Write (8GB LBA Space - QD3)

When fed easily compressible data, the Toshiba/SandForce SSD performs insanely well. Even at low queue depths it's able to hit Apple's advertised "up-to" performance spec of 500MB/s. Random write performance isn't actually as good as Samsung's, but it's more likely to maintain these performance levels over time.

Therein lies the primary motivator behind SandForce's approach to flash controller architecture. Large sequential transfers are more likely to be heavily compressed (e.g. movies, music, photos), while the small, pseudo-random accesses are more likely easily compressible. The former is rather easy for a SSD controller to write at high speeds. Break up the large transfer, stripe it across all available NAND die, write as quickly as possible. The mapping from logical block addresses to pages in NAND flash is also incredibly simple. Fewer entries are needed in mapping tables, making the read and write of these large files incredibly easy to track/manage. It's the small, pseudo-random operations that cause problems. The controller has to combine a bunch of unrelated IOs in order to get good performance, which unfortunately leaves the array of flash in a highly fragmented state - bringing performance down for future IO operations. If SandForce's compression can reduce the number of these small IOs (which it manages to do very well in practice), then the burden really shifts to dealing with large sequential transfers - something even the worst controllers can do well.

It's really a very clever technology, one that has been unfortunately marred by a bunch of really bad firmware problems (mostly limited to PCs it seems).

The downside in practice is the performance when faced with these incompressible workloads. Our 4KB random write test doesn't actually drop in performance, but if we ran it for long enough you'd see a significant decrease in performance. The sequential write test however shows an immediate reduction of more than half. If you've been wondering why your Toshiba SSD benchmarks slower than someone else's Samsung, check to see what sort of data the benchmark tool is writing to the drive. The good news is that even in this state the Toshiba drive is faster than the previous generation Apple SSDs, the bad news is the new Samsung based drive is significantly quicker.

What about in the real world? I popped two SSDs into a Promise Pegasus R6, created a RAID-0 array, and threw a 1080p transcode of the Bad Boys Blu-ray disc on the drive. I then timed how long it took to copy the movie to the Toshiba and Samsung drives over Thunderbolt:

Real World SSD Performance with Incompressible Data
Copy 13870MB H.264 Movie 128GB Toshiba SSD 512GB Samsung SSD
Transfer Time 59.97 s 31.59 s
Average Transfer Rate 231.3 MB/s 439.1 MB/s

The results almost perfectly mirrored what Iometer's incompressible tests showed us (which is why I use those tests so often, they do a good job of modeling the real world). The Samsung based Apple SSD is able to complete the file copy in about half the time of the Toshiba drive. Pretty much any video you'd have on your machine will be heavily compressed, and as a result will deliver the worst case performance on the Toshiba drive.

Keep in mind that to really show this difference I had to have a very, very fast source for the transfer. Unless you've got a 6Gbps SSD over USB 3.0 or Thunderbolt, or a bunch of hard drives you're copying from, you won't see this gap. The difference is also less pronounced if you're copying from and to the same drive. Whether or not this matters to you really depends on how often you move these large compressed files around. If you do a lot of video and photo work with your Mac, it's something to pay attention to.

There's another category of users who will want to be aware of what you're getting into with the Toshiba based drive: anyone who uses FileVault or other full disk encryption software.

Remember, SandForce's technology only works on files that are easily compressed. Good encryption should make every location on your drive look like a random mess, which wreaks havoc on SandForce's technology. With FileVault enabled, all transfers look incompressible - even those small file writes that I mentioned are usually quite compressible earlier.

After enabling FileVault I ran our Iometer write tests on the drives again, performance is understandably impacted:

Apple SSD Comparison - 128KB Sequential Write (QD1)

Also look at what happens to our 4KB random write test if we repeat it a few times back to back:

Impact of FileVault on SandForce/Toshiba SSD

That trend will continue until the drive's random write performance is really bad. Sequential write passes will restore performance up to ~250MB/s, but it takes several passes to get it there:

Recovering Performance with Sequential Writes after Incompressible Rand Write

If you're going to be using FileVault, stay away from the Toshiba drive.

This brings us to the next problem: how do you tell what drive you have?

As of now Apple has two suppliers for the SSD controllers in all of its 2012 Macs: Toshiba and Samsung. If you run System Information (click the Apple icon in the upper left > About this Mac > System Report) and select Serial ATA you'll see the model of your SSD. Drives that use Toshiba's 6Gbps controller are labeled Apple SSD TSxxxE (where xxx is your capacity, e.g. TS128E for a 128GB drive), while 6Gbps Samsung drives are labeled Apple SSD SMxxxE. Unfortunately this requires you to already purchase and open up your system. It's a good thing that Apple stores are good about accepting returns.

There's another option that seems to work, for now at least. It seems as if all 256GB and 512GB Apple SSDs currently use Samsung controllers, while Toshiba is limited to the 64GB and 128GB capacities. There's no telling if this trend will hold indefinitely (even now it's not a guarantee) but if you want a better chance of ending up with a Samsung based drive, seek out a 256GB or larger capacity. Note that this also means that the rMBP exclusively uses Samsung controllers, at least for now.

I can't really blame Toshiba for this as even Intel has resorted to licensing SandForce's controllers for its highest performing drives. I will say that Apple doesn't seem to be fond of inconsistent user experiences across its lineup. I wouldn't be surprised if Apple sought out a third SSD vendor at some point.

The Display Performance
POST A COMMENT

190 Comments

View All Comments

  • ShadeZeRO - Monday, July 16, 2012 - link

    I'm curious as to why you haven't held the apple line to the normal scrutiny typically found in your other notebook reviews.

    I've noticed a certain level of bias on most review sites most likely caused by the sudden trend in popularity of Apple products.

    If this was branded differently I'm positive the display for one would have been ridiculed as a poultry offering.

    Overall a fine review, it's not on the level of a gizmodo/engadget/etc apple circle jerk so I still respect it.

    /rant
    Reply
  • KoolAidMan1 - Tuesday, July 17, 2012 - link

    He criticized the display and compared it unfavorably against other ultrabooks that use IPS panels. It is one of the few points of criticism, but its there. Is your problem that he doesn't say that the MBA is a complete piece of trash?

    Anand gave a very well balanced review as per usual.
    Reply
  • Super56K - Tuesday, July 17, 2012 - link

    I would gladly take a 1440x900 TN display like the Air's in a 13" Sandy/Ivy Bridge Windows laptop. Reply
  • notposting - Tuesday, July 17, 2012 - link

    Paltry offering.

    Unless they are offering up some turkeys, ducks, and chickens for sacrifice.
    Reply
  • Alameda - Tuesday, July 17, 2012 - link

    First of all, Anand, I want to tell you that you've once again written up an excellent review, and I'm very impressed with your thoroughness and clarity. I just purchased a 2011 MBA, so of course I read your review to justify that I made the right decision. While I understand the need for benchmarks and the real-world tests such as Photoshop and so forth, I think most people do not use their computers the way your tests imply.

    In my daily use, the performance bottleneck I experience is when making a Time Machine backup, which the new machines should improve upon -- USB 2.0 is very slow, and Thunderbolt drives are too expensive. USB 3.0 is the right solution for most users. I would also like to see networking tests. This machine needs wifi to work, and there's a lot of performance variance from one 802.11n to another. Last, sincecthe RAM is soldered, adding guidance about memory use would be useful. On my MBA, if I open all of my applications at once (except VMWare), I use less than 2 of my 4 GB of RAM. So it seems to me that 8+ GB only matters if you have a specific power-user need, particularly in a simultaneous Mac/Windows setup, but for most users, 4 GB and the stock CPU exceeds what you can use.

    I personally think these sorts of real world tests would make your reviews better reflect what most users actually do. I also feel that many people want to make excuses for taxing a machine to its limits, but we simply don't have such issues in actual use.
    Reply
  • name99 - Tuesday, July 17, 2012 - link

    Basically 2GiB is tight for pretty much any OSX user nowadays, but swapping on even the 2011 SSDs is fast enough that you don't really notice it. You can live it, but you will notice occasional pauses.

    You want 4GiB if you tend to keep a lot of browser windows open. Safari is a pig in this respect (hopefully improved somewhat in ML) and Chrome is not that much better.
    My collection of always-running apps is Finder, Mail, iChat, Skype, iTunes and Safari; something like this is, I think, pretty standard across most users. That, with 20 or so browser windows, will fit OK in 4GiB, but will start obviously paging if you add a few more apps, eg throw in Word or Excel or an Adobe app.

    (
    Where does it go?
    Essentially the OS is using about 500MB for misc [process management, memory management, network buffers, that sort of thing] and about 1GB for file buffers of various sorts. Of the 2.5 GB that's left, the worst offenders:
    WebProcess, essentially the guts of Safari, is currently using 1.2GB which is pathetic, the actual Safari process, basically UI, is using 200MB, also pathetic, and iTunes is using 530 MB --- TRULY pathetic, but what else would you expect from, iTunes, apparently the team to which Apple retires their most doddering and incompetent programmers.
    )

    For most people, however, I'd recommend 8GiB purely because
    - it's not that much more in cost.
    - you may not need it now, but you're future-proofing yourself since most normal people use their laptops for at least 3yrs before replacing them
    - the extra retail value when you do retire it is probably worth more than the extra cost now --- just think of the comparable desirability today of a 2GB vs a 4GB 2010 MBA.
    Reply
  • Freakie - Tuesday, July 17, 2012 - link

    I'd definitely agree with you that 8GB really should be what most people get. Even though they wont know or understand the difference, it makes a difference in how much they will enjoy any computer, PC or Mac. You just never know when someone will do something that eats up a ton of RAM, whether it be running a game without closing their browser, or opening up 100+ browser tabs, 4 word documents, and 4 excel books (my girlfriend does that a lot, and it makes her 8GB rather unhappy).

    And if you have an Windows computer with a 64bit install, more RAM is even more important for getting the best possible experience. While I can't say while OSX seems to have such RAM hungry programs that don't offer performance gains for so much usage, I can say that I put specific 64bit programs on my Windows laptop knowing that they will perform better and be able to use more RAM and use it more effectively. My 64bit Firefox browser is a great example of that. I don't mind it using 2-3GB of RAM because I have a crap ton of stuff open, multiple flash based videos, a flash game, ect... because it's just a nice experience. Not to mention this 64bit version is way faster than the regular one :P Regular users could be having these nice experiences too if the general consensus on what a good amount of RAM is would increase.

    On a side note... I'm thinking of going up to 16GB on my laptop, maybe 24GB... Just because I'm greedy :P Might even go the full 32GB and put a RAMdisk... who knows! So many possibilities when you have MORE RAM!
    Reply
  • phillyry - Tuesday, March 26, 2013 - link

    'Future proofing' = illusion

    Here's your future proofing. Buy what you need now of will likely need in the near future and save your pennies (or, in this case hundreds of dollars with of upgrades) for your next upgrade. And by upgrade, I mean your next laptop or whatever the hell else you want to do wig your money. Why leave an extra $100 lying inside a computer chassis in the shape of some soldered on RAM, if you can keep that money in your bank account and buy whatever else you want or, maybe even, actually need with it.

    Really, the upgrades should only be for the hardcore users and should not be a concern for most people.
    Reply
  • name99 - Tuesday, July 17, 2012 - link

    What sort of compression does Sandforce use?

    More specifically, I imagine that within a file they use some sort of LZ variant --- it's easy, it's known to work well for this sort of problem, and there's hardware to handle it.

    More interesting is the question of what sort of cross-file de-duplication they use. The obvious thing would be some sort of hash of each 4kiB block to something like a 128b signature, then compare each incoming block to that map. But that would require an in-memory table of hashes that would be of order 256MiB in size for a 64GB drive (and twice that for a 128GB drive), which, while obviously technically possible, doesn't seem to match their actual hardware setup. You could drop the hash to 64b, at the cost of more frequent collisions (and thus having to waste time reading the drive to compare with the incoming block) and that would halve your table size --- again possible, but still looking like it uses more RAM than they have available.

    So what's the deal? They don't actually do cross block de-dup? They do it in some fashion (eg using larger blocks than 4kiB) which, while it works, is not as optimal for small files?
    Reply
  • desmoboy - Tuesday, July 17, 2012 - link

    Excellent Review Anand. Do you have a explanation why the 13" MBA i7 is 28 sec slower in the iMovie '11 (Import + Optimize) benchmark than the i5 version? Seems bit strange when the i7 scores higher on every other of your benchmarks (as one would expect) Reply

Log in

Don't have an account? Sign up now