Thunderbolt Performance

Apple’s 2011 Macs were the first to enjoy Thunderbolt, an interface co-developed with Intel that carries PCIe and DisplayPort over a single cable. As it derives most of its revenue from mobile, Apple wasted no time in bringing its Thunderbolt Display to market. A single Thunderbolt cable could bring Gigabit Ethernet, Firewire 800, high-speed mass storage, external audio and display to an otherwise IO-deprived MacBook Air.

At a high level, Thunderbolt is pretty easy to explain. The current implementation of Thunderbolt pairs four PCIe 2.0 lanes with DisplayPort, offering a maximum bandwidth of 2GB/s in either direction in addition to DP bandwidth. The Thunderbolt interface itself can deliver 10Gbps of bandwidth in each direction, per channel. The physical Thunderbolt port is compatible with mini DisplayPort to allow for the use of mini-DP displays as well as Thunderbolt chains. Each Thunderbolt port can carry up to two Thunderbolt channels, although one channel is typically reserved for DisplayPort duties.

In the past we measured a maximum of 1GB/s of unidirectional bandwidth for a single Thunderbolt channel in addition to video bandwidth over DisplayPort. There’s no shipping device that will deliver this sort of performance, I needed to outfit a Promise Pegasus with a handful of SSDs to truly saturate the bus.

In the 2012 Macs Apple, like the rest of the PC industry, has switched to using Intel’s 2nd generation Thunderbolt controllers codenamed Cactus Ridge.

The Retina MacBook Pro uses a four-channel Cactus Ridge controller and drives two Thunderbolt ports with it. Each port can drive a mini-DP display or a Thunderbolt chain with a mini-DP/Thunderbolt Display at the end of or in it. The rMBP can actually drive a fourth panel (counting the integrated Retina Display) via the integrated HDMI port although that’s not an officially supported configuration.

Unlike most other implementations, Apple hangs the Cactus Ridge controller off of the Ivy Bridge CPU rather than the PCH. The GeForce GT 650M in the system only gets the use of 8 PCIe 3.0 lanes instead of the full 16, but with PCIe 3.0 this is not an issue (it wouldn’t be an issue with PCIe 2.0 either to be honest).

I performed the same test as before to test if maximum bandwidth has gone up since switching to Cactus Ridge. Initial results remained unchanged, I was able to get north of 900MB/s to an array of SSDs in the Pegasus connected to a single Thunderbolt port. Now with two Thunderbolt on the rMBP however I was able to create a second chain of devices. I only have a single Pegasus so I resorted to chaining a LaCie Little Big Disk (SSD) and Elgato Thunderbolt drive. The combination of the two isn’t anywhere near as fast as the SSD array in the Pegasus but it allowed me to push the limits of the controller even more:

1380MB/s, over copper, to the rMBP. I suspect if I had another Pegasus SSD array I’d be able to approach 1800MB/s, all while driving video over the ports. Apple may limit the internal storage expansion of the rMBP but you still have a path to expansion for storage of large media files and other archives. And it’s very fast.

Unfortunately Thunderbolt behavior is still not perfect, although it is improved compared to previous Macs. If you write to Promise’s Pegasus for long enough while playing audio through Apple’s Thunderbolt Display you will still drop audio frames. Subjectively it seems to take longer to trigger this phenomenon but it does still happen. On my early 2011 MacBook Pro the problem has gotten so bad that I’ll even drop other USB packets for devices connected to the Thunderbolt Display. If I’m writing to the Pegasus I’ll miss keystrokes and the mouse will jump around until the high-speed write is complete. So far I haven’t had anything this bad happen on the Retina MBP but it took a while for this behavior to manifest on my early 2011 model so we’ll see what happens. I’m not sure what the fix will be for these types of issues as it seems there’s no good quality of service assurance for PCIe devices residing on Thunderbolt. As Thunderbolt was supposed to be as transparent as possible, it’s not surprising that even QoS overhead is nonexistent but it’s something that is clearly necessary. I’m not sure this is Apple’s fault as I’ve seen similar behavior under Windows. I suspect it’s something that Intel is going to have to figure out a way to address.

 

All Flash Storage Vastly Improved Thermals
Comments Locked

471 Comments

View All Comments

  • tipoo - Thursday, January 1, 2015 - link

    Hm, I get way higher max CPU and GPU temps than tested here, with the Haswell one instead of the IVB in the article. I'm consistently just a few degrees short of 100C when gaming.

Log in

Don't have an account? Sign up now