Retina Display: Scrolling & UI Performance

With the first Retina MacBook Pro I noticed UI performance in certain applications degraded considerably compared to previous, non-Retina MBPs. Browsing my Facebook news feed in Safari and scrolling down as well as just scrolling through a long email thread in Mail both caused frame rates to drop below 30 fps. Performing the same tests on my 2011 15-inch MBP yielded UI frame rates closer to 60 fps.


13-inch rMBP (left) vs. 15-inch rMBP (right)

The move to Mountain Lion improved UI performance, but it's still an issue. Switching between Retina and non-Retina MacBook Pros results in a very noticeable difference in UI frame rates, especially in problematic applications.

Apple does a lot of CPU and GPU work to make OS X look like OS X. Scaling the workload up from 1.76 million pixels to 4 and 5 million pixels creates additional work for both the CPU and GPU that neither chip vendor had planned on. Apple had to replace some fixed function code with general purpose CPU and GPU code to achieve consistent image quality in enabling Retina, which obviously has performance implications.

Next-generation GPUs should do a better job of driving these ultra high resolution displays, but today it looks like our biggest bottlenecks are software and single threaded CPU performance. In every situation where UI frame rate drops significantly on the rMBP, the offending application usually ends up consuming 100% of a single CPU core. This is true in Safari, Mail and other applications where I notice drops in scrolling frame rate.


The worst case UI perf

The worst case performance I recorded on the 13-inch rMBP was 16 fps when scrolling in Safari with Facebook loaded at the 1440 x 900 scaled resolution setting. Minimum frame rate at the default best for Retina setting ended up being around 18 fps. It's distracting and a clear regression from other, non-Retina Macs. That's the lowest performance you see, but not everything falls into that range. Scrolling down the AnandTech front page for example happens at around 40 - 50 fps at the 1440 x 900 scaled resolution. Other animations will happen as high as 60 fps, although you typically notice when things are slow not when they're performing as expected.


The better case

The 13 I'm testing had demonstrably lower scrolling performance than my 15-inch rMBP, but I believe that has to do with the difference in CPU clocks more than anything else. My 13 uses a 2.5GHz Core i5 that can turbo up to 3.1GHz, while my 15 has a max single threaded turbo of 3.6GHz - an increase of 16%. There's also the fact that the 15-inch model features a quad-core CPU, leaving you with more idle cores in the event that you're actually doing more than just scrolling all day. I suspect the combination of these two things is why a lot of folks perceive the 15-inch rMBP to deliver faster UI performance.

The 15-inch model does have a discrete GPU, however I didn't notice a big performance difference in UI frame rates when I was on integrated vs. discrete graphics. I do believe that a lot of the present issues are related to Apple not GPU accelerating more of the drawing pipeline and as a result, single threaded CPU performance suffering under the load of 4 and 5MP displays. Intel (and AMD) design their CPUs for the types of workloads most of their customers will be running. The vast majority of the market isn't running OS X with 4MP+ panels. A lot of this is related to OS X itself, as you don't have the same scrolling issues under Windows. As we saw in our Surface review, simply making an application (or in this case, an OS) look a certain way can eat up a good amount of CPU time.

There's not much you can do here other than wait for faster hardware or buy the fastest CPU available on whatever system you're considering. Single threaded performance will scale linearly with CPU clock speed, so higher clocked CPUs should deliver smoother scrolling performance. Ultimately just scaling CPU clock is an inefficient way to solve the current UI frame rate issues. Future revisions of OS X will likely shift even more UI workload to the GPU, plus we'll see new microprocessor architectures that will perform better with these types of workloads as well. The only issue is I don't know when either of these things will happen. Haswell should bring a good increase in IPC and maybe even a slight increase in frequency, which will definitely help.

What we'll need however, without any significant changes to OS X is an outright doubling of single threaded CPU performance. In the worst case scenario that could mean that we won't see UI frame rate fixed for years. I doubt it'll be that long, if I had to guess I would say that Haswell will bring a good improvement and around the Broadwell (2014) timeframe is probably we'll see things really get better. I don't have intimate knowledge of Apple's OS X roadmap and I also don't know the instruction mix that's causing this behavior so I can't really say anything for certain, I'm just speculating.

The 13-inch Retina Display in Numbers USB 3.0 Performance
POST A COMMENT

79 Comments

View All Comments

  • James5mith - Tuesday, November 13, 2012 - link

    One of the biggest advances the Windows OS made was moving from a strictly CPU driven windows management interface, to a GPU accelerated one. (Vista, Win7, Win8)

    It stopped things like the classic "trail of artifact windows" you could do when your old WindowsXP and earlier machines were bogged down. Since the desktop was drawn by the CPU, it wouldn't refresh properly until some CPU cycles were freed up.

    Seems Apple did not learn from the past, and is now doomed to repeat it.
    Reply
  • Ryan Smith - Tuesday, November 13, 2012 - link

    Actually Apple introduced GPU accelerated window composition with Quartz Extreme nearly 10 years ago, a few years ahead of Microsoft.

    However there are several layers to GPU acceleration. The earliest solutions could do window composition on the GPU, but the contents of the windows themselves were still generated by the CPU. Since then both MS and Apple have been moving more and more of the workload on to the GPU as it makes sense to do so. But no one is 100% offloaded, so the CPU still plays a part and consequently can still be a bottleneck.
    Reply
  • michal1980 - Tuesday, November 13, 2012 - link

    The bias is strong Reply
  • solipsism - Saturday, November 17, 2012 - link

    Yes, the bias is strong... in you.

    <ul><li><a href="http://www.anandtech.com/tag/windows-8>http://w...
    Reply
  • cjs150 - Tuesday, November 13, 2012 - link

    But not yet. One maybe two generations to go before perfect.

    1. One obvious use of this will be to watch movies on go (especially on long business trips) but once again Apple ignores 1080p resolution (if there someone at Apple who hates this resolution, because they try and ignore it in every device they produce)

    2. Card reader is flaky - is it because of chassis flex or just a bad reader?

    3. storage needs to be bigger, I gues next generation will be 256gb.

    4. A bit extra horse power obviously needed, but probably not a lot.

    The real question though is as tablets get better, is there any point in the 13" Mac?
    Reply
  • xTRICKYxx - Tuesday, November 13, 2012 - link

    What software do you use to test the framerates of the browser? Reply
  • Jorange - Tuesday, November 13, 2012 - link

    $1700 and laggy UI. Come on Apple fans admit that you buy their products for the image, and for fear of being seen as gauche in the eyes of your vapid clique. The best Apple zealots are those whom purchase the things on credit, look how wealthy I am, whilst paying off the monthly installments:) Reply
  • boblozano - Tuesday, November 13, 2012 - link

    This is the first review that I've read that captures the essence of this machine -- it is a machine of excellent balance, and in that balance lies it's real reason for being.

    Came from a mid-2011 mbair (1.8 i7, 256gb), and before that a mid-2009 mbp 15. Workload is a mix of writing, photo editing (lr, ps, etc.), and some video creation. Lots of travel. Went to the mbair after good friends with a heavy dev emphasis swore it was excellent. It was/is.

    Considered the 15 rMBP since the price is effectively the same, and there is obviously more of just about everything. But size for travel and general mobility was a significant concern (the air allowed me to switch from a full-size backpack to a much smaller messenger bag).

    The reason why I've settled on 13 as just about perfect for my present usage is simple: with the increasing number of full-screen apps, 13" is just about perfect for writing in a full screen, while 15 just feels overwrought. Even better, with retina the photo and video editing remains very usable.

    With that as a background, the 13" rMBP was a real step up in everything that I liked about the air, with hardly any compromise (small bit of weight). Ended up with the 2.9 I7, 512gb. Everything I do is faster, better (though definitely not cheaper), and that screen just makes you smile when opening it up to work. It is good enough that I'm even going back to using the device open on a stand (trying the new twelve south height-adjustable stand) when docked.

    Sure it would be nice to have 4 cores and a discrete gpu (particularly for rendering), but as of now there's no doubt it would have compromised mobility and/or battery life. About the only indisputable criticism is one of value, but such is life.

    Undoubtedly (and always) there'll be something much better down the road, maybe even only a year from now. Good. But as of today, this is the best computing device I've ever used on a daily basis.
    Reply
  • caleblloyd - Tuesday, November 13, 2012 - link

    Anand - the Primary Storage for the 13in MBA that you have listed on the table on the first page should be 128GB. Reply
  • repoman27 - Tuesday, November 13, 2012 - link

    "In reality USB 3.0 is good for about 400 - 500MB/s (3.2Gbps - 4.0Gbps)..."

    The actual reality is that USB 3.0 provides a physical layer gross bit rate of 5 Gbit/s, and a physical layer net bit rate of 4 Gbit/s due to 8b/10b encoding. The net bit rate delivered to the application layer is unlikely to ever exceed 80% of that, or 400 MB/s, in the real world. Even using UASP, which clearly looks to be the case in these tests, I've never seen peak SuperSpeed USB transfer rates much in excess of 350 MB/s. USB 3.0 is good for 300 - 350 MB/s with the hardware shipping at this point, although we may see the upper bound approach 400 MB/s in the future.

    "This is Intel's most capable Thunderbolt SKU as it takes four PCIe 2.0 lanes combined with DisplayPort and muxes them into four Thunderbolt channels (2 up/2 down) with two DP outputs."

    This sentence has some problems as well. The DSL3510L has connections on the back end for 4 PCIe 2.0 lanes, 2 DisplayPort 1.1a sources and 1 DisplayPort 1.1a sink. On the front side it has four 10 Gbit/s, full-duplex Thunderbolt channels, 2 per port (i.e. 2 up/2 down per port or 4 up/4 down per controller). Each port can also operate in legacy DisplayPort signaling mode when a DisplayPort device is connected directly.

    On another note, it's frustrating that Apple failed again at the SDXC card reader, and it appears to be a mechanical issue once again. iFixit's teardown photos seem to have omitted it, but if Apple used the same controller as in the 15-inch MBPR, then it's a Broadcom controller that supports SD 3.0 features such as SDXC and UHS-I paired with a PCIe 1.1 x1 back end. This should make it far more capable than a USB 2.0 based solution, but no, instead they made it useless because half the time it doesn't read a card at all when inserted.
    Reply

Log in

Don't have an account? Sign up now