Turbo and the 15-inch MacBook Pro

The 15 and 13 are different enough that I'll address the two separately. Both are huge steps forward compared to their predecessors, but for completely different reasons. Let's start with the 15.

Starting with Sandy Bridge, all 15 and 17-inch MacBook Pros now feature quad-core CPUs. This is a huge deal. Unlike other notebook OEMs, Apple tends to be a one-size-fits-all sort of company. Sure you get choice of screen size, but the options dwindle significantly once you've decided how big of a notebook you want. For the 15 and 17-inch MBPs, all you get are quad-core CPUs. Don't need four cores? Doesn't matter, you're getting them anyway

Evolution of the 15-inch MacBook Pro Early 2011 Mid 2010 Late 2009
CPU Intel Core i7 2.0GHz (QC) Intel Core i5 2.40GHz (DC) Intel Core 2 Duo 2.53GHz (DC)
Memory 4GB DDR3-1333 4GB DDR3-1066 4GB DDR3-1066
HDD 500GB 5400RPM 320GB 5400RPM 250GB 5400RPM
Video Intel HD 3000 + AMD Radeon HD 6490M (256MB) Intel HD Graphics +
NVIDIA GeForce GT 330M (256MB)
NVIDIA GeForce 9400M (integrated)
Optical Drive 8X Slot Load DL DVD +/-R 8X Slot Load DL DVD +/-R 8X Slot Load DL DVD +/-R
Screen Resolution 1440 x 900 1440 x 900 1440 x 900
USB 2 2 2
SD Card Reader Yes Yes Yes
FireWire 800 1 1 1
ExpressCard/34 No No No
Battery 77.5Wh 77.5Wh 73Wh
Dimensions (W x D x H) 14.35" x 9.82" x 0.95" 14.35" x 9.82" x 0.95" 14.35" x 9.82" x 0.95"
Weight 5.6 lbs 5.6 lbs 5.5 lbs
Price $1799 $1799 $1699

Apple was able to rationalize this decision because of one feature: Intel Turbo Boost.

In the ramp to 90nm Intel realized that it was expending a great deal of power in the form of leakage current. You may have heard transistors referred to as digital switches. Turn them on and current flows, turn them off and current stops flowing. The reality is that even when transistors are off, some current may still flow. This is known as leakage current and it becomes a bigger problem the smaller your transistors become.

With Nehalem Intel introduced a new type of transistor into its architecture: the power gate transistor. Put one of these babies in front of the source voltage to a large group of transistors and at the flip of a, err, switch you can completely shut off power to those transistors. No current going to the transistors means effectively no leakage current.

Prior to Intel's use of power gating, we had the next best thing: clock gating. Instead of cutting power to a group of transistors, you'd cut the clock signal. With no clock signal, any clocked transistors would effectively be idle. Any blocks that are clock gated consume no active power, however it doesn't address the issue of leakage power. So while clock gating got you some thermal headroom, it became less efficient as we moved to smaller and smaller transistors.


All four cores in this case have the same source voltage, but can be turned off individually thanks to the power gate above the core

Power gating gave Intel one very important feature: the ability to truly shut off a core when not in use. Prior to power gating Intel, like any other microprocessor company, had to make tradeoffs in choosing core count vs. clock speed. The maximum power consumption/thermal output is effectively a fixed value, physics has something to do with that. If you want four cores in the same thermal envelope as two cores, you have to clock them lower. In the pre-Nehalem days you had to choose between two faster cores or four slower cores, there was no option for people who needed both.

Now, with the ability to mostly turn off idle cores, you can get around that problem. A fully loaded four core CPU will still run at a lower clock than a dual core version, however with power gating if you are only using two cores then you have the thermal headroom to ramp up the clock speed of the two active cores (since the idle ones are effectively off).

Get a little more clever and you can do this power gate and clock up dance for more configurations. Only using one core? Power gate three and run the single active core at a really really high speed. All of this is done by a very complex piece of circuitry on the microprocessor die. Intel introduced it in Nehalem and called it the Power Control Unit (this is why engineers aren't good marketers but great truth tellers). The PCU in Nehalem was about a million transistors, around the complexity of the old Intel 486, and all it did was look at processor load, temperature, power consumption, active cores and clock speed. Based on all of these inputs it would determine what to turn off and what clock speed to run the entire chip at.

Another interesting side effect of the PCU is that if you're using all cores but they're not using the most power hungry parts of their circuitry (e.g. not running a bunch of floating point workloads) the PCU could keep all four active but run them at a slightly higher frequency.

Single Core Dual Core Quad Core
TDP
Tradeoff

The PCU actually works very quickly. Let's say you're running an application that only for a very brief period is only using a single core. That's more than enough time for the PMU to turn off all unused cores, turbo up the single core and complete the task quicker.

Intel calls this dynamic frequency scaling Turbo Boost (ah this is where the marketing folks took over). The reason I went through this lengthy explanation of Turbo is because it allowed Apple to equip the 15-inch Macbook Pro with only quad-core options and not worry about it being slower than the dual-core 13-inch offering, despite having a lower base clock speed (2.0GHz for the 15 vs. 2.3GHz for the 13).


13-inch MacBook Pro (left), 15-inch MacBook Pro with optional high res/anti-glare display (right)

Apple offers three CPU options in the 15-inch MacBook Pro: a 2.0GHz, 2.2GHz or 2.3GHz quad-core Core i7. These actually correspond to the Core i7-2635QM, 2720QM and 2820QM. The main differences are in the table below:

Apple 15-inch 2011 MacBook Pro CPU Comparison
2.0GHz quad-core 2.2GHz quad-core 2.3GHz quad-core
Intel Model Core i7-2635QM Intel Core i7-2720QM Intel Core i7-2820QM
Base Clock Speed 2.0GHz 2.2GHz 2.3GHz
Max SC Turbo 2.9GHz 3.3GHz 3.4GHz
Max DC Turbo 2.8GHz 3.2GHz 3.3GHz
Max QC Turbo 2.6GHz 3.0GHz 3.1GHz
L3 Cache 6MB 6MB 8MB
AES-NI No Yes Yes
VT-x Yes Yes Yes
VT-d No Yes Yes
TDP 45W 45W 45W

The most annoying part of all of this is that the base 2635 doesn't support Intel's AES-NI. Apple still doesn't use AES-NI anywhere in its OS it seems so until Lion rolls around I guess this won't be an issue. Shame on Apple for not supporting AES-NI and shame on Intel for using it as a differentiating feature between parts. The AES instructions, introduced in Westmere, are particularly useful in accelerating full disk encryption as we've seen under Windows 7.

Note that all of these chips carry a 45W TDP, that's up from 35W in the 13-inch and last year's 15-inch model. We're talking about nearly a billion transistors fabbed on Intel's 32nm process—that's almost double the transistor count of the Arrandale chips found in last year's MacBook Pro. These things are going to consume more power.

Despite the fairly low base clock speeds, these CPUs can turbo up to pretty high values depending on how many cores are active. The base 2.0GHz quad-core is only good for up to 2.9GHz on paper, while the 2720QM and 2820QM can hit 3.3GHz and 3.4GHz, respectively.

Given Apple's history of throttling CPUs and not telling anyone I was extra paranoid in finding out if any funny business was going on with the new MacBook Pros. Unfortunately there are very few ways of measuring turbo frequency under OS X. Ryan Smith pointed me in the direction of MSR Tools which, although not perfect, does give you an indication of what clock speed your CPU is running at.


Max single core turbo on the 2.3GHz quad-core

With only a single thread active the 2.3GHz quad-core seemed to peak at ~3.1—3.3GHz. This is slightly lower than what I saw under Windows (3.3—3.4GHz pretty consistently running Cinebench R10 1CPU test). Apple does do power management differently under OS X, however I'm not entirely sure that the MSR Tools application is reporting frequency as quickly as Intel's utilities under Windows 7.


Max QC turbo on the 2.3GHz quad-core

With all cores active (once again, Cinebench R10 XCPU) the max I saw on the 2.3 was 2.8GHz. Under Windows running the same test I saw similar results at 2.9GHz.


Max QC turbo on the 2.3GHz quad-core under Windows 7

I'm pretty confident that Apple isn't doing anything dramatic with clock speeds on these new MacBook Pros. Mac OS X may be more aggressive with power management than Windows, but max clock speed remains untouched.

Mac OS X 10.6.6 vs. Windows 7 Performance
15-inch 2011 MBP, 2.0GHz quad-core Single-Threaded Multi-Threaded
Mac OS X 10.6.6 4060 15249
Windows 7 x64 4530 16931

Note that even though the operating frequencies are similar under OS X and Windows 7, Cinebench performance is still higher under Windows 7. It looks like there's still some software optimization that needs to be done under OS X.

Introduction What About The 13?
POST A COMMENT

198 Comments

View All Comments

  • Anand Lal Shimpi - Friday, March 11, 2011 - link

    Thank you for reading them, comments like this really do make it all worthwhile :)

    You wouldn't believe how much time was spent making sure Apple wasn't doing something funny with the max turbo frequencies. At the end of the day it was a non-issue, but we had to be sure.

    Take care,
    Anand
    Reply
  • Ryan Smith - Friday, March 11, 2011 - link

    Just to add some technical background to this, it's actually quite complex to get a CPU speed reading on modern CPUs. Mac OS X's Sysctl reports the base speed of the processor, regardless whether Turbo Mode is active or not. So on the 15" low-end QC model you will always see 2.3GHz.

    To actually read the instantaneous speed of any given core, you need to peek at the CPU itself and count the cycles - Intel actually has a handy document detailing an algorithm to do this(1). The issue with that is that it requires peeking at the Model-Specific Registers (MSRs), which require Ring 0 access; or in other words you need a broker at the driver level to do it.

    Linux already does this (/proc/cpu/0/msr), and on Windows it's fairly trivial to load a driver alongside an Admin-level application to do this(CPU-Z, etc). Under Mac OS X this requires installing an Extension (at least as far as I know) which gets messy. If you don't go through this process you'll never be able to read the core speeds accurately, which is why there's virtually no Mac software capable of this.

    Fortunately MSR Tools exists, and it has a 32bit extension to allow it to peek at the MSRs. The right answer of course is always the last answer you try, so this was only after trying several other ways of calculating the CPU speed and a couple different OS-agnostic benchmarks to try to rule out OS differences.

    1) http://download.intel.com/design/processor/applnot...
    Reply
  • tno - Friday, March 11, 2011 - link

    +1

    I've been planning to plunge into Mac ownership for sometime, especially with grad school looming I really want something that's more comfortable to work on than my netbook but still fairly portable. This review really helped me gauge whether it was worth putting in the extra cost for a 2011 13" MBP or settle for a discounted 2010.

    So am I all set? Hardly! Now I need to see what the 2011 13" MBA has to offer! I'm praying that cost stays roughly the same and a move to a ULV SNB leads to 12+ hour battery life and a similarly huge leap in performance as the move lead to in the MBP. I am a sucker for lightweight form factors.

    This article is also the first one to make me ever consider the 15" MBP. I have been fairly opposed to the bulk but the performance is quite something. If I went that route then I would probably have a C2Q, water-cooled, ATI and SSD driven rig to put up on AT forums. Taking offers!
    Reply
  • tno - Wednesday, May 04, 2011 - link

    Rezzing a dead thread! I bought the 13" MBP! $999 at MicroCenter, too good to pass up! So . . . who wants my rig? Reply
  • JasperJanssen - Saturday, August 06, 2011 - link

    I, on the other hand, have gone the other way. My MBA13 is being put together in China now. Reply
  • ltcommanderdata - Thursday, March 10, 2011 - link

    A great review. I do have some additional questions though. First, given Apple was the instigator of OpenCL, it'd be great if you could run some OpenCL benchmarks. Are the Sandy Bridge MacBook Pro's disproportionately faster than the Arrandale MacBook Pro to indicate that OS X has CPU OpenCL drivers that can take advantage of AVX? Probably not, and this will hopefully come with Lion. Given nVidia's GPGPU push can the HD 6490 still keep up with the 330M GT in OpenCL? How does the HD6750 do?

    http://www.bit-tech.net/hardware/graphics/2011/01/...

    "'[Intel] will be releasing OpenCL graphics drivers to developers during the course of 2011. [Intel] continue to evaluate when and where OpenCL will intercept various products"

    And is there secret Sandy Bridge IGP OpenCL support? Bit-tech got a quote from Intel that Sandy Bridge IGP OpenCL support was inbound sometime this year and if anyone would be motivated to get it done it'd be Apple.

    And finally, does Apple now support hardware H.264 decoding on ATI or Intel GPUs? Previously, only a few nVidia GPUs were supported in Snow Leopard, such that the Arrandale MacBook Pro actually had to power up the 330M GT to decode H.264 wasting power compared to the perfectly fine Arrandale IGP if Apple just wrote the drivers. Do the new Sandy Bridge have the ATI GPUs doing H.264 decoding now, is the Intel IGP supported, or in the worst case is no H.264 hardware acceleration available now that nVidia GPUs are gone? Perhaps lack of hardware H.264 decoding is what makes the FaceTime HD CPU usage so high? QuickSync is only accelerating the encoding phase?
    Reply
  • Anand Lal Shimpi - Friday, March 11, 2011 - link

    Some answers:

    1a) I don't know of any good GPU based OpenCL tests under OS X at this point. I'm not even sure if Apple's Intel HD 3000 driver supports OpenCL.

    1b) Intel mentioned SNB's GPU technically supports OpenCL however there are no plans to release a public driver at this point.

    2) Hardware H.264 decoding is enabled on the 2011s and it is used while FaceTiming, at least according to Apple.

    Take care,
    Anand
    Reply
  • ltcommanderdata - Friday, March 11, 2011 - link

    Thanks for the reply.

    http://www.macupdate.com/app/mac/33632/smallluxgpu

    In regards to OpenCL testing, most people in OS X seem to use SmallLuxGPU which is an OpenCL raytracing benchmark. I don't have much experience with it, but it might be worth a try.

    In regards to hardware H.264 decode, do you know if the IGP is doing it or does the discrete GPU still have to be powered up as in the 2010 Arrandale MacBook Pros?

    Thanks
    Reply
  • Anand Lal Shimpi - Friday, March 11, 2011 - link

    It's my understanding that the IGP can do the decoding, although note that while FaceTime is running the dGPU is enabled by default.

    Good call on SLG, I had forgotten about that :)

    Take care,
    Anand
    Reply
  • secretmanofagent - Thursday, March 10, 2011 - link

    Hello authors,
    On one of the pages, you mentioned this:
    "This isn't Mac specific advice, but if you've got a modern Mac notebook I'd highly recommend upgrading to an SSD before you even consider the new MacBook Pro. I've said this countless times in the past but an SSD is the single best upgrade you can do to your computer."

    Is there an article where you recommend the best update for my model? Should I even bother with the drive? I realize the X3100 is going to still hamper any sort of graphical performance, but wondering if it's worth the effort.

    Out of curiosity as well, would a Time Machine restore be possible if you update the drive?
    Reply

Log in

Don't have an account? Sign up now