Turbo and the 15-inch MacBook Pro

The 15 and 13 are different enough that I'll address the two separately. Both are huge steps forward compared to their predecessors, but for completely different reasons. Let's start with the 15.

Starting with Sandy Bridge, all 15 and 17-inch MacBook Pros now feature quad-core CPUs. This is a huge deal. Unlike other notebook OEMs, Apple tends to be a one-size-fits-all sort of company. Sure you get choice of screen size, but the options dwindle significantly once you've decided how big of a notebook you want. For the 15 and 17-inch MBPs, all you get are quad-core CPUs. Don't need four cores? Doesn't matter, you're getting them anyway

Evolution of the 15-inch MacBook Pro Early 2011 Mid 2010 Late 2009
CPU Intel Core i7 2.0GHz (QC) Intel Core i5 2.40GHz (DC) Intel Core 2 Duo 2.53GHz (DC)
Memory 4GB DDR3-1333 4GB DDR3-1066 4GB DDR3-1066
HDD 500GB 5400RPM 320GB 5400RPM 250GB 5400RPM
Video Intel HD 3000 + AMD Radeon HD 6490M (256MB) Intel HD Graphics +
NVIDIA GeForce GT 330M (256MB)
NVIDIA GeForce 9400M (integrated)
Optical Drive 8X Slot Load DL DVD +/-R 8X Slot Load DL DVD +/-R 8X Slot Load DL DVD +/-R
Screen Resolution 1440 x 900 1440 x 900 1440 x 900
USB 2 2 2
SD Card Reader Yes Yes Yes
FireWire 800 1 1 1
ExpressCard/34 No No No
Battery 77.5Wh 77.5Wh 73Wh
Dimensions (W x D x H) 14.35" x 9.82" x 0.95" 14.35" x 9.82" x 0.95" 14.35" x 9.82" x 0.95"
Weight 5.6 lbs 5.6 lbs 5.5 lbs
Price $1799 $1799 $1699

Apple was able to rationalize this decision because of one feature: Intel Turbo Boost.

In the ramp to 90nm Intel realized that it was expending a great deal of power in the form of leakage current. You may have heard transistors referred to as digital switches. Turn them on and current flows, turn them off and current stops flowing. The reality is that even when transistors are off, some current may still flow. This is known as leakage current and it becomes a bigger problem the smaller your transistors become.

With Nehalem Intel introduced a new type of transistor into its architecture: the power gate transistor. Put one of these babies in front of the source voltage to a large group of transistors and at the flip of a, err, switch you can completely shut off power to those transistors. No current going to the transistors means effectively no leakage current.

Prior to Intel's use of power gating, we had the next best thing: clock gating. Instead of cutting power to a group of transistors, you'd cut the clock signal. With no clock signal, any clocked transistors would effectively be idle. Any blocks that are clock gated consume no active power, however it doesn't address the issue of leakage power. So while clock gating got you some thermal headroom, it became less efficient as we moved to smaller and smaller transistors.


All four cores in this case have the same source voltage, but can be turned off individually thanks to the power gate above the core

Power gating gave Intel one very important feature: the ability to truly shut off a core when not in use. Prior to power gating Intel, like any other microprocessor company, had to make tradeoffs in choosing core count vs. clock speed. The maximum power consumption/thermal output is effectively a fixed value, physics has something to do with that. If you want four cores in the same thermal envelope as two cores, you have to clock them lower. In the pre-Nehalem days you had to choose between two faster cores or four slower cores, there was no option for people who needed both.

Now, with the ability to mostly turn off idle cores, you can get around that problem. A fully loaded four core CPU will still run at a lower clock than a dual core version, however with power gating if you are only using two cores then you have the thermal headroom to ramp up the clock speed of the two active cores (since the idle ones are effectively off).

Get a little more clever and you can do this power gate and clock up dance for more configurations. Only using one core? Power gate three and run the single active core at a really really high speed. All of this is done by a very complex piece of circuitry on the microprocessor die. Intel introduced it in Nehalem and called it the Power Control Unit (this is why engineers aren't good marketers but great truth tellers). The PCU in Nehalem was about a million transistors, around the complexity of the old Intel 486, and all it did was look at processor load, temperature, power consumption, active cores and clock speed. Based on all of these inputs it would determine what to turn off and what clock speed to run the entire chip at.

Another interesting side effect of the PCU is that if you're using all cores but they're not using the most power hungry parts of their circuitry (e.g. not running a bunch of floating point workloads) the PCU could keep all four active but run them at a slightly higher frequency.

Single Core Dual Core Quad Core
TDP
Tradeoff

The PCU actually works very quickly. Let's say you're running an application that only for a very brief period is only using a single core. That's more than enough time for the PMU to turn off all unused cores, turbo up the single core and complete the task quicker.

Intel calls this dynamic frequency scaling Turbo Boost (ah this is where the marketing folks took over). The reason I went through this lengthy explanation of Turbo is because it allowed Apple to equip the 15-inch Macbook Pro with only quad-core options and not worry about it being slower than the dual-core 13-inch offering, despite having a lower base clock speed (2.0GHz for the 15 vs. 2.3GHz for the 13).


13-inch MacBook Pro (left), 15-inch MacBook Pro with optional high res/anti-glare display (right)

Apple offers three CPU options in the 15-inch MacBook Pro: a 2.0GHz, 2.2GHz or 2.3GHz quad-core Core i7. These actually correspond to the Core i7-2635QM, 2720QM and 2820QM. The main differences are in the table below:

Apple 15-inch 2011 MacBook Pro CPU Comparison
2.0GHz quad-core 2.2GHz quad-core 2.3GHz quad-core
Intel Model Core i7-2635QM Intel Core i7-2720QM Intel Core i7-2820QM
Base Clock Speed 2.0GHz 2.2GHz 2.3GHz
Max SC Turbo 2.9GHz 3.3GHz 3.4GHz
Max DC Turbo 2.8GHz 3.2GHz 3.3GHz
Max QC Turbo 2.6GHz 3.0GHz 3.1GHz
L3 Cache 6MB 6MB 8MB
AES-NI No Yes Yes
VT-x Yes Yes Yes
VT-d No Yes Yes
TDP 45W 45W 45W

The most annoying part of all of this is that the base 2635 doesn't support Intel's AES-NI. Apple still doesn't use AES-NI anywhere in its OS it seems so until Lion rolls around I guess this won't be an issue. Shame on Apple for not supporting AES-NI and shame on Intel for using it as a differentiating feature between parts. The AES instructions, introduced in Westmere, are particularly useful in accelerating full disk encryption as we've seen under Windows 7.

Note that all of these chips carry a 45W TDP, that's up from 35W in the 13-inch and last year's 15-inch model. We're talking about nearly a billion transistors fabbed on Intel's 32nm process—that's almost double the transistor count of the Arrandale chips found in last year's MacBook Pro. These things are going to consume more power.

Despite the fairly low base clock speeds, these CPUs can turbo up to pretty high values depending on how many cores are active. The base 2.0GHz quad-core is only good for up to 2.9GHz on paper, while the 2720QM and 2820QM can hit 3.3GHz and 3.4GHz, respectively.

Given Apple's history of throttling CPUs and not telling anyone I was extra paranoid in finding out if any funny business was going on with the new MacBook Pros. Unfortunately there are very few ways of measuring turbo frequency under OS X. Ryan Smith pointed me in the direction of MSR Tools which, although not perfect, does give you an indication of what clock speed your CPU is running at.


Max single core turbo on the 2.3GHz quad-core

With only a single thread active the 2.3GHz quad-core seemed to peak at ~3.1—3.3GHz. This is slightly lower than what I saw under Windows (3.3—3.4GHz pretty consistently running Cinebench R10 1CPU test). Apple does do power management differently under OS X, however I'm not entirely sure that the MSR Tools application is reporting frequency as quickly as Intel's utilities under Windows 7.


Max QC turbo on the 2.3GHz quad-core

With all cores active (once again, Cinebench R10 XCPU) the max I saw on the 2.3 was 2.8GHz. Under Windows running the same test I saw similar results at 2.9GHz.


Max QC turbo on the 2.3GHz quad-core under Windows 7

I'm pretty confident that Apple isn't doing anything dramatic with clock speeds on these new MacBook Pros. Mac OS X may be more aggressive with power management than Windows, but max clock speed remains untouched.

Mac OS X 10.6.6 vs. Windows 7 Performance
15-inch 2011 MBP, 2.0GHz quad-core Single-Threaded Multi-Threaded
Mac OS X 10.6.6 4060 15249
Windows 7 x64 4530 16931

Note that even though the operating frequencies are similar under OS X and Windows 7, Cinebench performance is still higher under Windows 7. It looks like there's still some software optimization that needs to be done under OS X.

Introduction What About The 13?
Comments Locked

198 Comments

View All Comments

  • zhill - Friday, March 11, 2011 - link

    Good article. I was thinking about your issue with the high cpu utilization, and could it simply be a reporting issue? Could the cpu performance counters or OSX be reporting QuickSync as part of the cpu rather than the GPU? This would certainly be strange and not accurate, but given that intel seems to list QuickSync and HD3000 separately, maybe the reporting stats aren't accurate. Presumably this would be an issue in both Windows and OSX, but at the driver level there could be differences. Just a thought.

    Have you, or anyone else, noticed heat issues with the MBP lid closed versus open? Aren't the vent ports along the back next to the hinge such that when open they can vent, but when closed airflow could be inhibited?
  • Anand Lal Shimpi - Friday, March 11, 2011 - link

    I thought about that too, but there seems to be a genuine increase in thermal output from the CPU - higher than I'd expect from idle cores and the quick sync engine active.

    I haven't personally noticed any heat issues with the lid open vs. closed, seems to behave similarly (although now that you mention it I feel like open I do get temperatures a couple of degrees cooler than when it's closed - that could just be psychological though as the comparison is completely unscientific).

    Take care,
    Anand
  • Omid.M - Friday, March 11, 2011 - link

    Anand,

    So do the 15-17" MBPs have hardware acceleration support for Flash? I didn't see that explicitly in the review; sorry if I missed it, but I tweeted you asking for this.

    The last MBP update, Anand said the 13" he could highly recommend, but the 15" got way too hot under load.

    This update, Anand said the 13" he could highly recommend, but the 15" gets way too hot under load.

    Hmm. (not insinuating anything, Anand and crew)

    I find that odd. But, maybe it's a good thing: I'm not comfortable buying an MBP until Apple build TRIM support for 3rd party SSDs into OSX. I would not want the Apple SSDs.

    My early 2008 MBP is still running fine, although I'm tempted by the QC models. Maybe waiting until Ivy Bridge, in hopes of a cooler laptop, will be enough time to see if Apple brings TRIM for after-market SSDs.

    I'm disappointed, but I guess this review saved me some money until next year.
  • Anand Lal Shimpi - Friday, March 11, 2011 - link

    Sorry I think I missed your tweet! I measured around 40 - 60% CPU utilization of a single core when viewing a 1080p HD video in YouTube on the new 15-inch MBP (same CPU usage for both the iGPU and dGPU).

    The frame rate was perfectly smooth, but it's unclear to me how much lifting is being done by the GPU here.

    Last year's 15 was pretty warm, but this year's model definitely didn't take a step back in that department - transistor count nearly doubled after all!

    The move to 22nm should bring about marginal updates to architecture so I'm hoping for lower power consumption at similar performance levels.

    Take care,
    Anand
  • Omid.M - Friday, March 11, 2011 - link

    Anand,

    You mentioned in the last MBP refresh/review that the 13" showed support for TRIM in OSX (evidenced in System Profiler, I believe).

    You also said in this refresh/review that Apple supports TRIM for its own SSDs only.

    To my knowledge, the last MBP generation had the SSD option for both 13" and 15-17" models, meaning the same SSD was offered across all models.

    If TRIM is only supported for Apple SSDs, why did we see an evidence of TRIM in last year's 13" model but no evidence for the 15/17, assuming the same SSD was offered across the entire line and assuming the version of OSX shipped with the last models was the same across the line?

    Was that due to different chipset drivers because the 13" had the Core 2 Duo/Nvidia combo, and the older 15/17 had Core i5/i7 (thus, newer chipset) ?

    Does it make sense what I'm asking?
  • tno - Friday, March 11, 2011 - link

    Apple ships different versions (small tweaks) of OSX with different laptops, and there is the key. If you recall, the field in System Profiler was populated indicating that at some level the chipset (Nvidia sourced) supported the instruction, but SSDs that supported the instruction did not.

    So you're correct, Nvidia chipset driver supported TRIM, but the OS did not implement the instruction. The Core i5/i7 integrated chipset driver had no support for TRIM.

    http://www.anandtech.com/show/3762/apples-13inch-m...
  • name99 - Friday, March 11, 2011 - link

    "I saw a number of different MCS (modulation coding scheme) values with the 2011 MBP in the exact same place. Link rates from just below 300 Mbps all the way up to the expected 450. It seems to settle out at the expected 450 Mbps in the same room as the AP, it just takes a while, whereas other 2x2 stacks I've seen always lock onto 300 Mbps and stay there in the same room and position."

    Is the state of the art any better than this?
    The reason I ask is that the simple WiFi problem (1x1 antenna, what is the best modulation + puncturing I should use for this SINR?) is well understood.
    But once MIMO enters the picture there are so many more options available --- for example: should we try to use all receive antennas for different streams, and run those three streams at "robust" modulation, or should we transmit a single "fragile" (64-QAM, 5/6) stream, and rely on receive diversity to be able to detect it without error? If we send a "fragile" stream, should we use the transmit antennas to perform beam shaping to target more power at the target?

    As I understand it, optimal methods for handling the juggling between all the different types of diversity available in the MIMO space still do not really exist (if anyone has a reference stating otherwise, please provide it).
    If this is the case, it would not surprise if, on either the base station end, the laptop end, or both, you have a huge amount of bouncing around between different possibilities (of course with 3x3:3 the space is larger than with 2x2:2 or 2x3:2) because what is being used to make the choices are simply heuristics, not engineered algorithms, and the heuristics are extremely sensitive to the slightest changes in the SINR covariance matrix).
  • Brian Klug - Friday, March 11, 2011 - link

    I haven't really played around enough with other 3x3 WiFi stacks enough to say for certain. I agree with you that a lot of this is it making some decisions based on whether to prioritize connection robustness or throughput rate. At close ranges, it certainly selects MCS that gives most throughput, but I'm still shocked to not see more 450 Mbps when in the exact same room as the AP.

    Moving away, you'll quickly fall back to single stream rates (but obviously still get MIMO range extension). You're exactly right that everyone has their own heuristics for how to do this based on SINR. I still haven't figured out how to actually grab SINR out on here, all I can see for the moment is just RSSI. Completely agreed though.

    -Brian
  • MrCromulent - Friday, March 11, 2011 - link

    Once again a very detailed, comprehensive and yet easy to understand article!

    I'd like to inquire once more about the C300: In the initial test, the C300 was criticized for poor garbage collection. Now it's considered an option for Apple notebooks. Has the GC been improved by Marvell in the last few firmware updates?
  • Griswold - Friday, March 11, 2011 - link

    Interesting revenue information right at the start. Apple went from a computer- to a music&player- to a phone company. :P

Log in

Don't have an account? Sign up now