Intel's Sandy Bridge Architecture Exposed

Name: Intel's Sandy Bridge Architecture Exposed
Item: Intel's Sandy Bridge Architecture Exposed
Author: Anand Lal Shimpi

by Anand Lal Shimpi on September 14, 2010 4:10 AM EST

Posted in
CPUs
Intel
Sandy Bridge

62 Comments | Add A Comment

62 Comments

New, More Aggressive Turbo

Lynnfield was the first Intel CPU to aggressively pursue the idea of dynamically increasing the core clock of active CPU cores while powering down idle cores. The idea is that if you have a 95W TDP for a quad-core CPU, but three of those four cores are idle, then you can increase the clock speed of the one active core until you hit that TDP limit.

In all current generation processors the assumption is that the CPU reaches that TDP immediately upon enabling turbo. In reality however, the CPU doesn’t heat up immediately - there’s a period of time where the CPU isn’t dissipating its full TDP - there’s a ramp.

Sandy Bridge takes advantage of this by allowing the PCU to turbo up active cores above TDP for short periods of time (up to 25 seconds). The PCU keeps track of available thermal budget while idle and spends it when CPU demand goes up. The longer the CPU remains idle, the more potential it has to ramp up above TDP later on. When a workload comes around, the CPU can turbo above its TDP and step down as the processor heats up, eventually settling down at its TDP.

While SNB can turbo up beyond its TDP, the PCU won’t allow the chip to exceed any reliability limits (similar to turbo today).

In addition to above-TDP-turbo, Sandy Bridge will also support more turbo bins than Nehalem/Westmere. Intel isn’t disclosing how much more turbo headroom we’ll have, but the additional bins are at least visible with multiple cores active. Current designs usually only turbo up one or two bins with all four cores active, I’d expect to see another bin or two there and possibly more in lighter load cases.

Both CPU and GPU turbo can work in tandem. Workloads that are more GPU bound running on SNB can result in the CPU cores clocking down and the GPU clocking up, while CPU bound tasks can drop the GPU frequency and increase CPU frequency.

Sandy Bridge as a whole is much more dynamic of a beast than anything that’s come before it.

Sandy Bridge Media Engine Multiplier-only Overclocking

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

62 Comments

View All Comments

name99 - Tuesday, September 14, 2010 - link
This is no secret. This is exactly Intel's tick-tock strategy that has been in place for years now.

The one thing you have to keep in mind is that designing these CPUs now takes of order SEVEN YEARS (!!!) from conception to ship, which means that slips and mistakes do occur. Intel (and I guess AMD) have to make their best guess as to what the market will look like in seven years and sometimes they do guess incorrectly. Of course there is scope for small changes along the way closer to the release date, but not for changes in the grand strategy.
medi01 - Tuesday, September 14, 2010 - link
Agreed, it was two things: greed and the fact that AMD is currently not in a position to be a threat.
tatertot - Tuesday, September 14, 2010 - link
"The value segments won’t see Sandy Bridge until 2012."

You later show a roadmap slide which indicates Sandy Bridge in the value segment in Q3 2011.

Perhaps you meant "H2 '11" instead of "2012" ?
J_Tarasovic - Thursday, September 16, 2010 - link
I think that the roadmap probably refers to OEM shipments, whereas, Anand was probably referring to when consumers would actually be able to buy devices.
iwodo - Tuesday, September 14, 2010 - link
I just realize that my computer will no longer scream when i do WebCam Video Conferencing with Skype!. With the Encoder Engine and Decoder Engine, all i am doing it feeding USB 3.0 data and move them around........
yuhong - Tuesday, September 14, 2010 - link
"Back in the Core Duo days that was 80-bits of data. When Intel implemented SSE, the burden grew to 128-bits. "
"Core Duo" Huh?
NaN42 - Tuesday, September 14, 2010 - link
No, it seems to be right. Core Duo belongs to the Pentium M microarchitecture which implemented the SSE registers as two 64bit registers. So the largest registers were the x87-registers, but I'm not sure whether upon register renaming the registers were really copied.
aka_Warlock - Tuesday, September 14, 2010 - link
New CPU from Intel... and guess what?!! New SOCKET!! Lol.
Intel do know how to milk the stupid cow.
bernpi - Sunday, November 14, 2010 - link
For most people it makes perfect sense to get a new socket. Most people don't buy every new CPU from Intel or AMD because it would be a waste of money. My current CPU is a Core2Duo Quad processor with a 775 socket, i skipped the nehalem generation and will buy a SandyBridge early next year. So why should i keep my motherboard and the old 775 socket? Of course i will buy a new motherboard for the new processor. So i think for most people this is not a real issue.
Sahrin - Tuesday, September 14, 2010 - link
There's a lot of "neato" stuff that does a lot to improve the user experience by making the chip use its design resources more intelligently (smarter turbo - that 'comcast turbo-boost' feature should really make a difference for end users); but in terms of actual throughput it looks like Intel left FP performance the same; and there certainly isn't any new integer hardware.

K11, on the other hand, doubled integer ALU's (though the raw number of execution units is now the same as in a Nehalem core) and added a half-width (compared to Intel) FP unit.

First, I'd be interested to see if the whizz-bangies AMD was talking about for the K11 FPU a year ago make the execution time for 128-bit FP instructions comparable, better than, or still slower than Intel's FPU .

Second, I'd be quadruple interested to see what impact the way AMD is allocating the new integer hardware is going to have on performance. A monolithic Nehalem core is going to be able to handle more complex (wider) threads better than a K11 core (that's a 2-integer and 1-FPU Bulldozer); but in SMT-mode (or pseudo-SMT mode) what happens? We know Intel experiences a performance hit in HTT mode which they are only able to offset because Nehalem is so wide. AMD thinks it isn't going to get the expected hit in the front end, and they won't have the thread-switching penalty that Intel does. My prediction is that 8-core K11/Bullzoder will crush Sandy Bridge in multithreaded FP-light workloads and be 5-20% slower in everything else (the possible exception being 128-bit floats).

I'm actually kind of disappointed by this update to Nehalem...Intel did a lot of "uncore" stuff and implemenated AVX. Where's our wider back-end? More execution hardware drives better single-thread performance...the rest is just undoing the damage from the CISC-RISC transition in the front end and OoO .

Intel's Sandy Bridge Architecture Exposed

New, More Aggressive Turbo

Post Your Comment

62 Comments

View All Comments

name99 - Tuesday, September 14, 2010 - link

medi01 - Tuesday, September 14, 2010 - link

tatertot - Tuesday, September 14, 2010 - link

J_Tarasovic - Thursday, September 16, 2010 - link

iwodo - Tuesday, September 14, 2010 - link

yuhong - Tuesday, September 14, 2010 - link

NaN42 - Tuesday, September 14, 2010 - link

aka_Warlock - Tuesday, September 14, 2010 - link

bernpi - Sunday, November 14, 2010 - link

Sahrin - Tuesday, September 14, 2010 - link

Log in

Don't have an account? Sign up now