On-package GPU and Graphics Turbo

Arrandale and Clarkdale are two-die packages. There's the 32nm CPU die and next to it is a 45nm DirectX 10 GPU die (no DX11 support until possibly Larrabee).

This isn't Larrabee (yet), it's a direct descendent of the graphics in G45. While G45 was built on a 65nm process, the 'dale graphics is built on a 45nm process.

The smaller transistors enable much higher performance. While G45 had 10 shader cores, the 'dale GPU increases that to 12. A number of performance limiting issues have now been resolved, so we should see much more competitive performance from Intel's graphics.

The memory controller has been moved off of the CPU die and is on the GPU die instead. It's still on-package so you get decently low latencies, but it shouldn't technically be as low as on Lynnfield. This is a temporary problem that fixes itself once the CPU/GPU are on the same die with Sandy Bridge.


Sandy Bridge brings on-die graphics

I've already explained turbo mode quite a bit so I won't rehash it here. The technology basically allows you to run your CPU at the fastest possible frequency regardless of how many cores are active. Westmere has this.

Arrandale will support graphics turbo modes, while Clarkdale won't. Clarkdale graphics is already running as fast as possible regardless of TDP.

If the GPU demand is higher than the CPU demand, the CPU will allocate more of its TDP to the GPU and vice versa.

AES-NI: Encryption/Decryption Acceleration Quad Core Performance From Two Cores?
POST A COMMENT

96 Comments

View All Comments

  • mczak - Thursday, September 24, 2009 - link

    Hmm, clarkdale beating c2q in specfp:
    "You can see that thanks to a competitive clock speed, aggressive turbo modes and Hyper Threading the 3.33GHz Clarkdale outperforms both the Q9400 and the E8500."

    I'll fix this for you:
    "Thanks to the low memory bandwidth available to the c2q due to FSB limitations, c2q scales terribly and is hardly any faster than c2d which allows clarkdale to beat c2q"...

    Still, performance is certainly solid. There's no way however that Clarkdale will beat this core 2 quad in more typical multithreaded applications which aren't as bandwidth limited as specfp, for instance video encoding. But at least it will be somewhat close.
    Reply
  • CajunArson - Saturday, September 26, 2009 - link

    Wow, I'm so glad you are such a genius and are correcting those dumb Anandtech guys who waste all their time researching and benchmarking CPU technology! [/sarcasm]

    Seriously, the FSB has never been a bottleneck on consumer systems, particularly on notebooks where the CPU is not clocked up the wazoo to begin with. The FSB was a limitation in 2+ socket systems which is why Nehalem came out... as Anand and many others pointed out when Nehalem was new, the primary reason for abandoning the FSB was that it did not scale to multiple CPU sockets. Now the point-to-point architecture is superior, but it's like having 200mph racing tires on a car that can't take advantage of them anyway: nice to have, but they don't make you any faster.

    Just go back and look at the original benchmarks of the supposedly "superior" Barcelona when it came out: Using an on-die L3 cache to transfer data between cores on Barcelona was a blazing 2% (yes two whole percent) faster than the quad-core desktop conroes swapping data over the FSB... not much of a bottleneck.
    Reply
  • mczak - Sunday, September 27, 2009 - link

    I'm not saying FSB is really that much of a bottleneck on consumer systems, the problem is that IN THIS SPECIFIC CASE with specFP it is (of course, specfp isn't exactly relevant for consumer systems) indeed a problem. Hence clarkdale with its two cores will not, as specfp would indicate, beat c2q in more typical multithreaded workloads.
    Merely pointing out the comment about why clarkdale is faster than c2q in specFP indeed is bogus - sure clock rate, turbo (not much as specfp rate will use 4 threads) etc. help but fact is this would give the impression that clarkdale could achieve c2q performance which it will not (for multithreaded workloads) unless they are heavily memory limited like specfp, which is unlikely. Not that this is really a bad thing as that would be too good to be true anyway (the core of a core2 duo and clarkdale is very similar so this would be very much a miracle indeed).
    Reply
  • Inkie - Saturday, October 03, 2009 - link

    PCMark Vantage comparitive scores are even better than SFP... Reply
  • GeorgeOu - Saturday, September 26, 2009 - link

    "Thanks to the low memory bandwidth available to the c2q due to FSB limitations, c2q scales terribly and is hardly any faster than c2d which allows clarkdale to beat c2q"...

    Even a Core 2 quad no matter the GHz with a single socket isn't going to flood a north bridge controller and the FSB. Even a sub 2 GHz dual-socket Harpertown quad-core isn't really hitting the FSB/NB wall for the most part. Where Intel gets into trouble with the FSB is when they're running two sockets with two high clocked quad-cores.
    Reply
  • mczak - Saturday, September 26, 2009 - link

    That's generally true but not in all cases. In fact you can easily see some performance degradation even with dual-cores (the pentiums with only 800Mhz FSB) with specific apps (or of course any synthetic memory benchmark) so it's not surprising to see this issue come up with specfp. Fact is, if you've got enough memory bandwidth, specFP rate should scale perfectly with core count. According to results published at spec.org, a C2D E8400 scores ~30. A C2Q QX9650 (same clock) scores ~45. Clearly, that's not good scaling, and AFAIK this is solely due to lack of memory (or rather FSB) bandwidth.
    It is incorrect to say the cpu isn't going to saturate the FSB. Even a 2Ghz C2D can already do that very easily as any memory benchmark will show, thankfully most applications aren't really in need of that much memory bandwidth, but specFP IS a memory bandwidth hog.
    Reply
  • mdbusa - Thursday, September 24, 2009 - link

    I dont know about anyone else but I am thoroughly confused by all the different nomenclature used by intel. We have thees nams clarksdale, etc... then we have chip names? I5, i7 etc...., then we have 45 nm etc. P55 etc. blah blah
    Now ill go read the article
    Reply
  • SenorB - Saturday, September 26, 2009 - link

    Just guessing, but note that Field = 4, Dale = 2. Cloverfield was a quad-core proc too (before it was a monster movie). Reply
  • SenorB - Saturday, September 26, 2009 - link

    My bad, it was Clovertown, not Cloverfield. Still, I always thought it was a sly little joke on Intel's part: quad core, clover... or am I giving them too much credit? Reply
  • the zorro - Friday, September 25, 2009 - link

    intel graphics?
    no thanks.
    Reply

Log in

Don't have an account? Sign up now