Intel's Core 2 Extreme & Core 2 Duo: The Empire Strikes Back

Name: Intel's Core 2 Extreme & Core 2 Duo: The Empire Strikes Back
Item: Intel's Core 2 Extreme & Core 2 Duo: The Empire Strikes Back
Author: Anand Lal Shimpi

by Anand Lal Shimpi on July 14, 2006 12:00 AM EST

Posted in
CPUs

202 Comments | Add A Comment

202 Comments

Gaming with Core 2 and CrossFire on 975X

We were so used to getting excited over AMD processor launches that we almost forgot what an important Intel CPU launch was like. You see, AMD and Intel behave very differently when at a dinner table preparing to eat their meals. AMD will eat when its partners eat; companies like ATI and NVIDIA get to share in the joy of a new AMD product launch as they are busy building chipsets for the new platform. That's why we get a new nForce chipset whenever AMD launches a new CPU. Intel on the other hand isn't as generous; Intel likes to eat first, and then whatever remains after it's nice and full can be scraped off the table and given to its partners. This is why today's launch is taking place pretty much exclusively on Intel chipsets, with retail products based on ATI/NVIDIA chipsets shipping in the coming months.

Intel's table manners aren't as nice as AMD's largely because they don't have to be. Intel has a lot more fabs than AMD, however they aren't all pumping out 65nm Core 2 Duos on 300mm wafers; instead many of them are still using old 90nm or 130nm process technology. It's not exactly economically feasible to keep converting all of the fabs to the latest technology as soon as it's available, so Intel uses up excess capacity in its older fabs by producing chipsets. AMD does not have this luxury so it depends on companies like ATI, NVIDIA, SiS and VIA for the platform side of things, and thus is much nicer at the dinner table.

Eating habits aside, what this means for us is that our only real options to test Core 2 Duo are with Intel chipsets. NVIDIA's nForce 590 SLI reference board for Core 2 Duo is in our labs but its BIOS isn't finalized yet so NVIDIA is asking us to hold off on using it for a couple more weeks. At the same time, we're hearing that we shouldn't expect any retail motherboards using ATI chipsets for Core 2 Duo motherboards until September at the earliest, once again leaving us with Intel.

Don't get us wrong; Intel chipsets are far from a terrible option. In fact, Intel continues to make extremely trouble-free platforms. It's not stability or performance that we're concerned about, as Intel has got both of those down pat. The issue however is multi-GPU compatibility.

You see, NVIDIA is a lot like Intel in that it wants to eat first or maybe, if the right people are at the table, at the same time as its partners. The problem with two companies that have identical eating habits is that no one ends up eating, and thus we have no SLI support on Intel chipsets. NVIDIA views this as an upper hand because honestly it's the only tangible advantage anyone has ever held over an Intel chipset since the days when Intel and Rambus were inseparable. If you want the best multi-GPU solution on the market you buy NVIDIA graphics cards, but they won't run (together) on Intel chipsets so you've got to buy the NVIDIA chipset as well - sounds like NVIDIA is trying to eat some of Intel's dinner, and this doesn't make Intel very happy.

Luckily for Intel, there's this little agreement it has with NVIDIA's chief competitor - ATI. Among other things, it makes sure that Intel platforms (or platform in this case, since it only officially works on the 975X) can support CrossFire, ATI's multi-GPU technology. Unfortunately, CrossFire isn't nearly as polished as NVIDIA's SLI. Case in point would be benchmarking for this Core 2 Duo article, which used a pair of X1900 XTs running in CrossFire mode. During our testing, CrossFire decided to disable itself after a simple reboot - twice. No warnings, no hardware changes, just lower frame rates after a reboot and a CrossFire enable checkbox that had become unchecked. Needless to say it was annoying, but by now we already know that CrossFire needs work and ATI is on it.

More than anything this is simply a message to ATI and Intel: if CrossFire had been in better shape, the high end gaming enthusiast could have been satisfied today, but instead they will have to wait a little longer for the first nForce 500 motherboards with Core 2 support to arrive (or settle for a nForce 4 board with Core 2 support).

Why does multi-GPU even matter? Given how fast Intel's Core 2 processors are, we needed to pair them with a GPU setup that was well matched - in this case we went with a pair of X1900 XTs running in CrossFire mode. With a pair of X1900 XTs we could run at 1600 x 1200 for all of our gaming tests, achieving a good balance between CPU and GPU loads and adequately characterizing the gaming performance of Intel's Core 2 line.

Encoding Performance using DivX 6.1, WME9, Quicktime (H.264) & iTunes Gaming Performance using Quake 4, Battlefield 2 & Half Life 2 Episode 1

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

202 Comments

View All Comments

coldpower27 - Friday, July 14, 2006 - link
Are there supposed to be there as they aren't functioning in Firefox 1.5.0.4
coldpower27 - Friday, July 14, 2006 - link
You guys fixed it awesome.
Orbs - Friday, July 14, 2006 - link
On "The Test" page (I think page 2), you write:

please look back at the following articles:

But then there are no links to the articles.

Anyway, Anand, great report! Very detailed with tons of benchmarks using a very interesting gaming configuration, and this review was the second one I read (so it was up pretty quickly). Thanks for not saccrificing quality just to get it online first, and again, great article.

Makes me want a Conroe!
Calin - Friday, July 14, 2006 - link
Great article, and thanks for a well done job. Conroe is everything Intel marketing machine shown it to be.
stepz - Friday, July 14, 2006 - link
The Core 2 doesn't have smaller emoty latency than K8. You're seeing the new advanced prefetcher in action. But don't just believe me, check with the SM2.0 author.
Anand Lal Shimpi - Friday, July 14, 2006 - link
That's what Intel's explanation implied as well, when they are working well the prefetchers remove the need for an on-die memory controller so long as you have an unsaturated FSB. Inevitably there will be cases where AMD is still faster (from a pure latency perspective), but it's tough to say how frequently that will happen.

Take care,
Anand
stepz - Friday, July 14, 2006 - link
Excuse me. You state "Intel's Core 2 processors now offer even quicker memory access than AMD's Athlon 64 X2, without resorting to an on-die memory controller.". That is COMPLETELY wrong and misleading. (see: http://www.aceshardware.com/forums/read_post.jsp?i...">http://www.aceshardware.com/forums/read_post.jsp?i... )

It would be really nice from journalistic integrity point of view and all that, if you posted a correction or atleast silently changed the article to not be spreading incorrect information.

Oh... and you really should have smelt something fishy when a memory controller suddenly halves its latency by changing the requestor.
stepz - Friday, July 14, 2006 - link
To clarify. Yes the prefetching and espescially the speculative memory op reordering does wonders for realworld performance. But then let the real-world performance results speak for themselves. But please don't use broken synthetic tests. The advancements help to hide latency from applications that do real work. They don't reduce the actual latency of memory ops that that test was supposed to test. Given that the prefetcher figures out the access pattern of the latency test, the test is utterly meaningless in any context. The test doesn't do anything close to realworld, so if its main purpose is broken, it is utterly useless.
JarredWalton - Friday, July 14, 2006 - link
Modified comments from a similar thread further down:

Given that the prefetcher figures out the access pattern of the latency test, the test is utterly meaningless in any context."

That's only true if the prefetcher can't figure out access patterns for all other applications as well, and from the results I'm pretty sure it can. You have to remember, even with the memory latency of approximately 35 ns, that delay means the CPU now has about 100 cycles to go and find other stuff to do. At an instruction fetch rate of 4 instructions per cycle, that's a lot of untapped power. So, while it waits on main memory access one, it can be scanning the next accesses that are likely to take place and start queuing them up and priming the RAM. The net result is that you may never actually be able to measure latency higher than 35-40 ns or whatever.

The way I think of it is this: pipeline issues aside, a large portion of what allowed Athlon 64 to outperform NetBurst was reduced memory latency. Remember, Pentium 4 was easily able to outperform Athlon XP in the majority of benchmarks -- it just did so at higher clock speeds. (Don't *even* try to tell me that the Athlon XP 3200+ was as fast as a Pentium 4 3.2 GHz! LOL. The Athlon 64 3200+ on the other hand....) AMD boosted performance by about 25% by adding an integrated memory controller. Now Intel is faster at similar clock speeds, and although the 4-wide architectural design helps, not to mention 4MB shared L2, they almost certainly wouldn't be able to improve performance without improving memory latency -- not just in theory, but in actual practice. Looking at the benchmarks, I have to think that our memory latency scores are generally representative of what applications see.

If you have to engineer a synthetic application specifically to fool the advanced prefetcher and op reordering, what's the point? To demonstrate a "worst case" scenario that doesn't actually occur in practical use? In the end, memory latency is only one part of CPU/platform design. The Athlon FX-62 is 61.6% faster than the Pentium XE 965 in terms of latency, but that doesn't translate into a real world performance difference of anywhere near 60%. The X6800 is 19.3% faster in memory latency tests, and it comes out 10-35% faster in real world benchmarks, so again there's not an exact correlation. Latency is important to look at, but so is memory bandwidth and the rest of the architecture.

The proof is in the pudding, and right now the Core 2 pudding tastes very good. Nice design, Intel.
coldpower27 - Friday, July 14, 2006 - link
But why are you posting the Manchester core's die size?

What about the Socket AM2 Windsor 2x512KB model which has a die size of 183mm2?

Intel's Core 2 Extreme & Core 2 Duo: The Empire Strikes Back

Gaming with Core 2 and CrossFire on 975X

Post Your Comment

202 Comments

View All Comments

coldpower27 - Friday, July 14, 2006 - link

coldpower27 - Friday, July 14, 2006 - link

Orbs - Friday, July 14, 2006 - link

Calin - Friday, July 14, 2006 - link

stepz - Friday, July 14, 2006 - link

Anand Lal Shimpi - Friday, July 14, 2006 - link

stepz - Friday, July 14, 2006 - link

stepz - Friday, July 14, 2006 - link

JarredWalton - Friday, July 14, 2006 - link

coldpower27 - Friday, July 14, 2006 - link

Log in

Don't have an account? Sign up now