Original Link: http://www.anandtech.com/show/1662
Dual Core Intel Platform Shootout - NVIDIA nForce4 vs. Intel 955Xby Anand Lal Shimpi on April 14, 2005 1:01 PM EST
- Posted in
Four years ago, NVIDIA previewed their first ever desktop chipset - the nForce 420 - at Computex. The anticipation of NVIDIA's entry into the Athlon chipset market at the time was astounding. While they didn't get it right the first time around, by the end of nForce2's reign, VIA had relinquished the throne as the most desirable supplier of AMD chipsets. Late last year, when NVIDIA announced that they had finally signed a cross licensing agreement with Intel, we knew it meant that NVIDIA's chipsets would soon be coming to the Intel platform, but honestly, we didn't really care. We hadn't recommended an Intel CPU since the introduction of Prescott and this time around, NVIDIA's biggest competition wasn't VIA, it was Intel - and it's rare that you beat Intel in making chipsets for their own processors.
Honestly, Intel processors and even the platform haven't been interesting since the introduction of Prescott. They have been too hot and poor performers, not to mention that the latest Intel platforms forced a transition to technologies that basically offered no performance benefits (DDR2, PCI Express). A bit of that changed when Intel brought forth their dual core plans - assuming that they can actually guarantee availability, Intel is planning to ship more desktop dual core processors, at lower prices, than AMD this year. As we mentioned in our preview of Intel's dual core Pentium D, the cheapest dual core processors will weigh in at $241 for the 2.8GHz models. While for the same price you can get a much faster single core AMD CPU, the word "faster" applies selectively depending on what sort of usage models that you're looking at - whether it's heavy multitasking, or mostly running single applications. We've already had that discussion, and the decision is still in your hands, but needless to say, Intel's processors have all of the sudden become much more interesting given the proposed price point for their entry-level dual core CPUs. Now all of the sudden, there's some purpose to actually looking at the latest chipsets for the Intel platform.
We have yet to recommend any of Intel's single core Prescott CPUs, and if you are looking for a single core Pentium 4, then you should already have a good idea of what chipsets there are out there. But for dual core, the platform support is much more limited. None of Intel's previous chipsets will support dual core, only their most recently announced 955X and 945 chipsets offer dual core support. On the NVIDIA side, their nForce4 SLI Intel Edition chipset does support dual core, but NVIDIA stipulates that the motherboard manufacturers must implement that support properly on the design side. As long as the motherboard manufacturer states that their nForce4 board supports Intel's dual core, you should be sitting pretty. Chipsets from all manufacturers, including ATI, SiS and VIA will undoubtedly offer dual core support, but the fact of the matter is that their release is further down the line. What we're looking at today are the two heavyweights that are supposed to be available in the channel by the end of this month.
The Delicate Competition
The NVIDIA/Intel relationship is a very interesting one; as with any of these types of relationships, it is not one borne out of love, but rather necessity. At the end of the day, Intel would still be happier if there was no threat from companies like NVIDIA. Because of this fine line between a partnership and a competitor, NVIDIA has to play their role very carefully - they don't want to be viewed as more of a competitor than a partner in the eyes of Intel. By selling a chipset that is significantly more expensive than Intel's most expensive 955X, NVIDIA secures their position as a valuable partner, and not a competitor.
You've already heard that NVIDIA's nForce4 SLI Intel Edition chipset costs about $80, but what about Intel's 955X and 945? For once, Intel is actually the cheaper alternative - their 955X costs motherboard manufacturers $50 ($53 with ICH7R), while the 945P costs a mere $38. For motherboard prices, this means that you can expect at least a $30 price premium for a nForce4 SLI Intel Edition board compared to a 955X board; compared to a 945P, you can expect closer to a $40 price premium. It's not tremendous, but given that motherboards tend to hover in the low $100s, even a $30 difference is significant.
At this point, NVIDIA hasn't announced any plans to bring a non-SLI version of the nForce4 to the Intel platform, and the vast majority of motherboard manufacturers are waiting for just that. A lower cost nForce4 chipset would obviously translate into more sales for the motherboard manufacturers. However, it could very well be that NVIDIA doesn't want to try and take on Intel in the same price bracket. At the same time, NVIDIA is a very successful company, so it remains to be seen how far over the line they will tread in the name of expanding their sales.
Intel's 955X Chipset
With pricing out of the way, let's have a look at the 955X chipset itself. The Intel slide below provides a good overview of the chipset:
For the most part, the features are pretty straightforward, but there are a few interesting points. Note that Intel shows support for dual PCI Express x16 slots with the use of an external bridge, meaning that motherboard manufacturers could effectively offer SLI support on 955X platforms. We firmly believe that Intel will introduce support for SLI once both ATI and NVIDIA have introduced their SLI technologies. It has yet to be seen how Intel will implement this bridged dual x16 solution and whether it will be a true dual x16 setup, or two x8 slots like NVIDIA's SLI.
The board that we used for this review actually featured two x16 slots, but only one was a true x16 slot - the other was a x4 slot with a x16 connector.
NVIDIA's nForce4 SLI Intel Edition Chipset
As we've indicated in the past, NVIDIA's first Intel chipset is very similar to their nForce4 AMD chipset, with a couple of exceptions. For starters, the Intel Edition chipset is made up of two chips, compared to the AMD chipset's one. The reasoning is simple: with AMD's architecture, NVIDIA needn't include a memory controller in their chipset, which cuts down on overall die size quite a bit. With the Intel Edition, we see the first new memory controller that NVIDIA has introduced since nForce2.
Remember DASP? NVIDIA's Dynamic Adaptive Speculative Pre-Processor is back in nForce4 Intel Edition, but this time around, the competition is much stronger. DASP is a hardware pre-fetch engine that resides within the memory controller and attempts to pre-fetch data into a small amount of cache on the chipset, which NVIDIA's algorithms determine will be used by the CPU in the future. Intel has a similar technology in their 955X chipset, although it's not something they have branded or marketed. Depending on how aggressive NVIDIA's DASP is, it could make good use of the extra memory bandwidth offered by its dual channel DDR2-533/667 memory bus.
NVIDIA also boasts a dedicated address bus per DIMM slot with the nForce4; however, this seems to be a feature also supported by Intel, so there isn't much advantage over the competition here.
In their final memory controller optimization, NVIDIA's QuickSync claims to be able to reduce memory latency when operating in multiple clock domains (e.g. 800MHz FSB, but 533MHz memory bus). Later in this article, we'll find out exactly how aggressive NVIDIA's memory controller truly is.
The nForce4 SLI Intel Edition chipset supports both 800 and 1066MHz FSBs, just like the 955X - however, NVIDIA also indicated that if Intel were to increase the FSB frequency, they would be ready.
Unlike the 955X, NVIDIA only supports 3 PCI Express x1 slots. However, NVIDIA does offer two PATA channels, compared to Intel's single PATA channel. NVIDIA also offers more USB 2.0 ports (10 vs 8). NVIDIA does not support Intel's HD Audio spec, so you're stuck with AC'97 on the nForce4 SLI.
For this comparison, we used production boards. From Intel, we have the Intel D955XBK and representing NVIDIA, we have the ASUS P5ND2-SLI Deluxe.
While NVIDIA's nForce4 reference board still doesn't seem to have dual core support, ASUS' board does, so it looks like the chipset will have no problem supporting the Pentium D. One problem that we have seen, however, is that neither NVIDIA's reference board nor ASUS' board support Intel's Thermal Monitor 2 specification at this time. While NVIDIA insists that support for TM2 is coming, we are hearing from motherboard manufacturers that support for TM2 will only be there for single core processors. If that ends up being true, that will be a huge downside for the nForce4 platform - TM2 significantly reduces heat output as well as fan noise on Intel platforms, both features that are much appreciated.
Despite having numerous problems with their AMD SLI motherboard, the ASUS P5ND2-SLI Deluxe was flawless during our testing. We had one problem with the system not POSTing, but a later BIOS revision fixed that issue.
Note that this is a comparison of Intel platform chipsets. For a comparison of AMD and Intel CPUs, have a look at our latest CPU reviews.
Intel Pentium 4 Configuration
LGA-775 Intel Pentium Extreme Edition 840
2 x 512MB Crucial DDR-II 667 Dual Channel DIMMs 4-4-4-15
Intel D955XBK 955X Motherboard
ASUS P5ND2-SLI nForce4 SLI Intel Edition Motherboard
ATI Radeon X800 XL PCI Express
NVIDIA GeForce 6800GT PCI Express
Seagate Barracuda 7200.7 Plus (with NCQ)
Maxtor MaXLine III (with NCQ) for NCQ tests
The biggest question on our minds when comparing these two heavyweights was: who has the better memory controller? We turned to the final version of ScienceMark 2.0 for the answer.
Amazingly enough, at the same memory timings, NVIDIA drops memory latency by around 13%. This is a worst case scenario for memory latency. In all of our other memory tests, the nForce4's memory controller was equal to Intel's controller - but even any advantage here is impressive, not to mention such a large advantage.
NVIDIA's latency reduction and DASP algorithms offer a negligible 2% increase in overall memory bandwidth. While you'd be hard pressed to find any noticeable examples of these performance improvements, the important thing here is that NVIDIA's memory controller appears to be just as good as, if not faster, than Intel's best. Kudos to NVIDIA - they have at least started off on the right foot with performance.
DDR2-667 or 533?
When Intel sent us their 955X platform, they configured it with DDR2-667 memory running at 5-5-5-15 timings. NVIDIA sent their nForce4 SLI Intel Edition board paired with some Corsair DIMMs running at 4-4-4-15 timings at DDR2-667. Given that we have lower latency DDR2-533 memory, we decided to find out if there was any real performance difference between DDR2-667 at relatively high timings and DDR2-533 at more aggressive timings. Once again, ScienceMark 2.0 is our tool of choice:
Here, we see that even at 3-2-2-12, DDR2-533 isn't actually any faster than DDR2-667.
...and it offers slightly less memory bandwidth.
It looks like there's not much point in worrying about low latency DDR2-533, as higher latency DDR2-667 seems to work just as well (if not a little better) on the newest Intel platforms.
Business Application Performance
Business Winstone 2004
Business Winstone 2004 tests the following applications in various usage scenarios:
- Microsoft Access 2002
- Microsoft Excel 2002
- Microsoft FrontPage 2002
- Microsoft Outlook 2002
- Microsoft PowerPoint 2002
- Microsoft Project 2002
- Microsoft Word 2002
- Norton AntiVirus Professional Edition 2003
- WinZip 8.1
NVIDIA is normally the strongest performer in Business Winstone, but here, the nForce4 takes a close backseat to Intel's 955X. The two basically perform the same.
Office Productivity SYSMark 2004
SYSMark's Office Productivity suite consists of three tests, the first of which is the Communication test. The Communication test consists of the following:
"The user receives an email in Outlook 2002 that contains a collection of documents in a zip file. The user reviews his email and updates his calendar while VirusScan 7.0 scans the system. The corporate web site is viewed in Internet Explorer 6.0. Finally, Internet Explorer is used to look at samples of the web pages and documents created during the scenario."
The next test is Document Creation performance:
"The user edits the document using Word 2002. He transcribes an audio file into a document using Dragon NaturallySpeaking 6. Once the document has all the necessary pieces in place, the user changes it into a portable format for easy and secure distribution using Acrobat 5.0.5. The user creates a marketing presentation in PowerPoint 2002 and adds elements to a slide show template."
The final test in our Office Productivity suite is Data Analysis, which BAPCo describes as:
"The user opens a database using Access 2002 and runs some queries. A collection of documents are archived using WinZip 8.1. The queries' results are imported into a spreadsheet using Excel 2002 and are used to generate graphical charts."
NVIDIA is actually slightly stronger than Intel in the Office Productivity suite of SYSMark 2004. In the communication tests, we see that NVIDIA actually holds a 13% performance advantage. Given that the communication suite is particularly disk intensive, we will look at SATA controller performance later on in this article to see if NVIDIA possibly has a stronger SATA controller.
Multimedia Content Creation Performance
MCC Winstone 2004
Multimedia Content Creation Winstone 2004 tests the following applications in various usage scenarios:
- Adobe® Photoshop® 7.0.1
- Adobe® Premiere® 6.50
- Macromedia® Director MX 9.0
- Macromedia® Dreamweaver MX 6.1
- Microsoft® Windows MediaTM Encoder 9 Version 9.00.00.2980
- NewTek's LightWave® 3D 7.5b
- SteinbergTM WaveLabTM 4.0f
All chips were tested with Lightwave set to spawn 4 threads.
Once again, we're back to both NVIDIA and Intel offering nearly identical performance.
ICC SYSMark 2004
The first category that we will deal with is 3D Content Creation. The tests that make up this benchmark are described below:
"The user renders a 3D model to a bitmap using 3ds max 5.1, while preparing web pages in Dreamweaver MX. Then the user renders a 3D animation in a vector graphics format."
Next, we have 2D Content Creation performance:
"The user uses Premiere 6.5 to create a movie from several raw input movie cuts and sound cuts and starts exporting it. While waiting on this operation, the user imports the rendered image into Photoshop 7.01, modifies it and saves the results. Once the movie is assembled, the user edits it and creates special effects using After Effects 5.5."
The Internet Content Creation suite is rounded up with a Web Publishing performance test:
"The user extracts content from an archive using WinZip 8.1. Meanwhile, he uses Flash MX to open the exported 3D vector graphics file. He modifies it by including other pictures and optimizes it for faster animation. The final movie with the special effects is then compressed using Windows Media Encoder 9 series in a format that can be broadcast over broadband Internet. The web site is given the final touches in Dreamweaver MX and the system is scanned by VirusScan 7.0."
Content Creation performance is identical across the board.
General Performance - PC WorldBench 5
Although we normally break WorldBench into its multiple categories, we'll just present all of the numbers in a single graph:
The biggest performance differences here are about 5%, with Intel actually coming in slightly faster than NVIDIA in some of the tests. Overall, we just have more proof that the two chipsets perform very similarly, with the exception of disk performance.
For our gaming performance tests, we tested with two cards, ATI's Radeon X800 XL and NVIDIA's GeForce 6800GT, to make sure that no performance advantages exist with only one card and not the other. First, let's look at the ATI equipped systems:
NVIDIA's superior memory controller is accountable for about a 2% performance advantage here.
NVIDIA actually holds a 3.5% performance advantage here, and with an ATI GPU, which means that there are no optimizations at work here.
The performance margins don't seem to change with a NVIDIA GPU. NVIDIA is still around 3% faster.
We ran SPECviewperf 8 as a further test of performance, but the two contenders ended up performing rather similarly again. Intel wins some, NVIDIA wins some, and both win by relatively small margins:
Workstation Performance - ATI GPU
What's interesting is that NVIDIA seems to do better when paired with an ATI GPU. However, that's most likely due to normal variations in the test results:
SATA Controller Performance
Both NVIDIA and Intel offer support for NCQ in their SATA controllers, and given our recently renewed interest in NCQ performance, we decided to find out if there were any performance differences between the two SATA controllers. However, as we've found in the past, coming up with tests that stress NCQ is quite difficult. Luckily, there is a tool that works perfectly for controlling the type of disk accesses that you want to test: iometer.
An Intel developed tool, iometer allows you to control the size, randomness and frequency, among other things, of disk accesses, and measure performance using data generated according to these specifications. Given that NCQ truly optimizes performance when disk accesses are random in nature, we decided to look at how performance varied according to what percentage of the disk accesses were random. At the same time, we wanted the tests to be modeled on a multitasking desktop system, so we did some investigation by setting up a computer and running through some of our multitasking scenarios on it.
What we found is that on modern day hard drives, the number of outstanding IOs (IO Queue Depth) is rarely above 10 on even a moderately taxed system. Only when you approach extremely heavy multitasking loads (heavier than anything that we've ever tested) do you break into queue depths beyond 32. So, we put together two scenarios, one with a queue depth of 8 and one with a queue depth of 32 - the latter being more of an extreme condition.
In each scenario, we sent the drives a series of 64KB requests, 75% of which were reads, 25% were writes; once again, derived from monitoring our own desktop usage patterns.
We then varied the randomness of disk accesses from 0% (e.g. 100% sequential) up to 100% (0% sequential reads/writes). In theory, the stronger NCQ controllers will show better performance as the percentage of random accesses increases. We reported both Average IOs per Second and average IO response time (how long accesses took to complete on average):
With a queue depth of 8, the two SATA controllers offer virtually identical performance.
Looking at latency, Intel actually offers a very slight performance advantage here - nothing huge, but it's definitely there.
The results get much more interesting as we increase the queue depth to 32:
Here, NVIDIA starts to pull away offering close to a 20% increase in average IOs per second as the access patterns get more random (e.g. as more applications running at the same time start loading down the hard disk).
What's truly impressive, however, is the reduction in average response time - up to a 90ms decrease in response time, thanks to NVIDIA's superior NCQ implementation.
But stepping back into reality, how big of a difference NVIDIA's NCQ implementation makes depends greatly on your usage patterns. Heavy multitaskers that are very IO bound will notice a performance difference, while more casual multitaskers would be hard pressed to find any difference. For example, Intel was actually faster than NVIDIA in our gaming multitasking scenarios from our dual core investigation.
The final performance investigation of this review focuses exclusively on NVIDIA's nForce4 SLI Intel Edition. More specifically, it focuses on SLI performance. The performance benefits of SLI apply just as much to Intel as they do to AMD, but to give you an idea of those performance benefits, we ran tests in Doom 3, Half Life 2 and Splinter Cell: Chaos Theory - three of the most demanding games out today.
Doom 3 Performance
Doom 3 has always been a strong selling point for NVIDIA, and the performance impact of SLI here is extremely strong. At 1600 x 1200, the 6800GT gains 34% from SLI, while the 6600GT gets a nice 56% increase in performance. For Doom 3, a pair of 6600GTs are not only cheaper, but also 22% faster than a single ATI Radeon X850 XT Platinum Edition.
Turning on 4X AA results in even larger performance gains, with the 6800GT getting anywhere from a 23% up to a 70% increase in performance. The 6600GTs fall between 58% and 75%, all very impressive gains as you would expect from a doubling of the number of GPUs in the system.
Half Life 2 SLI Performance
For Half Life 2 performance, we used our at_canals_08-rev7 demo from our Half Life 2 performance investigations.
What's interesting here is that because of how strong of a performer ATI is in Half Life 2, even a pair of 6800GTs aren't able to outpace a single X850 XT Platinum Edition.
Even with AA enabled, it isn't until you hit 1600 x 1200 where the 6800GTs sheer fill rate and memory bandwidth are finally able to outweigh the inherent performance advantages of the X850 XT PE. If Half Life 2 is your game, then SLI is probably not the wisest investment for you.
Splinter Cell: Chaos Theory SLI Performance
For Splinter Cell, we used the built-in timedemo running at the highest possible settings, but without enabling NVIDIA-specific features (e.g. Shader Model 3.0) to enable apples-to-apples comparisons. Here, we also reported min and max frame rates in addition to the average frame rate. Note the extremely positive impact of SLI on improving minimum frame rates.
Both Intel's 955X and NVIDIA's nForce4 SLI Intel Edition chipsets are expected to be available in motherboards by the end of this month. With that being said, which platform do you go for? Both contenders are quite strong. NVIDIA has the benefits of out-of-the-box SLI support, support for more PATA devices and better NCQ performance than Intel's 955X. On the other hand, Intel has HD Audio support, official support for TM2 on dual core processors and performance that's just as good as NVIDIA in most cases, at a lower price.
We'd give the edge to NVIDIA because of their strong NCQ performance and support for SLI, but those two items aren't necessarily on everyone's shopping list. For a lot of people, SLI is too big of an investment and a lower priced motherboard is far more desirable, not to mention support for TM2 on dual core CPUs for a cooler, quieter system.
The fact that this decision is so very hard is a testament to NVIDIA's strength as a chipset manufacturer. In fact, we've never seen anyone threaten Intel with nearly identical specifications, nearly as much as NVIDIA has with their nForce4 SLI Intel Edition.
In the end, the decision will undoubtedly come down to shipping motherboards - and whether or not motherboard manufacturers are able to enable TM2 support on their nForce4 boards. If they are able to, then we'd have no problem giving NVIDIA the clear win here. Their performance is competitive and they have strengths that exceed those of Intel.
Intel's CPU team should be happy. They have found an extremely complementary partner in NVIDIA. However, Intel's chipset team has reason to worry; motherboard manufacturers weren't happy with the 925/915 chipsets, and with a viable alternative in NVIDIA, we may very well have an opportunity for NVIDIA to start eating into Intel's own chipset market share in a way that no other company has in the past.