Original Link: http://www.anandtech.com/show/1656
Intel Dual Core Performance Preview Part I: First Encounterby Anand Lal Shimpi on April 4, 2005 2:44 PM EST
- Posted in
I was in Austin visiting AMD when I saw the email - Intel was prepping a dual core system to be sent out my way for a preview. That was last Wednesday, the machine arrived on Friday, and today's Monday; needless to say, it's been a busy weekend.
This type of a review is a first for Intel. For the most part, doing an officially sanctioned preview with performance benchmarks isn't in the Intel vocabulary. Don't take this opportunity lightly - this is a huge change in the thinking and execution at Intel.
Make no mistake, Intel isn't officially releasing their dual core desktop processors today; this is merely a preview. Intel's dual core line is still on track to be released sometime in the April - June timeframe. Intel will beat AMD to bringing dual core to the desktop first, while AMD will do the same to Intel in the server/workstation world. We still have no idea of actual availability when these chips are officially launched. Remember that all of the first generation dual core chips are basically twice the size of their single core counterparts - meaning that they put twice the strain on manufacturing. Intel, with 11 total fabs, is in a better position to absorb this impact than AMD, but both have paper-launched products in the past, so there's no telling which way the dual core wars will go initially. All we can say at this point is that we've seen dual core parts from both AMD and Intel running at full shipping speeds, and Intel was the first to get us a review sample for this preview.
The clock speed race is over, both AMD and Intel have thrown in their towels, and now it's time to shift to dual core. Intel has been extremely forthcoming with their dual core roadmap, and for those who aren't intimately familiar with it, here's a look at the next 24 months from Intel:
The green bars are dual core, the blue is single core. Enough said.
The Chip: Pentium Extreme Edition
As we mentioned in our IDF coverage, Intel has dropped the number 4 from their naming for their dual core parts. The new dual core desktop CPUs will simply be called the Pentium D and the Pentium Extreme Edition.
Both the Pentium D and Pentium Extreme Edition are nothing more than two 90nm Prescott 1M dies glued together. That means that each core has its own 1MB L2 cache, and that also means that architecturally, these chips are no different than the single core Pentium 4s that are out today - other than the obvious dual core fact.
Contrary to what we've reported earlier, the only difference between the Pentium D and the Pentium Extreme Edition is the presence of Hyper Threading; mainly, the Pentium D doesn't have it, while the Extreme Edition does. Both chips will only use a 800MHz FSB, they both have the same cache sizes, and they only differ in the presence of HT.
Armed with Hyper Threading, the Pentium Extreme Edition allows the execution of 4 concurrent threads and appears as a quad processor CPU to the OS. Without Hyper Threading, the Pentium D only allows for 2 concurrent threads and appears as a dual processor CPU to the OS.
While the Pentium D will be offered in three speed grades, from 2.8GHz up to 3.2GHz, the Extreme Edition will only be launched at 3.2GHz. Note that the fastest single core Pentium 4s run between 3.6GHz and 3.8GHz, so there is a significant clock speed penalty paid by going to dual core.
The Platform: Intel 955X
AMD's dual core Athlon 64 processors will work in all current Socket-939 motherboards with merely a BIOS update. The same level of compatibility obviously isn't true for Intel's dual core solutions. You'll need a new motherboard to support the Pentium D and Pentium Extreme Edition chips, and thus, Intel shipped us a board based on their soon-to-be released 955X platform.
The platform boasts a dual channel DDR2-667 memory controller, but given that the chips still only support an 800MHz FSB, the added bandwidth of DDR2-667 is useless. Even for bragging rights, running at DDR2-667 doesn't make sense, as the memory that Intel shipped with the system is rated at 5-5-5-15 timings at 667MHz. Wasted bandwidth and higher latency memory is nothing to get excited about in our book. We're not entirely sure what Intel is up to, but they had better plan on increasing the FSB of their chips really soon if they want DDR2-667 (or even 533) to gain any sort of acceptance.
Other than support for dual core, faster DDR2, RAID 5 and 8GB of ECC memory, the 955X doesn't have any features to boast over the current platforms. It does look like Intel may be planning SLI support for the 955X however:
The 955X board that we received had two physical x16 PCIe connectors, but only one of them was electrically a x16 slot.
Despite Intel's warnings not to make any judgments about final performance or stability, both the 955X and the Pentium Extreme Edition were as rock solid during our testing as any product that we've encountered. This was quite possibly the most stable encounter with a pre-release CPU, chipset and drivers that we've ever had. That being said, we really didn't expect Intel to break tradition with a platform of which they weren't 100% sure.
The Intangible Dual Core
The move to dual core is a bit of a "catch 22". In order to deal with the fact that a dual core die is twice the size of a single core die, AMD and Intel have to use higher yielding transistors. The larger your die, the more defects you have; so, you use higher yielding transistors to balance things out. The problem is that the highest yielding transistors run at the lowest clock speeds, so dual core chips end up running at slower speeds than single core chips. While the Pentium 4 could have hit 4GHz last year, we won't break the 4GHz barrier until late 2006 at the earliest.
In Intel's case, we're talking about 2.8GHz - 3.0GHz vs. 3.6GHz - 3.8GHz for the high end single core chips. In order to offset the difference, Intel is pricing their dual core chips within about $80 of their single core counterparts. Short of giving dual and single core chips a price parity, this is by far the best approach to assuring dual core adoption.
Why does Intel want to encourage dual core adoption? To guarantee a large installed user base, of course. The problem today is that the vast majority of desktop systems are single processor systems, meaning that most developers code applications for single processor systems. To encourage a mass migration to develop multithreaded applications, the installed user base has to be there to justify spending the added time and resources in developing such applications. As we just finished mentioning, Intel's approach is the quickest way to ensure that the exodus takes place.
So, with dual core CPUs priced very close to their single core counterparts, the choice is simple right?
On the Intel side of things, you're basically giving up 200MHz to have a dual core processor at virtually the same price. But things get a lot more complicated when you bring AMD into the situation. AMD hasn't officially released their dual core availability and pricing strategy, but let's just say that given AMD's manufacturing capacity, their dual core offerings won't be as price competitive as Intel's. Now, the decision is no longer that simple; you can either get a lower clocked dual core CPU, or a higher clocked single core AMD CPU for the same price - which one would you choose?
The vast majority of desktop application benchmarks will show the single core AMD CPU as a better buy than the dual core Intel CPU. Why? Because the vast majority of desktop applications are single threaded and thus, will gain no benefit from running on a dual core processor.
Generally speaking, the following types of applications are multi-threaded:
- Video Encoding
- 3D Rendering
- Photo/Video Editing
- most types of "professional" workstation applications
However, the vast majority of other applications are single threaded (or offer no performance gain from dual core processors):
- office suites
- web browsers
- email clients
- media players
- games, etc.
If you spend any of your time working with the first group of applications, then generally speaking, you'll want to go with the dual core CPU. For the rest of you, a faster single core CPU will be the better individual performance pick.
But once again, things get more complicated. Individually, single threaded applications will make no use of a CPU able to execute multiple threads. But, run more than one of these applications at the same time and all of the sudden, you're potentially dispatching multiple threads to your processor and thus, potentially, have a need for a multi-core CPU.
Scheduling and Responsiveness
In a single processor system (without Hyper Threading), the OS can only send one instruction thread to the CPU for execution at a time. But, you can run two applications at the same time and they can both be using up CPU time. In order to understand how this is possible, you have to understand a bit about how scheduling works.
As its name implies, the OS' scheduler schedules tasks. It takes the unlimited number of tasks that are requested of the OS, and schedules them to get done in the quickest way possible (in theory) on limited hardware resources.
When running a single application, the job of the scheduler is simple - the single active application gets all of the CPU's time for as long as it needs it. But what happens when you switch away from that active application and try to click on the Start Menu? Your usage experience would be pretty poor if you had to wait until your active application was done with its tasks before the scheduler would take the time to handle your Start Menu request. Imagine that your active application was 3ds max and you were rendering a scene that was going to take hours to complete. Would you be willing to wait hours for your Start Menu to appear?
Modern day OSes understand that this linear approach to scheduling isn't very practical, so they support pre-emptive multitasking, meaning that one task can pre-empt another before it is finished executing, and steal CPU time so that it may get some work done as well. In the previous example, the Start Menu request would pre-empt the 3D rendering process and your menu would pop up and the 3D rendering would resume immediately following that. Given that microprocessors these days are so fast, this rotation through tasks sent to the CPU occurs seamlessly to the end user, or at least it does most of the time.
There are times when the scheduler's work is not as transparent as it should be. In some cases, especially in Windows, processes will not always be able to pre-empt one another. If you're running two time-consuming, CPU intensive tasks, you may not notice, but if you're running one and trying to open a file or just click on a menu at the same time, then the hiccup is far more noticeable. The end result is usually a significantly delayed reaction to your input, such as a menu taking multiple seconds to appear instead of being an instantaneous response to your clicking. Anyone who runs more than one application at a time has undoubtedly encountered this type of a situation. Luckily, there are solutions.
Intel's Hyper Threading was one way around the problem. By fooling the scheduler into thinking that it can dispatch two threads simultaneously, situations like the one above were usually avoided assuming that the CPU had the appropriate resources free. Dual core is another solution to the problem, a far more robust one, since you literally have twice the processor resources.
The result of using a HT enabled or dual core system is better responsiveness when multitasking, but how do you quantify that? Unfortunately, it is extremely difficult to quantify response time in these situations. Even if we could easily quantify the response time improvements, is a snappier system when multitasking worth more than another 15% more performance in single threaded applications? How about 25%? It's a very different way of looking at the impact of a CPU to overall system performance, but it is an issue that we will have to tackle a lot more moving forward.
Characterizing Dual Core Performance
There are three areas to look at when measuring the performance of a dual core processor:
- Single-threaded application performance
- Multi-threaded Application Performance
- Multitasking Application Performance
For the first category, plain-jane single threaded application performance, the Pentium Extreme Edition or the Pentium D will simply perform identically to the equivalently clocked Pentium 5xx series CPU. The second core will go unused and the performance of the first core is nothing new. Given the short lead time on hardware for this review, we left out all of our single threaded benchmarks given that we can already tell you what performance is like under those tests - so if you're looking for performance under PC WorldBench or any of our Game tests, take a look at our older reviews and look at the performance of the Pentium 4 530 to get an idea of where these dual core CPUs will perform in single threaded apps. There are no surprises here; you could have a 128 core CPU and it would still perform the same in a single threaded application. Closer to its launch, we will have a full review including all of our single and multithreaded benchmarks so that you may have all of the information that will help determine your buying decision in one place.
The next category is pretty easy to benchmark as well. Things like 3ds max, iTunes, and Windows Media Encoder, are all examples of multi-threaded applications that are used rather frequently. We've included a few of these benchmarks as well in this article.
The final category is by far the most interesting as well as the most difficult to truly get a hold on - multitasking performance. The easiest way to measure multitasking performance is to have a number of applications loaded with one or more actively crunching away, and measure the performance of one or more of them. However, an arguably more useful way of looking at multitasking performance is to look at the response time of the system while multitasking. Unfortunately, no real benchmarks exist to measure response time of a system accurately while under a multitasking load, so we're left to do our best to try to develop those benchmarks to help answer the dual vs. single core purchasing debate. We are still working on developing those benchmarks and unfortunately, they didn't make it into this article, but we will keep cranking away and hopefully be able to debut them in one of the upcoming successors to this piece.
We did, however, string together a few benchmarks that don't explicitly measure response time, but do offer a good look at multitasking performance. Despite the fact that Intel has these types of benchmarks on their own, we went out and built benchmarks ourselves that was based on the feedback that we received from you all - the AnandTech readers.
We will describe these benchmarks later on in this piece, but first, let's take a look at two largely single threaded benchmark suites with a touch of multitasking: Winstone and SYSMark.
Our hardware configurations are similar to what we've used in previous comparisons.
AMD Athlon 64 Configuration
Socket-939 Athlon 64 CPUs
2 x 512MB OCZ PC3200 EL Dual Channel DIMMs 2-2-2-10
NVIDIA nForce4 Reference Motherboard
ATI Radeon X850 XT PCI Express
Intel Pentium 4 Configuration
LGA-775 Intel Pentium 4 and Extreme Edition CPUs
2 x 512MB Crucial DDR-II 533 Dual Channel DIMMs 3-2-2-12
Intel 955X Motherboard
ATI Radeon X850 XT PCI Express
Business Application Performance
Business Winstone 2004Business Winstone 2004 tests the following applications in various usage scenarios:
- Microsoft Access 2002
- Microsoft Excel 2002
- Microsoft FrontPage 2002
- Microsoft Outlook 2002
- Microsoft PowerPoint 2002
- Microsoft Project 2002
- Microsoft Word 2002
- Norton AntiVirus Professional Edition 2003
- WinZip 8.1
There's no surprise here - your best business application performance is going to come from a very fast single core CPU.
Office Productivity SYSMark 2004SYSMark's Office Productivity suite consists of three tests, the first of which is the Communication test. The Communication test consists of the following:
"The user receives an email in Outlook 2002 that contains a collection of documents in a zip file. The user reviews his email and updates his calendar while VirusScan 7.0 scans the system. The corporate web site is viewed in Internet Explorer 6.0. Finally, Internet Explorer is used to look at samples of the web pages and documents created during the scenario."
The next test is Document Creation performance:
"The user edits the document using Word 2002. He transcribes an audio file into a document using Dragon NaturallySpeaking 6. Once the document has all the necessary pieces in place, the user changes it into a portable format for easy and secure distribution using Acrobat 5.0.5. The user creates a marketing presentation in PowerPoint 2002 and adds elements to a slide show template."
The final test in our Office Productivity suite is Data Analysis, which BAPCo describes as:
"The user opens a database using Access 2002 and runs some queries. A collection of documents are archived using WinZip 8.1. The queries' results are imported into a spreadsheet using Excel 2002 and are used to generate graphical charts."
The Office Productivity SYSMark 2004 suite shows some benefit to dual core, given that there is quite a bit of multitasking involved in the test suite. Despite the multitasking, the Pentium Extreme Edition running at 3.2GHz isn't able to trounce its single core 3.73GHz relative.
Business Winstone 2004 includes a multitasking test as a part of its suite, which does the following:
"This test uses the same applications as the Business Winstone test, but runs some of them in the background. The test has three segments: in the first, files copy in the background while the script runs Microsoft Outlook and Internet Explorer in the foreground. The script waits for both foreground and background tasks to complete before starting the second segment. In that segment, Excel and Word operations run in the foreground while WinZip archives in the background. The script waits for both foreground and background tasks to complete before starting the third segment. In that segment, Norton AntiVirus runs a virus check in the background while Microsoft Excel, Microsoft Project, Microsoft Access, Microsoft PowerPoint, Microsoft FrontPage, and WinZip operations run in the foreground."
The performance of the dual core Extreme Edition comes within 5% of the 3.73GHz EE, despite the fact that the single core chip has a 16% clock speed advantage, but it is still slower overall.
The second test finally shows something positive for the dual core chip, with a negligable 2% performance lead. This is the perfect example of how multi-core can be a substitute for clock speed when it comes to performance. Note that despite the Pentium Extreme Edition being faster than the 3.73EE, the single core Athlon 64 FX-55 is faster than both.
The third and final test also shows a slight performance advantage for the dual core Extreme Edition, even over the Athlon 64 FX-55.
Multimedia Content Creation Performance
MCC Winstone 2004
Multimedia Content Creation Winstone 2004 tests the following applications in various usage scenarios:
- Adobe® Photoshop® 7.0.1
- Adobe® Premiere® 6.50
- Macromedia® Director MX 9.0
- Macromedia® Dreamweaver MX 6.1
- Microsoft® Windows MediaTM Encoder 9 Version 9.00.00.2980
- NewTek's LightWave® 3D 7.5b
- SteinbergTM WaveLabTM 4.0f
All chips were tested with Lightwave set to spawn 4 threads.
ICC SYSMark 2004The first category that we will deal with is 3D Content Creation. The tests that make up this benchmark are described below:
"The user renders a 3D model to a bitmap using 3ds max 5.1, while preparing web pages in Dreamweaver MX. Then, the user renders a 3D animation in a vector graphics format."
Next, we have 2D Content Creation performance:
"The user uses Premiere 6.5 to create a movie from several raw input movie cuts and sound cuts and starts exporting it. While waiting on this operation, the user imports the rendered image into Photoshop 7.01, modifies it and saves the results. Once the movie is assembled, the user edits it and creates special effects using After Effects 5.5."
The Internet Content Creation suite is rounded up with a Web Publishing performance test:
"The user extracts content from an archive using WinZip 8.1. Meanwhile, he uses Flash MX to open the exported 3D vector graphics file. He modifies it by including other pictures and optimizes it for faster animation. The final movie with the special effects is then compressed using Windows Media Encoder 9 series in a format that can be broadcast over broadband Internet. The web site is given the final touches in Dreamweaver MX and the system is scanned by VirusScan 7.0."
As soon as we throw in more content creation applications, some of which are multithreaded (e.g. 3ds max, Windows Media Encoder 9), the performance advantage of dual core is established. Here, we see that the dual core Pentium Extreme Edition running at 3.2GHz holds a 6% to 20% performance advantage over the higher clocked 3.73EE.
The performance advantages here are nice, but not the sort of order of magnitude in improvement that we'd been hearing about...
These new dual core CPUs are supposed to usher in a new era of media rich application usage models. They are supposed to enable us to do things that we were never able to do before. Let's find out if that's true or not...
First, we start off with iTunes to test MP3 encoding performance. We took a 12MB .wav file of our own creation and encoded it to a 192kbps MP3 file, measuring how long it took to encode the file.
Once again, we see that the Pentium Extreme Edition 840 is able to offer equal performance to the 3.73EE at 29 seconds. What's truly interesting is that the Pentium D running at 3.2GHz actually offers better performance than the Extreme Edition. We can only assume that 4 threads in iTunes begins to reduce performance, with 2 concurrent threads being the optimal point.
But once again, the performance gains aren't impressive. So far, dual core isn't looking too good.
DivX Encoding Performance
Our DivX tests from previous CPU reviews have shown a pretty sizeable improvement due to Hyper Threading, so we expected a similarly impressive gain due to dual core:
...and we were not disappointed. The Pentium Extreme Edition 840 offered more than a 20% increase in performance in our DivX encoding task when compared to the 3.73GHz single core P4 Extreme Edition.
We also see another example of four threads offering no performance improvement over being able to execute two concurrently, as the Pentium D running at 3.2GHz offers equal performance to the 840.
XviD Encoding Performance
The XviD tests show no real improvement due to dual core, but also don't seem to show much of an improvement due to Hyper Threading either. This just goes to show you that not all encoding tasks will show tremendous benefits.
Windows Media Video 9 Encoding Performance
Once again, we see extremely strong performance from the new dual core chips, offering around a 30% performance improvement at 85% of the clock speed of the current king of the hill.
So, overall encoding performance is pretty strong on the dual core chips from Intel. Let's have a look at one more multi-threaded application before we get to the more interesting tests.
3D Rendering Performance
3ds max 7 is quite well known for its multithreaded nature, so it made the perfect test for this preview.
With a rendering composite of 2.62, the Pentium Extreme Edition 840 is over 50% faster than the 3.73EE. Now, we're finally seeing the sorts of gains that have AMD and Intel all excited, but that's only in a single multithreaded application. Is that all we have to look forward to with dual core?
Of course not...
The Real Test - AnandTech's Multitasking Scenarios
Shortly after receiving the dual core system from Intel, I posted a question in my blog asking AnandTech readers to respond with how they multitask. I kept the question pretty open-ended, just wanting to get a feel for all the different types of multitasking that this sample of AnandTech readers did on a daily basis. I then took the data and did my best to, in the limited time that I had, model some real world multitasking benchmarks based on the responses. The results are three real world, multitasking benchmarks with the promise of more to come.
The biggest commonality between responses was that AnandTech reader systems, much like my own, are loaded with applications running in the background. So, the first thing that I did to put together our multitasking testbed was to put a ton of applications on it, the type that we all use. I came up with the following list:
Norton AntiVirus 2004 (with latest updates)
DVD Shrink 3.2
Microsoft AntiSpyware Beta 1.0
Visual Studio .NET 2003
Macromedia Flash Player 7
Adobe Photoshop CS
Microsoft Office 2003
3ds max 7
Norton Ghost 2003
Adobe Reader 7
What's important about that list is that a handful of those programs were running in the background at all times, primarily Microsoft's AntiSpyware Beta and Norton AntiVirus 2004. Both the AntiSpyware Beta and NAV 2004 were running with their real time protection modes enabled, to make things even more real world.
With my system fully configured, I did what anyone else would do with a brand new system - I used it. I used it as an actual system doing real world everyday tasks and made notes of my impressions, then came the interesting part - I swapped out processors for a single core non-HT enabled Pentium 4 and started making notes of differences. Armed with the single core chip, I went to task on creating benchmarks based on some of the tasks that AnandTech readers did on a regular basis (not too surprising, I use my system in a very similar way to most AT readers). So let's get to the tests...
Multitasking Scenario 1: DVD Shrink
If you've ever tried to backup a DVD, you know that the process can take a long time. Just ripping the disc to your hard drive will eat up a good 20 minutes, and then there's the encoding. The encoding can easily take between 20 - 45 minutes depending on the speed of your CPU, and once you start doing other tasks in the background, you can expect those times to grow even longer.
For this test, we used DVD Shrink, one of the simplest applications available to compress and re-encode a DVD to fit on a single 4.5GB disc. We ran DVD Decrypt on the Star Wars Episode VI DVD so that we had a local copy of the DVD on our test bed hard drive (in a future version of the test, we may try to include DVD Decrypt performance in our benchmark as well). All of the DVD Shrink settings were left at default, including telling the program to assume a low priority, a setting many users check in order to be able to do other things while DVD Shrink is working.
As a single application with no multitasking involved, here's how DVD Shrink performs:
As you can see, the new dual core chips can shrink a DVD in about 70% of the time of the 3.73EE. But what happens to performance when you start doing other things in the background?
In order to find out, we did the following:
1) Open Firefox and load the following web pages in tabs (we used local copies of all of the web pages):
We kept the browser on the AT front page.
2) Open iTunes and start playing the latest album of avid AnandTech reader 50 Cent on repeat all.
3) Open Newsleecher.
4) Open DVD Shrink.
5) Login to our news server and start downloading headers for our subscribed news groups.
6) Start backup of Star Wars Episode VI - Return of the Jedi. All default settings, including low priority.
DVD Shrink was the application in focus; this matters because by default, Windows gives special scheduling priority to the application currently in the foreground (we will test what happens when it's not in the foreground later in this article). We waited until the DVD Shrink operation was complete and recorded its completion time. Below are the results:
Now, we start to see where dual core helps. In this relatively simple multitasking scenario, the DVD shrink task took more than twice as long on single core CPUs than it did on dual core chips. The Pentium 4 without Hyper Threading took a full 35 minutes to complete the task, compared to the 9.3 minutes of the dual core Pentium Extreme Edition. Even the fastest from AMD couldn't hold a candle to the dual core offerings.
And this was only with a minimal amount of multitasking. Had more applications been running or had actual user interaction taken place during the test, the dual vs. single core gap would've grown even more.
Multitasking Scenario 2: File Compression
For our next test, we simulated what would happen if we performed two disk intensive tasks at the same time: zipping a file while importing a 260MB PST file into Outlook 2003.
We ran the same Firefox and iTunes tasks from the last test again, and then did the following:
1) Open Outlook.
2) Start importing 260MB PST.
3) Start WinRAR.
4) Archive 130MB test file.
WinRAR remained the application in focus during this test.
Here, we looked at two metrics: how long it took WinRAR to compress our test file, and how many emails were imported into Outlook during the time WinRAR was archiving. Let's have a look at the results:
Here, we see that all of the CPUs performed relatively similar to one another, but now let's talk about how many emails were imported. The non-HT Pentium 4 imported around 500 emails, while the HT P4 EE imported around 1700 emails by the time WinRAR was done. Neither of those are even close to the performance of the dual core chips, which each imported over 3000 emails in the same 40 seconds. The single core Athlon 64 FX-55 also only imported around 400 emails.
Our second test shows us that the performance of a dual core solution comes in all shapes and sizes. In this case, our foreground task took the same amount of time in almost all cases, but what was done in the background varied significantly.
Multitasking Scenario 3: Web Browsing
For our final benchmark, we decided to switch things up a bit and keep Firefox as our foreground application while background tasks ran. To make things even more stressful, we had no less than 12 tabs open in Firefox, with our main tab being IGN's PSP website - which happens to be very Flash heavy.
The iTunes and Newsleecher tasks from the first test scenario were also present in this one, plus we did the following:
Open Outlook, immediately import 130MB PST file and immediately switch app focus to Firefox.
We then recorded the total time required to import the new PST while Firefox was our foreground application. The results were very interesting:
The most surprising is how poorly AMD did in this test. We actually had to exclude them from the graph as it distorted the bar lengths too much. AMD weighed in at over 27 minutes; from actually using the system, it looks like Flash takes a much bigger performance toll on AMD platforms than it does on Intel. The end result is that the scheduler devoted very little time to the Outlook process, resulting in the import taking an extremely long time.
Ignoring the AMD outlier, dual core offered serious performance improvements over single core within the Intel realm alone. The 840 completed the PST import in around 70% of the time of the 3.73EE. Again, the gap would grow if more tasks were running, or if we were actually interacting with Firefox instead of just sitting there and reading one page (we confirmed this by actually doing it, but it is a little too difficult to do in a repeatable fashion for testing purposes).
Dual Core System Impressions
Despite our best efforts, some of the best characterization of the impact of dual core is done with words. The best way to put it is like this: if an application is eating up all of your CPU time, with dual core, you still have one core left to make the rest of your system just as responsive as before. But if you want a more detailed account of such a scenario, take a look at some of our lab notes:
CPU: Pentium 4 Extreme Edition 3.73GHz, Hyper Threading Disabled
So, I was playing around with Outlook, copying a bunch of emails, basically the equivalent of copying a 280MB PST file, which isn't huge by any means. In copying the emails, the CPU utilization skyrocketed to 100% and I was off trying to browse the web to see how responsive that was.
On this HT disabled P4 3.73EE, I could browse the web just fine. I had Firefox open and around 10 tabs and all was fine. I went to minimize Firefox and the animation was very choppy, but it still minimized/restored just fine. I had Photoshop CS running in the background - I tried to switch to it, but all I got was the outline of Photoshop. I couldn't see or interact with the app at all. I switched back to my other apps, Newsleecher, Firefox, iTunes, and they all worked fine, but Photoshop and Outlook were not responding.
I tried to take a screenshot of what was going on, but print screen wouldn't work. I could launch Paint, but I couldn't paste anything into it. So, I went to go get my digital camera to take a picture of it, but my CF card was full. I went and found my CF card adapter, plugged it into my personal machine, copied all of my pictures back to my computer (128MB card), wrote this text and then put the CF card back in my camera and took a picture of what was going on. At least 10 minutes had to have elapsed and Photoshop was still not responding.
The only solution? Kill both Photoshop and Outlook using task manager - at least I had access to task manager.
I wanted to see if it was a fluke, so I tried it again. This time, Photoshop was fine, but Outlook still hung. I closed and restarted Photoshop and got the following: Photoshop was basically hung and slowly made its way into a loaded state. A bit of a pain, especially when the only solution is to kill Outlook and I still can't get my emails copied over.
CPU: Pentium 4 Extreme Edition 3.73GHz, Hyper Threading Enabled
I repeat the same basic test with HT on; the obvious difference is that the UI is a lot faster. Minimizing/restoring windows is no longer super choppy, and application launches are much quicker. Launching Photoshop didn't yield the same, almost dying; results as before.
To push things even further, I started the DVD Shrink test and although the performance was obviously impacted, the system still remained quite responsive - other than Outlook, which was taking its sweet time.
I could still browse the web just fine, and overall, the rest of the system was pretty impressive despite Outlook being a rogue process.
CPU: Dual Core Pentium D 3.2GHz
Now, time to try it out on the Pentium D 3.2GHz. On this chip, I went through the same setup. The first thing I noticed was that merely clicking on the Inbox in Outlook didn't pause the system for 7 - 10 seconds as it did on the single core platforms. It only took 1 - 2 seconds; it felt much more responsive.
The next thing was that the Outlook window never turned completely blank. I still couldn't play around with the Outlook interface, but the window was always drawn. I'm not sure if this is necessarily a great thing, but it's a noticeable difference. I could still minimize the window, but I just couldn't interact with anything within the window.
Time to stress the system a bit more. I fired up the DVD Shrink benchmark, and started shrinking a DVD while downloading headers from Newsleecher. I then closed Photoshop and tried to restart it...wow, the application opened as quickly as it normally would have - no delays, nothing.
Outlook did eventually start listing itself as "Not Responding", but I still had full interaction with the rest of my system, even though both CPUs were pegged at 100% I'm guessing that because of the nature of the other applications, I could still switch between them, interact with them and launch more apps without any noticeable degradation in performance.
The other major change was that Outlook could now be closed using its own X button, instead of me having to kill it via task manager. Speeding up the Outlook task would require faster single cores (and maybe a faster hard disk), but dealing with its impact on the rest of the system is best handled by multiple cores.
CPU: Dual Core Pentium Extreme Edition 840
The experience here was pretty much the same as the Pentium D, but just with even better performance in the DVD Shrink task (still taking under 14 minutes to deal with the DVD).
The computer was maybe slightly more responsive, but nothing huge. When compared to the non-HT Pentium D. It is clear that HT does help dual core, although not as much as it helps single core P4s.
The verdict on dual core is far from in, but what we've presented here is a start. We have more coverage coming, including power consumption, overclocking potential and a look at the more economical dual core price points from Intel. We're also hard at work on creating new multitasking benchmarks with the hopes of eventually reaching the holy grail of being able to measure and quantify system response time accurately. To that effect, if you all have any suggestions for usage models that you'd like to see tested or any benchmarking suggestions in general, please let us know.
We're far from being able to make any conclusions about dual core or Intel's Pentium D/Extreme Edition, but there are some things that we can say at this point:
- In general use of the system, the Pentium Extreme Edition 840 felt just as fast as the 3.73GHz Pentium 4 Extreme Edition. In multitasking, there was no substitute for the dual core Pentium Extreme Edition.
- Hyper Threading made a decent impact on our usage, even on the dual core platform. However, the benchmarks show that Hyper Threading on dual core doesn't always result in a performance boost. That being said, we'd still opt for Hyper Threading as it just seems to make things smoother than without on the dual core chip. Although Intel has a desire to separate their Extreme Edition and Pentium D lines, we think that Hyper Threading is the wrong feature to use as a differentiator - all users could benefit from its presence on their dual core platforms.
- Intel's pricing strategy for dual core makes a lot of sense to force market adoption. In the near future, we will be looking at Intel's cheapest dual core offering to see how well it stacks up to AMD's similarly priced single core chips. The only way to make sure that developers crank out multithreaded desktop software is to ensure a large installed user base, and Intel appears to be committed to doing that.
- AMD should get an even larger boost from the move to dual core than Intel has, simply because AMD doesn't presently have the ability to execute more than one thread at a time. Intel's Hyper Threading on their single core chips does improve response time greatly as well as improves multitasking performance. For AMD, the move to dual core will give their users the benefits in response time that their Intel counterparts have enjoyed with Hyper Threading as well as the extra advantage offered by having two identical cores on a chip.
- When it comes to dual core vs. single core with Hyper Threading, there's a huge difference. While both improve system response time, dual core improves it more while also guaranteeing better overall system performance. Hyper Threading lets you multitask, dual core lets you actually get work done while multitasking.
That's all for now - we'll have much more dual core coverage later on this week and the next.