Original Link: http://www.anandtech.com/show/1745
Affordable Dual Core from AMD: Athlon 64 X2 3800+by Anand Lal Shimpi on August 1, 2005 9:36 AM EST
- Posted in
For the past couple of months, we've asked, hoped and dreamed for it, and today, AMD is launching it - the $354 Athlon 64 X2 3800+; the first somewhat affordable dual core CPU from AMD.
If necessity is the mother of invention, then the birth of the Athlon 64 X2 3800+ should be no surprise to anyone. In one of their strongest CPU paper-launches ever, AMD put their best foot forward this past May and introduced the Athlon 64 X2 processor. While AMD was late to the desktop dual core game compared to Intel, the Athlon 64 X2 processor had absolutely no problem outperforming Intel's Pentium D. But at the end of the day, despite AMD's clear victory, our recommendations were quite complicated, thanks to one major flaw in AMD's execution: price.
The cheapest dual core Pentium D processor could be had for under $300, yet AMD's cheapest started at $537. Intel was effectively moving the market to dual core, while AMD was only catering to the wealthiest budgets.
The Pentium D 820, running at 2.8GHz and priced at $280, offered the most impressive value that we've seen in a processor in quite some time - if you could properly use the power. Multitaskers and users of multithreaded applications found themselves with the cheapest 2-way workstation processor that they had seen since the SMP Celerons and ABIT's BP6. While Intel satiated our demands for affordable dual core, we knew it wasn't the perfect option. AMD's Athlon 64 X2 was the better overall performer, just at the very wrong price point.
After much pressure from all sides and some very important manufacturing changes, AMD went ahead with the decision to release a cheaper Athlon 64 X2. The decision was made around the time of Computex 2005 and that's when we first heard of the $354 Athlon 64 X2 3800+.
The Athlon 64 X2 3800+ is basically two Athlon 64 3200+ cores stuck together, each running at 2.0GHz and each with its own 512KB L2 cache. This is a full 200MHz lower clock per core than the 4200+, but with the same amount of cache.
Looking at the table above, it is clear that AMD has left room for another SKU - potentially an Athlon 64 X2 4000+ at 2.0GHz, but with a 1MB L2 cache. AMD could also go lower, pairing up a couple of 1.8GHz/512KB cores, but AMD most likely wanted to find a good balance between single threaded performance, price and multithreaded performance with this new "entry level" X2 core.
Note: The 512KB X2s are available in both 154M and 233M transistor versions.
A New CoreAMD didn't sit on the X2 3800+ just because they were greedy and expected everyone to gobble up the $500+ parts. Instead, today's release is the result of a slightly revised core, codenamed Manchester, specifically designed to cut costs.
The original Athlon 64 X2 (Toledo core) processors all had the same exact specifications:
- 233.2M transistorsFor the Athlon 64 X2 4800+ and the 4400+, the shared transistor count and die size made sense. They both were identical from a transistor standpoint, one chip just ran 200MHz faster than the other. But the 4200+ and the 4600+ weren't identical; unlike the 4800/4400+ X2s, the 4200+ and 4600+ only had a 512KB L2 cache per core, not a 1MB L2.
- 199 mm2 die size
- 110W max power
Update: As many of you have correctly pointed out, the 4200+ and 4600+ were available as both Toledo and Manchester cores. More than half of the Athlon 64 X2's transistor count is spent on cache, which means that if there are going to be any manufacturing defects on the chip, they will more than likely occur in the processor's cache. Born out of that fact, the Toledo based Athlon 64 X2 4600+ and 4200+ were nothing more than 4800/4400+ X2s with too many manufacturing defects; instead of throwing the bad cores away, AMD simply rebranded them and sold them at lower price points. The problem with this approach is that an Athlon 64 X2 4200+ took the same amount of space on a wafer as an Athlon 64 X2 4800+, despite only having half the cache. Thus we have the Manchester core: a core designed from the ground up to only feature a 512KB L2 cache per core.
As manufacturing ramps up, yields improve and it is now possible to actually create a cost-reduced Athlon 64 X2, using the smaller Manchester die - and that's where the Athlon 64 X2 3800+ gets its cost savings.
The transistor count of the 3800+ goes down to 154 million, and the die gets shrunk down to 147 mm2 compared to the 233.2M and 199 mm^2 of its bigger brothers (4800/4400+). The thermal envelope of the new core also dropped from 110W down to 89W, both still lower than Intel's Pentium D or single-core Pentium 4 for that matter.
With a smaller die and lower transistor count, the Athlon 64 X2 3800+ is able to support its $354 price tag.
Power Comparison: Manchester vs. ToledoIn our first review of the Athlon 64 X2, we were astounded by the fact that the fastest Athlon 64 X2, thanks to its cool running 90nm process, consumed less power than any single core Pentium 4 processor, not to mention all of the dual core models.
We also noted that a dual core Athlon 64 X2 processor used less power than a single core 130nm Athlon 64, once again a testament to AMD's transition to 90nm.
This time around, we're interested in the power consumption benefits of the new Manchester core. AMD says the core drops the maximum power consumption from 110W down to 89W, but what is that in the real world?
In order to find out, we performed one simple test; we clocked an Athlon 64 X2 4200+ based on the old Toledo core at 2.0GHz, the same clock speed as the X2 3800+, and measured the total power consumption of the system. We then swapped out the Toledo based X2 for a new Manchester based X2 to see, clock for clock, what the tangible decrease in power consumption was.
Remember, we're only looking at total system power consumption - obviously CPU power consumption will be a lot lower, but with identical system specifications, the CPU's impact on power consumption should be the major variable that we're measuring here.
Clock for clock, there's no tangible reduction in power consumption courtesy of the new Manchester core. But given how cool the Toledo based Athlon 64 X2s were already running, we're not too disappointed that there isn't more to talk about here. After all, the biggest advantage of the Manchester core is the cost reduction...
New Pricing, but Higher Cost per Core?One thing that we noticed in our first review of the Athlon 64 X2 processor was that AMD was surely getting their money's worth out of each X2 sale, especially compared to Intel. Dating back to the launch of the Pentium D, Intel's entry-level Pentium D 820 only came with an $80 premium over its identical single core counterpart. Back then, AMD's cheapest core, the X2 4200+ commanded a $265 premium for its second core.
With the introduction of the Manchester core in the Athlon 64 X2 3800+, AMD introduces a much more reasonably priced dual-core CPU, where the cost of the second core has finally dropped to $160. It's still not as low as Intel's lowest, but it is fairly competitive with Intel's closest priced dual core competitor - the 3.0GHz Pentium D 830.
It is interesting to note that although AMD has cut both their single core and dual core prices since the X2's launch, the cost per core of the older dual core CPUs has actually gone up a little in some cases. While both of the 512KB L2 parts have decreased their cost per second core relative to today's single core prices, the 1MB parts have gone up. Overall, prices have still gone down; it's just that the gap between buying a single core CPU and a dual core has changed.
So, what AMD has done is effectively released a price competitor to the Pentium D 830. While it isn't the Pentium D 820 competitor that we were hoping for at a sub-$300 price point, the Athlon 64 X2 3800+ will have to do.
Unfortunately, while AMD announced availability starting today, we have only seen limited availability in the retail channel with only Monarch and Directron listing the chip shipping on 8/12/2005.
AMD's Efficiency Advantage?Before we get to the actual barrage of performance tests, there is one issue that we have been wanting to tackle for quite some time now.
AMD has often argued that their dual core architecture is inherently more efficient than Intel's, primarily because of their System Request Queue (SRQ). All core-to-core transfers occur via this queue instead of over a main, shared FSB, which is the case in the Pentium D.
Johan put AMD's architecture to the test by measuring the latency of cache-to-cache transfers in AMD's dual core chips vs. Intel's. The results were quite impressively in favor of AMD's architecture. Cache-to-cache transfers on Intel's dual core CPUs took over twice as long as on AMD's dual core CPUs, but at that time, we could not find any real world benefit to the architecture.
Armed with a bit more time, we went through all of our benchmarks and specifically focused on those that received the most performance gain from dual core architectures. Using these multithreaded and/or multitasking benchmarks, we looked at the performance improvements that the dual core processors offered over their single core counterparts. For AMD, making this comparison was easy; we took the Athlon 64 X2 3800+ and compared it to its single core equivalent, the Athlon 64 3200+. For Intel, the comparison is a bit more complicated. The inclusion of Hyper Threading makes the single-core to dual-core jump a little less impressive in some cases, thanks to the fact that virtually all single-core Pentium 4 processors these days can execute two threads simultaneously. Thus, for Intel, we had to look at HT enabled, dual core and dual core with HT enabled, all compared to single core performance to get a complete picture of Intel's multithreaded performance scaling.
Remember that all performance increases are with reference to a single core processor, and in the case of Intel, we are talking about a single core Pentium 4 with HT disabled. More specifically, we used a Pentium D 830 (3.0GHz) for the dual core tests and compared it to its single core counterpart - the Pentium 4 530 (3.0GHz).
First, we have our Winstone 2004 benchmark suite; we omitted Business Winstone 2004, since it shows virtually no performance boost from dual core CPUs and instead, focused on Multimedia Content Creation Winstone 2004 and the Multitasking Winstone tests.
While AMD scales slightly worse than Intel (comparing the AMD Dual Core to the Intel Dual Core rows) in the MMCC Winstone test and significantly worse in the Multitasking 1 test, AMD scales better in the last two tests. Particularly in the third multitasking test, AMD gets a whopping 68.4% from the move to dual core while Intel only improves by 39.1%.
It is also worth noting that although Hyper Threading improves performance with a single core, enabling HT on the dual core CPU actually yields lower overall performance than if we had left it off (+24.1% vs. +39.1%). Johan explained exactly why situations like this exist on the Pentium D in his "Quest for More Processing Power".
Next up is the SYSMark 2004 suite. In all but two of the tests, AMD scales slightly better than Intel when going to dual core. The scaling advantages aren't huge, but they are tangible in some of the tests.
Once again, while Hyper Threading itself tends to impress, HT + dual core gives us a mixed bag of results, sometimes outperforming dual core alone while falling behind other times.
Finally, we have our application-specific benchmarks; here, we have AMD scaling better than Intel in 3 out of the 5 tests, but then in the remaining 2, Intel scales better.
Out of the 15 tests, 10 of them showed that AMD scaled better from single to dual core than Intel, while the remaining 5 showed the opposite, that Intel scales better. Out of the 10 tests where AMD offered better scaling, only 6 of them showed AMD outscaling Intel by more than a 3% margin (one test had AMD with a 2.9% advantage, but it was close enough, so we counted it). Of the 5 tests where Intel scaled better, 4 of them had Intel at an advantage by more than 3%.
While the Athlon 64 X2 does have much better cache-to-cache transfer latencies than the Pentium D, it appears as if for the most part, those advantages don't surface in real-world desktop usage. That being the case, the Athlon 64 X2 3800+ must outperform the Pentium D 830 based on the performance advantages of its individual cores in order to win this battle, not based on any dual core architectural efficiencies. So, does it?
Head to Head: Athlon 64 X2 3800+ vs. Pentium D 830
Is the Athlon 64 X2 3800+ worthy of its Pentium D opponent? Not to spoil the surprise, but yes, emphatically yes.
Not only are there significant advantages in single threaded games, but everything from encoding to the multitasking tests put the Athlon 64 X2 3800+ ahead of its Pentium D counterpart.
Note: The iTunes scores are Encoding Times in Minutes, lower numbers are better.
Although they aren't pictured here (for space reasons), you'll see in the coming pages that there is only one benchmark where Intel ends up ahead. The Roxio VideoWave test in PCWorld's WorldBench 5 suite completes 6 seconds quicker on the Pentium D 830 than on the Athlon 64 X2 3800+. That is one loss out of 31 total benchmarks for the Athlon 64 X2 3800+ (once again, not all pictured here, but you'll see them on the coming pages).
The victory is clear and without debate, at the $300 - $400 price point, the Athlon 64 X2 3800+ is the dual core processor to get.
The TestOur hardware configurations are similar to what we've used in previous comparisons. For this test, we focused on CPUs at or around the Athlon 64 X2 3800+'s $354 price point.
AMD Athlon 64 Configuration
Socket-939 Athlon 64 CPUs
2 x 512MB OCZ PC3200 EL Dual Channel DIMMs 2-2-2-7
ASUS A8N-SLI Deluxe
ATI Radeon X850 XT PCI Express
Intel Pentium 4 Configuration
LGA-775 Intel Pentium 4 and Pentium D CPUs
2 x 512MB Crucial DDR-II 533 Dual Channel DIMMs 3-3-3-12
Intel 925XE Motherboard
ATI Radeon X850 XT PCI Express
Business/General Use Performance
Business Winstone 2004Business Winstone 2004 tests the following applications in various usage scenarios:
. Microsoft Access 2002
. Microsoft Excel 2002
. Microsoft FrontPage 2002
. Microsoft Outlook 2002
. Microsoft PowerPoint 2002
. Microsoft Project 2002
. Microsoft Word 2002
. Norton AntiVirus Professional Edition 2003
. WinZip 8.1
Office Productivity SYSMark 2004SYSMark's Office Productivity suite consists of three tests, the first of which is the Communication test. The Communication test consists of the following:
"The user receives an email in Outlook 2002 that contains a collection of documents in a zip file. The user reviews his email and updates his calendar while VirusScan 7.0 scans the system. The corporate web site is viewed in Internet Explorer 6.0. Finally, Internet Explorer is used to look at samples of the web pages and documents created during the scenario."
The next test is Document Creation performance:
"The user edits the document using Word 2002. He transcribes an audio file into a document using Dragon NaturallySpeaking 6. Once the document has all the necessary pieces in place, the user changes it into a portable format for easy and secure distribution using Acrobat 5.0.5. The user creates a marketing presentation in PowerPoint 2002 and adds elements to a slide show template."
The final test in our Office Productivity suite is Data Analysis, and area where Pentium D typically does well. BAPCo describes it as:
"The user opens a database using Access 2002 and runs some queries. A collection of documents are archived using WinZip 8.1. The queries' results are imported into a spreadsheet using Excel 2002 and are used to generate graphical charts."
Microsoft Office XP SP-2Here, we see in that the purest of office application tests, performance doesn't vary all too much.
Mozilla 1.4Quite possibly the most frequently used application on any desktop is the one that we pay the least amount of attention to when it comes to performance. While a bit older than the core that is now used in Firefox, performance in Mozilla is worth looking at as many users are switching from IE to a much more capable browser on the PC - Firefox.
ACD Systems ACDSee PowerPack 5.0ACDSee is a popular image editing tool that is great for basic image editing options such as batch resizing, rotating, cropping and other such features that are too elementary to justify purchasing something as powerful as Photoshop. There are no extremely complex filters here, just pure batch image processing.
WinzipArchiving performance ends up being fairly CPU bound as well as I/O limited.
WinRAR 3.40Pulling the hard disk out of the equation, we can get a much better idea of which processors are truly best suited for file compression.
Multitasking Content Creation
MCC Winstone 2004Multimedia Content Creation Winstone 2004 tests the following applications in various usage scenarios:
. Adobe® Photoshop® 7.0.1As you can see above, Lightwave is part of the MCC Winstone 2004 benchmark suite. As an individual application, Lightwave does manage to get a healthy performance benefit with multithreaded rendering enabled, especially when paired with Hyperthreading enabled CPUs like the Pentium 4s here today. All chips were tested with Lightwave set to spawn 4 threads.
. Adobe® Premiere® 6.50
. Macromedia® Director MX 9.0
. Macromedia® Dreamweaver MX 6.1
. Microsoft® Windows MediaTM Encoder 9 Version 9.00.00.2980
. NewTek's LightWave® 3D 7.5b
. SteinbergTM WaveLabTM 4.0f
ICC SYSMark 2004The first category that we will deal with is 3D Content Creation. The tests that make up this benchmark are described below:
"The user renders a 3D model to a bitmap using 3ds max 5.1, while preparing web pages in Dreamweaver MX. Then the user renders a 3D animation in a vector graphics format."
Next, we have 2D Content Creation performance:
"The user uses Premiere 6.5 to create a movie from several raw input movie cuts and sound cuts and starts exporting it. While waiting on this operation, the user imports the rendered image into Photoshop 7.01, modifies it and saves the results. Once the movie is assembled, the user edits it and creates special effects using After Effects 5.5."
The Internet Content Creation suite is rounded up with a Web Publishing performance test:
"The user extracts content from an archive using WinZip 8.1. Meanwhile, he uses Flash MX to open the exported 3D vector graphics file. He modifies it by including other pictures and optimizes it for faster animation. The final movie with the special effects is then compressed using Windows Media Encoder 9 series in a format that can be broadcast over broadband Internet. The web site is given the final touches in Dreamweaver MX and the system is scanned by VirusScan 7.0."
Mozilla + Media Encoder
Video Creation/Photo Editing
Adobe Photoshop 7.0.1
Roxio VideoWave Movie Creator 1.5
MusicMatch Jukebox 7.10
DivX 6 with AutoGKArmed with the DivX 6 and the AutoGK front end for Gordian Knot, we took all of the processors to task at encoding a chapter out of Pirates of the Caribbean. We set AutoGK to give us 75% quality of the original DVD rip and did not encode audio; all of the DivX 6 settings were left at default.
Windows Media Encoder 9To finish up our look at Video Encoding performance, we have two tests both involving Windows Media Encoder 9. The first test is WorldBench 5's WMV9 encoding test.
Once we crank up the requirements a bit and start doing some HD quality encoding under WMV9, the single core performance drops dramatically:
Half Life 2
Splinter Cell: Chaos Theory
Unreal Tournament 2004
3dsmax 5.1WorldBench includes two 3dsmax benchmarks using version 5.1 of the popular 3D rendering and animation package: a DirectX and an OpenGL benchmark. We are only using the DirectX benchmark here:
3dsmax 6For the next 3dsmax test, we used version 6 of the program and ran the SPECapc rendering tests to truly stress these CPUs. Since there's not much new to report here, we're only going to report the Rendering Composite score.
Multitasking PerformanceBusiness Winstone 2004 includes a multitasking test as a part of its suite, which does the following:
"This test uses the same applications as the Business Winstone test, but runs some of them in the background. The test has three segments: in the first, files copy in the background while the script runs Microsoft Outlook and Internet Explorer in the foreground. The script waits for both foreground and background tasks to complete before starting the second segment. In that segment, Excel and Word operations run in the foreground while WinZip archives in the background. The script waits for both foreground and background tasks to complete before starting the third segment. In that segment, Norton AntiVirus runs a virus check in the background while Microsoft Excel, Microsoft Project, Microsoft Access, Microsoft PowerPoint, Microsoft FrontPage, and WinZip operations run in the foreground."
Final WordsThere's not much to say here other than that the Athlon 64 X2 3800+ is the clear choice for any user at this price point. What you give up in single threaded performance is more than made up for by the improvements in multitasking and multithreaded application performance.
Bit by bit, AMD is eating away at any possible recommendation that we'd have for the Pentium D. While the Pentium D 820 is still our recommendation at the sub-$300 mark, if your budget can handle it, we'd strongly recommend going for the Athlon 64 X2 3800+.
As for overclocking, we had no problems reaching 2.46GHz with our Athlon 64 3800+ sample using standard air cooling. The overclocking wasn't as impressive as what we saw with the Toledo based Athlon 64 4200+, but we will save a final conclusion on overclocking until we get more Manchester based processors in house.
We really didn't want to see AMD become a more expensive CPU manufacturer, and with the X2 3800+, we finally have a more sensibly priced dual core option. The choice is clear - the Athlon 64 X2 3800+ is better in every way than the Pentium D 830. For Intel's sake in the enthusiast community, Conroe had better be very competitive next year - because ever since Prescott, the Pentium 4 has been an utter disappointment.