Original Link: https://www.anandtech.com/show/2050



Core 2 Duo (Conroe) launched about twelve days ago with a lot of fanfare. With the largest boost in real performance the industry has seen in almost a decade it is easy to understand the big splash Core 2 Duo has made in a very short time. AnandTech delivered an in-depth analysis of CPU performance in Intel's Core 2 Extreme & Core 2 Duo: The Empire Strikes Back. With so much new and exciting information about Conroe's performance, it is easy to assume that since Core 2 Duo uses DDR2, just like NetBurst, then memory performance must therefore be very similar to the DDR2-based Intel NetBurst architecture.

Actually, nothing could be further from the truth. While the chipsets still include 975X and the new P965 and the CPU is still Socket T, the shorter pipes, 4 MB unified cache, intelligent look-ahead, and more work per clock cycle all contribute to Conroe exhibiting very different DDR2 memory behavior. It would be easy to say that Core 2 Duo is more like the AMD AM2, launched May 23rd, which now supports DDR2 memory as well. That would be a stretch, however, since AM2 uses an efficient on-processor memory controller, and the launch review found Core 2 Duo faster at the same clock speed than the current AM2. This is another way of saying Conroe is capable of doing more work per cycle - something we had been saying for several years about Athlon64 compared to NetBurst,

The move by AMD from Socket 939 to Socket AM2 is pretty straightforward. The new AM2 processors will continue to be built using the same 90nm manufacturing process currently used for Athlon 64 processors until some time in early to mid-2007. AMD will then slowly roll-out their 65nm process from the bottom of the line to the top according to AMD road-maps. This could include memory controller enhancements and possibly more. Performance of AM2 only changed very slightly with the move to DDR2, generally in the range of 0% to 5%. The only substantive difference with AM2 is the move from DDR memory to official AMD DDR2 memory support.

Our AM2 launch reviews and the article First Look: AM2 DDR2 vs. 939 DDR Performance found that AM2 with DDR2-533 memory performed roughly the same as the older Socket 939 with fast DDR400 memory. Memory faster than DDR2-533, namely DDR2-667 and DDR2-800, brought slightly higher memory performance to AM2.

The Core 2 Duo introduction is quite different. Clock speed moved down and performance moved up. The top Core 2 Duo, the X6800, is almost 1GHz slower than the older top NetBurst chip and performs 35% to 45% faster. With the huge efficiency and performance increases comes different behavior with DDR2 memory.

With the world now united behind DDR2, it is time to take a closer look at how DDR2 behaves on both the new Intel Core 2 Duo and the AMD AM2 platforms. The performance of both new DDR2 platforms will also be compared to NetBurst DDR2 performance, since the DDR2 NetBurst Architecture has been around for a couple of years and is familiar. We specifically want to know the measured latency of each new platform, how they compare in memory bandwidth, and the scaling of both Core 2 Duo and AM2 as we increase memory speed to DDR2-1067 and beyond. With this information and tests of the same memory on each platform, we hope to be able to answer whether memory test results on Conroe, for instance, will tell us how the memory will perform on AM2.

In addition we have an apples-apples comparison of AM2 and Core 2 Duo running at 2.93GHz (11x266) using the same memory at the same timings and voltages with the same GPU, hard drive, and PSU. This allows a direct memory comparison at 2.93GHz at DDR2-1067. It also provides some very revealing performance results for Core 2 Duo and AM2 at the exact same speeds in the same configurations.



DDR/NetBurst Memory Bandwidth and Latency

One of the most talked-about AMD advantages of the last couple of years has been their on-processor memory controller. This has allowed, according to popular theories, the Athlon64 to significantly outperform Intel NetBurst processors. The fact is NetBurst DDR2 bandwidth has recently been similar or wider in bandwidth than Athlon64 - even when the DDR is overclocked. You can see this clearly when we compare Buffered and Unbuffered Bandwidth of a NetBurst 3.46EE to an AMD 4800+ x2(2.4GHz, 2x1MB Cache) running DDR400 2-2-2 and running overclocked memory at DDR533 3-3-3.

The green bars represent DDR memory performance, while the beige to red are increasing DDR2 speed on NetBurst. Light green represents DDR400 2-2-2 while Dark Green is overclocked memory at the same CPU speed, DDR533 at 3-3-3.

Standard (Buffered) Memory Test


In buffered performance, Fast DDR400 is only faster than DDR2-400 and slower than DDR2-533, 667 and 800. Overclocked memory at DDR533 3-3-3 is faster than any of the DDR2 bandwidths on NetBurst.

The Sandra Unbuffered Memory Test, which turns off features that tend to artificially boost performance, is generally a better measure of how memory will behave comparatively in gaming. The same green for DDR applies here.

Unbuffered Memory Test


Without Buffering, DDR400 has the smallest bandwidth of tested memory speeds and timings. Even overclocking to DDR533 allows the DDR to barely beat DDR2-400. DDR2-533, 667, and 800 all have greater Unbuffered bandwidth than the DDR overclocked to 533. NetBurst DDR2 memory bandwidth is generally wider than the bandwidth supplied by DDR memory on Athlon64. Despite the wider bandwidth, the deep pipelines and other inefficiencies in the NetBurst design did not allow the NetBurst processors to outperform Athlon64. Keep this in mind later, when we look at AM2 and Core 2 Duo Memory Bandwidth.

Latency

The other area where AMD has had an advantage over NetBurst DDR2 performance is memory latency, the result of the on-processor memory controller. Comparison of the AMD DDR Memory controller and the Intel DDR2 Memory controller in the Intel chipset shows AMD DDR with latency about 35% lower than Intel NetBurst in Science Mark 2.0.

Memory Latency Comparison - DDR & NetBurst


While memory bandwidth was very similar between AMD and NetBurst, the deep pipes of the NetBurst design still behaved as if they were bandwidth starved. On the other hand the AMD architecture made use of the bandwidth available and the much lower latency to outperform NetBurst across the board.



AM2/Core 2 Duo Latency and Memory Bandwidth

The introduction of AM2 merely increased the AMD latency advantage. AM2 latency was slightly lower than DDR latency on AMD.

Memory Latency Comparison - Conroe & AM2

However, Core 2 Duo did what most believed was impossible in Latency. One of AMD's advantages is the on-processor memory controller, which Intel has avoided. It should not be possible to use a Memory Controller in the chipset on the motherboard instead and achieve lower latency. Intel developed read-ahead technologies that don't really break this rule, but to the system, in some situations, the Intel Core 2 Duo appears to have lower latency than AM2, and the memory controller functions as if it were lower latency.

Memory Bandwidth

The other part of the memory performance equation is memory bandwidth, and here you may be surprised, based on Conroe's performance lead, to see the changes Core 2 Duo has brought. Results are the average of ALU/FPU results on Sandra 2007 Standard (Buffered) memory performance test. We used the same memory on all three systems, and the fastest memory timings possible were used at each memory speed.


The results are not a mistake. In standard memory bandwidth, Core 2 Duo has lower memory bandwidth than either AM2 or Intel NetBurst. It is almost as if the tables have turned around. AMD had lower bandwidth with DDR than Intel NetBurst, and the Athlon64 outperformed Intel NetBurst. Now Conroe has the poorest Memory Bandwidth of any of the three processors, yet Conroe has a very large performance lead. It appears Conroe, with shallower pipes and an optimized read-ahead memory controller to lower apparent latency, makes best use of the memory bandwidth available.

Perhaps the most interesting statistics are that the huge increases in memory bandwidth brought by AM2 make almost no difference in AM2 performance compared to the earlier DDR-based Athlon64. With this perspective let's take a closer look at DDR2 memory performance on AM2 and Core 2 Duo. This will include as close to an apples-to-apples comparison of Core 2 Duo and AM2 as we can create.



Memory Test Configuration

The comparison of AM2 and Core 2 Duo Memory Performance used the exact same components wherever possible. Memory, Hard Drive, Video Card, HSF, and Video Drivers were the same on both test platforms.


The motherboards used for benchmarking differed by necessity, but they are both top-line boards from Asus - the P5W-DH Deluxe for Core 2 Duo and the M2N32-SLI Deluxe for AM2. The latest motherboard drivers from Intel (P5W-DH) and nVidia (M2N32-SLI) were used for testing. The hard drive configurations for each test platform only differed in the drivers required for the test motherboard.

Our Corsair CM2x1024-6400C3 modules were set to the following memory timings on each platform; DDR2-400 - 3-2-2-5, DDR2-533 - 3-2-2-6, DDR2-667 - 3-2-3-7, DDR2-800 - 3-3-3-9, DDR2-1067 - 4-3-4-11, and DDR2-1112 - 5-4-5-14.



A Closer Look at Latency and Scaling

As was explained in the Core 2 Duo launch review, Core 2 Duo has not physically added a memory controller on the processor. The memory controller is still part of the motherboard chipset that drives Core 2 Duo. Intel added features that perform intelligent look-aheads on the memory controller to behave like lower latency. As you saw on pages 2 and 3, ScienceMark 2.0 shows the "intelligent look-aheads" in Core 2 Duo to be extremely effective, with Core 2 Duo memory now exhibiting lower apparent latency than AM2. However, not all latency benchmarks show the same results. Everest from Lavalys shows latency improvements in the new CPU revisions, but it shows Latency more as we would expect in evaluating Conroe. For that reason, our detailed benchmarks for latency will use both Everest 1.51.195, which fully supports the Core 2 Duo processor, and ScienceMark 2.0.

Latency, or how fast memory is accessed, is not a static measurement. It varies with memory speed and generally improves (goes down) as memory speed increases. To better understand what is happening with memory accesses we first looked at Sciencemark 2.0 Latency on both AM2 and Conroe.


ScienceMark shows Conroe Latency with a 45ns to 61ns lead at DDR2-400. Latency continues to decrease as memory speed increases with Core 2 Duo, reaching a value of about 30ns at DDR2-1067. The Trend line for AM2 is steeper than Core 2 Duo, increasing at a rapid rate until latency is virtually the same at DDR2-800.

It is very interesting that ScienceMark shows lower latency on Core 2 Duo than AM2, since we all know the on-chip AM2 controller has to be faster. We thought perhaps it was because all of the tested memory accesses could be contained in the shared 4MB cache of Core 2 Duo, but Alex Goodrich,one of the authors of ScienceMark, states that Version 2 is designed to test up to 16MB of memory, forseeing the day of larger caches. In addition he states the Core 2 duo prefetcher is clever enough to pick up all the patterns ScienceMark uses to "fool" hardware prefetchers. ScienceMark plans a revision with an algoritm that is harder to fool, but Alex commented that Conroe fooling their benchmark was "in itself a great indicator of performance".


Everest uses a different algorithm for measuring Latency, and it shows the on-chip AM2 DDR2 controller in the lead at all memory speeds, with Latency almost the same at the Core 2 Duo memory speed range of DDR2-400 to DDR2-533. However, the Everest trend lines are similar to those in ScienceMark, in that AM2 latency improves at a steeper rate than Core 2 Duo as memory speed increases.

The point to the Latency discussion is that, as expected, AMD has much more opportunity for performance improvement with memory speed increases in AM2. Intel will eventually reach the point, if the lines were extended, where they would have to move to an on-chip memory controller to further improve latency. This is not to take anything away from Intel's intelligent design on Core 2 Duo. They have found a solution that fixes a performance issue without requiring an on-chip controller - for now.



Memory Bandwidth and Scaling

Everyone should already know that memory bandwidth improves with increases in memory speed and reductions in memory timings. To better understand the behavior of AM2 and Core 2 Duo memory bandwidth we used SiSoft Sandra 2007 Professional to provide a closer look at memory bandwidth scaling.


The most widely reported Sandra score is the Standard or Buffered memory score. This benchmark takes into account the buffering schemes like MMX, SSE, SSE2, SSE3, and other buffering tools that are used to improve memory performance. As you can clearly see in the Buffered result the AM2 on-chip memory controller holds a huge lead in bandwidth over Core 2 Duo. At DDR2-800 the AM2 lead in memory bandwidth is over 40%.

As we have been saying for years, however, the Buffered benchmark does not correlate well with real performance in games on the same computer. For that reason, our memory bandwidth tests have always included an UNBuffered Sandra memory score. The UNBuffered result turns off the buffering schemes, and we have found the results correlate well with real-world performance.


The Intel Core 2 Duo and AMD AM2 behave quite differently in UNBuferred tests. In these results AM2 and Core2 Duo are very close in memory bandwidth - much closer than in Standard tests. Core 2 Duo shows wider bandwidth below DDR2-800, but this will likely change when the AM2 controller matures and supports values below 3 in memory timings as the Core 2 Duo currently supports.

The Sandra memory score is really made up of both read and write operations. By taking a closer look at the Read and Write components we can get a clearer picture of how the two memory controllers operate. Everest from Lavalys provides benchmarking tools that can individually measure Read and Write operations.


The READ results are particularly interesting, since you can see that the READ component of Core 2 Duo performance is much larger than the WRITE results on Core 2 Duo. This is the result of the intelligent read-aheads in memory which Intel has used to lower the apparent latency of memory on the Core 2 Duo platform. Actual READ performance on Core 2 Duo now looks almost the same as AM2 to DDR2-533. AM2 starts pulling away in READ at DDR2-677 and has a slightly steeper increase slope as memory speed increases. The increases in READ speed in Core 2 Duo are a result of the intelligent read-aheads in memory. Performance without this feature would show Core 2 Duo much slower in READ operations than AM2.


This is most clearly illustrated by looking at Everest Write scores. Memory read-ahead does not help when you are writing memory, so core 2 Duo exhibits much lower WRITE performance than AM2 as we would expect. This means if all else were equal (and it isn't) the AM2 would perform much better in Memory Write tasks. Surprisingly the WRITE component of Core 2 Duo appears a straight line just below 5000 MB/s. AM2 starts at 5900 at DDR2-400 and WRITE rises to around 8000MB/s at DDR2-667. Write then appears to level off, with higher memory speeds having little to no impact on AM2 WRITE performance.



Stock Performance Comparison

With a clearer understanding of how memory behaves on the AM2 and Core 2 Duo platforms, benchmarks compared performance of the fastest Core 2 Duo and AM2 processors available. Core 2 Duo X6800 at 2.93 GHz and FX62 at 2.8GHz are both dual-core processors.









It really doesn't matter which DDR2 speed you examine in this direct comparison. Core 2 Duo is faster in every benchmark at every speed evaluated. It is true, however, that different processor and top memory speeds are being compared. This is a necessity at stock speeds. For that reason, the next series of comparisons tried to configure both test platforms as close to each other as possible.


2.93GHz with DDR2-1067 Performance Comparison

It is clear enough that despite the poorer memory bandwidth, Core 2 Duo is the performance leader by a substantial margin at stock speeds. You have seen that in all of the results posted in this article. This conclusion will not satisfy all our readers, however. Many have theorized every incarnation of performance imaginable with AM2 having higher clock speed, higher bandwidth, or higher speed memory than it currently does.

To best answer these questions we put together the fairest comparison we could think of to directly compare Core 2 Duo and AMD AM2. This consists of running both processors at the exact same speed - 2.93GHz - achieved at the same ratios - 11x266. This involves overclocking the AM2 FX62 to 2.93GHz and raising the "bus" speed to 266. That allows an 11x266 ratio to match Core 2 Duo. The desirable side effect is that while AM2 does not really support DDR2-1067, by setting the memory to DDR2-800 we reach DDR2-1067 speed at the 266 speed setting. While this slightly stacks the deck in AMD's favor, it is as close as it is possible to get at running the two systems at the same speed, same memory timings, same memory voltages, same memory, and same video card. We are comparing two identically configured systems with AM2 powering one system and Core 2 Duo powering the other system.

Results are particularly interesting in that the fastest current AM2 processor, the FX62, is overclocked about 5% in CPU speed and 33% in "bus" speed over a stock AM2 system.

DDR2-1067 (2.93 GHz) Calculation Performance

Frankly the gap that remains in Super Pi results when comparing AM2 and Core 2 Duo at the same speed was something of a shock. Clock for clock, with all other variables the same, Core 2 Duo is still almost 60% faster than AM2. There is nothing complex about calculating the value of Pi to 2 million places, but it does show the true power of Conroe in computation-intensive tasks.

DDR2-1067 (2.93 GHz) Standard (Buffered) Memory Test

DDR2-1067 (2.93 GHz) Unbuffered Memory Test

Despite the improvements Intel has made in intelligent read-ahead for memory, AM2 still has a huge lead in buffered memory bandwidth. This is a result of the superior on-processor memory controller used on AM2. The results become much closer in Unbuffered memory results, which is normally more revealing of performance in real-world applications, but AM2 still has a wider memory bandwidth. The unfortunate reality is AM2 is not starved for memory bandwidth and cannot really make effective use of this advantage. AMD clearly knows how to deliver memory bandwidth, so the task now becomes to modify their core logic to make better use of this advantage.

DDR2-1067 (2.93 GHz) - Far Cry

DDR2-1067 (2.93 GHz) - Half-Life 2

DDR2-1067 (2.93GHz) - Quake 4 1.22

We can now say with authority that Core 2 Duo is the faster performer clock-for-clock across the board. At the same 2.93GHz Far Cry is 27.7% faster, Half-Life 2: Lost Coast is 12.4% faster, and Quake 4 is 22.2% faster on Core 2 Duo. Of course AMD does not currently have a 2.93GHz CPU, so we tested by overclocking FX62. This suggests that FX64, or whatever it will be called, will not help much at 3.0GHz with a 200 clock speed.



Conclusion

While DDR2 Memory does not exhibit the same bandwidth or performance on the AM2 and Conroe platforms, they do perform at the same timings and voltages when going from one platform to another. This was clearly demonstrated in benchmarking tests performed on AM2 and Conroe platforms. This means readers can examine test results performed on a Core 2 Duo test bed with XYZ memory, and reasonably expect that XYZ memory to perform at the same speeds and the same memory timings and voltages on an AM2 platform - provided those settings are available.

There are always the variations in chipset and BIOS that can cause problems with a memory on one brand/model of motherboard and no problems on another brand/model, but that is also true even if you are planning to use the DDR2 on the same type of platform. We have sometimes seen where a brand of memory runs very well on an MSI platform, for example, but where it would not run at all on a DFI platform using the same chipset and CPU. Those types of compatibility issues will always happen, but in general if a memory tests well on Conroe it should do just as well on AM2.

This fact will make our memory testing much simpler, and we plan to perform all upcoming memory testing on the currently more flexible Core 2 Duo test platform. AM2 buyers can expect similar results with the same DDR2 memory on their AM2 motherboards.

A few conclusions about AM2 performance compared to Core 2 Duo performance are also inescapable in looking at our test results. First, Intel has done a remarkable job of concealing the issue of not having an on-processor memory controller. The intelligent look-ahead for memory works very well, and it makes the chipset-based Core 2 Duo memory controller appear to be as fast as the on-processor AM2 in many cases. This does not change the fact that the AM2 memory bandwidth is really greater than Core 2 Duo or the fact that AM2 scales better in memory, exhibiting a steeper slope in performance increase as memory speed increases than does Core 2 Duo. That just means as Memory Speed increases AM2 will benefit more and Intel will eventually need to move to an on-processor controller.

Probably the hardest conclusion for many will be the fact that increasing memory speed, increasing clock speed, and increasing CPU speed alone will not be enough for AM2 to catch up to Core 2 Duo in performance. The performance gap that remains when overclocking AM2 to 2.93GHz at 266 clock speed with DDR2-1067 is still huge. A die-shrink from 90 to 65nm and the additional cache that will allow will definitely help, but we are even skeptical there with Core 2 Duo already overclocking to 4GHz and beyond. No doubt AMD will find a solution, but it is now clear this will not be an easy fix for AMD.

The deep price cuts announced by AMD yesterday will definitely help. The new numbers indicate AM2 will be very competitive at the low end to low-mid of the processor food chain - a spot they have held in the past and where they have still managed to survive. The low end looks very competitive, and AMD is positioned close enough to mid-range in performance to keep Intel honest. There is no mistaking, however, that Intel Core 2 Duo owns the mid to high-end of the current processor market.

With this memory analysis, the memory playing field is hopefully a lot clearer for those shopping for DDR2 memory. Our next memory articles will compare memory performance of DDR2 on the Core 2 Duo Memory Test Bed. This began with the 6 high-performance memories and the 7 value memories tested in the Conroe Buyers Guide. It will continue with evaluations of the fastest memories available from both Corsair and OCZ.

Log in

Don't have an account? Sign up now