Bandwidth and Memory Scaling

One of the surprises in comparing DDR2 performance on AM2 and Core 2 Duo was the much better memory bandwidth found on the AM2 platform courtesy of the on-chip memory controller. Unfortunately, this did not translate into significant performance improvements compared to a similar AMD processor running DDR. At that point we concluded that Core 2 Duo was not particularly bandwidth sensitive, since it made very good use of the memory bandwidth available.

In our earlier review we were really comparing the DDR2 memory controller on AM2 to the 975X chipset memory controller, since Intel continues to place the memory controller in the chipset. We have speculated since then whether an improved memory controller in a socket 775 chipset would bring with it improved performance.

P965 brought very minor changes, mainly in the straps and overclocking ability of the memory. The NVIDIA 680i/670/650 actually shows decreased buffered bandwidth, but unbuffered bandwidth is about the same as P965. This reinforced the notion that memory bandwidth didn't matter much with Core 2.

To begin our investigation into DDR3 performance, we compared Standard or Buffered bandwidth on the P965 running DDR2, the new P35 running DDR2, and the new P35 running DDR3. As you can see the results are very interesting.

Standard (Buffered) Sandra XI.SP2 Memory Bandwidth - 2.66GHz
Memory Speed P965
ASUS P5B Dlx
P35 DDR2
ASUS P5K Dlx
P35 DDR3
ASUS P5K3 Dlx
DDR2-800 3-3-3-9 5531 6456 -
DDR2-800 5/6-6-6-15
DDR3-800 6-6-6-15
5207 6143 6156
DDR2-1067 4-4-3-11 5782 6811 -
DDR2-1067 5/6-6-6-15 5712 6621 -
DDR3-1067 7-7-7-20 - - 6613
DDR3-1333 9-9-9-25 - - 6757

While the purpose of this review was to compare DDR3 and DDR2 performance, something completely different emerged from the memory bandwidth tests. Namely, the memory controller on the P35 is definitely an improvement over the P965 memory controller. This is evident whether the P35 is running DDR2 or DDR3 memory.

In those cases where we can run timings the same or close to the same, as in 800 memory speed performance, DDR2 and DDR3 results are virtually identical. By 1067 the current slow DDR2-1067 timings of 7-7-7-20 are performing just as well as DDR2 running at 6-6-6-15. The superior timings of DDR2-1067 at 4-4-3 still provides the best bandwidth at that speed. Of course, DDR3 is currently alone at the 1333 memory speed, but even with the current slow 9-9-9-25 timings it performs nearly as well as DDR2-1067 at 4-4-3 timings.

We normally also test memory with buffering schemes like MMX, SSE, SSE2, SSE3, etc. turned off. While these features do provide apparent improved bandwidth, we have found the unbuffered bandwidth to correlate better with real-world application performance. Unbuffered performance does not always follow the patterns of buffered memory performance.

Unbuffered Sandra XI.SP2 Memory Bandwidth - 2.66GHz
Memory Speed P965
ASUS P5B Dlx
P35 DDR2
ASUS P5K Dlx
P35 DDR3
ASUS P5K3 Dlx
DDR2-800 3-3-3-9 4226 4536 -
DDR2-800 5/6-6-6-15
DDR3-800 6-6-6-15
3668 3975 4098
DDR2-1067 4-4-3-11 4608 4926 -
DDR2-1067 5/6-6-6-15 4389 4557 -
DDR3-1067 7-7-7-20 - - 4547
DDR3-1333 9-9-9-25 - - 4702

Unbuffered results show the same basic pattern as buffered results in this case. Here DDR3 is clearly the best performer at the same slow timings at DDR2-800, with DDR2 on the P35 behind about 3% and DDR2 on P965 about 12% lower. DDR2 is still faster at the better timings available with current DDR2 memory.

In Standard/Buffered memory bandwidth, the P35 (Bearlake) chipset is providing a 16% to 18% improvement in memory bandwidth compared to the P965. This is a significant improvement. The Unbuffered improvement is smaller, in the range of 4% to 8%. These bandwidth improvements may or may not translate into improved system performance. We will examine that in the SuperPi and Gaming benchmarks.

Memory Test Configuration Latency
POST A COMMENT

45 Comments

View All Comments

  • Wesley Fink - Tuesday, May 15, 2007 - link

    Yes. The P965 would not boot witha a CAS setting of 6 even though it could be selected. So the P965 was tested at 5-6-6 timings. The same DDR2 on the P5K was tested at 6-6-6, which would work and also matched the DDR3 timings. We will clarify this in the article.
    Reply
  • TA152H - Tuesday, May 15, 2007 - link

    OK, thanks.

    One thing I would suggest when you do the final tests for the Bearlake and DDR3 is to use the 2M processors as well. You'd expect the 4M cache to hide the differences better, obviously, so the 2M cache processors would be pretty interesting to see as well, if for no other reason to see how much the larger cache does mask the difference in the chipset and memory. Since Intel is planning on increasing cache sizes, it would be a pretty useful data point.
    Reply
  • TA152H - Tuesday, May 15, 2007 - link

    You measured the performance of the memory, but why not take a power measurement of it as well. That is one of the draws of the technology, it uses lower voltage, and therefore should use a little less power and generate less heat. Both are significant.

    Good article though, I just wish that had been included.
    Reply
  • kalrith - Tuesday, May 15, 2007 - link

    Page 2, second line of second-to-last paragraph says, "which is a 16% reduction form DDR2". "form" should be "from".

    Last page, fourth line of third-to-last paragraph says, "the shift to DDR2 may be further delayed". "DDR2" should be "DDR3".

    BTW, I found the article interesting, informative, enlightening, and unbiased (as usual).
    Reply
  • Wesley Fink - Tuesday, May 15, 2007 - link

    Mild dyslexia and less-than smart built-in spell checkers always win :) Both errors are corrected. Thanks. Reply

Log in

Don't have an account? Sign up now