Memory Latency Impact on Performance

We just looked at the impact of memory bandwidth on performance, but what about latency?  Let's first by adjusting the CAS latency from our default of 2 clocks up to 3 clocks.  Almost all DDR400 these days is CAS 2 memory, but older memory may have a higher CAS latency or you may have to increase your CAS latency when overclocking to gain more memory bandwidth, so what kind of a performance hit is there when going from CAS 2 to CAS 3?

at_canals_08
at_coast_05
at_coast_12
at_prison_05
at_c17_12
Tcl = 2
116.12
140.43
123.37
113.69
83.15
Tcl = 3
115.52
137.07
121.91
113.37
79.92

At worst, CAS 2 memory seems to be about 5% faster than CAS 3 memory when looking at at_c17_12, our most CPU intensive test.  While 5% alone isn't anything major, combine that with a number of other performance tweaks and they can definitely begin to add up.

Now let's look at keeping Tcl (CAS latency) fixed at 2 clocks, but vary Trcd timings from 3 up to 6 clocks:

at_canals_08
at_coast_05
at_coast_12
at_prison_05
at_c17_12
Trcd = 2
116.12
140.43
123.37
113.69
83.15
Trcd = 3
115.71
136.99
122.46
113.08
79.97
Trcd = 4
113.92
134.42
120.87
112.38
79.83
Trcd = 5
113.42
131.82
119.34
114.79
79.12
Trcd = 6
113.23
128.26
117.56
111.15
77.4

For the most part we saw no real changes when adjusting Trcd, the one exception being at_coast_05 which actually showed a pretty big difference between a Trcd value of 2 and higher latency values.

Next we'll look at adjusting Trp:

at_canals_08
at_coast_05
at_coast_12
at_prison_05
at_c17_12
Trp = 2
116.12
140.43
123.37
113.69
83.15
Trp = 3
115.6
139.24
123.13
116.35
82.09
Trp = 4
115.85
138.88
122.98
113.16
82.05
Trp = 5
114.84
138
122.65
112
80.98
Trp = 6
114.5
136.95
121.96
115.61
80.95

Here we see very little impact on performance.

Putting them all together we can see what the overall impact on using fast DDR400, higher latency DDR400 and extremely high latency DDR400 will be:

at_canals_08
at_coast_05
at_coast_12
at_prison_05
at_c17_12
2-2-2-10
116.12
140.43
123.37
113.69
83.15
3-3-3-10
114.47
134.11
120.64
112.62
80.56
3-6-6-10
110.74
123.76
114.75
112.17
73.8

Our standard 2-2-2-10 memory does actually offer reasonable performance benefits in Half Life 2 compared to DDR400 with higher timings such as 3-3-3-10 or the unrealistically high 3-6-6-10. 

First and foremost Half Life 2 does appear to be rather dependent on memory bandwidth, but it is also quite appreciative of low latency memory as well.  If you're wondering whether being able to run memory at low timings and high clock speeds is important, when it comes to Half Life 2 performance it is. 

Closer Look at AMD Memory Performance Cache Size Impact on Performance
Comments Locked

68 Comments

View All Comments

  • dderidex - Wednesday, February 2, 2005 - link

    Quick question...

    On the 'cache comparison' on page 5, where they compare an A64 with 1mb cache to an A64 with 512k cache...

    What CPUs are they comparing?

    512k Socket 754 (single channel)
    vs
    1mb Socket 754 (single channel

    or

    512k Socket 939 (dual channel)
    vs
    1mb Socket 939 (dual channel)

    or

    512k Socket 754 (single channel)
    vs
    1mb Socket 939 (dual channel)

    etc.

    No info is provided, so it's hard to really say what the numbers are showing.
  • doughtree - Tuesday, September 6, 2005 - link

    great article, next game you should do is battlefield 2!
  • dsorrent - Monday, January 31, 2005 - link

    How come in all of the CPU comparisons, the AMD FX-53 is left out of the comparisons?
  • PsharkJF - Monday, January 31, 2005 - link

    That has no bearing to half-life. Nice job, fanboy.
  • levicki - Saturday, January 29, 2005 - link

    Btw, I have Pentium 4 520 and 6600 GT card and I prefer that combo over AMD+ATI anytime. I had a chance to work on AMD and I didn't like it -- no hyperthreading = bad feeling when working with few things at once. With my P4 I can compress DVD to DivX and play Need For Speed Underground 2 without a hitch. I had ATI (Sapphire 9600 Pro) and didn't like that crap too especially when OpenGL and drivers are concerned = too much crashing.
    Intel .vs. AMD -- people can argue for ages about that but my 2 cents are that musicians using Pentium 4 with HT get 0.67 ms latency with latest beta kX drivers for Creative cards and AMD owners get close to 5.8 ms. From a developer point of view Intel is much better choice too due to great support, compiler and documentation. So my next CPU will be LGA775 with EM64T (I already have a compatible mainboard) and not AMD which by the way has troubles with Winchester cores failing Prime 95 at stock speed.
  • Carfax - Saturday, January 29, 2005 - link

    Yeah, developers are so lazy that they will still use x87 for FP rather than SSE2, knowing that the latter will give better performance.

    Thats why the new 64-bit OS from MSoft will be a good thing. It will force developers to use SSE2/SSE3, because they have access to twice as many registers and the OS itself won't recognize x87 for 64-bit operations.
  • Barneyk - Saturday, January 29, 2005 - link

    I would've liked to se some benchmarks on older CPUs to, kinda dissapointed...
  • levicki - Friday, January 28, 2005 - link

    I just wonder how would this test look like if it was made with 6800 Ultra instead with ATI X850 XT.

    Disabling SSE/SSE2 on Athlon and getting the same results as if they were enabled means that game is NOT OPTIMIZED. Using FPU math instead of SSE/SSE2 today is a sin. It could have been 3-4 times faster if they cared about optimizing the code.
  • Phantronius - Friday, January 28, 2005 - link

    #53

    Its because the Prescotts wern't better then the Northwoods to begin with, hence why don't see squat performance differences between them.
  • maestroH - Friday, January 28, 2005 - link

    Thx for your reply #56. Apologies for false '@9700pro' statement. Meant to say 'soft-modded with Omega driver to 9700pro'. Cheers.

Log in

Don't have an account? Sign up now