Memory Timings and Bandwidth Explained

With that brief overview of the memory subsystem, we are ready to talk about memory timings. There are usually four and sometimes five timings listed with memory. They are expressed as a set of numbers, e.g. 2-3-2-7, corresponding to CAS-tRCD-tRP-tRAS. On modules that list a fifth number, it is usually the CMD value, e.g. 1T. Some might also include a range for the tRAS value. These are really only a small subset of the total number of timing figures that memory companies use, but they tend to be the more important ones and encapsulate the other values. So, what does each setting mean? By referring back to the previous sections on how memory is accessed, we can explain where each value comes into play.

The most common discussion on timing is the CAS Latency, or CL value. CAS stands for Column Access Strobe. This is the number of memory cycles that elapse between the time a column is requested from an active page and the time that the data is ready to begin bursting across the bus. This is the most common occurrence, and so, CAS Latency generally has the largest impact on overall memory performance for applications that depend on memory latency. Applications that depend on memory bandwidth do not care as much about CAS latency, though. Of course, there are other factors that come into play, as our tests with OCZ 3500EB RAM have shown that a well designed CL2.5 RAM can keep up with and sometimes even outperform CL2 RAM. Note that purely random memory accesses will stress the other timings more than the CL, as there is little spatial locality in that case. Random memory access is not typical for general computing, which explains why theoretical memory benchmarks that use it as a performance metric frequently have little to no correlation with real world performance.

The next value is tRCD, which is referred to as the RAS to CAS Delay. This is the delay in memory cycles between the time a row is activated and when a column of data within the row can actually be requested. It comes into play when a request arrives for data that is not in an active row, so it occurs less frequently than CL and is generally not as important. As mentioned a moment ago, certain applications and benchmarks can have different memory access patterns, though, which can make tRCD more of a factor.

The term tRP stands for the time for RAS Precharge, which can be somewhat confusing. Time for a Row Precharge is another interpretation of the term and explains the situation better. tRP is the time in memory cycles that is required to flush an active row out of the sense amp ("cache") before a new row can be requested. As with tRCD, this only comes into play when a request is made to an inactive row.

Moving on, we have the tRAS - or more properly tRASmin - which is the minimum time that a row must remain active before a new row within that bank can be activated. In other words, after a row is activated, it cannot be closed and another row in the same bank be opened until a minimum amount of time (tRASmin) has elapsed. This is why having more memory banks can help to improve memory performance, provided it does not slow down other areas of the memory. There is less chance that a new page/row will need to be activated in a bank for which tRASmin has not elapsed. Taken together, tRP and tRAS are also referred to as the Row Cycle time (tRC), as they occur together.

CMD is the command rate of the memory. The command rate specifies how many consecutive clock cycles that commands need to be presented to the DRAMs before the DRAMs sample the address and command bus wires. The package of the memory controller, the wires of the address and command buses, and the package of the DRAM all have some electrical capacitance. As electrical 1's and 0's in the commands are sent from the memory controller to the DRAMs, the capacitance of these (and other) elements of the memory system slow the rate at which an electrical transition between a 1 and a 0 (and vice versa) can occur. At ever-increasing memory bus clock speeds, the clock period shrinks, meaning that there is less time available for the transition between a 1 and a 0 (and vice versa) to occur. Because of the way that addresses and commands are routed to DRAMs on memory modules, the total capacitance on these wires may be so high that transitions between 1 and 0 cannot occur reliably in only one clock cycle. For this reason, commands may need to be sent for 2 consecutive clock cycles so that they can be assured of settling to their appropriate values before the DRAMs take action. A 2T command rate means that commands are presented for 2 consecutive clocks to the DRAMs. In some implementations, command rate is always 1T, while in others, it may be either 1T or 2T. On DDR/DDR2, for instance, using high-quality memory modules (which cost a little more) and/or reducing the number of memory modules on each channel can allow 1T command rates. If you are wondering how the command rate can impact performance, that explanation will have hopefully made it clear that CMD can be just as important as CL. Every memory access will incur the CMD and CL delays, so removing one memory clock cycle from each benefits every memory access.

In addition to all of these timings, the question of memory bandwidth still remains. Bandwidth is the rate at which data can be sent from the DRAMs over the memory bus. Lower timings allow faster access to the data, while higher bandwidth allows access to more data. Applications that access large amounts of data - either sequentially or randomly - usually benefit from increased bandwidth. Bandwidth can be increased either by increasing the number of memory channels (i.e. dual-channel) or by increasing the clock speed of the memory. Doubling memory bandwidth will never lead to a doubling of actual performance except in theoretical benchmarks, but it could provide a significant boost in performance. Many games and multimedia benchmarks process large amounts of data that cannot reside within the cache of the CPU, and being able to retrieve the data faster can help out. All other things being equal, more bandwidth will never hurt performance.

It is important to make clear that this is only a very brief overview of common RAM timings. Memory is really very complex, and stating that lower CAS Latencies and higher bandwidths are better is a generalization. It compares to stating that "larger caches and higher clock speeds are better" in the CPU realm. This is often true, but there are many other factors that come into play. For CPUs, we also need to consider pipeline lengths, number of in-flight instructions, specific instruction latencies, number and type of execution units, etc. RAM has numerous other timings that can come into play, and the memory controller, FSB, and many other influences can also affect the resulting performance and efficiency of a system. Some people might think that designing memory is relatively simple compared to working on CPUs, but especially with rising clock speeds, this is not the case.

Design Considerations I'm late, I'm late for a very important date!
Comments Locked

22 Comments

View All Comments

  • Lynx516 - Tuesday, September 28, 2004 - link

    Your description of how SDRAM works is wrong. you do not bust down the rows as your artcle implys but instead it bursts along the columns.

    The whole column is sent imeadiatly but the other columns in the burst are not and are sent sequencially (idealy not quite the case if you want to interleave them).

    Comparing Banks to set associativity is probably counter productive as most of your reader wont fully under stand how it works. And infact comparing banks to set associativity is a bad annalogy. A better one would be just to say taht the memmory space in the chip is split up into banks.

    On top of this you have referred to a detailed comparsion of DRAM types. Even though there are many different types of DRAM most are not that interesting or used that much in PCs. I also assume that as you have said this you will not be talking about SRAM or RDRAM in forth coming articles which highlight the different approaches that can be taken when designing a memory sub system. (SRAM the low latency, high bandwidth but low density, RDRAM the serial approach)

    I assume you are going to talk abit about how a memory controller works as they are one of the most complex components in a PC (more complex than the exectution core of a CPU) but you have not refered to any plans to talk about memory controller and how the type of memory you are using affects the design of a memory controller.

    All in all a pretty confusingly written article. If you want a DRAM for beginners arstechnica have two good articles (though one is fairly old but atleast correctly and CLEARLY describes how SDRAM works).

  • Resh - Tuesday, September 28, 2004 - link

    I really think that some diagrams would help, especially for novices like #10. Other than that, great article and hope to see the follow-ups soon.
  • Modal - Tuesday, September 28, 2004 - link

    Great article, thanks. I like these "this is how the pieces of your computer work" articles... very interesting stuff, but it's usually written in far too complicated a manner for a relative novice like me. This was quite readable and understandable however; nice work.
  • danidentity - Tuesday, September 28, 2004 - link

    This is one of the best articles I've seen at Anandtech in a long while, keep up the good work.
  • deathwalker - Tuesday, September 28, 2004 - link

    I..for one, would rather have 1 GB of CL 2.5 high quality memory than 512 MB of CL 2 high quality memory. I'm conviced that in this instance quantity wins out over speed.
  • AlphaFox - Tuesday, September 28, 2004 - link

    where are the pictures? ;)
  • Pollock - Tuesday, September 28, 2004 - link

    Excellent read!
  • mino - Tuesday, September 28, 2004 - link

    Sry for triple post but some major typpos:
    "1) buy generic memory until your budget could afford no more than 512M DDR400"
    should be:
    "1) buy generic memory until your budget could afford more than 512M DDR400"
    and
    "Goog"(ROFL) should be "Good"
    onother -> another
    Hope that's all ;)
  • mino - Tuesday, September 28, 2004 - link

    OK, 3 rules ;) - I added 3rd after some thought.
  • mino - Tuesday, September 28, 2004 - link

    #2 You are missing one important point. That is, unless You can(want) afford at least 512M high quality RAM, it makes NO SENSE to buy 256M DDR400 CL2 since there are 2 basic rules:

    1) buy generic memory until your budget could afford no more than 512M DDR400
    2) then spend some aditional money for brand memory
    3) then go 1G and only at this point spent all additional money for better latencies and so on.

    Also do remember that at many shops(here in Slovakia) there is 3 or 4 yrs warranty for generic memory(like A-DATA) and also if you have major problems with compatibility they will usually allow you to choose different brand/type for your board for no additional cost except price difference. Also in case the memory works fine with onother board.
    Also Twinmos parts have 99month warranty (for price 10% higher than generic). That speaks for itself.

    Except this little missing part of reality,

    Goog work Jarred.

Log in

Don't have an account? Sign up now