Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask

Name: Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask
Item: Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask
Author: Rajinder Gill

by Rajinder Gill on August 15, 2010 10:59 PM EST

Posted in
Memory
Guides
Intel

46 Comments | Add A Comment

46 Comments

We hope you’ve enjoyed reading this article as much as we’ve enjoyed putting it together. If you took the time to thoroughly peruse and digest the information within the intricacies of basic memory operation should no longer be such a baffling subject. With the ground work out of the way, we now have a solid platform from which to build as we more closely begin exploring other avenues for increasing memory performance. We’ve already identified additional topics worth discussing, and provided the time shows up on the books, plan to bring you more.

Assumedly, the one big question that may remain: What are the real world benefits of memory tuning? Technically, we covered the subject in-depth last year in a previous article. We suggest you read through it once again for a refresher before you embark on any overclocking journeys (or before you rush out to over-spend on memory kits). Everything written in that article then is just as valid today. We’ve run tests here on our Gulftown samples and found exactly the same behavior. Undoubtably, Intel have taken steps to ensure their architectures aren't prematurely bottlenecked by giving the memory controller a big, fat bus for communicating with the DIMMs.

ASUS Rampage III Extreme married to 12GB of sweet, sweet DDR3-goodness

From what we can tell, the next generation of performance processors from Intel are going to move over to a 256-bit wide (quad channel) memory controller, leaving little need for ultra-high frequency memory kits. Thus we re-iterate something many have said before: a top priority when it comes to improving memory ICs and their respective architectures should be to focus development on reducing absolute minimum latency requirements for timings such as CAS and tRCD, rather than chasing raw synthetic bandwidth figures or setting outright frequency records at the expense of unduly high random access times.

Stepping away from the performance segment for a moment, something else that's also come to light is rumored news that Intel's Sandy Bridge architecture (due Q1 2011) will, by design, limit reference clock driven overclocking on mainstream parts to 5% past stock operating frequency. If this is indeed the case the consequence will be a very restricted ability to control memory bus frequency with limited granularity to tune the first 50~70 MHz past each step, followed by mandatory minimum jump of 200MHz to the next operating level. Accessing hidden potential will be even more difficult, especially for users of mainstream memory kits. While there is no downside to this from a processing perspective (hey, more speed is always better), this could be another serious nail in the coffin of an already waning overclocking memory industry.

We've Given You the Tools, Now Please Give Us the Help

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

46 Comments

View All Comments

bowhe - Tuesday, October 26, 2010 - link
Thanks for these great articles!

What I didn't understand yet:
You state "Installing more than one DIMM per channel does not double the Memory Bus bandwidth, as modules co-located in the same channel must compete for access to a shared 64-bit sub-bus; however, adding more modules does have the added benefit of doubling the number of pages that may be open concurrently (twice the ranks for twice the fun!)". This sounds very positive, but:

Some system manufacturers state that with 3 dimms the memory frequency can be for example 1333MHz, but with 6 dimms it needs to drop to 800MHz. Why does the frequency need to drop when using 6 versus 3 dimms? Does this apply to high end boards like the Gigabyte-X58A-UD9?

Some manufacturer states in a small side note of a 24GB kit (6x4GB) that the stated frequency/timing is only guaranteed when using 3 dimm slots. This leads me to think that any 3 dimms of the set can do the stated timing, but when all are used something inherent in the design or interaction of the i7 processor, motherboard and dimm prevents the use of stated frequency/timings? What is it?

Can one overcome these limitations by adjusting voltages in a high end board like the Gigabyte-X58A-UD9? (without use of extreme cooling <32F/0C)

Thanks a lot!
kakfjak - Thursday, May 5, 2011 - link

www.stylishdudes.com

All kinds of shoes + tide bag

Free transport
cochleoid - Tuesday, March 12, 2013 - link
"When associated in groups of two (DDR), four (DDR2) or eight (DDR3), these banks form the next higher logical unit, known as a rank. "

This mislead me. DDR2 may have coincidentally introduced 3 bit banks - allowing for 8 bank chips - but a typical old SDRAM (no DDR) chip had 4 banks.

"We can now see why the DDR3 core has a 8n-prefetch (where n refers to the number of banks per rank) as every read access to the memory requires a minimum of 64 bits (8 bytes) of data to be transferred. This is because each bank, of which there are eight for DDR3, fetches no less than 8 bits (1 byte) of data per read request - the equivalent of one column's worth of data. Whether or not the system actually makes use of all 8 bytes of transferred data is irrelevant. Any delivered data not actually requested can be safely disregarded as it's just a copy of what is still retained in memory."

This threw me off even more. What's happening is that the data at 8 consecutive (or otherwise close, depending on the burst mode) column addresses is being bursted on each read. "n" refers to the width of the memory chip, or the size of the "word" at a particular column address. "n" does not have any relation to the number of banks in a rank.

8 8bit-wide DDR3 chips would make a total module width of 64 bits or 8 bytes at each column address. 8 column addresses would be 64 bytes (not 8 bytes, as the article seems to suggest), which actually corresponds to the cacheline size on most PCs.

SDRAM could burst in sizes of 1,2,4,8
DDR could burst in sizes of only 2,4,8
DDR2 could burst in sizes of only 4,8
DDR3 can burst only in 8.
(All of these could burst in 8, filling the 64 byte cachline in one read operation. The difference with the generations of DDR has been a larger minimum wait in interface clock cycles as the interface got faster and the row accesses remained sluggish.)
The internal clock of SDRAM has been limited by the speed of row accesses. What the 2n,4n,8n prefetches are doing is transferring more of this data available in an open row out at higher interface speeds with the rest of the system. It has nothing to do with the banks.

SDRAM chips were segmented into independently operating banks so that parallel operations on interleaved banks could be synchronized or pipelined. 2n, 4n, and 8n prefetch buffering can be applied without independently operating banks.
ricardo_sa - Saturday, March 26, 2016 - link
Thanks for the detailed explanation. You really saved my day. Ive read this article some time ago to help me understand how a DDR3 worked (theres few detailed explanations on google) and it turned out to be the worst mistake possible. I got the concepts wrong because of the incompetence of the publisher and lost a lot of time dealing with that 8 Bank misconception about the 64 bits.

So it turns out one can only write a burst at 1 bank at a time, am i right? Otherwise you could access all the 8 banks in one single write/read....
Huendli - Friday, March 13, 2015 - link
Thanks for this interesting read with much attention to detail!

"a top priority [...] should be to focus development on reducing absolute minimum latency requirements for timings such as CAS and tRCD, rather than chasing raw synthetic bandwidth figures or setting outright frequency records at the expense of unduly high random access times."

The latter's exactly what happened. DDR3-1600 modules with CL7 timings were widely available at the time this article had been written. Nowadays, you only get ridiculously-named bars with equally-ridiculously monstrous heatspreaders, but more bandwidth and worse timings than ever.
Anuradha - Tuesday, March 9, 2021 - link
Each rank consists of 8 banks, OR, each rank consists of 8 ICs and each IC consists of 8 banks??

Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask

Post Your Comment

46 Comments

View All Comments

bowhe - Tuesday, October 26, 2010 - link

kakfjak - Thursday, May 5, 2011 - link

cochleoid - Tuesday, March 12, 2013 - link

ricardo_sa - Saturday, March 26, 2016 - link

Huendli - Friday, March 13, 2015 - link

Anuradha - Tuesday, March 9, 2021 - link

Log in

Don't have an account? Sign up now