RTL -

A quick note for all you benching fanatics on the 1156/1366 platform, especially those of you that love Super Pi. ‘Round Trip Latency’ chipset function in BIOS denotes the number of Uncore Clock cycles that pass before data arrives back at the IMC after a read command is issued.

For those of you familiar with socket 775 and the P35/P45/X38/X48 chipsets, this setting is known as tRD, aka ‘Performance Level’.

‘Performance Level’ on socket 775 based processor architectures denotes the number of Front Side bus clock cycles that pass before data arrives back from the memory banks time aligned with the leading edge of a FSB clock cycle, making data transfer between the two clock domains possible.

For the i5 and i7 architectures the data read time (from the time the read command was issued) can be calculated by the following formula;

 

 

Needless to say, the smaller the figure in nanoseconds, the better the performance. However, there is a change here in that although we are given control of the RTL parameter for each memory channel, the default time calculated by the memory controller at POST is almost as fast as the gearing for clock crossing can go based on aggressive timing values set by Intel.

 

 

Manual control of the RTL function has been added by board vendors to P55/X58, primarily to allow looser manual settings or to lock the setting down to a known working value. The latter is required at times because there are instances where the IMC selects a non-ideal/unstable setting for one of the memory channels, in which case locking these values down to a stable setting prevents random crashes and bizarre system instability between system reboots. Changes of 1-2 clocks below the auto-selected RTL value are sometimes possible for light load benchmarks such as a single thread of Super Pi 32M thus giving a small boost in the final time.

 
 

Socket 1156 CPU’s have their Uncore frequency multiplier locked, so there’s not too much to look out for other than a quick glance at the real RTL time in nanoseconds, to make sure that the clock crossing schedule is just as fast if not faster than your previous selected overclock.

For those of you playing around with socket 1366 processors, you get some control over the Uncore multiplier ratio (so long as you observe the minimum 2x memory multiplier rule). Bear in mind that as you increase the Uncore frequency, the RTL value will increase because more clock cycles pass over the same time period. As an example, if RTL defaults to a value of 54 clocks at an Uncore frequency of 4GHz (20x Uncore multiplier) and a memory CAS of 8, our effective read turnaround time is;



Assuming all other bus frequencies and memory timings are unchanged, if we decide to increase the Uncore Multiplier ratio to 21X, the RTL value should move out to around 57 clocks;



Any greater than 13.57ns, and we have lost system performance and as a double whammy will also have to increase memory controller voltage to facilitate the higher switching speed of the associated IMC stages; this is not the way to truly ‘overclock’ a system for better performance.

The reason we’re including this simple formula here is so that users can simply boot the motherboard, read the RTL value and quickly plug the numbers into the formula to work out the read time. This should help users from running repeated benchmarks for every given change in CAS and tRCD or change of a memory multiplier ratio. As always, various benchmarks will react to Uncore frequency changes in different ways, although it is handy to know if your selected operating point allows for tighter memory controller gearing than other available combinations.

 

 
QUICK UPDATE
 

We've been toying around with RTL for a few days and believe we've come up with a method of reliably predicting RTL in clocks using the following formula;

 

 

 

tCL denotes the True CAS Latency of the memory modules, giving us the time required to access memory at a given CAS and memory operating frequency. We take the tCL value, add it to the Uncore period we add ~670ps (approx distance to the DIMM) for each read transfer then multiply this by eight.

Do note, that the 0.67 part of the formula may need slight adjustment according to board layout; If a vendor places the DIMM slots closer to the CPU this figure will need to be reduced. You will also need to reduce this figure to around ~0.57 (570ps) for very high memory clock frequencies on some motherboards. This is because the clock skew needs to be advanced as memory clock frequency is increased. A simple Excel based calc is available if required, send me an email!

 

 
 
SuperPi 32M Max CPU BCLK and MHz
POST A COMMENT

52 Comments

View All Comments

  • Rajinder Gill - Saturday, November 7, 2009 - link

    For max BCLK testingPCI/e speeds were increased (where required) to 115MHz or so (the highest the CPU's I had could run were between 115-118MHz). I tried changes to RTL, memory dividers and all voltages were also changed. Subtiming ranges were shifted out to near maximums to see if that helped and also matched between the best and worst boards in the tests just to make sure something was not creating a hurdle.

    regards
    Raja
    Reply
  • dingetje - Friday, November 6, 2009 - link

    wow the p55 platform is totally screwd if this problem persists...any overclocker still oc'ing the hell out of their p55 must be either brave, rich or (michael jackson voice on:) ignoraaaant Reply
  • dingetje - Friday, November 6, 2009 - link

    oops, this was supposed to be a comment, not a reply...damn UI :P Reply
  • dingetje - Friday, November 6, 2009 - link

    now if we could only edit our posts I would so happy Reply
  • petergab - Friday, November 6, 2009 - link

    What about any MSI boards? I know they may not count to the "extreme" OC but I think they should have a representative in such reviews. Reply
  • Rajinder Gill - Friday, November 6, 2009 - link

    The MSI board was due to be included but left out becasue of CPU damage that occurred during the socket burnouts. This left no real way of cross comparing the prior results with the MSI boards abilities on the same CPU. At that point I decided to run with what I had at the time rather than starting afresh thus delaying the article even further.

    regards
    Raja


    Reply
  • spiderbutt - Friday, November 6, 2009 - link

    Are you planning to include the MSI boards at a later date? I am curious to see how they compare to the other boards.

    Thanks for your hard work Raja it is appreciated!
    Reply
  • Rajinder Gill - Friday, November 6, 2009 - link

    Hi,

    There will be some MSI P55 board reviews coming, although those were planned in more typical usage scenarios. I guess what I can do is use a different CPU in the E657 EVGA board for cross compare to any high-end P55 MSI offering we review to see how things stack up.

    regards
    Raja
    Reply
  • 1stguess - Friday, November 6, 2009 - link

    Wow. This is a bold article. Does anyone dare OC their P55 setup? Madness. Reply
  • jav6454 - Friday, November 6, 2009 - link

    I've been looking at these P55 boards and somehow I always thought high of the EVGA. However these results have proven my gut feeling right.

    Bad thing of the EVGA boards is sometimes their higher price tag.

    Reply

Log in

Don't have an account? Sign up now