Intel X38: Where's the Beef?

by Gary Key on October 10, 2007 3:15 PM EST
X38 Details


We have the familiar Intel block diagram that basically outlines the same technology patterns in previous MCH/ICH families. The X38 MCH is paired with the ICH9 series of Southbridges that was introduced with the P35 chipset. Intel continues to utilize its Direct Media Interface (DMI) technology for the interconnect link between the ICH/MCH chipsets. The 2GB/s DMI capability has not changed with this latest offering and continues to offer more than enough bandwidth for most users.

We say "most users" as installing a six drive RAID 10 array, audio card, Turbo Memory card, and TV Tuner card along with fully utilizing the LAN and USB ports can provide enough interconnect clogging data that even the most hardened air traffic controller (i.e. chipset) would fly into a panic. However, that particular scenario is very rare on the desktop. Intel will be abandoning the DMI interface late next year when the second generation 45nm CPU, Nehalem, is introduced. Intel's CSI (Common System Interconnect) and IMC (Integrated Memory Controller) represent their long-awaited response to AMD's HyperTransport technology and do away with the current front side bus architecture.

The most significant changes in the X38 MCH over the P35 or 975X are the inclusion of two PCI Express x16 lanes, PCI Express 2.0, and official support for DDR3-1333. One of the features in the X38 MCH that Intel has been fairly quiet about is the revised Snoop feature. It's not what you think, as we can confirm that Snoop Dogg was not involved in this development nor is it a feature that lets NFL head coaches steal signals, or the NSA figure out what you had for breakfast.

While Intel's press slides simply tout the virtues of Intel's "faster memory access", the fact is Intel has revamped the X38's memory controller to include their revised "flexible clock crossing architecture" and improved prefetching circuitry. We have done a little digging into exactly what faster memory access means and it seems to revolve around the snoop cache buffer improvements. The snoop feature can be described very simply as an extra level of pseudo-cache resident in the MCH. It's not really cache, per-se, but lines of recently cached memory reads. The purpose of this feature is to reduce memory read latencies in memory intensive programs that read a lot of data in parallel. The most common scenario is in multiple core computations where the cores are all manipulating shared data a little differently.

The primary benefit is that a separate memory read isn't needed every time data is accessed; instead the X38 MCH intelligently caches data and provides it when available. This also allows for a larger re-order buffer for read and write operations that further enhances memory performance. Every time the MCH must switch between reading and writing there is a wait period on the data lines, so the X38 MCH will wait until there are enough write requests stored before committing the data to memory.

Essentially, the MCH will "store" write requests and then "burst" them into memory when either the filter is full or the core requests data from memory that has not been written yet. So when the MCH writes this data it is coming straight from the pseudo-cache and this is what provides the "faster memory access" in the X38. It is almost like the chipset engineers followed their CPU brethren in the op re-ordering design in the Core 2 series, only it is applied to memory read/writes and implemented by the MCH - or maybe it's just a highly refined feature from the Intel 870. In reality, we have noticed slightly improved latencies and read rates when compared to the P35, but nothing that would make us dance in streets.

Along with support for both DDR2 and DDR3, the X38 introduces official support for DDR3-1333 while the P35 only officially supports DDR3-1066 - even though we have not had any issues running P35 boards past DDR3-2000 with the right memory. Intel is also introducing their Extreme Memory Profile (XMP) technology with the X38 roll-out. XMP is just like the Enhanced Performance Profile (EPP) technology launched by NVIDIA last year. It simply is a means of adding additional memory timing and clock speed profiles to the DDR3 SPDs in the same way EPP does for DDR2 memory. These profiles are designed to make it easier for users to basically auto-tune or overclock their memory/system using specific XMP profiles instead of manually changing individual timings and bus speeds in the BIOS.

Fortunately for those wanting to upgrade to the latest Intel chipset and not needing a new home equity loan to purchase DDR3 modules, the X38 MCH also supports DDR2 memory. In early testing with retail boards and BIOS releases, we are not seeing any real improvements over the DDR2 based P35 boards. Since the X38 is not "optimized" for DDR2 memory operations, this is both good and bad news. However, until DDR3 memory prices subside, an upgrade from the P965 or 975X should not be too painful once boards are in plentiful supply.

The last new feature from Intel is the support for the PCI Express 2.0 standard. In fact, Intel has the first desktop chipset on the market that supports this new standard. (AMD's RD790 should appear in the not-too-distant future.) The big news is the PCI Express 2.0 specification doubles the interconnect bit rate from 2.5 GT/s to 5 GT/s per port and is completely cross-compatible with the 1.0/1.1 specifications.

This means several things. First is that the performance increase to 5 GT/s effectively increases the aggregate bandwidth of a 16-lane link to approximately 16GB/s (maximum theoretical bandwidth of 8GB/s in each direction simultaneously before overhead), double that of the 1.1 spec. Real bandwidth per lane will be up to 4Gb/s (estimated to be around 500MB/s per lane/pin on average in current testing) in each direction given the 8b/10b encoding method used to transmit the data.

This increase in bandwidth comes courtesy of faster signaling rather than wider data paths which is why a 2.0 card is compatible in a 1.01/1.1 slot and vice versa. However, a 1.0/1.1 card will only work at its rated speed in a 2.0 slot and a 2.0 card is limited to the 1.0/1.1 slot speed. When two 16-lane PCI Express ports are utilized, the second port will support PCIe 1.1 cards at x8, x4, or x1 speeds or PCI Express Graphic cards at x16 or x1 operation.

While most will concur the reasoning behind the upgrade to the PCI Express standard is to improve graphics bandwidth, we think several other revised features also played a part in the early adoption of this specification. These features include dynamic link speed management, link bandwidth notification, access control services, and the power limit redefinition protocol. Of these, the dynamic link speed management and power limit redefinition are the two most interesting features in our opinion.

The dynamic link speed feature includes support for software controls that can dynamically throttle lane speeds. The power limit redefinition feature allows the system to redefine the slot power limits based upon the device inserted into that slot. The latter feature will work well with the new 300W electro-mechanical (CEM) spec. This new specification that works on either PCI Express standard provides full support for the 8-pin auxiliary power connectors seen on video cards like the HD 2900 XT and upcoming NVIDIA G9x offerings. The 8-pin PCIe power connector is capable of delivering up to 150W of power compared to the 75W limit in the 6-pin PCIe power plug. The PCI Express x16 slot on the motherboard is still limited to 75W. In total, up to 300W is available for each x16 PCIe slot on the motherboard, and hopefully we will not reach the day where that capability will need to be increased. (The HD 2900XT design had us wondering for awhile....)

Index Chipset Comparison and G35 Overview
Comments Locked

14 Comments

View All Comments

  • Lord Evermore - Sunday, October 14, 2007 - link

    quote:

    Real bandwidth per lane will be up to 4Gb/s (estimated to be around 500MB/s per lane/pin on average in current testing) in each direction


    Estimated? In testing? 4Gbps is exactly 500MBps given the encoding used. If you mean that actual performance and throughput gets close to the theoretical maximum, that's a bit odd a way to put it. You could also say that SATA's real burst bandwidth has been estimated and tested to be "around" 300MBps.
  • IntelUser2000 - Thursday, October 11, 2007 - link

    Now there are other reviews about X38 on the web, and here are two that I find it interesting:

    http://www.firingsquad.com/hardware/gigabyte_x38_d...">http://www.firingsquad.com/hardware/gigabyte_x38_d...
    http://www.ocworkbench.com/2007/gigabyte/GA-X38-DQ...">http://www.ocworkbench.com/2007/gigabyte/GA-X38-DQ...

    The Crossfire benchmarks seem to offer significant advantage of X38 over P35.
  • DigitalFreak - Thursday, October 11, 2007 - link

    quote:

    The Crossfire benchmarks seem to offer significant advantage of X38 over P35.


    Of course they do. P35 boards run Crossfire in a x16 + x4 config, while X38 is x16 + x16. The only exception is some Asus P35 board hacked to run x8 + x8 via a PCI-E switch.
  • IntelUser2000 - Friday, October 12, 2007 - link

    Now I am thinking Intel shouldn't have bothered to make faster memory controller on the X38 at all.

    -Average performance increase has been so far less than 2%. In some tests that's considered margin of error!
    -X38 added 10 or so more watts over P35
    -X38 also is rumored to have a 140mm2 die

    A theoretical 9% improvement for going for a motherboard that costs $100 more in some cases. X38 added 10 or so more wasted watts, a new die, for nothing. All they should have done is take a P35 chipset and put ability to do 2xPCI-E x16.

    In fact no Intel chipset manufacturers shouldn't bother with it. There used to be a time when high end chipsets were actually faster. 875P with PAT offered 3-5% real world performance increases over the 865 chipsets.

    Luckily, by the time Nehalem is out with integrated memory controller(hopefully, at least all the mainstream versions), this stupidity should be over. IMC will do far more than what futile advancements made on the external chipsets will ever do.
  • avaughan - Wednesday, October 10, 2007 - link

    On page one you mention improved "4x1GB compatibility". Do you test 4x2GB? 2x1GB+2x2GB?
  • wingless - Wednesday, October 10, 2007 - link

    Under the list of features these chipsets have it states that Intel chipsets ONLY supports Crossfire and not SLI. If they support both I would image they would list both as supported, not just Crossfire. If Intel only supports AMD's multi-card setup then thats a big win for ATI. Jeez, Nvidia better whore out their SLI technology because I know a lot of people that run Nvidia only.
  • mongo lloyd - Wednesday, October 10, 2007 - link

    Any Intel board sold is an AMD CPU/platform not purchased.
  • microAmp - Wednesday, October 10, 2007 - link

    I can't get to page 2, link looks ok, just dead ends for me and takes me to search.anandtech.com.

    I also tried print page to read, empty. :(
  • microAmp - Wednesday, October 10, 2007 - link

    Ignore me, worky now.
  • 8steve8 - Wednesday, October 10, 2007 - link

    I have a few questions:

    1. Will the Intel x38 desktop board be overclockable in any usable sense?

    2. Is it true the G35 can only talk to the ich8 (and not ich9)? this seems odd...

    3. I guess if you were going to buy an uber-high-end p35, this will be a great chipset to look foward to... but whos dying to spend $250+ on a motherboard for a few % improvement in the real-world user-experience, when you can get perfectly fine p35/g33 motherboards for ~$120, that overclock to like 425mhz. Especially when Nehalem, with its integrated memory controller, (which won't work in the x38) will likely whipe the floor with anything you buy today from intel.

    for me the choice is clear save $150 and go with a decent g33/p35/g35 board, then i'll be better able to afford Nehalem next year which will make this system irrelevant anyway.... penryn is a nice improvement, but we are talking ~5% per clock improvements, and a smaller power envelope... don't expect anything huge.


    who's spending $300 on a motherboard?
    and who would (in their right mind) build a system with ddr3. (now)





    We are all egerly awaiting G35 in the channel.

Log in

Don't have an account? Sign up now