Reliability Features

Intel claims no less than 20 new RAS features for the new Xeon, most of them borrowed from the Itanium. Some of the RAS features are for the most paranoid of IT professionals. Let's face it, who has experienced a server crash that was caused by a bad CPU? For each CPU failure there must be a million failures caused by buggy software. So we are not too concerned if a competing CPU lacks "hot physical CPU board" swapping, and it is reasonable to think that most IT professionals—even those with mission critical applications—will agree. The most paranoid people usually have the highest budgets, as the mission critical applications they manage could cost them their job if they go down. Not to mention that the company they work for might lose millions of dollars. So those people tend to favor a very long list of reliability features.

All ironic remarks about paranoid people aside, most of these RAS features make a lot of sense even for the "down to earth" people, the rest of us. Memory does fail a lot more than CPUs. According to Google research, 8% of the DIMMs see one correctable error per year, and 0.22% have uncorrectable errors. These machines can have up to half a Terabyte (!) of RAM, and with 32 to 64 DIMMs an uncorrectable error is conceivable. So it is no surprise that most of the RAS features try to cope with failing DRAM chips. Also as the number of VMs that you consolidate on one machine increases, the risk of a bad VM bringing the complete host machine down increases.

The idea behind the Machine Check Architecture is that errors in memory and L3 cache are detected before they are actually "used" by the running software. A firmware based memory scrubber constantly checks ("patrols") for unrecoverable errors, errors that ECC cannot correct. Those errors will make the (ESX) hypervisor create a purple screen—which is in most cases much worse than the famous blue screen—to make sure your data does not get corrupted.

With MCA in hardware and support in both firmware and the hypervisor, data errors are transmitted to the hypervisor's error handler before they cause havoc. The memory location is placed in quarantine (poisoned data containment) and the CPU will not use that address again. The software handler can then retry to get the data, and as a result the hypervisor keeps running. This "recover" mechanism can of course only work if the error is created by the occasional glitch and not by bad hardware.

So the basic idea behind these increased reliability features is that the more memory you have, the higher the chances that an occasional glitch occurs and thus the more features like demand and patrol scrubbing and recovery from single DRAM device failure are handy. You will need something better than simple ECC. The same is true for QPI. As the number of Nehalem EX CPUs and the speed of QPI links increases, the chances for bad addresses or bad data increases as well.

Nehalem EX Overview The Uncore Power of the Nehalem EX
Comments Locked

23 Comments

View All Comments

  • JohanAnandtech - Tuesday, April 13, 2010 - link

    "Damn, Dell cut half the memory channels from the R810!"

    You read too fast again :-). Only in Quad CPU config. In dual CPU config, you get 4 memory controllers, which connect each two SMBs. So in a dual Config, you get the same bandwidth as you would in another server.

    The R810 targets those that are not after the highest CPU processing power, but want the RAS features and 32 DIMM slots. AFAIK,
  • whatever1951 - Tuesday, April 13, 2010 - link

    2 channels of DDR3-1066 per socket in a fully populated R810 and if you populate 2 sockets, you get the flex memory routing penalty...damn..............!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! R810 sucks.
  • Sindarin - Tuesday, April 13, 2010 - link

    whatever1951 you lost me @ Hello.........................and I thought Sauron was tough!! lol
  • JohanAnandtech - Tuesday, April 13, 2010 - link

    "It is hard to imagine 4 channels of DDR3-1066 to be 1/3 slower than even the westmere-eps."

    On one side you have a parallel half duplex DDR-3 DIMM. On the other side of the SMB you have a serial full duplex SMI. The buffers might not perform this transition fast enough, and there has to be some overhead. I also am still searching for the clockspeed of the IMC. The SMIs are on a different (I/O) clockdomain than the L3-cache.

    We will test with Intel's / QSSC quad CPU to see whether the flexmem bridge has any influence. But I don't think it will do much. You might add a bit of latency, but essentially the R810 is working like a dual CPU with four IMCs just like another (Dual CPU) Nehalem EX server system would.
  • whatever1951 - Tuesday, April 13, 2010 - link

    Thanks for the useful info. R810 then doesn't meet my standard.

    Johan, is there anyway you can get your hands on a R910 4 Processor system from Dell and bench the memory bandwidth to see how much that flex mem chip costs in terms of bandwidth?
  • IntelUser2000 - Tuesday, April 13, 2010 - link

    The Uncore of the X7560 runs at 2.4GHz.
  • JohanAnandtech - Wednesday, April 14, 2010 - link

    Do you have a source for that? Must have missed it.
  • Etern205 - Thursday, April 15, 2010 - link

    I think AT needs to fix this "RE:RE:RE...:" problem?
  • amalinov - Wednesday, April 14, 2010 - link

    Great article! I like the way in witch you describe the memory subsystem - I have readed the Intel datasheets and many news articles about Xeon 7500, but your description is the best so far.

    You say "So each CPU has two memory interfaces that connect to two SMBs that can each drive two channels with two DIMMS. Thus, each CPU supports eight registered DDR3 DIMMs ...", but if I do the math it seems: 2 SMIs x 2 SMBs x 2 channels x 2 DIMMs = 16 DDR3 DIMMs, not 8 as written in the second sentence. Later in the article I think you mention 16 at different places, so it seems it is realy 16 and not 8.

    What about Itanium 9300 review (including general background on the plans of OEMs/Intel for IA-64 platform)? Comparision of scalability(HT/QPI)/memory/RAS features of Xeon 7500, Itanium 9300 and Opteron 6000 would be welcome. Also I would like to see a performance comparision with appropriate applications for the RISC mainframe market (HPC?) with 4- and 8-socket AMD, Intel Xeon, Intel Itanium, POWER7, newest SPARC.
  • jeha - Thursday, April 15, 2010 - link

    You really should review the IBM 3850 X5 I think?

    They have some interesting solutions when it comes to handling memory expansions etc.

Log in

Don't have an account? Sign up now