Data Corruption - not Political Corruption - with NVIDIA’s Latest Boards

Our performance board roundups ended up delayed for a variety of reasons, but we will be back on track next week. Every conceivable problem has hit us from shoddy BIOS releases to repeated problems getting Crysis to benchmark correctly under 64-bit Vista. We are still not sure about the latter problem, as one image works and another does not on identical hardware and software setups. We finally got to the point of being able to benchmark, but it is not a process we would wish upon our worst enemies.

However, none of that compares to the data corruption problems we are seeing intermittently on the 790i and 780i platforms. We honestly thought NVIDIA had solved these problems back in 2006 on the 680i platform. Since the MCP has not changed, it is disconcerting to us that this problem seems to be rearing its ugly head again. This time, the data corruption problems appear contained to memory overclocking, especially on the 790i boards. We are not talking massive overclocks here, but apparently hitting the right combination of FSB rates around 400 and memory speeds above DDR3-1600 seem to trigger our problems. Also, we have been able to reach higher DDR3 speeds with absolute stability on the 790i than on the X48 during extreme overclocking, so this problem is even more perplexing to us.

On the 780i boards, the magical combination is right above 400MHz FSB (1600 QDR) and memory unlinked anywhere from DDR2-900~1200. Our 780i problems have been minor for the most part, but the underlying problem is that after the systems recover from a BSOD, we typically have stability problems or gremlin behaviors until we reload the system. This same problem can occur on Intel or AMD chipset boards, but it is extremely rare in our experiences to date unless we absolutely pushed the memory beyond reasonable settings.


Back to the 790i boards; the data corruption problems have occurred more frequently as the boards (and their early BIOS revisions) seem more susceptible to faulty behavior when pushing the memory above DDR3-1600 with low latencies. We have not nailed downed exact settings at this point, as they tend to fluctuate between test sessions and boards. What we do know is that we are tired of constantly reloading our images after making minor changes to our settings.

It is possibly coincidence only, but over the past couple of months we have lost two WD Raptors, a couple of Samsung 500GB drives, and a WD 250GB drive while benchmarking the 790/780i boards. It may have just been time for these drives to meet their maker, as our particular samples have spent significant time running benchmarks almost 24/7 over the past year or so (it might not sound like a long time, but we totally abuse the drives to some degree when testing in this manner). We have certainly had hard drive failures when testing other chipsets, ranging from complete mechanical breakdowns to index tables being so corrupted that we could not fully recover the disk. It could just be bad luck on our part.

However, we think it goes deeper than that. After the first roundups this coming week, we plan to delve into it. The reason is that we have not had any data corruption problems testing our 650i/750i, GeForce 6100/6150, or GeForce 7050 boards, none of which utilize the MCP in the 680/780/790i boards.  Of course, this could be tied to the fact that we do not push the boards as hard, but knowing about the previous 680i problems makes us think the current BIOS code or Vista drivers need to be revised again.

Other problems

We share test notes on an almost continual basis with each other when testing boards. We thought some of the test notes from our upcoming roundup would be interesting. In all fairness to NVIDIA, we are including our X48 thoughts as we wrap up testing.

790i test notes:

a) CPU multiplier likes to changes at will, causing an inability to POST after changing BIOS options. (Problem is likely linked to bad NVIDIA base code).

b) Poor memory read performance above 475FSB unless you enable “P1” and “P2” which NVIDIA refuses to document operation of or provide information about.

c) EVGA/XFX (NVIDIA reference design) lacks support for tRFC tuning - high density DDR3 configurations often refuse to work unless the module SPDs are tuned from the manufacturer. (This makes them needlessly slow in low-density configurations.)

d) The chipset does not do a very good job of balancing read vs. write priorities with respect to memory access - copy scores lower than X38/X48.

e) Regardless of what NVIDIA says, we think PCI-E 2.0 (and 1.x) implementation is still better on Intel’s Express chipsets - give us SLI on Intel to prove it!!!

f) Possible problem with NVIDIA reference design: sustained overclocked operation at >~1.9V for VDIMM may cause critical failure of 790i (Ultra) SPP. This does not seem to affect ASUS S2E design and is the most critical issue facing the board; we need to verify before making recommendations.

g) Possible HDD corruption issues. (We lost the two 74GB WD Raptors so far…)

X48 test notes:

a) Chipset defaults to tRD values that are excessively loose and are not competitive with NVIDIA’s new 790i. The problem is most MB manufacturers do not allow this to be specifically tuned in the BIOS.

b) DMI interface (x4 PCI-E link) is sloooow….X38/X48 should have been paired with ICH10(R), which will be PCI-E 2.0 compliant on the link interface.

c) Haven’t found an Intel X48 board yet that will handle 8GB of DDR3 properly, even though this is a major bullet for chipset support - board or memory makers? (We need to test this on the Intel DX48BT2 that just arrived.)

d) Chipset runs HOT…might even be hotter than 790i. Intel should have shrunk this thing long ago!

That is it for now and we will have additional information in the first roundup. Now a take on Gigabyte.

Pop goes the MOSFET Walking the Plank with Gigabyte...
POST A COMMENT

81 Comments

View All Comments

  • braddy752 - Tuesday, April 8, 2008 - link

    Seeing this Gigabyte case has being flaming with gasoline.. Though the board maker should take certain responsibilites of having mistake information..

    However, we all knew that from the starting point it's the chipset makers issue, delivering too aggressive assumptions on supporting processors which the chip maker had not validated. Not mentioning those buggy issues created by Nvidia.

    Nvidia has been keeping quiet for those issues found by journalist or end-users, and making their so called partners (board makers) to deal with it.

    So, who should be blamed for the faulty products? Board maker or the Chip maker? For me... I'll still trust the board makers who deliever good quality products, and blame the root cause to the chip maker.
    Reply
  • aguilpa1 - Monday, April 7, 2008 - link

    I approached Anandtech a looong time ago way back when Yorkfields first came out about 680iSLI not being compatible as I soon found out. They ignored my posts.

    I had an EVGA 680i SLi and they provided their customers with a 780iSLI step up, which I am now running trouble free. Maybe you guys should have gone EVGA.
    Reply
  • Tanclearas - Monday, April 7, 2008 - link

    Gary,

    I'm not sure if you recall, but I was the one you helped get in contact with Nvidia regarding the nforce4 corruption issues related to the hardware firewall. I still have all of the email messages associated with that. You definitely tried harder than Nvidia in working with me on identifying the issue. I responded to the suggestions by the Nvidia rep quickly, and provided as detailed information as I could to them. I offered to completely reformat my system, and follow any specific directions they wanted, but their only suggestion for me was to install a driver that I had already tested with.

    You followed up with me, and contacted Nvidia again, but they still never contacted me again. You eventually let me know that Nvidia's solution was to discontinue support of the hardware firewall, even though that was a major "bullet point" on the feature list of nforce4.

    I won't go so far as to say "I'll never buy nvidia again!", but I definitely won't buy until the products have been on the market an extended period and there is reasonable confirmation about which "features" of their products actually work as advertised.
    Reply
  • chucky2 - Monday, April 7, 2008 - link

    Gary,

    Personally, I think instead of working behind the scenes with the mobo manufacturers, you ought to publish the review and slam them all for dying. The fact seems to be that these manufacturers will just not fix their ways until it blows up in their faces.

    Maybe being embarrassed on the front page of AnandTech in a full out review will serve as sufficient embarrassment for them to put enough engineering into their products so those of us who buy a 125W Phenom and OC it (through the BIOS options the manufacturers themselves provide) the boards won't fry.

    In college teaching the saying goes Publish or Perish...maybe for the manufacturers it should be Engineer it or be Embarrassed....

    Chuck

    P.S. Now your 780G review will be further delayed because of their shodding underengineering....should give AnandTech time to review a couple of the past 690G boards to see how they compare and if they have the same problems. I just looked, and the Gigabyte 690G boards have the 6000+ and 6400+ listed as supported, and the 95W Phenom's as well. Should make for a good comparison....
    Reply
  • whatthehey - Monday, April 7, 2008 - link

    I completely support your suggestion: reviewers should put the reviews out there as soon as a product is available on the retail shelves. By all means, go ahead and delay a preview or a first look while you wait for a BIOS respin, but when boards are available to the average Joe shopping at Newegg, it's time for manufacturers to put up or shut up.

    I respect all the hard work you do, Gary, but I'd much rather read reviews of imperfect products than to wait (and wait... and waaaaait.......) for a review of an ephemeral "perfect" motherboard and BIOS.

    As for the motherboards, I'd say you should stick to reviewing whatever CPUs they list as working and put in information about what CPUs *don't* make that list. Then let us know how the board works in practice. If it's flaky and your article ends up killing sales for a board, that's just too bad.

    Finally, less time spent on extreme overclocking and more time spent getting articles out the door would be appreciated. I don't use water, let alone phase change or LN2, and I'm not going to push my system to the ragged edge. I'll take a reasonable overclock if it's easy to achieve. Spending hours/days/months tweaking and adjusting various BIOS settings to get the last .05% performance boost means nothing to 99.999% of people. Let the XS braggarts worry about the ORB charts!
    Reply
  • nubie - Monday, April 7, 2008 - link

    I don't know if anything I have ever seen on Anandtech qualifies as extreme overclocking (if you want that go to vr-zone.com and see Shamino or Kingpin), unless we are talking about 3.8+ Ghz air-cooled CPU's, and I guess that is a little extreme, but only as far as the retention mechanism goes. I don't recall any voltage modifications or phase-cooling on this site. (If you can buy it for $100 or less, and put it on the CPU in one piece, without soldering, vacuum pumps, or bleeding of coolant, and it fits in the case, I don't really consider it extreme.)

    I think what they point to here is that general mild overclocking will completely destroy a motherboard with certain CPU's, CPU's that are supposed to be supported and should by rights have plenty of headroom for worst-case scenario running, and should thus be exploitable by their "best case" test methodology with a mild overclock.
    Reply
  • Dsjonz - Monday, April 7, 2008 - link

    I agree with whatthehey. "Warts and all" reviews TODAY are what many of us want. But despite the overclocked CPU limitations of the initial crop of these three 780G motherboards, I won't "waaaaait' for this issue to be resolved with more expensive board respins six months from now. I will be buying a 780G next week for my HTPC hardware refresh. Why?

    I'm buying a 780G board today because they offer precisely what I have been waiting a long time for. All three deliver full-featured low-power MATX boards, all hitting the feature set sweet-spot for HTPC/Windows Home Server/general productivity use -- and all offered at the crucial under-$100 "single-spouse-decision" pricepoint.

    Also consider what these 780G boards are NOT. They are clearly not oriented toward the "addled overclocker/uber-gamer/power-workstation" crowd. Yet, that's how they are being reviewed and judged. Am I the only one who is objecting to a prevailing pattern among many PC reviewers today to evaluate non-gaming/overclocking MATX motherboards only for their overclocking and gaming prowess?

    Let's separate the issues. The beef is with the three motherboard makers who should have prominently listed support limitations for overclocked CPUs. It's hard to believe that all three deliberately under-engineered their products, but it's at all not hard to believe that rushed and inexperienced product marketing staffers at all three companies either ignored engineering caveats or were "out of the loop" about these issues and did a "cut-and-paste" of standard product requirements on the specifications section of the datasheet.

    Unlikely, you say? I'm in the business, and it happens all the time.
    Reply
  • garydale - Monday, April 7, 2008 - link

    Not quite what the originator of that phrase had in mind, but it makes me very happy I stayed with the lower power (95W) version of the Phenom. I have an inexpensive all-in-one mainboard that doesn't seem to be having any problems with the 9500 (B2). However, until reading this article, I thought I was just being energy efficient.

    I'm also happy that we have sources like this to turn to. I've never paid much attention to CPU compatibility charts before, naively believing that if it was the right socket, at worst a BIOS upgrade would allow the processor to work. Now it seems I have another problem to worry about.
    Reply
  • Johnniewalker - Monday, April 7, 2008 - link

    Saved me lots of time and headaches! Reply
  • anindrew - Monday, April 7, 2008 - link

    I've been an avid reader of anandtech.com since 2000 or so. I am very impressed by how candid and gutsy your article is. Besides benchmark and real world tests, every user wants to know about issues with products. When issues this big present themselves, you are doing a great service by bringing them to the forefront.

    I've been wanting to build a new system for about a year (since my motherboard had a bit of a heart attack 6 months ago). I've held off for a few reasons (mostly financial). If I built right now (which I'm not), I'd probably go for an X38 board and a Q9450. I haven't heard about any issues with that combination as of yet.
    Reply

Log in

Don't have an account? Sign up now