The SF-2281 BSOD Bug

A few weeks ago I was finally able to reproduce the SF-2281 BSOD bug in house. In working on some new benchmarks for our CPU Bench database I built an updated testbed using OCZ's Agility 3. All of the existing benchmarks in CPU Bench use a first generation Intel X25-M and I felt like now was a good time to update that hardware. My CPU testbeds need to be stable given their importance in my life so if I find a particular hardware combination that works, I tend to stick to it. I've been using Intel's DH67BL motherboard for this particular testbed since I'm not doing any overclocking - just stock Sandy Bridge numbers using Intel's HD 3000 GPU. The platform worked perfectly and it has been crash free for weeks.

A slew of tablet announcements pulled me away from CPUs for a bit, but I wanted to get more testing done while I worked on other things. With my eye off the ball I accidentally continued CPU testing using an ASUS P8Z68-V Pro instead of my Intel board. All of the sudden I couldn't complete a handful of my benchmarks. I never did see a blue screen but I'd get hard locks that required a power cycle/reset to fix. It didn't take me long to realize that I had been testing on the wrong board, but it also hit me that I may have finally reproduced the infamous SandForce BSOD issue. The recent Apple announcements once more kept me away from my CPU/SSD work but with a way to reproduce the issue I vowed to return to the faulty testbed when my schedule allowed.

Even on the latest drive firmware, I still get hard locks on the ASUS P8Z68-V Pro. They aren't as frequent as before with the older firmware revision, but they still happen. What's particularly interesting is that the problem doesn't occur on Intel's DH67BL, only on the ASUS board. To make matters worse, I switched power supplies on the platform and my method for reproducing the bug no longer seems to work. I'm still digging to try and find a good, reproducible test scenario but I'm not quite there yet. It's also not a Sandy Bridge problem as I've seen the hard lock on ASRock's A75 Extreme6 Llano motherboard, although admittedly not as frequently.

Those who have reported issues have done so from a variety of platforms including Alienware, Clevo and Dell notebooks. Clearly the problem isn't limited to a single platform.

At the same time there are those who have no problems at all. I've got a 240GB Vertex 3 in my 2011 MacBook Pro (15-inch) and haven't seen any issues. The same goes for Brian Klug, Vivek Gowri and Jason Inofuentes. I've sent them all SF-2281 drives for use in their primary machines and none of them have come back to me with issues.

I don't believe the issue is entirely due to a lack of testing/validation. SandForce drives are operating at speeds that just a year ago no one even thought of hitting on a single SATA port. Prior to the SF-2281 I'm not sure that a lot of these motherboard manufacturers ever really tested if you could push more than 400MB/s over their SATA ports. I know that type of testing happens during chipset development, but I'd be surprised if every single motherboard manufacturer did the same.

Regardless the problem does still exist and it's a valid reason to look elsewhere. My best advice is to look around and see if other users have had issues with these drives and have a similar system setup to you. If you do own one of these drives and are having issues, I don't know that there's a good solution out today. Your best bet is to get your money back and try a different drive from a different vendor.

Update: I'm still working on a sort of litmus test to get this problem to appear more consistently. Unfortunately even with the platform and conditions narrowed down, it's still an issue that appears rarely, randomly and without any sort of predictability. SandForce has offered to fly down to my office to do a trace on the system as soon as I can reproduce it regularly. 

Introduction The Newcomers
POST A COMMENT

90 Comments

View All Comments

  • DarkKnight_Y2K - Thursday, August 11, 2011 - link

    "Bottom line, it seems like sandforce-driven ssds have the biggest number of issues, yet you still recommend them."

    Did you read the last sentence of Anand's review?

    "The safest route without sacrificing significant performance continues to be Intel's SSD 510."
    Reply
  • Socratic - Thursday, August 11, 2011 - link

    Yeah I don't know what planet you have been living on, but in MULTIPLE articles Anand has basically ended with the phase, The only logical choice is Intel.

    How is that being a sandforce fanboy??

    You need to keep YOUR bias in line and re-read the article and past articles!!
    Reply
  • Anand Lal Shimpi - Thursday, August 11, 2011 - link

    Given the continued issues with SF drives I'm quickly looking at other alternatives. Toshiba and Crucial have never been top end performers, which is why I've focused most of my recommendations on the Intel SSD 510. The biggest advantage SandForce continues to have is in better performance over the long run thanks to its live dedupe/compression. I've been working on a way to quantify that for a while unfortunately I don't have a good test I'm happy with...yet.

    Going forward I believe Samsung may be a bigger player. Take note of the recently announced PM830, expect full coverage of that drive upon its arrival.

    http://www.anandtech.com/show/4606/samsung-announc...

    Take care,
    Anand
    Reply
  • melgross - Thursday, August 11, 2011 - link

    Well, dedup itself is subject to a lot of controversy. It isn't necessarally a good thing. Reply
  • Anand Lal Shimpi - Thursday, August 11, 2011 - link

    I'd argue for most mainstream uses it's a very good thing for long term performance. If the SF-2281 had Intel's track record it'd be the best option in my mind.

    Take care,
    Anand
    Reply
  • name99 - Thursday, August 11, 2011 - link

    Hi Anand,

    Rather than beating up on you for not stressing reliability more in the past, I'm going to ask, AGAIN, that you take power more seriously.

    My experience has been

    - replaced the hard drive in my 2nd gen MacBook Air with a RunCore IV. The thing would crash about once a week, as far as I could tell NOT from logic errors but because its power draw during a long train of writes spiked higher than the interface was specced for. If this coincided with a high power draw elsewhere in the system --- fan, CPU etc, game over

    - an OCZ enyo USB3 drive which work just fine as a READ drive --- and is once again somewhat flaky if too many back-to-back writes occur

    - a Kingston SSDNow V which I have as the boot/VM drive for my iMac running off USB. My original plan for this was to have it running off FW800 (which is in theory 7W of power), but I got the same thing as the two previous drives --- crashes with too many back to back writes. It's now running successfully because I stuck it in a Kingwin USB<->SATA bridge that is for 3.5" drives, and thus has a separate power supply and the ability to provide a lot of juice.

    All this basically mirrors (along a different dimension) what you have said: these drives are ABSOLUTE CRAP for the naive consumer. You buy them, things seem great, and then randomly and with no obvious pattern to the naive user, your system hangs.

    You seem to be trying really hard to have the manufacturers get their act together; my point is to remind you that an IMPORTANT part of getting their act together is that these things are ALWAYS within spec with respect to power. Right now, we seem to have a lot (at least three different brands, in three different market segments) of drives that are simply not within spec --- they can run on the power that the system is specced to deliver for most command sequences, but there are always those few command sequences that over-draw power. Heck, at the very least, it is the responsibly of the drive to recognize this
    situation and throttle themselves, just like any modern x86 CPU.
    Reply
  • Coup27 - Thursday, August 11, 2011 - link

    +1.

    I have been feeling similar sentiments lately as well.

    I have posted in the forums on what happened to the 470 review but no official comment from anybody. Considering all of the reliability issues flying about, you woud think that if the 470 was a reliable as word suggests, it would have had a featured review.

    Some guy actually bought an Agility 3 based off the AT review and forum list of recommended drives and neither mentioned the BSOD. When he got it the BSOD, he went into the forums and kicked off. Rightly so.

    Unfortunately issues drag on for sometimes months before AT even update their article to make people aware that the product they might be buying could be seriously flawed.

    No other website offers the depth of detail which AT does and for that the editors are applauded, but unfortunately the playing field does not seem level.
    Reply
  • Lord 666 - Thursday, August 11, 2011 - link

    Before this article, previous reviews of Vertex problems did not address the issues. This hits it head on. Reply
  • jo-82 - Thursday, August 11, 2011 - link

    The Kingston HyperX cleary stands out with a consistent high performance. Why no words on that? Clearly the drive to buy. And Kingston has imho a much higher reputation on circuitry reliance and better QA in general then the rest of the pack, except Intel. Reply
  • Roland00Address - Thursday, August 11, 2011 - link

    And I ain't sure you can apply the logic of Kingston being rocking when Kingston purposefully makes their SSD line confusing using similar names with completely different controllers

    Kingston E series, Intel X25-E controller
    Kingston M series, Intel X25-M G2 controller
    SSDNow V 100, JMicron JMF618 controller
    SSDNow V+, Samsung S3C29RBB01 controller
    SSDNow V+ 100, Toshiba T6UG1XBG controller
    SSDNow V+ 180, Toshiba T6UG1XBG controller
    SSDNow V Series, Toshiba TC58NCF602GAT controller, which is based off the stuttering JMicron JMF602
    30GB SSDNow V Series Boot Drive, Toshiba T6UG1XBG controller

    I may be forgetting to list a couple models, but as I pointed above, Kingston has used 2 different controllers from Intel, 1 from Samsung, and 2 different from Toshiba (and all these controllers have similar names), not counting their most recent drive that is a Sandforce controller.
    Reply

Log in

Don't have an account? Sign up now