Let's start with the elephant in the room. There's a percentage of OCZ Vertex 3/Agility 3 customers that have a recurring stuttering/instability issue. The problem primarily manifests itself as regular BSODs under Windows 7 although OCZ tells me that the issue is cross platform and has been seen on a MacBook Pro running OS X as well.

How many customers are affected? OCZ claims it's less than two thirds of a percent of all Vertex 3/Agility 3 drives sold. OCZ came up with this figure by looking at the total number of tech support enquiries as well as forum posts about the problem and dividing that number by the total number of drives sold through to customers. I tend to believe OCZ's data here given that I've tested eight SF-2281 drives and haven't been able to duplicate the issue on a single drive/configuration thus far.

Most of the drives were from OCZ and I've tested them all on four separate platforms - three Windows 7 and one OS X. The latter is my personal system where I have since deployed a 240GB Vertex 3 in place of Intel's SSD 510 for long term evaluation. If you're curious, the 3 months I had the 510 in the MacBook Pro were mostly problem-free. It's always tough narrowing down the cause of system-wide crashes so it's hard to say whether or not the 510 was responsible for any of the hard-resets I had to do on the MacBook Pro while it was deployed. For the most part the 510 worked well in my system although I do know that there have been reports of issues from other MBP owners.

But I digress, there's a BSOD issue with SF-2281 drives and I haven't been able to duplicate it. OCZ has apparently had a very difficult time tracking down the issue as well. OCZ does a lot of its diagnostic work using a SATA bus analyzer, a device that lets you inspect what's actually going over the SATA bus itself rather than relying on cryptic messages that your OS gives you about errors. Apparently sticking a SATA bus analyzer in the chain between the host controller and SSD alone was enough to make the BSOD problem go away, which made diagnosing the source of the BSOD issue a pain.

OCZ eventually noticed odd behavior involving a particular SATA command. Slowing down timings associated with that command seems to have resolved the problem although it's tough to be completely sure as the issue is apparently very hard to track down.

OCZ's testing also revealed that the problem seems to follow the platform, not the drive itself. If you have a problem, it doesn't matter how many Vertex 3s you go through - you'll likely always have the problem. Note that this doesn't mean your motherboard/SATA controller is at fault, it just means that the interaction between your particular platform and the SF-2281 controller/firmware setup causes this issue. It's likely that either the platform or SSD is operating slightly out of spec or both are operating at opposite ends of the spec, but still technically within it. There's obviously chip to chip variance on both sides and with the right combination you could end up with some unexpected behaviors.

OCZ and SandForce put out a stopgap fix for the problem. For OCZ drives this is firmware revision 2.09 (other vendors haven't released the fix yet as far as I can tell). The firmware update simply slows down the timing of the SATA command OCZ and SF believe to be the cause of these BSOD issues.

In practice the update seems to work. Browsing through OCZ's technical support forums I don't see any indications of users who had the BSOD issue seeing it continue post-update. It is worth mentioning however that the problem isn't definitely solved since the true cause is still unknown, it just seems to be addressed given what we know today.

Obviously slowing down the rate of a particular command can impact performance. In practice the impact seems to be minimal, although a small portion of users are reporting huge drops in performance post-update. OCZ mentions that you shouldn't update your drive unless you're impacted by this problem, advice I definitely agree with.

What does this mean? Well, most users are still unaffected by the problem if OCZ's statistics are to be believed. I also don't have reason to believe this is exclusive to OCZ's SF-2281 designs so all SandForce drives could be affected once they start shipping (note that this issue is separate from the Corsair SF-2281 recall that happened earlier this month). If you want the best balance of performance and predictable operation, Intel's SSD 510 is still the right choice from my perspective. If you want the absolute fastest and are willing to deal with the small chance that you could also fall victim to this issue, the SF-2281 drives continue to be very attractive. I've deployed a Vertex 3 in my personal system for long term testing to see what living with one of these drives is like and so far the experience has been good.

With that out of the way, let's get to the next wave of SF-2281 based SSDs: the OCZ Vertex 3 MAX IOPS and the Patriot Wildfire.

The Vertex 3 MAX IOPS Drive

In our first review of the final, shipping Vertex 3, OCZ committed to full disclosure in detailing the NAND configuration of its SSDs to avoid any confusion in the marketplace. Existing Vertex 3 drives use Intel 25nm MLC NAND, as seen below:


A 240GB Vertex 3 using 25nm Intel NAND

 

Not wanting to be completely married to Intel NAND production, OCZ wanted to introduce a version of the Vertex 3 that used 32nm Toshiba Toggle NAND - similar to what was used in the beta Vertex 3 Pro we previewed a few months ago. Rather than call the new drive a Vertex 3 with a slightly different model number, OCZ opted for a more pronounced suffix: MAX IOPS.

Like the regular Vertex 3, the Vertex 3 MAX IOPS drive is available in 120GB and 240GB configurations. These drives have 128GB and 256GB of NAND, respectively, with just under 13% of the NAND set aside for use as a combination of redundant and spare area.


OCZ Vertex 3 MI 120GB

The largest NAND die you could ship at 32/34nm was 4GB - the move to 25nm brought us 8GB die. What this means is that for a given capacity, the MAX IOPS edition will have twice as many MLC NAND die under the hood. The table below explains it all:

OCZ SF-2281 NAND Configuration
  Number of NAND Channels Number of NAND Packages Number of NAND die per Package Total Number of NAND die Number of NAND per Channel
OCZ Vertex 3 120GB 8 16 1 16 2
OCZ Vertex 3 240GB 8 16 2 32 4
OCZ Vertex 3 MI 120GB 8 8 4 32 4
OCZ Vertex 3 MI 240GB 8 16 4 64 8

The standard 240GB Vertex 3 has 32 die spread across 16 chips. The MAX IOPS version doubles that to 64 die in 16 chips. The 120GB Vertex 3 only has 16 die across 16 chips while the MAX IOPS version has 32 die, but only using 8 chips. The SF-2281 is an 8-channel controller so with 32 die you get a 4-way interleave and 8-way with the 64 die version. There are obviously diminishing returns to how well you can interleave requests to hide command latencies - 4 die per channel seems to be the ideal target for the SF-2281.


OCZ Vertex 3 MI 240GB

Patriot's Wildfire
Comments Locked

112 Comments

View All Comments

  • Anand Lal Shimpi - Thursday, June 23, 2011 - link

    OCZ has been at the forefront of SF-2000 generation SSD releases. OWC and Patriot are the only two other companies that have sent us drives and we've reviewed both of them on the site as well. We try to review every SSD of interest that comes our way, that includes four different Intel SSD controller/NAND configurations in the past couple of months:

    Intel SSD 510 120GB
    Intel SSD 510 250GB
    Intel SSD 320 300GB
    Intel SSD 320 160GB

    I believe we've only done two more OCZ drives by comparison and that's because they have two more products that offer measurably different performance (the MAX IOPS drives).

    Corsair would've been added to the list by now however the recall issue pushed back sampling of their SF-2281 drives a little bit. As soon as we get their drives in they'll be tested as well.

    OCZ is simply first with a lot of these drives, thus there's a rush to test them.

    Take care,
    Anand
  • techinsidr - Thursday, June 23, 2011 - link

    I got a TON of respect for Anand, but I agree.. this review seems somewhat questionable.

    Anand: its unacceptable to ship products to customers that are defective. It's a bit bothersome to see that you still recommend faulty hardware from OCZ. Believing failure rates based on a company forum seems like a very flawed metric.
  • velis - Thursday, June 23, 2011 - link

    There is no way to test a product for EVERY usage scenario. This goes for HW compatibility as well. It's not for nothing companies issue HW compatibility lists...

    And we're talking about sub 1% of system configurations. Not drives, mind you - the same drive will work like a charm in another computer.

    And if you'll see my other reply, I have 3 Intel X25 G2s that give me BSODs. Does that mean Intel should stop selling them? Just because I whine about it? Get serious, please.
  • techinsidr - Thursday, June 23, 2011 - link

    If a $400 hard-drive doesn't work on the most popular laptops such as a Dell E6400... that is a big problem.

    I get that there are tons of configurations, but it definitely appears that OCZ is cutting major corners in the quality control department.

    If I owned a SSD shop, I would never ship product that wasn't compatible with mainstream/popular notebooks.

    OCZ products may work for enthusiasts, but my data is far too valuable to roll the dice on a cheap drive.
  • Anand Lal Shimpi - Thursday, June 23, 2011 - link

    This is very true, unfortunately it's a tradeoff that you make with any non-Intel or Samsung drive. From what I've seen no one else does the sort of validation testing that those two do. Everything else is a tradeoff. Intel in particular has it down pat, which I fundamentally believe leads to its very low return rates.

    OCZ and SandForce definitely test more now than they did a year ago, but it still pales in comparison to Intel. In the days of Indilinx drives this was a tradeoff you made to get a more affordable drive, however these days we at least have the Intel SSD 510 as an alternative if you want good 6Gbps performance and Intel reliability.

    Your last sentence really encompasses the issue entirely. Some users are clearly ok with being on the bleeding edge if it means they get some sort of an advantage (with SF it's better performance and lower write amp over time). Taking that approach usually requires sacrificing something and in this case there's the chance that you might have an unlucky combination of drive and platform. For everyone else, there's Intel :)

    Take care,
    Anand
  • LTG - Thursday, June 23, 2011 - link

    This comment is what makes AT great - boiling down complexity to the most important points. Many sites can collect data, it's the quality of the interpretation that makes it work.

    Separately, I find it interesting that readers are debating the 0.66% failure rate. That number alone is pretty scary for a system level component being that it's caused by 1 bug - their total RMA/failure rate goes up from there.

    I'm a huge performance geek and don't mind living on the edge a bit - but this seems to be less attractive compared to something like an aggressive overclock which (usually) can be easily rolled back.

    It will be interesting to see how OCZ balances further raising validation costs to get customers like me, versus the realities of having to be profitable in a difficult industry. No easy answers I think.
  • lyeoh - Thursday, June 23, 2011 - link

    Do OCZ/sandforce list a bunch of hardware configurations/chipsets that are known to be compatible with their stuff? A brief check doesn't turn up anything.

    Given the price of these drives you might as well buy a motherboard that suits the drive.

    Of course there would still be problems, but at least a replacement drive would have a far better chance of working.
  • Anand Lal Shimpi - Thursday, June 23, 2011 - link

    The implication seems to be that it's not so much the specific motherboard, but rather the behavior of the particular chipset used on the motherboard. E.g. you may have two identical P67 motherboards, one exhibits the issue and one doesn't.

    Take care,
    Anand
  • lyeoh - Friday, June 24, 2011 - link

    If that's the case then that sounds pretty broken to me.

    I can understand drives having incompatibilities with particular models/brands/types of chipsets, but to me having problems with some chips of the same model chipsets means that either the chipset is broken or the drives are.

    A "compatibility issue" with power supplies would be more forgiveable :).
  • JasonInofuentes - Friday, June 24, 2011 - link

    A compatibility issue with power supplies might be more forgiveable but it would be much less forgiving.

    As to whether this issue is "broken" remember that the more complex a system is the more likely that these problems will crop up, this is why integrating components (ala SoC) generally improves reliability.

    Let's look at another area where there can be a similar reliability issue due to an I/O interface. HDCP compliant HDMI has been around for sometime. Anyone that has put together a home theater system has encountered a handshake issue at one time or another. The source of the problems can be excruciatingly hard to identify because each component might work perfectly fine when paired in any other way, but when linked in a certain way there's a failure. And the failure could be in one of the source's, in a receiver, in a switch, in a display, or even in a cable. It could be resolved by replacing any of those components with an identical model. So, even though a system is put together of components that are individually fully HDCP compliant, there can still be a failure. And your likelihood of failure goes up the more components included, the longer your cables are, the more disparate in generation your components are. Is this system broken? Maybe, but for a different reason.

    OCZ, and Sandforce, have in their possession the goose that laid the golden egg. Class leading performance and longevity and right at the starting line of a market that will be absolutely enormous in the coming years. But as in the fairy tale, there's a problem. Should OCZ, as the farmer did, kill their goose because it can't lay two eggs a day (or run without any failures on every platform to ever be graced with a SATA port)?

Log in

Don't have an account? Sign up now