The Real Issue

While I was covering MWC a real issue with OCZ's SSDs erupted back home: OCZ aggressively moved to high density 25nm IMFT NAND and as a result was shipping product under the Vertex 2 name that was significantly slower than it used to be. Storage Review did a great job jumping on the issue right away.

Let's look at what caused the issue first.

When IMFT announced the move to 25nm it mentioned a doubling in NAND capacity per die. At 25nm you could now fit 64Gbit of MLC NAND (8GB) on a single die, twice what you could get at 34nm. With twice the density in the same die area, costs could come down considerably.


An IMFT 25nm 64Gbit (8GB) MLC NAND die

Remember NAND manufacturing is no different than microprocessor manufacturing. Cost savings aren't realized on day one because yields are usually higher on the older process. Newer wafers are usually more expensive as well. So although you get ~2x density improvement going to 25nm, your yields are lower and wafers are more expensive than they were at 34nm. Even Intel was only able to get a maximum of $110 decrease in price when going from the X25-M G2 to the SSD 320.

OCZ was eager to shift to 25nm. Last year SandForce was the first company to demonstrate 25nm Intel NAND on an SSD at IDF, clearly the controller support was there. As soon as it had the opportunity to, OCZ began migrating the Vertex 2 to 25nm NAND.

SSDs are a lot like GPUs, they are very wide, parallel beasts. While a GPU has a huge array of parallel cores, SSDs are made up of arrays of NAND die working in parallel. Most controllers have 8 channels they can use to talk to NAND devices in parallel, but each channel can often have multiple NAND die active at once.


A Corsair Force F120 using 34nm IMFT NAND

Double the NAND density per die and you can guess what happened next - performance went down considerably at certain capacity points. The most impacted were the smaller capacity drives, e.g. the 60GB Vertex 2. Remember the SF-1200 is only an 8-channel controller so it only needs eight devices to technically be fully populated. However within a single NAND device, multiple die can be active concurrently and in the first 25nm 60GB Vertex 2s there was only one die per NAND package. The end result was significantly reduced performance in some cases, however OCZ failed to change the speed ratings on the drives themselves.

The matter is complicated by the way SandForce's NAND redundancy works. The SF-1000 series controllers have a feature called RAISE that allows your drive to keep working even if a single NAND die fails. The controller accomplishes this redundancy by writing parity data across all NAND devices in the SSD. Should one die fail, the lost data is reconstructed from the remaining data + parity and mapped to a new location in NAND. As a result, total drive capacity is reduced by the size of a single NAND die. With twice the density per NAND die in these early 25nm drives, usable capacity was also reduced when OCZ made the switch with Vertex 2.

The end result was that you could buy a 60GB Vertex 2 with lower performance and less available space without even knowing it.


A 120GB Vertex 2 using 25nm Micron NAND

After a dose of public retribution OCZ agreed to allow end users to swap 25nm Vertex 2s for 34nm drives, they would simply have to pay the difference in cost. OCZ realized that was yet another mistake and eventually allowed the swap for free (thankfully no one was ever charged), which is what should have been done from the start. OCZ went one step further and stopped using 64Gbit NAND in the 60GB Vertex 2, although drives still exist in the channel since no recall was issued.

OCZ ultimately took care of those users who were left with a drive that was slower (and had less capacity) than they thought they were getting. But the problem was far from over.

Introduction The NAND Matrix
Comments Locked

153 Comments

View All Comments

  • dagamer34 - Wednesday, April 6, 2011 - link

    Any idea when these are going to ship out into the wild? I've got a 120GB Vertex 2 in my 2011 MacBook Pro that I'd love to stick into my Windows 7 HTPC so it's more responsive.
  • Ethaniel - Wednesday, April 6, 2011 - link

    I just love how Anand puts OCZ on the grill here. It seems they'll just have to step it up. I was expecting some huge numbers coming from the Vertex 3. So far, meh.
  • softdrinkviking - Wednesday, April 6, 2011 - link

    "OCZ insists that there's no difference between the Spectek stuff and standard Micron 25nm NAND"

    Except for the fact that Spectek is 34nm I am assuming?
    There surely must be some significant difference in performance between 25 and 34, right?
  • softdrinkviking - Wednesday, April 6, 2011 - link

    sorry, i think that wasn't clear.
    what i mean is that it seems like you are saying the difference in process nodes is purely related to capacity, but isn't there some performance advantage to going lower as well?
  • softdrinkviking - Wednesday, April 6, 2011 - link

    okay. forget it. i looked back through and found the part where you write about the 25nm being slower.

    that's weird and backwards. i wonder why it gets slower as it get smaller, when cpus are supposedly going to get faster as the process gets smaller?

    are their any semiconductor engineers reading this article who know?
    are the fabs making some obvious choice which trades in performance at a reduced node for cost benefits, in an attempt to increase die capacities and lower end-user costs?
  • lunan - Thursday, April 7, 2011 - link

    i think because the chip get larger but IO interface to the controller remain the same (the inner raid). instead of addressing 4GB of NAND, now one block may consists of 8GB or 16GB NAND.

    in case of 8 interface,
    4x8GB =32GB NAND but 8x8GB=64GB NAND, 8x16GB=128GB NAND

    the smaller the shrink is, the bigger the nand, but i think they still have 8 IO interface to the controller, hence the time takes also increased with every shrinkage.

    CPU or GPU is quite different because they implement different IO controller. the base architecture actually changes to accommodate process shrink.

    they should change the base architecture with every NAND if they wish to archive the same speed throughput, or add a second controller....

    I think....i may not be right >_<
  • lunan - Thursday, April 7, 2011 - link

    for example the vertex 3 have 8GB NAND with 16(8 front and 8 back) connection to the controller. now imagine if the NAND is 16GB or 32 GB and the interface is only 16 with 1 controller?

    maybe the CPU approach can be done to this problem. if you wish to duplicate performace and storage, you do dual core (which is 1 cpu core beside the other)....

    again...maybe....
  • softdrinkviking - Friday, April 8, 2011 - link

    thanks for your reply. when i read it, i didn't realize that those figures were referring to the capacity of the die.

    as soon as i re-read it, i also had the same reaction about redesigning the controller, it seems the obvious thing to do,
    so i can't believe that the controller manufacturer's haven't thought of it.
    there must be something holding them back, probably $$.
    the major SSD players all appear to be trying to pull down the costs of drives to encourage widespread adoption.

    perhaps this is being done at the expense of obvious performance increases?
  • Ammaross - Thursday, April 7, 2011 - link

    I think if you re-reread (yes, twice), you'll note that with the die shrink, the block size was upped from 4K to 8K. This is twice the space to be programmed or erased per write. This is where the speed performance disappears, regardless of the number of dies in the drive.
  • Anand Lal Shimpi - Wednesday, April 6, 2011 - link

    Sorry I meant Micron 34nm NAND. Corrected :)

    Take care,
    Anand

Log in

Don't have an account? Sign up now