Performance Consistency

We've been looking at performance consistency since the Intel SSD DC S3700 review in late 2012 and it has become one of the cornerstones of our SSD reviews. Back in the days many SSD vendors were only focusing on high peak performance, which unfortunately came at the cost of sustained performance. In other words, the drives would push high IOPS in certain synthetic scenarios to provide nice marketing numbers, but as soon as you pushed the drive for more than a few minutes you could easily run into hiccups caused by poor performance consistency. 

Once we started exploring IO consistency, nearly all SSD manufacturers made a move to improve consistency and for the 2015 suite, I haven't made any significant changes to the methodology we use to test IO consistency. The biggest change is the move from VDBench to Iometer 1.1.0 as the benchmarking software and I've also extended the test from 2000 seconds to a full hour to ensure that all drives hit steady-state during the test.

For better readability, I now provide bar graphs with the first one being an average IOPS of the last 400 seconds and the second graph displaying the standard deviation during the same period. Average IOPS provides a quick look into overall performance, but it can easily hide bad consistency, so looking at standard deviation is necessary for a complete look into consistency.

I'm still providing the same scatter graphs too, of course. However, I decided to dump the logarithmic graphs and go linear-only since logarithmic graphs aren't as accurate and can be hard to interpret for those who aren't familiar with them. I provide two graphs: one that includes the whole duration of the test and another that focuses on the last 400 seconds of the test to get a better scope into steady-state performance.

Steady-State 4KB Random Write Performance

Barefoot 3 has always done well in steady-state performance and the Vector 180 is no exception. It provides the highest average IOPS by far and the advantage is rather significant at ~2x compared to other drives.

Steady-State 4KB Random Write Consistency

But on the down side, the Vector 180 also has the highest variation in performance. While the 850 Pro, MX100 and Extreme Pro are all slower in terms of average IOPS, they are a lot more consistent and what's notable about the Vector 180 is how the consistency decreases as the capacity goes up. 

OCZ Vector 180 240GB
Default
25% Over-Provisioning

Looking at the scatter graph reveals the source of poor consistency: the IOPS reduce to zero or near zero even before we hit any type of steady state. This is known behavior of the Barefoot 3 platform, but what's alarming is how the 480GB and 960GB drives frequently drop to zero IOPS. I don't find that acceptable for a modern high-end SSD, no matter how good the average IOPS is. Increasing the over-provisioning helps a bit by shifting the dots up, but it's still clear that 240GB is the optimal capacity for Barefoot 3 because after that the platform starts to run into issues with consistency due to metadata handling.

OCZ Vector 180 240GB
Default
25% Over-Provisioning
SSD Guru: The New OCZ Toolbox AnandTech Storage Bench - The Destroyer
POST A COMMENT

89 Comments

View All Comments

  • nils_ - Wednesday, March 25, 2015 - link

    It's an interesting concept (especially when the Datacenter uses a DC Distribution instead of AC), but I don't know if I would be comfortable with batteries in everything. A capacitor holds less of a charge but doesn't deteriorate over time and the only component that really needs to stay on is the drive (or RAID controller if you're into that). Reply
  • nils_ - Wednesday, March 25, 2015 - link

    "I don't think it has been significant enough to warrant physical power loss protection for all client SSDs."

    If a drive reports a flush as complete, the operating system must be confident that the data is already written to the underlying device. Any drive that doesn't deliver this is quite simply defective by design. Back in the day this was already a problem with some IDE and SATA drives, they reported a write operation as complete once the data hit the drive cache. Just because something is rated as consumer grade does not mean that they should ship defective devices.

    Even worse is that instead of losing the last few writes you'll potentially lose all the data stored on the drive.

    If I don't care whether the data makes it to the drive I can solve that in software.
    Reply
  • shodanshok - Wednesday, March 25, 2015 - link

    If a drive receive an ATA FLUSH command, it _will_ write to stable storage (HDD platters or NAND chips) before returning. For unimportant writes (the ones not marked with FUA or encapsulated into an ATA FLUSH) the drive is allowed to store data into cache and return _before_ the data hit the actual permanent storage.

    SSDs adds another problem: by the very nature of MCL and TLC cells, data at rest (already comitted to stable storage) are at danger by the partial page write effect. So, PMF+ and Crucial's consumer drive Power Loss Protection are _required_ for reliable use of the drive. Drives that don't use at least partial power loss protection should use a write-through (read-only cache) approach at least for the NAND mapping table or very frequent flushes of the mapping table (eg: Sandisk)
    Reply
  • mapesdhs - Wednesday, March 25, 2015 - link


    How do the 850 EVO & Pro deal with this scenario atm?

    Ian.
    Reply
  • Oxford Guy - Wednesday, March 25, 2015 - link

    "That said, while drive bricking due to mapping table corruption has always been a concern, I don't think it has been significant enough to warrant physical power loss protection for all client SSDs."

    I see you never owned 240 GB Vertex 2 drives with 25nm NAND.
    Reply
  • prasun - Wednesday, March 25, 2015 - link

    "PFM+ will protect data that has already been written to the NAND"

    They should be able to do this by scanning NAND. The capacitor probably makes life easier, but with better firmware design this should not be necessary.

    With the capacitor, the steady state performance should be consistent, as they won't need to flush mapping table to NAND regularly.

    Since this is also not the case, this points to bad firmware design
    Reply
  • marraco - Wednesday, March 25, 2015 - link

    I have a bricked Vertex 2 resting a meter away. It was so expensive that I cannot resign to trow it at the waste.

    I will never buy another OCZ product, ever.

    OCZ refused to release the software needed to unbrick it. Is just a software problem. OCZ got my money, but refuses to make it work.

    Do NOT EVER buy anything from OCZ.
    Reply
  • ocztosh - Wednesday, March 25, 2015 - link

    Hello Marraco, thank you for your feedback and sorry to hear that you had an issue with the Vertex 2. That particular drive was Sandforce based and there was no software to unbrick it unfortunately, nor did the previous organization have the source code for firmware. This was actually one of the reasons that drove the company to push to develop in-house controllers and firmware, so we could control these elements which ultimately impacts product design and support.

    Please do contact our support team and reference this thread. Even though this is a legacy product we would be more than happy to help and provide support. Thank you again for your comments and we look forward to supporting you.
    Reply
  • mapesdhs - Wednesday, March 25, 2015 - link

    Indeed, the Vertex4 and Vector series are massively more reliable, but the OCZ haters
    ignore them entirely, focusing on the old Vertex2 series, etc. OCZ could have handled
    some of the support issues back then better, but the later products were more reliable
    anyway so it was much less of an issue. With the newer warranty structure, Toshiba
    ownership & NAND, etc., it's a very different company.

    Irony is, I have over two dozen Vertex2E units and they're all working fine (most are
    120s, with a sprinkling of 60s and 240s). One of them is an early 3.5" V2E 120GB,
    used in an SGI Fuel for several years, never a problem (recently replaced with a
    2.5" V2E 240GB).

    Btw ocztosh, I've been talking to some OCZ people recently about why certain models
    force a 3gbit SAS controller to negotiate only a 1.5gbit link when connected to a SATA3
    SSD. This occurs with the Vertex3/4, Vector, etc., whereas connecting the SATA2 V2E
    correctly results in a 3Gbit link. Note I've observed similar behaviour with other brands,
    ditto other SATA2 SSDs (eg. SF-based Corsair F60, 3Gbit link selected ok). The OCZ
    people I talked to said there's nothing they can do to fix whatever the issue might be,
    but what I'm interested in is why it happens; if I can find that out then maybe I can
    figure a workaround. I'm using LSI 1030-based PCIe cards, eg. SAS3442, SAS3800,
    SAS3041, etc. I'd welcome your thoughts on the issue. Would be nice to get a Vertex4
    running with a 3Gbit link in a Fuel, Tezro or Origin/Onyx.

    Note I've been using the Vertex4 as a replacement for ancient 1GB SCSI disks in
    Stoll/SIRIX systems used by textile manufacturers, works rather well. Despite the
    low bandwidth limit of FastSCSI2 (10MB/sec), it still cut the time for a full backup
    from 30 mins to just 6 mins (tens of thousands of small pattern files). Alas, with
    the Vertex4 no longer available, I switched to the Crucial M550 (since it does have
    proper PLP). I'd been hoping to use the V180 instead, but its lack of full PLP is an issue.

    Ian.
    Reply
  • alacard - Wednesday, March 25, 2015 - link

    In my view the performance consistency basically blows the lid off of OCZ and the reliability of their Barefoot controller. Despite reporting from most outlets, for years now drives based off of this technology have suffered massive failure rates due to sudden power loss. Here we have definitive evidence of those flaws and the lengths OCZ is going to in order to work around them (note, i didn't say 'fix' them).

    The fact that they were willing to go to the extra cost of adding the power loss module in addition to crippling the sustained performance of their flagship drive in order to flush the cache out of DRAM speaks VOLUMES about how bad their reliability was before. You don't go to such extreme - potentially kiss of death measures - without a good boot up your ass pushing you headlong toward them. In this case said boot was constructed purely out of OCZ's fear that releasing yet ANOTHER poorly constructed drive would finally put their reputation out of it's misery for good and kill any chance a future sales.

    OCZ has cornered themselves in a no win scenario:

    1) They don't bother making the drive reliable and in doing so save the cost of the power loss module and keep the sustained speed of the Vector 180 high. The drive reviews well with no craters in performance and the few customers OCZ has left buy another doomed Barefoot SSD that's practically guaranteed to brick on them within a few months. As a result they loose those customers for good along with their company.

    or

    2) The go to the cost of adding the power loss module and cripple the drives performance to ensure that the drive is reliable. The drive reviews horribly and no one buys it.

    This is their position. Kiss of death indeed.

    Ultimately, i think it speaks to how complicated controller development is and that if you don't have a huge company with millions of R&D funds at your disposal it's probably best if you don't throw your hat into that ring. It's a shame but it seems to be the way high tech works. (Global oligopoly, here we come.)

    All things considered, it's nice that this is finally all out in the open.
    Reply

Log in

Don't have an account? Sign up now