Performance Consistency

Over the years we've seen SSD makers shift focus from sequential performance, to random performance and more recently towards IO consistency. Prolonged random IO can create a ton of fragmentation both in the NAND and, if applicable, in a SSD's indirection table. A drive without a powerful enough controller, a smart enough algorithm and/or enough spare area will see tremendous variance between even adjacent IOs. Inconsistent IO latency can result in inconsistent application performance, which for a web-facing server can manifest itself as inconsistent page load times for your users. Even more visible are in situations where your server is delivering visual content to your users. Inconsistent drive latency here can appear, in the worst case, as stutters in playback or even loss of quality (if your application dynamically adjusts streaming quality based on server performance). If you're running a large array of SSDs, the total performance of the array is bound by the slowest drive - in the case of an array of drives with highly variable performance, that minimum level of performance can actually end up being far slower than you're expecting.

Performance consistency matters, but it requires a somewhat different methodology to analyze. We focus on random write latency as that's really the worst case scenario for IO variance. Reads are fairly consistent, as are sequential writes. It's random writes that cause problems.

To generate the data below I took a freshly secure erased SSD and filled it with sequential data. This ensures that all user accessible LBAs have data associated with them. Next I kicked off a 4KB random write workload at a queue depth of 32 using incompressible data. I ran the test for just over half an hour, no where near what we run our steady state tests for but enough to give me a good look at drive behavior once all spare area filled up.

I recorded instantaneous IOPS every second for the duration of the test. I then plotted IOPS vs. time and generated the scatter plots below. Each set of graphs features the same scale. The first two sets use a log scale for easy comparison, while the last set of graphs uses a linear scale that tops out at 40K IOPS for better visualization of differences between drives.

The first set of graphs shows the performance data over the entire 2000 second test period. In these charts you'll notice an early period of very high performance followed by a sharp dropoff. What you're seeing in that case is the drive alllocating new blocks from its spare area, then eventually using up all free blocks and having to perform a read-modify-write for all subsequent writes (write amplification goes up, performance goes down).

The second set of graphs zooms in to the beginning of steady state operation for the drive (t=1400s). The third set also looks at the beginning of steady state operation but on a linear performance scale. Click the buttons below each graph to switch source data.


At a high level, the P400m actually looks really good. Thanks to tons of spare area on the drive by default, performance until the 600 second marker is incredibly consistent. After that point things get a bit more erratic but compared to the P400e this drive is in a completely different league of performance consistency. Intel's S3700 is a bit better here, but the P400m is a clear step in the right direction.

The next set of charts look at the steady state (for most drives) portion of the curve. Here we'll get some better visibility into how everyone will perform over the long run.


Looking closer we see a clear improvement in consistency over the P400e. The P400m is truly an upgrade for customers who have write heavy workloads (e.g. high traffic database servers).

The final set of graphs abandons the log scale entirely and just looks at a linear scale that tops out at 40K IOPS. We're also only looking at steady state (or close to it) performance here:


It's interesting that consistency seems to improve over time after falling behind a bit. We really hit the worst inconsistency early on here, and even then it's no where near as bad as on the P400e. Intel's S3700 maintains a tighter distribution of IOPS however.

Introduction Random & Sequential Performance
POST A COMMENT

23 Comments

View All Comments

  • philipma1957 - Wednesday, February 13, 2013 - link

    I missed the warranty on this drive was it 3 or 5 years? Reply
  • Beenthere - Wednesday, February 13, 2013 - link

    The warranty doesn't matter much because if you need to use it, you'll get a repaird unit that isn't likely to last 1 year based on many people's experience. The lost data is the real issue for most people...

    For those who don't know, now days a longer product warranty does not decree a better quality product. Products with longer warranty cost more to pay for the longer warranty.
    Reply
  • designerfx - Wednesday, February 13, 2013 - link

    I don't get your point? The point of a warranty is to guarantee performance up to the stated warranty period. That's never been about quality or not. Whether you get a good unit back from warranty is entirely going to be luck of the draw with any manufacturer.

    It's no different in the car industry, Hyundai's 10 year warranty isn't an indicator of quality either.
    Reply
  • peterfares - Wednesday, February 13, 2013 - link

    I think his point is that because its an enterprise drive and because warranty replacements are unreliable, they wouldn't be put into service in an enterprise environment. Reply
  • DuckieHo - Thursday, February 14, 2013 - link

    Why would an enterprise replacement be unreliable? Hardware fails... that's why there's a warranty in the first place.

    Not replacing the drive or using a different drive would absolutely be more risky to uptime.
    Reply
  • JellyRoll - Wednesday, February 13, 2013 - link

    I googled to find more data on this "XPERT NAND" that is mentioned in this article. It actually doesn't exist, though a PDF explains that XPERT is not NAND, it is actual firmware and algorithms. I can't believe Anand missed the mark so badly. Reply
  • Anand Lal Shimpi - Wednesday, February 13, 2013 - link

    It's a combination of NAND binning/tech, firmware and validation. From Micron's reviewer's guide on the P400m:

    "The P400m is built from the ground up with custom 25nm MLC NAND that uses Micron’s extended performance and enhanced reliability technology (XPERT)."
    Reply
  • JellyRoll - Wednesday, February 13, 2013 - link

    Well..that states specifically that it uses XPERT technology but not that the NAND is XPERT NAND.
    For instance, some Marvell SSDs use F.R.A.M.E. technology, but they do not refer to it as F.R.A.M.E. NAND.
    Reply
  • mmonnin03 - Thursday, February 14, 2013 - link

    All the NAND is pretty much the same. The only thing really different with the NAND is binning during probe. A fuse blown here, a fuse blown there. Fab process is the same.

    If its 42GBit chips like the article mentioned then these are the TLC designed chips but only storing 2 bits in a Cell. 2/3 of 64GBit chip = 43GBit.
    Reply
  • JellyRoll - Wednesday, February 13, 2013 - link

    Its weird but I noticed when this article posted that there were already 1,394 likes in the icon above. I come back three hours later and there is the exact same number of likes! It really blew my mind that there would be 1,394 likes in the first five minutes of this article being posted, but now i see that the Like button must be broken :) Reply

Log in

Don't have an account? Sign up now