Performance Consistency

In our Intel SSD DC S3700 review Anand introduced a new method of characterizing performance: looking at the latency of individual operations over time. The S3700 promised a level of performance consistency that was unmatched in the industry, and as a result needed some additional testing to show that. The reason we don't have consistent IO latency with SSDs is because inevitably all controllers have to do some amount of defragmentation or garbage collection in order to continue operating at high speeds. When and how an SSD decides to run its defrag and cleanup routines directly impacts the user experience. Frequent (borderline aggressive) cleanup generally results in more stable performance, while delaying that can result in higher peak performance at the expense of much lower worst case performance. The graphs below tell us a lot about the architecture of these SSDs and how they handle internal defragmentation.

To generate the data below I took a freshly secure erased SSD and filled it with sequential data. This ensures that all user accessible LBAs have data associated with them. Next I kicked off a 4KB random write workload across all LBAs at a queue depth of 32 using incompressible data. I ran the test for just over half an hour, no where near what we run our steady state tests for but enough to give me a good look at drive behavior once all spare area filled up.

I recorded instantaneous IOPS every second for the duration of the test. I then plotted IOPS vs. time and generated the scatter plots below. Each set of graphs features the same scale. The first two sets use a log scale for easy comparison, while the last set of graphs uses a linear scale that tops out at 40K IOPS for better visualization of differences between drives.

The high level testing methodology remains unchanged from our S3700 review. Unlike in previous reviews however, I did vary the percentage of the drive that I filled/tested depending on the amount of spare area I was trying to simulate. The buttons are labeled with the advertised user capacity had the SSD vendor decided to use that specific amount of spare area.  If you want to replicate this on your own all you need to do is create a partition smaller than the total capacity of the drive and leave the remaining space unused to simulate a larger amount of spare area. The partitioning step isn't absolutely necessary in every case but it's an easy way to make sure you never exceed your allocated spare area. It's a good idea to do this from the start (e.g. secure erase, partition, then install Windows), but if you are working backwards you can always create the spare area partition, format it to TRIM it, then delete the partition. Finally, this method of creating spare area works on the drives we've tested here but not all controllers may behave the same way.

The first set of graphs shows the performance data over the entire 2000 second test period. In these charts you'll notice an early period of very high performance followed by a sharp dropoff. What you're seeing in that case is the drive alllocating new blocks from its spare area, then eventually using up all free blocks and having to perform a read-modify-write for all subsequent writes (write amplification goes up, performance goes down).

The second set of graphs zooms in to the beginning of steady state operation for the drive (t=1400s). The third set also looks at the beginning of steady state operation but on a linear performance scale. Click the buttons below each graph to switch source data.

Impact of Spare Area
  Plextor M5M 256GB Plextor M5 Pro 256GB Intel SSD 525 240GB Corsair Neutron 240GB OCZ Vector 256GB Samsung SSD 840 Pro 256GB
Default
25% Spare Area -

The M5M does a lot better than the M5 Pro but its consistency is still slightly behind OCZ Vector and Samsung SSD 840 Pro. I believe the reason why M5M's graph looks so different is Plextor's garbage collection method. Vector and SSD 840 are doing a lot more active garbage collection, which means they are constantly cleaning blocks and rearranging data. That's why their performance is constantly varying: At one second you're pushing data at 20K IOPS, the next at 5K IOPS and on the third you're back to 20K IOPS.

Plextor's approach is different, their garbage collection isn't triggered until it's an absolute must (or the drive is idling). In this case, after 500 seconds of 4KB random writes, there are no empty blocks left and the firmware must do garbage collection before it can process the next write request. The result? Performance drops to below 100 IOPS. This is the problem with the "clean up later" approach. As you'll soon see in the steady state graphs below, the drive completely stops (zero IOPS) every now and then. The drive is simply in such a dirty state that it must spend possibly seconds doing garbage collection before it can process the next IO. Sure, the IO may then transfer at 10K IOPS but you've already noticed the hiccup when the drive was doing GC.

This can actually be applied to real world very easily. Imagine that you're doing the dishes the old fashioned way (i.e. by hand). If you do the dishes after every meal, you'll have to do the dishes more often but you'll only spend a little time doing them at a time. If you do the dishes once a day, it will take you a longer time to get them all done. The total time spend doing dishes will most likely be around the same, but doing them all at once will stop you from doing other activities for a longer period of time. If a friend calls and asks you out, you can't go because you have a pile of dishes to do, or you may be able to go but it will take you a while. Had you done the dishes after every meal, you would have been free to go. In this analogy, doing the dishes is obviously garbage collection and going out is a write request from the host. 

There's no clear ruling about which is better, active or idle garbage collection, but we have always preferred the active (though not too aggressive) method. The peak performance may be lower but consistency is a lot higher because you won't have sudden drops in the IOPS. 

One quick note about the M5 Pro before we go forward. I asked Plextor about the IO consistency in the M5 Pro after our review of the new 1.02 firmware went live. A few weeks ago Plextor got back to me and told that the 1.02 firmware has a bug that causes the consistency to be as horrible as it is. However, this is only in the old M5 Pro (not in the new Xtreme with slightly different PCB and NAND) and they are working on a new firmware to fix the issue. I should have the new Xtreme here in the next few days so I can test and see if the issue only exists in the old M5 Pro. The M5M definitely doesn't suffer from this issue, although its IO consistency has room for improvement.

Lets move on to steady state performance, shall we?

Impact of Spare Area
  Plextor M5M 256GB Plextor M5 Pro 256GB Intel SSD 525 240GB Corsair Neutron 240GB OCZ Vector 256GB Samsung SSD 840 Pro 256GB
Default
25% Spare Area -

The impact of "clean up later" is even easier to see during steady state. Most of the other SSDs vary between 1K and 10K IOPS but the M5M dips below 100 IOPS every now and then. The majority of IOs are transferring at about 7K IOPS, which is pretty good, but the drops will still affect the performance. The non-logarithmic graph below will do an even better job at showing this:

Impact of Spare Area
  Plextor M5M 256GB Plextor M5 Pro 256GB Intel SSD 525 240GB Corsair Neutron 240GB OCZ Vector 256GB Samsung SSD 840 Pro 256GB
Default
25% Spare Area -

Now, what you're seeing are two main lines: One at ~7K IOPS and the other at 0 IOPS. This really shows how bad the situation can be if you don't clean up the mess from early on. About every third second the M5M completely stops to do garbage collection. It's unlikely for consumers to put the SSDs in a state similar to ours but we still shouldn't see SSDs completely stopping anymore. It was an issue a few years ago and back then it was somewhat acceptable given the immaturity of consumer SSDs; today it should not exist.

Fortunately, giving the M5M 25% over-provisioning helps a lot. It's still not as good as for example OCZ Vector or Corsair Neutron GTX, but the minimum IOPS is now over 20K (no more sudden 0 IOPS drops). You can still see the impact of the "clean up later" approach but the drop is only 5K IOPS, which shouldn't be very noticeable. I strongly recommend having at least 25% free space with the M5M. The more you fill the drive, the more likely it is that you'll face inconsistent performance.

Random & Sequential Performance Performance vs. Transfer Size
Comments Locked

36 Comments

View All Comments

  • JellyRoll - Thursday, April 18, 2013 - link

    The consistency testing and all trace based testing used by this site are tested without partitions or filesystems, and no TRIM functionality. This has been disclosed by the staff in the comment sections of previous reviews.
  • bobsmith1492 - Wednesday, April 17, 2013 - link

    Hi Kristian,
    Let me know the regulator part number and I can calculate the loss in the regulator. The main difference is if it is a switching or linear part. A linear part will waste 100% * (5-3.3)/5 percent of the power, or 34% neglecting the usually small quiescent current. A switcher will waste less, usually 10-20%.
  • Kristian Vättö - Wednesday, April 17, 2013 - link

    It's Micrel 29150 as far as I know. Here's the datasheet http://www.micrel.com/_PDF/mic29150.pdf
  • Ashaw - Wednesday, April 17, 2013 - link

    That is a linear part. Current in = current out + the ground pin current. See the graph on page 10. The ground current is about 1/50 the output current in this part. so the input current is a good approximation of the output current.
  • Ashaw - Wednesday, April 17, 2013 - link

    So the powers in the graphs above should be approx 0.41W, 2.75W and 2.98 W respectively. (Maybe slightly less in le lower digit if I were to include regulator losses).
  • bobsmith1492 - Wednesday, April 17, 2013 - link

    Agreed, the SSD is using approximately 66% of the measured power on the 5V rail.
  • JellyRoll - Wednesday, April 17, 2013 - link

    There are two problems with this statement:
    "In our Intel SSD DC S3700 review Anand introduced a new method of characterizing performance: looking at the latency of individual operations over time."

    1. Anand did not introduce this testing, another website did.
    2. it isnt looking at individual operations, thousands of operations are happening per second, hence the term 'IOPS' (I/O Per Second)
  • JellyRoll - Wednesday, April 17, 2013 - link

    Actually there is a third problem with the statement, it isnt looking at latency either. It is looking at IOPS, which is much different than latency. There are no latency numbers in this test.
  • JPForums - Thursday, April 18, 2013 - link

    There are no latency numbers displayed directly in the results, but latencies are implicit in the IOPS measurement. You may not be getting individual operation latencies, but IOPS is the inverse of average operation latency. So Just divide 1 by the number IOPS and you'll get your average operation latency.

    In general, I give reviewers the benefit of the doubt and try to put aside small slip ups in nomenclature or semantics as long as it is relatively easy to understand the points they are trying to make. That said, you seem to have it out for Kristian (or perhaps Anandtech as a whole), giving no slack and even reading things into statements that I'm not sure are there. I have no vested interest in Anandtech beyond the interest of reading good reviews, but I have to ask, did Kristian kick your dog or something? I'm honestly interested if you have a legitimate grievance.
  • JellyRoll - Thursday, April 18, 2013 - link

    Pointing out numerous problems with methodology is simply that, in particular the consistency tests are wildly misleading for a number of reasons, the least of which is an unreal workload. I will not resort to replying to thinly veiled flamebait attempts.

Log in

Don't have an account? Sign up now