Sequential Read/Write Speed

To measure sequential performance I ran a 1 minute long 128KB sequential test over the entire span of the drive at a queue depth of 1. The results reported are in average MB/s over the entire test length.

As impressive as the random read/write speeds were, at low queue depths the Vertex 4's sequential read speed is problematic:

Desktop Iometer - 128KB Sequential Read (4K Aligned)

Curious as to what's going on, I ran AS-SSD and came away with much better results:

Incompressible Sequential Read Performance - AS-SSD

Finally I turn to ATTO, giving me the answer I'm looking for. The Vertex 4's sequential read speed is slow at low queue depths with certain workloads, move to larger transfer sizes or high queue depths and the problem resolves itself:

QD2
QD4
QD8

The problem is that many sequential read operations for client workloads occur at 64 – 128KB transfer sizes, and at a queue depth of 1 - 3. Looking at the ATTO data above you'll see that this is exactly the weak point of the Vertex 4.

I went back to Iometer and varied queue depth with our 128KB sequential read test and got a good characterization of the Vertex 4's large block, sequential read performance:

The Vertex 4 performs better with heavier workloads. While other drives extract enough parallelism to deliver fairly high performance with only a single IO in the queue, the Vertex 4 needs 2 or more for large block sequential reads. Heavier read workloads do wonderfully on the drive, ironically enough it's the lighter workloads that are a problem. It's the exact opposite of what we're used to seeing. As this seemed like a bit of an oversight, I presented OCZ with my data and got some clarification.

Everest 2 was optimized primarily for non-light workloads where higher queuing is to be expected. Extending performance gains to lower queue depths is indeed possible (the Everest 1 based Octane obviously does fine here) but it wasn't deemed a priority for the initial firmware release. OCZ instead felt it was far more important to have a high-end alternative to SandForce in its lineup. Given that we're still seeing some isolated issues on non-Intel SF-2281 drives, the sense of urgency does make sense.

There are two causes for the lower than expected, low queue depth sequential read performance. First, OCZ doesn't currently enable NCQ streaming for queue depths less than 3. This one is a simple fix. Secondly, the Everest 2 doesn't currently allow pipelined read access from more than 8 concurrent NAND die. For larger transfers and queue depths this isn't an issue, but smaller transfers and lower queue depths end up delivering much lower than expected performance.

To confirm that I wasn't crazy and the Vertex 4 was capable of high, real-world sequential read speeds I created a simple test. I took a 3GB archive and copied it from the Vertex 4 to a RAM drive (to eliminate any write speed bottlenecks). The Vertex 4's performance was very good:

Sequential Read - 3GB Archive Copy to RAM Disk

Clearly the Vertex 4 is capable of reading at very high rates – particularly when it matters, however the current firmware doesn't seem tuned for any sort of low queue depth operation.

Both of these issues are apparently being worked on at the time of publication and should be rolled into the next firmware release for the drive (due out sometime in late April). Again, OCZ's aim was to deliver a high-end drive that could be offered as an alternative to the Vertex 3 as quickly as possible.

Update: Many have been reporting that the Vertex 4's performance is dependent on having an active partition on the drive due to its NCQ streaming support. While this is true, it's not the reason you'll see gains in synthetic tests like Iometer. If you don't fill the drive with valid data before conducting read tests, the Vertex 4 returns lower performance numbers. Running Iometer on a live partition requires that the drive is first filled with data before the benchmark runs, similar to what we do for our Iometer read tests anyway. The chart below shows the difference in performance between running an Iometer sequential read test on a physical disk (no partition), an NTFS partition on the same drive and finally the physical disk after all LBAs have been written to:

Notice how the NTFS and RAW+precondition lines are identical, it's because the reason for the performance gain here isn't NCQ streaming but rather the presence of valid data that you're reading back. Most SSDs tend to give unrealistically high performance numbers if you read from them immediately following a secure erase so we always precondition our drives before running Iometer. The Vertex 4 just happens to do the opposite, but this has no bearing on real world performance as you'll always be reading actual files in actual use.

Despite the shortcomings with low queue depth sequential read performance, the Vertex 4 dominated our sequential write tests, even at low queue depths. Only the Samsung SSD 830 is able to compete:

Desktop Iometer - 128KB Sequential Write (4K Aligned)

Technically the SF-2281 drives equal the Vertex 4's performance, but that's only with highly compressible data. Large sequential writes are very often composed of already compressed data, which makes the real world performance advantage of the Vertex 4 tangible.

Incompressible Sequential Write Performance - AS-SSD

AS-SSD gives us another taste of the performance of incompressible data, which again is very good on the Vertex 4. As far as writes are concerned, there's really no beating the Vertex 4.

Random Read/Write Speed AnandTech Storage Bench 2011
Comments Locked

127 Comments

View All Comments

  • elghosto - Wednesday, April 4, 2012 - link

    linux
    #fstrim
  • Per Hansson - Wednesday, April 4, 2012 - link

    "monitoring port of the SSD"
    Please enlighten me, Google was no help...
    Is it a hardware interface that allows you to see how the drive operates?
  • adamantinepiggy - Thursday, April 5, 2012 - link

    Basically, every SSD has some sort of real-time data port that allows engineers to monitor what is going on with the SSD, even when the drive hangs or has other issues. It is used mainly for development/testing. Consider it sorta like a way to read/access the dump file when Windows BSOD's, except in this case it's on the SSD. This monitoring port gets disabled on released drive firmware and the hardware attachment leads are unattached..
  • jonup - Wednesday, April 4, 2012 - link

    Thanks for asking this! I always wanted to know that myself. I actually google it to no avail while I was reading the article.
  • medys - Wednesday, April 4, 2012 - link

    How long till we are overclocking our SSD processors :-/
  • FunBunny2 - Wednesday, April 4, 2012 - link

    Umm. How you gonna fit that water cooler inside the case?
  • Iketh - Wednesday, April 4, 2012 - link

    hahaha... NEVER!! I've yet to break a Win7 installation from overclocking, but I broke XP many times... I shudder at the thought of overclocking an SSD :)
  • Iketh - Wednesday, April 4, 2012 - link

    Although, I wonder how long until the processors in SSDs reach, say, today's single-core Atom... OR better yet, how long before the SSD controller is built into the CPU much like the memory controller, where we install more storage the same way we install ram... and then later again the nand controller and RAM controller merge, and a computer is nothing more than a SoC with some nand sitting next to it...
  • iwod - Wednesday, April 4, 2012 - link

    We finally have controller that are able to bump out MB faster then Sandforce without using some silly compression engine. Marvell also announced next Gen SSD controller as well.

    Again we have reached the limit of SATA 6Gbps, we will need to start thinking about SATA Express, Lower power consumption, reliability. etc...
  • akbo - Wednesday, April 4, 2012 - link

    Though I think the high consumption might be because of the controller, the chip is huge! With thermy sticky!

    Wonder when a die shrink of this is possible.

Log in

Don't have an account? Sign up now