Original Link: http://www.anandtech.com/show/6363/ocz-vector-review-256gb
OCZ Vector (256GB) Reviewby Anand Lal Shimpi on November 27, 2012 9:10 PM EST
Times are changing at OCZ. There's a new CEO at the helm, and the company is now focused on releasing fewer products but that have gone through more validation and testing than in years past. The hallmark aggressive nature that gave OCZ tremendous marketshare in the channel overstayed its welcome. The new OCZ is supposed to sincerely prioritize compatibility, reliability and general validation testing. Only time will tell if things have changed, but right off the bat there's a different aura surrounding my first encounter with OCZ's Vector SSD.
Gone are the handwritten notes that accompanied OCZ SSD samples in years past, replaced by a much more official looking letter:
The drive itself sees a brand new 7mm chassis. The aluminum colored enclosure features a new label. Only the bottom of the SSD looks familiar as the name, part number and other details are laid out in traditional OCZ fashion.
Under the hood the drive is all new. Vector uses the first home-grown SSD controller by OCZ. Although the Octane and Vertex 4 SSDs both used OCZ Indilinx branded silicon, they were both based on Marvell IP - the controller architecture was licensed, not designed in house. Vector on the other hand uses OCZ's brand new Barefoot 3 controller, designed entirely in-house.
Barefoot 3 is the result of three different teams all working together. OCZ's UK design team, staffed with engineers from the PLX acquisition, the Korea design team inherited after the Indilinx acquisition, and folks at OCZ proper in California all came together to bring Barefoot 3 and Vector to life.
The Barefoot 3 controller integrates an unnamed ARM Cortex core as well as an OCZ Aragon co-processor. OCZ isn't going into a lot of detail as to how these two cores interact or what they handle, but multi-core SoCs aren't anything new in the SSD space. A branded co-processor is a bit unusual, and I suspect that whatever is responsible for Vector's distinct performance has to do with this part of the SoC.
Architecturally, Barefoot 3 can talk to NAND across 8 parallel channels. The controller is paired with two DDR3L-1600 DRAMs, although there's a pad for a third DRAM for use in the case where parity is needed for ECC.
Hardware encryption is not presently supported, although OCZ tells us Barefoot 3 is more than fast enough to handle it should a customer demand the feature. Hardware encryption remains mostly unused and poorly executed on client drives, so its absence isn't too big of a deal in my opinion.
OCZ does its own NAND packaging, and as a result Vector is home to a sea of OCZ branded NAND devices. In reality you're looking at 25nm IMFT synchronous 2-bit-per-cell MLC NAND, just with an OCZ silkscreen on it. There's no NAND redundancy built in to the drive as OCZ is fairly comfortable with the error and failure rates at 25nm. The only spare area set aside is the same 6.8% we see on most client drives (e.g. a 256GB Vector offers 238GB usable space in Windows).
|Sequential Read||550 MB/s||550 MB/s||550 MB/s|
|Sequential Write||400 MB/s||530 MB/s||530 MB/s|
|Random Read||90K IOPS||100K IOPS||100K IOPS|
|Random Write||95K IOPS||95K IOPS||95K IOPS|
|Active Power Use||2.25W||2.25W||2.25W|
|Idle Power Use||0.9W||0.9W||0.9W|
Regardless of capacity, OCZ is guaranteeing the Vector for up to 20GB of host writes per day for 5 years. The warranty on the Vector expires after 5 years or 36.5TB of writes, whichever comes first. As with most similar claims, the 20GB value is pretty conservative and based on a 4KB random write workload. With more realistic client workloads you can expect even more life out of the NAND.
Despite being built on a brand new SoC, there's a lot of firmware carryover from Vertex 4. Indeed if you look at the behavior of Vector, it is a lot like a much faster Vertex 4. OCZ does continue to use its performance mode that enables faster performance if less than 50% of the drive's capacity is used, however in practice OCZ seems to rely on it less than in the Vertex 4.
The design cycle for Vector is the longest OCZ has ever endured. It took OCZ 18 months to bring the Vector SSD to market, compared to less than 12 months for previous designs. The additional time was used not only to coordinate teams across the globe, but also to put Vector through more testing and validation than any previous OCZ SSD. It's impossible to guarantee a flawless drive, but doing considerably more testing can't hurt.
The Vector is available starting today in 128GB, 256GB and 512GB capacities. Pricing is directly comparable to Samsung's 840 Pro:
|OCZ Vector Pricing (MSRP)|
|Samsung SSD 840 Pro||$99.99||$149.99||$269.99||$599.99|
OCZ is a bit more aggressive on its 512GB MSRP, otherwise it's very clear what OCZ views as Vector's immediate competition.
Random Read/Write Speed
The four corners of SSD performance are as follows: random read, random write, sequential read and sequential write speed. Random accesses are generally small in size, while sequential accesses tend to be larger and thus we have the four Iometer tests we use in all of our reviews.
Our first test writes 4KB in a completely random pattern over an 8GB space of the drive to simulate the sort of random access that you'd see on an OS drive (even this is more stressful than a normal desktop user would see). I perform three concurrent IOs and run the test for 3 minutes. The results reported are in average MB/s over the entire time. We use both standard pseudo randomly generated data for each write as well as fully random data to show you both the maximum and minimum performance offered by SandForce based drives in these tests. The average performance of SF drives will likely be somewhere in between the two values for each drive you see in the graphs. For an understanding of why this matters, read our original SandForce article.
Low queue depth random read performance sees a significant regression compared to the Vertex 4. OCZ derives the Vector's specs at a queue depth of 32, at which it'll push 373MB/s of 4KB random reads. As Intel has established in the past, low queue depth random read performance of around 40 - 50MB/s is sufficient for most client workloads as we'll soon see in our trace based storage bench suite.
Low queue depth random write performance is a very different story, here the Vector pretty much equals the Vertex 4's already excellent score.
Many of you have asked for random write performance at higher queue depths. What I have below is our 4KB random write test performed at a queue depth of 32 instead of 3. While the vast majority of desktop usage models experience queue depths of 0 - 5, higher depths are possible in heavy I/O (and multi-user) workloads:
Crank up the queue depth and the Vector does well, but Samsung's SSD 840 Pro manages a nearly 10% performance advantage here.
Steady State 4KB Random Write Performance
OCZ will surely derive enterprise versions of the Vector and its Barefoot 3 controller, but I was curious to see what steady state 4KB random write performance looked like on the drive. I grabbed some of our Enterprise Iometer results from the S3700 review and trimmed out the non-SATA drives. The results are hugely improved compared to the Vertex 4:
Keep in mind this isn't an enterprise drive, and thus it's not too surprising to see significantly higher numbers here from other enterprise drives but the improvement over the Vertex 4 is substantial. Note that Samsung's SSD 840 Pro lands somewhere in between the Vector and Vertex 4.
Sequential Read/Write Speed
To measure sequential performance I ran a 1 minute long 128KB sequential test over the entire span of the drive at a queue depth of 1. The results reported are in average MB/s over the entire test length.
Low queue depth sequential read performance is among the better drives, but still slightly behind Samsung.
Write performance continues to be the Vector's strong suit, here only Intel's SSD 520 with easily compressed data pulls ahead.
AS-SSD Incompressible Sequential Performance
The AS-SSD sequential benchmark uses incompressible data for all of its transfers. The result is a pretty big reduction in sequential write speed on SandForce based controllers.
High queue depth sequential IO shows significant clustering at the top of the charts thanks to the limits of 6Gbps SATA. The Vector pushes performance pretty much as fast as possible here.
Switching to writes does shake loose some of the weaker competitors, but the Vector and 840 Pro still emerge as the strongest. Corsair's Neutron GTX does very well here.
Performance vs. Transfer Size
ATTO does a good job of showing us how sequential performance varies with transfer size. Most controllers optimize for commonly seen transfer sizes and neglect the rest. The optimization around 4KB, 8KB and 128KB transfers makes sense given that's what most workloads are bound by, but it's always important to understand how a drive performs across the entire gamut.
Vector clearly attempts to shift the Vertex 4's performance curve up and towards that of Samsung's SSD 840 Pro. There's a fairly repeatable anomaly at 32KB and 64KB where performance drops down to Vertex 4 levels, but generally speaking there's a tangible improvement across the board. My guess is whatever is happening at 32KB and 64KB is a bug though. Barefoot 3 has no issues parallelizing workloads that are smaller.
I would still like to see improved 512B transfer performance, but other than Samsung it doesn't look like anyone is really focusing on smaller-than-4KB performance anymore. Even Intel has pretty much abandoned focusing on it with its S3700 controller. I may just have to give up caring about it. Smaller than 4KB performance really doesn't impact most client workloads, it's really the weird corner cases where it would matter. Just don't go off and use any of these drives under Windows XP and you'll be fine.
When it comes to write performance, OCZ has delivered a solution that seems to be a hair quicker than the 840 Pro in many of the smaller transfer sizes, and a lot faster once we get to the larger block sizes. Performance vs. the Vertex 4 is clearly improved, and there's only a mild indication of whatever weird issue was happening in the read test.
In our Intel SSD DC S3700 review I introduced a new method of characterizing performance: looking at the latency of individual operations over time. The S3700 promised a level of performance consistency that was unmatched in the industry, and as a result needed some additional testing to show that. The reason we don't have consistent IO latency with SSDs is because inevitably all controllers have to do some amount of defragmentation or garbage collection in order to continue operating at high speeds. When and how an SSD decides to run its defrag and cleanup routines directly impacts the user experience. Frequent (borderline aggressive) cleanup generally results in more stable performance, while delaying that can result in higher peak performance at the expense of much lower worst case performance. The graphs below tell us a lot about the architecture of these SSDs and how they handle internal defragmentation.
To generate the data below I took a freshly secure erased SSD and filled it with sequential data. This ensures that all user accessible LBAs have data associated with them. Next I kicked off a 4KB random write workload at a queue depth of 32 using incompressible data. I ran the test for just over half an hour, no where near what we run our steady state tests for but enough to give me a good look at drive behavior once all spare area filled up.
I recorded instantaneous IOPS every second for the duration of the test. I then plotted IOPS vs. time and generated the scatter plots below. Each set of graphs features the same scale. The first two sets use a log scale for easy comparison, while the last set of graphs uses a linear scale that tops out at 40K IOPS for better visualization of differences between drives.
The first set of graphs shows the performance data over the entire 2000 second test period. In these charts you'll notice an early period of very high performance followed by a sharp dropoff. What you're seeing in that case is the drive alllocating new blocks from its spare area, then eventually using up all free blocks and having to perform a read-modify-write for all subsequent writes (write amplification goes up, performance goes down).
The second set of graphs zooms in to the beginning of steady state operation for the drive (t=1400s). The third set also looks at the beginning of steady state operation but on a linear performance scale. Click the buttons below each graph to switch source data.
Here we see a lot of the code re-use between the Vector and Vertex 4 firmware. Vector performs like a faster Vertex 4, with all of its datapoints shifted up in the graph. The distribution of performance is a bit tighter than on the Vertex 4 and performance is definitely more consistent than the 840 Pro. The S3700 is obviously in a league of its own here, but I do hope that over time we'll see similarly consistent drives from other vendors.
The next set of charts look at the steady state (for most drives) portion of the curve. Here we'll get some better visibility into how everyone will perform over the long run.
The source data is the same, we're just focusing on a different part of the graph. Here the Vector actually looks pretty good compared to all non-S3700 drives. In this case the Vector's performance distribution looks a lot like SandForce. There's a clear advantage again over the 840 Pro and Vertex 4.
The final set of graphs abandons the log scale entirely and just looks at a linear scale that tops out at 40K IOPS. We're also only looking at steady state (or close to it) performance here:
If we look at the tail end of the graph with a linear scale, we get a taste of the of just how varied IO latency can be with most of these drives. Vector looks much more spread out than the Vertex 4, but that's largely a function of the fact that its performance is just so much higher without an equivalent increase in aggressive defrag/GC routines. The 840 Pro generally manages lower performance in this worst case scenario. The SandForce based Intel SSD 330 shows a wide range of IO latencies but overall performance is much better. Had SandForce not been plagued by so many poorly handled reliability issues it might have been a better received option today.
From an IO consistency perspective, the Vector looks a lot like a better Vertex 4 or 840 Pro. Architecturally I wouldn't be too surprised if OCZ's method of NAND mapping and flash management wasn't very similar to Samsung's, which isn't a bad thing at all. I would like to see more emphasis placed on S3700-style IO consistency though. I do firmly believe that the first company to deliver IO consistency for the client space will reap serious rewards.
AnandTech Storage Bench 2011
Two years ago we introduced our AnandTech Storage Bench, a suite of benchmarks that took traces of real OS/application usage and played them back in a repeatable manner. I assembled the traces myself out of frustration with the majority of what we have today in terms of SSD benchmarks.
Although the AnandTech Storage Bench tests did a good job of characterizing SSD performance, they weren't stressful enough. All of the tests performed less than 10GB of reads/writes and typically involved only 4GB of writes specifically. That's not even enough exceed the spare area on most SSDs. Most canned SSD benchmarks don't even come close to writing a single gigabyte of data, but that doesn't mean that simply writing 4GB is acceptable.
Originally I kept the benchmarks short enough that they wouldn't be a burden to run (~30 minutes) but long enough that they were representative of what a power user might do with their system.
Not too long ago I tweeted that I had created what I referred to as the Mother of All SSD Benchmarks (MOASB). Rather than only writing 4GB of data to the drive, this benchmark writes 106.32GB. It's the load you'd put on a drive after nearly two weeks of constant usage. And it takes a *long* time to run.
1) The MOASB, officially called AnandTech Storage Bench 2011 - Heavy Workload, mainly focuses on the times when your I/O activity is the highest. There is a lot of downloading and application installing that happens during the course of this test. My thinking was that it's during application installs, file copies, downloading and multitasking with all of this that you can really notice performance differences between drives.
2) I tried to cover as many bases as possible with the software I incorporated into this test. There's a lot of photo editing in Photoshop, HTML editing in Dreamweaver, web browsing, game playing/level loading (Starcraft II & WoW are both a part of the test) as well as general use stuff (application installing, virus scanning). I included a large amount of email downloading, document creation and editing as well. To top it all off I even use Visual Studio 2008 to build Chromium during the test.
The test has 2,168,893 read operations and 1,783,447 write operations. The IO breakdown is as follows:
|AnandTech Storage Bench 2011 - Heavy Workload IO Breakdown|
|IO Size||% of Total|
Only 42% of all operations are sequential, the rest range from pseudo to fully random (with most falling in the pseudo-random category). Average queue depth is 4.625 IOs, with 59% of operations taking place in an IO queue of 1.
Many of you have asked for a better way to really characterize performance. Simply looking at IOPS doesn't really say much. As a result I'm going to be presenting Storage Bench 2011 data in a slightly different way. We'll have performance represented as Average MB/s, with higher numbers being better. At the same time I'll be reporting how long the SSD was busy while running this test. These disk busy graphs will show you exactly how much time was shaved off by using a faster drive vs. a slower one during the course of this test. Finally, I will also break out performance into reads, writes and combined. The reason I do this is to help balance out the fact that this test is unusually write intensive, which can often hide the benefits of a drive with good read performance.
There's also a new light workload for 2011. This is a far more reasonable, typical every day use case benchmark. Lots of web browsing, photo editing (but with a greater focus on photo consumption), video playback as well as some application installs and gaming. This test isn't nearly as write intensive as the MOASB but it's still multiple times more write intensive than what we were running in 2010.
As always I don't believe that these two benchmarks alone are enough to characterize the performance of a drive, but hopefully along with the rest of our tests they will help provide a better idea.
The testbed for Storage Bench 2011 has changed as well. We're now using a Sandy Bridge platform with full 6Gbps support for these tests.
AnandTech Storage Bench 2011 - Heavy Workload
We'll start out by looking at average data rate throughout our new heavy workload test:
Here it is. OCZ's Vector comes within 4% of Samsung's SSD 840 Pro and manages a 22% increase in performance compared to the Vertex 4. The breakdown shows Vector's strong write performance are really what push it over the edge. At the same time, OCZ has finally addressed whatever poor read performance issues plagued the Vertex 4 in our test - the Vector is a different beast here.
The next three charts just represent the same data, but in a different manner. Instead of looking at average data rate, we're looking at how long the disk was busy for during this entire test. Note that disk busy time excludes any and all idles, this is just how long the SSD was busy doing something:
AnandTech Storage Bench 2011 - Light Workload
Our new light workload actually has more write operations than read operations. The split is as follows: 372,630 reads and 459,709 writes. The relatively close read/write ratio does better mimic a typical light workload (although even lighter workloads would be far more read centric).
The I/O breakdown is similar to the heavy workload at small IOs, however you'll notice that there are far fewer large IO transfers:
|AnandTech Storage Bench 2011 - Light Workload IO Breakdown|
|IO Size||% of Total|
Our light workload remains Samsung's safe haven with the 840 Pro. OCZ's Vector does improve performance considerably over the Vertex 4 (+25%) but Samsung manages a 16% performance advantage here with the 840 Pro.
Over time SSDs can get into a fairly fragmented state, with pages distributed randomly all over the LBA range. TRIM and the naturally sequential nature of much client IO can help clean this up by forcing blocks to be recycled and as a result become less fragmented. Leaving as much free space as possible on your drive helps keep performance high (20% is a good number to shoot for), but it's always good to see how bad things can get before the GC/TRIM routines have a chance to operate. As always I filled all user addressible LBAs with data, wrote enough random data to the drive to fill the spare area and then some, then ran a single HD Tach pass to visualize how slow things got:
As we showed in our enterprise results, Vector's steady state 4KB random write performance is around 33MB/s. The worst case sequential performance here is around 50MB/s, which is in line with what you'd expect. Sequential writes do improve performance, but as with most SSDs you're best operating the Vector with a bit of spare area left on the drive (in addition to what's already set aside by firmware).
TRIM and another sequential pass restore performance to normal, but it also triggers the Vector's performance mode penalty:
At 50% capacity there's an internal reorganization routine that's triggered on Vector, similar to what happens on the Vertex 4. During this time, all performance is impacted, which is why you see a sharp drop in performance just beore the 135GB mark. The re-org routine only takes a few minutes. I went back and measured sequential write performance after this test and came back with 380MB/s in Iometer. In other words, don't be startled by the graph above - it's expected behavior, it just looks bad as the drive doesn't get a chance to run its background operations in peace.
With the Vertex 4, OCZ had a fairly power hungry drive on its hands. Thankfully that's well addressed by the Vector and Barefoot 3. Idle power remains higher than I would like; Samsung clearly has the advantage there. Under load however, the Vector is indistinguishable from Samsung's SSD 840 Pro, which is quite remarkable. Only slower drives or SandForce based solutions faced with highly compressible data can offer better load power consumption. In practice, I'd expect OCZ's Vector to be among the most power efficient, high performance drives on the market under load.
With the Vector, OCZ has built a price and performance competitor to Samsung's SSD 840 Pro, which previously remained peerless at the top of our charts. For a company that just weeks ago was considered down and out for the count, this is beyond impressive. Samsung has emerged as one of the strongest players in the consumer SSD space, and OCZ appears ready to challenge it. In our tests, Samsung typically enjoys better peak performance, but OCZ's Vector appears to have the advantage when it comes to worst case performance and IO consistency. The latter tend to be more valuable in improving overall user experience in my opinion. I would still like to see an S3700-class client drive and I'd be willing to give up top-end performance to get there, but I suspect that's a tall order for now.
The Vector's power consumption under load, given the performance it's able to deliver, is excellent. I wish idle power consumption were better, making the 830/840 Pro a better fit for ultra mobile applications. But under load the Vector and 840 Pro are indistinguishable from one another.
The only downside to the Vector really is its price, which like the 840 Pro is at a definite premium vs competition from the previous generation. As with all SSDs however, I fully expect Barefoot 3 and maybe even the Vector itself to fall in price over time. If you want the latest and greatest available today, Samsung's 840 Pro now has competition in OCZ's Vector.
The Barefoot 3 controller is quite promising. It certainly seems very capable from a performance standpoint without blowing through its power budget. It's no small feat if OCZ's best in-house silicon can be spoken of in the same sentence as Samsung's. The PLX and Indilinx acquisitions appear to have paid off. I'm curious to see how OCZ's improved validation and reliability testing fare in the long run. This isn't the first time that OCZ has promised to focus more on validation, but with Vector I do get the feeling that things are different. I didn't run into any compatibility issues or reliability problems with the Vector in my testing, but as always the proof is what happens when these drives make their way into the hands of end users.
Overall I'm impressed by the Vector. It's a huge improvement over the already good Vertex 4, and manages to compete in a different league by fixing some lingering performance issues with its predecessor. I had resigned myself to assuming no one would come close to Samsung on the high-end, but it's good to be proven wrong. Should OCZ be able to deliver Samsung-like performance and reliability, then I'll really be impressed.