Original Link: http://www.anandtech.com/show/4159/ocz-vertex-3-pro-preview-the-first-sf2500-ssd
OCZ Vertex 3 Pro Preview: The First SF-2500 SSDby Anand Lal Shimpi on February 17, 2011 3:01 AM EST
For the past six months I've been working on research and testing for the next major AnandTech SSD article. I figured I had enough time to line up its release with the first samples of the next-generation of high end SSDs. After all, it seems like everyone was taking longer than expected to bring out their next-generation controllers. I should've known better.
At CES this year we had functional next-generation SSDs based on Marvell and SandForce controllers. The latter was actually performing pretty close to expectations from final hardware. Although I was told that drives wouldn't be shipping until mid-Q2, it was clear that preview hardware was imminent. It was the timing that I couldn't predict.
A week ago, two days before I hopped on a flight to Barcelona for MWC, a package arrived at my door. OCZ had sent me a preproduction version of their first SF-2500 based SSD: the Vertex 3 Pro. The sample was so early that it didn't even have a housing, all I got was a PCB and a note:
Two days isn't a lot of time to test an SSD. It's enough to get a good idea of overall performance, but not enough to find bugs and truly investigate behavior. Thankfully the final release of the drive is still at least 1 - 2 months away, so this article can serve as a preview.
I've covered how NAND Flash works numerous times in the past, but I'll boil it all down to a few essentials.
NAND Flash is non-volatile memory, you can write to it and it'll store a charge even if you remove power from the device. Erase the NAND too many times and it will stop being able to hold a charge. There are two types of NAND that we deal with: single-level cell (SLC) and multi-level cell (MLC). Both are physically the same, you just store more data in the latter which drives costs, performance and reliability down. Two-bit MLC is what's currently used in consumer SSDs, the 3-bit stuff you've seen announced is only suitable for USB sticks, SD cards and other similar media.
Writes to NAND happen at the page level (4KB or 8KB depending on the type of NAND), however you can't erase a single page. You can only erase groups of pages at a time in a structure called a block (usually 128 or 256 pages). Each cell in NAND can only be erased a finite number of times so you want to avoid erasing as much as possible. The way you get around this is by keeping data in NAND as long as possible until you absolutely have to erase it to make room for new data. SSD controllers have to balance the need to optimize performance with the need to write evenly to all NAND pages. Conventional controllers do this by keeping very large tables that track all data being written to the drive and optimizes writes for performance and reliability. The controller will group small random writes together and attempt to turn them into large sequential writes that are easier to burst across all of the NAND devices. Smart controllers will even attempt to reorganize data while writing in order to keep performance high for future writes. All of this requires the controller to keep track of lots of data, which in turn requires the use of large caches and DRAMs to make accessing that data quick. All of this work is done to ensure that the controller only writes data it absolutely needs to write.
SandForce's approach has the same end goal, but takes a very different path to get there. Rather than trying to figure out what to do with the influx of data, SandForce's approach simply writes less data to the NAND. Using realtime compression and data deduplication techniques, SandForce's controllers attempt to reduce the size of what the host is writing to the drive. The host still thinks all of its data is being written to the drive, but once the writes hit the controller, the controller attempts to reduce the data as much as possible.
The compression/deduplication is done in realtime and what results is potentially awesome performance. Writing less data is certainly faster than writing everything. Similar technologies are employed by enterprise SAN solutions, but SandForce's algorithms are easily applicable to the consumer world. With the exception of large, highly compressed multimedia files (think videos, MP3s) most of what you write to your HDD/SSD is pretty easily compressible.
You don't get any extra space with SandForce's approach, the drive still has to accommodate the same number of LBAs as it advertises to the OS. After all, you could write purely random data to the drive, in which case it'd behave like a normal SSD without any of its superpowers.
Since the drive isn't storing your data bit for bit but rather storing hashes, it's easier for SandForce to do things like encrypt all of the writes to the NAND (which it does by default). By writing less, SandForce also avoids having to use a large external DRAM - its designs don't have any DRAM cache. SandForce also claims to be able to use its write-less approach in order to use less reliable NAND, in order to ensure reliability the controller actually writes some amount of redundant data. Data is written across multiple NAND die in parallel along with additional parity data. The parity data occupies the space of a single NAND die. As a result, SandForce drives set aside more spare area than conventional controllers.
Everything I've described up to this point applies to both the previous generation (SF-1200/1500) and the new generation (SF-2200/2500) of SandForce controllers. Now let's go over what's new:
1) Toggle Mode & ONFI 2 NAND support. Higher bandwidth NAND interfaces mean we should see much better performance without any architectural changes.
2) To accommodate the higher bandwidth NAND SandForce increased the size of on-chip memories and buffers as well as doubled the number of NAND die that can be active at one time. Finally there's native 6Gbps support to remove any interface bottlenecks. Both 1 & 2 will manifest as much higher read/write speed.
3) Better encryption. This is more of an enterprise feature but the SF-2000 controllers support AES-256 encryption across the drive (and double encryption to support different encryption keys for separate address ranges on the drive).
4) Better ECC. NAND densities and defect rates are going up, program/erase cycles are going down. The SF-2000 as a result has an improved ECC engine.
All of the other features that were present in the SF-1200/1500 are present in the SF-2000 series.
The Unmentionables: NAND Mortality Rate
When Intel introduced its X25-M based on 50nm NAND technology we presented this slide:
A 50nm MLC NAND cell can be programmed/erased 10,000 times before it's dead. The reality is good MLC NAND will probably last longer than that, but 10,000 program/erase cycles was the spec. Update: Just to clarify, once you exceed the program/erase cycles you don't lose your data, you just stop being able to write to the NAND. On standard MLC NAND your data should be intact for a full year after you hit the maximum number of p/e cycles.
When we transitioned to 34nm, the NAND makers forgot to mention one key fact. MLC NAND no longer lasts 10,000 cycles at 34nm - the number is now down to 5,000 program/erase cycles. The smaller you make these NAND structures, the harder it is to maintain their integrity over thousands of program/erase cycles. While I haven't seen datasheets for the new 25nm IMFT NAND, I've heard the consumer SSD grade stuff is expected to last somewhere between 3000 - 5000 cycles. This sounds like a very big problem.
Thankfully, it's not.
My personal desktop sees about 7GB of writes per day. That can be pretty typical for a power user and a bit high for a mainstream user but it's nothing insane.
Here's some math I did not too long ago:
|NAND Flash Capacity||256 GB|
|Formatted Capacity in the OS||238.15 GB|
|Available Space After OS and Apps||185.55 GB|
|Spare Area||17.85 GB|
If I never install another application and just go about my business, my drive has 203.4GB of space to spread out those 7GB of writes per day. That means in roughly 29 days my SSD, if it wear levels perfectly, I will have written to every single available flash block on my drive. Tack on another 7 days if the drive is smart enough to move my static data around to wear level even more properly. So we're at approximately 36 days before I exhaust one out of my ~10,000 write cycles. Multiply that out and it would take 360,000 days of using my machine for all of my NAND to wear out; once again, assuming perfect wear leveling. That's 986 years. Your NAND flash cells will actually lose their charge well before that time comes, in about 10 years.
Now that calculation is based on 50nm 10,000 p/e cycle NAND. What about 34nm NAND with only 5,000 program/erase cycles? Cut the time in half - 180,000 days. If we're talking about 25nm with only 3,000 p/e cycles the number drops to 108,000 days.
Now this assumes perfect wear leveling and no write amplification. Now the best SSDs don't average more than 10x for write amplification, in fact they're considerably less. But even if you are writing 10x to the NAND what you're writing to the host, even the worst 25nm compute NAND will last you well throughout your drive's warranty.
For a desktop user running a desktop (non-server) workload, the chances of your drive dying within its warranty period due to you wearing out all of the NAND are basically nothing. Note that this doesn't mean that your drive won't die for other reasons before then (e.g. poor manufacturing, controller/firmware issues, etc...), but you don't really have to worry about your NAND wearing out.
This is all in theory, but what about in practice?
Thankfully one of the unwritten policies at AnandTech is to actually use anything we recommend. If we're going to suggest you spend your money on something, we're going to use it ourselves. Not in testbeds, but in primary systems. Within the company we have 5 SandForce drives deployed in real, every day systems. The longest of which has been running, without TRIM, for the past eight months at between 90 and 100% of its capacity.
SandForce, like some other vendors, expose a method of actually measuring write amplification and remaining p/e cycles on their drives. Unfortunately the method of doing so for SandForce is undocumented and under strict NDA. I wish I could share how it's done, but all I'm allowed to share are the results.
Remember that write amplification is the ratio of NAND writes to host writes. On all non-SF architectures that number should be greater than 1 (e.g. you go to write 4KB but you end up writing 128KB). Due to SF's real time compression/dedupe engine, it's possible for SF drives to have write amplification below 1.
So how did our drives fare?
The worst write amplification we saw was around 0.6x. Actually, most of the drives we've deployed in house came in at 0.6x. In this particular drive the user (who happened to be me) wrote 1900GB to the drive (roughly 7.7GB per day over 8 months) and the SF-1200 controller in turn threw away 800GB and only wrote 1100GB to the flash. This includes garbage collection and all of the internal management stuff the controller does.
Over this period of time I used only 10 cycles of flash (it was a 120GB drive) out of a minimum of 3000 available p/e cycles. In eight months I only used 1/300th of the lifespan of the drive.
The other drives we had deployed internally are even healthier. It turns out I'm a bit of a write hog.
Paired with a decent SSD controller, write lifespan is a non-issue. Note that I only fold Intel, Crucial/Micron/Marvell and SandForce into this category. Write amplification goes up by up to an order of magnitude with the cheaper controllers. Characterizing this is what I've been spending much of the past six months doing. I'm still not ready to present my findings but as long as you stick with one of these aforementioned controllers you'll be safe, at least as far as NAND wear is concerned.
Today: Toshiba 32nm Toggle NAND, Tomorrow: IMFT 25nm
The Vertex 3 Pro sample I received is a drive rated at 200GB with 256GB of NAND on-board. The SF-2682 controller is still an 8-channel architecture and OCZ populates all 8 channels with a total of 16 NAND devices. OCZ selected Toshiba 32nm Toggle Mode MLC NAND for these early Vertex 3 Pro samples however final shipping versions might transition to IMFT 25nm. The consumer version (Vertex 3) will use IMFT 25nm for sure.
Each of the 16 NAND devices on board is 16GB in size. Each package is made up of four die (4GB a piece) and two planes per die (2GB per plane). Page sizes have changed. The older 34nm Intel NAND used a 4KB page size and a 1MB block size. For Toshiba's 32nm Toggle NAND pages are now 8KB and block size remains unchanged. The move to 25nm will finally double block size as well.
Remember from our earlier description of SandForce's architecture that its data redundancy requires a single die's worth of capacity. In this case 4GB of the 256GB of NAND is reserved for data parity and the remaining 66GB is used for block replacement (either cleaning or bad block replacement). The 200GB drive has a 186GB formatted capacity in Windows.
This is a drive with an enterprise focus so the 27.2% spare area is not unusual. You can expect the consumer versions to set aside less spare area, likely at little impact to performance.
The 0.09F supercap, a feature of the enterprise level SF-2500 controller. This won't be present on the client Vertex 3.
The Vertex 3 Pro is still at least a month or two away from shipping so pricing could change, but right now OCZ is estimating sales at between $3.75 and $5.25 per GB. The client focused Vertex 3 will be significantly cheaper - I'd estimate somewhere north (but within range) of what you can currently buy Vertex 2 drives for.
|OCZ Vertex 3 Pro Pricing|
|Cost per GB||$5.35/GB||$3.875/GB||$3.375/GB|
Both the Vertex 3 and Vertex 3 Pro are expected to be available as early as March, however as always I'd be cautious in jumping on a brand new controller with brand new firmware without giving both some time to mature.
Note that I've pulled out our older results for the Kingston V+100. There were a couple of tests that had unusually high performance which I now believe was due the drive being run with a newer OS/software image than the rest of the older drives. I will be rerunning those benchmarks in the coming week.
I should also note that this is beta hardware running beta firmware. While the beta nature of the drive isn't really visible in any of our tests, I did attempt to use the Vertex 3 Pro as the primary drive in my 15-inch MacBook Pro on my trip to MWC. I did so with hopes of exposing any errors and bugs quicker than normal, and indeed I did. Under OS X on the MBP with a full image of tons of data/apps, the drive is basically unusable. I get super long read and write latency. I've already informed OCZ of the problem and I'd expect a solution before we get to final firmware. Often times actually using these drives is the only way to unmask issues like this.
Intel Core i7 965 running at 3.2GHz (Turbo & EIST Disabled)
Intel Core i7 2600K running at 3.4GHz (Turbo & EIST Disabled) - for AT SB 2011
Intel DX58SO (Intel X58)
Intel H67 Motherboard
Intel X58 + Marvell SATA 6Gbps PCIeIntel H67
Intel 126.96.36.1995 + Intel IMSM 8.9
Intel 188.8.131.525 + Intel RST 10.2
|Memory:||Qimonda DDR3-1333 4 x 1GB (7-7-7-20)|
|Video Card:||eVGA GeForce GTX 285|
|Video Drivers:||NVIDIA ForceWare 190.38 64-bit|
|Desktop Resolution:||1920 x 1200|
|OS:||Windows 7 x64|
Random Read/Write Speed
The four corners of SSD performance are as follows: random read, random write, sequential read and sequential write speed. Random accesses are generally small in size, while sequential accesses tend to be larger and thus we have the four Iometer tests we use in all of our reviews.
Our first test writes 4KB in a completely random pattern over an 8GB space of the drive to simulate the sort of random access that you'd see on an OS drive (even this is more stressful than a normal desktop user would see). I perform three concurrent IOs and run the test for 3 minutes. The results reported are in average MB/s over the entire time. We use both standard pseudo randomly generated data for each write as well as fully random data to show you both the maximum and minimum performance offered by SandForce based drives in these tests. The average performance of SF drives will likely be somewhere in between the two values for each drive you see in the graphs. For an understanding of why this matters, read our original SandForce article.
Random write performance is much better on the SF-2500, not that it was bad to begin with on the SF-1200. In fact, the closest competitor is the SF-1200, the rest don't stand a chance.
Many of you have asked for random write performance at higher queue depths. What I have below is our 4KB random write test performed at a queue depth of 32 instead of 3. While the vast majority of desktop usage models experience queue depths of 0 - 5, higher depths are possible in heavy I/O (and multi-user) workloads:
Ramp up the queue depth and there's still tons of performance on the table. At 3Gbps the performance of the Vertex 3 Pro is actually no different than the SF-1200 based Corsair Force, the SF-2500 is made for 6Gbps controllers.
Sequential Read/Write Speed
To measure sequential performance I ran a 3 minute long 128KB sequential test over the entire span of the drive at a queue depth of 1. The results reported are in average MB/s over the entire test length.
This is pretty impressive. The new SF-2500 can write incompressible data sequentially at around the speed the SF-1200 could write highly compressible data. In other words, the Vertex 3 Pro at its slowest is as fast as the Vertex 2 is at its fastest. And that's just at 3Gbps.
The Vertex 3 Pro really shines when paired with a 6Gbps controller. At low queue depths you're looking at 381MB/s writes, from a single drive, with highly compressible data. Write incompressible data and you've still got the fastest SSD on the planet.
Micron is aiming for 260MB/s writes for the C400, which is independent of data type. If Micron can manage 260MB/s in sequential writes that will only give it a minor advantage over the worst case performance of the Vertex 3 Pro, and put it at a significant disadvantage compared to OCZ's best case.
Initially, SandForce appears to have significantly improved performance handling in the worst case of incompressible writes. While the old SF-1200 could only deliver 63% of its maximum performance when dealing with incompressible data, the SF-2500 holds on to 92% of it over a 3Gbps SATA interface. Remove the SATA bottleneck however and the performance difference returns to what we're used to. Over 6Gbps SATA the SF-2500 manages 63% of maximum performance if it's writing incompressible data.
Note that the peak 6Gbps sequential write figures jump up to around 500MB/s if you hit the drive with a heavier workload, which we'll see a bit later.
Sequential read performance continues to be dominated by OCZ and SandForce. Over a 3Gbps interface SandForce improved performance by 20 - 40%, but over a 6Gbps interface the jump is just huge. For incompressible data we're talking about nearly 400MB/s from a single drive. I don't believe you'd even be able to generate the workloads necessary to saturate a RAID-0 of two of these drives on a desktop system.
The Performance Degradation Problem
When Intel first released the X25-M, Allyn Malventano discovered a nasty corner case where the drive would no longer be able to run at its full potential. You basically had to hammer on the drive with tons of random writes for at least 20 minutes, but eventually the drive would be stuck at a point of no return. Performance would remain low until you secure erased the drive.
Although it shouldn't appear in real world use, the worry was that over time a similar set of conditions could align resulting in the X25-M performing slower than it should. Intel, having had much experience with similar types of problems (e.g. FDIV, Pentium III 1.13GHz), immediately began working on a fix and released the fix a couple of months after launch. The fix was nondestructive although you saw much better performance if you secure erased your drive first.
SandForce has a similar problem and I have you all and bit-tech to thank for pointing it out. In bit-tech's SandForce SSD reviews they test TRIM functionality by filling a drive with actual data (from a 500GB source including a Windows install, pictures, movies, documents, etc...). The drive is then TRIMed, and performance is measured.
If you look at bit-tech's charts you'll notice that after going through this process, the SandForce drives no longer recover their performance after TRIM. They are stuck in a lower performance state making the drives much slower when writing incompressible data.
You can actually duplicate the bit-tech results without going through all of that trouble. All you need to do is write incompressible data to all pages of a SandForce drive (user accessible LBAs + spare area), TRIM the drive and then measure performance. You'll get virtually the same results as bit-tech:
|AS-SSD Incompressible Write Speed|
|Clean Performance||Dirty (All Blocks + Spare Area Filled)||After TRIM|
|SandForce SF-1200 (120GB)||131.7MB/s||70.3MB/s||71MB/s|
The question is why.
I spoke with SandForce about the issue late last year. To understand the cause we need to remember how SSDs work. When you go to write to an SSD, the controller must first determine where to write. When a drive is completely empty, this decision is pretty easy to make. When a drive is not completely full to the end user but all NAND pages are occupied (e.g. in a very well used state), the controller must first supply a clean/empty block for you to write to.
When you fill a SF drive with incompressible data, you're filling all user addressable LBAs as well as all of the drive's spare area. When the SF controller gets a request to overwrite one of these LBAs the drive has to first clean a block and then write to it. It's the block recycling path that causes the aforementioned problem.
In the SF-1200 SandForce can only clean/recycle blocks at a rate of around 80MB/s. Typically this isn't an issue because you won't be in a situation where you're writing to a completely full drive (all user LBAs + spare area occupied with incompressible data). However if you do create an environment where all blocks have data in them (which can happen over time) and then attempt to write incompressible data, the SF-1200 will be limited by its block recycling path.
So why doesn't TRIMing the entire drive restore performance?
Remember what TRIM does. The TRIM command simply tells the controller what LBAs are no longer needed by the OS. It doesn't physically remove data from the SSD, it just tells the controller that it can remove the aforementioned data at its own convenience and in accordance with its own algorithms.
The best drives clean dirty blocks as late as possible without impacting performance. Aggressive garbage collection only increases write amplification and wear on the NAND, which we've already established SandForce doesn't really do. Pair a conservative garbage collection/block recycling algorithm with you attempting to write an already full drive with tons of incompressible data and you'll back yourself into a corner where the SF-1200 continues to be bottlenecked by the block recycling path. The only way to restore performance at this point is to secure erase the drive.
This is a real world performance issue on SF-1200 drives. Over time you'll find that when you go to copy a highly compressed file (e.g. H264 video) that your performance will drop to around 80MB/s. However, the rest of your performance will remain as high as always. This issue only impacts data that can't be further compressed/deduped by the SF controller. While SandForce has attempted to alleviate it in the SF-1200, I haven't seen any real improvements with the latest firmware updates. If you're using your SSD primarily to copy and store highly compressed files, you'll want to consider another drive.
Luckily for SandForce, the SF-2500 controller alleviates the problem. Here I'm running the same test as above. Filling all blocks of the Vertex 3 Pro with incompressible data and then measuring sequential write speed. There's a performance drop, but it's no where near as significant as what we saw with the SF-1200:
|AS-SSD Incompressible Write Speed|
|Clean Performance||Dirty (All Blocks + Spare Area Filled)||After TRIM|
|SandForce SF-1200 (120GB)||131.7 MB/s||70.3 MB/s||71 MB/s|
|SandForce SF-2500 (200GB)||229.5 MB/s||230.0 MB/s||198.2 MB/s|
It looks like SandForce has increased the speed of its block recycling engine among other things, resulting in a much more respectable worst case scenario of ~200MB/s.
Verifying the Fix
I was concerned that perhaps SandForce simply optimized for the manner in which AS-SSD and Iometer write incompressible data. In order to verify the results I took a 6.6GB 720p H.264 movie and copied it from an Intel X25-M G2 SSD to one of two SF drives. The first was a SF-1200 based Corsair Force F120, and the second was an OCZ Vertex 3 Pro (SF-2500).
I measured both clean performance as well as performance after I'd filled all blocks on the drive. The results are below:
|6.6GB 720p H.264 File Copy (X25-M G2 Source to Destination)|
|Clean Performance||Dirty (All Blocks + Spare Area Filled)||After TRIM|
|SandForce SF-1200 (120GB)||138.6 MB/s||78.5 MB/s||81.7 MB/s|
|SandForce SF-2500 (200GB)||157.5 MB/s||158.2 MB/s||157.8 MB/s|
As expected the SF-1200 drive drops from 138MB/s down to 81MB/s. The drive is bottlenecked by its block recycling path and performance never goes up beyond 81MB/s.
The SF-2000 however doesn't drop in performance. Brand new performance is at 157MB/s and post-torture it's still at 157MB/s. What's interesting however is that the incompressible file copy performance here is lower than what Iometer and AS-SSD would have you believe. Iometer warns that even its fully random data pattern can be defeated by drives with good data deduplication algorithms. Unless there's another bottleneck at work here, it looks like the SF-2000 is still reducing the data that Iometer is writing to the drive. The AS-SSD comparison actually makes a bit more sense since AS-SSD runs at a queue depth of 32 and this simple file copy is mostly at a queue depth of 1. Higher queue depths will make better use of parallel NAND channels and result in better performance.
AnandTech Storage Bench 2011: Much Heavier
I didn't expect to have to debut this so soon, but I've been working on updated benchmarks for 2011. Last year we introduced our AnandTech Storage Bench, a suite of benchmarks that took traces of real OS/application usage and played them back in a repeatable manner. I assembled the traces myself out of frustration with the majority of what we have today in terms of SSD benchmarks.
Although the AnandTech Storage Bench tests did a good job of characterizing SSD performance, they weren't stressful enough. All of the tests performed less than 10GB of reads/writes and typically involved only 4GB of writes specifically. That's not even enough exceed the spare area on most SSDs. Most canned SSD benchmarks don't even come close to writing a single gigabyte of data, but that doesn't mean that simply writing 4GB is acceptable.
Originally I kept the benchmarks short enough that they wouldn't be a burden to run (~30 minutes) but long enough that they were representative of what a power user might do with their system.
Not too long ago I tweeted that I had created what I referred to as the Mother of All SSD Benchmarks (MOASB). Rather than only writing 4GB of data to the drive, this benchmark writes 106.32GB. It's the load you'd put on a drive after nearly two weeks of constant usage. And it takes a *long* time to run.
I'll be sharing the full details of the benchmark in some upcoming SSD articles (again, I wasn't expecting to have to introduce this today so I'm a bit ill prepared) but here are some details:
1) The MOASB, officially called AnandTech Storage Bench 2011 - Heavy Workload, mainly focuses on the times when your I/O activity is the highest. There is a lot of downloading and application installing that happens during the course of this test. My thinking was that it's during application installs, file copies, downloading and multitasking with all of this that you can really notice performance differences between drives.
2) I tried to cover as many bases as possible with the software I incorporated into this test. There's a lot of photo editing in Photoshop, HTML editing in Dreamweaver, web browsing, game playing/level loading (Starcraft II & WoW are both a part of the test) as well as general use stuff (application installing, virus scanning). I included a large amount of email downloading, document creation and editing as well. To top it all off I even use Visual Studio 2008 to build Chromium during the test.
Many of you have asked for a better way to really characterize performance. Simply looking at IOPS doesn't really say much. As a result I'm going to be presenting Storage Bench 2011 data in a slightly different way. We'll have performance represented as Average MB/s, with higher numbers being better. At the same time I'll be reporting how long the SSD was busy while running this test. These disk busy graphs will show you exactly how much time was shaved off by using a faster drive vs. a slower one during the course of this test. Finally, I will also break out performance into reads, writes and combined. The reason I do this is to help balance out the fact that this test is unusually write intensive, which can often hide the benefits of a drive with good read performance.
There's also a new light workload for 2011. This is a far more reasonable, typical every day use case benchmark. Lots of web browsing, photo editing (but with a greater focus on photo consumption), video playback as well as some application installs and gaming. This test isn't nearly as write intensive as the MOASB but it's still multiple times more write intensive than what we were running last year.
As always I don't believe that these two benchmarks alone are enough to characterize the performance of a drive, but hopefully along with the rest of our tests they will help provide a better idea.
The testbed for Storage Bench 2011 has changed as well. We're now using a Sandy Bridge platform with full 6Gbps support for these tests. All of the older tests are still run on our X58 platform.
AnandTech Storage Bench 2011 - Heavy Workload
We'll start out by looking at average data rate throughout our new heavy workload test:
The Vertex 3 Pro on a 6Gbps interface is around 24% faster than Crucial's RealSSD C300. Note that the old SF-1200 (Corsair Force F120) can only deliver 60% of the speed of the new SF-2500. Over a 3Gbps interface the Vertex 3 Pro is quick, but only 15% faster than the next fastest 3Gbps drive. In order to get the most out of the SF-2500 you need a 6Gbps interface.
If we break out our performance results into average read and write speed we get a better idea for the Vertex 3 Pro's strengths:
The SF-2500 is significantly faster than its predecessor and all other drives in terms of read performance. Good read speed is important as it influences application launch time as well as overall system responsiveness.
Average write speed is still class leading, but this benchmark uses a lot of incompressible data - you'll note that the Vertex 3 Pro only averages 225.9MB/s - barely over its worst case write speed. It's in this test that I'm expecting the new C400 to do better than SandForce.
The next three charts just represent the same data, but in a different manner. Instead of looking at average data rate, we're looking at how long the disk was busy for during this entire test. Note that disk busy time excludes any and all idles, this is just how long the SSD was busy doing something:
AnandTech Storage Bench 2011 - Light Workload
Lighter use cases still show a benefit from the SF-2500, but again we see that a 6Gbps interface is necessary to really distance this drive from the pack:
Read performance continues to be a tremendous advantage of the SF-2500. Again, 6Gbps matters a lot here.
Performance vs. Transfer Size
All of our Iometer sequential tests happen at a queue depth of 1, which is indicative of a light desktop workload. It isn't too far fetched to see much higher queue depths on the desktop. The performance of these SSDs also greatly varies based on the size of the transfer. For this next test we turn to ATTO and run a sequential write over a 2GB span of LBAs at a queue depth of 4 and varying the size of the transfers.
On a 6Gbps SATA port the Vertex 3 Pro is unstoppable. For transfer sizes below 16KB it's actually a bit average, and definitely slower than the RealSSD C300. But once you hit 16KB and above, the performance is earth shattering. The gap at 128KB isn't even as big as it gets, we don't see leveling off of performance until 2048KB transfers.
The 3Gbps performance is pretty unimpressive. In fact, the Vertex 3 Pro actually comes in a bit slower than the SF-1200 based Corsair Force F120. If you're going to get the most out of this drive you had better have a good 6Gbps controller.
ATTO's writes are fully compressible, indicative of the sort of performance you'd get on applications/libraries/user data and not highly compressed multimedia files. Here the advantage is just hilarious. By the 8KB mark the Vertex 3 Pro is already faster than everything else, but by 128KB the gap is more of a chasm separating the 6Gbps Vertex 3 Pro from its competitors.
Over a 3Gbps interface the Vertex 3 Pro once again does well but still doesn't really differentiate itself from the SF-1200 based Force F120. Real world performance is probably a bit higher as most transfers aren't perfectly compressible, but again if you don't have a good 6Gbps interface (think Intel 6-series or AMD 8-series) then you probably should wait and upgrade your motherboard first.
AS-SSD High Queue Depth Incompressible Sequential Performance
While ATTO shows us best case scenario for the SF-2500, AS-SSD shows us the worst case - at least for writes. The AS-SSD sequential benchmark takes place at a very high queue depth of 32 and uses incompressible data for all of its transfers. The result is a pretty big reduction in sequential write speed on SandForce based controllers.
Read speed is minimally impacted by the nature of the data. We see that at high queue depths over a 6Gbps SATA interface the Vertex 3 Pro can break 500MB/s for sequential reads. Over a 3Gbps interface the Vertex 3 Pro is mostly unimpressive, looking a lot like a C300.
The sequential write test is a tough pill to swallow for SandForce. This is truly worst case scenario performance as its high queue depth transfers of incompressible data. Admittedly the Vertex 3 Pro does much better than drives based on its former controller (SF-1200) but it's no faster than Samsung's SSD 470 and barely faster than the SSDNow V100. Over a 3Gbps interface the controller doesn't look all that great either. This is an important chart to look at if you're doing a lot of file archival on your SSD. However most usage models will see a very different performance breakdown than this. For SandForce, this is truly the worst case scenario.
Overall System Performance using PCMark Vantage
Next up is PCMark Vantage, another system-wide performance suite. For those of you who aren’t familiar with PCMark Vantage, it ends up being the most real-world-like hard drive test I can come up with. It runs things like application launches, file searches, web browsing, contacts searching, video playback, photo editing and other completely mundane but real-world tasks. I’ve described the benchmark in great detail before but if you’d like to read up on what it does in particular, take a look at Futuremark’s whitepaper on the benchmark; it’s not perfect, but it’s good enough to be a member of a comprehensive storage benchmark suite. Any performance impacts here would most likely be reflected in the real world.
Our PCMark Vantage scores echo what we've seen already - the SF-2500 really needs a 6Gbps controller to shine.
SYSMark 2007 isn't nearly as demanding on the storage subsytem, and so we're mostly bottlenecked elsewhere. The SSDs and the Vertex 3 Pro place at the top, but the Raptor 600GB and Seagate Mementus XT 500GB are both withing striking distance.
AnandTech Storage Bench 2010
To keep things consistent we've also included our older Storage Bench. Note that the old storage test system doesn't have a SATA 6Gbps controller, so we only have one result for the Vertex 3 Pro (and the C300). The SF-2500 controller does respectably here, but with a 3Gbps controller we're only marginally faster than other SSDs (which is why we've moved to a new storage platform for 2011).
The first in our benchmark suite is a light/typical usage case. The Windows 7 system is loaded with Firefox, Office 2007 and Adobe Reader among other applications. With Firefox we browse web pages like Facebook, AnandTech, Digg and other sites. Outlook is also running and we use it to check emails, create and send a message with a PDF attachment. Adobe Reader is used to view some PDFs. Excel 2007 is used to create a spreadsheet, graphs and save the document. The same goes for Word 2007. We open and step through a presentation in PowerPoint 2007 received as an email attachment before saving it to the desktop. Finally we watch a bit of a Firefly episode in Windows Media Player 11.
There’s some level of multitasking going on here but it’s not unreasonable by any means. Generally the application tasks proceed linearly, with the exception of things like web browsing which may happen in between one of the other tasks.
The recording is played back on all of our drives here today. Remember that we’re isolating disk performance, all we’re doing is playing back every single disk access that happened in that ~5 minute period of usage. The light workload is composed of 37,501 reads and 20,268 writes. Over 30% of the IOs are 4KB, 11% are 16KB, 22% are 32KB and approximately 13% are 64KB in size. Less than 30% of the operations are absolutely sequential in nature. Average queue depth is 6.09 IOs.
The performance results are reported in average I/O Operations per Second (IOPS):
If there’s a light usage case there’s bound to be a heavy one. In this test we have Microsoft Security Essentials running in the background with real time virus scanning enabled. We also perform a quick scan in the middle of the test. Firefox, Outlook, Excel, Word and Powerpoint are all used the same as they were in the light test. We add Photoshop CS4 to the mix, opening a bunch of 12MP images, editing them, then saving them as highly compressed JPGs for web publishing. Windows 7’s picture viewer is used to view a bunch of pictures on the hard drive. We use 7-zip to create and extract .7z archives. Downloading is also prominently featured in our heavy test; we download large files from the Internet during portions of the benchmark, as well as use uTorrent to grab a couple of torrents. Some of the applications in use are installed during the benchmark, Windows updates are also installed. Towards the end of the test we launch World of Warcraft, play for a few minutes, then delete the folder. This test also takes into account all of the disk accesses that happen while the OS is booting.
The benchmark is 22 minutes long and it consists of 128,895 read operations and 72,411 write operations. Roughly 44% of all IOs were sequential. Approximately 30% of all accesses were 4KB in size, 12% were 16KB in size, 14% were 32KB and 20% were 64KB. Average queue depth was 3.59.
The gaming workload is made up of 75,206 read operations and only 4,592 write operations. Only 20% of the accesses are 4KB in size, nearly 40% are 64KB and 20% are 32KB. A whopping 69% of the IOs are sequential, meaning this is predominantly a sequential read benchmark. The average queue depth is 7.76 IOs.
I didn't really believe that SandForce could pull it off when I first heard how fast the SF-2000 line would be. Even after CES, I didn't really believe the drives would be this good in real world use cases. Consider me pleasantly surprised.
When connected to a good 6Gbps controller, the Vertex 3 Pro is significantly faster than anything else on the market today. Obviously the V3P itself is an unreleased drive so things could change as its competitors show up as well, but the bar has been set very high. The Vertex 3 Pro is the first SSD to really put 6Gbps SATA to good use. In fact I'd say its the first drive that really needs a 6Gbps interface. Whenever you Sandy Bridge owners get replacement motherboards, this may be the SSD you'll want to pair with them.
Even writing incompressible data the Vertex 3 Pro is faster than current SandForce drives running full tilt. The performance gains we see here are generational, not a simple evolutionary improvement. SandForce has also successfully addressed the limited shortcomings of the original SF-1200 controller with regards to writing incompressible data.
Clearly performance isn't going to be a problem with this generation. The real unknowns are how well will the Vertex 3 (non-Pro) perform and how reliable will these drives be? Intel is still king of the hill when it comes to drive reliability, however OCZ has been investing heavily in improving its manufacturing. I suspect that this next SSD war will be fought both along performance and reliability lines. Unfortunately for us, the latter is very difficult to quantify without a significant sample of drives.
With new controllers from SandForce, Intel and Marvell due out this year we're going to see SSD performance go through the roof and SSD prices to continue to fall. We're still a couple months away from knowing exactly what to buy, but if you've been putting off that move to an SSD - 2011 may be the year to finally pull the trigger.