Total Bytes Written & Spare Area

90K p/e cycles seems a bit high and I can't find any Intel documentation that actually quotes that number, it's just what I heard at the 710 briefing in San Francisco. Luckily Intel has another metric it likes to use: total bytes written.

You don't get TBW for client drives, but for enterprise drives Intel will tell you exactly how many tera or petabytes of random 4KB or 8KB data you can write to the drive. These values are "up to" of course as actual lifespan will depend on the specific workloads.

Intel SSD Endurance Comparison
  X25-E 32GB X25-E 64GB 710 100GB 710 200GB 710 300GB
4KB Random Writes 1.0 PB 2.0 PB 500 TB 1.0 PB 1.1 PB
w/ +20% Spare Area - - 900 TB 1.5 PB 1.5 PB

Doing the math these values work out to be about 5K writes per cell (~5243), however that's assuming no write amplification. Performing a 100% random write across all LBAs for a full petabyte of data is going to generate some serious write amplification. The controller in the 710 tends to see write amplification of around 12x for 4KB random writes, which would put the rated cycle count at just under 63,000.

There's just one problem. The 200GB 710 I'm basing these calculations on doesn't actually have 200GB of NAND on-board, it has 320GB.

Opening up the 710 that Intel sent me I found a total of 20 NAND packages on-board. This isn't surprising as Intel's controllers have always supported 10 parallel NAND channels, in this case the 710 uses two packages per channel and interleaves requests to them. Each NAND package however has 128Gbit (16GBytes) of NAND inside in the form of 2 x 8GB 25nm MLC-HET die. Multiply all of that out and you get 320GB of NAND inside this 200GB drive.

Of course 200GB is defined as 200,000,000,000,000 bits, so actual binary storage capacity is 186.3GiB. This is absolutely insane: over 41% of the NAND on the 710's PCB is set aside as spare area. We have never reviewed an SSD with anywhere near this much spare area before.

If we run the p/e count with 320GB as the actual amount of NAND available, it works out to be just under 40K p/e cycles per cell. The significant spare area on the 710 increases the drive's projected lifespan by 55%! Intel even recommends setting aside another 20% of the drive if you need a longer lifespan. An extra 20% spare area will give you another 50% increase in total bytes written. Tinkering with spare area just helps reduce write amplification, it doesn't magically make the NAND cells last longer.

If we believe Intel's specifications, MLC-HET actually sounds pretty decent. You get endurance in the realm of the X25-E but at significantly lower cost and with more reasonable capacity options.

Thankfully we don't need to just take Intel's word, we can measure ourselves. For the past couple of years Intel has included a couple of counters in the SMART data of its SSDs. SMART attribute E2h gives you an accurate count of how much wear your current workload is putting on the drive's NAND. To measure all you need to do is reset the workload timer (E4h) and run your workload on the drive for at least 60 minutes. Afterwards, take the raw value in E2h, divide by 1024 and you get the percentage of wear your workload put on the drive's NAND. I used smartmontools to reset E4h before running a 60 minute loop of our SQL benchmarks on the drive, simulating about a day of our stats DB workload.

Once the workloads finished looping I measured 0.0145% wear on the drive for a day of our stats DB workload. That works out to be 5.3% of wear per year or around 18.9 years before the NAND is done for. I'd be able to find more storage in my pocket before the 710 died due to NAND wear running our stats DB.

For comparison I ran the same test on an Intel SSD 320 and ended up with a much shorter 4.6 year lifespan. Our stats DB does much more than just these two tasks however - chances are we'd see failure much sooner than 4.6 years on the 320. An even heavier workload would quickly favor the 710's MLC-HET NAND.

But what about performance? SLC write speeds are much higher than MLC, but Intel's MLC performance has come a long way since the old X25-E. Let's get to the benchmarks.

The Test

We're still building up our Enterprise Bench data so forgive the lack of comparison data here. We included a number of consumer drives simply as a reference point.

CPU

Intel Core i7 2600K running at 3.4GHz (Turbo & EIST Disabled)

Motherboard:

Intel H67 Motherboard

Chipset:

Intel H67

Chipset Drivers:

Intel 9.1.1.1015 + Intel RST 10.2

Memory: Qimonda DDR3-1333 4 x 1GB (7-7-7-20)
Video Card: eVGA GeForce GTX 285
Video Drivers: NVIDIA ForceWare 190.38 64-bit
Desktop Resolution: 1920 x 1200
OS: Windows 7 x64

Random Read/Write Speed

NAND Recap Random & Sequential Read/Write Speed
POST A COMMENT

68 Comments

View All Comments

  • cdillon - Friday, September 30, 2011 - link

    I must be missing some important detail behind their decision to use MLC. MLC holds exactly twice as much information per cell as SLC, which means you can get twice the storage with the same number of chips. However, they are reserving up to 60% of the MLC NAND as spare area while still achieving a LOWER write-life than the SLC-based X25-E which only needs 20% spare. Why not continue to use lower-density SLC with a smaller spare area? The total capacity would only be slightly lower while achieving at least another 500GB of write life, if not more, and would probably also bring the 4KB Random Write numbers back up to X25-E levels. Reply
  • cdillon - Friday, September 30, 2011 - link

    Oops, I meant to say "at least another 500TB of write life" instead of 500GB. Reply
  • Stahn Aileron - Saturday, October 1, 2011 - link

    More than likely it has to do with production yields. Anand mentions SLC and MLC are physically identical, it's just how you address them. SLC seems to be very high quality NAND while MLC is the low end. MLC-HET (or eMLC) seems to be the middle of the pack in terms of overall quality.

    Unless you can get SLC yields that consistently outpace MLC-HET yields by a factor of 2, it's not very economical in the long run for the same capacity.

    Also, chip manufacturing is a pretty fixed cost at the wafer level from my understanding (at least once you hit mass production on a mature process). For SLC vs MLC, you can either use double the SLC chips to match MLC capacities (higher cost) or use the same number of chips and sacrifice capacity. Intel seems to be trying to get the best of both worlds (higher capacity at the same or lower costs). (All that while maximizing their production capacity and ability to meet demand as needed as a side benefit.)

    Obviously I could be wrong. That's all conjecture based on what little I know of the industry as a consumer.
    Reply
  • ckryan - Friday, September 30, 2011 - link

    Anand, thanks for the awesome Intel SMART data tip.

    I've joined in the XtremeSystems SSD endurance test. By writing a simulated desktop workload to the drive over and over, for months on end, eventually a drive will become read only. So far, only one drive has become RO, and that was a Samsung 470 with an apparent write amplification of 5+(this was the only SSD I've ever heard of that this has happened to outside of a lab). Another drive (a 64GB Crucial M4) has gone through almost 10,000 PE cycles, and still doesn't have any reallocated sectors -- but all of the drives have performed well, and many have hundreds of TBs on them. I chose a SF 2281 with Toshiba toggle NAND, but I'm having some issues with it (like it won't stop dropping out/or BSOD if it's the system drive). Though it takes months and months of 24/7 writing, I think the process is both interesting and likely to put many users at ease concerning drive longevity. I don't think consumers should be worrying about the endurance of 25nm NAND, but I do start wondering what will happen with the advent of next generation flash. If you want to worry about NAND, worry about sync vs async or toggle, but don't sweat the conservative PE ratings -- it seems like the controller itself plays a super important role in the preservation of NAND in addition to the NAND itself and spare area. Obviously, increasing spare area is always a good idea if you have a particularly brutal workload, but it's not a terrible idea in many other settings... it's not just for RAID0 you know.

    The only real SSD endurance test takes place in a user's machine (or server), and I have no doubts that any modern SSD will last anything less than the better part of a decade -- at least as far as the flash is concerned (and probably much, much longer). You'll get mad at your SF's BSoDs and throw it out the window before you ever make a dent in the flash's lifespan. The only exception is if your drive isn't aligned (and especially without trim). Under these conditions don't expect your drive to last very long as WA jumps by double digit factors.

    http://www.xtremesystems.org/forums/showthread.php...
    Reply
  • Movieman420 - Friday, September 30, 2011 - link

    Yup. And you can expect the current SF bsod problem to vanish when intel fixes it's drivers and oroms. What coincidence eh?

    http://thessdreview.com/latest-buzz/sandforce-driv...
    Reply
  • JarredWalton - Saturday, October 1, 2011 - link

    Ha! An educated guess in this case feels more like a pipe dream. If Intel is willing to jump on the SF controller bandwagon, I will be amazed. Then again, they've got the 510 using a non-Intel controller, so anything is possible. Reply
  • ckryan - Saturday, October 1, 2011 - link

    It kinda makes sense though. Intel using a Sandforce controller (or possibly "Sandforce-eque") but with their firmware and NAND would be tough to stop. The SF controller (when it works -- In my case not always that often) yields benefits to consumer and enterprise workloads alike. Further, it could help bridge Intel into smaller process NAND with about the same overall TBW due to compression (My 60GB has about 85TB host to ~65TB nand writes). That's not a small amount over the lifespan of a drive. Along with additional overprovisioning, Intel could conceivably make a drive with sub-25nm NAND last as long as the 34nm stuff with those two advantages.

    There's nothing really stopping SF now except for the not-so little stability issues. I thought it was much rarer that it actually is (it's rare when it happens to someone else and an epidemic when it happens to you). With that heinous hose-beast no longer lurking in the closet, SandForce could end up being the only contender. Until such time as they get the problems resolved, whether or not you have problems is just a crapshoot... no seeming rhyme or reason, almost -- but not quite -- completely random. If Intel could bring that missing link to SF it would be a boon to consumers, but Intel could just as well buy SandForce to get rid of them. Either is just as likely, and conspiracy theorists would say that Intel is purposely causing issues with SF drives so they don't have to buy them (or don't have to pay as much). In the end, most consumers would just be happy if the 2281 powered drives they already have worked like the drive it was always mean to be.
    Reply
  • Movieman420 - Saturday, October 1, 2011 - link

    Guess you didn't follow the story over to VR Zone either. It's a done deal. Cherryville is SandForce 2200. It should be announced before long imo. Reply
  • rishidev - Friday, September 30, 2011 - link

    Why even bother to make a 200gb drive at this point of time.
    AMD fanboys get abused for masturbating .
    So What ?? im supposed to buy a $1300 Intel ""WOOOW" ssd drive.
    Reply
  • TheSSDReview - Saturday, October 1, 2011 - link

    Because Intel wants to hit the enthusiast market. The new drives wont be anywhere near the price of these 710 series drives and will demand product confidence as Intel has always had such. It is a win-win for SF since many have taken comfort in speculation of controller troubles rather than examining other such causes.

    The question then becomes one of SF purchase we think.
    Reply

Log in

Don't have an account? Sign up now