AnandTech Storage Bench - Heavy

Our Heavy storage benchmark is proportionally more write-heavy than The Destroyer, but much shorter overall. The total writes in the Heavy test aren't enough to fill the drive, so performance never drops down to steady state. This test is far more representative of a power user's day to day usage, and is heavily influenced by the drive's peak performance. The Heavy workload test details can be found here. This test is run twice, once on a freshly erased drive and once after filling the drive with sequential writes.

ATSB - Heavy (Data Rate)

The Toshiba XG6 brings a healthy boost to the full-drive average data rate on the Heavy test, but only improves the empty drive test run performance by about 5% over the XG5. Toshiba is definitely starting to fall behind the fastest high-end drives on this test, but the XG6 is still comfortably ahead of most entry-level NVMe products and more than twice as fast as the Crucial MX500 SATA SSD.

ATSB - Heavy (Average Latency)ATSB - Heavy (99th Percentile Latency)

The Toshiba XG6 brings very small regressions to the latency scores on the empty-drive test runs, but makes up for it with substantially improved average and 99th percentile latency when the Heavy test is run on a full drive.

ATSB - Heavy (Average Read Latency)ATSB - Heavy (Average Write Latency)

The slight regression in average latency for the empty drive test runs comes from an increase in average write latency. Read latency has improved substantially and write latency for the full-drive test runs doesn't stand out for the XG6 the way it did for the XG5.

ATSB - Heavy (99th Percentile Read Latency)ATSB - Heavy (99th Percentile Write Latency)

For 99th percentile latency, both read and write performance are slightly worse on the XG6 than the XG5 when the Heavy test is run on an empty drive. But full-drive latency QoS has improved markedly for both read and write operations.

ATSB - Heavy (Power)

The Toshiba XG6 uses slightly more energy over the course of the Heavy test than the XG5 does, when the test is run on an empty drive. The improved full-drive performance helps the XG6 come out ahead on energy usage for that test run. Either way, the XG6's efficiency is comparable to SATA drives and the WD Black is the only other high-end NVMe that offers this kind of power efficiency.

AnandTech Storage Bench - The Destroyer AnandTech Storage Bench - Light
Comments Locked

31 Comments

View All Comments

  • Valantar - Friday, September 7, 2018 - link

    AFAIK they're very careful which patches are applied to test beds, and if they affect performance, older drives are retested to account for this. Benchmarks like this are never really applicable outside of the system they're tested in, but the system is designed to provide a level playing field and repeatable results. That's really the best you can hope for. Unless the test bed has a consistent >10% performance deficit to most other systems out there, there's no reason to change it unless it's becoming outdated in other significant areas.
  • iwod - Thursday, September 6, 2018 - link

    So we are limited by PCI-e interface again. Since the birth of SSD, we pushed past SATA 3Gbps / 6Gbps, than PCI-E 2.0 x4 2GB/S and now PCI-E 3.0, 4GB/s.

    When are we going to get PCI-E 4.0, or since 5.0 is only just around the corner may as well wait for it. That is 16GB/s, plenty of room for SSD maker to figure out how to get there.
  • MrSpadge - Thursday, September 6, 2018 - link

    There's no need to rush there. If you need higher performance, use multiple drives. Maybe on a HEDT or Enterprise platform if you need extreme performance.

    But don't be surprised if that won't help your PC as much as you thought. The ultimate limit currently is a RAMdisk. Launch a game from there or install some software - it's still surprisingly slow, because the CPU becomes the bottleneck. And that already applies to modern SSDs, which is obvious in benchmarks which test copying, installing or application launching etc.
  • abufrejoval - Friday, September 7, 2018 - link

    Could also be the OS or the RAMdisk driver. When I finished building my 128GB 18-Core system with a FusionIO 2.4 TB leftover and 10Gbit Ethernet, I obviously wanted to bench it on Windows and Linux. I was rather shocked to see how slow things generally remained and how pretty much all these 36 HT-"CPU"s were just yawning.

    In the end I never found out, if it was the last free version (3.4.8) version of SoftPerfect's RAM disk that didn' seem to make use of all four memory Xeon E5 memory channels, or some bottleneck in Windows (never seen Windows update user more than a single core), but I never got anywhere near the 70GB/s Johan had me dream of (https://www.anandtech.com/show/8423/intel-xeon-e5-... Don't think I even saturated the 10Gbase-T network, if I recall correctly.

    It was quite different in many cases on Linux, but I do remember running an entire Oracle database on tmpfs once, and then an OLTP benchmark on that... again earning myself a totally bored system under the most intensive benchmark hammering I could orchestrate.

    There are so many serialization points in all parts of that stack, you never really get the performance you pay for until someone has gone all the way and rewritten the entire software stack from scratch for parallel and in-memory.

    Latency is the killer for performance in storage, not bandwidth. You can saturate all bandwidth capacities with HDDs, even tape. Thing is, with dozens (modern CPUs) or thousands (modern GPGPUs) SSDs *become tape*, because of the latencies incurred on non-linear access patterns.

    That's why after NVMe, NV-DIMMs or true non-volatile RAM is becoming so important. You might argue that a cache line read from main memory still looks like a tape library change against the register file of an xPU, but it's still way better than PCIe-5-10 with a kernel based block layer abstraction could ever be.

    Linear speed and loops are dead: If you cannot unroll, you'll have to crawl.
  • halcyon - Monday, September 10, 2018 - link

    Thank you for writing this.
  • Quantum Mechanix - Monday, September 10, 2018 - link

    Awesome write up- my favorite kind of comment, where I walk away just a *tiny* less ignorant. Thank you! :)
  • DanNeely - Thursday, September 6, 2018 - link

    We've been 3.0 x4 bottlenecked for a few years.

    From what I've read about the implementing 4.0/5.0 on a mobo I'm not convinced we'll see them on consumer boards, at least not in its current form. The maximum PCB trace length without expensive boosters is too short, AIUI 4.0 is marginal to the top PCIe slot/chipset and 5.0 would need signal boosters even to go that far. Estimates I've seen were $50-100 (I think for an x16 slot) to make a 4.0 slot and several times that for 5.0. Cables can apparently go several times longer than PCB traces while maintaining signal quality, but I'm skeptical about them getting snaked around consumer mobos.

    And as MrSpadge pointed out in many applications scale out wider is an option, and what I've read that Enterprise Storage is looking at. Instead of x4 slots that have 2/4x the bandwidth of current ones that market is more interested in 5.0 x1 connections that have the same bandwidth as current devices but which would allow them to connect 4 times as many drives. That seems plausible to me since enterprise drive firmware is generally tuned for steady state performance not bursts and most of them don't come as close to saturating buses as high end consumer drives do for shorter/more intermitant workloads.
  • abufrejoval - Friday, September 7, 2018 - link

    I guess that's why they are working on silicon photonics: PCB voltage levels, densities, layers, trace lengths... Whereever you look there are walls of physics rising into mountains. If only PCBs weren't so much cheaper than silicon interposers, photonics and other new and rare things!
  • darwiniandude - Sunday, September 9, 2018 - link

    Any testing under windows on current MacBook Pro hardware? Those SSD's I would've thought are much much faster, but I'd love to see the same test on them.
  • halcyon - Monday, September 10, 2018 - link

    Thanks for the review. For future, could you consider segregating the drives into different tiers based on results, e.g. video editing, dB, generic OS/boot/app drive, compilation, whatnot.

    Now it seems that one drive is better in ine thing, and another drive in anither scenario. But not having your in-depth knowledge, makes it harder to assess which drive would be closest to optimal in which scenario.

Log in

Don't have an account? Sign up now