The Toshiba XG6 1TB SSD Review: Our First 96-Layer 3D NAND SSD

Name: The Toshiba XG6 1TB SSD Review: Our First 96-Layer 3D NAND SSD
Item: The Toshiba XG6 1TB SSD Review: Our First 96-Layer 3D NAND SSD
Author: Billy Tallis

by Billy Tallis on September 6, 2018 8:15 AM EST

31 Comments | Add A Comment

31 Comments

Random Read Performance

Our first test of random read performance uses very short bursts of operations issued one at a time with no queuing. The drives are given enough idle time between bursts to yield an overall duty cycle of 20%, so thermal throttling is impossible. Each burst consists of a total of 32MB of 4kB random reads, from a 16GB span of the disk. The total data read is 1GB.

Burst 4kB Random Read (Queue Depth 1)

The burst random read performance of the Toshiba XG5 was rather slow and the XG6 improves on it but not enough to bring it up to par. Intel/Micron 3D TLC seems to offer substantially lower read latency, though some other drives have managed to get better random read performance out of BiCS TLC than Toshiba's XG series.

Our sustained random read performance is similar to the random read test from our 2015 test suite: queue depths from 1 to 32 are tested, and the average performance and power efficiency across QD1, QD2 and QD4 are reported as the primary scores. Each queue depth is tested for one minute or 32GB of data transferred, whichever is shorter. After each queue depth is tested, the drive is given up to one minute to cool off so that the higher queue depths are unlikely to be affected by accumulated heat build-up. The individual read operations are again 4kB, and cover a 64GB span of the drive.

Sustained 4kB Random Read

The rankings for sustained random read performance are largely similar to the burst random read test. The XG6 is improved over the XG5 but there's still quite a bit of room for improvement.


Power Efficiency in MB/s/W	Average Power in W

The XG6 puts Toshiba back into a tie for the best power efficiency from a TLC drive performing random reads, because the middle of the road performance doesn't require all that much power—just over half the power required by the SM2262EN's class-leading performance.

While the low queue depth random read performance from the Toshiba XG6 is nothing special, it does scale up quite well and by QD32 it has caught up with the SM2262EN and pulled ahead of all other TLC drives.

Random Write Performance

Our test of random write burst performance is structured similarly to the random read burst test, but each burst is only 4MB and the total test length is 128MB. The 4kB random write operations are distributed over a 16GB span of the drive, and the operations are issued one at a time with no queuing.

Burst 4kB Random Write (Queue Depth 1)

The burst random write performance of the Toshiba XG6 is about 12% faster than the XG5—not enough to catch up to the fastest drives, but enough to stay above average even as the standards for high-end performance climb from year to year.

As with the sustained random read test, our sustained 4kB random write test runs for up to one minute or 32GB per queue depth, covering a 64GB span of the drive and giving the drive up to 1 minute of idle time between queue depths to allow for write caches to be flushed and for the drive to cool down.

Sustained 4kB Random Write

On the longer random write test, the Toshiba XG6 places at the top of the second tier of drives. It can't match the very fastest competitors, but it beats all the more mid-range NVMe drives.


Power Efficiency in MB/s/W	Average Power in W

The XG6 has leapfrogged the WD Black to retake the lead in power efficiency during random writes, with about a 7% performance per Watt lead. Even with the improved performance relative to the XG5, the XG6 is is still one of the least power-hungry NVMe drives during this test.

The random write performance of the XG6 scales best from QD2 to QD4 which brings it near saturation. This pattern is similar to the behavior of drives from Samsung, WD and the XG5, while the Phison and Silicon Motion controllers seem to pick up the pace a bit earlier with better QD2 performance.

AnandTech Storage Bench - Light Sequential Performance

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

31 Comments

View All Comments

Valantar - Friday, September 7, 2018 - link
AFAIK they're very careful which patches are applied to test beds, and if they affect performance, older drives are retested to account for this. Benchmarks like this are never really applicable outside of the system they're tested in, but the system is designed to provide a level playing field and repeatable results. That's really the best you can hope for. Unless the test bed has a consistent >10% performance deficit to most other systems out there, there's no reason to change it unless it's becoming outdated in other significant areas.
iwod - Thursday, September 6, 2018 - link
So we are limited by PCI-e interface again. Since the birth of SSD, we pushed past SATA 3Gbps / 6Gbps, than PCI-E 2.0 x4 2GB/S and now PCI-E 3.0, 4GB/s.

When are we going to get PCI-E 4.0, or since 5.0 is only just around the corner may as well wait for it. That is 16GB/s, plenty of room for SSD maker to figure out how to get there.
MrSpadge - Thursday, September 6, 2018 - link
There's no need to rush there. If you need higher performance, use multiple drives. Maybe on a HEDT or Enterprise platform if you need extreme performance.

But don't be surprised if that won't help your PC as much as you thought. The ultimate limit currently is a RAMdisk. Launch a game from there or install some software - it's still surprisingly slow, because the CPU becomes the bottleneck. And that already applies to modern SSDs, which is obvious in benchmarks which test copying, installing or application launching etc.
abufrejoval - Friday, September 7, 2018 - link
Could also be the OS or the RAMdisk driver. When I finished building my 128GB 18-Core system with a FusionIO 2.4 TB leftover and 10Gbit Ethernet, I obviously wanted to bench it on Windows and Linux. I was rather shocked to see how slow things generally remained and how pretty much all these 36 HT-"CPU"s were just yawning.

In the end I never found out, if it was the last free version (3.4.8) version of SoftPerfect's RAM disk that didn' seem to make use of all four memory Xeon E5 memory channels, or some bottleneck in Windows (never seen Windows update user more than a single core), but I never got anywhere near the 70GB/s Johan had me dream of (https://www.anandtech.com/show/8423/intel-xeon-e5-... Don't think I even saturated the 10Gbase-T network, if I recall correctly.

It was quite different in many cases on Linux, but I do remember running an entire Oracle database on tmpfs once, and then an OLTP benchmark on that... again earning myself a totally bored system under the most intensive benchmark hammering I could orchestrate.

There are so many serialization points in all parts of that stack, you never really get the performance you pay for until someone has gone all the way and rewritten the entire software stack from scratch for parallel and in-memory.

Latency is the killer for performance in storage, not bandwidth. You can saturate all bandwidth capacities with HDDs, even tape. Thing is, with dozens (modern CPUs) or thousands (modern GPGPUs) SSDs *become tape*, because of the latencies incurred on non-linear access patterns.

That's why after NVMe, NV-DIMMs or true non-volatile RAM is becoming so important. You might argue that a cache line read from main memory still looks like a tape library change against the register file of an xPU, but it's still way better than PCIe-5-10 with a kernel based block layer abstraction could ever be.

Linear speed and loops are dead: If you cannot unroll, you'll have to crawl.
halcyon - Monday, September 10, 2018 - link
Thank you for writing this.
Quantum Mechanix - Monday, September 10, 2018 - link
Awesome write up- my favorite kind of comment, where I walk away just a *tiny* less ignorant. Thank you! :)
DanNeely - Thursday, September 6, 2018 - link
We've been 3.0 x4 bottlenecked for a few years.

From what I've read about the implementing 4.0/5.0 on a mobo I'm not convinced we'll see them on consumer boards, at least not in its current form. The maximum PCB trace length without expensive boosters is too short, AIUI 4.0 is marginal to the top PCIe slot/chipset and 5.0 would need signal boosters even to go that far. Estimates I've seen were $50-100 (I think for an x16 slot) to make a 4.0 slot and several times that for 5.0. Cables can apparently go several times longer than PCB traces while maintaining signal quality, but I'm skeptical about them getting snaked around consumer mobos.

And as MrSpadge pointed out in many applications scale out wider is an option, and what I've read that Enterprise Storage is looking at. Instead of x4 slots that have 2/4x the bandwidth of current ones that market is more interested in 5.0 x1 connections that have the same bandwidth as current devices but which would allow them to connect 4 times as many drives. That seems plausible to me since enterprise drive firmware is generally tuned for steady state performance not bursts and most of them don't come as close to saturating buses as high end consumer drives do for shorter/more intermitant workloads.
abufrejoval - Friday, September 7, 2018 - link
I guess that's why they are working on silicon photonics: PCB voltage levels, densities, layers, trace lengths... Whereever you look there are walls of physics rising into mountains. If only PCBs weren't so much cheaper than silicon interposers, photonics and other new and rare things!
darwiniandude - Sunday, September 9, 2018 - link
Any testing under windows on current MacBook Pro hardware? Those SSD's I would've thought are much much faster, but I'd love to see the same test on them.
halcyon - Monday, September 10, 2018 - link
Thanks for the review. For future, could you consider segregating the drives into different tiers based on results, e.g. video editing, dB, generic OS/boot/app drive, compilation, whatnot.

Now it seems that one drive is better in ine thing, and another drive in anither scenario. But not having your in-depth knowledge, makes it harder to assess which drive would be closest to optimal in which scenario.

The Toshiba XG6 1TB SSD Review: Our First 96-Layer 3D NAND SSD

Random Read Performance

Random Write Performance

Post Your Comment

31 Comments

View All Comments

Valantar - Friday, September 7, 2018 - link

iwod - Thursday, September 6, 2018 - link

MrSpadge - Thursday, September 6, 2018 - link

abufrejoval - Friday, September 7, 2018 - link

halcyon - Monday, September 10, 2018 - link

Quantum Mechanix - Monday, September 10, 2018 - link

DanNeely - Thursday, September 6, 2018 - link

abufrejoval - Friday, September 7, 2018 - link

darwiniandude - Sunday, September 9, 2018 - link

halcyon - Monday, September 10, 2018 - link

Log in

Don't have an account? Sign up now