The New Indirection Table

While the binary tree structure was great for sequential IO performance and for keeping DRAM sizes low, it wasn't good for lowering random IO latency. The S3700 controller completely does away with the old indirection table.

The new controller ditches the binary tree entirely and moves to a completely flat structure with 1:1 mapping. What happens now is there's a giant array with each location in the array mapped to a specific portion of NAND. The array isn't dynamically created and, since it's a 1:1 mapping, searches, inserts and updates are all very fast.

The other benefit of being 1:1 mapped with physical NAND is that there's no need to defragment the table, which immediately cuts down the amount of work the controller has to do. Drives based on this new controller only have to keep the NAND defragmented.

The downside to all of this is the DRAM area required by the new flat indirection table. The old binary tree was very space efficient, while the new array is just huge. It requires a large amount of DRAM depending on the capacity of the drive. In its largest implementation (800GB), Intel needs a full 1GB of DRAM to store the indirection table. By my calculations, the table itself should require roughly 100MB of DRAM per 100GB of storage space on the drive itself. Intel appears to be using DDR3-1333 for its DRAM on-board S3700 drives.

There's a bit of space left over after you account for the new indirection table. That area is reserved for a cache of the controller's firmware so it doesn't have to read from slow flash to access it.

Once again, there's no user data stored in the external DRAM. The indirection table itself is physically stored in NAND (just cached in DRAM), and there are two large capacitors on-board to push any updates to non-volatile storage in the event of power loss.

It sounds like a simple change, but building this new architecture took quite a bit of work. The results, if they are anywhere close to what Intel is promising, are pretty awesome.

Final Words

The Intel SSD DC S3700 appears to be a very promising new architecture from Intel. If it ends up performing as Intel promised, the S3700 controller could be the beginning of a new era in SSD performance - one focused on consistency of performance, not just absolute performance. As soon as we run samples through our test suite you can expect a full review, putting Intel's claims to the test. Stay tuned.

A Brand New Architecture & The Old Indirection Table
Comments Locked


View All Comments

  • DukeN - Monday, November 5, 2012 - link

    Now please give us some results with benchmarks relevant to enterprise users (eg RAID performance, wear levelling vs other enterprise drives).
  • chrone - Monday, November 5, 2012 - link

    finally getting more consistent performance over time. nice writing Anand, as always! :)
  • edlee - Monday, November 5, 2012 - link

    on paper this is very nice, but i am not having any issues with current crop of ssds.

    how about intel helps design a new sata standard that supports more than 6Gbps, like 50Gbps, so its futureproof, and can put a deathblow to thunderbolt.
  • Conficio - Monday, November 5, 2012 - link

    You realize that
    * Thunderbolt is an Intel technology. So they are not looking to kill it
    * That thunderbolt can rout your entire PCI bus across physical locations (6 m now, with optical cables ~100 m [if memory serves me])
    * That said you want SSD interfaces going directly to the PCI bus (not invent another intermediate bus that is built for a technology (spinning disks))
    * That direct PCI interfaces for SSDs is where things are going
  • dananski - Monday, November 5, 2012 - link

    " PCI interfaces for SSDs is where things are going"

    I would like to see this become more common. There's 8Gb/s of spare PCI-E bandwidth on one slot on my machine at the moment.

    But what if SSDs advance faster than even PCI-E? I wonder if they could bring the interface even closer to home by allowing NAND chips to plug into memory-like slots on the motherboard (yay easy upgrade path), with the controller integrated into the CPU? The controller should be relatively inobtrusive - how much die area would it take at 22nm? And could some of the operations run efficiently on the main CPU to cut down that die area overhead some more?
  • JohnWinterburn - Monday, November 5, 2012 - link

    As much as fusion IO et al would like it to direct PCI interfaces are certainly not where it's going for this market.

    You cant replace them easily when they break (as some are always going to when you have enough), you cant fit that many in a box, you have to rely on a single manufacturer and you're then tied into their software.

    None of thats going to change any time soon, so PCI interfaced SSDs will be small scale or for specific projects.
  • ogreslayer - Monday, November 5, 2012 - link

    That is what SATA express and SFF-8639 will be for and was announced a while ago.

    Maybe not 50Gbps but at 4GB/s and providing 32Gbps it isn't a small jump. Even the 2Gbps gen3 connection isn't something to sneeze at.
  • iwod - Monday, November 5, 2012 - link

    I still fail to understand why we need SATA express and SFF-8639. When one could have ruled them all. Since the main difference between SATA and SAS is one being Half Duplex and SAS being Full Duplex. But the under lying PCI-Express protocol is Full Duplex by design, so why make another SATA express and not just use SFF-8639 ?

    And I hope we start with PCI-E 3.0 too, by the time these things arrive there is no point of using the older and slower PCI-E 2.0
  • Kevin G - Monday, November 5, 2012 - link

    Look into SATA-Express. It essentially uses two PCI-E 2.0 lanes for data transfer (16 Gbit/s with 32 Gbit/s when the spec migrates over to PCI-E 3.0). There is some backwards compatibility with SATA too.

    Though SATA-Express will likely coexist with Thunderbolt. SATA Express is aimed as an internal storage solution where as Thunderbolt is aimed toward external peripherals (where storage is just one aspect).
  • Kevin G - Monday, November 5, 2012 - link

    I'm curious about the raw depth of ECC in this device. ECC on the internal SRAM is pretty much expected for enterprise grade equipment nowadays. ECC on the DRAM is also expected but I'm wondering how it is implemented. Chances are that the drive doesn't house 9 DRAM chips for traditional 72 bit wide ECC protected bus. ECC on the NAND could be implemented at the block level (576 bit blocks with 512 bit data + 64 bit ECC) but that'd require some custom NAND chips.

    As for the indirect tables, I suspect that the need to be able to hold the entire table in DRAM stems from the idea of table having to optimize the copy in NAND. Optimizing here can likely be done without the massive DRAM cache but I suspect that the optimization process would require too many read/writes to the point it'd be detrimental to the drives life span.

Log in

Don't have an account? Sign up now