Live Long and Prosper: The Logical Page

Computers are all about abstraction. In the early days of computing you had to write assembly code to get your hardware to do anything. Programming languages like C and C++ created a layer of abstraction between the programmer and the hardware, simplifying the development process. The key word there is simplification. You can be more efficient writing directly for the hardware, but it’s far simpler (and much more manageable) to write high level code and let a compiler optimize it.

The same principles apply within SSDs.

The smallest writable location in NAND flash is a page; that doesn’t mean that it’s the largest size a controller can choose to write. Today I’d like to introduce the concept of a logical page, an abstraction of a physical page in NAND flash.

Confused? Let’s start with a (hopefully, I'm no artist) helpful diagram:

On one side of the fence we have how the software views storage: as a long list of logical block addresses. It’s a bit more complicated than that since a traditional hard drive is faster at certain LBAs than others but to keep things simple we’ll ignore that.

On the other side we have how NAND flash stores data, in groups of cells called pages. These days a 4KB page size is common.

In reality there’s no fence that separates the two, rather a lot of logic, several busses and eventually the SSD controller. The latter determines how the LBAs map to the NAND flash pages.

The most straightforward way for the controller to write to flash is by writing in pages. In that case the logical page size would equal the physical page size.

Unfortunately, there’s a huge downside to this approach: tracking overhead. If your logical page size is 4KB then an 80GB drive will have no less than twenty million logical pages to keep track of (20,971,520 to be exact). You need a fast controller to sort through and deal with that many pages, a lot of storage to keep tables in and larger caches/buffers.

The benefit of this approach however is very high 4KB write performance. If the majority of your writes are 4KB in size, this approach will yield the best performance.

If you don’t have the expertise, time or support structure to make a big honkin controller that can handle page level mapping, you go to a larger logical page size. One such example would involve making your logical page equal to an erase block (128 x 4KB pages). This significantly reduces the number of pages you need to track and optimize around; instead of 20.9 million entries, you now have approximately 163 thousand. All of your controller’s internal structures shrink in size and you don’t need as powerful of a microprocessor inside the controller.

The benefit of this approach is very high large file sequential write performance. If you’re streaming large chunks of data, having big logical pages will be optimal. You’ll find that most flash controllers that come from the digital camera space are optimized for this sort of access pattern where you’re writing 2MB - 12MB images all the time.

Unfortunately, the sequential write performance comes at the expense of poor small file write speed. Remember that writing to MLC NAND flash already takes 3x as long as reading, but writing small files when your controller needs large ones worsens the penalty. If you want to write an 8KB file, the controller will need to write 512KB (in this case) of data since that’s the smallest size it knows to write. Write amplification goes up considerably.

Remember the first OCZ Vertex drive based on the Indilinx Barefoot controller? Its logical page size was equal to a 512KB block. OCZ asked for a firmware that enabled page level mapping and Indilinx responded. The result was much improved 4KB write performance:

Iometer 4KB Random Writes, IOqueue=1, 8GB sector space Logical Block Size = 128 pages Logical Block Size = 1 Page
Pre-Release OCZ Vertex 0.08 MB/s 8.2 MB/s

A Quick Flash Refresher The Cleaning Lady and Write Amplification
POST A COMMENT

296 Comments

View All Comments

  • jtleon - Tuesday, September 08, 2009 - link

    Yes I fell asleep atleast 3 times reading this article (it IS Monday afterall)

    Yes, Indilinx clearly rocks the SSD world - Now I know thanks to Anand!

    Stories like this set the standard for all review sites - I don't come away with the feeling I was just sold a bill of goods by some schiester in Intel's pocket, or otherwise.

    Great Job Anand! Keep them coming!
    Reply
  • SSDdaydreamer - Tuesday, September 08, 2009 - link

    I too am wondering whether TRIM will be available on the Intel Drives for Windows XP or Vista. I seriously doubt it, as the OCZ Wiper Tool appears to only be available for Indilinx controllers. Perhaps Intel will introduce their own wiper utility. I am leaning towards the OCZ Vertex or Patriot Torqx drives, as I am quite content with Windows XP and Windows Vista.
    I have an itchy trigger finger on these SSDs, but I want to hold back for the following unknowns.

    1. I would like to use the NTFS file system for my drive, but I am unsure of the proper/ideal block size.

    2. I would merely like to image my existing Windows Installation, but I am worried that performance or stability problems will arise from the NTFS file system. A fresh install could be in order, but it is preferred to image.

    3. Is there a way to change the size of the spare area? Maybe I have the wrong idea (perhaps only format part of the drive, unformatted space goes appends to the spare area?) I am willing to sacrifice some of the usable partition space for an increased spare area for improved performance.

    4. Are there complications with multiple partitions? If there are multiple partitions on the drive (for multi-boot) do they all share the same spare area? Is it possible to allow their own respective spare areas?


    Is there anybody out there that could enlighten me? I'm sure others would do well to have the answers as well. If I make any discoveries, I will be sure to post them.
    Thanks in advance.
    Reply
  • bradhs - Monday, September 07, 2009 - link

    IS there a "Wiper" app for Intel X-25m G2 drives? For people who don't have Windows 7 (TRIM) and want to keep the Intel X-25m G2 running smooth. Reply
  • smjohns - Tuesday, September 08, 2009 - link

    No there is no wiper tool for Intel drives at the mo. In addition to this the current firmware on the Intel drives do not have TRIM enabled. I guess this will be released soon after Windows7 is released. I think I have read somewhere that Intel are working on a TRIM version of it's Matrix Storage Manager software that will provide this functionality to the other operating systems. Reply
  • Burny - Monday, September 07, 2009 - link

    As many before me: great article! I learned a lot about SSD's. Even up to the point i'm ready to buy one.
    I still have 2 questions tough:

    2. Will TRIM be available on the G2 Intel drives for sure? Some sources doubt this: http://www.microsoft.com/communities/newsgroups/en...">http://www.microsoft.com/communities/ne...t=&l...


    3. As I understand, TRIM will work on a firmware level. That implies that TRIM will also function under Windows XP or any OS for that matter? Then why the need to build another TRIM into Windows 7? Or does a TRIM firmware enabled SSD simply allows the OS to use TRIM?

    Thanks!
    Reply
  • smegforbrain - Monday, September 07, 2009 - link

    While I consider myself handy with computers, I'm not the best technical mind when it comes to the details. You do an excellent job of presenting everything in a manner that it can be understood with little difficulty. I look forward to future articles about SSDs.

    I do have a question I'm hoping somebody can answer. I'm as interested in the long-term storage outlook of SSD drives as I am every day use. I've seen it said that an SSD drive should hold its charge for 10 years if not used, and it was discussed a bit earlier in this thread.

    Yet, none of my current mechanical hard drives are more than 3 years old; none of my burned DVDs/CDs are older than 5 years. It seems far more likely that I would replace an SSD for one with a greater storage capacity after 5 years tops than to expect one to be in use, even as archival storage, for as long as 10 years.

    So, is the 10 year 'lifespan' even going to be an issue with archival storage for most people?

    Will this worry over the life span of an SSD become even less of an issue as the technology matures over the next couple of years?
    Reply
  • Starcub - Tuesday, September 08, 2009 - link

    "So, is the 10 year 'lifespan' even going to be an issue with archival storage for most people?"

    No, but who takes wads of money out of their wallet to store it on their shelf?
    Reply
  • smegforbrain - Tuesday, September 08, 2009 - link

    "No, but who takes wads of money out of their wallet to store it on their shelf?"

    That is simply assuming that they will remain as expensive as they are now. They won't.
    Reply
  • BlackSphinx - Sunday, September 06, 2009 - link

    Hello! I'm taking the time to comment on this article, because I am very thankful for all of these awesome write-ups on SSD.

    I'm in the process of building an heavily overclocked i7 rig for gaming and video edition, and I was going to jam 2 Velociraptors in Raid0 in there. Why? I had only heard bad things about SSDs in the past.

    Reading your aticles, who are, while in depth, very clear and easy to understand, I understand much better what happened in early SSDs, what's so good about recent Indilinx and Intel SSD, and, truly, why I should forgo mechanical drives and instead go the SSD route (which, frankly, isn't more costly than a Raid0 raptor setup). In short, these articles are a great service to the end users just like myself, and if they were intended as such, you have passed with flying colors. Congratulations and thanks.
    Reply
  • Transisto - Sunday, September 06, 2009 - link

    Could someone reset my brain as to why there is no way to get a (very noticeable) improvement from USB thumb-drives. I mean these thing also get 0.1 ms latency.

    It's a bit extreme but for the same price I could get 9 cheap 8gb SLC usb drive for around 20$ each and put them in three separate PCI-USB add-on card (5$)

    They would saturate the USB controler with 3 drive in it so I Could get around 140mb/s read and 60mb/s write.

    Say you manage to merge that into a raid or ... ? Is eboost or Readyboost any good at scaling up ?

    Reply

Log in

Don't have an account? Sign up now