The Return of the JMicron based SSD

With the SSD slowdown addressed it’s time to start talking about new products. And I’ll start by addressing the infamous JMicron JMF602 based SSDs.

For starters, a second revision of the JMF602 controller came out last year - the JMF602B. This controller had twice the cache of the original JMF602A and thus didn’t pause/stutter as often.

The JMicron JMF602B is the controller found in G.Skill’s line of SSDs as well as OCZ’s Core V2, the OCZ Solid and the entire table below of SSDs:

JMicron JMF602B Based SSDs
G.Skill FM-25S2
G.Skill Titan
OCZ Apex
OCZ Core V2
OCZ Solid
Patriot Warp
SuperTalent MasterDrive

 

All I need to do is point to our trusty iometer test to tell you that the issues that plagued the original JMicron drives I complained about apply here as well:

Iometer 4KB Random Writes, IOqueue=3, 8GB sector space IOs per second MB/s Average Latency Maximum Latency
JMF602B MLC Drive 5.61 0.02 MB/s 532.2 ms 2042 ms

 

On average it takes nearly half a second to complete a random 4KB write request to one of these drives. No thanks.

The single-chip JMF602B based drives are now being sold as value solutions. While you can make the argument that the pausing and stuttering is acceptable for a very light workload in a single-tasking environment, simply try doing anything while installing an application or have anti-virus software running in the background and you won’t be pleased by these drives. Save your money, get a better drive.

The next step up from the JMF602B based drives are drives based on two JMF602B controllers. Confused? Allow me to explain. The problem is that JMicron’s next SSD controller design won’t be ready anytime in the near future, and shipping mediocre product is a better option than shipping no product, so some vendors chose to take two JMF602B controllers and put them in RAID, using another JMicron controller.


Two JMF602B controllers and a JMicron RAID controller

The problem, my dear friends, is that the worst case scenario latency penalty - at best, gets cut in half using this approach. You’ll remember that the JMF602 based drives could, under normal load, have a write-latency of nearly 0.5 - 2 seconds. Put two controllers together and you’ll get a worst-case scenario write latency of about one second under load or half a second with only a single app running. To test the theory I ran the now infamous 4K random write iometer script on these “new” drives:

Iometer 4KB Random Writes, IOqueue=3, 8GB sector space IOs per second MB/s Average Latency Maximum Latency
JMF602B MLC Drive 5.61 0.02 MB/s 532.2 ms 2042 ms
Dual JMF602B MLC Controller Drive 8.18 0.03 MB/s 366.1 ms 1168.2 ms

 

To some irate SSD vendors, these may just be numbers, but let’s put a little bit of thought into what we’re seeing here shall we? These iometer results are saying that occasionally when you go to write a 4KB file (for example, loading a website, sending an IM and having the conversation logged or even just saving changes to a word doc) the drive will take over a second to respond.

I don’t care what sort of drive you’re using, 2.5”, 3.5”, 5400RPM or 7200RPM, if you hit a 1 second pause you notice it and such performance degradation is not acceptable. Now these tests are more multitasking oriented, but if you run a single IO on the drive you'll find that the maximum latency is still half a second.

The average column tells an even more bothersome story. Not only is the worst case scenario a 1168 ms write, on average you’re looking at over a quarter of a second just to write 4KB.

The G.Skill Titan has recently garnered positive reviews for being a fast, affordable, SSD. Many argued that it was even on the level of the Intel X25-M. I’m sorry to say it folks, that’s just plain wrong.


One of the most popular dual JMF602B drives

If you focus exclusively on peak transfer rates then these drives work just fine. You’ll find that, unless you’re running a Blu-ray rip server, you don’t spend most of your time copying multi-GB files to and from the drive. Instead, on a normal desktop, the majority of your disk accesses will be small file reads and writes and these drives can’t cut it there.

Some vendors have put out optimization guides designed to minimize stuttering with these JMF602B based drives. The guides generally do whatever they can to limit the number and frequency of small file writes to your drive (e.g. disabling search indexing, storing your temporary internet files on a RAM drive). While it’s true that doing such things will reduce stuttering on these drives, the optimizations don’t solve the problem - they merely shift the cause of it. The moment an application other than Vista or your web browser goes to write to your SSD you’ll have to pay the small file write penalty once more. Don’t settle.

But what option is there? Is Intel’s X25-M the only drive on the market worth recommending? What if you can’t afford spending $390 for 80GB. Is there no cheaper option?

Latency vs. Bandwidth: What to Look for in a SSD OCZ Tries Again with the Vertex
Comments Locked

250 Comments

View All Comments

  • Kary - Thursday, March 19, 2009 - link

    Why use TRIM at all?!?!?

    If you have extras Blocks on the drive (NOT PAGES, FULL BLOCKS) then there is no need for TRIM command.

    1)Currently in use BLOCK is half full
    2)More than half a block needs to be written
    3)extra BLOCK is mapped into the system
    4)original/half full block is mapped out of system.. can be erased during idle time.

    You could even bind multiple continuous blocks this way (I assume that it is possible to erase simultaneously any of the internal groupings pages from Blocks on up...they probably share address lines...ex. erase 0000200 -> just erase block #200 ....erase 00002*0 -> erase block 200 to 290...btw, did addressing in base ten instead of binary just to simplify for some :)
  • korbendallas - Wednesday, March 18, 2009 - link

    Actually i think that the Trim command is merely used for marking blocks as free. The OS doesn't know how the data is placed on the SSD, so it can't make informed decision on when to forcefully erase pages. In the same way, the SSD doesn't know anything about what files are in which blocks, so you can't defrag files internally in the drive.

    So while you can't defrag files, you CAN now defrag free space, and you can improve the wear leveling because deleted data can be ignored.

    So let's say you have 10 pages where 50% of the blocks were marked deleted using the Trim command. That means you can move the data into 5 other pages, and erase the 10 pages. The more deleted blocks there are in a page, the better a candidate for this procedure. And there isn't really a problem with doing this while the drive is idle - since you're just doing something now, that you would have to do anyway when a write command comes.
  • GourdFreeMan - Wednesday, March 18, 2009 - link

    This is basically what I am arguing both for and against in the fourth paragraph of my original post, though I assumed it would be the OS'es responsibility, not the drive's.

    Do SSDs track dirty pages, or only dirty blocks? I don't think there is enough RAM on the controller to do the former...
  • korbendallas - Wednesday, March 18, 2009 - link

    Well, let's take a look at how much storage we actually need. A block can be erased, contain data, or be marked as trimmed or deallocated.

    That's three different states, or two bits of information. Since each block is 4kB, a 64GB drive would have 16777216 blocks. So that's 4MB of information.

    So yeah, saving the block information is totally feasible.
  • GourdFreeMan - Thursday, March 19, 2009 - link

    Actually the drive only needs to know if the page is in use or not, so you can cut that number in half. It can determine a partially full block that is a candidate for defragmentation by looking at whether neighboring pages are in use. By your calculation that would then be 2 MiB.

    That assumes the controller only needs to support drives of up to 64 GiB capacity, that pages are 4 KiB in size, and that the controller doesn't need to use RAM for any other purpose.

    Most consumer SSD lines go up to 256 GiB in capacity, which would bring the total RAM needed up to 8 MiB using your assumption of a 4 KiB page size.

    However, both hard drives and SSDs use 512 byte sectors. This does not necessarily mean that internal pages are therefore 512 bytes in size, but lacking any other data about internal pages sizes, let's run the numbers on that assumption. To support a 256 MiB drive with 512 byte pages, you would need 64 MiB of RAM -- which only the Intel line of SSDs has more than -- dedicated solely to this purpose.

    As I said before there are ways of getting around this RAM limitation (e.g. storing page allocation data per block, keeping only part of the page allocation table in RAM, etc.), so I don't think the technical challenge here is insurmountable. There still remains the issue of wear, however...
  • GourdFreeMan - Wednesday, March 18, 2009 - link

    Substitute "allocated" for "dirty" in my above post. I muddled the terminology, and there is no edit function to repair my mistake.

    Also, I suppose the SSD could store some per block data about page allocation appended to the blocks themselves at a small latency penalty to get around the RAM issue, but I am not sure if existing SSDs do such a thing.

    My concerns about added wear in my original post still stand, and doing periodic internal defragmentation is going to necessitate some unpredictable sporadic periods of poor response by the drive as well if this feature is to be offered by the drive and not the OS.
  • Basilisk - Wednesday, March 18, 2009 - link

    I think your concerns parallel mine, allbeit we have different conclusions.

    Parag.1: I think you misunderstand the ERASE concept: as I read it, after an ERASE parts of the block are re-written and parts are left erased -- those latter parts NEED NOT be re-erased before they are written, later. If the TRIM function can be accomplished at an idle moment, access time will be "saved"; if the TRIM can erase (release) multiple clusters in one block [unlikely?], that will reduce both wear & time.

    Parag.2: This argument reverses the concept that OS's should largely be ignorant about device internals. As devices with different internal structures have proliferated over the years -- and will continue so with SSD's -- such OS differentiation is costly to support.

    Parag 3 and onwards: Herein lies the problem: we want to save wear by not re-writing files to make them contiguous, but we now have a situation where wear and erase times could be considerably reduced by having those files be contiguous. A 2MB file fragmented randomly in 4KB clusters will result in around 500 erase cycles when it's deleted; if stored contiguously, that would only require 4-5 erase cycles (of 512KB SSD-blocks)... a 100:1 reduction in erases/wear.

    It would be nice to get the SSD blocks down to 4KB in size, but I have to infer there are counter arguments or it would've been done already.

    With current SSDs, I'd explore using larger cluster sizes -- and here we have a clash with MS [big surprise]. IIRC, NTFS clusters cannot exceed 4KB [for something to do with file compression!]. That makes it possible that FAT32 with 32KB clusters [IIRC clusters must be less than 64KB for all system tools to properly function] might be the best choice for systems actively rewriting large files. I'm unfamiliar with FAT32 issues that argue against this, but if the SSD's allocate clusters contiguously, wouldn't this reduce erases by a factor of 8 for large file deletions? 32KB clusters might ham-string caching efficiency and result in more disk accesses, but it might speed-up linear reads and s/w loads.

    The impact of very small file/directory usage and for small incremental file changes [like appending to logs] wouldn't be reduced -- it might be increased as data-transfer sizes would increase -- so the overall gain for having fewer clusters-per-SSD-block is hard to intuit, and it would vary in different environments.
  • GourdFreeMan - Wednesday, March 18, 2009 - link

    RE Parag. 1: As I understand it, the entire 512 KiB block must always be erased if there is even a single page of valid data written to it... hence my concerns. You may save time reading and writing data if the device could know a block were partially full, but you still suffer the 2ms erase penalty. Please correct me if I am mistaken in my assumption.

    RE Parag. 2: The problem is the SSD itself only knows the physical map of empty and used space. It doesn't have any knowledge of the logical file system. NTFS, FAT32, ext3 -- it doesn't matter to the drive, that is the OS'es responsibility.

    RE Parag. 3: I would hope that reducing the physical block size would also reduce the block erase time from 2ms, but I am not a flash engineer and so cannot comment. One thing I can state for certain, however is that moving to smaller physical block sizes would not increase wear across the surface of the drive, except possibly for the necessity to keep track of a map of used blocks. Rewriting 128 blocks on a hypothetical SSD with 4 KiB blocks versus 1 512 KiB block still erases 512 KiB of disk space (excepting the overhead in tracking which blocks are filled).

    Regarding using large filesystem clusters: 4 KiB clusters offer a nice tradeoff between filesystem size, performance and slack (lost space due to cluster size). If you wanted to make an SSD look artificially good versus a hard drive, a 512 KiB cluster size would do so admirably, but no one would use such a large cluster size except for a data drive used to store extremely large files (e.g. video) exclusively. BTW, in case you are unaware, you can format a non-OS partition with NTFS to cluster sizes other than 4 KiB. You can also force the OS to use a different cluster size by first formating the drive for the OS as a data drive with a different cluster size under Windows and then installing Windows on that partition. I have a 2 KiB cluster size on a drive that has many hundreds of thousands of small files. However, I should note that since virtual memory pages are by default 4 KiB (another compelling reason for the 4 KiB default cluster size), most people don't have a use for other cluster sizes if they intend to have a page file on the drive.
  • ssj4Gogeta - Wednesday, March 18, 2009 - link

    Thanks for the wonderful article. And yes, I read every single word. LOL
  • rudolphna - Wednesday, March 18, 2009 - link

    Hey anand, page 3, the random read latency graph, they are mixed up. it is listed as the WD Velociraptor having a .11ms latency, I think you might want to fix that. :)

Log in

Don't have an account? Sign up now