Native Command Queuing

Hard drives are the slowest things in your PC and they are such mostly because they are the only component in your PC that still relies heavily on mechanics for its normal operation. That being said, there are definite ways of improving disk performance by optimizing the electronics that augment the mechanical functions of a hard drive.

Hard drives work like this: they receive read/write requests from the chipset's I/O controller (e.g. Intel's ICH6) that are then buffered by the disk's on-board memory and carried out by the disk's on-board controller, making the heads move to the correct platter and the right place on the platter to read or write the necessary data. The hard drive is, in fact, a very obedient device; it does exactly what it's told to do, which is a bit unfortunate. Here's why:

It is the hard drive, not the chipset's controller, not the CPU and not the OS that knows where all of the data is laid out across its various platters. So, when it receives requests for data, the requests are not always organized in the best manner for the hard disk to read them. They are organized in the order in which they are dispatched by the chipset's I/O controller.

Native Command Queuing is a technology that allows the hard drive to reorder dynamically its requests according to the location of the requests on a platter. It's like this - say you had to go to the grocery store and the drug store next to it, the mall and then back to the grocery store for something else. Doing it in that order would not make sense; you'd be wasting time and money. You would naturally re-order your errands to grocery store, grocery store, drug store and then the mall in order to improve efficiency. Native Command Queuing does just that for disk accesses.

For most desktop applications, NCQ isn't necessary. Desktop applications are mostly sequential in nature and exhibit a high degree of spatial locality. What this means is that most disk accesses for desktop systems occur around the same basic areas on a platter. Applications store all of their data around the same location on your disk as do games, so loading either one doesn't require many random accesses across the platter - reducing the need for NCQ. Instead, we see that most desktop applications benefit much more from higher platter densities (more data stored in the same physical area on a platter) and larger buffers to improve sequential read/write performance. This is the reason why Western Digital's 10,000 RPM Raptor can barely outperform the best 7200 RPM drives today.

Times are changing however, and while a single desktop application may be sequential in nature, running two different desktop applications simultaneously changes the dynamics considerably. With Hyper Threading and multi core processors being the thing of the future, we can expect desktop hard disk access patterns to begin to slightly resemble those of servers - with more random accesses. It is with these true multitasking and multithreading environments that technologies such as NCQ can improve performance.

Index Maxtor's MaXLine III: NCQ Enabled
Comments Locked

38 Comments

View All Comments

  • dmxlite - Saturday, June 26, 2004 - link

    The raptor has TCQ, which is basically the same, but in the raptor's case, it is actually an ATA implementation. TCQ has been around for awhile, originally from SCSI. The problem with the Raptor is it is not a 'real' SATA drive. It is a ATA drive with a SATA bridge, which at the time only offers the benefits of the smaller SATA cabling and such.
    To get a better impression of Raptor vs. MaXLine III, you'll have to use a controller that supports both, like the Promise FastTrak TX4200. Here is a good article on TCQ and it advantages (using the Raptor):
    http://www.storagereview.com/articles/200406/20040...
  • skiboysteve - Saturday, June 26, 2004 - link

    anand, the raptor has NCQ....
  • PrinceGaz - Saturday, June 26, 2004 - link

    Does using NCQ increase disk-related CPU load?
  • TrogdorJW - Saturday, June 26, 2004 - link

    I'm more in agreement with Operandi on this review: Sure, there are instances where NCQ hurts performance (although we need to remember that this is just one implementaion of NCQ on the SATA chips: Intel's), but overall, I would easily take one of these drives over a Raptor. Performance is pretty close, but three or four times the capacity for most likely the same price? Come on, who wouldn't take the increased storage?

    I'm just a little more than surprised that the largest 10000+ RPC drives are still stuck at such low densities. You can only find a couple 147 GB 10000 RPM SCSI drives, and no 15000 RPM drives that I'm aware of hold more than 74 GB. Give me a 10000 RPM drive with 100 GB platters, and then we're talking! (Wishful thinking for a few more years, I'd imagine.)

    Anyway, the benefit of NCQ is going to depend largely on the bottleneck. Some things are CPU limited, in which case it won't help, at least not yet. Compression is a great example of this. If you had a multi-core or SMP setup and the compression was single-threaded, then NCQ could help out more. Other tasks are going to be limited by the sustained transfer, i.e. file copying. However, if you multitask file copying by, for example, copying two large files between drives at the same time, then sustained transfer rate is only part of the bottleneck, with access time being the other part. NCQ is designed to help improve access times, but even in the best case scenarios, it's not going to be tons faster.

    IIRC, technologies like NCQ (which have been in SCSI for ages, via the split-transaction bus) usually don't hurt performance much and can periodically improve performance a lot. I don't know if this has changed much, but I remember seeing my boss start four or five applications at the same time on a SCSI workstation, and they loaded TONS faster than those same applications loaded on my non-SCSI system.

    I've become so used to the bottlenecks in IDE systems that I usually avoid trying to launch multiple applications at the same time. I'd be interested to see some benchmarks in that area, i.e. launching Photoshop, Word, Excel, and Internet Explorer at the same time with and without NCQ. Yeah, it's a short-term bottleneck, but at times such delays are irritating.
  • pxc - Friday, June 25, 2004 - link

  • FacelessNobody - Friday, June 25, 2004 - link

    So based on what I've read, NCQ (and HyperThreading, for that matter) are techologies aimed at improving computer performance in specific situations. I suppose in discussion they sound short-sighted--helping only a few people some of the time. However, it wasn't long ago that 3D accelerators were the same way. The idea could still tank, computers may not go the way forseen in Intel's crystal ball, so it's a leap of faith developing techology like this, and that's why Intel tries so hard to push things like PCI Express and BTX.

    I'm a bottom line person. Like Pariah (whose criticism has been excellent), I'd like a definitive statement telling me what improves by how much and how it affects me. The more I look into it though, it doesn't seem a thing like NCQ can give me that. My computer use (gaming, word processing, music) is affected more by raw power than anything else.
  • Anand Lal Shimpi - Friday, June 25, 2004 - link

    Pariah

    Throughout the article we stated whenever the performance impact due to NCQ was positive, negative as well as whether or not it was negligible.

    A 10% performance improvement is often our rule of thumb for perceivable performance impact. A number of companies have done studies that basically state a 10% impact is noticeable to the end user and we tend to agree.

    Your example of a 30 -> 33 fps performance increase is a valid one, but if we're talking about measuring average frame rates I'd argue that a 10% increase in average frame rate could be very noticeable depending on the rest of the frame rates over the test period. Remember that with our Winstone tests we are measuring the time it takes to complete a task, and a 10% improvement in time savings is nothing to cast aside.

    Performance improvement down the road due to NCQ is far from a guess in my opinion. The fact that Intel has integrated support for it into their new ICH6 and given how hard they have been pushing for the type of multithreaded/multitasking usage environments for the future should lend decent credibility to Intel seeing NCQ as a worthwhile feature. Again, look at the direction that even AMD is going when it comes to CPU design - the focus continues to be on multithreading/multitasking usage environments which, as long as you are doing so across more than one application, result in much more "server-like" disk access patterns than what we have currently.

    As the PC architecture moves into the next phase of parallelism (moving beyond instruction level parallelism to thread/application level parallelism) it's inevitable that hard drive usage patterns do get a bit more random. Definitely not as much as servers, but definitely more than they are now.

    Again I'll state that an improvement of 10% in a very realistic test is something that I'm sure any user would gladly have for free with their hard drive. It's very much like Hyper Threading; the performance differences are sometimes slightly negative or nothing at all, but in some very real world (albeit hard to benchmark) scenarios the performance improvement is undeniably evident.

    Take care,
    Anand
  • nourdmrolNMT1 - Friday, June 25, 2004 - link

    Pariah-

    there is a key word in the conclusion, it is called, potential, read it, the 6 line, 8th word. A lot of the conclusion is based around the future, and what NCQ can do in the future, it says that in todays world you get slight increase, but as computers become more multithreaded NCQ will increase performance.

    p.s. dont go getting you panties in a wad over an article on an HD.

    MIKE
  • Pariah - Friday, June 25, 2004 - link

    "In terms of our excitement about NCQ, the conclusion never stated that NCQ increased performance tremendously across the board."

    It also never stated that in 15 out of the 16 real world tests you ran, the improvement was anywhere from a 1% improvement to 4.9% reduction in performance, with the majority of tests showing a decrease in performance. That wouldn't seem to be a minor point to just gloss over and ignore in the conclusion, don't you think?

    "But also remember that we only had three heavy-multitasking benchmarks, and the performance boost we saw in one of them (a very common scenario, who doesn't copy a file and browse the net or check email?) was nothing short of outstanding."

    10.2% is considered outstanding? Most people would consider a 10% improvement to be about the bare minimum necessary to invoke a perceptible difference in speed for hard drives. I wouldn't call the performance of a card getting 33fps in Far Cry vs a card getting 30fps (the same 10% difference) to be "outstanding."

    That's not to say that NCQ cannot actually provide outstanding performance. It most certainly can. Storage Review's server benchmarks show comparable 10k SCSI drives beating the Raptor by 30%+ and 15k drives just crushing it while only running neck and neck in workstation marks. But the fact remains, that those server benchmarks in no way mirror what us "ordinary folk" can duplicate on our home systems.

    So, again, while we may be able to develop scenarios where NCQ will make a big difference, none of the tests run in this review came anywhere close to displaying those capabilities unless you consider a minimal perceptible difference in one test to be "outstanding." I would take a slightly more subdued approach in the conclusion that tells what the tests actually show and how they would benefit people today, rather than hyping something you weren't able to show and are only guessing might improve in the future.
  • Anand Lal Shimpi - Friday, June 25, 2004 - link

    broberts

    Only the game tests used two drives, all of the other tests had the OS and the applications on the same drive.

    I don't believe that a large buffer negates the advantages of NCQ, as they largely address two different problems. Large buffers help mostly with sequential transfers while NCQ is geared towards improving random access performance; if anything they are complementary.

    You bring up a very good point, the benefit of NCQ could (in theory) be better seen with more fragmentation on a drive. Unfortunately it is difficult to construct a repeatable test involving fragmentation that's real world, I can easily do it in synthetic benchmarks though. That being said, we shouldn't need NCQ to improve fragmented performance - that's what defrag programs are for :)

    Take care,
    Anand

Log in

Don't have an account? Sign up now