Enterprise Disks: all about SCSI

There are currently only two kinds of hard disks: those which work with SCSI commands and those which work with (S)ATA commands. However, those SCSI commands can be sent over three disk interfaces:
  • SCSI-320 or 16 bit Parallel SCSI
  • SAS or Serial Attached SCSI
  • FC or Fibre Channel
Fibre Channel is much more than just an interface; it could be described as a complete network protocol like TCP/IP. However as we are focusing on the disks right now, we consider it for now as an interface through which SCSI commands are sent. Right now, Fibre channel disks sales amount to about 20% of the Enterprise market and are mostly sold in the high end market. SCSI-320 used to have about 70% of this market, but it is being replaced quickly by SAS[1]. Some vendors estimate that SAS drives are already good for about 40% of the enterprise market. It is not clear which percentage of enterprise drives are SATA drives, but it is around 20%.


Will SATA kill off SCSI?

One look at the price of a typical "enterprise disk" -- whether it be a SCSI, FC or SAS disk -- will tell you that you have to pay at least 5 times and up to 10 times more per GB. Look at the specification sheets and you will see that the advantage you get for paying this enormous price premium seems to be only a 2 to 2.5 times lower access time (seek + latency) and a maximum transfer rate that is perhaps 20 to 50% better.

In the past, the enormous price difference between the disks which use ATA commands and the disks which use SCSI commands could easily be explained by the fact that a PATA disk would simply choke when you sent a lot of random concurrent requests. As quite a few of reviews here at AnandTech have shown, thanks to Native Command Queuing the current SATA drives handle enterprise workloads quite well. The number of concurrent I/O operations per second is easily increased by 50% thanks to NCQ. So while PATA disks were simply pathetically slow, the current SATA disks are - in the worst case - about half as fast as their SCSI counterparts when it comes to typical I/O intensive file serving.

There is more. The few roadblocks that kept SATA out of the enterprise world have also been cleared. One of the biggest problems was the point to point nature of SATA: each disks needs its own cable to the controller. This results in a lot of cable clutter which made SATA-I undesirable for enterprise servers or storage rack enclosures.

This roadblock can be removed in two ways. The first way is to use a backplane with SATA port multipliers. A port multiplier can be compared to a switch. One Host to SATA connection is multiplexed to multiple SATA connectors. At most 15 disks can make us of one SATA point to point connection. In reality, port multipliers connect 4 to 8 disks per port. As the best SATA disks are only able to sustain about 60 to 80 MB/s in the outer zones, a four disk port multiplier make sense even for streaming applications. For more random applications, even 8 or 15 disks on one 300 MB/s SATA connection would not result in a bottleneck.


Port multipliers are mostly used on the back panel of a server or the backplane of a storage rack. The second way is to use your SATA disks in a SAS enclosure. We will discuss this later.

Index Parallel SCSI in trouble
POST A COMMENT

21 Comments

View All Comments

  • dickrpm - Saturday, October 21, 2006 - link

    I have a big pile of "Storage Servers" in my basement that function as a audio, video and data server. I have used PATA, SATA and SCSI 320 (in that order) to achieve necessary reliability. Put another way, when I started using enterprise class hardware, I quit having to worry (as much) about data loss. Reply
  • ATWindsor - Friday, October 20, 2006 - link

    What happens if you encounter a unrecovrable read error when you rebuid a raid5-array? (after a disk has failed) Is the whole array unusable, or do you only loose the file using the sector which can't be read?

    AtW
    Reply
  • nah - Friday, October 20, 2006 - link

    actually the cost of the original RAMAC was USD 35,000 per year to lease---IBM did not sell them outright in those days, and the size was roughly 4.9 MB. Reply
  • nah - Friday, October 20, 2006 - link

    actually the cost of the original RAMAC was USD 35,000 per year to lease---IBM did not sell them outright in those days, and the size was roughly 4.9 MB. Reply
  • yyrkoon - Friday, October 20, 2006 - link

    It's nice to see that someone finally did an article that had information about SATA port multipliers (these devices have been around for around 2 years, and no one seems to know about them), but since I have no direct hands on experience, I feel the article concerning these was a bit skimpy.

    Also, while I see you're talking about iSCSI (I think some call it SCSI over IP ?) in the comments section here, I'm a bit interrested as to why I didnt see it mentioned in the article.

    I plan on getting my own SATA port multiplier eventually, and I have a pretty good idea how well they would work under the right circumstances, with the right hardware, but since I do not work in a data center (or some such profession), the likelyhood of me getting my hands on a SAS, iSCSI, FC, etc rack/system is un-likely. What I'm trying to say here, is that I think you guys could go a good bit deeper into detail with each technology, and let each reader decide if the cost of product x is worth it for whatever they want to do. In the last several months (close to two years) I've been doing alot of research in this area, and still find some of these technologies a bit confusing. iSCSI for example, the only documention I could find on the subject (around 6 months ago) was some sort of technical document, written by Microsoft that I found very hard time digesting. Since then, I've only seen (going from memory) white papers from companies like HP pushing thier own specific products, and I dont care about thier product in particular, I care about the technology, and COULD be interrested in building my own 'system' some day.

    What I am about to say next, I do not mean as an insult in ANY shape or form, however I think when you guys write articles on such subjects, that you NEED to go into more detail. Motherboards are one thing, hard drives, whatever, but when you get into technology that isnt very common(outside of enterprise solutions) such as SAS, iSCSI, etc, I think you're actualy doing your readers a dis-service by showing a flow chart or two, and briefly describing the technology. NAS, SAN, etc have all been done to death, but I think if you look around, you will find that a good article on ATLEAST iSCSI, how it works, and how to implement it, would be very hard to find(without buying a prebuilt solution from a company). Anyhow (again) I think I've beat this horse to death, you get my drift by now im sure ;)
    Reply
  • photoguy99 - Thursday, October 19, 2006 - link

    Great article, well worth it for AT to have this content.

    Can't wait for part 2 -
    Reply
  • ceefka - Thursday, October 19, 2006 - link

    Can we also expect a breakdown and benchmarking on network storage solutions for the home and small office? Reply
  • LoneWolf15 - Thursday, October 19, 2006 - link

    Great article. It addressed points that I not only didn't think of, but that were far more useful to me than just baseline performance.

    It seems to me that for the moderately-sized business (or "enterprise-on-a-budget" role, such as K-12 education) that enterprise-level SATA such as Caviar RE drives in RAID-5, plus solid server backups (which should be done anyways) make more sense cost-wise than SAS. Sure, the risk for error is a bit higher, but that is why no systems/network administrator in their right minds would rely on RAID-5 alone to keep data secure.

    I hope that Anandtech will do a similarly comprehensive article about backup for large storage someday, including multiple methods and software options. All this storage is great, but once you have it, data integrity (especially now that server backups can be hundreds of gigabytes or more) cannot be stressed enough.

    P.S. It's one of the reasons I can't wait until we have enough storage that I can enable Shadow Copy on our Win2k3 boxes. Just one more method on top of the existing structure.
    Reply
  • Olaf van der Spek - Thursday, October 19, 2006 - link

    quote:

    the command decoding and translating can take up to 1 ms.


    Why does this simple (non-mechanical) operation take so long?
    Reply
  • Fantec - Thursday, October 19, 2006 - link

    Working for an ISP, we started to use PATA/SATA a few years ago. We still use SCSI, FC & PATA/SATA depending on our needs. SATA is the first choice when we may have redundant data (and, in this case, disks are setup in JBOD (standalone) for performances issues). At the opposite, FC is only used for NFS filers (mostly used for mail storage, where average file size is a few KB).
    Between both, we are looking at needed storage size & IO load to make up our mind. Even for huge IO loads but only when requested block size is big enough, SATA behaves quite well.

    Nonetheless, something bugs me in your article on Seagate test. I manage a cluster of servers whose total throughoutput is around 110 TB a day (using around 2400 SATA disks). With Seagate figure (an Unrecoverable Error every 12.5 terabytes written or read), I would get 10 Unrecoverable Errors every day. Which, as far as I know, is far away from what I may see (a very few per week/month).
    Reply

Log in

Don't have an account? Sign up now