SAS layers in the real world

The SAS layered model is much more than just theory. Below you can see how our LSI Logic card is able to combine a SATA RAID volume with two SATA disks and a SAS RAID volume with four 15000 rpm Fujitsu SAS hard disks in the same Promise Vtrak 300JS JBOD storage rack. Don't let the word JBOD confuse you: it does not refer to the RAID level "Just a Bunch Of Disks". It means that the storage rack doesn't have any raid capabilities, and that the RAID implementation needs to be on the HBA card. Also notice the word "expander" in the screenshot below, which refers to the circuit which is basically a SAS crossbar switch and a router at the same time. We will discuss the expander's functionality in more detail.


The bandwidth between the HBA and the SAS storage racks is also much higher than similar SCSI-320 configurations. Let us take a look at the external port of our HBA card. We find an industry standard 4x wide-port SAS connector, which combines four 300 MB/s SAS cables in one "SFF-8470 4X to 4X" external SAS cable. This gives us 1.2 GB/s bandwidth to any external storage rack, four times more than what would be possible with SCSI-320.



A 4x wide port

When we combine this wide port with a storage rack that has a built-in expander, magic happens. We get the advantages of the parallel SCSI architecture and those of the Serial SATA architecture. To understand why we say that "magic happens", let us take a concrete example. The storage rack that we tried out was the Promise Vtrack J300S, which can contain up to 12 3.5 inch drives. Take a look at the picture below.


In this configuration you get a big 1.2 GB/s pipe, called a SAS wide port, to your storage rack. If we use SATA without a port multiplier, we would only be able to attach four drives: each drives needs its own cable, its own point to point connection. If we use SATA with a port multiplier, we are able to use all 12 drives, but our maximum bandwidth to our HBA would be limited to 300 MB/s. This is OK for transaction based traffic, but it might introduce a bottleneck for streaming applications.

With SCSI we would be able to use up to 14 drives without a port multiplier, but we would have to add another SCSI HBA card if we ran out of space with our 14 SCSI drives. As hot swap PCI slots are very rare and expensive, this could mean we would have to take the server and storage down for some time.

Thanks to the built-in expander of the Vtrack J300s, not only can we address 12 drives with only 4 point to point connections, but we can also use up to four cascaded storage racks. So in this case, Promise has limited the SAS routing and SAS traffic distribution to 48 (4x 12) drives. In theory SAS can support up to 128 devices, but in practice HBA is limited to about 122 drives.

In other words SAS combines all the advantages of SATA and SCSI, without inheriting any of the disadvantages:
  • You can use up to 122 drives instead of 14 (SCSI)
  • You do not have to use a cable for each drive (SATA-1)
  • Thanks to wide ports, the bandwidth of several channels can be combined into one big multiplexed pipe
  • Thanks to the serial signalling of SAS, the bandwidth of wide ports will double in 2008 (4x600 MB/s instead of 4x 300 MB/s)
  • You only need one SAS cable to attach external storage
  • You can use cheaper SATA and fast SAS drives in the same storage rack
It is now crystal clear why SAS will completely replace SCSI-320 and why SAS is pretty popular among the drive manufacturers. Seagate, Fujitsu-Siemens and Hitachi have entered the SAS drive market with new SAS drives. Western Digital is the exception and has no SAS disk plans at all, but that doesn't mean they don't see a future for SAS. Western Digital views SAS racks as an ecosystem, a breeding place for their Raptors (10000 RPM Enterprise SATA disks). If it was up to Western Digital, SAS would only exist as cables and storage racks, filled with their Raptor disks.

Serial Attached SCSI or SAS has been available since late 2005 and is a logical evolution of the old parallel SCSI-320. However, it is quite revolutionary as it offers in some ways functionality that was only available with high end fibre channel storage.

Parallel SCSI in trouble Enterprise SATA
POST A COMMENT

21 Comments

View All Comments

  • dickrpm - Saturday, October 21, 2006 - link

    I have a big pile of "Storage Servers" in my basement that function as a audio, video and data server. I have used PATA, SATA and SCSI 320 (in that order) to achieve necessary reliability. Put another way, when I started using enterprise class hardware, I quit having to worry (as much) about data loss. Reply
  • ATWindsor - Friday, October 20, 2006 - link

    What happens if you encounter a unrecovrable read error when you rebuid a raid5-array? (after a disk has failed) Is the whole array unusable, or do you only loose the file using the sector which can't be read?

    AtW
    Reply
  • nah - Friday, October 20, 2006 - link

    actually the cost of the original RAMAC was USD 35,000 per year to lease---IBM did not sell them outright in those days, and the size was roughly 4.9 MB. Reply
  • nah - Friday, October 20, 2006 - link

    actually the cost of the original RAMAC was USD 35,000 per year to lease---IBM did not sell them outright in those days, and the size was roughly 4.9 MB. Reply
  • yyrkoon - Friday, October 20, 2006 - link

    It's nice to see that someone finally did an article that had information about SATA port multipliers (these devices have been around for around 2 years, and no one seems to know about them), but since I have no direct hands on experience, I feel the article concerning these was a bit skimpy.

    Also, while I see you're talking about iSCSI (I think some call it SCSI over IP ?) in the comments section here, I'm a bit interrested as to why I didnt see it mentioned in the article.

    I plan on getting my own SATA port multiplier eventually, and I have a pretty good idea how well they would work under the right circumstances, with the right hardware, but since I do not work in a data center (or some such profession), the likelyhood of me getting my hands on a SAS, iSCSI, FC, etc rack/system is un-likely. What I'm trying to say here, is that I think you guys could go a good bit deeper into detail with each technology, and let each reader decide if the cost of product x is worth it for whatever they want to do. In the last several months (close to two years) I've been doing alot of research in this area, and still find some of these technologies a bit confusing. iSCSI for example, the only documention I could find on the subject (around 6 months ago) was some sort of technical document, written by Microsoft that I found very hard time digesting. Since then, I've only seen (going from memory) white papers from companies like HP pushing thier own specific products, and I dont care about thier product in particular, I care about the technology, and COULD be interrested in building my own 'system' some day.

    What I am about to say next, I do not mean as an insult in ANY shape or form, however I think when you guys write articles on such subjects, that you NEED to go into more detail. Motherboards are one thing, hard drives, whatever, but when you get into technology that isnt very common(outside of enterprise solutions) such as SAS, iSCSI, etc, I think you're actualy doing your readers a dis-service by showing a flow chart or two, and briefly describing the technology. NAS, SAN, etc have all been done to death, but I think if you look around, you will find that a good article on ATLEAST iSCSI, how it works, and how to implement it, would be very hard to find(without buying a prebuilt solution from a company). Anyhow (again) I think I've beat this horse to death, you get my drift by now im sure ;)
    Reply
  • photoguy99 - Thursday, October 19, 2006 - link

    Great article, well worth it for AT to have this content.

    Can't wait for part 2 -
    Reply
  • ceefka - Thursday, October 19, 2006 - link

    Can we also expect a breakdown and benchmarking on network storage solutions for the home and small office? Reply
  • LoneWolf15 - Thursday, October 19, 2006 - link

    Great article. It addressed points that I not only didn't think of, but that were far more useful to me than just baseline performance.

    It seems to me that for the moderately-sized business (or "enterprise-on-a-budget" role, such as K-12 education) that enterprise-level SATA such as Caviar RE drives in RAID-5, plus solid server backups (which should be done anyways) make more sense cost-wise than SAS. Sure, the risk for error is a bit higher, but that is why no systems/network administrator in their right minds would rely on RAID-5 alone to keep data secure.

    I hope that Anandtech will do a similarly comprehensive article about backup for large storage someday, including multiple methods and software options. All this storage is great, but once you have it, data integrity (especially now that server backups can be hundreds of gigabytes or more) cannot be stressed enough.

    P.S. It's one of the reasons I can't wait until we have enough storage that I can enable Shadow Copy on our Win2k3 boxes. Just one more method on top of the existing structure.
    Reply
  • Olaf van der Spek - Thursday, October 19, 2006 - link

    quote:

    the command decoding and translating can take up to 1 ms.


    Why does this simple (non-mechanical) operation take so long?
    Reply
  • Fantec - Thursday, October 19, 2006 - link

    Working for an ISP, we started to use PATA/SATA a few years ago. We still use SCSI, FC & PATA/SATA depending on our needs. SATA is the first choice when we may have redundant data (and, in this case, disks are setup in JBOD (standalone) for performances issues). At the opposite, FC is only used for NFS filers (mostly used for mail storage, where average file size is a few KB).
    Between both, we are looking at needed storage size & IO load to make up our mind. Even for huge IO loads but only when requested block size is big enough, SATA behaves quite well.

    Nonetheless, something bugs me in your article on Seagate test. I manage a cluster of servers whose total throughoutput is around 110 TB a day (using around 2400 SATA disks). With Seagate figure (an Unrecoverable Error every 12.5 terabytes written or read), I would get 10 Unrecoverable Errors every day. Which, as far as I know, is far away from what I may see (a very few per week/month).
    Reply

Log in

Don't have an account? Sign up now