AnandTech Home IT Portal Home Increase Font Size Decrease Font Size Change Page Size
Server Guide part 2: Affordable and Manageable Storage
Server Guide part 2: Affordable and Manageable Storage
Date: October 18th, 2006
Topic: IT Computing
Manufacturer: Various
Author: Johan De Gelas
Buy the HP 337972-B21 Smart Array P600
Blank
 CompuVest $168.65
 
 

Introduction

The first magnetic disk was introduced by IBM in the 305 RAMAC computer on September 13th, 1956. The first disk drive was the size of two large refrigerators, could hold 4.4 MB, and cost $10,000 per MB. Although the capacity of the hard disk has exploded and price per GB has decreased spectacularly, the price of a complete enterprise storage solution can still quickly amount to tens of thousands of dollars and more.

Building a complete server solution for our own server lab, we quickly found out that finding the best storage solution for our needs was pretty hard when you are on a tight budget. As usual, the companies active in this market are not helping out. Minor evolutions are called "Breakthrough Architectures", "Affordability" means "not too expensive unless you need more than two drive bays filled" and "Business Intelligence" or "Investment Protection" just means that the marketing people were running out of buzz words and inspiration. In fact, the storage companies do their best to confuse people by calling both a simple SCSI DAS and a very expensive Fibre Channel SAN "scalable", "flexible", "affordable" and "serviceable".

The seasoned storage veteran quickly weeds out all the fluffy buzzwords, but what if you are relatively new to this market? What if your own experience with storage has been limited to adding disks to your old trusty tower server or the workstations of your colleagues? Welcome to the second part of our server guide! Just like our first guide, our goal is to offer you a no-nonsense introduction into the server room, and in this particular server guide we focus on storage performance and different disk interfaces.


Disk performance?

Before we start discussing the different topologies and technologies in the storage world, it is good to get back to basics. The basic component of 99.9% of the storage technology out there is still the hard disk.

To understand the basic performance of a disk, take a look at what happens when a request is sent to the disk:
  1. The Disk controller translates a logical address into a physical address (cylinder, track, and sector). The request is a matter of a few tens of nanoseconds, the command decoding and translating can take up to 1 ms.
  2. The head is moved by the actuator to the correct track. This is called seek time, the average seek time is somewhere between 3.5 and 10 ms
  3. The rotational motor makes sure that the correct sector is located under the head. This is called rotational latency and it takes from 5.6 ms (5400 rpm) to 2 ms (15000 rpm). Rotational latency is thus determined by how fast the rotational motor spins.
  4. The data is then read or written. The time it takes is dependent on how many sectors the disk has to write or read. The rate at which data is accessed is called the media transfer rate (MTR).
  5. If data is read, the data goes into disk buffer, and is transferred by the disk interface to the system.
Media transfer rate (MTR) depends on the rotation speed and on the density with which data is stored. The higher the density, the more data moves under the head in the same amount of time.

Which operation will be the most important? That depends on the amount of data you read or write. If you need many small pieces of data all scattered all over the disk, seek time and latency are the most important. On the other hand if you transfer larger, contiguous pieces of data (i.e. data that is located in close proximity on the drive surface), the MTR will be the most important parameter.

To illustrate this, take a look at the table below. The table below calculates how much time it would take to transfer one block of 4 MB, similar to opening a MP3 song on a desktop PC. We also calculate the time it takes to get 100 different blocks of 4 KB, similar to what would happen if 100 users sent a very simple query to a database server simultaneously. At the end of the table we calculate the total time it takes to perform the requested actions, and we calculate the sustained transfer rate (STR), or the amount of data divided by the total time.



The Faster SATA and SCSI disk performing a database and a typical desktop workload

Although it's transferring one tenth the amount of data, the database access takes almost 15 times more time. In the case of our database access, seek time and latency determine 90-95% of our disk performance, while transfer time is only 1%. If we increase the size of the blocks that we need to 16 KB, little would change. The transfer time would quadruple, but the total time would hardly increase. However, if we increase the numbers of blocks or more generally the number of "I/O operations" that we access, the total time necessary to complete this action would scale almost linearly: twice as many I/O operations will double the time.

In our "desktop MP3" example, transfer time is good for 85% of the time: MB/s is the most important metric. File and FTP servers are somewhere between the desktop and database server examples: on average the number of KB per I/O operation is much higher than a transactional database, but I/O operations are also requested simultaneously.

So basically, there are two ways to measure storage performance:
  1. In MB/s
  2. In I/O operations per second
Notice that in the worst case, database storage server performance can be less than 1 MB/s. Of course, smart techniques such as Native Command Queuing, read ahead buffers, Out of Order Data delivery, and smart caches can lower the impact of concurrent accesses. However, it is not uncommon for database applications to lower the STR (Sustained Transfer Rate) of very fast drives to a few MB per second.

Enterprise Disks: all about SCSI   Next Page

 
  Index

Tools Share
Find lowest prices Find the lowest prices
Digg   del.icio.us   E-mail  
Print This Article Print this article  

21 Comments - Last by Bill Todd, 1119 days ago
Username:
Password:
I'd like a "Buyer's Guide" for this topic by slashbinslashbash, 1129 days ago
I appreciate the theory and the mentioning of some specific products and the general recommendations in this article, but you started off mentioning that you were building a system for AT's own use (at the lowest reasonable cost) without fully going into exactly what you ended up using or how much it cost.

So now I know something about SAS, SATA, and other technologies, but I have no idea what it will actually cost me to get (say) 1TB of highly-reliable storage suitable for use in a demanding database environment. I would love to see a line-item breakdown of the system that you ended up buying, along with prices and links to stores where I can buy everything. I'm talking about the cables, cards, drives, enclosures, backplanes, port multipliers, everything.

Of course my needs aren't the same as AnandTech's needs, but I just need to get an idea of what a typical "total solution" costs and then scale it to my needs. Also it'd be cool to have a price/performance comparison with vendor solutions like Apple, Sun, HP, Dell, etc.

Reply
RE: I'd like a "Buyer's Guide" for this topic by JohanAnandtech, 1128 days ago
Definitely... When I started writing this series I start to think about what I was asking myself years ago. For starters, what the weird I/O per second benchmarking. If you are coming from the workstation world, you expect all storage benchmarks to be in MB/s and ms.

Secondly, one has to know the interfaces available. The features of SAS for example could make you decide to go for a simple DAS instead of an expensive SAN. Not always but in some cases. So I had to make sure that before I start talking iSCSI, FC SAN, DAS that can be turned in to SAN etc., all my readers know what SAS is all about.

So I hope to address the things you brought up in the second storage article.

Reply
RE: I'd like a "Buyer's Guide" for this topic by slashbinslashbash, 1128 days ago
Sounds great, thanks. If possible it'd be great to see full schematics of the setup, pics of everything, etc. This is obviously outside the realm of your "everyday PC" stuff where we all know what's going on. I administer 6 servers at a colo facility and our servers (like 90% of the other servers that I see) are basically PC hardware stuck in a rackmount box (and a lot of the small-shop webhosting companies at the colo facility use plain towers! In the rack across from ours, there are 4 Shuttle XPC's! Unbelievable!).

We use workstation motherboards with ECC RAM, Raptor drives, etc. but still it's basically just a PC. These external enclosures, SAS, etc. are a whole new realm. I know that it'd be better than the ad-hoc storage situation we have now, but I'm kind of scared because I don't know how it works and I don't know how much it would cost. So now I know more about how it works, but the cost is still scary. ;)

I guess the last thing I'd want to know is the OS support situation. Linux support is obviously crucial.

Reply
how about using enterprise class drives on modest load systems? by BikeDude, 1127 days ago
What if you face a bunch of servers with modest disk I/O that require high availability? We typically use SATA drives in RAID-1 configurations, but I've seen some disturbing issues with the onboard SATA RAID controller on a SuperMicro server which leads me to believe that SCSI is the right way to go for us. (the issue was that the original Adaptec driver caused Windows to eventually freeze given a certain workload pattern -- I've also seen mirrors that refuse to rebuild after replacing a drive; we've now stopped buying Maxtor SATA drives completely)

More to the point: Seagate has shown that massive amount of IO requires enterprise class drives, but do they say anything about how enterprise class drives behave with a modest desktop-type load? (I wish the article linked directly to the document on Seagate's site, instead it links to a powerpoint presentation hosted by microsoft?)

Reply
Table Correction by stelleg151, 1129 days ago
In the table the cheetah decodes 1000block of 4KB faster than the raptor decodes 100 blocks of 4KB. Guessing this is a typo. Liked the article.

Reply
RE: Table Correction by JarredWalton, 1128 days ago
Yeah, I notified Johan of the error but figured it wasn't big enough problem to hold back releasing the article. I guess I can Photoshop the image myself... I probably should have just done that, but I was thinking it would be more difficult than it is. The error is corrected now.

Reply
Can we go further here? by Sunrise089, 1128 days ago
I liked this story, but I finished feeling informed but not satisfied. I love AT's focus on real-world performance, so I think an excellent addition would be more info into actually building a storage system, or at least some sort of a buyers guide to let us know how the tech theory translates over to the marketplace. The best idea would be a tour of AT's own equipment and a discussion of why it was chosen.

Reply
RE: Can we go further here? by JohanAnandtech, 1128 days ago
If you are feeling informed and not satisfied, we have reached our goal :-). The next article will go in through the more complex stuff: when do I use NAS, when do I use DAS and SAN. What about iSCSI and so on. We are also working to having different storage solutions in our lab.

Reply
SCSI/SATA/FC by Fantec, 1128 days ago
Working for an ISP, we started to use PATA/SATA a few years ago. We still use SCSI, FC & PATA/SATA depending on our needs. SATA is the first choice when we may have redundant data (and, in this case, disks are setup in JBOD (standalone) for performances issues). At the opposite, FC is only used for NFS filers (mostly used for mail storage, where average file size is a few KB).
Between both, we are looking at needed storage size & IO load to make up our mind. Even for huge IO loads but only when requested block size is big enough, SATA behaves quite well.

Nonetheless, something bugs me in your article on Seagate test. I manage a cluster of servers whose total throughoutput is around 110 TB a day (using around 2400 SATA disks). With Seagate figure (an Unrecoverable Error every 12.5 terabytes written or read), I would get 10 Unrecoverable Errors every day. Which, as far as I know, is far away from what I may see (a very few per week/month).

Reply
Error rate by JohanAnandtech, 1128 days ago
"Nonetheless, something bugs me in your article on Seagate test. I manage a cluster of servers whose total throughoutput is around 110 TB a day (using around 2400 SATA disks). With Seagate figure (an Unrecoverable Error every 12.5 terabytes written or read), I would get 10 Unrecoverable Errors every day. Which, as far as I know, is far away from what I may see (a very few per week/month). "

1. The EUR number is worst case, so the 10 Unrec errors you expect to see are really the worst situation that you would get.
2. Cached reads are not included as you do not access the magnetic media. So if on average the servers are able to cache rather well, you are probably seeing half of that throughtput.

And it also depends on how you measured that. Is that throughput on your network or is that really measured like bi/bo of Vmstat or another tool?

Reply
Comments Page 1 of 3

Free Forrester Risk Management Report
Demystifying Enterprise Risk Management. Download Free With Registration.
DOWNLOAD vWire Today - FREE TRIAL
Take Control of Your Virtual Infrastructure. Manage VI Data & Prevent Problems.
Report Unlicensed Business Software Use
Earn Up to $1 Million by Reporting Unlicensed Software Use. Fill Out Our Form!
Download Microsoft Visual Studio ® Team System
Streamline Dev processes, Reduce time to market. Try Microsoft Visual Studio Team System, FREE!
Supermicro Barebone Servers
We Carry Everything Supermicro. Low Price, Top Service, FREE Shipping, and more.




Latest news by
DailyTech

 November 20, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank

 November 19, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank




pipeboost
Copyright © 1997-2009 AnandTech, Inc. All rights reserved. Terms, Conditions and Privacy Information.
Click Here for Advertising Information