Hard drives

One of the most frequently asked questions I hear is 'what's the most reliable hard drive?'  The answer to this question is straightforward - the one that's backed up frequently.  Home file servers can be backed up with a variety of devices, from external hard drives to cloud storage.  As a general guideline, RAID enhances performance but it is not a backup solution.  Some RAID configurations (such as RAID 1) provide increased reliability, but others (such as RAID 0) actually decrease reliability.  A detailed discussion of different kinds of disk arrays is not within the scope of this guide, but the Wikipedia page is a good place to start your research if you're unfamiliar with the technology.

As for hard drive reliability, every hard drive can fail.  While some models are more likely to fail than others, there are no authoritative studies that implement controlled conditions and have large sample sizes.  Most builders have preferences - but anecdotes do not add up to data.  There are many variables that all affect a drive's long-term reliability: shipping conditions, PSU quality, temperature patterns, and of course, specific make and model quality.  Unfortunately, as consumers we have little control over shipping and handling conditions until we get a drive in our own hands.  We also generally don't have much insight into a specific hard drive model's quality, or even a manufacturer's general quality.  However, we can control PSU quality and temperature patterns, and we can use S.M.A.R.T. monitoring tools

One of the most useful studies on hard drive reliability was presented by Pinheiro, Weber, and Barroso at the 2007 USENIX Conference on File and Storage Technologies.  Their paper, Failure trends in a large disk drive population, relied on data gleaned from Google.  So while the controls are not perfect, the sample size is enormous, and it's about as informative as any research on disk reliability.  The PDF is widely available on the web and is definitely worth a read if you've not already seen it and you have the time (it's short at only 12 pages with many graphs and figures).  In sum, they found that SMART errors are generally indicative of impending failure - especially scan errors, reallocation counts, offline reallocation counts, and probational counts.  The take home message: if one of your drives reports a SMART error, you should probably replace it and send it in for replacement if it's under warranty.  If one of your drives reports multiple SMART errors, you should almost certainly replace it as soon as possible.

From Pinheiro, Weber, and Barroso 2007.  Of all failed HDDs, more than 60% had reported a SMART error. 

Pinheiro, Weber, and Barroso also showed how temperature affects failure rates.  They found that drives operating at low temperatures (i.e. less than 75F/24C) actually have the highest (by far) failure rates, even greater than drives operating at 125F/52C.  This is likely an irrelevant point to many readers, but for those of us who live further up north and like to keep our homes at less than 70F/21C in the winter, it's an important recognition that colder is not always better for computer hardware.  Of use to everyone, the study showed that the pinnacle of reliability occurs around 104F/40C, from about 95F/35C to 113F/45C. 

From Pinheiro, Weber, and Barroso 2007.  AFR: Annualized Failure Rate - higher is worse!

Given the range of temperatures that hard drives appear to function most reliably at, it might take some experimentation in any given case to get a home file server's hard drives in an ideal layout. 

So rather than answering what specific hard drive models are the most reliable, we recommend you do everything you can to prevent catastrophic failure by using quality PSUs, maintaining optimal temperatures, and paying attention to SMART utilities.  For such small sample sizes as a home file server necessitates, the most important factor in long-term HDD reliability is probably luck.   

Pragmatically, low-rpm 'green' drives are the most cost-effective storage drives.  Note that many of the low-rpm drives are not designed to operate in a RAID configuration - be sure to research specific models.  The largest drives currently available are 3TB, which can now be found for as little as $110.  The second-largest capacity drives at 2TB generally offer the best $/GB ratio, and can regularly be found for $70 (and less when on sale or after rebate).  1TB drives are fine if you don't need much space, and can sometimes be found for as little as $40.

Cases and Power Supplies Concluding Remarks
Comments Locked

152 Comments

View All Comments

  • EnzoFX - Sunday, September 4, 2011 - link

    What about something for a mostly Mac environment, and the occasional Windows system.
  • DesktopMan - Sunday, September 4, 2011 - link

    Anyone looking for a setup with many HDDs should take a look at port multipliers. You can get external cases for the HDDs, which saves your PC from a lot of heat. Connect using one ESATA cable per 5 drives and you don't need many ports either. (Just make sure they support port multipliers.)
  • mongo lloyd - Sunday, September 4, 2011 - link

    "It is impossible to hear active HDDs inside this case even when you're sitting just a few feet from it (even the notoriously loud VelociRaptors)."

    Maybe if you're half-deaf. And if you put 10 of them in that case, even your 97-year-old grandma would hear them.

    Don't be fooled, a 5-10 disk cabinet will be rather loud and you probably don't want it near your person if you are sensitive to noise. No way around that.
  • Rick83 - Sunday, September 4, 2011 - link

    Just skimming the CPU page, I see a problem: None of these CPU's accelerate encryption in hardware.
    Encrypting your hard-disks should be standard procedure, especially on a file server. You never know what someone may use against you, and in the case of a disk failure, you won't have to worry about sending in a disk with readable data on it.
    Without hardware acceleration though gigabit ethernet may not end up being saturated, especially on the truly low end zacates and forget about atom...

    My recommendation is the sandy bridge i5 2390T. Should be trivial to cool passively if there's enough fans to keep the chassis below 50°C.
    Alternatively, there's VIA - but those nanos are somewhat harder to obtain in the retail channel (and even the 2390 is pretty hard to get)
    And finally, something not touched on: first gen core i5 CPU's on old socket 1156 boards. As those go EOL good deals can be had, and there's no clear power savings advantage in sandy bridge.

    RAM wise, 2GB is a huuuuge amount for a slim OS.
    I'm running 2GB on my machine, and never hit swap -ever- and that's even though I am runnig gentoo and compiling my own kernels and running multiple other services besides samba and nfs.

    Finally on boards: with a good deal you can get those SATA ports on the board on the cheap. Paid only 150 euro for my p55-ud5 last year, as it as going EOL. Bonus is you generally get a better featured board as well, so I also have IEEE 1394 and dual LAN.
  • DanNeely - Sunday, September 4, 2011 - link

    Encryption is also one more factor to make things harder when everything goes wrong at once and you're trying to recover data. I'd rather write off the cost of the disks if they fail than impair my catastrophe recovery options.
  • Rick83 - Monday, September 5, 2011 - link

    That's why you have backups.
  • DanNeely - Monday, September 5, 2011 - link

    If your backup is still intact, not everything has gone wrong yet...
  • Rick83 - Monday, September 5, 2011 - link

    That's the point of the back up so, so that it's impossible for everything to go wrong.
    If everything goes wrong, there's usually a pretty big design flaw somewhere.
    The impact of encryption is relatively negligible, excepting the performance impact.
  • don_k - Sunday, September 4, 2011 - link

    That is why the article recommends a quality PSU. The enterprise space has 48 disk monstrosities, 10, even 20 drive home file server is perfectly possible - get a good PSU.
  • chbarg - Sunday, September 4, 2011 - link

    Excellent comment.

    Recently I built a W2008 server with an AMD processor without encryption support in hardware and I was surprised by the high CPU utilization by Truecrypt. In contrast, my laptop has an Intel CPU with encryption support in hardware and Truecrypt barely loads the CPU.

    I use LUKS encryption in my file server at home for safety.

    Regards,

Log in

Don't have an account? Sign up now