Hard drives

One of the most frequently asked questions I hear is 'what's the most reliable hard drive?'  The answer to this question is straightforward - the one that's backed up frequently.  Home file servers can be backed up with a variety of devices, from external hard drives to cloud storage.  As a general guideline, RAID enhances performance but it is not a backup solution.  Some RAID configurations (such as RAID 1) provide increased reliability, but others (such as RAID 0) actually decrease reliability.  A detailed discussion of different kinds of disk arrays is not within the scope of this guide, but the Wikipedia page is a good place to start your research if you're unfamiliar with the technology.

As for hard drive reliability, every hard drive can fail.  While some models are more likely to fail than others, there are no authoritative studies that implement controlled conditions and have large sample sizes.  Most builders have preferences - but anecdotes do not add up to data.  There are many variables that all affect a drive's long-term reliability: shipping conditions, PSU quality, temperature patterns, and of course, specific make and model quality.  Unfortunately, as consumers we have little control over shipping and handling conditions until we get a drive in our own hands.  We also generally don't have much insight into a specific hard drive model's quality, or even a manufacturer's general quality.  However, we can control PSU quality and temperature patterns, and we can use S.M.A.R.T. monitoring tools

One of the most useful studies on hard drive reliability was presented by Pinheiro, Weber, and Barroso at the 2007 USENIX Conference on File and Storage Technologies.  Their paper, Failure trends in a large disk drive population, relied on data gleaned from Google.  So while the controls are not perfect, the sample size is enormous, and it's about as informative as any research on disk reliability.  The PDF is widely available on the web and is definitely worth a read if you've not already seen it and you have the time (it's short at only 12 pages with many graphs and figures).  In sum, they found that SMART errors are generally indicative of impending failure - especially scan errors, reallocation counts, offline reallocation counts, and probational counts.  The take home message: if one of your drives reports a SMART error, you should probably replace it and send it in for replacement if it's under warranty.  If one of your drives reports multiple SMART errors, you should almost certainly replace it as soon as possible.

From Pinheiro, Weber, and Barroso 2007.  Of all failed HDDs, more than 60% had reported a SMART error. 

Pinheiro, Weber, and Barroso also showed how temperature affects failure rates.  They found that drives operating at low temperatures (i.e. less than 75F/24C) actually have the highest (by far) failure rates, even greater than drives operating at 125F/52C.  This is likely an irrelevant point to many readers, but for those of us who live further up north and like to keep our homes at less than 70F/21C in the winter, it's an important recognition that colder is not always better for computer hardware.  Of use to everyone, the study showed that the pinnacle of reliability occurs around 104F/40C, from about 95F/35C to 113F/45C. 

From Pinheiro, Weber, and Barroso 2007.  AFR: Annualized Failure Rate - higher is worse!

Given the range of temperatures that hard drives appear to function most reliably at, it might take some experimentation in any given case to get a home file server's hard drives in an ideal layout. 

So rather than answering what specific hard drive models are the most reliable, we recommend you do everything you can to prevent catastrophic failure by using quality PSUs, maintaining optimal temperatures, and paying attention to SMART utilities.  For such small sample sizes as a home file server necessitates, the most important factor in long-term HDD reliability is probably luck.   

Pragmatically, low-rpm 'green' drives are the most cost-effective storage drives.  Note that many of the low-rpm drives are not designed to operate in a RAID configuration - be sure to research specific models.  The largest drives currently available are 3TB, which can now be found for as little as $110.  The second-largest capacity drives at 2TB generally offer the best $/GB ratio, and can regularly be found for $70 (and less when on sale or after rebate).  1TB drives are fine if you don't need much space, and can sometimes be found for as little as $40.

Cases and Power Supplies Concluding Remarks
Comments Locked

152 Comments

View All Comments

  • Emantir - Sunday, September 4, 2011 - link

    Im using my file server since october'10 it consists of:
    - Lian Li PC-Q08
    - Zotac NM10 DTX Wifi
    - 200GB Matrox Sata HDD (System)
    - 4x 2TB Western Digital WD20EARS (Storage)
    - Asus EN210 Silent
    Sporting Ubuntu 10.10 with Software Raid 5 and XBmC for HD Playback, works like a charm.
    XBmC Remote Apps exist for iOS and Android so i skipped buying a MCE Remote.
  • Emantir - Sunday, September 4, 2011 - link

    Uh, forgot the Problems:
    - Asus EN210 blocks one Sata Port
    - There are some problems concerning the JMicron Sata Multiplier and Linux. one drive gets miserable write speeds, thus making the whole raid 5 somewhat slow. More: http://goo.gl/cM0gg
  • Lonyo - Sunday, September 4, 2011 - link

    Does the NM10 support staggered spin-up of hard drives?
  • Emantir - Sunday, September 4, 2011 - link

    AFAIK No, Im using a 300W PSU anyway, thus high initial current isn't a Problem.
  • pvdw - Monday, September 5, 2011 - link

    But you should have a good quality PSU to give nice clean, reliable current. Loads of PSUs are just rubbish.
  • Lonyo - Sunday, September 4, 2011 - link

    You may have mentioned needing a good power supply, but when you talk about Atom and Zacate boards, low power solutions, and low power "green" drives, you don't focus on the fact that total system power use in typical conditions could be lower than 30w. If you are buying a beefy 500w power supply, you could be wasting a LOT of power due to efficiency issues.

    The 80PLUS rating only tests as low as 20% of full load. 30w on a 500w PSU is below 10% load, so you could be getting 70% efficiency.
    While it's not a major concern, if you are looking to make things low power to leave it on 24/7, you might want to think about some DC power supplies rather than regular desktop power supplies.
    If you are making a 2~4 drive file server based on an Atom system, you could get a 100~120w picoPSU instead of a "real" PSU, and get potentially much higher efficiency than with a 300w+ normal PSU.

    Of course, not everyone (especially Americans) cares about efficiency, since for them power is so inexpensive, but for a 24/7 box, why not at least discuss things which might improve power efficiency?
  • jtag - Sunday, September 4, 2011 - link

    I have to say that a file server guide that mentions RAID/NAS really should include a discussion on which drives are suitable for using in a RAID. Not all drives are valid for use in a RAID, not because of reliability concerns, but rather because not all manufacturers support Error Recovery Control (see http://www.csc.liv.ac.uk/~greg/projects/erc/ for more info) in their consumer level drives.

    I'd very much appreciate it if AnandTech could run the following command on every drive they test and add it to bench, so we could come up with a list of drives that do support ERC:

    smartctl -l scterc /dev/sdX

    smartctl is available for both Windows and Linux (smartmontools.)

    Of course, this may say it is supported, but the real test would be to set timeouts:

    smartctl -l scterc,70,70 /dev/sdX

    And then cause the drive to have a block error and see if access times out, or causes the drive to drop out of the RAID. This would also be a good test of RAID controller cards, though personally I always use software RAID under Linux.
  • jtag - Sunday, September 4, 2011 - link

    And for the record - I run 6 2TB drives in a RAID-6 (2 drive redundancy) with one hot spare under Gentoo Linux software RAID. My drives are 5 Seagate ST32000542AS and one Samsung EcoGreen F4 HD204UI
  • jwilliams4200 - Sunday, September 4, 2011 - link

    Any tips on how to "cause the drive to have a block error"?
  • Rick83 - Monday, September 5, 2011 - link

    you can use hdparm to mark a block as faulty, IIRC.

Log in

Don't have an account? Sign up now