Concluding Remarks

Ten different 4 TB hard drives have been analyzed for NAS and DAS applications. Coming to the business end of the review, it is clear that there is no 'one size fits all' model in this area. The hard drives themselves were launched targeting different markets and their resulting performance varies accordingly. However, based on our test results we can arrive at the following conclusions:

The lowest power consumption numbers were recorded, as expected, with the 5400/5900 RPM drives: the WD Reds, Seagate NAS HDDs and the Seagate Terascale units. While the WD Red drew the least power in the resync test, the Terascale and the NAS HDD drew less power during our access sequence tests (though not by much). If power constraints are a primary factor any one of the three would be a good choice. However with that said, despite possessing the highest workload rating of the three the Terascale has the lowest MTBF, coming in at a rating of 800K hours versus 1M hours for the Western Digital and Seagate NAS drives. Consequently depending on the usage scenario the extra premium for the Terascale might not be worth it.

The best overall performance is recorded by the Seagate Enterprise Capacity v4, thanks to its clear lead in the random access patterns segment of the multi-client evaluation. The drawback being that the part is quite difficult to come across for purchase and carries a premium wherever it becomes available.

Meanwhile the Toshiba drive is a strong value proposition as the cheapest enterprise hard drive in the scalable storage class of drives. Even though the MSRP and street price put it in the same category as that of the Seagate Terascale, we have seen occasional deals which give it only a slight premium over the non-enterprise WD Red and the Seagate NAS HDDs. Otherwise the Ultrastar 7K4000 SAS unit is also available for rock bottom prices from third-party sellers on Amazon, and is a good candidate for users running SAS-based storage servers. Though without a broader sample space it is difficult to recommend it further.

Finally we have the WD Red Pro, which aims to strike a balance between performance, power consumption and price. The attractive pricing (given the warranty) makes up for the fact that it doesn't impress in any particular category compared to the competition. If Seagate's Enterprise Capacity drive were to retail for the same price as that of the Red Pro, the choice would be a no-brainer in favour of the Seagate unit. But right now that is not the case, and Western Digital continues to present a unique value proposition with the Red Pro lineup.

All in all, there are plenty of options for NAS users looking to stock up their NAS units with high capacity drives. Though not at the bleeding edge of capacity, today's 4TB drives offer a good mix of pricing, performance, and capacity; and for the cautious buyer 4TB drives offer an alternative to the potential risk in going the new technology route with 6 TB drives. In the end, with the right data in hand it's easy enough to find the best fit by taking into consideration the expected workload and desired price points.

RAID-5 Benchmarking - Miscellaneous Aspects
Comments Locked

62 Comments

View All Comments

  • shodanshok - Sunday, August 10, 2014 - link

    It is not a single post. It is a lengthy discussion of 18 different posts. Let me forward you to the first post: http://marc.info/?l=linux-raid&m=1406709331293...

    When used in single parity scheme, no RAID implementation or file system is immune to UREs that happen during rebuild. What ZFS can do it to catch when a disk suddenly return garbage, which with other filesystem normally result in silent data corruption.

    But UREs are NOT silent corruption. They happen when the disk can not read the requested block and give you a "sorry, I can't read that" message.

    Regards.
  • asmian - Sunday, August 10, 2014 - link

    >But URE's are NOT silent corruption.

    They are if you are using WD Red drives, which Ganesh has previously said are using URE masking to play nicer with RAID controllers. They issue dummy data and no error instead of a URE. This, and the serious implications of it especially with single parity RAID (mirror/RAID5), is NOT mentioned in this comparative article, which is shocking.

    To reiterate: if a RAID5 array (or a degraded RAID6) has a masked URE, there is no way to know which disk the error came from. And if the controller is NOT continuously checking parity against all reads for speed then the dummy data will be passed through without any error being raised at all. Worse, since you don't know there has been a read error, you will assume your data is OK to backup, so you will likely overwrite good old backups with corrupt data, since space for multiple copies is likely to be at a premium, so any backup mitigation strategy is screwed.

    Given the fact that these are 4GB consumer class drives with 1 in 10^14 URE numbers, the chance of a URE when rebuilding is very high, which is why these Red drives are extremely unsafe in RAID implementations that do NOT check parity continuously. I already ran the numbers in a previous post, although they haven't been verified - Ganesh said he was seeking clarification from the manufacturers. Bottom line: caveat emptor if you risk your data to these drives, with or without RAID or a backup strategy.
  • shodanshok - Sunday, August 10, 2014 - link

    Can you provide a reference about URE masking? I carefully read WD Red specs (http://www.wdc.com/wdproducts/library/SpecSheet/EN... and in no place they mention something similar to what you are referring. Are you sure you are not confusing URE with TLER?

    After all, I find extremely difficult to think that an hard drive will intentionally return bad data instead of a URE.

    The only product range where I can _very remotely_ find a similar thing useful is with WD Purple (DVR) series: being often used as simple "video storage" in single disk configuration, masking an URE will not lead to big problems. However, the proper solution here is to implement a configurable SCTERC o TLRE.

    Regards.
  • asmian - Sunday, August 10, 2014 - link

    > I find extremely difficult to think that an hard drive will intentionally return bad data instead of a URE.

    Ganesh wrote to me: "As discussed in earlier WD Red reviews, the drive hopes to tackle the URE issue by silently failing / returning dummy data instead of forcing the rebuild to fail (this is supposed to keep the RAID controller happy)."
  • shodanshok - Sunday, August 10, 2014 - link

    This seems more the functionality of TLER, rather than some form of URE masking. Anyway, if the RED drive really, intentionally return garbage instead of a read error, it should absolutely avoided.

    Ganesh, can you clarify this point?
  • asmian - Sunday, August 10, 2014 - link

    A quick search back through previous WD Red drive reviews reveals nothing immediately. Ganesh ran a large article on Red firmware differences that covered configurable TLER behaviour, which is about dropping erroring drives out of an array quickly so that the array parity or other redundancy can take over and provide the data that the drive can't immediately retrieve, but nothing like this was mentioned.

    However, in http://www.anandtech.com/show/6083/wd-introduces-r... the author Jason Inofuentes wrote: "They've also included error correction optimizations to prevent a drive from dropping out of a RAID array while it chases down a piece of corrupt data. The downside is that you might see an artifact on the screen briefly while streaming a movie, the upside is that you won't have playback pause for a few seconds, or for good depending on your configuration, while the drive drops off the RAID to fix the error."

    That sounds like what Ganesh has said, although I can't see anything in his articles mentioning it. It may be a complete misunderstanding of the TLER behaviour, though. The problem with the behaviour described above is that it assumes that the data is not important, something that will only manifest as a little unnoticed corruption while watching a video file. But what if it happens while you're copying data to your backup array? What if it's not throwaway data, but critical data and you now have no idea that it's corrupt or unrecoverable on the disk so you NEED that last good backup you took... I don't think ANYONE is (or should be) as casual as that about the intrinsic VALUE of their data - why bother with parity/mirror RAID otherwise? If the statement is correct, it's extremely concerning. If not, it needs correcting urgently.
  • Zan Lynx - Monday, August 11, 2014 - link

    To me that sounds like a short TLER setting. The description says nothing about if the drive returns an error or not. It may very well be the playback software receiving the error but continuing playback.
  • asmian - Monday, August 11, 2014 - link

    But a short TLER is designed specifically to allow the array parity/redundancy to kick in immediately and provide the missing data by reconstruction. There wouldn't BE any bad data returned (unless there was no array redundancy). So as described this is NOT anything to do with short TLER. It is about the drive not returning an error when it can't read data successfully (ie. a URE), and issuing dummy data instead. The fundamental issue is that without an error being raised, neither the array hardware/software nor the user can take any action to remedy the data failure, whether that's restoring the bad data from backup or even highlighting the drive to see if this is a pattern indicative of likely failure.

    There are some comments about it in that article which try to explain the scope (it seems to be limited to some ATA commands), but not in sufficient detail for me or most average users who don't know what ATA commands are sent by specific applications or the file system, and they certainly didn't answer my questions and misgivings.
  • shodanshok - Monday, August 11, 2014 - link

    Hi, it seems more as a short TLER timeout rather than URE masking. Ganesh, can you clarify?
  • ganeshts - Saturday, August 23, 2014 - link

    Yes, shodanshok is right ; TLER feature in these NAS drives is a shorter timeout rather than URE masking. Ian's quote of my exchange in a private e-mails was later clarified, but the conversation didn't get updated here:

    1. When URE happens, the hard drive returns an error code back to the RAID controller (in the case of devices with software RAID, it sends the error back to the CPU). The error code can be used to gauge what exactly happened. A fairly detailed list can be found here: http://en.wikipedia.org/wiki/Key_Code_Qualifier : URE corresponds to a medium error with this key code description: "Medium Error - unrecovered read error"

    2. Upon recognition of URE, it is up to the RAID controller to decide what needs to be done. Systems usually mark the sector as bad and try to remap it. It is then populate with data recovered using the other drives in the RAID array. It all depends on the vendor implementation. Since most off-the-shelf NAS vendors use mdadm, I think the behaviour will be similar for all of those.

    3. TLER just refers to quicker return of error code back to controller rather than 'hanging' for a long time. The latter behaviour might cause the RAID controller to mark the whole disk as bad when we have URE for only one sector.

Log in

Don't have an account? Sign up now