Outils pour utilisateurs

Outils du site


issue108:labolinux2

Ceci est une ancienne révision du document !


Several years ago, I bought a number of new components from an online retailer only to discover some of the equipment was bad. I figured since the hardware was new and in the sealed packing when I bought it, I wouldn’t have any problems. Whether you’re building a new system or refurbishing an old one, it’s always a good idea to test your hardware. Since hard drives are most people’s permanent storage, having a healthy hard drive is almost as important as good steady power to the computer.

In the past, our refurbishing project has used a few methods to detect bad drives: The first method was just listening to the drive. If the drive sounded whiny (or had the notorious click of death), we either wiped the drive with DBAN (Darik’s Boot and Nuke - http://www.dban.org/), or took apart the drive and sent it to our end-of-life processor. The second way we knew a drive was bad was if it failed DBAN. This method wasn’t foolproof because drives with bad sectors could fully DBAN. Sometimes our volunteers would forget to hook up the data or power cable, so the drive wouldn’t be detected and would also fail DBAN. The last method was to examine the hard drive using Gsmartcontrol. Gsmartcontrol is great because it can instantly detect certain kinds of SMART (Self-Monitoring, Analysis and Reporting Technology) errors. Unfortunately, SMART isn’t perfect. Wikipedia has an excellent article covering SMART that mentions that in one study more than 50% of drives that failed did so without triggering one of the main SMART failure indicators.

To augment our SMART testing, we’re starting to use WHDD, a tool ported to Ubuntu by Eugene San. WHDD bills itself as a hard disk drive diagnostic and recovery tool. What we like about WHDD is that it can run a disk read surface scan fairly rapidly. Smartmontools and Gsmartcontrol can be used to run a short (approximately 2 minute) electrical and mechanical test, but the short test covers only a small part of the drive. Both tools can also be used to run a Long/Extended test which scans the entire surface of the drive, but the Long test is as the name describes, long. Even small (80GB) hard drives can take several hours to scan.

This is where WHDD comes in handy. WHDD can run a complete surface READ scan on an 80GB hard drive in under 22 minutes (17 minutes on one of our Seagate drives). Our larger 3TB drive (which was completely full of data) finished in 245 minutes. Below is a small sampling of the time/size ratio we found for different drives we measured.

The ETA is an approximate Estimated Time of Arrival (finish) WHDD displays at the beginning of the test. For the most part, the ETA, unlike many time indicators, is fairly accurate – to within a few minutes. But, like other time indicators, it does suffer time creep when the test comes across several slowly read sectors. We tested several drives and found the average time for an 80GB hard drive to be about 22 minutes.

The Speed of the drive is a fluctuating number. In general, we found the larger the drive the higher the speed. This makes a lot of sense since newer drives should have faster technology to read the surface of larger sized drives. We tested 15 drives of different sizes and makes, and found (with the odd exception) the trend of faster speed seems to correlate with size. Drive content didn’t seem to affect the numbers as much as the number of slowly read sectors.

During the read test, WHDD charts the number of blocks read at each of the following speeds: <3 ms, <10 ms, <50 ms, <150 ms, <500 ms, > 500 ms. If you see a lot of blocks in the <500 ms range and > 500 ms range it’s a really good idea to back up your data and switch to a drive with better read times. Below is a small sample of read times for the same drives.

In the example, the Western Digital WD5000AAKS-65V0 500GB hard drive has 3 blocks in the above 500 millisecond and 22 blocks in the between 150 to 500 millisecond range. If you’re concerned about your drive being fast, or worried about bad blocks, this might be a good indication that it’s time to back up and replace the drive.

But what about data? Does the amount of data affect the time data is read by WHDD? In these charts, all the drives were blank with the exception of the 3TB Seagate ST3000DM001-1ER166 which was almost full of very large files (20GB+ files). Although it took the longest to read, it’s also 6 times the size of the 500GB drive. If we take the 90 minutes of the 500GB drive, and multiply it by 6, we get 540 minutes, almost double the time the 3TB actually took to read. From this we can conclude that WHDD doesn’t seem to be affected much by the amount of data on a computer (it also helps that newer drives are simply faster).

WHDD is a command line tool and needs sudo permission to run:

sudo whdd

WHDD can show SMART attributes, run a read test, run a copy test, run a write test, or set up a host protected area (HPA), a hidden area of the drive an OS can’t normally read. WHDD doesn’t do the kind of SMART short test that Smartmontools or Gsmartcontrol does, but one of the neat things that WHDD might tell you (when looking at the SMART attributes) is if the hard drive has a firmware update available.

Read, Copy and Write tests are all visual. As each block is read/copied/written, it’s displayed much as you’d expect a block read/copy/write test to display, as a block. Quickly read blocks are mostly black with a tiny bit of grey, blocks up to 50ms access time are still grey. Blocks under 150ms access time are green. Blocks over 150ms are displayed in red and blocks over 500ms are displayed in bright Salmon color. Errors are indicated by what looks to be a small bug.

At any time during a test, you can stop the test by pressing CTRL+C. To get back to the WHDD menu, press m (after aborting the test with CRTL+C).

WHDD is a great tool when SMART doesn’t tell the whole story. SMART can tell you a lot about a hard drive: it can tell you the number of hours in use, whether a sector has been reallocated, even if the system has been suddenly shut down, but it doesn’t always help when a drive is simply slower than you might expect. This is where WHDD can be a great tool.

Of course there is no substitute for a good backup. More than any other tool, a good backup can save you the most grief. Whether you’re a system administrator or just storing terabytes of music and video on your home media server, you want some kind of backup to ensure your data (and effort) doesn’t disappear if a drive fails.

Alan Ward wrote an excellent article on backup using rsync in FCM#83: http://fullcirclemagazine.org/issue-83

In summary: if you’re just checking out blank drives, listen first – if a drive sounds bad, but tests good with a SMART tool like Gsmartcontrol or Smartmontools, give it another quick read test with WHDD. If you’re worried about data on a drive, back it up first with rsync (or dd if you have an identical or larger drive to copy to), then check with one of the SMART tools and WHDD to get a better overview of what’s going on with your drive(s).

GSmartcontrol: http://gsmartcontrol.sourceforge.net/home/

Smartmontools: https://www.smartmontools.org/

WHDD: http://whdd.org/

issue108/labolinux2.1462033296.txt.gz · Dernière modification : 2016/04/30 18:21 de auntiee