How to detect hard disk failure?

view full story

http://serverfault.com – So, one of my servers has a hard disk failure. It's running software RAID, the system locked up and according to /proc/mdstat (and /var/log/messages), it's really down: Personalities : [raid1] md2 : active raid1 sdb2[1] 104320 blocks [2/1] [_U] md5 : active raid1 sdb5[1] 2104448 blocks [2/1] [_U] md6 : active raid1 sdb6[1] 830134656 blocks [2/1] [_U] md1 : active raid1 sdb1[1] 143363968 blocks [2/1] [_U] and Nov 5 22:04:37 m38501 smartd[4467]: Device: /dev/sda, not capable of SMART self-check However when I do smartctl -H /dev/sda, it passes the test. It also (HowTos)