Replacing a failed disk in an MD array
Sooner or later you will end up with the infamous [_U] in your MD array. If you are lucky you may be able to recover it by rebooting/removing/inserting the drive and then re-adding it back to the MD device. The question is always if you should trust once failed drive. Some times it is just easier if the drive just dies completely. Whether or not the faulty drive is accessible I find the most robust way of removing it from an array to be to simply just power the system down and removing the drive. (Make sure to remove the correct drive!) MDADM will detect the faulty dry drive and remove it from the array and set the MD-device in a degraded state. Some people claim that you should fail the drive first but this will make it not to be a member of the MD-device anymore thus making it hard to recover the array solely from this removed drive if it is still readable. Adding a new drive back to the array is easy. let's say I have replaced sda and created a partition of the same size as the old disk. # mdadm --manage /dev/md0 --add /dev/sda1 As usual the status can be read from /proc/mdstat #cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1[2] sdb1[1] 1953512400 blocks super 1.2 [2/1] [_U] [=========>...........] recovery = 45.0% (879803904/1953512400) finish=191.4min speed=93475K/sec unused devices: <none> If the rebuilding speed is slow you can force a higher speed by increasing the lower speed limit. echo 100000 > /proc/sys/dev/raid/speed_limit_min This will limit the speed to at least 100000K/sec (if the system can keep up with that). Watch out though! Rebuilding the array will put extra stress on the remaining drives. Rebuilding may cause other drives to fail. It may be more safe to rebuild at a low speed.

Write a comment

Name

E-mail (not visible)

Comment


Code from above