Replacing faulted drive on Linux software RAID (MDTOOLS)

1264

Here’s a very quick HOWTO for Linux Software RAID, these notes are maded for replacing a faulty disk with a new one.

When you’ve a software RAID configuration with linux you’ve planned to survive to hardware failures, when these failures happen you need to replace the faulty drive with a new one and inform your RAID configuration of it.

First take a look at your current RAID config by running the command:

~# cat /proc/mdstat
Personalities : [raid1]
  md1 : active raid1 sda2[2](F) sdb2[1]
     70645760 blocks [2/1] [_U]
  md0 : active raid1 sda1[0] sdb1[1]
     9767424 blocks [2/2] [UU]
  unused devices:

 

This shows raid md1 has drive sda2 stopped with a fault.
As my config shows I’ve two disks with software RAID1, sda2 is marked as faulty (letter F) and block device is not present in the RAID (“_” instead of “U”). First thing to do is to replace the drive, power off the machine if you don’t have hotswap drives

Then you need to inform your configuration about the new drive, first remove your previous block device (from raid md1 in my case):

~# mdadm /dev/md1 -r /dev/sda2
mdadm: hot removed /dev/sda2

Then add your new partitioned block device:

~# mdadm /dev/md1 -a /dev/sda2
mdadm: re-added /dev/sda2

Now you will see it regenerate your RAID chain in mdstat:

~# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda2[2] sdb2[1]
70645760 blocks [2/1] [_U]
[>....................] recovery = 0.3% (268800/70645760) finish=21.8min speed=53760K/sec
md0 : active raid1 sda1[0] sdb1[1]
9767424 blocks [2/2] [UU]
unused devices:

When finished you’ll have a working config

 

Hope it helps

Ben