Playing with mdadm

This is all for RAID1.  And it's written in December 2010 with reference to mdadm 2.6.7.2 running on Debian 5 (Lenny), or any other Linux system with a 2.6.26-ish kernel.

First, set up some partitions — normally in equal-sized pairs on different disks.  Let's assume the pair is /dev/sda1 and /dev/sdb1, to be combined as /dev/md0 (yes, I know — normal partitions are numbered from , RAID ones from 0.  Trying to line up /dev/sd?1 with /dev/md1 seems to cause problems when automatic numbering kicks in, so live with it.)

Create the RAID partition:

mdadm --create -n 2 --level 1 /dev/md0 /dev/sda1 /dev/sdb1

or, if only one partition is available, that can be

mdadm --create -n 2 --level 1 /dev/md0 /dev/sda1 missing

That will start the process of syncing the two partitions, which can take a long time if they're big, even if they were initially empty.

What happens if both contained data?  If either contain partitions or filing systems, mdadm will prompt for continuation.  The data (i.e. everything, including the partition table) contained in the first-named partition will overwrite the second partition when the array syncs.  Syncing only happens when the array is mounted — until then it is marked as 'pending'.

Before I go on, I'll look for some existing information.  Here's some: http://www.ducea.com/2009/03/08/mdadm-cheat-sheet/

So I'll copy and paste a big chunk of that, and annotate it to my taste.

1. Create a new RAID array

Create (mdadm --create) is used to create a new array:
mdadm --create --verbose /dev/md0 --level=1 /dev/sda1 /dev/sdb2
or using the compact notation:
mdadm -Cv /dev/md0 -l1 -n2 /dev/sd[ab]1

2. /etc/mdadm.conf

/etc/mdadm.conf or /etc/mdadm/mdadm.conf (on debian) is the main configuration file for mdadm. After we create our RAID arrays we add them to this file using:
mdadm --detail --scan >> /etc/mdadm.conf
or on debian
mdadm --detail --scan >> /etc/mdadm/mdadm.conf

3. Remove a disk from an array

We can’t remove a disk directly from the array, unless it is marked as 'failed', so we first have to fail it.   (If the drive has really failed, the mdadm should have detected this and so this step will not be needed):
mdadm --fail /dev/md0 /dev/sda1
and now we can remove it:
mdadm --remove /dev/md0 /dev/sda1

This can be done in a single step using:
mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1

These commands are what mdadm calls 'management' mode, as opposed to the 'create', 'assemble' etc., modes.

4. Add a disk to an existing array

We can add a new disk to an array (replacing a failed one probably):
mdadm --add /dev/md0 /dev/sdb1

This will only work if /dev/sdb1 does not already have a superblock, i.e. it has not already been made part of an array.

If it has a superblock, that can be removed with:
mdadm --zero /dev/sdb1

This will potentially lose data since the 'array' is considered to contain the up-to-date data, and the newly added partition will be overwritten.

See http://en.wikipedia.org/wiki/Mdadm#Known_problems

5. Verifying the status of the RAID array

We can check the status of the arrays on the system with:
cat /proc/mdstat
or
mdadm --detail /dev/md0

The output of this command will look like:

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[1] sda1[0]
104320 blocks [2/2] [UU]

md1 : active raid1 sdb3[1] sda3[0]
19542976 blocks [2/2] [UU]

md2 : active raid1 sdb4[1] sda4[0]
223504192 blocks [2/2] [UU]

Here we can see both drives are used and working fine – U. A failed drive will show as F, while a degraded array will having missing partitions marked with .

Note: while monitoring the status of a RAID rebuild operation using watch can be useful:
watch cat /proc/mdstat

6. Stop and delete a RAID array

If we want to completely remove a raid array we have to stop if first and then remove it:
mdadm --stop /dev/md0
mdadm --remove /dev/md0

and finally we can even delete the superblock from the individual partitions:
mdadm --zero-superblock /dev/sd[ab]1

That presumably leaves two normal partitions, which will have identical contents if syncing was up-to-date.  They could then be recreated into an array with –create (but that would keep the contents of the first-named).

7. Assembling

If the system is set up correctly (whatever that means), RAID arrays will be assembled when the machine is booted.

If not, partitions that have previously belonged to an array can be linked together:
mdadm --assemble /dev/md0 /dev/sd[ab]1

For more convenience, mdadm will scan partitions for matching superblocks and start the arrays automatically:
mdadm --assemble --scan
which is presumably what happens at boot time.

7a. Knoppix — why did that not work?

(when I was sorting out RAID on an HP Proliant server)

Because of dmraid.  If the BIOS has been used to create a fake-raid array in the past, even if you've now chosen not to use it, dmraid will notice the signature on the drives and will take them over, getting in the way of mdadm.  See http://en.wikipedia.org/wiki/Mdadm#Known_problems.

Normally, Knoppix sorts out RAID well.  It doesn't seem to create arrays automatically, but a simple
mdadm --assemble --scan
will start any arrays that it finds.

8. Monitoring

Will only daemonize and monitor things if there's an email address or program name in /etc/mdadm/mdadm.conf.

Only Fail, FailSpare, DegradedArray, SparesMissing and TestMessage cause Email to be sent.  All events cause the program to be run.  The program is run with two or three arguments: the event name, the array device and possibly a second device.

So, for information about other events, we need the specified 'program' to be a script that sends a suitable email.  Easy, but not particularly useful. 

Looking at /etc/init.d (on Debian Lenny), there are two processes that get run automatically:

  • mdadm — "Start the MD monitor daemon for all active MD arrays if desired."
  • mdadm — "Start all arrays specified in the configuration file."

There's some configuration in /etc/default/mdadm, as set up when the mdadm package is configured.

Monitoring also happens as controlled by /etc/cron.d/mdadm, which runs /usr/share/mdadm/checkarray once a month.

 

Other things to do

Finally in using RAID1 arrays, where we create identical partitions on both drives this can be useful to copy the partitions from sda to sdb:
sfdisk -d /dev/sda | sfdisk /dev/sdb

(this will dump the partition table of sda, removing completely the existing partitions on sdb, so be sure you want this before running this command, as it will not warn you at all).

Things to sort out:

Boot partition: should grub be on /dev/sda1 and /dev/sda2, or on /dev/md0?  The latter I think, now that Grub is capable of such things. 

Aha! That's all very well, but where does the BIOS find grub?  Some BIOSes will just boot from 'a hard drive', others from a specific hard drive.  For example, a HP Proliant I've been working on mentions specific drives.  It was willing to boot from a fake-raid array (using nVidia MediaShield), but I opted for mdadm software RAID instead.  Leaving the BIOS point to one of the drives: if it fails, a BIOS tweak will presumably be required.  It didn't have an option for 'try this hard drive, and then try the other one.

Likewise swap: it won't be any slower on RAID1, and we won't have to tweak things

Leave a Reply

Your email address will not be published. Required fields are marked *