Sunday 8 March 2015

Putting Bigger Disks in a Server ..... Without Rebooting

Yes, you read that right.  It really is possible to swap the disks, in a running server, without rebooting.  I know, because I was that soldier.  This is how I did it.

What you will need:
  • A server with one or more RAID-1 arrays.
  • Enough RAM to run without a swap area.
  • Quick-release drive carriers, preferrably including one spare.
  • Enough new drives, preferrably identical, to replace the little ones.
  • Another computer with at least two SATA ports.
  • A USB stick with System Rescue CD installed.
  • The usual tools, spare parts &c.
Most tutorials of this nature begin with a dire warning to back up all your data.  You don't actually need to do this here, because the whole point of a RAID-1 setup is that both  (all?)  the disks are copies of each other.  And the system tries its damnedest to make sure that this is the case.  Nonetheless, if you don't feel confident .....  Well, back up all your data anyway.

NOTE:  # is the root prompt on the server; % is the prompt on the work computer  (which will be running SystemRescueCD, and that uses zsh as its default shell instead of bash.)

Run top to check memory usage.  If the machine has not used any swap space, you're good to go.  If it has, you may as well take it down for awhile anyway and stick more RAM in it.  Otherwise

# swapoff -a

Now cat /proc/mdstat .  This will tell you which drives are part of which RAID arrays.

Disconnect one of the drives from the RAID array:

# mdadm --manage /dev/md0 --fail /dev/sdb1
# mdadm --manage /dev/md0 --remove /dev/sdb1

Now for the first scary bit.  You are going to physically remove the drive from the server.  Take a deep breath, release the caddy and withdraw it in one smooth motion, while trying not to think about what might happen if you have got it wrong and pulled out the wrong drive.  Run

# cat /proc/mdstat

again to make sure all is still well.

Label this drive at once.  There will be confusion later on, and you don't want to get things mixed up.  Now go to the computer you will be using for the donkey work.  Connect the ex-server drive to SATA port 0, and a brand new drive to SATA 1.  Plug in your SystemRescueCD USB stick, and boot up into the sysresccd shell  (and don't forget to select a keymap when prompted, otherwise you will find all your punctuation marks in the wrong places.  Maybe some letters, too, if you are French or German.)

Now run

% fdisk -l

and read and digest the output carefully.  You will know which disk is which, by the sizes:  the smaller one -- which should be sda, because you plugged it into port 0 -- is the one you took out of the server, and the larger one with no partitions on it should be sdb.  There will also be an sdc, and maybe even an sdd if you left the DVD±RW drive connected.  Assuming things are right here, all you need do now is

% dd if=/dev/sda of=/dev/sdb bs=65536

to copy the contents of sda, in 65536-byte chunks, over to sdb.  This will bring the boot sector, partition table and everything else across.

(You did get it the right way around, didn't you?  If it stops with "not enough space left on device", you just tried to copy the factory-fresh HDD onto the one out of your server.  Oops.  I can't even give you any instructions here, because this did not happen to me.  But it's not the end of the world; you still have a copy of your data on the disk that's still in the server .....)

(Why didn't she just say % sfdisk -d /dev/sda | sfdisk /dev/sdb to copy over just the partition table?  That would have been much quicker, for sure.  And the server is going to stomp all over the disk contents anyway.  But then you wouldn't be able to time how long the copying takes.  Disk I/O will be the major limiting factor for this operation, so you can be reasonably sure that the server will take about the same amount of time to recreate the RAID array as it took you to copy the disk.)

Now you have a clone of your original sdb.  But it isn't quite ready yet.  If there was a swap partition on the disk, we need to move that to the end of the free space, so we can grow the main partition. 

Messing around with partitions is a great way to trash a drive if you are not careful.  So power down the work computer and disconnect the original drive.  Then there is no way you can inadvertently change its contents.  Restart the work computer.  SystemRescueCD includes gparted, so start the GUI with

% startx

and then get to work.  Basically, you want the partition for your RAID member to be as big as it can be.  Exactly how you have to do this bit will depend on your original partitioning scheme.  Mine was easy -- just one primary partition for the RAID1 array that will hold the root file system, and a logical partition to act as swap space.

Note that moving a logical partition first requires you to grow the extended partition which contains it; then you can move the logical partition, and last of all shrink the extended partition  (I got bitten by this at first, but Google was very helpful).  Also, if it protests that a partition is in use, what probably has happened is that the kernel has seen that it looks like part of a RAID array -- and so the underlying partitions will be in use by the mdraid driver.  Try

% cat /proc/mdstat

If a  (degraded)  RAID array shows up, you need to stop it with

% mdadm --stop /dev/md0

and then you should be able to resize the partition.

Shut down the work computer, swap the new drive into the caddy, put the original sdb somewhere safe and insert the new drive into the server.  Wait a minute or so, for the server to recognise that a new disk has been plugged in.  If when you type

# fdisk -l

you can see the new, large drive, you're ready!  So add it to the RAID array:

# mdadm --manage /dev/md0 --add /dev/sdb1

and keep doing # cat /proc/mdstat a few times to convince yourself that stuff is happening.  At this point, you may as well brew a cup of tea, write a blog post or something.  You can't really go anywhere until the RAID system has done its stuff.  You know roughly how long it's going to take to copy that amount of data from one drive to another, because you've already done it.

Now it's time for a bit of a diversion.  Run

# blkid

and make a note of the UUIDs of any swap partitions on sda .

When the array is no longer resyncing, then it's time for the second stage.  So this time, disconnect the other drive from the array:

# mdadm --manage /dev/md0 --fail /dev/sda1
# mdadm --manage /dev/md0 --remove /dev/sda1

and pop out the caddy with sda in it.  Label the drive; plug it and your other new disk into your work computer; and boot up SystemRescueCD.  Now I'm going to assume the small drive is sda and the large drive is sdb.  Make sure they really are that way around.  If needs be, power off and swap the cables.  You don't want to be attempting substitutions.

This time, we can use the quick way of copying.

As this is the boot device, it needs its boot sector copying across:

% dd if=/dev/sda of=/dev/sdb bs=512 count=1

And then you can copy the partition table:

% sfdisk -d /dev/sda | sfdisk /dev/sdb

% fdisk -l

and check that the partitions are the same on both drives.  Recreate any swap partition that may have been on the original drive  (I had a swap partition at sda5, so % mkswap /dev/sdb5 )  Don't forget to give it the same UUID as was in /etc/fstab.  Start the GUI and use gparted to fix the partitions.  At this stage, it would do no harm to run blkid and make sure the UUID for the swap partition is correct.

Power down the work computer, and swap the new sda into the caddy.  Re-insert it into the server, let it get recognised, and then

# mdadm --manage /dev/md0 --add /dev/sda1

As ever, keep an eye on /proc/mdstat until it finishes resyncing.

You now have a clone of your original system on two new drives -- and the original two drives in a safe place in case anything goes wrong.  But you are still not quite done yet.  There is all that empty space to be taken care of.

First you need to grow the RAID array to fit in its new-size partitions:

# mdadm --grow /dev/md0 --size=max

And when that's done, you can grow the file system to fit in the RAID array:

# resize2fs /dev/md0

(Yes, this does work.  You can resize a filesystem while it is still mounted, and nothing terrible will happen.)  You can estimate how long this process is going to take, based on the difference between the new and old drive sizes; but this probably will be a bit optimistic, and it is likely to take longer in real life.  Note that at first, /proc/mdstat will be estimating days for completion; it will speed up over time.

The very last thing you need to do is to re-enable swap, with

# swapon -a

.....  And that's really all there is to it!  Not only did you swap out both drives, but you now have more free space.  And no reboot required -- although you really should think about doing so, just to prove that the array is still bootable.  Have your SystemRescueCD USB stick with you while doing this


Many thanks to the Open Source community, in particular those upon whose efforts this builds.

No comments:

Post a Comment