Backing up and archiving with rsync and ZFS snapshots

Overview

I look after a server for a colleague that runs a number of KVM virtual machines on top of Ubuntu 12.04 LTS, the server has a pair of USB drives plugged in the back for backups.

I talked briefly about this in my lightning talk at LISA conference in Newcastle this week but I’m not sure I made much sense so this is a better write up.

Using USB harddrives for backups has drawbacks, namely USB transfers are slow compared to modern SATA harddisks, also I don’t trust harddisks or USB to transfer and store data correctly. This solution makes use of ZFS to address both of these problems and many others.

Why ZFS for the backup target?

The ZFS features we are leveraging are:

  • End to end checksums – ZFS provides end-to-end data integrity by computing and storing a checksum with every block on the disk. This does cost a small amount of space and CPU time, however the gains for backups are huge. You know that the data on the disk is the data you have backed up, and it’s automatically verified every-time you read your backups.
  • Compression – ZFS allows for data to be compressed before it is sent to the disk, this helps because we can store more backups and because data is compressed before it travels over the slow USB link.
  • Mirroring – The drives as setup in as a ZFS mirror so that data is stored twice, once on each disk. This helps for two reasons, firstly it means that if ZFS does detect an error on read there is a second copy of the data to repair the error. Secondly as we are mostly reading the backups when we are rsyncing to the ZFS pool we can read from both disks at once so the pool is twice as fast, thus backups happen quicker.
  • Snapshots – We want to be able to “step back in time” to load a machine image from last week/month/year but also not to store the same data over and over again. ZFS snapshots is how we do it, as we are using both LVM and ZFS snapshots I’m covering this is more detail next.

LVM vs ZFS snapshots

The disk images for the virtual machines are stored on an LVM managed RAID5 array. The only user data on the machine is inside disk images, so we need an efficient way to backup and archive large (100GB) disk images. We can do this by leveraging the power of snapshots, both LVM and ZFS snapshots.

LVM and ZFS snapshots are very different, LVM is a block management layer and so snapshots in LVM are block based where as ZFS is a Copy on Write (CoW) filesystem and so ZFS snapshots are tree snapshots. The differences are show in this slide from a Sun presentation on ZFS. The key difference is that with ZFS snapshots are a map to where the changed blocks are stored where as LVM snapshots are a copy of what has been overwritten.

ZFS snapshotsYou don’t need to understand the exact reason why things are different, just that they are and we going to make use of them, now on to the meat of the backup process.

Implementation

First of all we need to set some variables, different people like the date in different formats here we are using %s or number of seconds since the UNIX epoch in 1970.

# These are the VMs that are sync'd before backup
vm_targets="Server-Tim MySQL"
# Get the date in UNIX time
date=$(date +%s)
# These are the LVM volume groups we are backing up
storage_targets="/dev/SSD/images /dev/raid5/VM.images"
# Where are we backing up to, needs to be ZFS
backup_target="archives/backups"
# Where we mount the LVM snapshots
mountpoint=/run/backup

As we are backing up virtual machines (VMs) it really helps if we can tell the machine we are backing up that it’s about to have a backup taken. There are a number of approaches to this such as logging in with an ssh key or using an agent, but because QEMU doesn’t have much support for agents yet and ssh logins are just another thing to setup I’ve settled on a different way. I have made use of the fact that virsh can send a raw key press to the VM just like someone sitting at the keyboard has pressed a key, by sending alt+sysrq+s we can trigger Linux’s emergency sync mechanisms which flush all waiting disk buffers from memory to disk, perfect for a backup.

One second after requesting the disk flush we tell QEMU to suspend the machine so that nothing changes while we sync the other machines and trigger an LVM snapshot. This happens very quickly and the machine is normally suspended for less than 5 seconds.

for VM in $vm_targets ; do
  virsh send-key $VM KEY_LEFTALT KEY_SYSRQ KEY_S
  sleep 1 ; sync
  virsh suspend $VM
done

Now that we have suspended all the machines we care about and made sure all of the disk images are consistent we need to make sure that we can get a consistent backup of the disk images so we take an LVM snapshot. Later on we mount this snapshot and take an rsync backup of the disk images.

for storage in $storage_targets ; do
  shortname=$(basename $storage)
  /sbin/lvm lvcreate --quiet -L 10G --chunksize 512k -s -n ${shortname}.${date} $storage
done

We have now got a snapshot on the LVM storage so we can release the machines to carry on working while we run the backups.

for VM in $vm_targets ; do
  virsh resume $VM
done

We also need to take a ZFS snapshot so we can roll back to this backup in the future.

/sbin/zfs snapshot $backup_target@daily.$date

We now have 2 snapshots frozen in time, the LVM and ZFS one, we are planning to keep the ZFS one but we need to discard the LVM one as soon as we can because it’s lowering the performance of the disks. Now we need to actually make a copy of the files from the LVM snapshot to the ZFS storage.

This loop of the backup script is kind of complex, what it’s doing is making a temporary mount point, checking the LVM snapshots and mounting them and using the command rsync to copy any changes from the snapshot to the ZFS backup target. The flags –no-whole-file and –inplace are worth mentioning, they force rsync to only copy over changed blocks from the LVM storage to the ZFS storage. This makes the ZFS snapshot very space efficient as well as improving the speed of the backups. Finally the loop un-mounts the LVM snapshots and removes them.

for storage in $storage_targets ; do
  shortname=$(basename $storage)
  mountname=$(echo $storage | sed -e s,^/dev/,,g -e s,/,.,g )
  mkdir -p $mountpoint/$date/$mountname
  /sbin/fsck -p ${storage}.${date}
  if ! mount -o ro ${storage}.${date} $mountpoint/$date/$mountname ; then
    echo failed to mount ${storage}.${date}
  else
    /usr/bin/rsync -axH --no-whole-file --inplace --delete $mountpoint/$date/$mountname/ /$backup_target/$mountname/
  fi
  sleep 10 ; sync
  umount ${storage}.${date}
  sleep 10 ; sync
  /sbin/lvm lvremove --quiet --force ${storage}.${date}
  rmdir $mountpoint/$date/$mountname
done

Just in case grab a copy of the root filesystem as it contains the configuration of the server and VMs.

/usr/bin/rsync -axH --no-whole-file --inplace --delete / /$backup_target/root/

And finally…

Yes it would be easier if the main VM data store was run on ZFS but it’s not, however I think that I’ve designed this backup solution to get the best from both LVM and ZFS snapshots.

Setting up a headless networked Raspberry Pi

I’m sure most people know about the Raspberry Pi and all of the cool and interesting things you can do with but I’m interested in using my Pi as a headless networked computer.

First you need an image to boot the Pi, I use Raspbian because it’s close to Debian (which I know) and is built for arm hardfloat (which makes it faster on the Pi). SD card images for Raspbian can be found on the official download page.

Once you have written the image to an SD card, you need to find your Pi after it boots. Because Raspbian gets an IP address from DHCP and starts an SSH server by default, the easiest way to do this just scan your local network for systems running an SSH server. You can do this with nmap, the reason you need to run nmap as root is to get the mac address back off the wire.

sudo nmap -p 22 --open  192.168.1.0/24

You get something like this back, notice the “Raspberry Pi Foundation” label by the mac address, this means you’ve found a Pi.

Nmap scan report for 192.168.1.xxx
Host is up (0.00064s latency).
PORT   STATE SERVICE
22/tcp open  ssh
MAC Address: B8:27:EB:XX:XX:XX (Raspberry Pi Foundation)

Once you have found your Pi you need to login and set it up, the user name is pi and the password is raspberry you should really change this.

ssh -l pi <IP address>

First off updating, as raspbian is just a spin of Debian so we can just use apt-get to update.

sudo apt-get update
sudo apt-get dist-upgrade

You can also use apt-get to install packages as well but that’s a different tutorial.

The latest version of the Pi model B has 512MB of RAM which is much better and gives a lot more flexibility but as we want to run a headless Pi we really don’t need to give the GPU much ram so edit /boot/config.txt and add this line to end of the config file

gpu_mem=16M

As we don’t have a monitor plugged into the Pi there isn’t much point in running X windows and disabling it will save some ram.

update-rc.d lightdm disable

By default raspbian sets up a 100M swapfile on the SD card, I don’t really like that as it’s slow and can wear out the SD card so I disable swap.

sudo update-rc.d dphys-swapfile disable

There is a system in Linux called “zram” ie a compressed ramdisk we can use to swap to instead of the SD card. When it’s not in use the zram ramdisk takes up almost no memory so I tend to enable it on most systems I look after. I’ve hacked up an init script that setups a compress ramdisk 1/2 the size of the memory on the Pi and formats it as swap. To install this script download it, make is executable and enable it.

sudo wget -O /etc/init.d/zram http://co-lo.night-shade.org.uk/~tim/Pi/zram
sudo chmod 755 /etc/init.d/zram
sudo update-rc.d zram enable

The Raspberry Pi foundation call the Linux kernel that the Pi boots from the SD card firmware, however as the licensing is “complex” and not something that I’m going to comment on here you can’t update it with standard Debian tools. There is an update script from https://github.com/Hexxeh/rpi-update that will automate downloading and installing the latest firmware.

sudo apt-get install git-core
wget -O rpi-update https://raw.github.com/Hexxeh/rpi-update/master/rpi-update
chmod 755 rpi-update
sudo ./rpi-update

Finally we need to resize the main system partition on the SD card to use all the space the SD card, raspbian includes a method to do this in the config application raspi-config, but you can also just use sfdisk like this:

echo ",+," | sudo sfdisk --force -N2 --no-reread /dev/mmcblk0

Finally you will need to reboot for all of these changes to take effect.

Once the Pi has rebooted, final step is to login again and resize the root filesystem

sudo resize2fs /dev/mmcblk0p2

You now have a tiny little networked computer that you can use to do all sorts of things, I’ll post some examples of what you can do later.