Checking Backup Integrity

If you’ve set up a backup solution for WordPress or other dynamic PHP websites, you will probably be backing up site files as well as the site database. For a proper backup solution, you need to check that the backup copy is viable.

You may have a copy of the site files, along with a (hopefully properly) dumped database, but unless you connect these up, how do you know that your backup copy is sound?

The integrity of your backups is not something that you should discover during an emergency recovery situation.

Manually rebuilding a working copy of a dynamic website is time consuming. For each site database, internal site URLs all relate to production domains, complicating the rebuild process. When you have a server backup with ten or twenty important client sites, verification of backups looks pretty daunting – and I suspect that a lot of people just don’t bother.

This article describes how to partially automate this process.

If you want to hack on this, get the files on GitHub.

Context

In our case, site files from the Apache root of a production server are backed up incrementally on a daily basis to a date-stamped directory. This contains:

  • A subdirectory html – which in turn contains a subdirectory for each site under the document root
  • A subdirectory sql which contains a collection of dumped databases for the sites in question

Important config files are also backed up, but that is beyond the scope of this article.

This article assumes that the backup has been downloaded to a local machine.

Checking Integrity: Overview

To test the integrity of backed up sites, one option is to build working clones of the sites on a virtual machine. To avoid the need to change URLs on the backup copies, the /etc/hosts file is amended on the guest VM.

Obviously, the guest VM needs to run a server that broadly matches the original backed up server (in this case Apache), and the virtual hosts settings for the guest VM server need to be set up correctly (this is a one-time import from the backed-up config directory).

You don’t necessarily need to use a VM – you could use any machine on the local network. The reason this is done on a VM/separate machine is so that the main host computer can access the actual live sites for maintenance purposes.

This method also keeps seperation between backed up clones and ongoing development websites – which are two different things.

This article assumes that a backup archive is available. Building working copies involves:

  1. One-time setup of a suitable Virtual Machine – in this case, a Ubuntu Xenial, Apache, MariaDB and PHP 7 LAMP stack
  2. A one-time import of relevant database users to the VM
  3. Exporting files from the Host machine backup archive to the Guest VM (run command in Host)
  4. Importing databases in the Guest (run command in Guest)

Aims

Check the integrity of multiple site backups by building working local copies. This is achieved by:

  • Moving site files and databases for a backed-up production server from a host machine into a local virtual machine
  • Import MySQL/MariaDB databases and set up working sites on the VM

Backup integrity should be checked regularly, so this should be a simple process.

Ideally, once the system has been setup it should be run by administrators rather than developers.

Requirements

These BASH scripts have been tested on Ubuntu Xenial Xerus 16.04 Desktop.

Zenity is used to create user dialogues.

VirtualBox is required for the Virtual Machine. In this case, the VM runs Ubuntu 16.04 Xenial Xerus desktop – desktop rather than server because it allows easy checking of the moved sites. To achieve this, the guest machine hosts file (/etc/hosts) must be set up properly to point at the local copies.

The VM also runs Ubuntu 16.04 Desktop. The database server is MariaDB, but the commands would work on a standard MySQL database server.

The sql backups directory includes the performance_schema.sql, phpmyadmin.sql, mysql.sql and log files from the original server. These aren’t necessary to build clones from backups, and if imported will probably mess up the VM MySQL configuration. Because of this, we exclude these files from the transfer – see the sql-verification-exclude file in the linked repo for an example.

 

Note: For the backed up sites on the guest machine to work properly, the MySQL users from the original server should be imported in a one-time operation.

Move Files to the VM

This is achieved with the move-backups script. This script prompts the user to choose a directory to move. The script is tightly coupled to our requirements, but would be easy to amend.

The directory to be moved is a datestamped directory that contains the entire html directory (i.e. document root) from a backed-up Apache server. It also contains backed up MySQL files (originally created by mysqldump) in a sql directory.

Move Backups Script

Run on the Host computer.


#!/bin/bash
#
# Move a directory into a local Virtual Machine for testing purposes.
#
# This file should be executable and in your path. E.g.:
# - `mv move-backups /usr/local/bin`
# - `chmod +x /usr/local/bin/move-backups`
# - Run: `move-backups`
#
# Add username@network-IP for your VM in place of `david@192.168.1.145`
# Add root@network-IP for your VM in place of `root@192.168.1.145`
# ------------------------------------------------------------------------------
STORAGE=/media/david/storage/servername
SQL_DESTINATION=david@192.168.1.145:staging
HTML_DESTINATION=root@192.168.1.145:/var/www/html
VM=Xenial
SQL_EXCLUDE=/media/david/storage/sql-verification-exclude # rsync excludes are controlled in a file

cd $STORAGE;

# Select the Directory to move
# ------------------------------------------------------------------------------
zenity --info \
--text="Begin the build process for backup up client websites. Click \"OK\" to begin. Then select the date-stamped directory in the backup storage area."

SOURCE=`zenity --file-selection --directory --title="Select a Directory to Sync"`

case $? in
0)
echo "\"$SOURCE\" selected.";;
1)
echo "No file selected.";;
-1)
echo "An unexpected error has occurred.";;
esac

# Start the Virtual Machine - for headless, append --type headless
# ------------------------------------------------------------------------------
VBoxManage startvm "$VM" --type headless | zenity --progress \
--pulsate --width="320" --height="150" \
--text="Starting $VM Virtual Machine" \
--title="Please Wait while $VM is started" --auto-close

# Sync HTML directories INDIVIDUALLY
# ------------------------------------------------------------------------------
HTML_DIRS=($SOURCE/html)

# Loop through Directories only
for DIR in $HTML_DIRS/*/; do

# For our rsync setup, the source directory MUST NOT have a trailing slash -
# so that if the directory doesn't exist, it will be created.
SOURCE_DIR=${DIR%/}

# basename of the $DIR - used as the destination directory, under `/var/www/html`
DEST_DIR= $(basename $DIR)

rsync -azv --progress --delete $SOURCE_DIR $HTML_DESTINATION/$DEST_DIR | zenity --progress \
--pulsate --width="320" --height="150" \
--text="Syncing the HTML directory: $SOURCE_DIR" \
--title="Please Wait" --auto-close

done

# Sync SQL backups to a staging directory
# ------------------------------------------------------------------------------
rsync -azv --exclude-from=$SQL_EXCLUDE --progress --delete $SOURCE/sql/ $SQL_DESTINATION/sql | zenity --progress \
--pulsate --width="320" --height="150" \
--text="Syncing the SQL directory" \
--title="Please Wait" --auto-close

# Tidy up
zenity --question \
--text="Sync complete. Do you want to shut down the VM?"

case $? in

0)
echo "0"
# Close up the VM, maintain state
VBoxManage controlvm $VM savestate | zenity --progress \
--pulsate --width="320" --height="150" \
--text="Shutting down $VM Virtual Machine" \
--title="Please Wait while VMs are Saved" \
--auto-close
zenity --info\
--window-icon="info" \
--text="The VM $VM has been shut down."
echo "$vm was closed to a saved state"

;;

1)
echo "1"
zenity --info\
--window-icon="info" \
--text="Your VM $VM is Running - though it may be in a headless[GitHub repo with scripts](https://github.com/DavidCWebs/check-backups).

-1)
echo "An unexpected error has occurred."
;;

esac

Usage:

  • Add move-backups to usr/local/bin on the Host computer: mv move-backups /usr/local/bin
  • Make executable: chmod +x /usr/local/bin/move-backups
  • Run move-backups in a terminal and follow instructions

When prompted, you should select a directory that contains the backed-up html directory from the Apache doc root – the directory that is normally located at /var/www/ in a standard Apache setup.

Note that the moved files won’t do anything unless you also import the associated databases on the guest machine.

Import Databases

  • Add import-databases to usr/local/sbin on the Guest computer/VM: mv import-databases /usr/local/sbin
  • Make executable: chmod +x /usr/local/sbin/import-databases
  • Run sudo import-databases in a terminal on the Guest VM

#!/bin/bash
#
# The purpose of this script is to import databases so that working copies of
# backed up PHP/WordPress websites can be quickly and easily checked.
#
# The script loops through all databases in a staging directory and imports them
# into MySQL/MariaDB. Existing databases having the same name will be overwritten.
#
#-------------------------------------------------------------------------------

SOURCE=/home/david/staging/sql
PASSWORD=thenicelongpassword
DATABASES=($SOURCE/*)

for (( i = 0; i < ${#DATABASES[@]}; i++ )); do

# The file extension - in our case, there are *.log files that should be ignored
EXT=${DATABASES[$i]#*.}

if [[ "sql" == $EXT ]]; then

DB_SOURCE=${DATABASES[$i]}
DB_NAME=$(basename ${DATABASES[$i]} .sql)

# If a Databse exists with this name, DROP it
mysql --user=root --password=$PASSWORD -e "DROP DATABASE IF EXISTS \`$DB_NAME\`"

# Create new DB with the name of the DB backup file
mysql --user=root --password=$PASSWORD -e "create database \`$DB_NAME\`; GRANT ALL PRIVILEGES ON \`$DB_NAME\`.* TO root@localhost IDENTIFIED BY '$PASSWORD'"

# Import the Database
mysql --user=root --password=$PASSWORD $DB_NAME < $DB_SOURCE

fi

done

# Set proper ownership of site files
chown -R www-data /var/www/html/*

TODO

These scripts are a good start, and allow us to build and check backup copies quite easily. There is room for further automation – ideally we’d like the process to be fully automated, integrated naturally into the backup process.

Our setup includes passwordless SSH keys which allows for easier rsync’ing, and this has not been documented.

Other enhancements might include:

  • Prevent selection of the ‘wrong’ backup directory
  • Auto creating the staging directory for the sql files transfer
  • Trigger the `import-databases` script from the host, so working copies are built with a single command
  • Better feedback on the `import-databases` script (there’s none at the moment!)
  • Document how to import users from original server to the guest machine

Resources