SoftwarePractice.org: Home | Courseware | Wiki | Archive

Backing up a web server

From SoftwarePractice.org

Here's how I back up a web server to StrongSpace. On this page I am assuming that you have root access to the server, built somewhat as described in Building a web server and Configuring a web server. So some of this is similar to the Backing up to StrongSpace article.

Contents

Set up accounts

I use the apache and mysql users to do backups within their own file system space. This is so that cron jobs can read from /usr/www and /usr/var (resp.) without being run as root.

Create blank home directories:

mkdir /home/apache
chown apache:apache /home/apache

Each of these users needs to be able to log into StrongSpace. Assuming that this is not the first such user, we need to generate a new key and append it to the authorized_keys file on StrongSpace. In the following, ssuser is the account name on StrongSpace:

su apache
cd
ssh-keygen -t rsa (enter blank passphrase)
scp ssuser@ssuser.strongspace.com:.ssh/authorized_keys .
cat authorized_keys .ssh/id_rsa.pub > tmpfile
scp tmpfile ssuser@ssuser.strongspace.com:.ssh/authorized_keys
rm tmpfile authorized_keys

The easy way

The easy way simply does a full dump of the database every night and rsyncs it over to StrongSpace. It does at least remove backups over a week old -- presumably you will notice if there is a problem before then!

I am going to run the backups as the apache user. This is so that it gets access to all of the web directories.

  1. Set up the home directory if it isn't.
    mkdir /home/apache
    cd
    mkdir scripts backups
    chmod -R o-r,o-x .
    chown -R apache:apache .
    
  2. Write a script to do the mysql dumps:
    #!/bin/sh
    for db in \
            "dbname1 dbuser1 dbpass1" \
            "dbname2 dbuser2 dbpass2"
    do
      set -- $db
      /usr/local/bin/mysqldump --opt --skip-add-locks \
         --user=$2 --password=$3 $1 \
         | gzip > /home/apache/backups/${1}-`date "+%Y-%m-%d"`.sql.gz
    done
    cd /home/apache/backups
    /usr/bin/find *.gz -mtime +7 -delete
    

    Save it as /home/admuser/scripts/daily_backup.sh. Make sure you chown it to apache:apache.

  3. Write a script to do the rsync. There are two loops in this script: one which backs up arbitrary directories anywhere on the server (they have to be readable by the apache user), and one which backs up directories in the web server space. All of these backups are put into the "myserver" directory on StrongSpace.
    #!/bin/sh
    #
    for dir in \
            "/home/apache/backups databases"
            "/usr/local/apache2/conf apache2-conf"
    do
            set -- $dir
            /usr/local/bin/rsync -rtz --delete ${1}/* ssuser@ssuser.strongspace.com:myserver/$2
    done
    for webdir in \
            "domain1.com uploads" \
            "domain2.com images/avatars"
    do
            set -- $webdir
            origin=/usr/www/${1}/public_html/${2}
            target=myserver/${1}-`echo $2 | sed -e 's/\//-/g'`/
            /usr/local/bin/rsync -rtz $origin/* ssuser@ssuser.strongspace.com:${target}
    done
    
  4. Create a cron job to run these scripts. As the apache user, run 'crontab -e' and add the following lines:
    30 03 * * * /home/apache/scripts/daily_backup.sh
    00 04 * * * /home/apache/scripts/remote_backup.sh
    

Using binary logs

The problem with the above is that it does a full dump of the database(s) every time, and also copies them across to the backup server (StrongSpace, in my case). For a large database, this means you are copying over a lot of data that hasn't changed.

What we need is a better strategy. We can use MySQL's binary logging facility to allow much more frequent and efficient backups.

  • Every week, generate a full backup (and sync to backup server).
  • Every three hours, flush the binary log files (and sync to backup server).

This strategy should ensure that we never lose more than three hours of data, while also reducing the cost of a full backup to once per week. In the following, I assume that the server is running for the purpose of hosting a single reasonably large site, so there is only one database that needs to have this strategy applied, which I'll call here maindb. (Any other databases I'll assume are either archived or non-critical and can be backed up using regular dumps.)

  1. Create a MySQL user to generate backups. No need to use the root user for this. In phpMyAdmin, create a new called, say, bacman, and give it the global privileges RELOAD, SUPER, and REPLICATION CLIENT, and local privileges on maindb LOCK TABLES and SELECT.
  2. Tell MySQL to generate binary logs for maindb. In /etc/my.cnf, add (or edit existing lines):
    log-bin=/usr/var/BINLOGS/myserver-bin
    binlog-do-db=maindb
    

    Note: The directory /usr/var/BINLOGS is where the binary logs will go. (I use this directory because /usr/var is where I told MySQL to put its data files.) You must create this directory yourself and set the ownership to me the mysql user.

  3. Restart MySQL:
    /usr/local/mysql/share/mysql/mysql.server stop
    /usr/local/mysql/share/mysql/mysql.server start
    

    You will see files myserver-bin.index and myserver-bin.index in your /usr/var/BINLOGS directory. To check that this works as expected, use phpMyAdmin to change something in maindb. Also change something in a different database. Use the Binary Log function in phpMyAdmin and verify that the log shows only the change to maindb. (Or, you can use mysqlbinlog).

  4. Figure out how to put your web applications into "maintenance mode." Since we are only going to do this once a week, it doesn't hurt too much to stop serving content for a few minutes. (Hint 1: look in your application's configuration table. Hint 2: check your application's configuration file.) In the case of SMF, for example, the flag is set in the Settings.php file. In the case of Coppermine, the offline flag in the cpg146_config table needs to be set.
  5. Write scripts to put the relevant applications into (and out of) maintenance mode. Mine look like this. (Rather than modifying the database field for Coppermine, I simply hack the code where it checks that field.)
    cd /usr/www/www.mysite.com/public_html/forums
    sed -e 's/maintenance = 0/maintenance = 1/' < Settings.php > tmpfile; mv tmpfile Settings.php
    cd /usr/www/www.mysite.com/public_html/gallery/include
    sed -e "s/CONFIG\['offline'\]/CONFIG\['offline'\]==0/" < init.inc.php > tmpfile; mv tmpfile init.inc.php
    

    and

    cd /usr/www/www.mysite.com/public_html/forums
    sed -e 's/maintenance = 1/maintenance = 0/' < Settings.php > tmpfile; mv tmpfile Settings.php
    cd /usr/www/www.mysite.com/public_html/gallery/include
    sed -e "s/CONFIG\['offline'\]==0/CONFIG\['offline'\]/" < init.inc.php > tmpfile; mv tmpfile init.inc.php
    
  6. Write the weekly_backup.sh script. This does a full dump while the site is in maintenance mode. The call to mysqldump also flushes the binary logs. It then removes backups older than 30 days, and syncs to the backups server. (The --delete option removes files older than 30 days from the backup server as well.) Note also that compressing the backup is done in a separate step, to minimize the site downtime.
    sh /home/apache/scripts/go_offline.sh
    /usr/local/mysql/bin/mysqldump --delete-master-logs \
           -u bacman -pbacmanpasswd maindb \
         > /home/apache/backups/maindb-`date "+%Y-%m-%d"`.sql
    sh /home/apache/scripts/go_online.sh
    cd /home/apache/backups
    nice gzip maindb-`date "+%Y-%m-%d"`.sql
    /usr/bin/find *.gz -mtime +30 -delete
    /usr/local/bin/rsync -rtz --delete /home/apache/backups/ ssuser@ssuser.strongspace.com:myserver/weekly
    
  7. Write the threehour_backup.sh script. This script simply flushes the binary logs and then does an rsync to a directory allocated for binary logs on the backup server. The trick here is that the apache user can't read the MySQL data directory, so I'm going to put it into the mysql user's filespace instead.
    /usr/local/mysql/bin/mysqladmin -u bacman -pbacmanpasswd flush-logs
    /usr/local/bin/rsync -rtz --delete /usr/var/BINLOGS/ ssuser@ssuser.strongspace.com:myserver/binlogs
    
  8. Remove database backups from remote_backup.sh.
  9. Modify crontabs. For the apache user, you want something like this. This backs up the web server filespace once a day, and does a full database dump and backup on Sunday mornings.
    00 02 * * sun sh /home/apache/scripts/weekly_backup.sh
    30 02 * * * sh /home/apache/scripts/remote_backup.sh
    

    For the mysql user, something like this. This does a binary log backup every three hours.

    10 */3 * * * sh /home/mysql/scripts/threehour_backup.sh
    


Reconstructing the database

You need to use the mysqlbinlog utility.


See also

Securing a web server

Optimizing a web server

Personal tools