Wednesday, February 13, 2013

Backing up SVN (SubVersion) repository directories

When backing up subversion (SVN) repositories, I find it best to use a bash shell script to search for the SVN repositories.  These can then be passed to the svnadmin hotcopy command or the svnadmin dump command to dump out each repository by itself.

First off, you should define a few variables at the top of your bash shell script.  The key one is ${BASE} which lets you define the location of your SVN repositories. 

# BASE should be location of SVN repositories (no trailing slash)
# such as: BASE=`pwd` or BASE="/var/svn"
BASE="/var/svn"


FIND=/usr/bin/find
GREP=/bin/grep
RM=/bin/rm
SED=/bin/sed


Next is the bit of find/grep/sed magic that turns the list of directories that contain SVN repositories into a list of repository directories.  In this particular case, we are searching for the item named 'current' at a maximum depth of 3 directories deep, then making sure it is 'db/current' in the full pathname.  Last, we sort the list of paths so that we process things in alphabetical order.

DIRS=`$FIND ${BASE} -maxdepth 3 -name current | \
    $GREP 'db/current$' | $SED 's:/db/current$::' | $SED "s:^${BASE}/::" | \
    sort`

As an alternative to processing in alphabetical order, you can use the following perl fragment to randomize the order of the directories.  The advantage of this is that if your backup script breaks for some reason, in the middle of the backup, you have a far higher chance that directory backups at the bottom of the list won't be too far out of date (they might be a few days old, but probably not a few months old).  This is an especially good idea if you are sending the backups out over a WAN link using rsync.

We also, in order to speed up our backups, only search for repositories modified on-disk in the last 15 days.

DIRS=`$FIND ${BASE} -maxdepth 3 -name current -mtime -15 | \
    $GREP 'db/current$' | $SED 's:/db/current$::' | $SED "s:^${BASE}/::" | \
    perl -MList::Util -e 'print List::Util::shuffle <>'`


The loop portion is simply (this particular example shows how to use "svnadmin verify"):

for DIR in ${DIRS}
do

    echo "verifying ${DIR}"
    svnadmin verify --quiet ${BASE}/${DIR}
    status=$?
    if [ $status -ne 0 ]; then
        echo "svnadmin verify FAILED with status: $status"
    else
        echo "svnadmin verify succeeded"
    fi

    echo ""
done

Hope these tricks help.