LowEndBox - Cheap VPS, Hosting and Dedicated Server Deals

Incremental Remote Backups Using rsnapshot

Tags: , , , , , Date/Time: April 17, 2021 @ 12:00 am, by raindog308

There’s an old adage about backups:

There are two kinds of people: people who’ve never lost data, and people who’ll never lose data again.

If you’ve ever experienced data loss, you will instantly become passionate about backups.  To prevent bad experiences with your data, you want backups that are comprehensive, manageable, versioned, automated, and secure.  Let’s break that down:

  • comprehensive: They should include everything by default.  It’s certainly legitimate to exclude OS directories, temp files, etc. but you don’t want a system where you have to manually add directories as you add applications and data.  Inevitably, you’ll forget something and not know it until it’s too late.
  • manageable: If you have a 1TB server and take a full backup every day and retain them for a month, that’s 30TB in a month.  You need a system that allows for regular pruning.
  • versioned: If you have a system that simply copies everything from system A to system B once a night, that’s better than nothing, but on Monday you trash a file and don’t notice it until Thursday, you can’t recover.
  • automated: Because humans are lazy.
  • secure: It’s annoying to be hacked.  It’s heartbreaking to find the hacker also destroyed your backups.

In this tutorial, we’ll show you how to setup backups using rsnapshot.  Quoting rsnapshot.org:

rsnapshot is a filesystem snapshot utility based on rsync. rsnapshot makes it easy to make periodic snapshots of local machines, and remote machines over ssh. The code makes extensive use of hard links whenever possible, to greatly reduce the disk space required.

This means if you have 500MB of files, you want to retain 30 days’ backups, and your change rate is 10% over that period, you don’t need 30 * 500 = 15,000MB but rather only 550MB.  Beautifully, you still have point-in-time recovery (depending on your backup schedule) throughout that period.

In this tutorial, we’ll setup the following:

  • server1.lowend.party has a directory called /data with lots of valuable files.
  • We want to back it up to backup.lowend.party using a scheme of hourly/daily/weekly/monthly backups.  These are stored in /backups/server1.lowend.party
  • backup.lowend.party has other hosts it backs up as well.
  • We’re using passwordless ssh keys for authentication so we can run everything out of cron.

Before we start, there’s one more key concept.

Push vs. Pull Backups

I’ve long advocated pull backups.  In other words, the backup server comes along and backs up the client.  In this scenario, backup.lowend.party initiates the backups and contacts server1.lowend.party to get the data.  This is in contrast to push backups, where server1.lowend.party contacts backup.lowend.party and pushes the backups to it.

What’s the difference?  Imagine server1 is hacked.  If we’re using push backups, it would be trivial for the hacker to use the passwordless ssh keys to nuke the backups as well.  In a pull-based model, backup.lowend.party can authenticate to server1, but not vice-versa, so the hacker is out of luck.

Installing rsnapshot

On Debian, it’s as easy as

apt install rsnapshot

Configuring rsnapshot

rsnapshot’s config lives in /etc/rsnapshot.conf.  I recommend making a backup of it before you start changing things:

mv /etc/rsnapshot.conf /etc/rsnapsnap.conf.default

There are different philosophies about how to setup rsnapshot configs.  I prefer to have a separate config file for each client (system being backed up).  If you only have one system to backup, this is not necessary.  You can backup multiple systems in one config file, but you lose some flexibility.  Experiment and decide which you like.  In my case, I do this:

cp /etc/rsnapshot.conf.default /etc/rsnapshot.conf.server1

Now modify as follows.  Important Note: rsnapshot.conf requires TABs between elements.  So “cmd /usr/bin/ssh” is “cmd<TAB>/usr/bin/ssh”.

Enable remote backups:

cmd_ssh /usr/bin/ssh

Add these backup intervals:

interval hourly 6
interval daily 7
interval weekly 4
interval monthly 3

I’m using a passwordless ssh key stored in /root/.ssh/backup.  I also use a different ssh port.  So make this change:

ssh_args -p 8989 -i ~/.ssh/backup
These two commands are for reporting (see below):

rsync_long_args --stats --delete --numeric-ids --relative --delete-excluded
verbose 4

Now I tell rsnapshot where to save backups:

snapshot_root /backups/server1.lowend.party/

Finally, I add the backup definition:

backup root@server1.lowend.party:/data/ .

This will keep files in /backups/server1.lowend.party/hourly.0, etc.

I want to exclude /data/cache on my backups:

exclude_file /etc/rsnapshot.server1.exclude

And in that file I put:

- /data/cache/*

OK, we’re ready to go.  Now because I’m not using the default /etc/rsnapshot.conf name, I need to use the -c parameter for all rsnapshot commands.  Let’s start by testing the config:

root@backup:/etc# rsnapshot -c /etc/rsnapshot.conf.server1 configtest
Syntax OK

Now we can run a simulation:

root@backup:/etc# rsnapshot -c /etc/rsnapshot.conf.server1 -t hourly
echo 9633 > /var/run/rsnapshot.pid
mkdir -m 0755 -p /backups/hourly.0/
/usr/bin/rsync -a --stats --delete --numeric-ids --relative --delete-excluded \
--exclude-from=/etc/rsnapshot.server1.exclude --rsh=/usr/bin/ssh -p 8989 \
-i ~/.ssh/backup root@server1.lowend.party:/data/ \
/backups/hourly.0/server1.lowend.party/
touch /backups/hourly.0/

One more thing to do.  I like to use rsnapshot’s reporting tool, so let’s enable it:

cp /usr/share/doc/rsnapshot/examples/utils/rsnapreport.pl /usr/local/bin
chmod 755 /usr/local/bin/rsnapreport.pl 

We’re good to go!

Running rsnapshot

On server1, I have 547MB in /data, and 30MB in /data/cache which will be excluded:

root@server1:~# du -sm /data
547 /data
root@server1:~# du -sm /data/cache
30 /data/cache

Let’s run our first rsnapshot backup:

root@backup:/backups/server1.lowend.party# rsnapshot -c /etc/rsnapshot.conf.server1 hourly
Setting locale to POSIX "C"
echo 10012 > /var/run/rsnapshot.pid
mkdir -m 0755 -p /backups/server1.lowend.party/hourly.0/
/usr/bin/rsync -av --stats --delete --numeric-ids --relative \
--delete-excluded --exclude-from=/etc/rsnapshot.server1.exclude \
--rsh=/usr/bin/ssh -p 8989 -i ~/.ssh/backup \
root@server1.lowend.party:/data/ \
/backups/server1.lowend.party/hourly.0/.
receiving incremental file list
data/
<snipped>
data/cache/

Number of files: 10,982 (reg: 10,980, dir: 2)
Number of created files: 10,982 (reg: 10,980, dir: 2)
Number of deleted files: 0
Number of regular files transferred: 10,980
Total file size: 518,702,282 bytes
Total transferred file size: 518,702,282 bytes
Literal data: 518,702,282 bytes
Matched data: 0 bytes
File list size: 611,123
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 208,691
Total bytes received: 519,874,481

sent 208,691 bytes received 519,874,481 bytes 80,012,795.69 bytes/sec
total size is 518,702,282 speedup is 1.00
touch /backups/server1.lowend.party/hourly.0/
rm -f /var/run/rsnapshot.pid
/usr/bin/logger -p user.info -t rsnapshot[10012] /usr/bin/rsnapshot -c \
/etc/rsnapshot.conf.server1 hourly: completed successfully

Now I can also run that using the rsnapshotreport.pl script we setup.  If I do, the output will look like this (the TOTAL MB is a little different because I ran these at different times):

# rsnapshot -c /etc/rsnapshot.conf.server1 hourly | /usr/local/bin/rsnapshotreport.pl
SOURCE TOTAL FILES FILES TRANS TOTAL MB MB TRANS LIST GEN TIME FILE XFER TIME
--------------------------------------------------------------------------------------------------------------------
server1.lowend.party:/data/ 11982 1 564.81 46.10 0.001 seconds 0.000 seconds

Now if I continue running hourly backups, I see new directories being created in /backups/server1.lowend.party:

drwxr-xr-x 3 root root 4096 Jul 12 16:03 hourly.0
drwxr-xr-x 3 root root 4096 Jul 12 16:01 hourly.1
drwxr-xr-x 3 root root 4096 Jul 12 15:58 hourly.2

Interestingly, hourly.0 is 500-odd MB, will the rest are only 1MB.  Why?  Because hourly.1, hourly.2, etc. are simply hard links back to hourly.0.  This is a huge space savings.

If I nuke some files on server1’s /data and run another couple backups, you’ll see this:

root@backup:/backups/server1.lowend.party# du -sm *
526 hourly.0
39 hourly.1
1 hourly.2
1 hourly.3

rsnapshot is retaining data in hourly.1 because it’s needed to reconstruct the backups for that hour.

Automating rsnapshot

Setting up automated backups is as easy as putting jobs in cron.  For example:

MAILTO=you@somewhere.com
0 * * * * root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 hourly 2>&1 | /usr/local/bin/rsnapreport.pl
0 3 * * * root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 daily 2>&1 | /usr/local/bin/rsnapreport.pl 
0 3 * * 1 root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 weekly 2>&1 | /usr/local/bin/rsnapreport.pl
30 2 1 * * root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 monthly 2>&1 | /usr/local/bin/rsnapreport.pl

Now

I'm Andrew, techno polymath and long-time LowEndTalk community Moderator. My technical interests include all things Unix, perl, python, shell scripting, and relational database systems. I enjoy writing technical articles here on LowEndBox to help people get more out of their VPSes.

1 Comment

  1. Can rsnapshot be used to backup an ENTIRE VPS?

    I two LEB special VPS’s hosted on RackNerd, and RackNerd doesn’t provide snapshots or backups. Both VPS’s have CyberPanel installed, and while CyberPanel can backup and restore websites without issue, there is no facility to backup CyberPanel itself. So if one of the VPS’s fails, it means a complete reinstall and manual reconfiguration.

    –> How can I use rsnapshot to backup an entire (or enough of a) VPS such that after doing a complete reinstall, I could then restore everything completely?

    May 12, 2021 @ 8:08 am | Reply

Leave a Reply

Some notes on commenting on LowEndBox:

  • Do not use LowEndBox for support issues. Go to your hosting provider and issue a ticket there. Coming here saying "my VPS is down, what do I do?!" will only have your comments removed.
  • Akismet is used for spam detection. Some comments may be held temporarily for manual approval.
  • Use <pre>...</pre> to quote the output from your terminal/console, or consider using a pastebin service.

Your email address will not be published. Required fields are marked *