There’s an old adage about backups:
There are two kinds of people: people who’ve never lost data, and people who’ll never lose data again.
If you’ve ever experienced data loss, you will instantly become passionate about backups. To prevent bad experiences with your data, you want backups that are comprehensive, manageable, versioned, automated, and secure. Let’s break that down:
- comprehensive: They should include everything by default. It’s certainly legitimate to exclude OS directories, temp files, etc. but you don’t want a system where you have to manually add directories as you add applications and data. Inevitably, you’ll forget something and not know it until it’s too late.
- manageable: If you have a 1TB server and take a full backup every day and retain them for a month, that’s 30TB in a month. You need a system that allows for regular pruning.
- versioned: If you have a system that simply copies everything from system A to system B once a night, that’s better than nothing, but on Monday you trash a file and don’t notice it until Thursday, you can’t recover.
- automated: Because humans are lazy.
- secure: It’s annoying to be hacked. It’s heartbreaking to find the hacker also destroyed your backups.
In this tutorial, we’ll show you how to setup backups using rsnapshot. Quoting rsnapshot.org:
rsnapshot is a filesystem snapshot utility based on rsync. rsnapshot makes it easy to make periodic snapshots of local machines, and remote machines over ssh. The code makes extensive use of hard links whenever possible, to greatly reduce the disk space required.
This means if you have 500MB of files, you want to retain 30 days’ backups, and your change rate is 10% over that period, you don’t need 30 * 500 = 15,000MB but rather only 550MB. Beautifully, you still have point-in-time recovery (depending on your backup schedule) throughout that period.
In this tutorial, we’ll setup the following:
- server1.lowend.party has a directory called /data with lots of valuable files.
- We want to back it up to backup.lowend.party using a scheme of hourly/daily/weekly/monthly backups. These are stored in /backups/server1.lowend.party
- backup.lowend.party has other hosts it backs up as well.
- We’re using passwordless ssh keys for authentication so we can run everything out of cron.
Before we start, there’s one more key concept.
Push vs. Pull Backups
I’ve long advocated pull backups. In other words, the backup server comes along and backs up the client. In this scenario, backup.lowend.party initiates the backups and contacts server1.lowend.party to get the data. This is in contrast to push backups, where server1.lowend.party contacts backup.lowend.party and pushes the backups to it.
What’s the difference? Imagine server1 is hacked. If we’re using push backups, it would be trivial for the hacker to use the passwordless ssh keys to nuke the backups as well. In a pull-based model, backup.lowend.party can authenticate to server1, but not vice-versa, so the hacker is out of luck.
Installing rsnapshot
On Debian, it’s as easy as
apt install rsnapshot
Configuring rsnapshot
rsnapshot’s config lives in /etc/rsnapshot.conf. I recommend making a backup of it before you start changing things:
mv /etc/rsnapshot.conf /etc/rsnapsnap.conf.default
There are different philosophies about how to setup rsnapshot configs. I prefer to have a separate config file for each client (system being backed up). If you only have one system to backup, this is not necessary. You can backup multiple systems in one config file, but you lose some flexibility. Experiment and decide which you like. In my case, I do this:
cp /etc/rsnapshot.conf.default /etc/rsnapshot.conf.server1
Now modify as follows. Important Note: rsnapshot.conf requires TABs between elements. So “cmd /usr/bin/ssh” is “cmd<TAB>/usr/bin/ssh”.
Enable remote backups:
cmd_ssh /usr/bin/ssh
Add these backup intervals:
interval hourly 6 interval daily 7 interval weekly 4 interval monthly 3
I’m using a passwordless ssh key stored in /root/.ssh/backup. I also use a different ssh port. So make this change:
ssh_args -p 8989 -i ~/.ssh/backup
These two commands are for reporting (see below): rsync_long_args --stats --delete --numeric-ids --relative --delete-excluded verbose 4
Now I tell rsnapshot where to save backups:
snapshot_root /backups/server1.lowend.party/
Finally, I add the backup definition:
backup root@server1.lowend.party:/data/ .
This will keep files in /backups/server1.lowend.party/hourly.0, etc.
I want to exclude /data/cache on my backups:
exclude_file /etc/rsnapshot.server1.exclude
And in that file I put:
- /data/cache/*
OK, we’re ready to go. Now because I’m not using the default /etc/rsnapshot.conf name, I need to use the -c parameter for all rsnapshot commands. Let’s start by testing the config:
root@backup:/etc# rsnapshot -c /etc/rsnapshot.conf.server1 configtest Syntax OK
Now we can run a simulation:
root@backup:/etc# rsnapshot -c /etc/rsnapshot.conf.server1 -t hourly echo 9633 > /var/run/rsnapshot.pid mkdir -m 0755 -p /backups/hourly.0/ /usr/bin/rsync -a --stats --delete --numeric-ids --relative --delete-excluded \ --exclude-from=/etc/rsnapshot.server1.exclude --rsh=/usr/bin/ssh -p 8989 \ -i ~/.ssh/backup root@server1.lowend.party:/data/ \ /backups/hourly.0/server1.lowend.party/ touch /backups/hourly.0/
One more thing to do. I like to use rsnapshot’s reporting tool, so let’s enable it:
cp /usr/share/doc/rsnapshot/examples/utils/rsnapreport.pl /usr/local/bin chmod 755 /usr/local/bin/rsnapreport.pl
We’re good to go!
Running rsnapshot
On server1, I have 547MB in /data, and 30MB in /data/cache which will be excluded:
root@server1:~# du -sm /data 547 /data root@server1:~# du -sm /data/cache 30 /data/cache
Let’s run our first rsnapshot backup:
root@backup:/backups/server1.lowend.party# rsnapshot -c /etc/rsnapshot.conf.server1 hourly Setting locale to POSIX "C" echo 10012 > /var/run/rsnapshot.pid mkdir -m 0755 -p /backups/server1.lowend.party/hourly.0/ /usr/bin/rsync -av --stats --delete --numeric-ids --relative \ --delete-excluded --exclude-from=/etc/rsnapshot.server1.exclude \ --rsh=/usr/bin/ssh -p 8989 -i ~/.ssh/backup \ root@server1.lowend.party:/data/ \ /backups/server1.lowend.party/hourly.0/. receiving incremental file list data/ <snipped> data/cache/ Number of files: 10,982 (reg: 10,980, dir: 2) Number of created files: 10,982 (reg: 10,980, dir: 2) Number of deleted files: 0 Number of regular files transferred: 10,980 Total file size: 518,702,282 bytes Total transferred file size: 518,702,282 bytes Literal data: 518,702,282 bytes Matched data: 0 bytes File list size: 611,123 File list generation time: 0.001 seconds File list transfer time: 0.000 seconds Total bytes sent: 208,691 Total bytes received: 519,874,481 sent 208,691 bytes received 519,874,481 bytes 80,012,795.69 bytes/sec total size is 518,702,282 speedup is 1.00 touch /backups/server1.lowend.party/hourly.0/ rm -f /var/run/rsnapshot.pid /usr/bin/logger -p user.info -t rsnapshot[10012] /usr/bin/rsnapshot -c \ /etc/rsnapshot.conf.server1 hourly: completed successfully
Now I can also run that using the rsnapshotreport.pl script we setup. If I do, the output will look like this (the TOTAL MB is a little different because I ran these at different times):
# rsnapshot -c /etc/rsnapshot.conf.server1 hourly | /usr/local/bin/rsnapshotreport.pl SOURCE TOTAL FILES FILES TRANS TOTAL MB MB TRANS LIST GEN TIME FILE XFER TIME -------------------------------------------------------------------------------------------------------------------- server1.lowend.party:/data/ 11982 1 564.81 46.10 0.001 seconds 0.000 seconds
Now if I continue running hourly backups, I see new directories being created in /backups/server1.lowend.party:
drwxr-xr-x 3 root root 4096 Jul 12 16:03 hourly.0 drwxr-xr-x 3 root root 4096 Jul 12 16:01 hourly.1 drwxr-xr-x 3 root root 4096 Jul 12 15:58 hourly.2
Interestingly, hourly.0 is 500-odd MB, will the rest are only 1MB. Why? Because hourly.1, hourly.2, etc. are simply hard links back to hourly.0. This is a huge space savings.
If I nuke some files on server1’s /data and run another couple backups, you’ll see this:
root@backup:/backups/server1.lowend.party# du -sm * 526 hourly.0 39 hourly.1 1 hourly.2 1 hourly.3
rsnapshot is retaining data in hourly.1 because it’s needed to reconstruct the backups for that hour.
Automating rsnapshot
Setting up automated backups is as easy as putting jobs in cron. For example:
MAILTO=you@somewhere.com 0 * * * * root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 hourly 2>&1 | /usr/local/bin/rsnapreport.pl 0 3 * * * root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 daily 2>&1 | /usr/local/bin/rsnapreport.pl 0 3 * * 1 root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 weekly 2>&1 | /usr/local/bin/rsnapreport.pl 30 2 1 * * root /usr/bin/rsnapshot -c /etc/rsnapshot.conf.server1 monthly 2>&1 | /usr/local/bin/rsnapreport.pl
Now
Related Posts:
- eWallHost has a Holiday Sale with Ultra-Cheap Shared and Email Hosting! - December 26, 2024
- Merry Christmas from LowEndBox! - December 25, 2024
- We are Social Butterflies!Check Us Out Wherever You Browse, View, or Tap! - December 23, 2024
Can rsnapshot be used to backup an ENTIRE VPS?
I two LEB special VPS’s hosted on RackNerd, and RackNerd doesn’t provide snapshots or backups. Both VPS’s have CyberPanel installed, and while CyberPanel can backup and restore websites without issue, there is no facility to backup CyberPanel itself. So if one of the VPS’s fails, it means a complete reinstall and manual reconfiguration.
–> How can I use rsnapshot to backup an entire (or enough of a) VPS such that after doing a complete reinstall, I could then restore everything completely?