LowEndBox - Cheap VPS, Hosting and Dedicated Server Deals

Backing Up Your VPS With b2 and rclone

b2 is cloud storage from Backblaze. Like Amazon S3, Microsoft Azure, or Google Cloud Storage, it provides an API for storing files in the cloud and does so at a substantially cheaper price than its competitors.  In this tutorial, we’ll show how you can use b2 for backing up your VPS.

Understanding B2 Pricing

There are three components to B2 pricing:

  • The monthly storage cost.
  • The transaction costs (delete actions, list actions, etc.)
  • Download costs. As of this tutorial, there is no charge for upload.

The transaction costs are minimal (less than a penny per 1,000 operations such as delete), so we will ignore those. Download costs are only relevant if you are restoring a backup. Hopefully this will be a rare event. This is where S3, Azure, and Google can be very expensive because you pay their normal transit.

For my backups, I model the monthly storage costs (see full pricing details on Backblaze.com), since that’s what I’m going to see on my bill. I rarely restore, and typically it’s a few files.

Getting the B2 Tools

In this tutorial, we’ll be using the b2 command line tool and rclone.

After signing up on backblaze.com, you will be assigned a Master Application Key. For this tutorial, I’ll create a separate Application Key just for this tutorial. It’s important to note that while you can retrieve the keyID at any time by logging into your account, you can only see the applicationKey once at the time you create it.

Next setup the b2 command line tool:

apt-get install python-pip
pip install b2

To use the b2 cli, you first have to authorize the client. I will replace my Key ID with XXXXX and my Application Key with YYYYY.

# b2 authorize-account XXXXX YYYYY
Using https://api.backblazeb2.com

You’ll now find a file called ~/.b2_account_info has been created. This is a sqlite3 database with configuration information. You can take a look at it with sqlite3 (e.g., sqlite3 ~/.b2_account_info) but don’t modify anything. You can move this file if you want, but if you change it to a non-default location you need to set the B2_ACCOUNT_INFO environment variable so the b2 cli knows where to find it.

Picking a B2 Bucket Name

We’ll create a bucket next, which is where you’re store your files.  You can create as many buckets as you like – perhaps one for each VPS, one for different filesystems or projects, etc.

It’s important to realize that the b2 namespace is completely flat. You will never get a bucket named ‘backup’ because there are 7 billion other humans fighting for that name. Bucket names have a minimum of 6 and a maximum of 50 characters and consist of letters, digits, and hyphens. They also cannot start with b2-.

One trick to make sure you have a consistent naming standard is to prefix your bucket names with something long, like a UUID.  A v4 UUID is 37 characters so if you use a UUID v4 plus a trailing hyphen, that leaves 12 characters for a bucket name and you can rest assured it is available.

Using an online UUID generator (or install the uuid package and run uuid -v4), I chose this bucket name:

5e73a641-5685-4ab5-9276-9f480ebabd2e-backup

I could also remove the hypens if I wanted more space.

Creating the Bucket

Let’s use the b2 command line tool to create a bucket. You could also create one on backblaze.com. By default, if you create a bucket it has a lifecycle setting that retains previous versions of files. In our case, we don’t want to do that. We can either create the bucket on backblaze.com and set the life cycle rules there, or pass some JSON on the b2 command-line to accomplish the same thing.  (I’ve split the long line here but you should type this all on one line without the back slashes).

# b2 create-bucket --lifecycleRules \
'[{ "daysFromHidingToDeleting": 1, "daysFromUploadingToHiding": null, "fileNamePrefix":"" }]' \
5e73a641-5685-4ab5-9276-9f480ebabd2e-backup allPrivate
8be70287ff4bc9157a1e0915

That command says “create a bucket, set the life cycle to only keep the most recent version of the file, and make all files in the bucket private”. The command returns the bucket ID.

If this looks confusing, setup a bucket on Backblaze.com, and then use b2 get-bucket to see the JSON for that particular config.

Setting Up rclone

Installing rclone is easy:

apt-get install rclone

And now run rclone config to set it up:

# rclone config
2020/04/28 18:22:30 NOTICE: Config file "/root/.config/rclone/rclone.conf" not found - using defaults
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> b2
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / A stackable unification remote, which can appear to merge the contents of several remotes
\ "union"
2 / Alias for a existing remote
\ "alias"
3 / Amazon Drive
\ "amazon cloud drive"
4 / Amazon S3 Compliant Storage Providers (AWS, Ceph, Dreamhost, IBM COS, Minio)
\ "s3"
5 / Backblaze B2
\ "b2"
<snip>
24 / http Connection
\ "http"
Storage> 5
** See help for b2 backend at: https://rclone.org/b2/ **

Account ID or Application Key ID
Enter a string value. Press Enter for the default ("").
account> XXXXX
Application Key
Enter a string value. Press Enter for the default ("").
key> YYYYY
Permanently delete files on remote removal, otherwise hide files.
Enter a boolean value (true or false). Press Enter for the default ("false").
hard_delete> 
Edit advanced config? (y/n)
y) Yes
n) No
y/n> n
Remote config
--------------------
[b2]
account = XXXXX
key = YYYYY
--------------------
y) Yes this is OK
e) Edit this remote
d) Delete this remote
y/e/d> y
Current remotes:

Name Type
==== ====
b2   b2

e) Edit existing remote
n) New remote
d) Delete remote
r) Rename remote
c) Copy remote
s) Set configuration password
q) Quit config
e/n/d/r/c/s/q> q

In this case, I chose to call the b2 remote storage ‘b2’ (clever, eh?)  I could have called it ‘cloud’, ‘b2backup’, ‘snuffleupagus’ or anything else.  You can examine /root/.config/rclone/rclone.conf to see the values you entered.

Backing Up With rclone

You are now ready to backup.  In this example, I’ll backup /usr/include:

rclone -L sync /usr/include b2:5e73a641-5685-4ab5-9276-9f480ebabd2e-backup

then I can list the files:

# rclone ls b2:5e73a641-5685-4ab5-9276-9f480ebabd2e-backup3
7456 aio.h
2031 aliases.h
1203 alloca.h
1730 ar.h
<etc.>

Finer Points

One thing that b2 doesn’t handle natively is symlinks.  You need to make a choice to either copy the linked file (what the symlink points to) or skip the link entirely.  One technique is to tar up directories and then backup the tarball.  If you ever need a restore, you restore the tarball and then when you untar, you get your symlink backs.  Above I used -L (copy the linked file), but if you want to skip links, there’s –skip-links.

Remember that for live databases, you’ll want to do a mysqldump, pg_backup, etc. before backup.  Because rclone runs so easily on the command line, it’s easy to put the database backup commands and rclone in a single script and stick it in cron.

Another consideration is how many transfer threads you want to run in parallel.  If you have many files, then increasing the number of transfers can increase performance.  By default, rclone will use 4 parallel transfers, but you can override this with the –transfers flag.

There are many other options to rclone to limit bandwidth, tune memory, and configure encryption, so be sure to check out the rclone man page and/or docs.  Enjoy!

raindog308

2 Comments

  1. Nickolai:

    Have you heard of tarsnap? It works similar to tar, so at least one should not have to use a separate tar step to handle symlinks.

    It would be nice to see a comparison.

    January 27, 2021 @ 1:22 pm | Reply
  2. Zee:

    I found this opensource tool that allows you to backup upto 15GB on google drive using RCLONE 🤣

    January 29, 2021 @ 1:55 am | Reply

Leave a Reply

Some notes on commenting on LowEndBox:

  • Do not use LowEndBox for support issues. Go to your hosting provider and issue a ticket there. Coming here saying "my VPS is down, what do I do?!" will only have your comments removed.
  • Akismet is used for spam detection. Some comments may be held temporarily for manual approval.
  • Use <pre>...</pre> to quote the output from your terminal/console, or consider using a pastebin service.

Your email address will not be published. Required fields are marked *