b2 is cloud storage from Backblaze. Like Amazon S3, Microsoft Azure, or Google Cloud Storage, it provides an API for storing files in the cloud and does so at a substantially cheaper price than its competitors. In this tutorial, we’ll show how you can use b2 for backing up your VPS.
Understanding B2 Pricing
There are three components to B2 pricing:
- The monthly storage cost.
- The transaction costs (delete actions, list actions, etc.)
- Download costs. As of this tutorial, there is no charge for upload.
The transaction costs are minimal (less than a penny per 1,000 operations such as delete), so we will ignore those. Download costs are only relevant if you are restoring a backup. Hopefully this will be a rare event. This is where S3, Azure, and Google can be very expensive because you pay their normal transit.
For my backups, I model the monthly storage costs (see full pricing details on Backblaze.com), since that’s what I’m going to see on my bill. I rarely restore, and typically it’s a few files.
Getting the B2 Tools
In this tutorial, we’ll be using the b2 command line tool and rclone.
After signing up on backblaze.com, you will be assigned a Master Application Key. For this tutorial, I’ll create a separate Application Key just for this tutorial. It’s important to note that while you can retrieve the keyID at any time by logging into your account, you can only see the applicationKey once at the time you create it.
Next setup the b2 command line tool:
apt-get install python-pip pip install b2
To use the b2 cli, you first have to authorize the client. I will replace my Key ID with XXXXX and my Application Key with YYYYY.
# b2 authorize-account XXXXX YYYYY Using https://api.backblazeb2.com
You’ll now find a file called ~/.b2_account_info has been created. This is a sqlite3 database with configuration information. You can take a look at it with sqlite3 (e.g., sqlite3 ~/.b2_account_info) but don’t modify anything. You can move this file if you want, but if you change it to a non-default location you need to set the B2_ACCOUNT_INFO environment variable so the b2 cli knows where to find it.
Picking a B2 Bucket Name
We’ll create a bucket next, which is where you’re store your files. You can create as many buckets as you like – perhaps one for each VPS, one for different filesystems or projects, etc.
It’s important to realize that the b2 namespace is completely flat. You will never get a bucket named ‘backup’ because there are 7 billion other humans fighting for that name. Bucket names have a minimum of 6 and a maximum of 50 characters and consist of letters, digits, and hyphens. They also cannot start with b2-.
One trick to make sure you have a consistent naming standard is to prefix your bucket names with something long, like a UUID. A v4 UUID is 37 characters so if you use a UUID v4 plus a trailing hyphen, that leaves 12 characters for a bucket name and you can rest assured it is available.
Using an online UUID generator (or install the uuid package and run uuid -v4), I chose this bucket name:
5e73a641-5685-4ab5-9276-9f480ebabd2e-backup
I could also remove the hypens if I wanted more space.
Creating the Bucket
Let’s use the b2 command line tool to create a bucket. You could also create one on backblaze.com. By default, if you create a bucket it has a lifecycle setting that retains previous versions of files. In our case, we don’t want to do that. We can either create the bucket on backblaze.com and set the life cycle rules there, or pass some JSON on the b2 command-line to accomplish the same thing. (I’ve split the long line here but you should type this all on one line without the back slashes).
# b2 create-bucket --lifecycleRules \ '[{ "daysFromHidingToDeleting": 1, "daysFromUploadingToHiding": null, "fileNamePrefix":"" }]' \ 5e73a641-5685-4ab5-9276-9f480ebabd2e-backup allPrivate 8be70287ff4bc9157a1e0915
That command says “create a bucket, set the life cycle to only keep the most recent version of the file, and make all files in the bucket private”. The command returns the bucket ID.
If this looks confusing, setup a bucket on Backblaze.com, and then use b2 get-bucket to see the JSON for that particular config.
Setting Up rclone
Installing rclone is easy:
apt-get install rclone
And now run rclone config to set it up:
# rclone config 2020/04/28 18:22:30 NOTICE: Config file "/root/.config/rclone/rclone.conf" not found - using defaults No remotes found - make a new one n) New remote s) Set configuration password q) Quit config n/s/q> n name> b2 Type of storage to configure. Enter a string value. Press Enter for the default (""). Choose a number from below, or type in your own value 1 / A stackable unification remote, which can appear to merge the contents of several remotes \ "union" 2 / Alias for a existing remote \ "alias" 3 / Amazon Drive \ "amazon cloud drive" 4 / Amazon S3 Compliant Storage Providers (AWS, Ceph, Dreamhost, IBM COS, Minio) \ "s3" 5 / Backblaze B2 \ "b2" <snip> 24 / http Connection \ "http" Storage> 5 ** See help for b2 backend at: https://rclone.org/b2/ ** Account ID or Application Key ID Enter a string value. Press Enter for the default (""). account> XXXXX Application Key Enter a string value. Press Enter for the default (""). key> YYYYY Permanently delete files on remote removal, otherwise hide files. Enter a boolean value (true or false). Press Enter for the default ("false"). hard_delete> Edit advanced config? (y/n) y) Yes n) No y/n> n Remote config -------------------- [b2] account = XXXXX key = YYYYY -------------------- y) Yes this is OK e) Edit this remote d) Delete this remote y/e/d> y Current remotes: Name Type ==== ==== b2 b2 e) Edit existing remote n) New remote d) Delete remote r) Rename remote c) Copy remote s) Set configuration password q) Quit config e/n/d/r/c/s/q> q
In this case, I chose to call the b2 remote storage ‘b2’ (clever, eh?) I could have called it ‘cloud’, ‘b2backup’, ‘snuffleupagus’ or anything else. You can examine /root/.config/rclone/rclone.conf to see the values you entered.
Backing Up With rclone
You are now ready to backup. In this example, I’ll backup /usr/include:
rclone -L sync /usr/include b2:5e73a641-5685-4ab5-9276-9f480ebabd2e-backup
then I can list the files:
# rclone ls b2:5e73a641-5685-4ab5-9276-9f480ebabd2e-backup3 7456 aio.h 2031 aliases.h 1203 alloca.h 1730 ar.h <etc.>
Finer Points
One thing that b2 doesn’t handle natively is symlinks. You need to make a choice to either copy the linked file (what the symlink points to) or skip the link entirely. One technique is to tar up directories and then backup the tarball. If you ever need a restore, you restore the tarball and then when you untar, you get your symlink backs. Above I used -L (copy the linked file), but if you want to skip links, there’s –skip-links.
Remember that for live databases, you’ll want to do a mysqldump, pg_backup, etc. before backup. Because rclone runs so easily on the command line, it’s easy to put the database backup commands and rclone in a single script and stick it in cron.
Another consideration is how many transfer threads you want to run in parallel. If you have many files, then increasing the number of transfers can increase performance. By default, rclone will use 4 parallel transfers, but you can override this with the –transfers flag.
There are many other options to rclone to limit bandwidth, tune memory, and configure encryption, so be sure to check out the rclone man page and/or docs. Enjoy!
Related Posts:
- One Week From Tomorrow…THE WORLD WILL LOSE THEIR MINDS!Lines Are Already Forming! - November 21, 2024
- Crunchbits Discontinuing Popular Annual Plans – The Community Mourns! - November 20, 2024
- RackNerd’s Black Friday 2024: Bigger, Better, and Now in Dublin! - November 19, 2024
Have you heard of tarsnap? It works similar to tar, so at least one should not have to use a separate tar step to handle symlinks.
It would be nice to see a comparison.
I found this opensource tool that allows you to backup upto 15GB on google drive using RCLONE 🤣