Full Circle Magazine FR

Ceci est une ancienne révision du document !

Backups are fundamental aspects of the IT life: for personal purposes as well as in professional environments. Ranging from personal documents to server configurations and databases, sometimes a backup is life-saving. There is plenty of documentation around the web: documents about backup strategies, how-tos, backup softwares, and so on.

The big drawback of backups is, most of the time, space occupancy. A good software, or a good strategy, can mitigate this by mean of incremental backups. But in some cases, learning and implementing a specialized backup software is laborious, tedious, and the learning curve is too high especially when we only need “a simple” copy of the last useful files or database dump. In any case, there isn’t a self-configuring software out there: strategies and policies are up to the system administrator or backup administrator. And last but not least, what about restore? We all hope not to need to restore a backup, but sometimes disasters can occur: from deletion of an important document to database server crash.

Here, in this article, we will talk about a sort of incremental backup. How difficult is it to retrieve a backup or restore a database from an incremental backup? In a company, sometimes, the System Administrator is also the DBA and even the Backup Administrator at the same time, so testing backups’ restoration is, most times, a zero priority task.

As said before, space occupancy (hard disk or tapes) is a big question to address. Let’s talk about a disk shared between many users, or the home directories of such users. Let’s say we want to use an affordable and simple solution using well known tools, that must be a reliable solution as well. The choice of the tool, if we work on Linux and Unix, will usually fall on rsync.

We have to take into account that rsync doesn’t delete files on the destination that are not in the source anymore: but we cannot store files indefinitely, for example documents voluntarily deleted by users or old configuration files, what a lot of garbage; so we can use the delete option ¹⁾ offered by rsync, but… who knows if the user didn’t delete a file by mistake?

So, if we use rsync, we must take care of such things: we must avoid garbage and we cannot rely only on one backup, that is the last time rsync has run. So, usually, we end up writing a complex script, or we spend time looking for and testing some script or software found on the Internet, falling again in a pointless learning curve or using a script that doesn’t fit our needs.

Well, a filesystem snapshot can save our bacon. In this article we will talk about ZFS snapshot feature.

What is ZFS

ZFS is a robust, enterprise grade, filesystem and volume manager developed by Sun in the late 2001; today it is the default file system on Solaris and a bunch of Operating Systems based on the Open Source Illumos kernel, like SmartOS, Openindiana, OmniOS, etc. Due to legal issues related to the licence, that prevented the development of a Linux kernel module, in September 2013 a side project was created: OpenZFS. So, nowadays, the powerful features of ZFS can be used also on your preferred Linux distributions. There was a lot of hype before Ubuntu 16.04 LTS came out, when it was announced that the ZFS filesystem module would be included by default, and the OpenZFS-based implementation would have received official support from Canonical: many concerns about license issues come out.

I don’t want to go deep into the advantages, the technical characteristics or the options of ZFS, but I want just to underline the ease-of-use of some features that this file system provides, like, exactly, snapshots.

Well, also other filesystems like BtrFS, or Logical Volume managers (LVM) have snapshot functionalities as well; even Windows has something like that, called Shadow Copy, but ZFS, as far as I know, is the most simple and proficient to use.

How to create a zpool

This is only a quick example. Also, in this case I don’t want to dig deep into technical details.

So let’s install the ZFS stuff.

sudo apt-get install zfsutils-linux

Plug in an USB drive, and use fdisk to create a new empty GPT partition table (g key)

sudo fdisk /dev/sdb

Exit from fdisk, and let’s create the zpool (look at the zpool like a volume). “tank” is the name of the zpool, you can use a name of your choice.

sudo zpool create tank /dev/sdb

Please look at this link, in order to understand why using sdb, instead of the disk UUID, might not be a good thing http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool

Let’s create a ZFS filesystem inside the zpool

sudo zfs create tank/mybackup

Set on-the-fly compression. Yes: this is another feature of ZFS. The LZ4 algorithm has a good balance between performance and compression level.

sudo zfs set compression=lz4 tank/mybackup

Let’s take the first snapshot

A snapshot is the state of the file system at a particular point in time. The command we need to use in order to take a snapshot is:

zfs snapshot filesystem@name

So in our case the command could be:

sudo zfs snapshot tank/mybackup@201707091030

Since it doesn’t consist of an incremental copy of files, like Mac OS X Time Machine, but it is done at block level, the snapshot operation is immediate.

Initially, the disk space occupied by a snapshot is zero, since the snapshot corresponds exactly to the original file system. As the files on the file system change (new files, deleted ones, updated files), disk space becomes unique to the snapshot, so the space used by the snapshot is strictly related to changes, writes, and deletes, performed on the file system. Last, the snapshot is read-only: there is no danger to lose your backup or to alter it.

Displaying and Accessing ZFS Snapshots

Now you can access data “backed up” in the snapshot in two ways: rolling back the snapshot (overwriting the working file system, read this as a complete restore), or accessing it to recover single files or directories.

Rolling back the snapshot is as simple as taking it:

zfs rollback filesystem@name

But pay attention: as said, rolling back a snapshot will overwrite the working file system! So a more convenient way to recover files is to go inside the snapshot directory and browse into the frozen directory tree. Inside the mount point of the file system there is an hidden directory called “.zfs” (it is not visible even with ls -la).

cd /mountpoint/.zfs/snapshot

Here you can find all your snapshots, you will see a directory for every taken snapshot, and you can use usual commands (cp, rsync, scp, etc) to copy a previous file version wherever you need: you can replace or restore a file or a directory directly in the working file system.

That said, the backup policy can be the following: take a snapshot just before the rsync command and you are on your way. So you should not worry about previous versions of backups, incremental backups, huge used space, and so on.

To delete a snapshot, the command is as simple as

zfs destroy filesystem@name

Then in our example

sudo zfs destroy tank/mybackup@201707091030

Sample backup script

This is a sample and simple script to illustrate the concepts. We use rsync over SSH with key-based authentication (you can find a lot of step-by-step guides around the web on how to set up SSH key-based login).

You can find the script here https://gist.github.com/alcir/7cb799edfb677a50fc38741dc706d73f

¹⁾

rsync –delete: delete extraneous files from dest dirs