Full Circle Magazine FR

Ceci est une ancienne révision du document !

Table des matières

1
2
3
4
5
6
7

1

Backups are fundamental aspects of the IT life: for personal purposes as well as in professional environments. Ranging from personal documents to server configurations and databases, sometimes a backup is life-saving. There is plenty of documentation around the web: documents about backup strategies, how-tos, backup softwares, and so on. The big drawback of backups is, most of the time, space occupancy. A good software, or a good strategy, can mitigate this by mean of incremental backups. But in some cases, learning and implementing a specialized backup software is laborious, tedious, and the learning curve is too high especially when we only need “a simple” copy of the last useful files or database dump. In any case, there isn’t a self-configuring software out there: strategies and policies are up to the system administrator or backup administrator. And last but not least, what about restore? We all hope not to need to restore a backup, but sometimes disasters can occur: from deletion of an important document to database server crash. Here, in this article, we will talk about a sort of incremental backup. How difficult is it to retrieve a backup or restore a database from an incremental backup? In a company, sometimes, the System Administrator is also the DBA and even the Backup Administrator at the same time, so testing backups’ restoration is, most times, a zero priority task.

Les sauvegardes sont des aspects fondamentaux de la vie informatique, pour des environnements personnels aussi bien que professionnels. Des documents personnels aux configurations de serveurs et des bases de données, parfois une sauvegarde peut vous sauver la vie. Une documentation fournie est disponible ici et là sur le Web, notamment sur des stratégies de sauvegarde, des tutoriels, des logiciels de sauvegarde et ainsi de suite.

La plupart du temps, le gros inconvénient des sauvegardes est l'occupation de l'espace disque. Un bon logiciel, ou une bonne stratégie, peut atténuer cela avec des sauvegardes incrémentielles. Mais, dans certains cas, l'apprentissage et l'implémentation d'un logiciel de sauvegarde spécialisé sont laborieux et fastidieux et la courbe d'apprentissage est trop haute, surtout quand nous n'avons besoin que « d'une simple copie » des derniers fichiers utiles ou du dernier vidage d'une base de données. De toute façon, un logiciel qui se configure tout seul n'existe pas : les stratégies et les politiques sont du domaine de l'administrateur système ou l'administrateur des sauvegardes. Enfin et surtout, quid d'une restauration ? Nous espérons tous ne pas devoir restaurer une sauvegarde, mais un désastre est toujours possible - de la suppression d'un document important au plantage d'un serveur de bases de données.

Dans cet article, nous parlerons d'un type de sauvegarde incrémentielle. À quel point est-il difficile de récupérer une sauvegarde ou de restaurer une base de données à partir d'une sauvegarde incrémentielle ? Parfois, dans une entreprise, l'Administrateur système est aussi l'Administrateur des bases de données et même l'Administrateur des sauvegardes, tout à la fois, ce qui fait que tester les restauration des sauvegardes n'est, la plupart du temps, point prioritaire.

As said before, space occupancy (hard disk or tapes) is a big question to address. Let’s talk about a disk shared between many users, or the home directories of such users. Let’s say we want to use an affordable and simple solution using well known tools, that must be a reliable solution as well. The choice of the tool, if we work on Linux and Unix, will usually fall on rsync. We have to take into account that rsync doesn’t delete files on the destination that are not in the source anymore: but we cannot store files indefinitely, for example documents voluntarily deleted by users or old configuration files, what a lot of garbage; so we can use the delete option ¹⁾ offered by rsync, but… who knows if the user didn’t delete a file by mistake? So, if we use rsync, we must take care of such things: we must avoid garbage and we cannot rely only on one backup, that is the last time rsync has run. So, usually, we end up writing a complex script, or we spend time looking for and testing some script or software found on the Internet, falling again in a pointless learning curve or using a script that doesn’t fit our needs. Well, a filesystem snapshot can save our bacon. In this article we will talk about ZFS snapshot feature.

Comme déjà indiqué, l'occupation de l'espace (sur un disque dur ou sur des bandes magnétiques) est une très importante question qui doit être résolue.

Parlons d'un disque partagé par beaucoup d'utilisateurs ou des répertoires personnels de tels utilisateurs. Disons que nous voulons une

2

What is ZFS ZFS is a robust, enterprise grade, filesystem and volume manager developed by Sun in the late 2001; today it is the default file system on Solaris and a bunch of Operating Systems based on the Open Source Illumos kernel, like SmartOS, Openindiana, OmniOS, etc. Due to legal issues related to the licence, that prevented the development of a Linux kernel module, in September 2013 a side project was created: OpenZFS. So, nowadays, the powerful features of ZFS can be used also on your preferred Linux distributions. There was a lot of hype before Ubuntu 16.04 LTS came out, when it was announced that the ZFS filesystem module would be included by default, and the OpenZFS-based implementation would have received official support from Canonical: many concerns about license issues come out. I don’t want to go deep into the advantages, the technical characteristics or the options of ZFS, but I want just to underline the ease-of-use of some features that this file system provides, like, exactly, snapshots. Well, also other filesystems like BtrFS, or Logical Volume managers (LVM) have snapshot functionalities as well; even Windows has something like that, called Shadow Copy, but ZFS, as far as I know, is the most simple and proficient to use.

3

How to create a zpool This is only a quick example. Also, in this case I don’t want to dig deep into technical details. So let’s install the ZFS stuff. sudo apt-get install zfsutils-linux Plug in an USB drive, and use fdisk to create a new empty GPT partition table (g key) sudo fdisk /dev/sdb Exit from fdisk, and let’s create the zpool (look at the zpool like a volume). “tank” is the name of the zpool, you can use a name of your choice. sudo zpool create tank /dev/sdb Please look at this link, in order to understand why using sdb, instead of the disk UUID, might not be a good thing http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool

4

Let’s create a ZFS filesystem inside the zpool sudo zfs create tank/mybackup Set on-the-fly compression. Yes: this is another feature of ZFS. The LZ4 algorithm has a good balance between performance and compression level. sudo zfs set compression=lz4 tank/mybackup Let’s take the first snapshot A snapshot is the state of the file system at a particular point in time. The command we need to use in order to take a snapshot is: zfs snapshot filesystem@name So in our case the command could be: sudo zfs snapshot tank/mybackup@201707091030

5

Since it doesn’t consist of an incremental copy of files, like Mac OS X Time Machine, but it is done at block level, the snapshot operation is immediate. Initially, the disk space occupied by a snapshot is zero, since the snapshot corresponds exactly to the original file system. As the files on the file system change (new files, deleted ones, updated files), disk space becomes unique to the snapshot, so the space used by the snapshot is strictly related to changes, writes, and deletes, performed on the file system. Last, the snapshot is read-only: there is no danger to lose your backup or to alter it. Displaying and Accessing ZFS Snapshots Now you can access data “backed up” in the snapshot in two ways: rolling back the snapshot (overwriting the working file system, read this as a complete restore), or accessing it to recover single files or directories.

6

Rolling back the snapshot is as simple as taking it: zfs rollback filesystem@name But pay attention: as said, rolling back a snapshot will overwrite the working file system! So a more convenient way to recover files is to go inside the snapshot directory and browse into the frozen directory tree. Inside the mount point of the file system there is an hidden directory called “.zfs” (it is not visible even with ls -la). cd /mountpoint/.zfs/snapshot Here you can find all your snapshots, you will see a directory for every taken snapshot, and you can use usual commands (cp, rsync, scp, etc) to copy a previous file version wherever you need: you can replace or restore a file or a directory directly in the working file system. That said, the backup policy can be the following: take a snapshot just before the rsync command and you are on your way. So you should not worry about previous versions of backups, incremental backups, huge used space, and so on.

7

To delete a snapshot, the command is as simple as zfs destroy filesystem@name Then in our example sudo zfs destroy tank/mybackup@201707091030 Sample backup script This is a sample and simple script to illustrate the concepts. We use rsync over SSH with key-based authentication (you can find a lot of step-by-step guides around the web on how to set up SSH key-based login). You can find the script here https://gist.github.com/alcir/7cb799edfb677a50fc38741dc706d73f

¹⁾

rsync –delete: delete extraneous files from dest dirs