Notice: Undefined offset: 1 in /usr/local/src/wordpress/wp-content/themes/montezuma/includes/parse_php.php on line 79

HowTo: Build an Encrypted ZFS Array ~ Part 2 ~ The Array

zfsThis is a continuation of Build an Encrypted ZFS Array – Part 1 – Encryption, although if you do not choose to encrypt, you could pick up here.   This HowTo is Debian-centric.   Caution:   Sometimes command-lines wrap below, because of the width of the page.

~~   Building the Array   ~~

We now have 4 disk drives set up encrypted, and their raw devices reside at /dev/mapper/sdb ~ sde.   We want to assemble these into a ZFS array so they’ll appear as one volume to the system, and with RAID-Z for data integrity.   First a few rules:

  • Start a single-parity (raidz) configuration at 3 disks (2+1)
  • Start a double-parity (raidz2) configuration at 6 disks (4+2)
  • Start a triple-parity (raidz3) configuration at 9 disks (6+3)

(N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 6

The recommended number of disks per group is between 3 and 9.   If you have more disks, use multiple groups.   For example, if you have three disks in a single-parity RAID-Z configuration, parity data occupies disk space equal to one of the three disks.   In our case with 4 disks, single-parity is the best choice.   If we were planning to add more than 2 disks in the near future, double-parity would be advisable, as this can not be changed once set.

Create the array:
# zpool create -fo ashift=12 pool raidz /dev/mapper/sdb_crypt /dev/mapper/sdc_crypt /dev/mapper/sdd_crypt /dev/mapper/sde_crypt

We must use -f on Linux, as well as -o ashift=12.   We’ve named our array ‘pool’ (‘pool’ and ‘tank’ are common), and agglomerated it of the 4 raw drives.

# zpool list
NAME   SIZE   ALLOC      FREE   CAP   DEDUP   HEALTH   ALTROOT
pool      14.50T 13.72T      794G      94%      1.00x      ONLINE      –

# zpool status
pool: pool
state: ONLINE
scan: scrub repaired 0 in 39h1m with 0 errors on Sat Aug 2 22:01:43 2016
config:
NAME      STATE      READ      WRITE      CKSUM
pool           ONLINE            0            0            0
raidz1-0  ONLINE            0            0            0
sdb              ONLINE            0            0            0
sdc              ONLINE            0            0            0
sdd              ONLINE            0            0            0
sde              ONLINE            0            0            0
errors: No known data errors

Now reboot.   It’s still eh, all there, isn’t it?

~~   Configuring the Array   ~~

Time to set the mountpoint for the array.   Where does it belong?   If it’s for backups, /media/backups:
# zfs set mountpoint=/media/backups pool

And set compression for all data.   This won’t help much for video and audio files, but it makes a big difference for everything else.
# zfs set compression=gzip-5 pool
gzip-5 is a good balance between space savings, and performance needs.

Now there are two ways to have the array automounted at boot:   the legacy way;   and the current way.   We will use the latter.   We have already set the mountpoint for the array, so now we need to invoke automount, the Debian Way.
# nano /etc/insserv.conf
… and change to:

$local_fs	+mountall +mountall-bootclean +mountoverflowtmp +umountfsinclude +zfs-mount

(Careful of the line-wrap)   Reboot and make sure your pool is imported and mounted.

Very good.   Now we have our pool mounted at /media/backups and it’s set for compression.   Chances are you will be backing up various machines on your LAN to here, and it would be good to have a dedicated location for each machine.   For this we will create special zfilesystems, so we can take snapshots. (later)   The machines on our LAN are named Camelopardalis, Horologium, and Monoceros (our backups server is naturally, Gemini), so:
# zfs create pool/camelopardalis
# zfs create pool/horologium
# zfs create pool/monoceros

Since no one else will tell you this, when giving zfs and zpool commands you must refer to the array by its ZFS name, as above, not its filesystem mountpoint, as below.   Now after the above commands, check it:
# ls -al /media/backups
total 88
drwxrwx—   8   root   backup      8 Jun   15 07:56   ./
drwxr-xr-x   7   root   root         64   Feb 11 2016   ../
drwxrwx—   9   root   backup 106   Aug 10   08:25   camelopardalis/
drwxrwx—   9   root   backup   36   Aug 10   08:25   horologium/
drwxrwx—   2   root   backup   21   Aug 10   08:42   monoceros/

They look like Linux directories, and with Linux commands they behave like them, but they are actually special ZFS datastructures.   Before you actually put data in them, you must make damned sure they are ZFS mounted, or else you can never take snapshots.   This will usually happen automatically at boot with the import of the array, but when you’re first starting this whole bag of muffins, maybe not:
# zfs mount
pool                                         /media/backups
pool/camelopardalis    /media/backups/camelopardalis
pool/horologium           /media/backups/horologium
pool/monoceros            /media/backups/monoceros

If you’re not sure that everything’s mounted:
# zfs mount -a

To destroy a zfilesystem:
# zfs destroy pool/camelopardalis

~~   Snapshots!   ~~

Now you are free to put your data in your shiny new zfilesystems, without having to worry about anything else.   But, a key feature of ZFS is the ability to take snapshots, which record the data state of every file, at a given point in time.   Typically I do system backups daily, and snapshots weekly;   if something goes wrong I can turn to the latest daily backup.   But if I need a file that was deleted three months ago, I can go to the snapshot for that week or prior.   This will save your bacon.   You don’t have to worry about duplication of data with ZFS, as it keeps track of the modified date and time for each block on the disk, assembling them into a descriptive structure which preserves only what is needed for a complete record.   Magick is used, so as not to make this process dog-slow.

To take regular snapshots, I have it done in a script which is invoked by cron:
# /sbin/zfs snapshot pool/camelopardalis@camelopardalis-$(date +”%Y-%m-%d”)
So the snapshot is to be taken of pool/camelopardalis, and will be at /media/backups/camelopardalis/.zfs/snapshot/camelopardalis-$(date +”%Y-%m-%d”) .   You need a line like this for each zfilesystem.

Personally, I like to be able to actually see my snapshots (maybe I’m alone in this…), but they are stored in a special hidden format   —   they can be unmasked with this setting:
# zfs set snapdir=visible pool/{zfilesystem}
Then you will find all your snapshots in /media/backups/{zfilesystem}/.zfs/snapshots/.   You must give this command for each zfilesystem you create with # zfs create, and only need give it once.

Also, for convenience and visibility I make a symlink to each snapshot.   In my weekly snapshots cron script:
# ln -s /media/backups/camelopardalis/.zfs/snapshot/snap-home-2016-08-16 /media/backups/camelopardalis/snap-home-2016-08-16
# ln -s /media/backups/camelopardalis/.zfs/snapshot/snap-root-2016-08-16 /media/backups/camelopardalis/snap-root-2016-08-16

Speaking of the weekly snapshots script, it’s a good idea to regularly scrub your datasets as well:
# /sbin/zpool scrub pool ; sleep 1 ; zpool status | mail -s “zpool status ” {username}

~~   Offline Storage   ~~

With disks as large as they are these days, no other method of storage is practical anymore.   Optical, magtape, paper tape, all unusable now.   Backing up to disk is the only way.   Some organizations feel that RAID alone is enough protection and don’t even try backups.   Not a good idea because Cryptolocker reached out to any media it could find anywhere and encrypted the data.   This is all a long way of saying, umount your array when it is not doing backups.

~~   Some Useful Commands   ~~

Disk usage report:
# zfs list -o space
# zfs list -t all
# du -x /media/backups/cygnus | sort -n | tail -n 100
# bc -l <<< "$(zfs get logicalused pool -Hp | cut -f 3) / $(zfs get used pool -Hp | cut -f 3)"

(this last should be ~1.33* after long usage)

To set a different mountpoint:
# zfs set mountpoint=/home pool

To replace a flakey disk:
# zpool replace pool raidz UUID=??? UUID=???
# zpool status

To add more space:
# zpool add pool raidz UUID=??? UUID=???

To add a disk:
# zpool add -f pool {UUID}

To put /home in the pool (heehee):
# mkdir /media/home
# zfs set mountpoint=/media/home pool/home
# zfs mount -a
# rsync -aXS –exclude=’/*/.gvfs’ /home/. / media/home/.

Boot to single-user mode (if you can).
Pivot /media/home to /home.
# zfs set mountpoint=/home pool
Reboot.

To destroy the array:
# zpool destroy pool
(sub zfilesystems are destroyed as well of course)

For a history of the array:
# zpool history pool
2013-10-12.11:07:11 zfs set mountpoint=/media/backups pool
2013-10-12.11:08:54 zfs set compression=gzip-5 array
2013-10-12.11:09:12 zfs set snapdir=visible array
2013-10-12.11:52:47 zfs create pool/camelopardalis
2013-10-12.11:52:54 zfs create pool/horologium
2013-10-12.11:52:59 zfs create pool/monoceros

Rename a pool:
# zfs list -t all
# zfs umount -a
# zpool export pool
# zpool import pool tank
# zfs mount -a

,'after' => '

') )