Categories
Personal Computing

TrueNAS 12 Replication

This post introduces the reader to the TrueNAS 12 replication facility operational concept and describes how to use local replication for main pool backups and remote replications to initialize a server or for off-site backups

TrueNAS is built on FreeBSD and the ZFS copy on write filesystem originally developed at Sun Microsystems for use in petabyte scale systems, possibly with cluster filesystem support layered on top. Today among hobbyists and system integrators, TrueNAS finds use in small scale file servers in home and office environments.

Purchased storage systems from iX Systems (are you a Dune fan?) make ZFS and TrueNAS FreeBSD accessible to small organizations needing reliable shared file storage. It is ideal for small medical offices and graphic and creative arts professionals needing working and archival storage. The TrueNAS SOHO systems are price competitive with home brew from new systems as a result of volume purchase of components. TrueNAS carefully tailors the FreeBSD component selection and system configuration for the storage appliance mission. In most installations, TrueNAS functions solely in the storage role. In home and small office installations, it may also provide some application support. Here at Dismal Manor, our TrueNAS system also runs our Roon Audio service instance in a VM.

The capabilities of TrueNAS make it the superior choice for this application. The stand-out feature is check-summing of both file metadata and file data allowing TrueNAS to validate data at rest. TrueNAS will detect, report, and correct data bit rot.

It’s all about timing

When I was looking for a NAS to replace my Drobostores, the Linux NAS storage market was in flux. Most Linux storage systems were based on kernel RAID using the EXT4 filesystem. EXT4 is a journaled filed system designed for simplicity. RAID and LVM were separate and check-summing happened at the block level.

At the time I built my server (2017), Netgear was transitioning ReadyNAS to BTRFS (said butter-fuss) but some models were and some not. Since then, Synology has transitioned its current (2021) products to BTRFS.

BTRFS, developed by Oracle, is intended to have feature parity with OpenZFS and Oracle ZFS. If buying from the Linux NAS vendors check the fine print regarding software versions and beware old stock from sellers. At this time (winter 2021), all products should be BTRFS-based. BTRFS CRC checks both metadata and file data and offers snapshot and replication features similar to those in ZFS.

This is a CONOPS Article

In this article, I’ll describe TrueNAS replication operational concepts and several uses of snapshots with replication in the SOHO environment. We’ll look at local backups, off-site backups, and new system data initialization. This article deliberately avoids implementation switchology as iX Systems may change it from release to release. The TrueNAS Guide is very good and continuous improvement.

Since I originally set up replication on FreeNAS 9, life has become much easier using TrueNAS 12. The TrueNAS 12 user interface has eliminated most of the friction experienced when making the replication initial setup in 2018. iX Systems wizards have established dialogs that support the most common use cases. In most cases it is no longer necessary to explicitly create the snapshot task, snapshot schedule, and replication tasks. Creation of a TrueNAS 12 replication task using the standard dialog will create the supporting snapshot task and snapshot schedule.

This post aims to guide the novice admin through the workflows needed to configure basic internal replication and external replications independent of the applications of the replication results.

Revisions

  1. Revised to add a section “Replications for Backups” and added the setting needed for that use case.
  2. Revised the intro to make it clearer what we were talking about, TrueNAS, ZFS, and ZFS replication.
  3. Revised to mention that 2021 ReadyNAS and Synology NAS offerings now use BTRFS rather than Linux LVM, Kernel RAID, and EXT4 filesystem.

References

This article describes the capabilities of TrueNAS 12 which is built upon FreeBSD 12 with a significant storage system management environment added to the core system. iX Systems has written an excellent system Guide that describes the configuration and operation of the system. Please consult the Guide for all system operations tasks.

  1. https://www.truenas.com/docs/hub/tasks/scheduled/snapshot-scheduling/
  2. https://www.truenas.com/docs/hub/tasks/scheduled/replication/
  3. http://storagegaga.com/ransomware-recovery-with-truenas-zfs-snapshots/ (Windows 10 example)

ZFS Storage Model

ZFS is very flexible in its ability to represent storage. The basic element of ZFS storage is the block. ZFS combines one or more physical disks into a virtual block device called a pool.

Dismal’s disks and pools

A pool can contain volumes and file systems. Interestingly, ZFS allows volumes to contain volumes. This arrangement represents a physical disk having several slices, each with it own file system. Or you can think of it as modeling a MS volume having several partitions.

Dismal’s main pool organization

Pools also combine physical volumes into a larger logical volume. Depending upon the pool configuration, the user data can be mirrored or stored with single, double, or triply redundant data. A physical device can be replaced with one the same size or larger. ZFS will recreate the content of the original media on the replacement media. If a larger device is provided, some space will be lost until all devices are of the larger size.

The TrueNAS documentation tends to use filesystem and dataset interchangably. In Unix, a physical volume contains slices and a slice of storage contains a filesystem. Dataset is older terminology carried over from the IBM OS360 meaning a file or group of files in a physical container (volume usually). This terminology evolved to allow speaking of single reel and multi-reel tape sets with one or more files on them in addition to disk storage in the days of mountable disk packs and datasets that spanned disk packs.

A filesystem spans each pool. That file system may contain both filesystems and volumes. Volumes are sparsely provisioned but have an upper size limit. They typically represent a partition or slice of physical media or virtual media. A volume may contain one or more filesystems.

Filesystems contain volumes or files as noted above and are hierarchical. Thus filesystems may contain filesystems. Exporting occurs at the file system directory node allowing part of a filesystem to be exported.

Filesystem snapshots capture the current state of a filesystem. They are taken on the live filesystem as data is being written and read. A snapshot can be serialized to a new pool and filesystem and the destination may be local to the source TrueNAS system or sent to a destination TrueNAS system, pool and filesystem.

Snapshots and Replication

Snapshots and Replication are like peanut butter and jelly. They were made to go together. A snapshot describes a read-only version of a filesystem at a point in time. The snapshot is a compact encoding of the base version plus the changes made between snapshots. It allows the storage resource to be reconstructed in that state. When replicated to independent storage, the snapshot series serves as a sequence of backups of the state and contents of the pool or data set (filesystem).

Copy on write filesystems like APFS, BTRFS, and ZFS record filesystem changes to a journal as as active processes make them queuing the changes to be committed to disk. Another volume management process reads the journal and writes the changes to the disk resident data structures. If operation of the system is interrupted, the filesystem can be brought current by completing application of the queued transactions in the journal. This journal scheme replaces static auditing of the filesystem as part of the system recovery process. Conceptually, each snapshot is a serialization of the journal for retention and playback to update a second local or remote filesystem.

Replication is the process of transmitting and storing a snapshot from the snapshot source to a replication destination, usually a second internal pool and data set or a pool and data set on a second TrueNAS server. Replication of a snapshot is a good way to initialize the data on a newly commissioned TrueNAS server or to transfer data to a second pool that serves as a backup of the original.

The use of a snapshot sequence allows the saving of evolving versions of works in progress. If a work object is lost, an image from the most recent snapshot or any earlier snapshot of the object (directory content or file content) is possible.

Snapshots

Snapshots are a compact representation of the state of a filesystem. The first snapshot describes the initial state of the filesystem. Each subsequent snapshot represents the state of the filesystem at a later point in time. The snapshot is a journal of the changes made since the previous snapshot. Or if you will, it is the serialization of the changes made since the last snapshot. A snapshot specification tells ZFS how to make snapshots of a filesystem.

I suspect ZFS makes the snapshot on the fly and closes it at the end of the snapshot interval (time to start the next snapshot). Once complete, the snapshot triggers the replication.

Several things make up a snapshot description. Each snapshot specification has a content specification that describes the filesystem to be check pointed. It has a snapshot schedule that specifies how often snapshots are to be taken. And it has a sequence of snapshot datasets that encode the original snapshot and the incremental changes between each snapshot point in time. If the snapshot is replicated on a second machine, a TrueNAS SSH connection entry describes the connection and a TrueNAS SSH keypair entry at the destination provides the necessary public key which is kept locally.

The snapshot specification describes what is to be captured. Snapshots can be “recursive” or not. A recursive snapshot includes each child filesystem and its children much as a recursive directory listing contains each child directory and its children.

The snaphshot specification also specifies the number of snapshots to be retained. When the limit is reached, the oldest snapshot is folded into the initial state of the file system and the space the snapshot occupied is freed for reuse.

TrueNAS can list the available snapshot images and can perform the following operations on the selected images. It can

  • clone the dataset or pool to local storage in the state described in the snapshot,
  • it can restore the dataset to the state described in the snapshot (a rollback operation)
  • or it can delete the snapshot
dismal.local storage snapshot view

Note that snapshots are part of the pool or dataset for which they are defined. To have the data appear in a different place, a replication task must be created to move the data.

Replication

Replication is the process of merging the snapshot into the destination filesystem. The destination system applies the serialized changes to its local copy of the destination filesystem.

TrueNAS 12 supports simple replications and advanced replications. Simple replications permit a simplified setup shown below. Advanced replications setup allows access to all of the elements of replication (snapshot, source and destination, schedule) and to all of the options for each element. Most users will use the simple replication dialog.

Replication at Dismal Manor

The TrueNAS documentation and associated user interfaces have been greatly improved in moving to the TrueNAS 12 UI. There is now an express setup for local replication. Local replication allows TrueNAS to keep an image of a pool or dataset in a local pool added for backup purposes. Dismal, the manor’s home server, keeps an internal backup on a single 8 TB storage rated disk. This disk has no redundancy but it has full data and metadata CRC checking and CRC verification just like that in the doubly redundant main pool.

Dismal.Local pools

Dismal currently replicates the FreeNAS pool to the Internal Backups pool daily at 0000R. This is a local replication. We’ve never placed the Unifi Video volume into production. Unifi Video and later, Unifi Protect run on a separate Unifi appliance.

Replication for backups

When configuring a replication as a backup replication, set the option to enable Full Filesystem Replication. This setting results in all essential attributes being saved. Per the tooltip for the setting

“Completely replicate the selected dataset. The target dataset will have all of the properties, snapshots, child datasets, and clones from the source dataset.”

Local Replication

TrueNAS provides the Local Replication mechanism as a quick and easy means of setting up replication from a primary pool to a secondary pool both residing in a single TrueNAS server. Local replication uses the local transport. The dialog can configure local transport without further user assistance. Local Replication is intended for backing up a filesystem internal to the server on which it resides.

Dismal Manor uses local replication to make local backups of the primary pool’s outer filesystem. We have about 2 TB of data in the primary that we snapshot and replicate to the secondary. When setting up the snapshot time to live, you need to be mindful of snapshot retention time so as not to overflow the secondary. The Local Replication dialog handles creation of the snapshot data specification, the snapshot schedule, and the replication target specification and associated replication task. Manual snapshot specification is no longer needed.

Setting a finite retention time causes the aging out snapshots to be incorporated into the starting image and the snaphot data to be deleted. So the retention time specifies the time window back from now in which it is possible to restore any prior saved state. Before the retention window, only the initial state at window start is available.

Local Replication is Good

The job of local replication in the Dismal Manor grand scheme of computing is to provide a local backup of the primary storage pool in the unlikely event that 3 disks break before the primary can be repaired and reconstructed. It is not intended to protect the system against memory issues (ECC does this job) or a colicky CPU (processor reliability, availability, and serviceability capabilities do this job). Local replication proceeds at hardware speeds and takes advantage of transfer caching. Remote replication proceeds at SSH line speed and is subject to contention with playing videos and music or some feeble minded IOT device going mental.

Remote Replication

TrueNAS 12 also provides remote replication. A remote replication operation is the process of sending a replication snapshot from the primary machine and pool to a secondary machine and pool using a network connection. The secondary or receiving machine may be on the local network, at a friend’s house, or in a commercial service such as BackBlaze.

Remote replication can be used to initialize a newly commissioned TrueNAS server or to make colocated backups to a second TrueNAS server. Or to make backups over the Internet to a remotely located server.

Creating a local backup replication

In TrueNAS 12, you can duplicate and modify an existing replication or allow the replication dialog to create a snapshot for you (recommended for simple source specifications). First, create the replication and its associated snapshot as shown below.

FreeNAS pool replication to internal backup pool

Setting the Replication Schedule

Then create the replication schedule. A replication my occur once or periodically with some number of snapshots to be retained. The Next button takes you to the schedule form shown below.

Replication scheduling

Remote Replication SSH Authentication

Remote Replication requires that the target machine supports SSH authentication of the source clients. The new UI simplifies the procedures for establishing the SSH connection by moving it out of the replication dialog. Create the SSH connection first, then create the replication.

In TrueNAS 12, SSH connections are created before the Replication and referenced
  1. Generate the SSH key pair on the destination machine.
  2. Switch to the source machine
  3. Create the SSH connection (in System)
  4. Click the button to schlep the destination machine’s public key to the source machine.
  5. Save the connection.
Creating a connection with remote public key discovery

The source machine will use the destination machine’s public key to authenticate the connection and to secure the transfer.

Encrypting the Destination Dataset

With the new TrueNAS 12 system, creation of backups that are encrypted at rest is almost simple. The target pool must be encrypted. The results set filesystem and each individual snapshot replication will inherit the encryption of the pool. So this takes some forethought when creating the pool. The pool volume must be created with encryption enabled.

The encryption question appears at the start of the create pool dialog. The dialog will ask for or create a key. Let TrueNAS create the key as it knows best practices. AES-256 keys are recommended. The key is stored as part of the system parameters so be sure to back these up to a secure location and off-site . You will need the key if the pool is to be moved to a new system. I don’t believe the key is part of the pool’s metadata (for obvious reasons).

The TrueNAS 12 Replication UI will let you specify an encrypted target in a plain-text pool but the transfer will fail with the error that the encryption is different than that of the pool. The message in the alert Email is below. The quoted text is the names of the filesystems used as the replication source and replication destination. Note that a filesystem spans each pool.

Replication “FreeNAS pool to encrypted Internal Backups pool ” failed: Destination dataset ‘InternalBackups/Encrypted Backups’ already exists and is it’s own encryption root. This configuration is not supported yet. If you want to replicate into an encrypted dataset, please, encrypt it’s parent dataset.

TrueNAS 12 does not allow nesting of encrypted file systems so encryption must be applied at the pool level and inherited by all containers in that pool.