This post comes out of a discussion with the folks at Take Control Books who write a line of topic and capability oriented Apple product guides like Taking Control of Your Digital Storage. This is a big topic that Jeff Carlson capably explores. NAS (network attached storage) is among the topics he considers. There are 3 sorts of systems available in this market space,
- Those targeted to small office and home offered by Synology and QNAP (the two best know) and some others.
- Professional products offered by the global IT vendors at departmental scale. EMC, NetApp, DELL, HP, IBM, the usual suspects.
- Cluster computing products like those offered by IBM Red Hat and various Linux distributions. These systems provide a single file system view to numerous computers formed into a capacity or reliability cluster. 45Drives, ix Systems, and others are in this space with hardware, drivers, CephFS scale-out file system, etc.
In this article we will consider the essential capabilities a small NAS should have. The intent is to prepare my readers to venture into the world of marketing slicks, spec sheets, and white papers.
- 2022-06-01 Original
NAS Killer App?
Network file storage is the killer app that a NAS provides. DVR, surveillance recording, photo identification, music serving, video serving, etc are all potential sweeteners or potential horrors depending on implementation quality. At best, they are only as good as the underlying system storage.
Essential Storage System Capabilities
So just what is Network Attached Storage? It is a service that a computer on your network provides to other networked computers. The earliest “PC servers” provided network attached storage and printing capabilities. At about the same time or a bit before, the folks at UC Berkeley Computer Science department were creating Network File System. Also going on at the time, Carnegie Mellon was developing the Andrew File System. These systems are the beginnings of modern networked file systems. Some time later the Microsoft Windows NT folk developed CIFS which is similar to NFS in many ways. CIFS, swept in on the tide of Windows success, became the dominant network storage communications interface.
Network File Operations
A network file system does the following things.
- It provides a means of mounting a network share as if it were local storage.
- It provides a means of access control compatible with that of the host operating system.
- It provides the means to traverse the file system to locate and operate on files.
- It provides read, write, and delete mechanisms.
- It provides the means to adjust access restrictions.
In short, it allows processes running on network computers to do all of the same things that a process running locally can do.
File Server Capabilities
The file server provides controlled access and controlled use of local files to remote computers operating on the local area network. In providing file service it makes almost all local file system capabilities available (versions, snapshots, and return to a prior version are notable exceptions).
A file service will have the following characteristics.
- Services will be defined independent of network media, link, network, transport, and session layers. They may include a presentation layer.
- The application layer provides access to the remote file services on offer including mounting, dismounting, listing directories, moving about, reading, writing, and deleting files.
- The file service abstracts volume management. To clients, volumes are a named resource. The server combines physical volumes into logical volumes, provides data redundancy, error detection and recovery, and verification of the integrity of data at rest.
- The file service may also encrypt the data at rest. Data becomes usable only with the encryption keys available and the user authenticated properly.
Small Server File Systems
Many small storage appliances are built on top of the Linux operating system. Most use Linux Logical Volume Manager, the Linux RAID subsystem, and commercial SATA disks and one of the available Linux file systems. Reference 2 is a good review of the history of the files systems offered in the Linux kernel over time. EXT4 is the filesystem of choice for most basic applications of Linux. Most Linux file systems aimed for simplicity, speed, and reliability on well-behaved hardware. They largely ignored memory errors and used check-summing to detect data errors. Detection only. Simple cyclic redundancy functions allowed corruption to be detected but not fixed at the block level. Volume scaling and data redundancy happened outside the file system.
Four file systems APFS, BTRFS, XFS, and ZFS attempted to merge volume management, data integrity management, and with APFS and ZFS, data security at rest. All are copy on write file systems. Two of these come from the engineering workstation community, XFS and ZFS. And one comes from the commercial personal computing system suppliers, APFS, and one, BTRFS began as a personal quest of an Open Source developer in the Linux environment. XFS and ZFS are mature and have been in production use for 20 or so years. BTRFS is approaching maturity and is available in the Linux kernel. APFS is an Apple commercial product found in iThings and MacThings.
In 2021, several major distributions added ZFS into the Linux Kernel. It is now the filesystem of choice for new Linux deployments.
Linux has its limitations
Computer systems houses catering to large scale professional computing (modeling, simulation, weather, climate, computational aerodynamics, nuclear reactor behavior modeling, particle physics, CAD, computer design, etc.) have developed large scale file systems that overcome the limitations of Logical Volume Manager, Linux Kernel Raid, etc. XFS and ZFS were the first of these. By and large, these systems integrate physical volume management, logical volume formation, data encryption, data and metadata error detection and correction, etc. into a unified framework. Successful integrations offer a number of important advantages for the long-term storage of data, most of which is at rest.
The large scale system vendors, especially those providing super computing systems to researchers, needed file system flexibility to allow drawing computers from the multi-processor pool, providing an application tailored file system view of applications, reference data like material properties, case setup data, and case results data to each system in a manner independent of the location of the referencing process. Moving data around would take entirely too long so cluster file systems offered tailored views of a distributed store that were compact and easily made available to each participating runtime environment. File systems such as CephFS give an integrated virtual view of individual file systems located throughout the cluster. So each cluster member could have its own boot environment, OS release media, and utilities with applications and their data being shared.
These specialized cluster filesystems are beyond the scope of this article.
Modern File Systems have additional capabilities
The new file systems offer some capabilities above and beyond basic data storage semantics.
- The ability to to address more data than there are stars in the known universe.
- The ability to expand the storage system.
- Data redundancy and check-summing to detect and recover from disk errors.
- Copy on write semantics. When a file is modified or extended, new versions come into being. Only the changes are recorded to disk.
- Snap-shots record the current states of each file at snapshot time by recording appropriate metadata. Only the changed files are logged.
- The file system state can be rewound to any previous snapshot.
- The host OS can perform verification scans of the filesystem. Errors developing in data at rest can be detected, reported, and corrected. Data redundancy and meta-data redundancy make this capability possible.
- Boot environments. Each OS and kernel update is logged as a new version of the boot filesystem.
Why these capabilities matter
Error detection and recovery are important to maintaining the integrity of long-lived data like photos or videos. Or your tax records, retirement accounts, investment capital gains, investment transactions, etc have to be retained for many years, possibly a lifetime.
Second, system intrusion and encryption of data for ransom is a common extortion tactic used by bad actors. Attempts to encrypt data cause a copy on write filesystem to make a new copy. Regular snapshots make it possible to revert to a pre-intrusion version.
Reverting boot environments and reverting file system versions are powerful tools for recovering from buggy updates and intrusion damage.
Dismal Manor Storage Shopping Experience
In 2017, it was apparent that my 10 year old Drobo storage appliances were approaching end of life. They were slow. They were on their second set of disks. They couldn’t be connected to the new computer, a 2017 iMac (Apple retired Fire-Wire). So I made a set of criteria for their replacement.
- A supported physical interface, preferably a network interface as other thing were becoming users of stored data. Music had moved into Roon, Photos were now Apple Photos, etc.
- A physical interface that had demonstrated staying power. Of those in use, only Ethernet had demonstrated longevity.
- Data and metadata redundancy like that provided by Drobo.
- Easy recovery like that provided by Drobo. Replace the disk and let it go.
- A means of backup.
- Plays nicely with Time Machine.
Confessions of an OS junky
Well into the ’90s, I was a processor architecture, system architecture, and OS architecture junkie. While I was a young engineer, computing was a frothing world of experimentation and survival of the successful. While working in the nuclear power industry, I delivered simulators on SunOS and IRIX and was familiar with the XFS work under way at Silicon Graphics.
At about this time, Linux was becoming usable, and Intel IA32 had uniform addressing, memory address translation, memory protection, etc making it worth using professionally. When I moved to Navy modeling and simulation, my work environment became Linux-based. By this time, the excitement of the 80’s and 90’s had sort of fizzled but cluster development and cluster file systems remained a hot topic in the supercomputing industry and among their users.
So I was aware of ZFS. Searching on DIY NAS, I quickly discovered FreeNAS, now TrueNAS, and ixSystems, and that these things were eminently suitable for home-brew. Oh, and built on ZFS as was FreeBSD, a descendant of the work at Berkley. So I investigated further and learned that ZFS data and metadata check-summing was unique and that only ZFS of the file systems in use in storage appliances offered data integrity verification. ZFS also offered snapshot replication which became the primary means of backup. Sun had planned for backups!
So I built one. The other thing I learned was that Synology, QNAP, etc were about pretty faces. You couldn’t find out which file system was used or if they provided protection of data at rest from bit rot. They were pretty. They were cheap. They were slow. They did not have ECC memory, vital in a system that is operating continuously.
When you shop, look for
- ECC memory
- Data at rest integrity checking
- Means of backup.
- Snapshots, and snapshot restoration
If you are building using new parts, you will find it hard to make a build for less than the cost of a ix Systems TrueNAS Mini. You’ll also find that storage dominates the cost of the system and that ixSystems storage prices will be hard to beat. There was literally no premium for proper NAS-spec drives from ix Systems.
When I purchased my Mini, the machine was beautifully packaged. The drives were properly packaged with proper shock and vibration protection. In fact, they were packed like OEM repair parts!