- Example: A RAID array of 2 hard drives and an SSD caching disk is controlled by Intel's RST system, part of the chipset and firmware built into a desktop computer. The user sees this as a single volume, containing an NTFS-formatted drive of their data, and NTFS is not necessarily aware of the manipulations that may be required (such as rebuilding the RAID array if a disk fails). The management of the individual devices and their presentation as a single device, is distinct from the management of the files held on that apparent device.
ZFS is unusual, because unlike most other storage systems, it unifies both of these roles and acts as both the volume manager and the file system. Therefore, it has complete knowledge of both the physical disks and volumes (including their condition, status, their logical arrangement into volumes, and also of all the files stored on them). ZFS is designed to ensure (subject to suitable hardware) that data stored on disks cannot be lost due to physical error or misprocessing by the hardware or operating system, or bit rot events and data corruption which may happen over time, and its complete control of the storage system is used to ensure that every step, whether related to file management or disk management, is verified, confirmed, corrected if needed, and optimized, in a way that storage controller cards, and separate volume and file managers cannot achieve.
ZFS also includes a mechanism for snapshots and replication, including snapshot cloning; the former is described by the FreeBSD documentation as one of its "most powerful features", having features that "even other file systems with snapshot functionality lack".[12] Very large numbers of snapshots can be taken, without degrading performance, allowing snapshots to be used prior to risky system operations and software changes, or an entire production ("live") file system to be fully snapshotted several times an hour, in order to mitigate data loss due to user error or malicious activity. Snapshots can be rolled back "live" or the file system at previous points in time viewed, even on very large file systems, leading to "tremendous" savings in comparison to formal backup and restore processes,[12] or cloned "on the spot" to form new independent file systems.
Summary of key differentiating features[edit]
Examples of features specific to ZFS which facilitate its objective include:
- Designed for long term storage of data, and indefinitely scaled datastore sizes with zero data loss, and high configurability.
- Hierarchical checksumming of all data and metadata, ensuring that the entire storage system can be verified on use, and confirmed to be correctly stored, or remedied if corrupt. Checksums are stored with a block's parent block, rather than with the block itself. This contrasts with many file systems where checksums (if held) are stored with the data so that if the data is lost or corrupt, the checksum is also likely to be lost or incorrect.
- Can store a user-specified number of copies of data or metadata, or selected types of data, to improve the ability to recover from data corruption of important files and structures.
- Automatic rollback of recent changes to the file system and data, in some circumstances, in the event of an error or inconsistency.
- Automated and (usually) silent self-healing of data inconsistencies and write failure when detected, for all errors where the data is capable of reconstruction. Data can be reconstructed using all of the following: error detection and correction checksums stored in each block's parent block; multiple copies of data (including checksums) held on the disk; write intentions logged on the SLOG (ZIL) for writes that should have occurred but did not occur (after a power failure); parity data from RAID/RAIDZ disks and volumes; copies of data from mirrored disks and volumes.
- Native handling of standard RAID levels and additional ZFS RAID layouts ("RAIDZ"). The RAIDZ levels stripe data across only the disks required, for efficiency (many RAID systems stripe indiscriminately across all devices), and checksumming allows rebuilding of inconsistent or corrupted data to be minimised to those blocks with defects;
- Native handling of tiered storage and caching devices, which is usually a volume related task. Because it also understands the file system, it can use file-related knowledge to inform, integrate and optimize its tiered storage handling which a separate device cannot;
- Native handling of snapshots and backup/replication which can be made efficient by integrating the volume and file handling. ZFS can routinely take snapshots several times an hour of the data system, efficiently and quickly. (Relevant tools are provided at a low level and require external scripts and software for utilization).
- Native data compression and deduplication, although the latter is largely handled in RAM and is memory hungry.
- Efficient rebuilding of RAID arrays — a RAID controller often has to rebuild an entire disk, but ZFS can combine disk and file knowledge to limit any rebuilding to data which is actually missing or corrupt, greatly speeding up rebuilding;
- Ability to identify data that would have been found in a cache but has been discarded recently instead; this allows ZFS to reassess its caching decisions in light of later use and facilitates very high cache hit levels;
- Alternative caching strategies can be used for data that would otherwise cause delays in data handling. For example, synchronous writes which are capable of slowing down the storage system can be converted to asynchronous writes by being written to a fast separate caching device, known as the SLOG (sometimes called the ZIL - ZFS Intent Log).
- Highly tunable - many internal parameters can be configured for optimal functionality.
- Can be used for high availability clusters and computing, although not fully designed for this use.