v.21.9New Feature

Zero-copy Replication for ReplicatedMergeTree over HDFS Storage

Zero-copy replication for ReplicatedMergeTree over HDFS storage. #25918 (Zhichang Yu).
Introduces zero-copy replication for ReplicatedMergeTree tables on HDFS storage, enabling efficient data replication without unnecessary data copying.

Why it matters

This feature addresses performance and storage overhead issues during replication by eliminating redundant data copying processes. It allows replication to be faster and more storage-efficient, especially beneficial for large datasets stored in HDFS.

How to use it

To use zero-copy replication, configure your ReplicatedMergeTree tables to store data on HDFS and enable the zero-copy replication option. The replication mechanism will then reuse existing data files on HDFS during replication processes automatically.