v.20.4Improvement

Allow SAMPLE OFFSET Query for Splitting in ClickHouse Copier

Allow to use SAMPLE OFFSET query instead of cityHash64(PRIMARY KEY) % N == n for splitting in clickhouse-copier. To use this feature, pass --experimental-use-sample-offset 1 as a command line argument. #10414 (Nikita Mikhaylov)
Support for using SAMPLE OFFSET syntax as an alternative to cityHash64(PRIMARY KEY) % N == n for data splitting in clickhouse-copier.

Why it matters

This feature simplifies and standardizes the way data is partitioned during copying operations by enabling the use of the SAMPLE OFFSET query. It provides an easier and potentially more efficient method for splitting data across multiple workers compared to the previous hashing method.

How to use it

To enable this feature, run clickhouse-copier with the command line argument --experimental-use-sample-offset 1. This activates the use of SAMPLE OFFSET splitting instead of the default hashing-based approach.