v.23.3New Feature

Add Custom Key Mode for Replica Shard Splitting in ClickHouse

Add a new mode for splitting the work on replicas using settings parallel_replicas_custom_key and parallel_replicas_custom_key_filter_type. If the cluster consists of a single shard with multiple replicas, up to max_parallel_replicas will be randomly picked and turned into shards. For each shard, a corresponding filter is added to the query on the initiator before being sent to the shard. If the cluster consists of multiple shards, it will behave the same as sample_key but with the possibility to define an arbitrary key. #45108 (Antonio Andelic).
Introduces a new mode for distributing query work across replicas using customizable settings parallel_replicas_custom_key and parallel_replicas_custom_key_filter_type, enabling more flexible and efficient workload splitting in clusters.

Why it matters

This feature solves the problem of limited flexibility in parallelizing query execution across replicas. It allows users to define a custom key to split work on a cluster with a single shard and multiple replicas, randomly selecting up to max_parallel_replicas to act as shards with corresponding filters applied to queries. In multi-shard clusters, it behaves like the existing sample_key approach but supports arbitrary custom keys, thus improving parallel query distribution and potentially enhancing query performance.

How to use it

To enable this feature, set the parallel_replicas_custom_key to the desired key for splitting, and configure parallel_replicas_custom_key_filter_type to define the filter behavior. Adjust the max_parallel_replicas setting to control the number of replicas involved in parallelization. The query initiator will then apply the necessary filters before dispatching queries to shards or replicas.