v.21.6New Feature
Added s3Cluster Function for Parallel File Processing on S3 in ClickHouse
Added a table functions3Cluster, which allows to process files froms3in parallel on every node of a specified cluster. #22012 (Nikita Mikhaylov).
Why it matters
This feature solves the problem of efficiently reading and processing large datasets stored ins3 by distributing the workload across multiple cluster nodes. It improves performance and scalability when working with s3-based data sources in a distributed ClickHouse setup.How to use it
Use thes3Cluster table function by specifying the target cluster and the s3 file paths. This will automatically parallelize the reading of s3 files on every node of the cluster during query execution.