v.24.8New Feature
Interpretation of Hive-Style Partitioning for Various Data Engines
Interpret Hive-style partitioning for different engines (File,URL,S3,AzureBlobStorage,HDFS). Hive-style partitioning organizes data into partitioned sub-directories, making it efficient to query and manage large datasets. Currently, it only creates virtual columns with the appropriate name and data. The follow-up PR will introduce the appropriate data filtering (performance speedup). #65997 (Yarik Briukhovetskyi).
Why it matters
Hive-style partitioning organizes data into partitioned sub-directories, allowing efficient querying and management of large datasets. This feature provides support for interpreting these partitions in ClickHouse, enabling users to access partition information as virtual columns. It lays the foundation for improved performance through partition pruning in follow-up updates.How to use it
When using external storage engines such asFile, URL, S3, AzureBlobStorage, or HDFS, Hive-style partitions are automatically recognized and exposed as virtual columns with appropriate names and data types. Users can then query these virtual columns in their SQL statements. The actual data filtering based on these partitions will be enabled in a subsequent release.