v.24.6Improvement
Added new setting for Parquet input format and updated default block size
Added a new settinginput_format_parquet_prefer_block_bytesto control the average output block bytes, and modified the default value ofinput_format_parquet_max_block_sizeto 65409. #64427 (LiuNeng).
Why it matters
This feature allows users to better manage memory and processing efficiency by controlling the block size of data blocks produced when reading Parquet files. Adjusting the output block size helps optimize performance and resource utilization during data ingestion.How to use it
Users can set theinput_format_parquet_prefer_block_bytes configuration parameter to specify their preferred average output block size in bytes. Additionally, the system now uses an updated default value of 65409 for input_format_parquet_max_block_size. These settings can be configured in the ClickHouse server configuration files or via session settings before executing queries that read Parquet files.