v.24.6Improvement

Added new setting for Parquet input format and updated default block size

Added a new setting input_format_parquet_prefer_block_bytes to control the average output block bytes, and modified the default value of input_format_parquet_max_block_size to 65409. #64427 (LiuNeng).
Introduced a new setting input_format_parquet_prefer_block_bytes to control the desired average output block size when reading Parquet files, and updated the default value of input_format_parquet_max_block_size to 65409 bytes.

Why it matters

This feature allows users to better manage memory and processing efficiency by controlling the block size of data blocks produced when reading Parquet files. Adjusting the output block size helps optimize performance and resource utilization during data ingestion.

How to use it

Users can set the input_format_parquet_prefer_block_bytes configuration parameter to specify their preferred average output block size in bytes. Additionally, the system now uses an updated default value of 65409 for input_format_parquet_max_block_size. These settings can be configured in the ClickHouse server configuration files or via session settings before executing queries that read Parquet files.