v.23.12New Feature

Add Union Mode for Schema Inference in ClickHouse

Add 'union' mode for schema inference. In this mode the resulting table schema is the union of all files schemas (so schema is inferred from each file). The mode of schema inference is controlled by a setting schema_inference_mode with two possible values - default and union. Closes #55428. #55892 (Kruglov Pavel).
Introduces a new 'union' mode for schema inference, where the resulting table schema is the union of schemas from all input files, allowing schema to be inferred from each file individually.

Why it matters

This feature solves the problem of inconsistent schemas across multiple files during data ingestion by enabling a schema inference mode that merges all individual file schemas into one unified table schema. It provides more flexibility and robustness when working with heterogeneous data sources.

How to use it

Users can enable the feature by setting the schema_inference_mode configuration to union. By default, the mode is set to default, which infers schema from a single source. To apply, configure this setting prior to loading data so the table schema results from the union of all file schemas.