v.23.6Improvement

Add input_format_max_bytes_to_read_for_schema_inference setting for schema inference limit

Add setting input_format_max_bytes_to_read_for_schema_inference to limit the number of bytes to read in schema inference. Closes #50577. #50592 (Kruglov Pavel).
Added the setting input_format_max_bytes_to_read_for_schema_inference to limit the number of bytes read during schema inference for input formats.

Why it matters

This feature addresses the problem of excessive data reading when ClickHouse attempts to infer the schema from input data. By limiting the bytes read for schema inference, it improves performance and reduces resource consumption when importing or processing large files with uncertain schema.

How to use it

Set the input_format_max_bytes_to_read_for_schema_inference setting to a desired byte limit before running queries that require schema inference. For example, set it in the user profile, query settings, or explicitly in a query context.