v.22.9New Feature

Add JSONObjectEachRow Format and New JSON Settings in Import/Export

Add new input/output format JSONObjectEachRow - Support import for formats JSON/JSONCompact/JSONColumnsWithMetadata. Add new setting input_format_json_validate_types_from_metadata that controls whether we should check if data types from metadata match data types from the header. - Add new setting input_format_json_validate_utf8, when it's enabled, all JSON formats will validate UTF-8 sequences. It will be disabled by default. Note that this setting doesn't influence output formats JSON/JSONCompact/JSONColumnsWithMetadata, they always validate utf8 sequences (this exception was made because of compatibility reasons). - Add new setting input_format_json_read_numbers_as_strings that allows to parse numbers in String column, the setting is disabled by default. - Add new setting output_format_json_quote_decimals that allows to output decimals in double quotes, disabled by default. - Allow to parse decimals in double quotes during data import. #40910 (Kruglov Pavel).
Introduces the new input/output format JSONObjectEachRow and enhances support for JSON formats including JSON, JSONCompact, and JSONColumnsWithMetadata with new validation and parsing settings.

Why it matters

This feature improves the flexibility and robustness of JSON data import and export in ClickHouse. It addresses common data compatibility and validation challenges by allowing users to control type validation based on metadata, UTF-8 validation, number parsing as strings, and decimal formatting. This results in more accurate data handling and better compatibility with varied JSON datasets.

How to use it

Users can enable the new format JSONObjectEachRow by specifying it in their input/output format settings. Additionally, they can configure the following settings to tailor JSON processing:

- input_format_json_validate_types_from_metadata: enables type validation against metadata.
- input_format_json_validate_utf8: validates UTF-8 sequences for JSON input (disabled by default).
- input_format_json_read_numbers_as_strings: allows parsing numbers into String columns (disabled by default).
- output_format_json_quote_decimals: outputs decimal numbers as quoted strings (disabled by default).

These settings can be adjusted via the client or server configuration or session settings accordingly.