v.22.2New Feature

New File Creation and Overwrite Options for File/S3/HDFS Engines in ClickHouse

An option to create a new files on insert for File/S3/HDFS engines. Allow to overwrite a file in HDFS. Throw an exception in attempt to overwrite a file in S3 by default. Throw an exception in attempt to append data to file in formats that have a suffix (and thus don't support appends, like Parquet, ORC). Closes #31640 Closes #31622 Closes #23862 Closes #15022 Closes #16674. #33302 (Kruglov Pavel).
Introduces an option to create new files on INSERT operations for File, S3, and HDFS table engines, allowing overwrites on HDFS and enforcing restrictions on file overwrites and appends for certain formats.

Why it matters

This feature addresses the limitation of appending or overwriting files when inserting data into external storage engines like File, S3, and HDFS. It enables safer and more controlled data management by preventing unintended overwrites in S3 and disallowing appends on file formats that do not support them (e.g., Parquet, ORC), while allowing overwrites in HDFS. This improves data integrity and operational reliability for users working with external file-based storage.

How to use it

To enable this feature, use the new option when creating or altering tables with the File, S3, or HDFS engines. This will instruct ClickHouse to create new files on each INSERT. By default, an exception is thrown if an S3 file overwrite or appending to unsupported formats (e.g., Parquet, ORC) is attempted, while overwriting files in HDFS is allowed.