v.23.6New Feature

Allow Skipping Empty Files in Table Functions with New Settings

Allow to skip empty files in file/s3/url/hdfs table functions using settings s3_skip_empty_files, hdfs_skip_empty_files, engine_file_skip_empty_files, engine_url_skip_empty_files. #50364 (Kruglov Pavel).
Allows skipping empty files when reading from file, s3, url, and hdfs table functions using dedicated settings.

Why it matters

This feature solves the problem of processing empty files in external data sources, which can cause unnecessary overhead or errors. By enabling this option, users can improve query efficiency and avoid failures caused by empty files when loading data from S3, HDFS, URLs, or local files.

How to use it

To use this feature, set the corresponding settings to true before querying, for example:

SET s3_skip_empty_files = 1;
SELECT * FROM s3('https://example.com/data.csv', 'CSV');


Similarly, for HDFS or file and URL engines, use:

SET hdfs_skip_empty_files = 1;
SET engine_file_skip_empty_files = 1;
SET engine_url_skip_empty_files = 1;


These settings can be applied in the session or in the query context to skip empty files automatically.