v.23.6New Feature
Allow Skipping Empty Files in Table Functions with New Settings
Allow to skip empty files in file/s3/url/hdfs table functions using settingss3_skip_empty_files,hdfs_skip_empty_files,engine_file_skip_empty_files,engine_url_skip_empty_files. #50364 (Kruglov Pavel).
Why it matters
This feature solves the problem of processing empty files in external data sources, which can cause unnecessary overhead or errors. By enabling this option, users can improve query efficiency and avoid failures caused by empty files when loading data from S3, HDFS, URLs, or local files.How to use it
To use this feature, set the corresponding settings totrue before querying, for example:SET s3_skip_empty_files = 1;
SELECT * FROM s3('https://example.com/data.csv', 'CSV');Similarly, for HDFS or file and URL engines, use:
SET hdfs_skip_empty_files = 1;
SET engine_file_skip_empty_files = 1;
SET engine_url_skip_empty_files = 1;These settings can be applied in the session or in the query context to skip empty files automatically.