v.24.7Improvement

StorageS3Queue Enhancements: Default Values, Exception Handling, and Commit Controls

StorageS3Queue related fixes and improvements. Deduce a default value of s3queue_processing_threads_num according to the number of physical cpu cores on the server (instead of the previous default value as 1). Set default value of s3queue_loading_retries to 10. Fix possible vague "Uncaught exception" in exception column of system.s3queue. Do not increment retry count on MEMORY_LIMIT_EXCEEDED exception. Move files commit to a stage after insertion into table fully finished to avoid files being commited while not inserted. Add settings s3queue_max_processed_files_before_commit, s3queue_max_processed_rows_before_commit, s3queue_max_processed_bytes_before_commit, s3queue_max_processing_time_sec_before_commit, to better control commit and flush time. #65046 (Kseniia Sumarokova).
StorageS3Queue improvements including dynamic default thread count, enhanced retry logic, refined commit timing, and better error handling.

Why it matters

These enhancements address performance and reliability issues by automatically adjusting the number of processing threads based on physical CPU cores, increasing the default retry attempts, preventing retries on memory limit exceptions, and improving commit timing to avoid partial data commits. They also introduce new settings to give users finer control over commit and flush operations, leading to more efficient and stable S3 queue data processing.

How to use it

Users can benefit from the new defaults automatically without changes, or customize behavior by setting s3queue_processing_threads_num, s3queue_loading_retries, and the new commit control settings: s3queue_max_processed_files_before_commit, s3queue_max_processed_rows_before_commit, s3queue_max_processed_bytes_before_commit, and s3queue_max_processing_time_sec_before_commit to optimize processing according to their workload.