v.25.8Improvement

Introduce a new backup_slow_all_threads_after_retryable_s3_error setting

Introduce a new backup_slow_all_threads_after_retryable_s3_error setting to reduce pressure on S3 during retry storms caused by errors such as SlowDown, by slowing down all threads once a single retryable error is observed. #84854 (Julia Kartseva).
Introduces the backup_slow_all_threads_after_retryable_s3_error setting to mitigate retry storms on S3 by slowing down all threads after detecting a single retryable error like SlowDown.

Why it matters

This feature addresses the problem of excessive pressure on S3 caused by simultaneous retries from multiple threads during retryable errors such as SlowDown. By slowing down all threads once a retryable error is observed, it reduces the load on S3, improving stability and preventing further errors.

How to use it

Enable the feature by setting backup_slow_all_threads_after_retryable_s3_error to 1 in the configuration. This activates the slowdown of all threads upon encountering a retryable S3 error during backup operations.