v.18.12New Feature
Added min_merge_bytes_to_use_direct_io Option for MergeTree Engines
Added themin_merge_bytes_to_use_direct_iooption forMergeTreeengines, which allows you to set a threshold for the total size of the merge (when above the threshold, data part files will be handled using O_DIRECT). #3117
Why it matters
This feature allows users to specify a size threshold for data merges inMergeTree tables. When the total size of a merge exceeds this threshold, the merge process will use O_DIRECT to handle data part files, improving disk I/O efficiency and potentially reducing cache pollution during large merges.How to use it
Set themin_merge_bytes_to_use_direct_io parameter in the MergeTree table engine settings with the desired byte threshold. For example:CREATE TABLE example (
...
) ENGINE = MergeTree()
SETTINGS min_merge_bytes_to_use_direct_io = 1000000000;This will enable direct I/O for merges larger than 1GB.