v.24.12New Feature
Added Cache for Primary Index in MergeTree Tables to Optimize Memory Usage
Added cache for primary index ofMergeTreetables (can be enabled by table settinguse_primary_key_cache). If lazy load and cache are enabled for primary index, it will be loaded to cache on demand (similar to mark cache) instead of keeping it in memory forever. Added prewarm of primary index on inserts/mergs/fetches of data parts and on restarts of table (can be enabled by settingprewarm_primary_key_cache). This allows lower memory usage for huge tables on shared storage, and we tested it on tables over one quadrillion records. #72102 (Anton Popov). #72750 (Alexander Gololobov).
Why it matters
The feature reduces memory usage for hugeMergeTree tables on shared storage by loading the primary index into cache only as needed instead of keeping it in memory permanently. This improves resource efficiency and scalability, demonstrated on tables with over one quadrillion records.How to use it
Enable primary index caching by settinguse_primary_key_cache = 1 on a MergeTree table. Optionally, enable automatic prewarming of the primary key cache on data inserts, merges, fetches, and table restarts by setting prewarm_primary_key_cache = 1.