v.22.4Experimental Feature

Enhancements for Remote Filesystem Cache Management in ClickHouse

Allow to write remote FS cache on all write operations. Add system.remote_filesystem_cache table. Add drop remote filesystem cache query. Add introspection for s3 metadata with system.remote_data_paths table. Closes #34021. Add cache option for merges by adding mode read_from_filesystem_cache_if_exists_otherwise_bypass_cache (turned on by default for merges and can also be turned on by query setting with the same name). Rename cache related settings (remote_fs_enable_cache -> enable_filesystem_cache, etc). #35475 (Kseniia Sumarokova).
Introduces the ability to write remote filesystem (FS) cache on all write operations, includes new system tables for cache introspection, and adds tools to manage and monitor remote FS cache in ClickHouse.

Why it matters

The feature addresses the need for improved caching of remote filesystem data by enabling cache writes on all write operations, which helps optimize read performance and resource usage. It simplifies cache management by providing introspection tables and commands to drop remote FS cache, enhancing monitoring and control over remote data caching.

How to use it

Users can leverage the new remote filesystem cache functionality by using the system.remote_filesystem_cache table for cache status and management, and the system.remote_data_paths table for introspection of S3 metadata. The cache write on all write operations is enabled by default for merges via the mode read_from_filesystem_cache_if_exists_otherwise_bypass_cache, which can also be controlled per query using a query setting of the same name. To clear the cache, users can execute the DROP REMOTE FILESYSTEM CACHE query. Additionally, cache-related settings have been renamed for clarity, e.g., remote_fs_enable_cache is now enable_filesystem_cache.