v.23.9New Feature
Added GCD Codec for Data Compression in ClickHouse
Added GCD a.k.a. "greatest common denominator" as a new data compression codec. The codec computes the GCD of all column values, and then divides each value by the GCD. The GCD codec is a data preparation codec (similar to Delta and DoubleDelta) and cannot be used stand-alone. It works with data integer, decimal and date/time type. A viable use case for the GCD codec are column values that change (increase/decrease) in multiples of the GCD, e.g. 24 - 28 - 16 - 24 - 8 - 24 (assuming GCD = 4). #53149 (Alexander Nam).Why it matters
TheGCD codec addresses the need for improved compression on columns where values change in multiples of a common factor. By normalizing values through division by the GCD, it optimizes storage and reduces disk space consumption for integer, decimal, and date/time data types.How to use it
TheGCD codec is a data preparation codec similar to Delta and DoubleDelta and cannot be used alone. To enable it, specify GCD within the column's codec chain in the table schema definition. For example:CREATE TABLE example (
column_name Int32 CODEC(GCD, LZ4)
) ENGINE = ...;