v.20.3New Feature
Added groupArraySample function with reservoir sampling algorithm
AddedgroupArraySamplefunction (similar togroupArray) with reservior sampling algorithm. #8286 (Amos Bird)
Why it matters
ThegroupArraySample function addresses the need to efficiently sample elements from groups when the full aggregation may be too large or unnecessary. It provides a way to gather a representative subset of data within each group, saving memory and computational resources while maintaining randomness in the sample.How to use it
UsegroupArraySample in SQL queries like other aggregation functions. For example:SELECT key, groupArraySample(value) AS sample_values
FROM table
GROUP BY keyThis will return a sample array of
value elements per group using reservoir sampling.