v.25.7New Feature
NumericIndexedVector: new vector data-structure backed
NumericIndexedVector: new vector data-structure backed by bit-sliced, Roaring-bitmap compression, together with more than 20 functions for building, analysing and point-wise arithmetic. Can cut storage and speed up joins, filters and aggregations on sparse data. Implements #70582 and “Large-Scale Metric Computation in Online Controlled Experiment Platform” paper by T. Xiong and Y. Wang from VLDB 2024. #74193 (FriendLey).
Why it matters
This feature addresses the need for efficient storage and faster query performance on sparse data. By leveraging bitmap compression, it reduces storage requirements and accelerates joins, filters, and aggregations, making it ideal for large-scale metric computations and sparse data workloads.How to use it
Users can leverage the NumericIndexedVector data structure in their queries and tables to optimize storage and performance for sparse numeric data. The new suite of functions can be applied for constructing and manipulating these vectors. Refer to the associated functions documentation and examples in the release notes or official docs for explicit syntax and usage.Related resources
- Issue #70582 - NumericIndexedVector Implementation
- Pull Request #74193 - NumericIndexedVector Feature
- "Large-Scale Metric Computation in Online Controlled Experiment Platform" Paper
- NumericIndexedVector: new vector data-structure backed by bit-sliced, Roaring-bitmap compression, together with more than 20 functions for building, analysing and point-wise arithmetic. Can cut storage and speed up joins, filters and aggregations on sparse data. Implements Large-Scale Metric Computation in Online Controlled Experiment Platform paper