v.25.6Experimental Feature

Improve for the experimental text

An improvement for the experimental text index: explicit parameters are supported via key-value pairs. Currently, supported parameters are a mandatory tokenizer and two optional max_rows_per_postings_list and ngram_size. #80262 (Elmi Ahmadov).
Enhanced Experimental Text Index with Explicit Parameters: The experimental text index now supports explicit configuration through key-value pairs, including a mandatory tokenizer and optional max_rows_per_postings_list and ngram_size parameters.

Why it matters

This feature addresses the need for more precise control over the experimental text index behavior. By allowing users to explicitly set important parameters such as the tokenizer type, maximum number of rows per postings list, and n-gram size, it improves text indexing flexibility and performance tuning, ultimately enhancing search relevance and resource management.

How to use it

Users can enable and configure the experimental text index by specifying parameters as key-value pairs within the index definition. The tokenizer parameter must be set, while max_rows_per_postings_list and ngram_size are optional. An example configuration might look like:

CREATE TABLE example_table (
text_column String,
INDEX idx_text text(text_column) TYPE minterpol
GRANULARITY 1
SETTINGS
tokenizer = 'some_tokenizer',
max_rows_per_postings_list = 1000000,
ngram_size = 3
)

Replace some_tokenizer with the desired tokenizer name.