v.25.5Improvement

Tokens function was extended

The tokens function was extended to accept an additional "tokenizer" argument plus further tokenizer-specific arguments. #79001 (Elmi Ahmadov).
The tokens function in ClickHouse has been extended to accept a new "tokenizer" argument along with additional tokenizer-specific parameters.

Why it matters

This feature allows users to specify the tokenizer used within the tokens function, providing greater flexibility and control over text tokenization. It addresses the need for customizable tokenization strategies depending on the use case, enhancing text processing capabilities.

How to use it

To use the extended tokens function, include the new tokenizer argument followed by any required tokenizer-specific arguments within the function call. This enables selecting different tokenizers and configuring them as needed in your SQL queries.