v.21.1New Features

Implement Substring Count Functions in ClickHouse

Implement countSubstrings()/countSubstringsCaseInsensitive()/countSubstringsCaseInsensitiveUTF8() (Count the number of substring occurrences). #17347 (Azat Khuzhin).
Added new functions countSubstrings(), countSubstringsCaseInsensitive(), and countSubstringsCaseInsensitiveUTF8() to count the number of occurrences of a substring within a string.

Why it matters

These functions provide users with a straightforward way to count how many times a specific substring appears in a string, with support for case-sensitive and case-insensitive matching, including UTF-8 case-insensitive comparisons. This simplifies string analysis and pattern counting tasks within ClickHouse queries.

How to use it

Use the functions by passing the target string and the substring to count as arguments. For example:

SELECT countSubstrings('hello world, hello ClickHouse', 'hello') AS count;
SELECT countSubstringsCaseInsensitive('Hello world', 'hello') AS count;
SELECT countSubstringsCaseInsensitiveUTF8('ΠŸΡ€ΠΈΠ²Π΅Ρ‚ ΠΌΠΈΡ€', 'ΠΏΡ€ΠΈΠ²Π΅Ρ‚') AS count;

These functions return the number of substring occurrences in the given string.