v.23.9Improvement
LIKE and Regular Expressions Now Support Non-UTF-8 Patterns with Binary Matching
LIKE,match, and other regular expressions matching functions now allow matching with patterns containing non-UTF-8 substrings by falling back to binary matching. Example: you can usestring LIKE '\xFE\xFF%'to detect BOM. This closes #54486. #54942 (Alexey Milovidov).
Why it matters
This feature addresses the limitation of matching patterns containing non-UTF-8 binary substrings by enabling fallback to binary matching. It allows users to perform pattern matching on binary data (e.g., detecting BOM) using standard SQL pattern matching functions, improving usability and functionality with non-textual data.How to use it
Use standard pattern matching functions such asLIKE or match with patterns containing binary sequences. For example, to detect a BOM, you can run:SELECT * FROM table WHERE string LIKE '\xFE\xFF%'The system automatically falls back to binary matching when non-UTF-8 substrings are detected in the pattern.