v.25.8Performance Improvement
Remove zero byte
Remove zero byte. Closes #85062. A few minor bugs were fixed. FunctionsstructureToProtobufSchema,structureToCapnProtoSchemadidn't correctly put a zero-terminating byte and were using a newline instead of it. That was leading to a missing newline in the output, and could lead to buffer overflows while using other functions that depend on the zero byte (such aslogTrace,demangle,extractURLParameter,toStringCutToZero, andencrypt/decrypt). Theregexp_treedictionary layout didn't support processing strings with zero bytes. TheformatRowNoNewlinefunction, called withValuesformat or with any other format without a newline at the end of rows, erroneously cuts the last character of the output. Functionstemcontained an exception-safety error that could lead to a memory leak in a very rare scenario. Theinitcapfunction worked in the wrong way forFixedStringarguments: it didn't recognize the start of the word at the start of the string if the previous string in a block ended with a word character. Fixed a security vulnerability of the ApacheORCformat, which could lead to the exposure of uninitialized memory. Changed behavior of the functionreplaceRegexpAlland the corresponding alias,REGEXP_REPLACE: now it can do an empty match at the end of the string even if the previous match processed the whole string, such as in the case of^a*|a*$or^|.*- this corresponds to the semantic of JavaScript, Perl, Python, PHP, Ruby, but differs to the semantic of PostgreSQL. Implementation of many functions has been simplified and optimized. Documentation for several functions was wrong and has now been fixed. Keep in mind that the output ofbyteSizefor String columns and complex types, which consisted of String columns, has changed (from 9 bytes per empty string to 8 bytes per empty string), and this is normal. #85063 (Alexey Milovidov).