v.22.5Improvement
Nullables Detection in Protobuf and ClickHouse Integration Proposal
Nullables detection in protobuf. In proto3, default values are not sent on the wire. This makes it non-trivial to distinguish between null and default values for Nullable columns. A standard way to deal with this problem is to use Google wrappers to nest the target value within an inner message (see https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/wrappers.proto). In this case, a missing field is interpreted as null value, a field with missing value if interpreted as default value, and a field with regular value is interpreted as regular value. However, ClickHouse interprets Google wrappers as nested columns. We propose to introduce special behaviour to detect Google wrappers and interpret them like in the description above. For example, to serialize values for a Nullable columntest, we would usegoogle.protobuf.StringValue testin our .proto schema. Note that these types are so called "well-known types" in Protobuf, implemented in the library itself. #35149 (Jakub Kuklis).
Why it matters
In proto3, default values are omitted when serialized, making it difficult to differentiate between a null value and a default value for Nullable columns in ClickHouse. This feature addresses that by recognizing Google's wrapper types (well-known protobuf types) that explicitly encapsulate nullability. It allows ClickHouse to properly interpret missing fields as null, fields with missing inner values as default, and regular fields as actual values, improving data correctness and usability when importing protobuf data.How to use it
Define Nullable columns in your ClickHouse table schema as usual, and in your.proto schema, use the corresponding Google wrapper types from google.protobuf.wrappers.proto (e.g., google.protobuf.StringValue test for a Nullable String column named test). ClickHouse will automatically detect and correctly handle these wrappers during protobuf data processing.