v.20.1New Feature
Add categoricalInformationValue aggregate function for discrete feature analysis
Add aggregate function categoricalInformationValue which calculates the information value of a discrete feature. #8117 (hcz)Why it matters
The feature provides a way to compute the Information Value (IV) metric for categorical features directly within ClickHouse. Information Value is widely used in feature selection, especially in credit scoring and risk modeling, to measure the predictive power of a discrete variable. This function helps users evaluate the usefulness of categorical features efficiently without exporting data to external tools.How to use it
Users can apply the new aggregate functioncategoricalInformationValue in their SELECT queries to calculate the information value of a categorical column with respect to a binary target. Typical usage involves grouping data by the categorical feature and aggregating with categoricalInformationValue(feature_column, target_column).