v.25.9Improvement
Add a new startup_scripts_failure_reason dimensional metric
Add a newstartup_scripts_failure_reasondimensional metric. This metric is needed to distinguish between different error types that result in failing startup scripts. In particular, for alerting purposes, we need to distinguish between transient (e.g.,MEMORY_LIMIT_EXCEEDEDorKEEPER_EXCEPTION) and non-transient errors. #86202 (Miсhael Stetsyuk).
Why it matters
This feature enables distinguishing between transient errors (such asMEMORY_LIMIT_EXCEEDED or KEEPER_EXCEPTION) and non-transient errors in startup scripts, improving alerting accuracy and operational diagnostics.How to use it
Users can utilize thestartup_scripts_failure_reason metric in their monitoring and alerting setups to filter and respond to specific types of startup script failures by querying this new dimension in the metrics system.