Prometheus logo PromCon EU 2025

The Prometheus conference — October 21 - 22 in Munich

Talk abstract

SAAFE - A prioritized alerting model to troubleshoot your incidents

Existing taxonomies for time-series data, including the Four Golden Signals, the RED, and the USE Method, are most concerned with the nature of each type of series. The SAAFE - Saturation, Amend, Anomaly, Failure, and Error alerting model helps you focus on what they imply and not the type.

At Grafana Labs, we have built a scalable, fully automated alerting system that analyzes the data using its domain knowledge. These alerts are categorized into the SAAFE model based on their implications for the system. Combined with severity levels - info, warning, critical, no of instances, and firing duration, the SAAFE alerts are scored and ranked. When our on-call engineers troubleshoot incidents, they use the SAAFE categorization and ranking to prioritize, filter, and infer causality.

In this talk, we will introduce the SAAFE method with real-world examples of how this has been useful. We will also share the open-source framework built purely using PromQL and Grafana that you can adopt.

Speakers

Jorge Creixell

Jorge Creixell Profile Picture

Manoj Acharya

Manoj Acharya Profile Picture

Back to schedule