Talk abstract
Alert on All the Things: Integrating Quicksilver with Prometheus
Speaker: Lorenz Bauer
Cloudflare provides its services from 115 data centres in 57 countries. One of the most critical systems is a key-value store that replicates configuration data to every single machine, which we recently rewrote from scratch. As developers we were early adopters of Prometheus at Cloudflare, and this talk will explain how we set up Grafana dashboards for monitoring and Alertmanager for alerting, giving us unprecedented insight. It’ll also cover the gotchas we encountered. Like that one time when we triggered 7000 alerts at once.
Back to schedule