Detecting outliers and anomalies in realtime at Datadog - Homin Lee (OSCON Austin 2016)

Описание к видео Detecting outliers and anomalies in realtime at Datadog - Homin Lee (OSCON Austin 2016)

Monitoring even a modestly sized systems infrastructure quickly becomes untenable without automated alerting. For many metrics, it is nontrivial to define ahead of time what constitutes “normal” versus “abnormal” values. This is especially true for metrics whose baseline value fluctuates over time.

To make this problem more tractable, Datadog provides outlier detection functionality to automatically identify any host (or group of hosts) that is behaving abnormally compared to its peers and anomaly detection to alert when any single metric is behaving differently than its past history would suggest. Homin Lee discusses the algorithms and open source tools Datadog uses for outlier and anomaly detection and lessons learned from using these alerts on its own systems, along with some real-life examples on how to avoid false positives and negatives.

Комментарии

Информация по комментариям в разработке