Prometheus decides when to alert; Alertmanager decides what to do about it — routing alerts to the right people, grouping the noise, and silencing during maintenance.
Why: Prometheus only evaluates rules and pushes firing alerts to Alertmanager. Alertmanager does everything after that: it groups related alerts into one notification, routes by labels to the right receiver (Slack, email, PagerDuty), deduplicates, and handles silences. Splitting the two keeps alert logic and delivery logic separate.
Prometheus ──firing alerts──▶ Alertmanager ──▶ group + route + dedupe
(rules) │
├─▶ Slack (severity=critical)
└─▶ email (severity=warning)Why: Alertmanager is a separate service. Add it to Compose, then tell Prometheus where to send alerts via the alerting block. Now firing alerts flow from Prometheus to Alertmanager on :9093.
# docker-compose.yml — add:
alertmanager:
image: prom/alertmanager:latest
ports: ["9093:9093"]
volumes:
- ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
# prometheus.yml — add:
alerting:
alertmanagers:
- static_configs:
- targets: ["alertmanager:9093"]Why: the Alertmanager config defines receivers (where notifications go) and a route tree (which alerts go where). group_by batches related alerts into one message; routes match on labels like severity. This sends critical alerts to Slack and everything else to a default receiver.
# alertmanager.yml
route:
group_by: ["alertname", "job"]
receiver: default
routes:
- matchers: [severity="critical"]
receiver: slack
receivers:
- name: default
- name: slack
slack_configs:
- api_url: "https://hooks.slack.com/services/XXX"
channel: "#alerts"Note: three features keep alert fatigue down. Grouping collapses many alerts (50 nodes down) into one notification. Inhibition suppresses lower-priority alerts when a related higher-priority one is firing (no "high latency" spam when the whole service is down). A silence mutes matching alerts for a window — use it during planned maintenance so expected alarms stay quiet.
grouping ─ many alerts → one notification (group_by labels)
inhibition ─ a firing critical mutes related warnings
silence ─ mute matching alerts for a time window (maintenance)