feat(ui): allow acknowledging alerts using short lived silences

This commit is contained in:
Łukasz Mierzwa
2019-11-04 14:46:56 +00:00
parent 6857368607
commit 63a3d2a30b
11 changed files with 788 additions and 74 deletions

View File

@@ -229,6 +229,70 @@ setting multiple Alertmanager servers. For cases where only a single server
needs to be configured without a config file see
[Simplified Configuration](#simplified-configuration).
### Alert acknowledgement
Prometheus Alertmanager allows alerts to be in 3 states:
- `active` - when alert is firing
- `suppressed` - when alert is either silenced by a
[silence rule](https://prometheus.io/docs/alerting/alertmanager/#silences) or
inhibited by another alert using
[inhibition rules](https://prometheus.io/docs/alerting/alertmanager/#inhibition)
- `unprocessed` - initial state for new alerts before they are checked against
all silence rules so Alertmanager doesn't yet know if the alert should be
`active` or `supported`
A silence rule can be used to mark an alert as acknowledged and being worked on.
To simplify creating of such silences karma provides a one click button that
will create a silence matching alert group it was clicked for.
`alertAcknowledgement` allows to enable this feature and customize it's
configuration.
Syntax:
```YAML
alertAcknowledgement:
enabled: bool
duration: duration
author: string
commentPrefix: string
```
- `enabled` - setting it to true will enable creation of short lived
acknowledgement silences.
- `duration` - duration for acknowledgement silences, value is a string in
[time.Duration](https://golang.org/pkg/time/#ParseDuration) format.
- `author` - default author for acknowledgement silences. If user set the
author field on the silence form then that value will be used instead.
- `commentPrefix` - a string that will be added as a prefix to autogenerated
silence comment (optional).
Defaults:
```YAML
alertAcknowledgement:
enabled: false
duration: 15m0s
author: karma
commentPrefix: ACK!
```
A common problem is setting a correct duration for the silence.
If set for too short it can expire before the issue is resolved, and will
require re-silencing all the alerts.
If set for too long it mask the same problem reoccurring in the future. This
requires user to expire the silence once the issue is resolved.
[kthxbye](https://github.com/prymitive/kthxbye) is a tiny daemon that can help
with managing short lived acknowledged silences. It will continuously extend
short lived acknowledgement silences if there are alerts firing against those
silences, which means that the user doesn't need to worry about setting proper
duration for such silences.
To use it run an instance of kthxbye with every alertmanager instance or
cluster and configure it to use the same comment prefix as `commentPrefix`.
With this setup when user clicks to acknowledge an alert karma will create
a short lived silence and kthxbye will keep that silence in Alertmanager
until there are no alerts matching it, meaning that the issue was resolved.
### Annotations
`annotations` section allows configuring how alert annotation are displayed in