15 Commits

Author SHA1 Message Date
Łukasz Mierzwa
fb75e6b083 Don't iterate a dict, just grab the key
This is slow, it's a dict, not need to iterate that.
2017-07-10 09:14:50 -07:00
Łukasz Mierzwa
9b5155e68c Re-implement metrics calculation as a collector
Split metrics code into a collector, this way it's self contained and doesn't require mixing metric calculation in the main logic.
Fixes #130
2017-07-10 09:09:43 -07:00
Łukasz Mierzwa
6166fdd474 Expose number of time unsee collected alerts from Alertmanager API
This way one can alert if unsee stops collecting alerts.
2017-07-06 08:52:13 -07:00
Łukasz Mierzwa
cd63ee512e Fix metric updates for alert counters
Metrics were incremented but never reset, this fixes it
2017-07-06 08:52:13 -07:00
Łukasz Mierzwa
81ce5d3098 Speed up alert fingerprint generation
Dynamic fingerprints made the code much slower, pprof shows they are responsible for ~70% of all cpu usage for any API call. To make it worse they are applied to all alerts, since dedup layer doesn't know which alerts will be filtered later, it operates on all of them. This PR will:
1. add benchmarks to so it's easier to track performance
2. Keep current methods for accessing fingerprints, but use precomputed static fields in those
3. Refactor Alert methods to use pointers, so we're not working on a copy

Benchmark timing went down from ~4000ns to 0.4ns for fingerprint calls and API response times from 1.3s (for my test sample) to 0.2s, which puts it back to the same level as before moving fingerprints to be dynamic. I still don't like this code much, it's all over the place, but I don't have a good idea how to better structure this, let's hope I'll be wiser in the future.
2017-07-05 23:35:00 -07:00
Łukasz Mierzwa
71c0dce1f6 Vendor renamed Sirupsen/logrus to sirupsen/logrus, fix imports 2017-07-02 10:12:33 -07:00
Łukasz Mierzwa
f2b21a60e2 Store alert state per instance 2017-07-01 12:43:15 -07:00
Łukasz Mierzwa
01c89082dd Calculate min/max timestamps and store those as globals, keep individual timestamps per instance 2017-07-01 12:09:55 -07:00
Łukasz Mierzwa
5bd03234a0 Clear Alertmanager data on pull error 2017-07-01 11:31:29 -07:00
Łukasz Mierzwa
5e020b9e01 Store alert source link per Alertmanager instance 2017-06-29 21:18:30 -07:00
Łukasz Mierzwa
c5724bb751 Handle per Alertmanager instance errors in the API and the UI 2017-06-28 22:36:25 -07:00
Łukasz Mierzwa
2647330f71 Generate AlertGroup unique fingerprint on the fly 2017-06-28 22:36:25 -07:00
Łukasz Mierzwa
b0d6628f82 Generate alert unique fingerprint on the fly 2017-06-28 22:36:24 -07:00
Łukasz Mierzwa
97e3728dab Compute alert content fingerprints on the fly
This will be more expensive but will simplify the code
2017-06-28 22:36:24 -07:00
Łukasz Mierzwa
26d14d1bd2 Refactor Alertmanager API client code to use multiple upstream instances
Alerts are stored per instance and deduplicated on read.
2017-06-28 22:35:16 -07:00