zhangyue
b51cb3219f
fix: print result's message when status unknown
...
Signed-off-by: zhangyue <huaihuan.zy@alibaba-inc.com >
2020-11-18 19:30:17 +08:00
varsha teratipally
50127b0512
changed labelname after code review
2020-08-06 00:43:45 +00:00
varsha teratipally
4c40b7e468
updated readme
2020-08-05 21:43:58 +00:00
varsha teratipally
e13210157d
Add more info to disk metrics
2020-08-05 21:12:25 +00:00
Frame
9678892546
Fix typo in custom-plugin-monitor
2020-08-03 17:08:42 +08:00
Kubernetes Prow Robot
f3ab10eddb
Merge pull request #442 from abansal4032/custom-plugin-logs-capture
...
Capture the logs from stderr of custom plugins
2020-07-29 14:18:03 -07:00
Archit Bansal
6acf5b1edb
Capture the logs from stderr of custom plugins.
2020-07-29 11:57:05 -07:00
Kubernetes Prow Robot
c3cf941e98
Merge pull request #441 from abansal4032/custom-plugin-log-fix
...
Generate new status log only on condition change
2020-07-28 09:45:48 -07:00
Archit Bansal
f80f3e0dfa
Generate status generation logs from custom plugin run only on condition change.
2020-07-24 09:39:39 -07:00
Archit Bansal
f56d0a929d
Use InactiveExitTimestamp instead of ActiveEnterTimestamp for cooldown
...
period in health check monitor.
2020-07-16 18:53:47 -07:00
Archit Bansal
44dc4aa6c1
Add health-check-monitor
2020-05-27 14:08:42 -07:00
Abhilash Pallerlamudi
5342a50874
Add rhel support for osversion
2020-04-15 13:19:56 -07:00
Andrew DeMaria
7fd465e195
Add namespace option for events
2020-03-05 19:04:31 -07:00
Xuewei Zhang
83b09277f0
Collect more cpu/disk/memory metrics
2020-02-03 15:29:45 -08:00
Xuewei Zhang
fa7a3d7df1
Fix disk metrics unit and queue_length calculation
2020-01-02 17:19:38 -08:00
Kubernetes Prow Robot
0d0bba94e5
Merge pull request #402 from gmemcc/master
...
Ignore first collected disk stats to prevent metric distortion
2019-12-18 11:57:57 -08:00
Alex Wong
5a4ac81186
Only disk_avg_queue_len is distorted on first collection
2019-12-12 14:39:29 +08:00
Alex Wong
3d10c892a2
Ignore first collected disk stats to prevent metric distortion
2019-12-11 11:14:01 +08:00
yuzhiquan
9c24be2da4
cleanup: using time.Since(t) instead of t.Sub(time.Now())
2019-12-05 18:57:53 +08:00
yuzhiquan
b458f0d028
fix: modify typo
2019-12-03 15:21:57 +08:00
Xuewei Zhang
5e55ef89f1
Make log-counter respect ENABLE_JOURNALD
2019-11-26 13:58:10 -08:00
tongxin21
d5cb44646e
add an unit test for parsing the "/etc/os-release" of CentOS
...
add a newline character at the end
2019-11-01 13:34:22 +08:00
tongxin21
9b9f18a7ed
add a case is ID="centos"
2019-10-28 19:09:15 +08:00
Lantao Liu
be7cc78aa0
Properly close channel when monitor exits.
...
Signed-off-by: Lantao Liu <lantaol@google.com >
2019-10-25 14:11:39 -07:00
Kubernetes Prow Robot
705cb01e0c
Merge pull request #339 from wenjun93/logmonitor
...
avoid log channel closed caused endless loop
2019-10-25 11:27:39 -07:00
Kubernetes Prow Robot
bac3429522
Merge pull request #359 from gmemcc/hotfix-closed-channel
...
fix close of closed channel
2019-10-24 20:57:38 -07:00
wenjun93
4a4ebc7097
avoid log channel closed caused endless loop
2019-10-25 11:43:49 +08:00
Kubernetes Prow Robot
a999207a56
Merge pull request #367 from grosser/grosser/unwrap
...
untangle plugin runner a bit
2019-10-24 20:29:38 -07:00
Michael Grosser
3be50a088a
untangle plugin runner a bit
...
add some docs and make it clearer what is actually going on
(parallel rule execution on start and then on timer)
2019-10-10 15:46:04 -07:00
Xuewei Zhang
794300af59
Add stackdriver exporter endpoint for problem_gauge
2019-09-26 13:45:17 -07:00
Matt Matejczyk
2e9da8569d
Make heartbeatPeriod const into a flag.
2019-09-26 09:59:03 +02:00
Alex Wong
60e048d2ce
fix close of closed channel
2019-09-24 16:07:47 +08:00
Xuewei Zhang
e1939ebc03
Handle vendor change in k8s.io/apimachinery/pkg/util/clock
...
clock.Clock used to have Tick() method, but is now replaced with
NewTicker() method to prevent leaking. Changed NPD code to adapt to it.
See https://github.com/kubernetes/apimachinery/commit/10ebc22e for more
detail.
2019-09-14 15:22:09 -07:00
Xuewei Zhang
0f0e5eff0f
Adding stackdriver exporter
2019-09-12 18:30:00 -07:00
Xuewei Zhang
9e789b5f99
Refactor on metrics so that names for all the views are tracked
2019-09-11 12:07:13 -07:00
Xuewei Zhang
0f2fce56e5
Change host/uptime to GAUGE metrics
2019-09-10 16:58:06 -07:00
Kubernetes Prow Robot
2a07254f96
Merge pull request #253 from finn-no/master
...
Empty LogPath will use journald's default path.
2019-08-27 09:22:41 -07:00
Andrew Stribblehill
09c498ad74
Empty LogPath will use journald's default path.
2019-08-27 01:55:30 +02:00
Xuewei Zhang
82c2368795
Metric format fixes on host/uptime and disk/*
...
1. host/uptime, disk/io_time and disk/weighted_io should be
counter/cumulative metrics. SO we have to use the Sum aggregation method
rather than LastValue aggregation method (which will declare the metric
as gauge metric).
2. Renamed label "device" for disk/* metrics to "device_name".
This is to clarify that it is device_name (sda1) rather than device_path
(/dev/sda1)
2019-08-16 15:14:54 -07:00
Kubernetes Prow Robot
424b864291
Merge pull request #323 from xueweiz/test
...
Add a simple e2e test
2019-08-16 14:56:09 -07:00
Xuewei Zhang
f9b5e60a43
Add e2e test for NPD
...
The first test is a very simple test. It installs NPD on a VM, and then
verifies that NPD reports metric host_uptime in Prometheus format.
2019-08-16 01:33:29 -07:00
Lang Chi
4d37d6fb68
fix a spelling error
...
Signed-off-by: Lang Chi <21860405@zju.edu.cn >
2019-08-13 15:12:01 +08:00
Kubernetes Prow Robot
e280e2075a
Merge pull request #320 from wangzhen127/custom-plugin-fix
...
Don't update condition if status stays False/Unknown for custom plugin
2019-08-07 17:09:18 -07:00
Zhen Wang
30e20c6a20
Validate that permanent problem has preset default condition
2019-08-01 23:40:16 -07:00
Zhen Wang
2f5d03280a
Don't update condition if status stays False/Unknown for custom plugin
2019-08-01 23:40:16 -07:00
Zhen Wang
182a9450dd
Print monitor config path in the logs
2019-07-30 11:00:47 -07:00
Kubernetes Prow Robot
599ca532e8
Merge pull request #315 from xueweiz/metrics
...
Report metrics from custom-plugin-monitor
2019-07-25 11:58:44 -07:00
Xuewei Zhang
94af7de97b
Report metrics from custom-plugin-monitor
2019-07-25 11:28:38 -07:00
Kubernetes Prow Robot
b8ce6360d9
Merge pull request #300 from xueweiz/metrics
...
Report metrics from system-log-monitor
2019-07-12 15:17:06 -07:00
Xuewei Zhang
fbebcf311b
Report metrics from system-log-monitor
2019-07-12 14:38:21 -07:00