Kubernetes Prow Robot
2a07254f96
Merge pull request #253 from finn-no/master
...
Empty LogPath will use journald's default path.
v0.7.1
2019-08-27 09:22:41 -07:00
Andrew Stribblehill
09c498ad74
Empty LogPath will use journald's default path.
2019-08-27 01:55:30 +02:00
Kubernetes Prow Robot
6aa308db81
Merge pull request #334 from xueweiz/cumulative
...
Metric format fixes on host/uptime and disk/*
2019-08-19 12:27:31 -07:00
Xuewei Zhang
82c2368795
Metric format fixes on host/uptime and disk/*
...
1. host/uptime, disk/io_time and disk/weighted_io should be
counter/cumulative metrics. SO we have to use the Sum aggregation method
rather than LastValue aggregation method (which will declare the metric
as gauge metric).
2. Renamed label "device" for disk/* metrics to "device_name".
This is to clarify that it is device_name (sda1) rather than device_path
(/dev/sda1)
2019-08-16 15:14:54 -07:00
Kubernetes Prow Robot
424b864291
Merge pull request #323 from xueweiz/test
...
Add a simple e2e test
2019-08-16 14:56:09 -07:00
Xuewei Zhang
f9b5e60a43
Add e2e test for NPD
...
The first test is a very simple test. It installs NPD on a VM, and then
verifies that NPD reports metric host_uptime in Prometheus format.
2019-08-16 01:33:29 -07:00
Kubernetes Prow Robot
81fcdcebb8
Merge pull request #331 from pigletfly/fix-cm
...
Move NPD into kube-system namespace
2019-08-13 23:04:24 -07:00
pigletfly
4118c56385
Move NPD into kube-system namespace
2019-08-14 12:06:07 +08:00
Xuewei Zhang
db2dbd1eb2
vendor changes for e2e tests
2019-08-13 17:34:20 -07:00
Kubernetes Prow Robot
a442e71190
Merge pull request #325 from lang710/fixSpelling
...
fix a spelling error
2019-08-13 10:53:42 -07:00
Lang Chi
4d37d6fb68
fix a spelling error
...
Signed-off-by: Lang Chi <21860405@zju.edu.cn >
2019-08-13 15:12:01 +08:00
Kubernetes Prow Robot
e280e2075a
Merge pull request #320 from wangzhen127/custom-plugin-fix
...
Don't update condition if status stays False/Unknown for custom plugin
2019-08-07 17:09:18 -07:00
Kubernetes Prow Robot
eeb51ee03a
Merge pull request #322 from nvtkaszpir/bump-image-debian-base
...
Bump base image debian-base to tag v1.0.0 to pick up some CVE fixes
2019-08-04 19:43:49 -07:00
Michał Sochoń
4641ba114f
Bump base image debian-base to tag v1.0.0 to pick up some CVE fixes
2019-08-03 18:21:16 +02:00
Kubernetes Prow Robot
b9adfbb26b
Merge pull request #321 from wangzhen127/overlay2
...
Update the detection method for docker overlay2 issue
2019-08-02 12:07:54 -07:00
Zhen Wang
30e20c6a20
Validate that permanent problem has preset default condition
2019-08-01 23:40:16 -07:00
Zhen Wang
2f5d03280a
Don't update condition if status stays False/Unknown for custom plugin
2019-08-01 23:40:16 -07:00
Kubernetes Prow Robot
7f0b914617
Merge pull request #318 from wangzhen127/log
...
Print monitor config path in the logs
2019-08-01 23:21:51 -07:00
Zhen Wang
a8527712f6
Update the detection method for docker overlay2 issue
2019-08-01 22:16:44 -07:00
Kubernetes Prow Robot
239913cae6
Merge pull request #319 from wangzhen127/systemd-monitor-fix
...
Make systemd monitor look back for 5m
2019-08-01 16:49:52 -07:00
Zhen Wang
570ae0cb20
Make systemd monitor look back for 5m
2019-07-30 11:17:02 -07:00
Zhen Wang
182a9450dd
Print monitor config path in the logs
2019-07-30 11:00:47 -07:00
Kubernetes Prow Robot
599ca532e8
Merge pull request #315 from xueweiz/metrics
...
Report metrics from custom-plugin-monitor
v0.7.0
2019-07-25 11:58:44 -07:00
Xuewei Zhang
94af7de97b
Report metrics from custom-plugin-monitor
2019-07-25 11:28:38 -07:00
Kubernetes Prow Robot
b8ce6360d9
Merge pull request #300 from xueweiz/metrics
...
Report metrics from system-log-monitor
2019-07-12 15:17:06 -07:00
Xuewei Zhang
fbebcf311b
Report metrics from system-log-monitor
2019-07-12 14:38:21 -07:00
Kubernetes Prow Robot
dbe7cafe1e
Merge pull request #308 from yguo0905/master
...
Support waiting for kube-apiserver to be ready with timout during NPD startup
2019-07-09 16:50:24 -07:00
Yang Guo
ddb1d76178
Support waiting for kube-apiserver to be ready with timout during NPD startup
2019-07-09 10:24:25 -07:00
Kubernetes Prow Robot
30babe906e
Merge pull request #303 from xueweiz/self
...
Implement host collector as part of system-stats-monitor
2019-07-03 13:38:12 -07:00
Xuewei Zhang
4944ac3e48
Implement host collector as part of system-stats-monitor
...
Host collector report three things today:
1. Host OS uptime (in seconds)
2. Host kernel version (as a metric label)
3. Host OS version (as a metric label)
2019-06-27 16:40:11 -07:00
Xuewei Zhang
ed16a29ec2
Add github.com/cobaugh/osrelease as dependency
...
This done via:
GO111MODULE=on go get github.com/cobaugh/osrelease
GO111MODULE=on go mod vendor
2019-06-27 16:40:05 -07:00
Xuewei Zhang
935fab705e
Add github.com/shirou/gopsutil/host to vendor
...
This is needed for a coming PR to measure system uptime.
I separated vendor changes out, because they are larger while easier to
review.
This done via:
GO111MODULE=on go get github.com/shirou/gopsutil/host
GO111MODULE=on go mod vendor
2019-06-27 16:40:05 -07:00
Xuewei Zhang
29b0740f4c
Refactor systemstatsmonitor/metric_helper.go into a metrics package
2019-06-27 16:40:05 -07:00
Kubernetes Prow Robot
146dfd70b2
Merge pull request #299 from xueweiz/start
...
Correctly identify failures in problem daemon starting.
2019-06-27 10:47:22 -07:00
Xuewei Zhang
225de07427
Correctly identify failures in problem daemon starting.
2019-06-26 17:55:11 -07:00
Kubernetes Prow Robot
c95c37532b
Merge pull request #292 from wangzhen127/systemd-monitor
...
Add systemd monitor for kubelet, docker, and containerd restart events
2019-06-20 19:14:38 -07:00
Zhen Wang
ea6a141351
Allow using custom flags in build.sh
2019-06-18 10:26:53 -07:00
Zhen Wang
b94a555dfc
Add systemd monitor for kubelet, docker, and containerd restart events
2019-06-18 10:26:53 -07:00
Kubernetes Prow Robot
b667a12ee4
Merge pull request #294 from xueweiz/compile
...
Allow compilation time disabling for each type of Problem Daemon.
2019-06-17 16:32:15 -07:00
Xuewei Zhang
be2647a686
Allow compilation time disabling for each type of Problem Daemon.
2019-06-17 16:02:45 -07:00
Kubernetes Prow Robot
e10e6cc106
Merge pull request #293 from Random-Liu/do-not-import-plugins-unnecessarily
...
Do not import plugins unnecessarily.
2019-06-13 20:32:23 -07:00
Lantao Liu
d520ca89bd
Build node-problem-detector from a directory.
...
Signed-off-by: Lantao Liu <lantaol@google.com >
2019-06-13 18:54:23 -07:00
Lantao Liu
f2d17ee77b
Do not import plugins unnecessarily.
...
Signed-off-by: Lantao Liu <lantaol@google.com >
2019-06-13 17:57:53 -07:00
Kubernetes Prow Robot
975dc718a5
Merge pull request #275 from xueweiz/exp
...
node-problem-detector: report disk queue length in Prometheus format
2019-06-13 15:24:14 -07:00
Xuewei Zhang
cf6624661a
Update READMEs
2019-06-13 00:51:17 -07:00
Xuewei Zhang
7ad5dec712
Add disk metrics support.
2019-06-13 00:51:17 -07:00
Xuewei Zhang
23dc265971
Add Prometheus exporter.
2019-06-13 00:51:17 -07:00
Xuewei Zhang
a07176073a
Add existing monitors into the problem daemon registration hook.
2019-06-13 00:51:17 -07:00
Xuewei Zhang
63f0e35e56
Implement dynamic problemdaemon registration and initialization.
...
Added package problemdaemon. All future problem daemons should be
registered by calling problemdaemon.register().
CLI interfaces will be automatically generated for all registered
problem daemons in the form of "--config.DAEMON_NAME"
2019-06-12 18:29:18 -07:00
Xuewei Zhang
5814195ad5
Move apiserver-reporting logic into k8s_exporter.
...
Added CLI option "enable-k8s-exporter" (default to true). Users can use
this option to enable/disable exporting to Kubernetes control plane.
This commit also removes all the apiserver-specific logic from package
problemdetector.
Future exporters (e.g. to local journald, Prometheus, other control
planes) should implement types.Exporter interface.
2019-06-12 18:29:18 -07:00