Commit Graph

403 Commits

Author SHA1 Message Date
Alexandre
4df720c2a0 Improve systemctl check, style + cleanup
- Use `systemctl is-active` to check if service is running
  - Cleaner that `grep` on `systemctl status` output
  - Return success means service is running/active
  - Return failure means not running which could be due to
    stopped/failed service or that service does not exist

- Use `command -v` instead of `which`
  Ref: https://github.com/koalaman/shellcheck/wiki/SC2230

- Follow Google "Shell Style Guide": indent, use "readonly"

- Minor: Rephrase comment, avoid all caps
2019-11-29 14:14:19 +09:00
Kubernetes Prow Robot
a1a7234878 Merge pull request #363 from grosser/grosser/old
remove kubernetes 1.8 support
2019-09-28 19:27:37 -07:00
Kubernetes Prow Robot
850ecf1f12 Merge pull request #368 from xueweiz/problem-state
Add stackdriver exporter endpoint for problem_gauge
2019-09-26 23:41:36 -07:00
Xuewei Zhang
794300af59 Add stackdriver exporter endpoint for problem_gauge 2019-09-26 13:45:17 -07:00
Kubernetes Prow Robot
76865bda54 Merge pull request #356 from mm4tt/heartbeat-period-flag
Make heartbeatPeriod const into a flag
2019-09-26 08:57:07 -07:00
Matt Matejczyk
2e9da8569d Make heartbeatPeriod const into a flag. 2019-09-26 09:59:03 +02:00
Michael Grosser
f77e80a8c4 remove kubernetes 1.8 support 2019-09-25 16:41:13 -07:00
Kubernetes Prow Robot
219b408222 Merge pull request #352 from xueweiz/test
Set SSH timeout to 5 minutes
2019-09-19 12:30:25 -07:00
Xuewei Zhang
ec4b615844 Set SSH timeout to 5 minutes 2019-09-19 12:01:16 -07:00
Kubernetes Prow Robot
56f42d902e Merge pull request #353 from xueweiz/family
Allow e2e test to pick up test VM image using image family
2019-09-18 23:45:00 -07:00
Xuewei Zhang
1989ab3681 Allow e2e test to pick up test VM image using image family 2019-09-18 16:09:14 -07:00
Kubernetes Prow Robot
9828ab7f06 Merge pull request #349 from xueweiz/test
Allow e2e test to rent project from Boskos
2019-09-16 12:22:38 -07:00
Kubernetes Prow Robot
5345185ec2 Merge pull request #341 from iranzo/patch-1
Update network_problem.sh
2019-09-15 01:00:37 -07:00
Kubernetes Prow Robot
9870e774d3 Merge pull request #350 from lang710/fixSpelling
fix a spelling error
2019-09-15 00:30:38 -07:00
Lang Chi
28233337fc fix a spelling error
Signed-off-by: Lang Chi <21860405@zju.edu.cn>
2019-09-15 12:31:19 +08:00
Xuewei Zhang
fb7fd239bb Add logic for renting test project from Boskos 2019-09-14 15:22:09 -07:00
Xuewei Zhang
e1939ebc03 Handle vendor change in k8s.io/apimachinery/pkg/util/clock
clock.Clock used to have Tick() method, but is now replaced with
NewTicker() method to prevent leaking. Changed NPD code to adapt to it.

See https://github.com/kubernetes/apimachinery/commit/10ebc22e for more
detail.
2019-09-14 15:22:09 -07:00
Xuewei Zhang
3fc6c7f306 Add vendor code for Boskos
Added replace statement for apache/thrift, since it has been recently
moved from git.apache.org/thrift.git to github.com/apache/thrift, and is
causing `go get` to fail.

See https://github.com/jenkins-x/jx/pull/3321 for more detail.

Commands used:
GO111MODULE=on go get k8s.io/test-infra/boskos/client
GO111MODULE=on go mod vendor
2019-09-14 15:22:09 -07:00
Kubernetes Prow Robot
aea91e385c Merge pull request #335 from xueweiz/sd
Add Stackdriver exporter
2019-09-13 23:36:39 -07:00
Xuewei Zhang
0f0e5eff0f Adding stackdriver exporter 2019-09-12 18:30:00 -07:00
Xuewei Zhang
9e789b5f99 Refactor on metrics so that names for all the views are tracked 2019-09-11 12:07:13 -07:00
Xuewei Zhang
0f2fce56e5 Change host/uptime to GAUGE metrics 2019-09-10 16:58:06 -07:00
Xuewei Zhang
42285cb8db vendor changes 2019-09-10 16:58:06 -07:00
Kubernetes Prow Robot
0fdff95f22 Merge pull request #342 from iranzo/fixtypo
Fixes typo in README
2019-09-06 01:46:56 -07:00
Pablo Iranzo Gómez
eea584e78d Fixes typo in README 2019-09-05 16:26:27 +02:00
Pablo Iranzo Gómez
fa94b42849 Use bashate recommendations on network_problem script 2019-09-05 15:46:45 +02:00
Kubernetes Prow Robot
2a07254f96 Merge pull request #253 from finn-no/master
Empty LogPath will use journald's default path.
v0.7.1
2019-08-27 09:22:41 -07:00
Andrew Stribblehill
09c498ad74 Empty LogPath will use journald's default path. 2019-08-27 01:55:30 +02:00
Kubernetes Prow Robot
6aa308db81 Merge pull request #334 from xueweiz/cumulative
Metric format fixes on host/uptime and disk/*
2019-08-19 12:27:31 -07:00
Xuewei Zhang
82c2368795 Metric format fixes on host/uptime and disk/*
1. host/uptime, disk/io_time and disk/weighted_io should be
counter/cumulative metrics. SO we have to use the Sum aggregation method
rather than LastValue aggregation method (which will declare the metric
as gauge metric).

2. Renamed label "device" for disk/* metrics to "device_name".
This is to clarify that it is device_name (sda1) rather than device_path
(/dev/sda1)
2019-08-16 15:14:54 -07:00
Kubernetes Prow Robot
424b864291 Merge pull request #323 from xueweiz/test
Add a simple e2e test
2019-08-16 14:56:09 -07:00
Xuewei Zhang
f9b5e60a43 Add e2e test for NPD
The first test is a very simple test. It installs NPD on a VM, and then
verifies that NPD reports metric host_uptime in Prometheus format.
2019-08-16 01:33:29 -07:00
Kubernetes Prow Robot
81fcdcebb8 Merge pull request #331 from pigletfly/fix-cm
Move NPD into kube-system namespace
2019-08-13 23:04:24 -07:00
pigletfly
4118c56385 Move NPD into kube-system namespace 2019-08-14 12:06:07 +08:00
Xuewei Zhang
db2dbd1eb2 vendor changes for e2e tests 2019-08-13 17:34:20 -07:00
Kubernetes Prow Robot
a442e71190 Merge pull request #325 from lang710/fixSpelling
fix a spelling error
2019-08-13 10:53:42 -07:00
Lang Chi
4d37d6fb68 fix a spelling error
Signed-off-by: Lang Chi <21860405@zju.edu.cn>
2019-08-13 15:12:01 +08:00
Kubernetes Prow Robot
e280e2075a Merge pull request #320 from wangzhen127/custom-plugin-fix
Don't update condition if status stays False/Unknown for custom plugin
2019-08-07 17:09:18 -07:00
Kubernetes Prow Robot
eeb51ee03a Merge pull request #322 from nvtkaszpir/bump-image-debian-base
Bump base image debian-base to tag v1.0.0 to pick up some CVE fixes
2019-08-04 19:43:49 -07:00
Michał Sochoń
4641ba114f Bump base image debian-base to tag v1.0.0 to pick up some CVE fixes 2019-08-03 18:21:16 +02:00
Kubernetes Prow Robot
b9adfbb26b Merge pull request #321 from wangzhen127/overlay2
Update the detection method for docker overlay2 issue
2019-08-02 12:07:54 -07:00
Zhen Wang
30e20c6a20 Validate that permanent problem has preset default condition 2019-08-01 23:40:16 -07:00
Zhen Wang
2f5d03280a Don't update condition if status stays False/Unknown for custom plugin 2019-08-01 23:40:16 -07:00
Kubernetes Prow Robot
7f0b914617 Merge pull request #318 from wangzhen127/log
Print monitor config path in the logs
2019-08-01 23:21:51 -07:00
Zhen Wang
a8527712f6 Update the detection method for docker overlay2 issue 2019-08-01 22:16:44 -07:00
Kubernetes Prow Robot
239913cae6 Merge pull request #319 from wangzhen127/systemd-monitor-fix
Make systemd monitor look back for 5m
2019-08-01 16:49:52 -07:00
Zhen Wang
570ae0cb20 Make systemd monitor look back for 5m 2019-07-30 11:17:02 -07:00
Zhen Wang
182a9450dd Print monitor config path in the logs 2019-07-30 11:00:47 -07:00
Kubernetes Prow Robot
599ca532e8 Merge pull request #315 from xueweiz/metrics
Report metrics from custom-plugin-monitor
v0.7.0
2019-07-25 11:58:44 -07:00
Xuewei Zhang
94af7de97b Report metrics from custom-plugin-monitor 2019-07-25 11:28:38 -07:00