Commit Graph

212 Commits

Author SHA1 Message Date
whwreflux
3fba7a9e86 fix README under systemlogmonitor 2022-07-29 17:14:46 +08:00
Kubernetes Prow Robot
9a9b06d24d Merge pull request #660 from grosser/grosser/latest
simplify cri health check
2022-07-26 20:00:28 -07:00
Kubernetes Prow Robot
7bc362cfdc Merge pull request #668 from grosser/grosser/systemd
show failed statuses as warning
2022-07-26 19:16:38 -07:00
Kubernetes Prow Robot
341af62275 Merge pull request #646 from notchairmk/notchairmk/custom-skip-initial
Allow skipping condition during customplugin initialization
2022-07-26 19:16:31 -07:00
diamondburned
6809f445eb Remove unused resultChan field in CPM
This commit removes the resultChan field in ./pkg/custompluginmonitor's
customPluginMonitor struct. This was detected by staticcheck:

    ―❤―▶ staticcheck ./pkg/custompluginmonitor/
    pkg/custompluginmonitor/custom_plugin_monitor.go:50:2: field resultChan is unused (U1000)
2022-07-12 21:43:05 -07:00
Kubernetes Prow Robot
72f1672634 Merge pull request #675 from mmiranda96/feat/net-monitor-groupings
Add ExcludeInterfaceRegexp to Net Dev monitor
2022-06-29 14:50:06 -07:00
Mike Miranda
1471f74d98 Add ExcludeInterfaceRegexp to Net Dev monitor 2022-06-15 23:22:38 +00:00
Andrew Garrett
b1bd8e7424 Use %q instead of %s 2022-06-09 17:18:30 +00:00
Andrew Garrett
a39a7c6e0f Add condition message to event message
If you're using some monitoring solution that aggregates events from
your Kubernetes cluster, having the underlying reason why a condition
triggered could be very useful, especially if you are using custom
plugin monitors.

Co-authored-by: Micah Norman <micnorman@paypal.com>
Signed-off-by: Ryan Eschinger <reschinger@paypal.com>
2022-06-08 21:42:40 +00:00
Michael Grosser
011b9e6a46 show failed statuses as warning 2022-04-26 11:50:10 -07:00
Taylor Chaparro
9344c938bb Allow skipping condition during customplugin initialization 2022-04-26 10:12:01 -07:00
Kubernetes Prow Robot
c083db10f0 Merge pull request #628 from mx-psi/master
Change to using new dependency name for osreleaser
2022-04-22 11:35:37 -07:00
Kubernetes Prow Robot
9c23553e0b Merge pull request #650 from yankay/fix-deprecated-maintainer-in-dockerfile
FIx deprecated "MAINTAINER" in Dockerfile
2022-04-21 12:28:12 -07:00
Neo Zhuo
11ddb5e6bf support custom /proc path 2022-04-11 18:15:08 +08:00
Neo Zhuo
78c11c4ceb reimplement net collector metrics register, config check and recording 2022-04-11 18:15:07 +08:00
Michael Grosser
d764b1ab87 simplify cri health check 2022-03-28 17:05:53 -07:00
Kay Yan
bc89bbce56 MAINTAINER in Dockerfile is deprecated, change to label
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2022-03-07 15:27:08 +08:00
Pablo Baeyens
a859b5f027 Change to using new dependency name for osreleaser
To do this I
1. changed the name in go.mod and the Go code that used it,
2. ran `go mod tidy -go=1.15` and
3. ran `go mod vendor`.

Step 3 added another vendored dependency unrelated AFAIK to this change.
2021-11-29 16:45:48 +01:00
michelletandya
3344efd552 ensure time is in Universal Time Zone to properly calculate uptime 2021-09-02 17:41:54 +00:00
Kubernetes Prow Robot
56c592a5d7 Merge pull request #587 from vteratipally/bug_fix
Add a check if the metric is nil so that collector doesn't collect metrics.
2021-08-31 09:21:37 -07:00
Kubernetes Prow Robot
3c3609b5fa Merge pull request #612 from mcshooter/updateUptimeCMd
Update powershell command for uptime to help efficiency
2021-08-20 18:42:05 -07:00
michelletandya
dd0d0d71ab Update powershell command for uptime to help efficiency 2021-08-20 01:16:45 +00:00
michelletandya
26f070bfd4 Prevent uptimeFunc from being called everytime CheckHealth is being called 2021-08-17 19:30:28 +00:00
Julie Qi
fe09e416bd remove aufs hung check 2021-07-30 13:53:25 -07:00
Varsha Teratipally
ebdd9038b7 Add a check if the metric is nil so that collector doesn't collect the
metrics.
2021-06-30 19:50:16 +00:00
Oleg Atamanenko
c8629cea5d Check kube-proxy health on linux 2021-06-29 21:36:27 -07:00
Kubernetes Prow Robot
cbb029d905 Merge pull request #583 from pezzak/log-kubeapi-error
Log error from kube-api
2021-06-25 10:18:51 -07:00
Kubernetes Prow Robot
a0b0f9460f Merge pull request #578 from kubernetes/partitions
Reduce the number of reads to /proc/partitions file and gofmt.
2021-06-25 10:18:45 -07:00
Kubernetes Prow Robot
e349323507 Merge pull request #539 from smileusd/health_check
improvement health-checker
2021-06-25 09:48:45 -07:00
pezzak
ed97725ea1 Log error from kube-api 2021-06-17 12:51:44 +03:00
michelletandya
a14577dfa4 update CriCtl path for windows 2021-06-15 01:03:04 +00:00
varsha teratipally
7b51a90328 Reduce the number of reads to /proc/partitions file
to retrive the partitions on disk
2021-06-13 21:11:34 +00:00
tashen
a3b928467e add loopbacktime to reduce time of journalctl call 2021-05-19 13:55:55 +08:00
Lantao Liu
8e94c930ee Fix the uptime timestamp parsing. 2021-05-14 16:43:09 -07:00
Kubernetes Prow Robot
9c541692ee Merge pull request #557 from vteratipally/adfad
Make sure the path to known-modules.json is relative
2021-05-14 14:39:59 -07:00
Varsha Teratipally
a79b87ce7e Make sure the path to known-modules.json is relative to the
system-stats-monitor.json file
2021-05-14 21:14:55 +00:00
Jeremy Edwards
d4933875ed Add support for basic system metrics for Windows. 2021-05-10 21:58:38 +00:00
michelletandya
01cd8dd08c Add healthChecker functionality for kube-proxy service 2021-05-05 17:27:58 +00:00
michelletandya
c4e5400ed6 separate linux/windows health checker files. 2021-04-26 21:45:05 +00:00
Jeremy Edwards
a7f78c5668 Enable NPD to run as a Windows Service. 2021-04-02 23:03:14 -07:00
Jeremy Edwards
4181ece888 Windows Support: Fix Build Regressions, Tests Pass 2021-03-14 10:24:45 -07:00
Archit Bansal
fb8bbe91d7 Fix for flaky unit test in health checker
The unit test was dependent on the order of map iteration. Changed to
using sorted keys while iterating.
2021-02-18 17:52:49 -08:00
Archit Bansal
100f2bf8e6 Make log pattern check configurable in health checker 2021-02-17 17:46:18 -08:00
Karan Goel
c2aceee61d remove os_versions and kernel_version labels 2021-02-02 08:25:10 -08:00
Kubernetes Prow Robot
422c088d62 Merge pull request #516 from karan/system_time
add metric for per-cpu, per-stage timing
2021-02-01 18:54:28 -08:00
zhangyue
98ba606d4f fix check for timeout
Signed-off-by: zhangyue <huaihuan.zy@alibaba-inc.com>
2021-01-30 21:35:00 +08:00
Karan Goel
8648fe265a add metric for per-cpu, per-stage timing 2021-01-29 08:46:39 -08:00
Karan Goel
2a2bab3d28 Add network interface stats
We do not have to collect these often, so for now set the collection
interval to 120s (even though the Stackdriver exporter is still set to
export every 60s).
2021-01-20 08:56:34 -08:00
Kubernetes Prow Robot
45f70a8b26 Merge pull request #456 from ZYecho/fix_timeout
fix: fix script timeout can't work
2021-01-19 19:01:58 -08:00
Kubernetes Prow Robot
c2d7a7be62 Merge pull request #513 from karan/cpu_activity_metrics
add metrics for process stats
2021-01-19 18:38:07 -08:00