Commit Graph

531 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
8f2a94fd7e Merge pull request #502 from jeremyje/windows
Introduce Windows build of Node Problem Detector
2020-12-07 22:21:11 -08:00
Jeremy Edwards
4adec4bbc6 Introduce Windows build of Node Problem Detector 2020-12-05 23:54:52 +00:00
Kubernetes Prow Robot
bf51d6600e Merge pull request #492 from vteratipally/module_stats_branch
add code to retrieve kernel modules in a linux system from /proc/modules
2020-12-03 09:51:00 -08:00
Kubernetes Prow Robot
1e917af560 Merge pull request #455 from ZYecho/fix_newmessage
fix: print result's message when status unknown
2020-11-24 16:14:39 -08:00
Kubernetes Prow Robot
6956e6074d Merge pull request #500 from Random-Liu/fix-staging-bucket
Change default staging bucket.
2020-11-20 09:44:51 -08:00
Lantao Liu
ed783da499 Change default staging bucket.
The new staging bucket for the promoter is `gcr.io/k8s-staging-npd`.
2020-11-20 09:08:35 -08:00
varsha teratipally
2b50e4af1a add testcases for cos and ubuntu to retrieve modules 2020-11-19 10:29:12 +00:00
varsha teratipally
944efce3a6 add code for retrieving kernel modules 2020-11-19 09:49:25 +00:00
Kubernetes Prow Robot
59536256e3 Merge pull request #475 from vteratipally/boot_size_disk
catching hung task with pattern like "tasks airflow scheduler: *"
v0.8.5
2020-11-18 14:42:50 -08:00
Kubernetes Prow Robot
112d53b10a Merge pull request #497 from vteratipally/fs_types
avoid duplicating the disk bytes used metrics based on fstype and mount types
2020-11-18 10:48:07 -08:00
zhangyue
b51cb3219f fix: print result's message when status unknown
Signed-off-by: zhangyue <huaihuan.zy@alibaba-inc.com>
2020-11-18 19:30:17 +08:00
vteratipally
0c258bb704 Update kernel-monitor.json 2020-11-17 13:38:07 -08:00
Kubernetes Prow Robot
438d014389 Merge pull request #425 from jsoref/grammar
Grammar
2020-11-16 21:38:04 -08:00
Kubernetes Prow Robot
3abcfb7063 Merge pull request #490 from karan/vendor
Bump some major dependencies to latest versions
2020-11-16 14:06:50 -08:00
Kubernetes Prow Robot
d8ea2538de Merge pull request #489 from abansal4032/health-check-kubelet-connection
Kubelet api server connection check in health checker
2020-11-16 14:06:42 -08:00
Kubernetes Prow Robot
cff4a54d6a Merge pull request #488 from vteratipally/io_errors
Add Detectection logic for  I/O errors
2020-11-16 14:06:36 -08:00
Kubernetes Prow Robot
5919888571 Merge pull request #485 from karan/helm-readme
fix helm instructions
2020-11-16 14:06:28 -08:00
Kubernetes Prow Robot
2d53c0a2a6 Merge pull request #481 from tosi3k/oom-regex-fix
Adapt OOMKilling pattern to old and new Linux kernels
2020-11-16 14:06:20 -08:00
Kubernetes Prow Robot
33571a312d Merge pull request #478 from neoseele/master
fix: node memory metrics are off by 1024
2020-11-16 14:06:12 -08:00
Kubernetes Prow Robot
06e5a875be Merge pull request #430 from wawa0210/linux-only
avoid npd pod schedule on windows node
2020-11-16 14:06:04 -08:00
varsha teratipally
1550882948 avoid duplicating the disk bytes used metrics based on fstype and mountopts 2020-11-16 20:10:46 +00:00
Kubernetes Prow Robot
35bfe697a5 Merge pull request #484 from karan/trial-metric
Collect CPU load averages in a separate metric
2020-11-12 12:00:28 -08:00
Karan Goel
db35f6a857 bump some dependencies to latest versions 2020-11-09 15:33:13 -08:00
Archit Bansal
2513756583 Add kubelet apiserver connection fail check in health checker 2020-11-09 12:47:16 -08:00
Karan Goel
925ea7393c Collect CPU load averages in a separate metric 2020-11-09 09:41:52 -08:00
varsha teratipally
f01b5e5cfe Detect I/O errors 2020-11-06 03:48:33 +00:00
Karan Goel
d39915d392 fix helm instructions 2020-11-04 11:59:07 -08:00
Kubernetes Prow Robot
0fb464c24a Merge pull request #459 from abansal4032/logging-improvements
Add logging levels to custom plugin logs.
2020-11-04 11:59:01 -08:00
Antoni Zawodny
6b650e785e Adapt OOMKilling pattern to old and new Linux kernels 2020-10-22 15:12:26 +02:00
Neil
589411702a fix: node memory metrics are off by 1024
The memory unit in /proc/meminfo is kB (b/171164235)

```
MemTotal:       264129908 kB
MemFree:        153559480 kB
...
```
2020-10-19 17:26:31 +11:00
varsha teratipally
f984abbe2e catching hung task with pattern like taks airflow scheduler: some of the events related to hungtask is not identified 2020-10-08 23:04:15 +00:00
Kubernetes Prow Robot
f42281ee26 Merge pull request #459 from abansal4032/logging-improvements
Add logging levels to custom plugin logs.
v0.8.4
2020-08-28 17:05:19 -07:00
Archit Bansal
8c94d5e60c Add logging levels to custom plugin logs. 2020-08-28 12:51:50 -07:00
Kubernetes Prow Robot
7fa34545b7 Merge pull request #458 from abansal4032/logging-improvements
Log custom plugin stderr only if the status is not ok.
2020-08-27 10:41:53 -07:00
Archit Bansal
3a9370e01b Log custom plugin stderr only if the status is not ok.
Otherwise with plugins that run frequently and report ok status, the
logs are filled with unnecessary noise and significantly increases log
size.
2020-08-27 10:17:05 -07:00
Kubernetes Prow Robot
8a41d4abe3 Merge pull request #453 from vteratipally/docker_failures
Detect docker startup failures
2020-08-14 15:26:18 -07:00
vteratipally
edfd70a16c Update docker-monitor.json
fixed json format error as it doesn't allow trailing commas
2020-08-11 10:02:17 -07:00
vteratipally
fbdd9eec9a Update docker-monitor.json
making DockerContainerStartup failure as temporary
2020-08-11 09:59:46 -07:00
Kubernetes Prow Robot
860e6b0145 Merge pull request #452 from vteratipally/add_fstypes
Add more info to disk metrics
2020-08-07 13:37:57 -07:00
varsha teratipally
4ce29a95d5 removed the $ symbol as npd handles end of the line 2020-08-06 01:30:11 +00:00
varsha teratipally
50127b0512 changed labelname after code review 2020-08-06 00:43:45 +00:00
varsha teratipally
4c40b7e468 updated readme 2020-08-05 21:43:58 +00:00
varsha teratipally
95237efb4d Detect docker startup failures 2020-08-05 21:29:11 +00:00
varsha teratipally
e13210157d Add more info to disk metrics 2020-08-05 21:12:25 +00:00
Kubernetes Prow Robot
c01ea4f582 Merge pull request #450 from saintube/master
Fix typo in custom-plugin-monitor
2020-08-04 12:14:21 -07:00
Frame
9678892546 Fix typo in custom-plugin-monitor 2020-08-03 17:08:42 +08:00
Kubernetes Prow Robot
f3ab10eddb Merge pull request #442 from abansal4032/custom-plugin-logs-capture
Capture the logs from stderr of custom plugins
v0.8.3
2020-07-29 14:18:03 -07:00
Archit Bansal
6acf5b1edb Capture the logs from stderr of custom plugins. 2020-07-29 11:57:05 -07:00
Kubernetes Prow Robot
c3cf941e98 Merge pull request #441 from abansal4032/custom-plugin-log-fix
Generate new status log only on condition change
2020-07-28 09:45:48 -07:00
Archit Bansal
f80f3e0dfa Generate status generation logs from custom plugin run only on condition change. 2020-07-24 09:39:39 -07:00