Kubernetes Prow Robot
cbb029d905
Merge pull request #583 from pezzak/log-kubeapi-error
...
Log error from kube-api
2021-06-25 10:18:51 -07:00
Kubernetes Prow Robot
a0b0f9460f
Merge pull request #578 from kubernetes/partitions
...
Reduce the number of reads to /proc/partitions file and gofmt.
2021-06-25 10:18:45 -07:00
Kubernetes Prow Robot
e349323507
Merge pull request #539 from smileusd/health_check
...
improvement health-checker
2021-06-25 09:48:45 -07:00
pezzak
ed97725ea1
Log error from kube-api
2021-06-17 12:51:44 +03:00
michelletandya
a14577dfa4
update CriCtl path for windows
2021-06-15 01:03:04 +00:00
varsha teratipally
7b51a90328
Reduce the number of reads to /proc/partitions file
...
to retrive the partitions on disk
2021-06-13 21:11:34 +00:00
tashen
a3b928467e
add loopbacktime to reduce time of journalctl call
2021-05-19 13:55:55 +08:00
Lantao Liu
8e94c930ee
Fix the uptime timestamp parsing.
2021-05-14 16:43:09 -07:00
Kubernetes Prow Robot
9c541692ee
Merge pull request #557 from vteratipally/adfad
...
Make sure the path to known-modules.json is relative
2021-05-14 14:39:59 -07:00
Varsha Teratipally
a79b87ce7e
Make sure the path to known-modules.json is relative to the
...
system-stats-monitor.json file
2021-05-14 21:14:55 +00:00
Jeremy Edwards
d4933875ed
Add support for basic system metrics for Windows.
2021-05-10 21:58:38 +00:00
michelletandya
01cd8dd08c
Add healthChecker functionality for kube-proxy service
2021-05-05 17:27:58 +00:00
michelletandya
c4e5400ed6
separate linux/windows health checker files.
2021-04-26 21:45:05 +00:00
Jeremy Edwards
a7f78c5668
Enable NPD to run as a Windows Service.
2021-04-02 23:03:14 -07:00
Jeremy Edwards
4181ece888
Windows Support: Fix Build Regressions, Tests Pass
2021-03-14 10:24:45 -07:00
Archit Bansal
fb8bbe91d7
Fix for flaky unit test in health checker
...
The unit test was dependent on the order of map iteration. Changed to
using sorted keys while iterating.
2021-02-18 17:52:49 -08:00
Archit Bansal
100f2bf8e6
Make log pattern check configurable in health checker
2021-02-17 17:46:18 -08:00
Karan Goel
c2aceee61d
remove os_versions and kernel_version labels
2021-02-02 08:25:10 -08:00
Kubernetes Prow Robot
422c088d62
Merge pull request #516 from karan/system_time
...
add metric for per-cpu, per-stage timing
2021-02-01 18:54:28 -08:00
zhangyue
98ba606d4f
fix check for timeout
...
Signed-off-by: zhangyue <huaihuan.zy@alibaba-inc.com >
2021-01-30 21:35:00 +08:00
Karan Goel
8648fe265a
add metric for per-cpu, per-stage timing
2021-01-29 08:46:39 -08:00
Karan Goel
2a2bab3d28
Add network interface stats
...
We do not have to collect these often, so for now set the collection
interval to 120s (even though the Stackdriver exporter is still set to
export every 60s).
2021-01-20 08:56:34 -08:00
Kubernetes Prow Robot
45f70a8b26
Merge pull request #456 from ZYecho/fix_timeout
...
fix: fix script timeout can't work
2021-01-19 19:01:58 -08:00
Kubernetes Prow Robot
c2d7a7be62
Merge pull request #513 from karan/cpu_activity_metrics
...
add metrics for process stats
2021-01-19 18:38:07 -08:00
Kubernetes Prow Robot
a8a1d30310
Merge pull request #509 from jeremyje/winrun
...
Support filelog watching in Windows.
2021-01-19 18:37:59 -08:00
varsha teratipally
2cb1195f18
cleanup the log
2021-01-13 17:54:53 +00:00
Karan Goel
f13d2a5449
don't run os feature collector if metric not initialized
2021-01-13 09:33:13 -08:00
Jeremy Edwards
adc587f222
Support filelog watching in Windows.
2021-01-13 17:16:46 +00:00
Karan Goel
71098097c0
add metrics for process stats
...
Tested on a COS VM:
```
$ curl -s localhost:20257/metrics | grep "^system_"
system_interrupts_total{kernel_version="5.4.49+",os_version="cos 85-13310.1041.24"} 8.759236e+07
system_processes_total{kernel_version="5.4.49+",os_version="cos 85-13310.1041.24"} 692506
system_procs_blocked{kernel_version="5.4.49+",os_version="cos 85-13310.1041.24"} 0
system_procs_running{kernel_version="5.4.49+",os_version="cos 85-13310.1041.24"} 2
```
2021-01-13 09:14:08 -08:00
zhangyue
4f68b251ac
fix: fix script timeout can't work
...
Signed-off-by: zhangyue <huaihuan.zy@alibaba-inc.com >
2021-01-13 20:53:25 +08:00
varsha teratipally
f89f620909
added new line in the known_modules.json
2021-01-08 23:25:02 +00:00
varsha teratipally
eb38b4b598
added a new metric to retrieve os features like unknown modules
2021-01-08 21:52:16 +00:00
Kubernetes Prow Robot
4ad49bbd84
Merge pull request #503 from vteratipally/label_fix
...
changing the label names as per the standards
2020-12-08 22:04:49 -08:00
Kubernetes Prow Robot
4dccc1ce24
Merge pull request #493 from vteratipally/kernel_cmdline_parameters
...
add code to retrieve kernel command line parameters
2020-12-08 17:58:18 -08:00
varsha teratipally
4085da817d
renaming splitWords to tokens
2020-12-08 18:34:54 +00:00
varsha teratipally
047958a49c
changing the label names as per the standards
2020-12-08 02:27:22 +00:00
varsha teratipally
ffc46f977d
add code to retrieve kernel command line parameters
2020-12-07 22:40:22 +00:00
Jeremy Edwards
4adec4bbc6
Introduce Windows build of Node Problem Detector
2020-12-05 23:54:52 +00:00
Kubernetes Prow Robot
bf51d6600e
Merge pull request #492 from vteratipally/module_stats_branch
...
add code to retrieve kernel modules in a linux system from /proc/modules
2020-12-03 09:51:00 -08:00
Kubernetes Prow Robot
1e917af560
Merge pull request #455 from ZYecho/fix_newmessage
...
fix: print result's message when status unknown
2020-11-24 16:14:39 -08:00
varsha teratipally
2b50e4af1a
add testcases for cos and ubuntu to retrieve modules
2020-11-19 10:29:12 +00:00
varsha teratipally
944efce3a6
add code for retrieving kernel modules
2020-11-19 09:49:25 +00:00
Kubernetes Prow Robot
112d53b10a
Merge pull request #497 from vteratipally/fs_types
...
avoid duplicating the disk bytes used metrics based on fstype and mount types
2020-11-18 10:48:07 -08:00
zhangyue
b51cb3219f
fix: print result's message when status unknown
...
Signed-off-by: zhangyue <huaihuan.zy@alibaba-inc.com >
2020-11-18 19:30:17 +08:00
Kubernetes Prow Robot
d8ea2538de
Merge pull request #489 from abansal4032/health-check-kubelet-connection
...
Kubelet api server connection check in health checker
2020-11-16 14:06:42 -08:00
Kubernetes Prow Robot
33571a312d
Merge pull request #478 from neoseele/master
...
fix: node memory metrics are off by 1024
2020-11-16 14:06:12 -08:00
varsha teratipally
1550882948
avoid duplicating the disk bytes used metrics based on fstype and mountopts
2020-11-16 20:10:46 +00:00
Archit Bansal
2513756583
Add kubelet apiserver connection fail check in health checker
2020-11-09 12:47:16 -08:00
Karan Goel
925ea7393c
Collect CPU load averages in a separate metric
2020-11-09 09:41:52 -08:00
Neil
589411702a
fix: node memory metrics are off by 1024
...
The memory unit in /proc/meminfo is kB (b/171164235)
```
MemTotal: 264129908 kB
MemFree: 153559480 kB
...
```
2020-10-19 17:26:31 +11:00