Commit Graph

81 Commits

Author SHA1 Message Date
Xuewei Zhang
cf6624661a Update READMEs 2019-06-13 00:51:17 -07:00
Xuewei Zhang
7ad5dec712 Add disk metrics support. 2019-06-13 00:51:17 -07:00
Xuewei Zhang
23dc265971 Add Prometheus exporter. 2019-06-13 00:51:17 -07:00
Xuewei Zhang
a07176073a Add existing monitors into the problem daemon registration hook. 2019-06-13 00:51:17 -07:00
Xuewei Zhang
63f0e35e56 Implement dynamic problemdaemon registration and initialization.
Added package problemdaemon. All future problem daemons should be
registered by calling problemdaemon.register().

CLI interfaces will be automatically generated for all registered
problem daemons in the form of "--config.DAEMON_NAME"
2019-06-12 18:29:18 -07:00
Xuewei Zhang
5814195ad5 Move apiserver-reporting logic into k8s_exporter.
Added CLI option "enable-k8s-exporter" (default to true). Users can use
this option to enable/disable exporting to Kubernetes control plane.

This commit also removes all the apiserver-specific logic from package
problemdetector.

Future exporters (e.g. to local journald, Prometheus, other control
planes) should implement types.Exporter interface.
2019-06-12 18:29:18 -07:00
Xuewei Zhang
c6c4e80c9d Remove TestGoroutineLeak unit tests.
We are seeing some flakes on these tests because some goroutine
fluctuation:
https://github.com/kubernetes/node-problem-detector/pull/275#issuecomment-499306727

Removing the tests, as it's robust to test leakage in a soak/stress
test, rather than unit test.
2019-06-06 16:27:59 -07:00
Yang Guo
468a23d09a Run custom plugins immediately on startup 2019-06-04 09:42:34 -07:00
Andy Xie
33dffe0761 enable codnition updaet when message change for custom plugin 2018-12-11 13:14:49 +08:00
Zhen Wang
3062622d7c More fix to custom plugin monitor condition change 2018-11-27 10:59:40 -08:00
k8s-ci-robot
d793330dcd Merge pull request #203 from andyxning/fix_custom_plugin_monitor_condition_change
fix custom plugin monitor condition change
2018-11-27 10:37:42 -08:00
Zhen Wang
1f636381b8 Detect kubelet and container runtime frequent crashes 2018-11-26 22:41:06 -08:00
SataQiu
91adf37050 fix typo: NDDE -> NODE, permenantly -> permanently 2018-11-21 17:36:08 +08:00
Jason Stangroome
38330605c5 Fix the spelling of monitor in the error message 2018-11-20 14:00:30 +11:00
Andy Xie
e3b37719ec fix custom plugin monitor condition change 2018-11-12 17:57:55 +08:00
AdamDang
392ebe9c1b Typo fix in systemlogmonitor/README.md
configurtion->configuration
2018-09-25 10:13:48 +08:00
Andy Xie
89cfb5261d bump kubernetes to 1.9 2018-07-09 14:59:51 +08:00
k8s-ci-robot
f479d09e58 Merge pull request #183 from andyxning/adjust_client-go_user-agent
adjust client-go User-Agent
2018-06-25 18:15:51 -07:00
Andy Xie
866ae661da adjust client-go User-Agent 2018-06-24 10:39:28 +08:00
k8s-ci-robot
aabd369760 Merge pull request #151 from Random-Liu/improve-cpm
Improve cpm
2018-06-22 01:10:05 -07:00
AdamDang
e6e42175fa Typo fix: encounts->encounters
encounts->encounters
2018-06-22 14:04:45 +08:00
Lantao Liu
ee103dd4ac Generate event for condition change and support unknown status. 2018-06-21 15:29:53 -07:00
David Ashpole
bf730e9c63 add log-counter go plugin 2018-06-20 15:55:19 -07:00
Lantao Liu
9acad906ff Merge pull request #158 from cimomo/small-fix
Use camelCase instead of snake_case per Golang convention
2018-02-22 22:18:17 -08:00
Tim Hockin
3468934b7d Pushes go to staging-k8s.gcr.io 2018-02-01 20:11:55 -08:00
Kai Chen
bc08bd0b80 Use camelCase instead of snake_case per Golang convention 2018-01-22 23:42:13 +08:00
Tim Hockin
547c65ef89 Convert registry to k8s.gcr.io 2017-12-22 09:55:16 -08:00
Andy Xie
10dbfef1a8 add custom problem detector plugin 2017-11-22 10:14:09 +08:00
Cao Shufeng
b939fb575a return an error when error happens in SetConditions() 2017-08-23 17:56:15 +08:00
Random-Liu
f5a7ead8d6 Clarify the limitation of log matching pattern. 2017-06-20 18:11:29 -07:00
Random-Liu
51351f91b2 Cleanup kmsg log wather. 2017-05-30 15:58:45 -07:00
Lantao Liu
be6c516cfd Merge pull request #41 from euank/kmsg-parser
logwatchers: add new kmsg-based kernel log watcher
2017-05-30 15:53:24 -07:00
Euan Kemp
73cba49db0 kmsg: update the docs to reference kmsg parser too 2017-03-09 21:38:11 -08:00
Euan Kemp
9c23921c11 logwatchers/kmsg: add initial kmsg watcher impl
This adds a logwatcher which is able to parse kernel messages directly
from the /dev/kmsg interface. This supports any modern linux distro,
while also avoiding any dependency on libraries (e.g. as journald
needs).
2017-03-09 20:40:49 -08:00
Random-Liu
02d6b89536 Fix journald plugin to only look at the current boot. 2017-03-02 13:57:38 -08:00
Andy Xie
0a914cae09 refactor options pkg 2017-02-23 08:23:52 +08:00
fate-grand-order
a756ef48f3 fix misspell "timestamp" 2017-02-21 23:01:30 +08:00
Random-Liu
889d9efbc1 Add unit test for goroutine leak. 2017-02-16 00:08:56 -08:00
Random-Liu
6170b0c87f Add multiple log monitoring support. 2017-02-15 13:15:18 -08:00
Random-Liu
dba47bdc27 Update the README.md. 2017-02-15 13:07:01 -08:00
Random-Liu
10fc831409 Change kernel specific name in code base and change syslog to filelog. 2017-02-15 13:07:01 -08:00
Random-Liu
f16f0f630b Rename helpers.go to translator.go 2017-02-10 11:32:35 -08:00
Random-Liu
27cc831408 Add arbitrary daemon log support 2017-02-10 11:32:35 -08:00
Dawn Chen
5e563930c0 Merge pull request #81 from Random-Liu/fix-kernel-monitor-issues
Fix kernel monitor issues
2017-02-10 11:17:17 -08:00
Random-Liu
d281cb8a15 Fix kernel monitor issues:
* Change `unregister_netdevice` to be an event to fix #47.
* Change `KernelPanic` to `KernelOops` because we can't handle kernel
panic currently.
* Use system boot time instead of "StartPattern" to fix #48.
2017-02-09 16:09:27 -08:00
Lantao Liu
f20b892123 Merge pull request #84 from Random-Liu/fix-transition-timestamp
Only change transition timestamp when condition is changed.
2017-02-07 10:41:51 -08:00
Andy Xie
d0e0a8c765 add options pkg 2017-02-07 18:44:21 +08:00
Random-Liu
20ffe37cea Add NPD endpoints: /debug/pprof, /healthz, /conditions. 2017-02-03 11:07:06 -08:00
Dawn Chen
b66c4df364 Merge pull request #39 from Random-Liu/journald-support
Journald support
2017-02-01 12:41:51 -08:00
Random-Liu
a986976a1d Only change transition timestamp when condition is changed. 2017-01-27 14:48:28 -08:00