Random-Liu
c95a0d1096
Fix journald plugin to only look at the current boot.
2017-03-15 11:12:32 -07:00
Andy Xie
0a914cae09
refactor options pkg
2017-02-23 08:23:52 +08:00
fate-grand-order
a756ef48f3
fix misspell "timestamp"
2017-02-21 23:01:30 +08:00
Random-Liu
889d9efbc1
Add unit test for goroutine leak.
2017-02-16 00:08:56 -08:00
Random-Liu
6170b0c87f
Add multiple log monitoring support.
2017-02-15 13:15:18 -08:00
Random-Liu
dba47bdc27
Update the README.md.
2017-02-15 13:07:01 -08:00
Random-Liu
10fc831409
Change kernel specific name in code base and change syslog to filelog.
2017-02-15 13:07:01 -08:00
Random-Liu
f16f0f630b
Rename helpers.go to translator.go
2017-02-10 11:32:35 -08:00
Random-Liu
27cc831408
Add arbitrary daemon log support
2017-02-10 11:32:35 -08:00
Dawn Chen
5e563930c0
Merge pull request #81 from Random-Liu/fix-kernel-monitor-issues
...
Fix kernel monitor issues
2017-02-10 11:17:17 -08:00
Random-Liu
d281cb8a15
Fix kernel monitor issues:
...
* Change `unregister_netdevice` to be an event to fix #47 .
* Change `KernelPanic` to `KernelOops` because we can't handle kernel
panic currently.
* Use system boot time instead of "StartPattern" to fix #48 .
2017-02-09 16:09:27 -08:00
Lantao Liu
f20b892123
Merge pull request #84 from Random-Liu/fix-transition-timestamp
...
Only change transition timestamp when condition is changed.
2017-02-07 10:41:51 -08:00
Andy Xie
d0e0a8c765
add options pkg
2017-02-07 18:44:21 +08:00
Random-Liu
20ffe37cea
Add NPD endpoints: /debug/pprof, /healthz, /conditions.
2017-02-03 11:07:06 -08:00
Dawn Chen
b66c4df364
Merge pull request #39 from Random-Liu/journald-support
...
Journald support
2017-02-01 12:41:51 -08:00
Random-Liu
a986976a1d
Only change transition timestamp when condition is changed.
2017-01-27 14:48:28 -08:00
Lantao Liu
ba5f5a158d
Merge pull request #79 from Random-Liu/change-resync-mechanism
...
Update NPD to only do forcibly sync every 1 minutes.
2017-01-24 00:39:50 -08:00
Random-Liu
60975f5ad5
Update NPD to only do forcibly sync every 1 minutes.
2017-01-24 00:31:46 -08:00
fate-grand-order
9ac19a240a
correct spelling error in kernel_monitor.go
2017-01-22 22:21:39 +08:00
Random-Liu
2ef2af99eb
Update Readme.md
2017-01-19 01:59:09 -08:00
Random-Liu
c15d463ad5
Finish the journald support
2017-01-19 01:59:09 -08:00
Lantao Liu
f0ed07a0b4
Merge pull request #72 from andyxning/enrich_info_about_nodename
...
detail how node-problem-detector get node name in README
2017-01-18 11:02:56 -08:00
Andy Xie
7302c70143
add -hostname-override
2017-01-18 23:45:30 +08:00
fate-grand-order
a8a5538357
fix misspell
2017-01-17 15:13:02 +08:00
Random-Liu
6637139441
Add release tar ball support.
2017-01-13 11:13:59 -08:00
Random-Liu
aedb371d06
Add --version flag.
2017-01-12 02:07:25 -08:00
Lantao Liu
0cd7944653
Merge pull request #49 from andyxning/add_support_for_running_standalone
...
add support for running standalone
2017-01-09 23:35:15 -08:00
andy xie
68b379c423
add support for running npd standalone
2017-01-07 23:49:19 +08:00
andy xie
2606d52afb
check for linux os
2016-12-22 10:30:42 +08:00
andy xie
2c12274333
bump kubernetes version to v1.4.0-beta.3
2016-12-20 18:11:03 +08:00
AdoHe
86f4d07547
fix data race
2016-10-31 10:40:16 -04:00
AdoHe
ff0a099eec
fix test issue
2016-10-31 10:08:37 -04:00
AdoHe
1e33cddf10
mirror update
2016-10-26 08:43:04 +08:00
AdoHe
84c25077da
add journald support
2016-10-08 20:28:30 -04:00
Lantao Liu
aa9e268be7
Remove the function getStartPoint, because in current logic, it is not
...
needed anymore.
2016-09-12 14:04:23 -07:00
Lantao Liu
a8f491c0d3
Fix unit test.
2016-09-09 20:00:18 -07:00
Dawn Chen
ea83111c80
Merge pull request #22 from Random-Liu/add-look-back
...
Kernel Monitor: Add look back support and kernel panic handling
2016-08-23 17:13:58 -07:00
Lantao Liu
9054dab4c8
Get node name from the downward api.
2016-08-22 17:51:15 -07:00
Lantao Liu
532f933bd8
This PR:
...
1) Add lookback support in kernel monitor. After started, Kernel monitor
will check some old logs to detect problems which happened before last
node reboot.
2) Add `lookback` and `startPattern` in kernel monitor configuration.
* `lookback` specifies how long time kernel monitor should look back.
* `startPattern` specifies which log indicates the node is started.
kernel monitor will clear all current node conditions once it finds
a node start log. This makes sure that old problems won't change the
node condition.
3) Add support for kernel panic monitoring, the null pointer and divide
0 kernel panic will be surfaced as event. Usually kernel monitor will
report these events during looking back phase.
2016-08-20 19:11:26 -07:00
Lantao Liu
5a19ac1868
Get node name from pod, this makes sure that the node
...
name should always be consistent with kubelet.
2016-08-11 14:22:29 -07:00
Lantao Liu
acabf68e06
Add README.md for kernel monitor
2016-06-24 16:19:44 -07:00
Girish Kalele
b687dfaafc
Containerize the nethealth bandwidth measurement utility
2016-06-07 20:51:30 -07:00
Girish Kalele
33a43545ca
Node network health check utility - performs a quick HTTP GET test
2016-06-03 14:26:12 -07:00
Lantao Liu
29ff791f08
Hack for unsupported OS distros.
2016-06-03 01:48:26 -07:00
Lantao Liu
5b07afd325
1. Make source and conditions configurable.
...
2. Add multiple events and conditions support in problem interface.
2016-06-02 15:32:02 -07:00
Lantao Liu
8759e4d610
Use Patch instead of UpdateStatus.
2016-05-30 19:22:32 -07:00
Lantao Liu
f0312655bd
Add first version of node-problem-detector
2016-05-17 15:55:33 -07:00