kured

mirror of https://github.com/kubereboot/kured.git synced 2026-05-17 05:47:06 +00:00

Author	SHA1	Message	Date
Jean-Philippe Evrard	837bd4eb2a	Refactor reboot blocks Without this patch, we rely on global state in many functions for which we check the reboot blockers. This is a problem, as it's harder to test. This patch fixes it by refactoring the reboot blockers. This also includes a first series of unit tests for our main.	2021-03-29 09:50:56 +02:00
Jean-Philippe Evrard	15c57927c8	Update the deprecated DeleteLocalData DeleteLocalData was deprecated for users of kubectl in 0.20 [1]. At the same time of the deprecation, the relevant code was also removed [2] without warning: The DeleteLocalData from the helper structure was simply renamed DeleteEmptyDirData, without shims on the exposed pkg. This is a problem, as it completely breaks kured. This should fix it, by using the new field name. [1]: `56ea9621b7` [2]: `56ea9621b7 (diff-041bdcdedca650a38a8d82cf15ab6f3665b7b84a0fb44a8bb5dcdc5cd944c63d)`	2021-03-22 14:28:17 +01:00
Daniel Holbach	f6ada05c5d	Merge pull request #320 from dholbach/alpine-3.13 update to alpine 3.13	2021-03-10 08:50:42 +01:00
Daniel Holbach	355813de30	update to alpine 3.13 Signed-off-by: Daniel Holbach <daniel@weave.works>	2021-03-10 08:10:36 +01:00
Daniel Holbach	250b9bad05	Merge pull request #296 from jackfrancis/node-annotations add node annotations to identify kured reboot operations	2021-03-09 10:14:46 +01:00
Jack Francis	baf83408b8	add node annotations adds a new --annotate-nodes daemonset runtime argument, which does the following when enabled: - adds a new node annotation "weave.works/kured-most-recent-reboot-needed" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted - adds a new node annotation "weave.works/kured-reboot-in-progress" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted - removes the annotation "weave.works/kured-reboot-in-progress" when kured has successfully rebooted the node	2021-03-08 17:22:47 -08:00
Jack Francis	93c8242b89	always drain before reboot This changes the pre-reboot drain functionality so that it always runs, regardless of the value of the Unschedulable node property. Because kubectl drain is idempotent, we shouldn't have to worry about whether the node has already been set to Unschedulable (perhaps due to a prior, unsuccessful loop of the kured reboot cycle): we can run it over and over again. And because this drain func actually does a cordon + drain (and it only performs the drain if a cordon is successful), we can be sure that we aren't going to be thrashing this node w/ respect to scheduled pods. This also fixes an edge case: if the node has been marked Unschedulable out-of-band, but workloads remain Running on this node, kured will no longer reboot the node's underlying VM/machine while it is actively running pods.	2021-03-08 17:20:31 -08:00
Daniel Holbach	fade706cbf	Merge pull request #250 from damoon/19-PreferNoSchedule implement issue-19 add prefer no schedule taint to avoid double draining of pods	2021-01-12 14:28:23 +01:00
David Sauer	5a4e197d27	change taint config to be disabled by default	2021-01-11 18:24:17 +01:00
David Sauer	3a35d6a46c	remove taint in case the reboot is not needed anymore	2021-01-06 22:21:41 +01:00
David Sauer	34446f949e	Allow to disable tainting during pending node reboot by setting the taint name to an empty string.	2021-01-06 21:39:32 +01:00
David Sauer	e4c684c3af	taint node with PreferNoSchedule to prevent receiving (and double draining) additional pods from other rebooting nodes	2021-01-06 21:23:40 +01:00
David Sauer	204a06ca38	fixed call of log.Fatal instead of log.Fatalf	2021-01-06 21:23:40 +01:00
David Sauer	48897eb0ab	avoid indentations to ease readability	2021-01-06 21:23:40 +01:00
Jean-Philippe Evrard	897834a9db	Temporarily workaround alpine issue Until a new alpine image is created, we should ensure the latest packages are used, and therefore we should upgrade default installed packages. Without this patch, we'll have outdated and vulnerable packages until a new 3.12 image is released. This is a problem, as we'll publish broken images. This should temporarily workaround it, at the expense of larger images (contains package cache)	2020-12-14 11:20:27 +01:00
Daniel Jimenez Garcia	51cab0dedc	rename message template parameters so they are not related to slack	2020-11-25 16:20:54 +00:00
Daniel Jimenez Garcia	f059cec794	GH-125, add additional parameters to override the drain/reboot slack messages	2020-11-25 16:19:31 +00:00
Bryan Boreham	1ba3acab98	Drain: allow pods grace period to terminate The default of 0 is taken as "delete immediately", which is not appropriate.	2020-11-23 18:07:56 +00:00
Daniel Holbach	aa49cfd8c4	Merge pull request #215 from evrardjp/make-lint-happier Make go lint on cmd folder happier	2020-11-09 11:49:51 +01:00
Bryan Boreham	4c31184422	Merge pull request #213 from mvisonneau/lock_ttl Replaced --annotationTTL with --lockTTL and fixed bug	2020-11-06 11:31:19 +00:00
Jean-Philippe Evrard	7091debe23	Make lint happier Without this, golint is complaining about a few cosmetic changes. This solves it, and is necessary if we want to add a lint test in CI.	2020-11-05 10:14:39 +01:00
Jean-Philippe Evrard	ce6075c800	Remove prom-active-alerts Prom-active-alerts command is not used, not tested, and currently broken. Let's remove it.	2020-11-05 10:13:50 +01:00
Maxime VISONNEAU	9648d1d759	Replaced --annotationTTL with --lockTTL and made it work correctly	2020-10-30 10:39:18 +00:00
Jean-Philippe Evrard	e5a2d4acc7	Refactor drain/uncordon Moving the drainer object close to its usage is more readable.	2020-10-29 11:45:20 +01:00
Jean-Philippe Evrard	72c4112e20	Use kubectl as library instead of calling from cli	2020-10-15 13:02:35 +02:00
Jean-Philippe Evrard	b0bd603931	fix: Follow DKL-DI-0004 guideline Without this patch, we need to build a cache, remove it. Since apk allows to work with no-cache and won't leave artifacts, we should use it. This will make the dockle best practices scanner happier.	2020-09-11 16:53:59 +02:00
Daniel Holbach	3ebc224958	update alpine to 3.12, k8s 1.18.8	2020-08-28 10:27:39 +02:00
Daniel Holbach	16109017ce	Prepare for k8s release 1.19 (Aug 25) This is #152, #139, #127 in disguise. Maybe this time let it simmer a bit longer until the k8s release is there?	2020-08-19 17:30:00 +02:00
Daniel Holbach	8fafad18bb	Revert #139 This is a follow-up to #150, so we can get a 1.4.x release out that will be geared towards k8s 1.1[6-8]. Update to latest 1.17 kubectl: 1.17.7.	2020-06-26 17:30:01 +02:00
Bryan Boreham	ec75533394	Merge pull request #119 from michalschott/annotationTTL Adding --annotation-ttl for automatic unlock	2020-05-20 11:30:44 +01:00
Michal Schott	59a6700add	Renaming flag as suggested.	2020-05-05 20:52:10 +02:00
Michal Schott	64ebf53264	Typo in logic.	2020-05-05 14:32:41 +02:00
Michal Schott	1257d97ead	Be clean when this feature is disabled.	2020-05-05 14:10:23 +02:00
Michal Schott	7fb16fed9b	Adding annotationTTL.	2020-05-05 14:10:22 +02:00
Daniel Holbach	72a31030db	replay changes from #127	2020-05-01 09:07:16 +02:00
Daniel Holbach	8e73cf224d	Revert parts of #127 , move to client-go/kubectl 1.17 After the release of kured 1.4.0 we should be able to go back. This was decided in our meeting (https://docs.google.com/document/d/1bsHTjHhqaaZ7yJnXF6W8c89UB_yn-OoSZEmDnIP34n8/edit#heading=h.8cgszb6vuhza) Let's go with supporting 1.1[678] in this release.	2020-04-22 18:32:25 +02:00
Carlos Garcia Lalicata	800e9e19fb	pring node id when commanding reboot, so that any monitoring tool can catch it and act on it	2020-04-20 10:58:35 +02:00
Jean-Philippe Evrard	bdd20c963c	Unpin base docker images The upside is that image building will always use the latest stable version of the alpine OS, which might include security fixes. The downside is that it's less reproducible, because the full version isn't given. While this commit isn't necessary per se, it's nice to have an image that will be up to date, when we'll build it.	2020-04-08 18:17:58 +02:00
Daniel Holbach	0a419d0d34	update to 1.18.0 API confirmed by running https://github.com/kubernetes-sigs/clientgofix closes: #123	2020-03-30 10:11:30 +02:00
Daniel Holbach	b75aec87d7	update urls to match k8s 1.18 release	2020-03-30 10:11:30 +02:00
Peter Groenewegen	7e7430f7df	Keeping alpine fresh Updating alpine to the latest version. Tested this version of alpine and running fine, keeping versions of dependencies up to date.	2020-02-25 11:09:28 -08:00
Daniel Holbach	7975a78025	update to latest kubectl 1.15	2020-02-21 16:11:10 +01:00
Peter Groenewegen	f86514c1e6	Use newer version of k8s client tools The version of k8s has security vulnerabilities, updating to a newer version Tested this this version to on our clusters	2020-02-19 07:55:03 -08:00
Praveen Adusumilli	f2ae01120a	Upgrading to latest alpine (#100 ) * Upgrading to latest alpine 3.10.3	2019-11-26 16:53:43 +00:00
Nighthawk22	5c21206bdb	Merge branch 'master' into master	2019-10-28 10:56:13 +01:00
leigh capili	4beddb5338	Reboot only within time window specified on commandline (#66 ) Reboot only within time window specified on commandline	2019-10-23 22:23:51 -06:00
Maximilian Zollneritsch	d1315c691e	Added slack channel name configuration	2019-09-11 13:59:09 +02:00
Adam Harrison	8d809333b3	Update embedded kubectl to v1.14.1	2019-05-16 17:07:17 +01:00
Adam Harrison	556789e6c7	Update embedded kubectl to v1.13.6	2019-05-16 10:51:51 +01:00
JJ Jordan	357687b053	Fix seconds format in parser, address (an unimportant) corner case	2019-04-18 18:00:29 -07:00

1 2

74 Commits