Commit Graph

145 Commits

Author SHA1 Message Date
Jean-Philippe Evrard
00d8a524ab Move command line validations in pre function
Without this, validations are all over the place.
This moves some validations directly into the function, to
make the code simpler to read.

Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>
2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard
eeedf203c3 Extract blockers
This will make it easier to manipulate main in the future.

Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>
2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard
574065ff8a Add checker interface
This will be useful to refactor the checkers loop.

Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>
2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard
3bfdd76f29 Extract privileged command wrapper into util
Without this, it makes the code a bit harder to read.

This fixes it by extracting the method.

Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>
2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard
f34864758e Cleanup rebooter interface
Without this, the interface and the code to reboot is
a bit more complex than it should be.

We do not need setters and getters, as we are just
instanciating a single instance of a rebooter interface.

We create it based on user input, then pass the object
around. This should cleanup the code.

Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>
2024-10-18 00:53:38 +02:00
Christian Hopf
87202d8fcf Add signal-reboot (#814)
* feat: sentinel-command without nsenter by default

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: no readonly mount

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: mount at different folder

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* feat: add signal-reboot

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* feat: make signal configurable and add tests

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* build: rename job

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* cleanup: linter

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* build: also adjust signal manifest

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* test: add e2e-tests

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: small code restructure

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: adjust version-range

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

---------

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>
2024-01-06 10:25:11 +01:00
Daniel Malon
d51258ffde feat: add drain delay (#852)
Signed-off-by: Daniel Malon <daniel.malon@me.com>
2023-12-11 10:58:29 -08:00
Jack Francis
8bc66c937d fix: don’t hold node lock if reboot is blocked (#819)
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2023-08-17 06:15:09 +02:00
Jim
9a4b8fdb32 add argument to invert the behavior of alert-filter-regexp (#786)
* add argument to invert the behavior of alert-filter-regexp

Signed-off-by: Jim Liming <james.k.liming@gmail.com>

* feat: small code-improvements

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

---------

Signed-off-by: Jim Liming <james.k.liming@gmail.com>
Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>
Co-authored-by: Christian Kotzbauer <git@ckotzbauer.de>
2023-08-14 18:52:12 +02:00
Thomas Stringer
3b9b190422 Add multiple concurrent node reboot feature (#660)
* Add ability to have multiple nodes get a lock

Currently in kured a single node can get a lock with Acquire. There
could be situations where multiple nodes might want a lock in the event
that a cluster can handle multiple nodes being rebooted. This adds the
side-by-side implementation for a multiple node lock situation.

Signed-off-by: Thomas Stringer <thomas@trstringer.com>

* Refactor to use the same code path for a single lock and a multilock

Signed-off-by: Thomas Stringer <thomas@trstringer.com>

* test: force rebuild

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* build: log pod-logs

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: change condition

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* build: fix test-script

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* build: add concurrent test

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: final changes

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

---------

Signed-off-by: Thomas Stringer <thomas@trstringer.com>
Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>
Co-authored-by: Christian Kotzbauer <git@ckotzbauer.de>
2023-08-14 18:33:18 +02:00
nkinkade
351ca71787 Adds new flag --metrics-host (#811)
* Replaces flag --metrics-port with --metrics-addresss

Signed-off-by: Nathan Kinkade <kinkade@measurementlab.net>

* Revert "Replaces flag --metrics-port with --metrics-addresss"

This reverts commit 528c7bb14b.

Signed-off-by: Nathan Kinkade <kinkade@measurementlab.net>

* Adds new --metrics-host flag

The flag --metrics-port already exists. While not as clean, to avoid
introducing a backward incompatible change to flags, this commit adds a
new --metrics-host flag, which in combination with the existing
--metrics-port flag can define a complete listen address for the metrics
server as "<metrics-host>:<metrics-port>"

Signed-off-by: Nathan Kinkade <kinkade@measurementlab.net>

* Adds new, commented flags --metrics-{port,host}

Signed-off-by: Nathan Kinkade <kinkade@measurementlab.net>

---------

Signed-off-by: Nathan Kinkade <kinkade@measurementlab.net>
2023-08-08 08:50:41 +02:00
Christian Kotzbauer
16dc5e30d9 fix: log on unusual sentinel-command exit code (#806)
Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>
2023-08-02 19:04:52 +02:00
Boris Prüßmann
d019e7a50a Support pod-selector for drain command (#788)
Signed-off-by: Boris Pruessmann <boris@pruessmann.org>
2023-08-02 11:33:29 +02:00
Maxime Leroy
4c75199b41 feat: metrics port command (#780)
Signed-off-by: Maxime Leroy <19607336+maxime1907@users.noreply.github.com>
2023-06-09 21:33:17 +02:00
Jack Francis
1929c11297 fix: annotate nodes for reboot before aborting due to blocked (#749)
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2023-04-14 10:29:22 +02:00
Christian Kotzbauer
ba1328ca12 feat: Integrate GoReleaser, Cosign and Syft (#595)
* build: integrate goreleaser, syft and cosign

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: chmod for all binaries

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: version-env

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: remove prefix

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: remove prefix

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: schellcheck

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: shellcheck

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: several script updates

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* fix: remove main-prefix

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>
2022-10-02 15:25:17 +02:00
Daniel Holbach
bce0bac183 Changed weaveworks to kubereboot in many places
Areas I did not touch:
- bot name, secrets
- image name
- LICENSE (would need to ask how/if that gets changed...?)
- one mention in the Dev docs that we used to do some
  pre-release smoke-testing on the Weave Dev cluster

Signed-off-by: Daniel Holbach <daniel@weave.works>
2022-09-20 13:17:55 +02:00
dependabot[bot]
9d4ebfc1f8 build(deps): bump alpine from 3.16.1 to 3.16.2 in /cmd/kured (#617)
Bumps alpine from 3.16.1 to 3.16.2.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-10 06:20:13 +02:00
Jack Francis
777f5b2cce update command line flags in README (#607) 2022-07-23 09:20:52 +02:00
dependabot[bot]
10d42b07a5 build(deps): bump alpine from 3.16.0 to 3.16.1 in /cmd/kured
Bumps alpine from 3.16.0 to 3.16.1.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-22 15:16:47 +00:00
Alexei Tighineanu
28c5332450 added notification when uncordoning (#587)
* added notification when uncordoning

 when reboot & uncordoning is succ
 essful -> notification will be se
 nt

* added uncordon message tmpl

 added message template for
 announcing successful uncor-
 doning and reboot.

* added proper documentation about new flag

 added readme note about new flag
2022-06-25 21:08:05 +02:00
Christian Kotzbauer
115fea9d2a Release 1.10.0 preparation (#572)
* feat: updated helm-chart for 1.10.0
close #551

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>

* feat: update multiarch-dockerfile to 3.16.0

Signed-off-by: Christian Kotzbauer <git@ckotzbauer.de>
2022-06-08 19:32:09 +02:00
David Shay
641c319eb8 Added support for multi-arch image build (#496)
* Added support for multi-arch image build

* Requested changes to multi-arch build

* Further optimizations of multi build

* multi needs QEMU for some pieces

* change main push for all platforms

* Update Dockerfile to call Makefile

* Remove manual workflow
2022-06-07 08:23:36 +02:00
dependabot[bot]
cd7c4f8da3 build(deps): bump alpine from 3.15.4 to 3.16.0 in /cmd/kured (#560)
Bumps alpine from 3.15.4 to 3.16.0.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-25 06:54:04 +02:00
harbottle
6191c73a3c Use clean patch to update node labels. Fixes #553 2022-05-20 08:16:45 +02:00
harbottle
48d112ba32 Change after-reboot-node-labels flag to post-reboot-node-labels 2022-05-18 11:39:38 +02:00
harbottle
50aac294b7 Use Errorf instead of Fatalf for node label logging 2022-05-18 11:39:38 +02:00
harbottle
c3cb2bbc6c Tidy node labelling code 2022-05-18 11:39:38 +02:00
harbottle
9be88fb878 Add verification for node labelling flags 2022-05-18 11:39:38 +02:00
harbottle
4fcf6e184b Add node labelling 2022-05-18 11:39:38 +02:00
Jack Francis
aa5c3e7783 strip unnecessary quotes for notify-url configurations 2022-05-17 19:33:35 +02:00
Jack Francis
d965e7f67e Merge pull request #486 from jackfrancis/retry-cordon-drain
retry cordon + drain if fail, keep lock
2022-05-06 12:19:31 -07:00
dependabot[bot]
6691996bc0 build(deps): bump alpine from 3.15.3 to 3.15.4 in /cmd/kured
Bumps alpine from 3.15.3 to 3.15.4.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-05 21:04:41 +02:00
Christian Kotzbauer
966698f3c6 update to alpine@3.15.3 2022-03-29 10:06:59 +02:00
dependabot[bot]
445310b9b7 build(deps): bump alpine from 3.15.1 to 3.15.2 in /cmd/kured
Bumps alpine from 3.15.1 to 3.15.2.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-24 07:08:45 +01:00
dependabot[bot]
1eec15b5dd build(deps): bump alpine from 3.15.0 to 3.15.1 in /cmd/kured
Bumps alpine from 3.15.0 to 3.15.1.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-17 18:58:10 +01:00
Jack
93d6a783a1 retry cordon + drain if fail, keep lock 2022-02-15 15:07:51 -08:00
Christian Kotzbauer
91b01b5524 Merge pull request #489 from dkulchinsky/dannyk/remove_env_values_from_logs
don't print env variable values in the logs (some are sensitive)
2022-01-05 05:55:28 +01:00
Danny Kulchinsky
22a76f0da2 small fix in deprecation log messages 2022-01-04 12:23:22 -05:00
Danny Kulchinsky
b52a9587f3 don't print env variable values in the logs (some are sensitive) 2022-01-04 10:55:46 -05:00
Christian Kotzbauer
1a6592851e Merge pull request #459 from georgekaz/patch-1
Exclude terminated pods from the blocking mechanism
2021-12-09 14:02:49 +01:00
Christian Kotzbauer
bba3b8d83f Merge pull request #464 from dkulchinsky/viper_env_vars
bind environment variables to cobra flags with viper
2021-12-09 14:00:11 +01:00
Danny Kulchinsky
687aeda813 use sprintf for value in log 2021-12-02 12:05:07 -05:00
Danny Kulchinsky
acddd6b675 minor restructure and adding log for flag to env var binding 2021-12-01 20:59:12 -05:00
Danny Kulchinsky
54e7d93902 dedup const block 2021-12-01 14:50:53 -05:00
Danny Kulchinsky
2666b49d01 address review comments 2021-12-01 11:14:19 -05:00
dependabot[bot]
16e6d3c4d3 build(deps): bump alpine from 3.14 to 3.15.0 in /cmd/kured
Bumps alpine from 3.14 to 3.15.0.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-29 09:51:54 +00:00
Danny Kulchinsky
79e19d84ba bind environment variables to cobra flags with viper 2021-11-25 13:53:30 -05:00
georgekaz
d3b59b8922 Exclude terminated pods from the blocking mechanism
Terminated pods should be excluded from the blocking a reboot as per https://github.com/weaveworks/kured/issues/227

This adds status filters to the fieldSelector in order to do that. I've not updated tests here but have successfully tested the exact same filter using kubectl
2021-11-05 16:48:36 +00:00
Daniel Holbach
348b5b4c96 Merge pull request #368 from atighineanu/proto_removed_slack
removed notifications/slack package [Merge after 1.7.0 release]
2021-10-28 08:43:27 +02:00