Commit Graph

123 Commits

Author SHA1 Message Date
David Shay
641c319eb8 Added support for multi-arch image build (#496)
* Added support for multi-arch image build

* Requested changes to multi-arch build

* Further optimizations of multi build

* multi needs QEMU for some pieces

* change main push for all platforms

* Update Dockerfile to call Makefile

* Remove manual workflow
2022-06-07 08:23:36 +02:00
dependabot[bot]
cd7c4f8da3 build(deps): bump alpine from 3.15.4 to 3.16.0 in /cmd/kured (#560)
Bumps alpine from 3.15.4 to 3.16.0.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-25 06:54:04 +02:00
harbottle
6191c73a3c Use clean patch to update node labels. Fixes #553 2022-05-20 08:16:45 +02:00
harbottle
48d112ba32 Change after-reboot-node-labels flag to post-reboot-node-labels 2022-05-18 11:39:38 +02:00
harbottle
50aac294b7 Use Errorf instead of Fatalf for node label logging 2022-05-18 11:39:38 +02:00
harbottle
c3cb2bbc6c Tidy node labelling code 2022-05-18 11:39:38 +02:00
harbottle
9be88fb878 Add verification for node labelling flags 2022-05-18 11:39:38 +02:00
harbottle
4fcf6e184b Add node labelling 2022-05-18 11:39:38 +02:00
Jack Francis
aa5c3e7783 strip unnecessary quotes for notify-url configurations 2022-05-17 19:33:35 +02:00
Jack Francis
d965e7f67e Merge pull request #486 from jackfrancis/retry-cordon-drain
retry cordon + drain if fail, keep lock
2022-05-06 12:19:31 -07:00
dependabot[bot]
6691996bc0 build(deps): bump alpine from 3.15.3 to 3.15.4 in /cmd/kured
Bumps alpine from 3.15.3 to 3.15.4.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-05 21:04:41 +02:00
Christian Kotzbauer
966698f3c6 update to alpine@3.15.3 2022-03-29 10:06:59 +02:00
dependabot[bot]
445310b9b7 build(deps): bump alpine from 3.15.1 to 3.15.2 in /cmd/kured
Bumps alpine from 3.15.1 to 3.15.2.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-24 07:08:45 +01:00
dependabot[bot]
1eec15b5dd build(deps): bump alpine from 3.15.0 to 3.15.1 in /cmd/kured
Bumps alpine from 3.15.0 to 3.15.1.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-17 18:58:10 +01:00
Jack
93d6a783a1 retry cordon + drain if fail, keep lock 2022-02-15 15:07:51 -08:00
Christian Kotzbauer
91b01b5524 Merge pull request #489 from dkulchinsky/dannyk/remove_env_values_from_logs
don't print env variable values in the logs (some are sensitive)
2022-01-05 05:55:28 +01:00
Danny Kulchinsky
22a76f0da2 small fix in deprecation log messages 2022-01-04 12:23:22 -05:00
Danny Kulchinsky
b52a9587f3 don't print env variable values in the logs (some are sensitive) 2022-01-04 10:55:46 -05:00
Christian Kotzbauer
1a6592851e Merge pull request #459 from georgekaz/patch-1
Exclude terminated pods from the blocking mechanism
2021-12-09 14:02:49 +01:00
Christian Kotzbauer
bba3b8d83f Merge pull request #464 from dkulchinsky/viper_env_vars
bind environment variables to cobra flags with viper
2021-12-09 14:00:11 +01:00
Danny Kulchinsky
687aeda813 use sprintf for value in log 2021-12-02 12:05:07 -05:00
Danny Kulchinsky
acddd6b675 minor restructure and adding log for flag to env var binding 2021-12-01 20:59:12 -05:00
Danny Kulchinsky
54e7d93902 dedup const block 2021-12-01 14:50:53 -05:00
Danny Kulchinsky
2666b49d01 address review comments 2021-12-01 11:14:19 -05:00
dependabot[bot]
16e6d3c4d3 build(deps): bump alpine from 3.14 to 3.15.0 in /cmd/kured
Bumps alpine from 3.14 to 3.15.0.

---
updated-dependencies:
- dependency-name: alpine
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-29 09:51:54 +00:00
Danny Kulchinsky
79e19d84ba bind environment variables to cobra flags with viper 2021-11-25 13:53:30 -05:00
georgekaz
d3b59b8922 Exclude terminated pods from the blocking mechanism
Terminated pods should be excluded from the blocking a reboot as per https://github.com/weaveworks/kured/issues/227

This adds status filters to the fieldSelector in order to do that. I've not updated tests here but have successfully tested the exact same filter using kubectl
2021-11-05 16:48:36 +00:00
Daniel Holbach
348b5b4c96 Merge pull request #368 from atighineanu/proto_removed_slack
removed notifications/slack package [Merge after 1.7.0 release]
2021-10-28 08:43:27 +02:00
Daniel Kvist
b108aa4d2d Support json logformatter
This commit introduces a new flag '--log-format' that allows a user
to configure json logging on the pods. If the log-format
is not specified, the formatter will default to the existing
text formatter.
2021-10-25 14:38:53 +02:00
atighineanu
bab1425e1a removed notifications/slack package
In this PR the slack-hook-url is translated
 into shoutrrr syntax. Therefore, slack pack
 age as well as checks for slack-hook-url in
 drain and reboot functions are removed.
 Also added a unit test for flagCheck(), this
 function also checks the (slack)URL syntax.
2021-10-07 10:37:47 +02:00
Jack
3c2508050d fix: don't use nil context in drain helper 2021-09-27 12:43:20 -07:00
Cameron McAvoy
cee15cfc32 Add force-reboot and drain timeouts to chart config and ds 2021-09-15 10:42:50 -05:00
Daniel Holbach
0955403470 Merge pull request #429 from weaveworks/alpine-3.14
build: updated to alpine@3.14
2021-08-30 10:54:35 +02:00
Christian Kotzbauer
9473f831be build: updated to alpine@3.14
Signed-off-by: Christian Kotzbauer <christian.kotzbauer@gmail.com>
2021-08-25 20:19:03 +02:00
Andres Morey
3c5eb968d3 Add reboot-delay command line argument
Currently, kured issues the system reboot command immediately after
kubectl drain finishes.

This is a problem for processes that need extra time to finish but aren't
running on pods and therefore aren't controlled by kubectl drain (e.g.
de-registering nodes from external load balancers).

This patch solves the problem by introducing a `reboot-delay` command
line argument that can be used to add a delay after kubectl drain
finishes but before the reboot command is issued.
2021-08-03 16:48:25 +03:00
Matt Jeanes
6af3f1abc1 Add --alert-firing-only parameter to only consider firing alerts 2021-07-27 11:23:10 +01:00
SimeonPoot
c7d5810503 Restructuring Prometheus client, added unit-tests to regex-queries active alerts (#386)
* prometheus labels incl tests

* enable label in main, add log, docs

* revert the option to query by label

* revert the option to query by label

* PromClient instantiate by func,white space removal

* revert whitespace fix for readability.

* revert removal of newlines for readability

* rename New to NewPromClient to improve readability

Co-authored-by: simp <simp@saxobank.com>
2021-07-27 07:09:46 +02:00
Danny Kulchinsky
c826d73695 fix slack deprecation notice 2021-05-28 13:52:01 -04:00
Jean-Philippe Evrard
79f22cee67 Merge branch 'main' into release-lock-delay 2021-04-14 09:48:28 +02:00
Steffen Pingel
f7b3de36a6 Add parameter for delaying release of lock
This support throtteling of reboots across the cluster
and allows rebooted nodes to reschedule pods, e.g.
to synchronize replicated state before rebooting the next node.
2021-04-13 10:14:14 +02:00
Cameron McAvoy
25dcf3cb12 Expose SkipWaitForDeleteTimeoutSeconds and explicitly return when cordonning fails 2021-04-08 09:52:15 -05:00
Cameron McAvoy
5a86ef40e8 Update the default drain timeout to be infinite 2021-04-07 17:17:33 -05:00
Cameron McAvoy
2400f34cc0 Don't panic if the cordon fails and force-reboot is true 2021-04-07 14:58:21 -05:00
Cameron McAvoy
8db5650510 Refactor force-drain to be a drain-timeout in general 2021-04-07 12:57:01 -05:00
Cameron McAvoy
65292983f2 Add force-reboot after force-timeout duration has been exceeded 2021-04-07 09:39:01 -05:00
Jean-Philippe Evrard
4d45fa8bdb Fix invoke reboot for custom commands
Without this patch, the rebootCommand passed to invokeReboot is
ignored, and the command used for reboot is always systemctl reboot.

This is a problem, as we are aiming for flexible commands for this
release.

This fixes it by restoring the previous behaviour before commit
[1] happened.

[1]: 694957d56e
2021-04-02 09:15:59 +02:00
atighineanu
694957d56e Implement universal notification mechanism
This patch gives the possibility to send notifications
 across different technologies. Also, this patch makes
 slack-hook-url, slack-username and slack-channel
 deprecated (informed by a warning).
 Also, updated the documentation (Readme).
2021-03-29 11:26:18 +02:00
Jean-Philippe Evrard
5930d733f8 Fix the Fatal calls using formatting
Without this, go test will rightfully fail.

This is a problem, as we don't have go test enabled, but we want
to have this in the future.

This should fix it.
2021-03-29 09:50:56 +02:00
Jean-Philippe Evrard
fd63e9a74b Add flexible commands parameters
Without this patch, you cannot configure the reboot
command to use, or the use another command to trigger
a reboot.

This is a problem, as multiple users have asked for
it in the past, and we are lacking flexibility.

This fixes it by introducing two new parameters,
- one to provide a custom reboot command.
  This should help people running kured on
  non systemd OS
- one to provide a custom sentinel command.
  This should help people running non Ubuntu OS,
  as they can directly use their command instead of
  generating a file (useful for CentOS/SUSE)

For this, several refactors had to be done, to
remove global state in some functions. Making those
functions closer to "pure functions" helps us
increase our test coverage here and later.

As commandReboot was very close to rebootCommand,
the function to reboot the node has been renamed
to invokeReboot.
2021-03-29 09:50:56 +02:00
Jean-Philippe Evrard
837bd4eb2a Refactor reboot blocks
Without this patch, we rely on global state in many functions for
which we check the reboot blockers.

This is a problem, as it's harder to test.

This patch fixes it by refactoring the reboot blockers. This also
includes a first series of unit tests for our main.
2021-03-29 09:50:56 +02:00