Without this, go test will rightfully fail.
This is a problem, as we don't have go test enabled, but we want
to have this in the future.
This should fix it.
Without this patch, you cannot configure the reboot
command to use, or the use another command to trigger
a reboot.
This is a problem, as multiple users have asked for
it in the past, and we are lacking flexibility.
This fixes it by introducing two new parameters,
- one to provide a custom reboot command.
This should help people running kured on
non systemd OS
- one to provide a custom sentinel command.
This should help people running non Ubuntu OS,
as they can directly use their command instead of
generating a file (useful for CentOS/SUSE)
For this, several refactors had to be done, to
remove global state in some functions. Making those
functions closer to "pure functions" helps us
increase our test coverage here and later.
As commandReboot was very close to rebootCommand,
the function to reboot the node has been renamed
to invokeReboot.
Without this patch, we rely on global state in many functions for
which we check the reboot blockers.
This is a problem, as it's harder to test.
This patch fixes it by refactoring the reboot blockers. This also
includes a first series of unit tests for our main.
DeleteLocalData was deprecated for users of kubectl in 0.20 [1].
At the same time of the deprecation, the relevant code was also
removed [2] without warning: The DeleteLocalData from the helper
structure was simply renamed DeleteEmptyDirData, without shims
on the exposed pkg.
This is a problem, as it completely breaks kured.
This should fix it, by using the new field name.
[1]:
56ea9621b7
[2]:
56ea9621b7 (diff-041bdcdedca650a38a8d82cf15ab6f3665b7b84a0fb44a8bb5dcdc5cd944c63d)
adds a new --annotate-nodes daemonset runtime argument, which does the following when enabled:
- adds a new node annotation "weave.works/kured-most-recent-reboot-needed" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted
- adds a new node annotation "weave.works/kured-reboot-in-progress" with a value of the current RFC3339 timestamp as soon as kured identifies that a node needs to be rebooted
- removes the annotation "weave.works/kured-reboot-in-progress" when kured has successfully rebooted the node
This changes the pre-reboot drain functionality so that it always runs, regardless of the value of the Unschedulable node property.
Because kubectl drain is idempotent, we shouldn't have to worry about whether the node has already been set to Unschedulable (perhaps due to a prior, unsuccessful loop of the kured reboot cycle): we can run it over and over again. And because this drain func actually does a cordon + drain (and it only performs the drain if a cordon is successful), we can be sure that we aren't going to be thrashing this node w/ respect to scheduled pods.
This also fixes an edge case: if the node has been marked Unschedulable out-of-band, but workloads remain Running on this node, kured will no longer reboot the node's underlying VM/machine while it is actively running pods.
Until a new alpine image is created, we should ensure the latest
packages are used, and therefore we should upgrade default
installed packages.
Without this patch, we'll have outdated and vulnerable packages
until a new 3.12 image is released.
This is a problem, as we'll publish broken images.
This should temporarily workaround it, at the expense of larger
images (contains package cache)
Without this patch, we need to build a cache, remove it.
Since apk allows to work with no-cache and won't leave artifacts,
we should use it.
This will make the dockle best practices scanner happier.
The upside is that image building will always use the latest
stable version of the alpine OS, which might include security fixes.
The downside is that it's less reproducible, because the full
version isn't given.
While this commit isn't necessary per se, it's nice to have
an image that will be up to date, when we'll build it.