mirror of
https://github.com/kubereboot/kured.git
synced 2026-05-19 23:07:37 +00:00
Without this, some CI jobs are flaky or slow due to the following issues: - Triggering a reboot cause an unrecoverable boot loop. This fixes it by restarting the containers that are incorrectly exited. - API server is down while operations happen. This fixes it by ensuring at least one API server is up. In this case, we don't add a reboot marker on the unique api server. - The amount of nodes in a test environment is larger than necessary. This fixes it by ensuring two nodes are required to reboot. This is enough for concurrency, and for the e2e testing. - The wait time between operations is high, and can cause a heartbeat to be missed in the check script. This fixes it by checking more often, at the expense of more logging. This is compensated by increasing the amount of tries. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>