kured

mirror of https://github.com/kubereboot/kured.git synced 2026-05-20 07:12:58 +00:00

Author	SHA1	Message	Date
Jean-Philippe Evrard	73f00ce445	Make all the internal validations ... internal The main is doing flag validation through pflags, then did further validation by involving the constructors. With the recent refactor on the commit "Refactor constructors" in this branch, we moved away from that pattern. However, it means we reintroduced a log dependency into our external API, and the external API now had extra validations regardless of the type. This is unnecessary, so I moved away from that pattern, and moved back all the validation into a central place, internal, which is only doing what kured would desire, without exposing it to users. The users could still theoretically use the proper constructors for each type, as they would validate just fine. The only thing they would lose is the kured internal decision of validation/precedence. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	626db87158	Add error to reboot interface Without this, impossible to bubble up errors to main Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	67df0e935a	Remove deprecated PollWithContext Replaced with PollUntilContextTimeout. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	231888e58a	Use RegexpValue in plags This will remove double pointers, and be explicit about the type we are using. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	d8b9e31ac9	Refactor constructors Without this, a bit of the validation is done in main, while the rest is done in each constructor. This fixes it by create a new global constructor in checkers/reboot to solve all the cases and bubble up the errors. I prefered keeping the old constructors, and calling them, this way someone wanting to have a fork of the code could still create directly the good checker/rebooter, without the arbitrary decisions taken by the generic constructor. However, kured is not a library, and was never intended to be usable in forks, so we might want to reconsider is part 2 of the refactor. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	104a745305	Make locks more generic Implementation details of lock should not leak into the calling methods. Without this path, calls are a bit more complex and error handling is harder to find. This is a problem for long term maintenance, as it is tougher to refactor the locks without impacting the main. Decoupling the two (main usage of the lock, and the lock themselves) will allow us to introduce other kinds of locks easily. I solve this by inlining into the daemonsetlock package: - including all the methods for managing locks from the main.go functions. Those were mostly doing error handling where code became no-op by introducing multiple daemonsetlock types - adding the lock release delay part of lock info I also did not like the pattern include in Test method, which added a reference to nodeMeta: It was not very clear that Test was storing the current metadata of the node, or was returning the current state. (Metadata here only means unschedulable). The problem I saw was that the metadata was silently mutated from a lock Test method, which was very not obvious. Instead, I picked to explicitly return the lock data instead. I also made it explicit that the Acquire lock method is passing the node metadata as structured information, rather than an interface{}. This is a bit more fragile at runtime, but I prefer having very explicit errors if the locks are incorrect, rather than having to deal with unvalidated data. For the lock release delay, it was part of the rebootasrequired loop, where I believe it makes more sense to be part of the Release method itself, for readability. Yet, it hides the delay into the implementation detail, but it keeps the reboot as required goroutine more readable. Instead of passing the argument rebootDelay as parameter of the rebootasrequired method, this refactor took creation of the lock object in the main loop, close to all the variables, and then pass the lock object to the rebootasrequired. This makes the call for rebootasrequired more clear, and lock is now encompassing everything needed to acquire, release, or get info about the lock. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	aae5bb6ebb	Raise the error levels for wrong flag If the notification url configuration is known to be not working, this should be raised as an error, not a warning. Without this, it would be easy to miss a misconfiguration. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	a8132a2286	Remove viper/cobra deps Without this, the main loop is in need of 3 functions to simply parse flags and env variables (excluding input validation). This is a bit more complex than it should, especially since we only need to parse command line flags and env vars. This fixes it by simply using pflags (which we were already using) instead of pflags + viper + cobra (for which we do not have any benefit), and removing all the methods outside the mapping of env var with cli flag. The main code is now far simpler: It handles the reading, parsing, and returning in case of error. As we do not bubble up errors from rebootasRequired yet, this is good enough at this moment. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	42c4b8bc53	Revert to use a constructor again Without this, we have no validation of the data in command/signal reboot. This was not a problem in the first refactor, as the constructor was a dummy one, without validation. However, as we refactoed, we now have code in the root method that is validation for the reboot command. This can now be encompassed in the constructor. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	3895a2f6d3	Remove nodeID from rebooter interface Without this patch, the rebooter interface has data which is not related to the rebooter interface. This should get removed to make it easier to maintain. The loss comes from the logging, which mentioned the node. In order to not have a regression compared to [1], this ensures that at least the node to be rebooted appears in the main. [1]: https://github.com/kubereboot/kured/pull/134 Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	f43ed1484e	Cleanup checkers Without this, the checkers are only shell calls: test -f sentinelFile, or sentinelCommand. This changes the behaviour of existing code to test file for sentinelFile checker, and to keep the sentinel command as a command. However, to avoid having validation in the root loop, it moves to use a constructor to cleanup the code. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-19 15:51:04 +02:00
Jean-Philippe Evrard	36e6c8b4d8	Rename variable Without this, the variable name is hard to follow. This fixes it by cleaning up the var name. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard	00d8a524ab	Move command line validations in pre function Without this, validations are all over the place. This moves some validations directly into the function, to make the code simpler to read. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard	eeedf203c3	Extract blockers This will make it easier to manipulate main in the future. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard	574065ff8a	Add checker interface This will be useful to refactor the checkers loop. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard	3bfdd76f29	Extract privileged command wrapper into util Without this, it makes the code a bit harder to read. This fixes it by extracting the method. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard	f34864758e	Cleanup rebooter interface Without this, the interface and the code to reboot is a bit more complex than it should be. We do not need setters and getters, as we are just instanciating a single instance of a rebooter interface. We create it based on user input, then pass the object around. This should cleanup the code. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-18 00:53:38 +02:00
Jean-Philippe Evrard	6b7d9be99f	Merge pull request #980 from kubereboot/dependabot/go_modules/github.com/prometheus/common-0.60.0 build(deps): bump github.com/prometheus/common from 0.57.0 to 0.60.0	2024-10-18 00:23:32 +02:00
dependabot[bot]	2eec401435	build(deps): bump github.com/prometheus/common from 0.57.0 to 0.60.0 Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.57.0 to 0.60.0. - [Release notes](https://github.com/prometheus/common/releases) - [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md) - [Commits](https://github.com/prometheus/common/compare/v0.57.0...v0.60.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-17 21:42:35 +00:00
Jean-Philippe Evrard	a1f3d1eba9	Merge pull request #994 from kubereboot/dependabot/go_modules/github.com/prometheus/client_golang-1.20.5 build(deps): bump github.com/prometheus/client_golang from 1.20.4 to 1.20.5	2024-10-17 23:41:22 +02:00
dependabot[bot]	d81b2fd93b	build(deps): bump github.com/prometheus/client_golang Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.20.4 to 1.20.5. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](https://github.com/prometheus/client_golang/compare/v1.20.4...v1.20.5) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-17 21:15:47 +00:00
Jean-Philippe Evrard	9592fbc94f	Merge pull request #989 from dholbach/987 Update k8s deps and images	2024-10-17 22:35:01 +02:00
Jean-Philippe Evrard	199103498b	Merge pull request #993 from kubereboot/dependabot/github_actions/aquasecurity/trivy-action-0.28.0 build(deps): bump aquasecurity/trivy-action from 0.27.0 to 0.28.0	2024-10-17 22:31:55 +02:00
Jean-Philippe Evrard	f04f465cad	Merge pull request #991 from kubereboot/dependabot/github_actions/lycheeverse/lychee-action-2.0.2 build(deps): bump lycheeverse/lychee-action from 2.0.0 to 2.0.2	2024-10-17 22:31:09 +02:00
Daniel Holbach	575fd245ae	Update k8s deps and images Fixes: #987 Signed-off-by: Daniel Holbach <daniel.holbach@gmail.com>	2024-10-17 22:14:55 +02:00
Jean-Philippe Evrard	608abc6e89	Increase CI coverage and provide new dev tool (#982 ) * Move to stable kind cluster filenames Without this, we have to rename files at every version. This is really unnecessary, we should only change the files and be done with it. This is a problem, as if we move to programmatic test running, the tests would need to be mutatated at every k8s version. With this model, we know that only the kind-cluster files need to be modified for the tests to ba automatically adapted. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party> * Create e2e from go tests interface Without this, e2e tests need tons of manual work to test locally, and the results are not easily exposed. People are less likely to use the e2e tests if they are tough to use outside the CI. This commit makes it easier to run tests locally, and ensures the CI is closer to the Makefile. At the same time, this removes debt in the github worfklows: By switching to newer versions of kind, we can remove the very old workaround for the failed to attach pid 1. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party> * Add node stays as cordonned test Without this, impossible to prove that the node stays as cordonned after a reboot by kured. This refactor also adds the test in the CI, and makes sure the CI is a bit simpler, by using matrix more extensively. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party> * Use hack dir instead of .tmp This is more idiomatic. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party> --------- Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-15 13:16:45 -07:00
Jean-Philippe Evrard	804ff87592	Merge pull request #965 from evrardjp/fix-lychee-pointer Change relative link to absolute link	2024-10-15 21:46:56 +02:00
dependabot[bot]	9ed8d412ac	build(deps): bump aquasecurity/trivy-action from 0.27.0 to 0.28.0 Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.27.0 to 0.28.0. - [Release notes](https://github.com/aquasecurity/trivy-action/releases) - [Commits](`5681af892c...915b19bbe7`) --- updated-dependencies: - dependency-name: aquasecurity/trivy-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-15 17:40:24 +00:00
dependabot[bot]	5615e1e3d2	build(deps): bump lycheeverse/lychee-action from 2.0.0 to 2.0.2 Bumps [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) from 2.0.0 to 2.0.2. - [Release notes](https://github.com/lycheeverse/lychee-action/releases) - [Commits](`7da8ec1fc4...7cd0af4c74`) --- updated-dependencies: - dependency-name: lycheeverse/lychee-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-14 17:33:05 +00:00
Jean-Philippe Evrard	1ce0d36b64	Merge pull request #988 from kubereboot/dependabot/github_actions/aquasecurity/trivy-action-0.27.0 build(deps): bump aquasecurity/trivy-action from 0.26.0 to 0.27.0	2024-10-11 21:14:50 +02:00
dependabot[bot]	3ff79eb20d	build(deps): bump aquasecurity/trivy-action from 0.26.0 to 0.27.0 Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/aquasecurity/trivy-action/releases) - [Commits](`a20de5420d...5681af892c`) --- updated-dependencies: - dependency-name: aquasecurity/trivy-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-11 17:36:33 +00:00
dholbach	98dfe109ce	Merge pull request #977 from kubereboot/dependabot/docker/alpine-3.20.3 build(deps): bump alpine from 3.20.2 to 3.20.3	2024-10-11 16:06:02 +02:00
dholbach	f986887214	Merge pull request #979 from kubereboot/dependabot/go_modules/github.com/prometheus/client_golang-1.20.4 build(deps): bump github.com/prometheus/client_golang from 1.20.3 to 1.20.4	2024-10-11 16:05:13 +02:00
dependabot[bot]	18c3c06b6e	build(deps): bump github.com/prometheus/client_golang Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.20.3 to 1.20.4. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](https://github.com/prometheus/client_golang/compare/v1.20.3...v1.20.4) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-11 13:57:24 +00:00
dependabot[bot]	fc9a5c75e3	build(deps): bump alpine from 3.20.2 to 3.20.3 Bumps alpine from 3.20.2 to 3.20.3. --- updated-dependencies: - dependency-name: alpine dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-11 13:57:07 +00:00
dholbach	719d241e30	Merge pull request #978 from jackfrancis/go-1.22 update go to 1.22	2024-10-11 15:55:30 +02:00
dholbach	204b094554	Merge pull request #983 from kubereboot/dependabot/github_actions/lycheeverse/lychee-action-2.0.0 build(deps): bump lycheeverse/lychee-action from 1.10.0 to 2.0.0	2024-10-11 15:54:39 +02:00
dholbach	4451747a83	Merge pull request #985 from kubereboot/dependabot/github_actions/aquasecurity/trivy-action-0.26.0 build(deps): bump aquasecurity/trivy-action from 0.24.0 to 0.26.0	2024-10-11 15:54:04 +02:00
dependabot[bot]	cec0881290	build(deps): bump aquasecurity/trivy-action from 0.24.0 to 0.26.0 Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.24.0 to 0.26.0. - [Release notes](https://github.com/aquasecurity/trivy-action/releases) - [Commits](`6e7b7d1fd3...a20de5420d`) --- updated-dependencies: - dependency-name: aquasecurity/trivy-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-09 17:07:27 +00:00
dependabot[bot]	5c71880f32	build(deps): bump lycheeverse/lychee-action from 1.10.0 to 2.0.0 Bumps [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) from 1.10.0 to 2.0.0. - [Release notes](https://github.com/lycheeverse/lychee-action/releases) - [Commits](`2b973e86fc...7da8ec1fc4`) --- updated-dependencies: - dependency-name: lycheeverse/lychee-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2024-10-08 17:41:56 +00:00
Jean-Philippe Evrard	fdac3b1fe7	Merge pull request #981 from evrardjp/fix_ci Fix ci	2024-10-03 13:23:36 +02:00
Jean-Philippe Evrard	a02ae67559	Accelerate CI jobs Without this, some CI jobs are flaky or slow due to the following issues: - Triggering a reboot cause an unrecoverable boot loop. This fixes it by restarting the containers that are incorrectly exited. - API server is down while operations happen. This fixes it by ensuring at least one API server is up. In this case, we don't add a reboot marker on the unique api server. - The amount of nodes in a test environment is larger than necessary. This fixes it by ensuring two nodes are required to reboot. This is enough for concurrency, and for the e2e testing. - The wait time between operations is high, and can cause a heartbeat to be missed in the check script. This fixes it by checking more often, at the expense of more logging. This is compensated by increasing the amount of tries. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-02 23:30:41 +02:00
Jean-Philippe Evrard	5536bf7e30	Add CVE in ignore list We can't move to use 1.22 yet, so we'll ignore this one. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-01 23:47:40 +02:00
Jean-Philippe Evrard	29b4af1ab7	Automatically point to correct repository Without this, the CI would automatically point DH_ORG to kubereboot/kured on ghcr, instead of pointing to the owner of the repo. This makes the CI smoother. Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>	2024-10-01 22:10:17 +02:00
Jack Francis	a82b11f1c2	update go to 1.22 Signed-off-by: Jack Francis <jackfrancis@gmail.com>	2024-09-11 14:56:43 -07:00
Daniel Holbach	679cdc40b9	Merge pull request #975 from kubereboot/dependabot/go_modules/github.com/prometheus/client_golang-1.20.3 build(deps): bump github.com/prometheus/client_golang from 1.20.2 to 1.20.3	2024-09-06 07:56:16 +02:00
dependabot[bot]	efbd514af8	build(deps): bump github.com/prometheus/client_golang Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.20.2 to 1.20.3. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/v1.20.3/CHANGELOG.md) - [Commits](https://github.com/prometheus/client_golang/compare/v1.20.2...v1.20.3) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2024-09-05 17:39:10 +00:00
Daniel Holbach	54d356c420	Merge pull request #969 from kubereboot/dependabot/go_modules/github.com/prometheus/common-0.57.0 build(deps): bump github.com/prometheus/common from 0.55.0 to 0.57.0	2024-08-30 12:22:01 +02:00
dependabot[bot]	ee18dbf482	build(deps): bump github.com/prometheus/common from 0.55.0 to 0.57.0 Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.55.0 to 0.57.0. - [Release notes](https://github.com/prometheus/common/releases) - [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md) - [Commits](https://github.com/prometheus/common/compare/v0.55.0...v0.57.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2024-08-29 17:14:21 +00:00
Daniel Holbach	2d52f00bfe	Merge pull request #968 from kubereboot/dependabot/go_modules/github.com/prometheus/client_golang-1.20.2 build(deps): bump github.com/prometheus/client_golang from 1.20.1 to 1.20.2	2024-08-27 19:17:50 +02:00

1 2 3 4 5 ...

1140 Commits