Commit Graph

322 Commits

Author SHA1 Message Date
Jian Zhu
b506d16cf8 🐛 Fix ManagedClusterAddons not removed when ClusterManagementAddon is deleted (#1160)
Some checks failed
Post / coverage (push) Failing after 38s
Post / images (amd64, addon-manager) (push) Failing after 33s
Post / images (amd64, placement) (push) Failing after 41s
Post / images (amd64, registration) (push) Failing after 40s
Post / images (amd64, registration-operator) (push) Failing after 38s
Post / images (amd64, work) (push) Failing after 36s
Post / images (arm64, addon-manager) (push) Failing after 35s
Post / images (arm64, placement) (push) Failing after 39s
Post / images (arm64, registration) (push) Failing after 34s
Post / images (arm64, registration-operator) (push) Failing after 33s
Post / images (arm64, work) (push) Failing after 35s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 41s
Close stale issues and PRs / stale (push) Failing after 27s
* Fix ManagedClusterAddons not removed when ClusterManagementAddon is deleted

The addon template controller was stopping addon managers immediately when
ClusterManagementAddon was deleted, without waiting for pre-delete jobs
to complete or ManagedClusterAddons to be cleaned up via owner reference
cascading deletion.

This change implements the TODO at line 105 by checking if all
ManagedClusterAddons are deleted before stopping the manager. The controller
now uses field selectors to efficiently query for remaining ManagedClusterAddons
and requeues after 10 seconds if any still exist, allowing time for proper
cleanup.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* add e2e test

Signed-off-by: zhujian <jiazhu@redhat.com>

* return err when stopUnusedManagers failed

Signed-off-by: zhujian <jiazhu@redhat.com>

* Address review comments for addon manager deletion fix

- Use lister instead of API client for better performance
- Add named constant for requeue delay
- Fix test cache synchronization issues
- Improve test coverage from 74.7% to 75.6%

Addresses review feedback from Qiujian16 and CodeRabbit.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Fix e2e test timeout for configmap deletion check

Add explicit 180s timeout for pre-delete job configmap cleanup.
The default 90s timeout was insufficient for the deletion workflow.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Improve error logging in template agent

- Replace utilruntime.HandleError with structured logging in CSR functions
- Add more context to error messages for better debugging
- Use logger.Info for template retrieval errors to provide better visibility

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Use ManagedClusterAddonByName index for efficient lookup

- Replace inefficient list-and-filter with indexed lookup
- Add managedClusterAddonIndexer field to controller struct
- Update comment to accurately describe functionality
- Fix unit tests to properly set up the required index

This addresses the PR review feedback to use the existing index
instead of listing all ManagedClusterAddOns and filtering by name.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Remove unused mcaLister field

Since we now use managedClusterAddonIndexer for efficient lookup,
the mcaLister field is no longer needed. This cleanup reduces
memory usage and simplifies the controller structure.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Replace inefficient list-and-filter with indexed lookup in runController

Use managedClusterAddonIndexer.ByIndex() instead of listing all ManagedClusterAddOns
and filtering by name. This provides O(1) indexed lookup instead of O(n) linear scan.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Fix review comments for addon manager deletion

- Fix closure capture bug in controller test by using captured variables
- Fix typo 'copyiedConfig' to 'copiedConfig' in e2e tests

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Optimize ManagedClusterAddOn event handling in addon template controller

Replace filtered event handling with custom event handlers that only trigger
reconciliation when AddOnTemplate configReferences actually change. This
reduces unnecessary reconciliation cycles by using reflect.DeepEqual to
compare config references between old and new objects.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Revert "Optimize ManagedClusterAddOn event handling in addon template controller"

This reverts commit 4649d1b9ac.

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-10 01:30:19 +00:00
Jian Qiu
b4b42aa0b5 Requeue ssar check if only hubKubeConfigSecret is unauthorized (#1169) (#1164)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-09-08 07:11:44 +00:00
Jian Qiu
7d42f5f9f6 Requeue ssar check if only hubKubeConfigSecret is unauthorized (#1169)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 39s
Post / coverage (push) Failing after 28s
Post / images (amd64, addon-manager) (push) Failing after 27s
Post / images (amd64, placement) (push) Failing after 29s
Post / images (amd64, registration) (push) Failing after 27s
Post / images (amd64, registration-operator) (push) Failing after 33s
Post / images (amd64, work) (push) Failing after 29s
Post / images (arm64, addon-manager) (push) Failing after 27s
Post / images (arm64, placement) (push) Failing after 26s
Post / images (arm64, registration) (push) Failing after 32s
Post / images (arm64, registration-operator) (push) Failing after 31s
Post / images (arm64, work) (push) Failing after 34s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 50s
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-09-05 02:26:30 +00:00
Jian Qiu
e2be403132 Update grpc configuration in operator API (#1159)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 32s
Post / coverage (push) Failing after 43s
Post / images (amd64, addon-manager) (push) Failing after 41s
Post / images (amd64, placement) (push) Failing after 21s
Post / images (amd64, registration) (push) Failing after 23s
Post / images (amd64, registration-operator) (push) Failing after 30s
Post / images (amd64, work) (push) Failing after 28s
Post / images (arm64, addon-manager) (push) Failing after 28s
Post / images (arm64, placement) (push) Failing after 26s
Post / images (arm64, registration) (push) Failing after 35s
Post / images (arm64, registration-operator) (push) Failing after 28s
Post / images (arm64, work) (push) Failing after 35s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 38s
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-09-04 11:15:15 +00:00
Morven Cao
e0476eebb4 upgrade grpc server. (#1157)
Some checks failed
Post / coverage (push) Failing after 28s
Post / images (amd64, addon-manager) (push) Failing after 22s
Post / images (amd64, placement) (push) Failing after 31s
Post / images (amd64, registration) (push) Failing after 27s
Post / images (amd64, registration-operator) (push) Failing after 30s
Post / images (amd64, work) (push) Failing after 31s
Post / images (arm64, addon-manager) (push) Failing after 35s
Post / images (arm64, placement) (push) Failing after 27s
Post / images (arm64, registration) (push) Failing after 21s
Post / images (arm64, registration-operator) (push) Failing after 33s
Post / images (arm64, work) (push) Failing after 31s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 47s
Close stale issues and PRs / stale (push) Successful in 47s
Signed-off-by: morvencao <lcao@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
2025-09-03 08:31:10 +00:00
Jian Qiu
b72eebc72e Fix wrong key queue for addon controllers (#1152)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 54s
Post / coverage (push) Failing after 28s
Post / images (amd64, addon-manager) (push) Failing after 41s
Post / images (amd64, placement) (push) Failing after 23s
Post / images (amd64, registration) (push) Failing after 22s
Post / images (amd64, registration-operator) (push) Failing after 24s
Post / images (amd64, work) (push) Failing after 28s
Post / images (arm64, addon-manager) (push) Failing after 24s
Post / images (arm64, placement) (push) Failing after 26s
Post / images (arm64, registration) (push) Failing after 35s
Post / images (arm64, registration-operator) (push) Failing after 30s
Post / images (arm64, work) (push) Failing after 24s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 1m12s
The key queue for clustermanagementaddon informer is not correct for
several controllers, fix it by introducing a new queuekey func

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-09-01 08:51:20 +00:00
Wei Liu
74aa03b01c using api auth consts (#1146)
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-08-28 07:15:36 +00:00
Jian Qiu
c5f6e30ab8 Ignore already existing error when creating cluster (#1142)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 43s
Post / coverage (push) Failing after 47s
Post / images (amd64, addon-manager) (push) Failing after 33s
Post / images (amd64, placement) (push) Failing after 39s
Post / images (amd64, registration) (push) Failing after 32s
Post / images (amd64, registration-operator) (push) Failing after 37s
Post / images (amd64, work) (push) Failing after 39s
Post / images (arm64, addon-manager) (push) Failing after 42s
Post / images (arm64, placement) (push) Failing after 42s
Post / images (arm64, registration) (push) Failing after 36s
Post / images (arm64, registration-operator) (push) Failing after 34s
Post / images (arm64, work) (push) Failing after 27s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
In integration test, there is change that creating cluster fails
since the cluster is created in the test. The alreadyExist
error should be ignored

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-08-28 03:33:42 +00:00
Wei Liu
d7c82f4d4a support grpc auto approval user config (#1145)
Some checks failed
Post / coverage (push) Failing after 25s
Post / images (amd64, addon-manager) (push) Failing after 25s
Post / images (amd64, placement) (push) Failing after 27s
Post / images (amd64, registration) (push) Failing after 30s
Post / images (amd64, registration-operator) (push) Failing after 22s
Post / images (amd64, work) (push) Failing after 25s
Post / images (arm64, addon-manager) (push) Failing after 32s
Post / images (arm64, placement) (push) Failing after 35s
Post / images (arm64, registration) (push) Failing after 38s
Post / images (arm64, registration-operator) (push) Failing after 32s
Post / images (arm64, work) (push) Failing after 26s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 29s
Close stale issues and PRs / stale (push) Successful in 1m15s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-08-27 08:38:20 +00:00
Wei Liu
ef24cbbab4 support cert auto approve for grpc (#1134)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m35s
Post / coverage (push) Failing after 41m11s
Post / images (amd64, addon-manager) (push) Failing after 8m58s
Post / images (amd64, placement) (push) Failing after 7m57s
Post / images (amd64, registration) (push) Failing after 8m0s
Post / images (amd64, registration-operator) (push) Failing after 7m59s
Post / images (amd64, work) (push) Failing after 7m32s
Post / images (arm64, addon-manager) (push) Failing after 8m3s
Post / images (arm64, placement) (push) Failing after 7m41s
Post / images (arm64, registration) (push) Failing after 7m20s
Post / images (arm64, registration-operator) (push) Failing after 7m41s
Post / images (arm64, work) (push) Failing after 7m42s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 31s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-08-25 07:44:21 +00:00
Wei Liu
5bac053fe0 using dir to reorg cluster-manager manifests (#1112)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m2s
Post / coverage (push) Failing after 39m59s
Post / images (amd64, addon-manager) (push) Failing after 8m42s
Post / images (amd64, placement) (push) Failing after 7m45s
Post / images (amd64, registration) (push) Failing after 7m51s
Post / images (amd64, registration-operator) (push) Failing after 7m38s
Post / images (amd64, work) (push) Failing after 7m44s
Post / images (arm64, addon-manager) (push) Failing after 7m51s
Post / images (arm64, placement) (push) Failing after 7m48s
Post / images (arm64, registration) (push) Failing after 7m55s
Post / images (arm64, registration-operator) (push) Failing after 7m49s
Post / images (arm64, work) (push) Failing after 7m50s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 54s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-08-18 09:38:43 +00:00
Wei Liu
064c031545 should resync the grpc-server cert after clustermanager updated (#1126)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m58s
Post / coverage (push) Failing after 38m18s
Post / images (amd64, addon-manager) (push) Failing after 8m7s
Post / images (amd64, placement) (push) Failing after 7m42s
Post / images (amd64, registration) (push) Failing after 7m58s
Post / images (amd64, registration-operator) (push) Failing after 7m47s
Post / images (amd64, work) (push) Failing after 7m56s
Post / images (arm64, addon-manager) (push) Failing after 7m40s
Post / images (arm64, placement) (push) Failing after 7m36s
Post / images (arm64, registration) (push) Failing after 7m31s
Post / images (arm64, registration-operator) (push) Failing after 7m42s
Post / images (arm64, work) (push) Failing after 7m53s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 39s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-08-15 02:26:48 +00:00
Wei Liu
c5e7e0711a adjust base and max delay for appliedmanifestwork deletion (#1120)
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-08-11 07:07:00 +00:00
Wei Liu
6c4102f2ca support deploying grpc with clustermanager/klusterlet (#1107)
Some checks failed
Post / coverage (push) Failing after 38m56s
Post / images (amd64, addon-manager) (push) Failing after 8m3s
Post / images (amd64, placement) (push) Failing after 7m33s
Post / images (amd64, registration) (push) Failing after 7m43s
Post / images (amd64, registration-operator) (push) Failing after 7m36s
Post / images (amd64, work) (push) Failing after 7m39s
Post / images (arm64, addon-manager) (push) Failing after 7m56s
Post / images (arm64, placement) (push) Failing after 7m42s
Post / images (arm64, registration) (push) Failing after 7m51s
Post / images (arm64, registration-operator) (push) Failing after 7m43s
Post / images (arm64, work) (push) Failing after 7m46s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m24s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-08-06 09:45:10 +00:00
Jian Zhu
aa660678a4 ⚠️ Remove crd apiextensions v1beta1 (#1095)
Some checks failed
Post / coverage (push) Failing after 39m34s
Post / images (amd64) (push) Failing after 8m31s
Post / images (arm64) (push) Failing after 7m55s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m53s
Close stale issues and PRs / stale (push) Successful in 56s
* Remove crd apiextensions v1beta1

Signed-off-by: zhujian <jiazhu@redhat.com>

* fix unit test

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2025-07-30 01:59:42 +00:00
Jian Qiu
588f82f48b Refactor webhook to use a common webhook option (#1096)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m26s
Post / coverage (push) Failing after 39m1s
Post / images (amd64) (push) Failing after 8m21s
Post / images (arm64) (push) Failing after 7m47s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 47s
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-07-29 07:38:59 +00:00
Jian Qiu
a75eec0b7b Add unit test for agent options (#1097)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-07-29 07:27:38 +00:00
Jian Zhu
f989a37e1a 🌱 Upgrade golang to 1.24 and helm to 3.18.4 (#1085)
* Upgrade golang to 1.24

Signed-off-by: zhujian <jiazhu@redhat.com>

* Fix lint errors

Co-authored-by: gemini <gemini@google.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* upgrade sdk-go to latest

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: gemini <gemini@google.com>
2025-07-28 02:13:36 +00:00
Jian Qiu
334710ce0e Set deleting condition when mw is deleting (#1084)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m54s
Post / coverage (push) Failing after 36m57s
Post / images (amd64) (push) Failing after 8m41s
Post / images (arm64) (push) Failing after 8m7s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 49s
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-07-24 01:11:02 +00:00
Wei Liu
628d0d90ec cloudevents services integration test (#1086)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m22s
Post / coverage (push) Failing after 38m27s
Post / images (amd64) (push) Failing after 8m48s
Post / images (arm64) (push) Failing after 8m9s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 1m3s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-07-23 13:55:55 +00:00
Jian Qiu
feccf1298d Should start clusterprofile informer when featuregate is enabled (#1079)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m4s
Post / coverage (push) Failing after 36m45s
Post / images (amd64) (push) Failing after 8m20s
Post / images (arm64) (push) Failing after 7m51s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 1m20s
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-07-21 08:46:17 +00:00
Wei Liu
405adb61cd starting grpc server with config file (#1071)
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-07-16 02:13:37 +00:00
Ben Perry
c5e776cdd9 Manifest completion (#1033)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m58s
Post / coverage (push) Failing after 36m24s
Post / images (amd64) (push) Failing after 9m7s
Post / images (arm64) (push) Failing after 8m30s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 57s
* Skip manifests in work reconcile that are marked Complete

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Aggregate Complete condition to work from manifests

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Delete work that is complete and satisfies configured TTL

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* tests

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* lint

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* go.mod

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Helper funcs for conditions

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Generic condition aggregation

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Support integration test args

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Remove work deletion from spoke, will be moved to hub GC

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Cleanup

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* update api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Wait for NS to exist before testing

Signed-off-by: Ben Perry <bhperry94@gmail.com>

---------

Signed-off-by: Ben Perry <bhperry94@gmail.com>
2025-07-14 04:53:04 +00:00
Jian Zhu
71236758c1 🌱 Send an event for evicting appliedmanifestwork (#1066)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Has been cancelled
Post / coverage (push) Has been cancelled
Post / images (amd64) (push) Has been cancelled
Post / images (arm64) (push) Has been cancelled
Post / image manifest (push) Has been cancelled
Post / trigger clusteradm e2e (push) Has been cancelled
Close stale issues and PRs / stale (push) Successful in 56s
* Increase the log level for evicting appliedmanifestwork

Signed-off-by: zhujian <jiazhu@redhat.com>

* Add an event for evicting appliedmanifestwork

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2025-07-13 13:47:30 +00:00
Wei Liu
7924226eba grpc server (#1058)
Some checks failed
Post / coverage (push) Failing after 36m50s
Post / images (amd64) (push) Failing after 8m47s
Post / images (arm64) (push) Failing after 8m15s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m8s
Close stale issues and PRs / stale (push) Successful in 50s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-07-09 08:59:10 +00:00
Ben Perry
cbff56ad4b Add bhperry as approver/reviewer of work API (#1065)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 5m29s
Post / coverage (push) Failing after 34m7s
Post / images (amd64) (push) Failing after 8m16s
Post / images (arm64) (push) Failing after 7m38s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Signed-off-by: Ben Perry <bhperry94@gmail.com>
2025-07-09 01:05:19 +00:00
Zhiwei Yin
ce7d226bdd 🐛 fix the labels of hub deployments cannot be updated from the clustermanager (#1046)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m33s
Post / coverage (push) Failing after 33m53s
Post / images (amd64) (push) Failing after 8m23s
Post / images (arm64) (push) Failing after 7m54s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Has been cancelled
* remove labels from spec.selector for cluster manager deployments

Signed-off-by: Zhiwei Yin <zyin@redhat.com>

* refactor labels of operators

Signed-off-by: Zhiwei Yin <zyin@redhat.com>

---------

Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-06-26 03:44:57 +00:00
Jian Zhu
a5757b46f3 Fix typo description of the cluster manager operator flags (#1044)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m36s
Post / coverage (push) Failing after 35m17s
Post / images (amd64) (push) Failing after 8m34s
Post / images (arm64) (push) Failing after 8m7s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 56s
Signed-off-by: zhujian <jiazhu@redhat.com>
2025-06-19 08:06:21 +00:00
Zhiwei Yin
e11e84fcce fix no requeue when return requeueError (#1041)
Some checks failed
Post / coverage (push) Failing after 35m11s
Post / images (amd64) (push) Failing after 8m52s
Post / images (arm64) (push) Failing after 8m9s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m48s
Close stale issues and PRs / stale (push) Successful in 1m4s
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-06-18 03:40:27 +00:00
Jian Qiu
567caa2fe9 Bump API to v1.0.0 (#1036)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m12s
Post / coverage (push) Failing after 36m53s
Post / images (amd64) (push) Failing after 8m37s
Post / images (arm64) (push) Failing after 7m58s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-06-16 01:05:30 +00:00
Zhiwei Yin
27fc65e174 fix the ut failure in pod (#1034)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m10s
Post / coverage (push) Failing after 34m46s
Post / images (amd64) (push) Failing after 8m6s
Post / images (arm64) (push) Failing after 7m44s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 1m7s
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-06-12 03:41:34 +00:00
Jeffrey
215cfed77e Adding support for enableSyncLabels for clustermanager operator and registration controller (#1021)
Signed-off-by: Jeffrey Wong <jeffreywong0417@gmail.com>
2025-06-12 02:32:36 +00:00
Ben Perry
377ba25c26 Workload conditions (#910)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m40s
Post / coverage (push) Failing after 35m43s
Post / images (amd64) (push) Failing after 8m36s
Post / images (arm64) (push) Failing after 8m8s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 48s
* Import OCM API changes for workload conditions

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Implement condition rule evaluator

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Evaluate manifest condition rules after apply

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* note to self

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Cleanup

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Return config option if rules are set

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* update api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Always return an error to inform user about the state of their condition rule

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Condition rule errors should not result in retrying apply

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Test condition rule reconciliation

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Return condition status Unknown when an internal CEL error occurs

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Switch to common CEL lib

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update to simplified celExpressions format

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Formatting

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* tidy

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update ocm api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update sdk-go

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Switch to sdk-go ConditionLib

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update API

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Switch to WellKnownConditions with required Condition field

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Support CEL evaluation budget

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update sdk-go

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update API

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* lint

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update go.mod

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Tests and comments

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Move condition reader to status controller for more frequent updates

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Ignore missing WellKnownCondition

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Fix test

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update condition tests

Signed-off-by: Ben Perry <bhperry94@gmail.com>

---------

Signed-off-by: Ben Perry <bhperry94@gmail.com>
2025-06-11 15:47:35 +00:00
Yang Le
0e2bbba84e 🐛 watch filtered configmaps & deployments to reduce memory usage of cluster-manager (#1030)
Some checks failed
Post / coverage (push) Failing after 32m51s
Post / images (amd64) (push) Failing after 8m10s
Post / images (arm64) (push) Failing after 7m44s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 1m0s
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m23s
Signed-off-by: Yang Le <yangle@redhat.com>
2025-06-10 06:05:27 +00:00
Jian Qiu
0734a0b763 Enable about-api when ClusterProperty featuregate is enabled (#1025)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-06-06 10:11:30 +00:00
Jian Zhu
fb5ba3acaf 🐛 Use syncmap for the resource cache (#1023)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m33s
Post / coverage (push) Failing after 32m30s
Post / images (amd64) (push) Failing after 8m6s
Post / images (arm64) (push) Failing after 7m35s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 43s
* Use syncmap for the resource cache

Signed-off-by: zhujian <jiazhu@redhat.com>

* update unit tests

Signed-off-by: zhujian <jiazhu@redhat.com>

* fix unit test

Signed-off-by: zhujian <jiazhu@redhat.com>

* use sync.map directly

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2025-06-05 01:58:40 +00:00
Jian Qiu
8faa1b2327 Added support for about-api for cluster properties (#1006)
Some checks failed
Post / coverage (push) Failing after 33m23s
Post / images (amd64) (push) Failing after 8m28s
Post / images (arm64) (push) Failing after 7m59s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m20s
Close stale issues and PRs / stale (push) Successful in 57s
* Added support for about-api for cluster properties

Signed-off-by: gnana997 <gnana097@gmail.com>

* refactored failing registration test cases

Signed-off-by: gnana997 <gnana097@gmail.com>

* Added new fake classes and test cases

Signed-off-by: gnana997 <gnana097@gmail.com>

* Refactored test cases and vendors

Signed-off-by: gnana997 <gnana097@gmail.com>

* updated the open-cluster api package and updated cluster property

Signed-off-by: gnana997 <gnana097@gmail.com>

* Refactored the pr with just registration details and crds

Signed-off-by: gnana997 <gnana097@gmail.com>

* Fix fake client

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Add integration test for clusterproperty

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: gnana997 <gnana097@gmail.com>
Signed-off-by: Jian Qiu <jqiu@redhat.com>
Co-authored-by: gnana997 <gnana097@gmail.com>
2025-06-04 09:17:55 +00:00
Ramesh Krishna
5bcfeca203 🐛 Switch the order of deletion of access entry and iamrole when managedcluster gets deleted. (#1022)
* Delete access entry before iam role

Signed-off-by: Jeffrey Wong <jeffreywong0417@gmail.com>

* Fix error handling to fix unit test

Signed-off-by: Jeffrey Wong <jeffreywong0417@gmail.com>

* Fix go fmt error

Signed-off-by: Jeffrey Wong <jeffreywong0417@gmail.com>

---------

Signed-off-by: Jeffrey Wong <jeffreywong0417@gmail.com>
Co-authored-by: Jeffrey Wong <jeffreywong0417@gmail.com>
2025-06-04 06:49:16 +00:00
Zhiwei Yin
98443736e9 support set hub qps and burst for work in the klusterlet (#1014)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m43s
Post / coverage (push) Failing after 31m18s
Post / images (amd64) (push) Failing after 8m11s
Post / images (arm64) (push) Failing after 7m29s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 35s
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-05-30 02:03:07 +00:00
Zhiwei Yin
88e7c32400 create a separate rest config for gc metadata client (#1013)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m22s
Post / coverage (push) Failing after 26m41s
Post / images (amd64) (push) Failing after 3m34s
Post / images (arm64) (push) Failing after 2m47s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 41s
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-05-29 01:26:14 +00:00
ivanscai
e753bd6e81 add hub QPS/Burst to hub work client,for talking with hub cluster apiserver (#1012)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m7s
Post / coverage (push) Failing after 27m40s
Post / images (amd64) (push) Failing after 3m26s
Post / images (arm64) (push) Failing after 2m55s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 36s
Signed-off-by: caijing <caijing.cai@alibaba-inc.com>
2025-05-28 13:41:55 +00:00
Zhiwei Yin
e78a3a6d3d add deletionPolicy for manifestworkReplicaset (#996)
Some checks failed
Post / coverage (push) Failing after 26m38s
Post / images (amd64) (push) Failing after 3m24s
Post / images (arm64) (push) Failing after 2m59s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m13s
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-05-28 01:12:21 +00:00
Jian Zhu
4cbb12d5a2 add support for custom ClusterClaim configuration (#1004)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m14s
Post / coverage (push) Failing after 26m54s
Post / images (amd64) (push) Failing after 3m38s
Post / images (arm64) (push) Failing after 3m13s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 37s
* vendor api

Signed-off-by: Omar Farag <omarfarag74@gmail.com>

* add support for maxCustomClusterClaim

Signed-off-by: Omar Farag <omarfarag74@gmail.com>

* support ReservedClusterClaimSuffixes

Signed-off-by: Omar Farag <omarfarag74@gmail.com>

* add and use klusterletinformer

Signed-off-by: Omar Farag <omarfarag74@gmail.com>

* fix tests

Signed-off-by: Omar Farag <omarfarag74@gmail.com>

* update for change in clusterclaimconfiguration api

Signed-off-by: Omar Farag <omarfarag74@gmail.com>

* requested changes, clean up

Signed-off-by: Omar Farag <omarfarag74@gmail.com>

* Use flag to pass the reservedClusterClaimSuffixes

Signed-off-by: zhujian <jiazhu@redhat.com>

* Add cluster claim tests

Signed-off-by: zhujian <jiazhu@redhat.com>

* use StringSliceVar to parse the reserved cluster claim suffixes flag

Signed-off-by: zhujian <jiazhu@redhat.com>

* fix rebase issues

Signed-off-by: zhujian <jiazhu@redhat.com>

* address code review comments

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: Omar Farag <omarfarag74@gmail.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Omar Farag <omarfarag74@gmail.com>
2025-05-27 12:09:41 +00:00
Zhiwei Yin
3d7d770712 remove deprecated work execution clusterrolebinding (#992)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m13s
Post / coverage (push) Failing after 27m7s
Post / images (amd64) (push) Failing after 3m16s
Post / images (arm64) (push) Failing after 2m57s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-05-27 09:40:47 +00:00
Jian Zhu
4d1b4ee8d5 make work status sync interval configurable (#1009)
* update api

Signed-off-by: zhujian <jiazhu@redhat.com>

* make work status sync interval configurable

Signed-off-by: zhujian <jiazhu@redhat.com>

* add unit tests

Signed-off-by: zhujian <jiazhu@redhat.com>

* fix flaky e2e tests

Signed-off-by: zhujian <jiazhu@redhat.com>

* drop go mod replace

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2025-05-27 07:47:58 +00:00
Ben Perry
f13599ffdb Refactor common CEL eval functions into shared pkg (#1003)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m10s
Post / coverage (push) Failing after 30m9s
Post / images (amd64) (push) Failing after 3m38s
Post / images (arm64) (push) Failing after 2m50s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 35s
Signed-off-by: Ben Perry <bhperry94@gmail.com>
2025-05-26 14:36:04 +00:00
Jian Qiu
4eda44f2b9 Add jitter in requeue for status controller (#991)
Some checks failed
Post / coverage (push) Failing after 27m51s
Post / images (amd64) (push) Failing after 3m27s
Post / images (arm64) (push) Failing after 3m13s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m9s
Close stale issues and PRs / stale (push) Successful in 40s
Instead of requeue all each resyncInterval, we requeue
for each item separately with a jitter to avoud bursty request

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-05-14 07:09:27 +00:00
Qing Hao
67f0db9311 remove cel from placement decision group (#981)
Signed-off-by: Qing Hao <qhao@redhat.com>
2025-05-06 12:35:15 +00:00
Qing Hao
df87f528d7 add cost budget, runtime cost estimator and metrics (#964)
Some checks failed
Post / images (amd64) (push) Failing after 6m56s
Post / images (arm64) (push) Failing after 6m41s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 27m27s
Scorecard supply-chain security / Scorecard analysis (push) Failing after 4m7s
Close stale issues and PRs / stale (push) Successful in 34s
Signed-off-by: Qing Hao <qhao@redhat.com>
2025-04-30 08:15:22 +00:00
Qing Hao
f4b6dcb159 select clusters with cel selector (#693)
Some checks failed
Post / images (amd64) (push) Failing after 3m11s
Post / coverage (push) Failing after 8m5s
Post / images (arm64) (push) Failing after 3m27s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 7s
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m34s
Signed-off-by: Qing Hao <qhao@redhat.com>
2025-04-22 15:00:56 +00:00