The TestHubTimeoutController_Sync test was failing intermittently in CI
due to timing issues with time.Sleep() and real-time execution overhead.
Changes:
- Removed time.Sleep() dependency that caused flakiness
- Set lease renew time in the past using time.Now().Add(-duration)
to deterministically simulate aged leases
- Made timeout threshold configurable per test case
- Increased safety margin from 2s to 3s in "not timeout" case
(2s lease age vs 5s timeout, previously 1s wait vs 3s timeout)
- Set startTime in the past to bypass the 10s grace period check
that was added to handle stale lease scenarios
This eliminates race conditions in CI environments where execution
overhead could push the test beyond the timeout threshold.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Scorecard supply-chain security / Scorecard analysis (push) Failing after 20s
Post / images (amd64, placement) (push) Failing after 47s
Post / images (amd64, registration) (push) Failing after 41s
Post / images (amd64, registration-operator) (push) Failing after 45s
Post / images (amd64, work) (push) Failing after 40s
Post / images (arm64, addon-manager) (push) Failing after 44s
Post / images (arm64, placement) (push) Failing after 41s
Post / images (arm64, registration) (push) Failing after 41s
Post / images (arm64, registration-operator) (push) Failing after 41s
Post / images (arm64, work) (push) Failing after 42s
Post / images (amd64, addon-manager) (push) Failing after 7m42s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 9m45s
* check namespace existence and state in clusterprofile lifecycle controller.
Signed-off-by: Morven Cao <lcao@redhat.com>
* optimize the queue key and log for clusterprofile controller.
Signed-off-by: Morven Cao <lcao@redhat.com>
---------
Signed-off-by: Morven Cao <lcao@redhat.com>
Scorecard supply-chain security / Scorecard analysis (push) Failing after 25s
Post / images (amd64, placement) (push) Failing after 47s
Post / images (amd64, registration) (push) Failing after 44s
Post / images (amd64, registration-operator) (push) Failing after 44s
Post / images (amd64, work) (push) Failing after 43s
Post / images (arm64, addon-manager) (push) Failing after 42s
Post / images (arm64, placement) (push) Failing after 41s
Post / images (arm64, registration) (push) Failing after 43s
Post / images (arm64, registration-operator) (push) Failing after 41s
Post / images (arm64, work) (push) Failing after 41s
Post / images (amd64, addon-manager) (push) Failing after 7m45s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 38m55s
Close stale issues and PRs / stale (push) Successful in 50s
* sync clusterprofile based on managedclusterset and managedclustersetbinding
Co-authored-by: Claude <claude@anthropic.com>
Signed-off-by: Morven Cao <lcao@redhat.com>
* Refactor ClusterProfile controller into two separate controllers.
Signed-off-by: Morven Cao <lcao@redhat.com>
* address comments.
Signed-off-by: Morven Cao <lcao@redhat.com>
* fix lint issues.
Signed-off-by: Morven Cao <lcao@redhat.com>
* address comments.
Signed-off-by: Morven Cao <lcao@redhat.com>
* address comments.
Signed-off-by: Morven Cao <lcao@redhat.com>
---------
Signed-off-by: Morven Cao <lcao@redhat.com>
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m24s
Post / coverage (push) Failing after 7m11s
Post / images (amd64, registration) (push) Failing after 45s
Post / images (amd64, registration-operator) (push) Failing after 42s
Post / images (amd64, placement) (push) Failing after 7m50s
Post / images (amd64, work) (push) Failing after 42s
Post / images (arm64, placement) (push) Failing after 42s
Post / images (arm64, registration) (push) Failing after 40s
Post / images (arm64, registration-operator) (push) Failing after 38s
Post / images (arm64, work) (push) Failing after 42s
Post / images (amd64, addon-manager) (push) Failing after 14m28s
Post / images (arm64, addon-manager) (push) Failing after 7m10s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Refactored the removeClusterRbac function into separate functions to handle
different RBAC resource cleanup scenarios:
- removeClusterRBACResources: orchestrates full RBAC cleanup when cluster is deleted
- removeClusterSpecificRBAC: removes ClusterRole and ClusterRoleBinding
- removeClusterSpecificRoleBindings: removes registration and work RoleBindings
When hubAcceptsClient is false (cluster denied), only RoleBindings are removed
while ClusterRole and ClusterRoleBinding are preserved and updated. This ensures
proper RBAC state for denied clusters without deleting cluster-scoped resources.
Added unit test to verify that when a cluster is denied, only RoleBindings are
deleted while ClusterRole and ClusterRoleBinding remain intact.
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Scorecard supply-chain security / Scorecard analysis (push) Failing after 13s
Post / images (amd64, addon-manager) (push) Failing after 48s
Post / images (amd64, placement) (push) Failing after 1m22s
Post / images (amd64, registration) (push) Failing after 42s
Post / images (amd64, work) (push) Failing after 41s
Post / images (arm64, addon-manager) (push) Failing after 42s
Post / images (arm64, placement) (push) Failing after 41s
Post / images (arm64, registration) (push) Failing after 41s
Post / images (arm64, registration-operator) (push) Failing after 41s
Post / images (arm64, work) (push) Failing after 42s
Post / images (amd64, registration-operator) (push) Failing after 21m14s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 39m11s
Close stale issues and PRs / stale (push) Successful in 50s
* Fix work rolebinding cleanup when hubAcceptsClient is set to false
Signed-off-by: Erico G. Rimoli <erico.rimoli@totvs.com.br>
* Adds error handling to the removeClusterRbac call within the controller synchronization function
Signed-off-by: Erico G. Rimoli <erico.rimoli@totvs.com.br>
Post / images (amd64, addon-manager) (push) Failing after 46s
Post / images (amd64, placement) (push) Failing after 41s
Post / images (amd64, registration-operator) (push) Failing after 39s
Post / images (amd64, work) (push) Failing after 42s
Post / images (arm64, addon-manager) (push) Failing after 39s
Post / images (arm64, placement) (push) Failing after 39s
Post / images (arm64, registration) (push) Failing after 40s
Post / images (arm64, registration-operator) (push) Failing after 42s
Post / images (arm64, work) (push) Failing after 39s
Post / images (amd64, registration) (push) Failing after 7m46s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 14m33s
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m25s
Close stale issues and PRs / stale (push) Successful in 46s
* Add addon conversion webhook for v1alpha1/v1beta1 API migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Qing Hao <qhao@redhat.com>
* Fix GroupVersion compatibility issues after API dependency update
This commit fixes compilation and test errors introduced by updating
the API dependency to use native conversion functions from PR #411.
Changes include:
1. Fix GroupVersion type mismatches across the codebase:
- Updated OwnerReference creation to use schema.GroupVersion
- Fixed webhook scheme registration to use proper GroupVersion type
- Applied fixes to addon, placement, migration, work, and registration controllers
2. Enhance addon conversion webhook:
- Use native API conversion functions from addon/v1beta1/conversion.go
- Fix InstallNamespace annotation key to match expected format
- Add custom logic to populate deprecated ConfigReferent field in ConfigReferences
- Properly preserve annotations during v1alpha1 <-> v1beta1 conversion
3. Remove duplicate conversion code:
- Deleted pkg/addon/webhook/conversion/ directory (~500 lines)
- Now using native conversion functions from the API repository
4. Patch vendored addon-framework:
- Fixed GroupVersion errors in agentdeploy utils
All unit tests pass successfully (97 packages, 0 failures).
Signed-off-by: Qing Hao <qhao@redhat.com>
---------
Signed-off-by: Qing Hao <qhao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Skip garbage collection for ManifestWorks that have the
ManifestWorkReplicaSet controller label, as these should be
managed exclusively by the ManifestWorkReplicaSet controller.
Changes:
- Fix logic bug in controller to properly check for ReplicaSet label
- Add unit tests for label-based GC skip behavior
- Add integration test to verify GC skip for ReplicaSet-managed works
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This change adds log tracing support to the work agent controllers by:
- Upgrading SDK to version with logging.SetLogTracingByObject helper
- Setting tracing keys from ManifestWork objects in all work controllers
- Adding clusterName to the base logger for better log context
- Propagating tracing context through cloud events
The tracing keys enable better correlation of logs across the work
lifecycle from source to agent.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m11s
Post / coverage (push) Failing after 37m30s
Post / images (amd64, addon-manager) (push) Failing after 7m29s
Post / images (amd64, placement) (push) Failing after 6m57s
Post / images (amd64, registration) (push) Failing after 7m5s
Post / images (amd64, registration-operator) (push) Failing after 7m5s
Post / images (amd64, work) (push) Failing after 7m2s
Post / images (arm64, addon-manager) (push) Failing after 7m18s
Post / images (arm64, placement) (push) Failing after 7m7s
Post / images (arm64, registration) (push) Failing after 7m13s
Post / images (arm64, registration-operator) (push) Failing after 7m6s
Post / images (arm64, work) (push) Failing after 7m2s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 45s
* Use base controller in sdk-go
We can leverage contextual logger in base controller.
Signed-off-by: Jian Qiu <jqiu@redhat.com>
* Fix integration test error
Signed-off-by: Jian Qiu <jqiu@redhat.com>
---------
Signed-off-by: Jian Qiu <jqiu@redhat.com>
Scorecard supply-chain security / Scorecard analysis (push) Failing after 43s
Post / coverage (push) Failing after 47s
Post / images (amd64, addon-manager) (push) Failing after 33s
Post / images (amd64, placement) (push) Failing after 39s
Post / images (amd64, registration) (push) Failing after 32s
Post / images (amd64, registration-operator) (push) Failing after 37s
Post / images (amd64, work) (push) Failing after 39s
Post / images (arm64, addon-manager) (push) Failing after 42s
Post / images (arm64, placement) (push) Failing after 42s
Post / images (arm64, registration) (push) Failing after 36s
Post / images (arm64, registration-operator) (push) Failing after 34s
Post / images (arm64, work) (push) Failing after 27s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
In integration test, there is change that creating cluster fails
since the cluster is created in the test. The alreadyExist
error should be ignored
Signed-off-by: Jian Qiu <jqiu@redhat.com>
1. make IsCertificateValid to public
2. check existence of bootstrap kubeconfig in the specific driver
3. initiate clients from driver using bootstrapCtx in bootstrap
phase
Signed-off-by: Jian Qiu <jqiu@redhat.com>
* Code refactor on registration driver
1. Move driver's options to each drivers implemenation
2. Add BuildClients interface so it is possible
to support other driver later.
3. create a general option to simplify driver init
4. remove option with any type
Signed-off-by: Jian Qiu <jqiu@redhat.com>
* Change to bootstrapKubeConfigFile
Signed-off-by: Jian Qiu <jqiu@redhat.com>
---------
Signed-off-by: Jian Qiu <jqiu@redhat.com>