Modified e2e test suite to validate image tags and fail tests when
tag-based images don't match the expected tag, while skipping validation
for digest-based images (SHA format).
Changes:
- Added validateImageFormat() helper to check image format (tag vs digest)
- Images using digest format (@sha256:...) skip validation
- Images using tag format (:tag) are validated against expected tag
- Tests fail with Expect() if tag validation fails
- Validation applies to test image variables and ClusterManager specs
- Only validates ClusterManager CR specs, not deployments
- Removed validateKlusterletImageSpecs() to avoid validation before resource creation
Bug fix:
- Fixed CI failure where image validation ran before Klusterlet was created
- The validation now only checks test inputs (which are used to create Klusterlet)
- This ensures Klusterlet has correct images by design without redundant validation
This fixes the BeforeSuite error:
"image validation failed: [failed to get Klusterlet:
klusterlets.operator.open-cluster-management.io
"e2e-universal-klusterlet" not found]"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit fixes two bugs in the ManifestWorkReplicaSet e2e tests:
1. Use freshly fetched mwrSet instead of stale mwReplicaSet when checking
status summary. This ensures we're validating against the latest state
rather than the initial object.
2. Return descriptive error messages instead of nil error when condition
checks fail. This improves test debugging by providing clear failure
reasons.
These fixes improve test reliability and error reporting.
🤖 Assisted by Claude Code
Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: zhujian <jiazhu@redhat.com>
* ✨ Add e2e test for token-based authentication with template addons
This test validates the token-based authentication feature for template
addons introduced in PR #1363. It tests the complete authentication
lifecycle including switching between token and CSR authentication modes.
Test Flow:
1. Enable token-based authentication for addons on klusterlet
2. Deploy template addon and verify it uses token auth
3. Validate token field exists in hub kubeconfig secret
4. Test addon functionality with token authentication
5. Switch back to CSR-based authentication
6. Verify hub kubeconfig now uses client certificates
7. Test addon functionality with CSR authentication
8. Cleanup all resources
Key Features:
- Comprehensive validation of both token and CSR authentication
- No manual CSR approval needed (auto-approved by system)
- Works independently of klusterlet registration driver (grpc/csr)
- Uses label "addon-token-auth" for selective test execution
🤖 Generated with Claude Code
https://claude.com/claude-code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* ♻️ Refactor addon token auth test to use BeforeAll/AfterAll hooks
Move klusterlet configuration save/restore logic from defer in test
function to BeforeAll/AfterAll hooks for better test structure and
isolation.
Changes:
- Save original klusterlet configuration in BeforeAll before any setup
- Configure token auth for klusterlet in BeforeAll
- Restore original configuration in AfterAll after cleanup
- Remove redundant Steps 9-12 (CSR auth switch back)
- Renumber remaining steps from 1-10
- Remove unused strings import
This ensures the klusterlet's original AddOnKubeClientRegistrationDriver
is preserved for other tests and provides clearer separation of test
setup/teardown from test logic.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: zhujian <jiazhu@redhat.com>
* ✅ Wait for registration agent rollout before proceeding in token auth test
Add explicit wait for registration agent deployment to fully rollout after
token authentication configuration is applied. This ensures all replicas are
updated and ready before proceeding with the test, preventing race conditions.
The wait validates:
- ObservedGeneration matches current generation
- All replicas are updated with new configuration
- All replicas are ready and available
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* ✨ Add deployment generation check to ensure token auth rollout
Capture the registration agent deployment generation before updating
the klusterlet configuration, then wait for it to increment after the
update. This ensures the test waits for the actual new deployment with
token auth configuration, not an old one with CSR-based auth.
Changes:
- Capture initial deployment generation before klusterlet update
- Calculate deployment name once based on Singleton vs Default mode
- Wait for deployment generation to increment after config change
- Verify deployment has fully rolled out with all pods updated and ready
This prevents race conditions where the test proceeds while old pods
with the previous CSR-based configuration are still running, which was
likely causing CI failures.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: zhujian <jiazhu@redhat.com>
* ✨ Add support for hosted mode in addon token auth test
This commit adds proper support for hosted mode deployment in the addon
token authentication e2e test. In hosted mode, the agent deployments run
on the hub cluster instead of the spoke cluster, and the agent namespace
is named after the klusterlet name rather than using a fixed namespace.
Key changes:
- Check for both InstallModeHosted and InstallModeSingletonHosted modes
- Use hub.KubeClient instead of spoke.KubeClient in hosted mode
- Use klusterlet.Name as agentNamespace in hosted mode
- Support InstallModeSingletonHosted for deployment naming
This ensures the test works correctly in all deployment modes:
Default, Singleton, Hosted, and SingletonHosted.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
---------
Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit adds validation to detect and reject duplicate manifests
in ManifestWork resources. A manifest is considered duplicate when
it has the same apiVersion, kind, namespace, and name as another
manifest in the same ManifestWork.
This prevents issues where duplicate manifests with different specs
can cause state inconsistency, as the Work Agent applies manifests
sequentially and later entries would overwrite earlier ones.
The validation returns a clear error message indicating the duplicate
manifest's index and the index of its first occurrence.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: xuezhaojun <zxue@redhat.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Post / images (amd64, addon-manager) (push) Failing after 46s
Post / images (amd64, placement) (push) Failing after 41s
Post / images (amd64, registration-operator) (push) Failing after 39s
Post / images (amd64, work) (push) Failing after 42s
Post / images (arm64, addon-manager) (push) Failing after 39s
Post / images (arm64, placement) (push) Failing after 39s
Post / images (arm64, registration) (push) Failing after 40s
Post / images (arm64, registration-operator) (push) Failing after 42s
Post / images (arm64, work) (push) Failing after 39s
Post / images (amd64, registration) (push) Failing after 7m46s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 14m33s
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m25s
Close stale issues and PRs / stale (push) Successful in 46s
* Add addon conversion webhook for v1alpha1/v1beta1 API migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Qing Hao <qhao@redhat.com>
* Fix GroupVersion compatibility issues after API dependency update
This commit fixes compilation and test errors introduced by updating
the API dependency to use native conversion functions from PR #411.
Changes include:
1. Fix GroupVersion type mismatches across the codebase:
- Updated OwnerReference creation to use schema.GroupVersion
- Fixed webhook scheme registration to use proper GroupVersion type
- Applied fixes to addon, placement, migration, work, and registration controllers
2. Enhance addon conversion webhook:
- Use native API conversion functions from addon/v1beta1/conversion.go
- Fix InstallNamespace annotation key to match expected format
- Add custom logic to populate deprecated ConfigReferent field in ConfigReferences
- Properly preserve annotations during v1alpha1 <-> v1beta1 conversion
3. Remove duplicate conversion code:
- Deleted pkg/addon/webhook/conversion/ directory (~500 lines)
- Now using native conversion functions from the API repository
4. Patch vendored addon-framework:
- Fixed GroupVersion errors in agentdeploy utils
All unit tests pass successfully (97 packages, 0 failures).
Signed-off-by: Qing Hao <qhao@redhat.com>
---------
Signed-off-by: Qing Hao <qhao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Post / images (amd64, addon-manager) (push) Failing after 33s
Post / images (amd64, placement) (push) Failing after 41s
Post / images (amd64, registration) (push) Failing after 40s
Post / images (amd64, registration-operator) (push) Failing after 38s
Post / images (amd64, work) (push) Failing after 36s
Post / images (arm64, addon-manager) (push) Failing after 35s
Post / images (arm64, placement) (push) Failing after 39s
Post / images (arm64, registration) (push) Failing after 34s
Post / images (arm64, registration-operator) (push) Failing after 33s
Post / images (arm64, work) (push) Failing after 35s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 41s
Close stale issues and PRs / stale (push) Failing after 27s
* Fix ManagedClusterAddons not removed when ClusterManagementAddon is deleted
The addon template controller was stopping addon managers immediately when
ClusterManagementAddon was deleted, without waiting for pre-delete jobs
to complete or ManagedClusterAddons to be cleaned up via owner reference
cascading deletion.
This change implements the TODO at line 105 by checking if all
ManagedClusterAddons are deleted before stopping the manager. The controller
now uses field selectors to efficiently query for remaining ManagedClusterAddons
and requeues after 10 seconds if any still exist, allowing time for proper
cleanup.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* add e2e test
Signed-off-by: zhujian <jiazhu@redhat.com>
* return err when stopUnusedManagers failed
Signed-off-by: zhujian <jiazhu@redhat.com>
* Address review comments for addon manager deletion fix
- Use lister instead of API client for better performance
- Add named constant for requeue delay
- Fix test cache synchronization issues
- Improve test coverage from 74.7% to 75.6%
Addresses review feedback from Qiujian16 and CodeRabbit.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Fix e2e test timeout for configmap deletion check
Add explicit 180s timeout for pre-delete job configmap cleanup.
The default 90s timeout was insufficient for the deletion workflow.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Improve error logging in template agent
- Replace utilruntime.HandleError with structured logging in CSR functions
- Add more context to error messages for better debugging
- Use logger.Info for template retrieval errors to provide better visibility
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Use ManagedClusterAddonByName index for efficient lookup
- Replace inefficient list-and-filter with indexed lookup
- Add managedClusterAddonIndexer field to controller struct
- Update comment to accurately describe functionality
- Fix unit tests to properly set up the required index
This addresses the PR review feedback to use the existing index
instead of listing all ManagedClusterAddOns and filtering by name.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Remove unused mcaLister field
Since we now use managedClusterAddonIndexer for efficient lookup,
the mcaLister field is no longer needed. This cleanup reduces
memory usage and simplifies the controller structure.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Replace inefficient list-and-filter with indexed lookup in runController
Use managedClusterAddonIndexer.ByIndex() instead of listing all ManagedClusterAddOns
and filtering by name. This provides O(1) indexed lookup instead of O(n) linear scan.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Fix review comments for addon manager deletion
- Fix closure capture bug in controller test by using captured variables
- Fix typo 'copyiedConfig' to 'copiedConfig' in e2e tests
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Optimize ManagedClusterAddOn event handling in addon template controller
Replace filtered event handling with custom event handlers that only trigger
reconciliation when AddOnTemplate configReferences actually change. This
reduces unnecessary reconciliation cycles by using reflect.DeepEqual to
compare config references between old and new objects.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Revert "Optimize ManagedClusterAddOn event handling in addon template controller"
This reverts commit 4649d1b9ac.
Signed-off-by: zhujian <jiazhu@redhat.com>
---------
Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
* Fix helm command syntax in e2e test deployment
Remove unnecessary commas from --set flags in helm commands.
Multiple --set flags don't require commas between them.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Add comprehensive image tag validation to e2e tests
- Add configurable expected-image-tag parameter to e2e test suite
- Create image_tag_validation_test.go with comprehensive validation:
* Validates all container images in cluster-manager and klusterlet deployments use expected tag
* Validates test image variables (registrationImage, workImage, singletonImage) end with expected tag
* Validates ClusterManager spec imagePullSpec fields use expected tag
- Pass IMAGE_TAG to e2e tests via expected-image-tag parameter
- Use BeforeEach to avoid duplicate validation checks
This ensures IMAGE_TAG=e2e properly propagates to all OCM components.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
* Move image validation from separate test to BeforeSuite
- Move all image validation logic from image_tag_validation_test.go to BeforeSuite in e2e_suite_test.go
- Validates test image variables, ClusterManager spec, and deployment containers in setup phase
- Provides early failure detection if image configuration is incorrect
- Removes duplicate test file and function declarations
This ensures image validation runs once during test setup rather than as separate test cases.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
---------
Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Scorecard supply-chain security / Scorecard analysis (push) Failing after 43s
Post / coverage (push) Failing after 47s
Post / images (amd64, addon-manager) (push) Failing after 33s
Post / images (amd64, placement) (push) Failing after 39s
Post / images (amd64, registration) (push) Failing after 32s
Post / images (amd64, registration-operator) (push) Failing after 37s
Post / images (amd64, work) (push) Failing after 39s
Post / images (arm64, addon-manager) (push) Failing after 42s
Post / images (arm64, placement) (push) Failing after 42s
Post / images (arm64, registration) (push) Failing after 36s
Post / images (arm64, registration-operator) (push) Failing after 34s
Post / images (arm64, work) (push) Failing after 27s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
In integration test, there is change that creating cluster fails
since the cluster is created in the test. The alreadyExist
error should be ignored
Signed-off-by: Jian Qiu <jqiu@redhat.com>
* Add a configmap to handle the proxy ca bundle
Signed-off-by: zhujian <jiazhu@redhat.com>
* Use contextual logger
Signed-off-by: zhujian <jiazhu@redhat.com>
---------
Signed-off-by: zhujian <jiazhu@redhat.com>
The addon namespace should always be default on.
The operator will not create addon ns based
on klusterlet install namespace.
Signed-off-by: Jian Qiu <jqiu@redhat.com>
* Set install namespace of addonTemplate from config
Signed-off-by: Jian Qiu <jqiu@redhat.com>
* Add an e2e test case
Signed-off-by: Jian Qiu <jqiu@redhat.com>
---------
Signed-off-by: Jian Qiu <jqiu@redhat.com>