Commit Graph

27 Commits

Author SHA1 Message Date
Jian Qiu
63d9574ca2 Add watch-based feedback with dynamic informer lifecycle management (#1350)
* Add watch-based feedback with dynamic informer lifecycle management

Implements dynamic informer registration and cleanup for resources
configured with watch-based status feedback (FeedbackScrapeType=Watch).
This enables real-time status updates for watched resources while
efficiently managing resource lifecycle.

Features:
- Automatically register informers for resources with FeedbackWatchType
- Skip informer registration for FeedbackPollType or when not configured
- Clean up informers when resources are removed from manifestwork
- Clean up informers during applied manifestwork finalization
- Clean up informers when feedback type changes from watch to poll

Implementation:
- Refactored ObjectReader to interface for better modularity
- Added UnRegisterInformerFromAppliedManifestWork helper for bulk cleanup
- Enhanced AvailableStatusController to conditionally register informers
- Updated finalization controllers to unregister informers on cleanup
- Added nil safety checks to prevent panics during cleanup

Testing:
- Unit tests for informer registration based on feedback type
- Unit tests for bulk unregistration and nil safety
- Integration test for end-to-end watch-based feedback workflow
- Integration test for informer cleanup on manifestwork deletion
- All existing tests updated and passing

This feature improves performance by using watch-based updates for
real-time status feedback while maintaining efficient resource cleanup.

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Fallback to get from client when informer is not synced

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2026-01-29 06:46:21 +00:00
Jian Qiu
9b010ef622 🌱 build object reader to get resource object from spoke (#1324)
Some checks failed
Post / images (amd64, addon-manager) (push) Failing after 51s
Post / images (amd64, placement) (push) Failing after 46s
Post / images (amd64, registration) (push) Failing after 43s
Post / images (amd64, registration-operator) (push) Failing after 44s
Post / images (amd64, work) (push) Failing after 44s
Post / images (arm64, addon-manager) (push) Failing after 43s
Post / images (arm64, placement) (push) Failing after 43s
Post / images (arm64, registration) (push) Failing after 42s
Post / images (arm64, registration-operator) (push) Failing after 43s
Post / images (arm64, work) (push) Failing after 41s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 8m56s
Post / coverage (push) Failing after 13m3s
Close stale issues and PRs / stale (push) Successful in 42s
* A resource informer code to watch resources

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Use object reader in controller

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2026-01-23 07:32:13 +00:00
Anne Lau
4dc99cd621 Progressing status conditions, true wins (#1332)
Signed-off-by: annelau <annelau@salesforce.com>
Co-authored-by: annelau <annelau@salesforce.com>
2026-01-16 06:53:14 +00:00
Anne Lau
ff9f801aa0 Fix transition time for Applied + StatusFeedbackSynced (#1282)
Some checks failed
Post / coverage (push) Failing after 7m10s
Post / images (amd64, addon-manager) (push) Failing after 43s
Post / images (amd64, placement) (push) Failing after 36s
Post / images (amd64, registration) (push) Failing after 36s
Post / images (amd64, registration-operator) (push) Failing after 36s
Post / images (amd64, work) (push) Failing after 38s
Post / images (arm64, placement) (push) Failing after 37s
Post / images (arm64, registration) (push) Failing after 37s
Post / images (arm64, registration-operator) (push) Failing after 38s
Post / images (arm64, work) (push) Failing after 38s
Post / images (arm64, addon-manager) (push) Failing after 14m20s
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m28s
Post / image manifest (addon-manager) (push) Has been cancelled
Post / image manifest (placement) (push) Has been cancelled
Post / image manifest (registration) (push) Has been cancelled
Post / image manifest (registration-operator) (push) Has been cancelled
Post / image manifest (work) (push) Has been cancelled
Post / trigger clusteradm e2e (push) Has been cancelled
Close stale issues and PRs / stale (push) Successful in 4s
Update code changes to only update observed generation without lastTransitionTime

Update with simple tests

Update with the latest PR changes

Add unit test changes

Add integration test generated by cursor

Fix unit tests

Signed-off-by: annelau <annelau@salesforce.com>
Co-authored-by: annelau <annelau@salesforce.com>
2025-12-31 02:27:59 +00:00
Jian Qiu
a06e37e65c 🌱 Integrate SDK logging tracing into work agent controllers (#1277)
This change adds log tracing support to the work agent controllers by:
- Upgrading SDK to version with logging.SetLogTracingByObject helper
- Setting tracing keys from ManifestWork objects in all work controllers
- Adding clusterName to the base logger for better log context
- Propagating tracing context through cloud events

The tracing keys enable better correlation of logs across the work
lifecycle from source to agent.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Jian Qiu <jqiu@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-04 13:12:38 +00:00
Jian Qiu
33310619d9 🌱 use SDK basecontroller for better logging. (#1269)
* Use basecontroller in sdk-go instead for better logging

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Rename to fakeSyncContext

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-12-01 03:07:02 +00:00
Jian Qiu
eb033993c2 🌱 Use base controller in sdk-go (#1251)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m11s
Post / coverage (push) Failing after 37m30s
Post / images (amd64, addon-manager) (push) Failing after 7m29s
Post / images (amd64, placement) (push) Failing after 6m57s
Post / images (amd64, registration) (push) Failing after 7m5s
Post / images (amd64, registration-operator) (push) Failing after 7m5s
Post / images (amd64, work) (push) Failing after 7m2s
Post / images (arm64, addon-manager) (push) Failing after 7m18s
Post / images (arm64, placement) (push) Failing after 7m7s
Post / images (arm64, registration) (push) Failing after 7m13s
Post / images (arm64, registration-operator) (push) Failing after 7m6s
Post / images (arm64, work) (push) Failing after 7m2s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 45s
* Use base controller in sdk-go

We can leverage contextual logger in base controller.

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Fix integration test error

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-11-20 07:53:42 +00:00
Jian Qiu
5528aff6d3 🌱 Add contextual logging for work agent (#1242)
* Add contextual logging for work agent

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Resolve comments

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-11-07 05:28:13 +00:00
Ben Perry
c5e776cdd9 Manifest completion (#1033)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m58s
Post / coverage (push) Failing after 36m24s
Post / images (amd64) (push) Failing after 9m7s
Post / images (arm64) (push) Failing after 8m30s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 57s
* Skip manifests in work reconcile that are marked Complete

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Aggregate Complete condition to work from manifests

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Delete work that is complete and satisfies configured TTL

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* tests

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* lint

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* go.mod

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Helper funcs for conditions

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Generic condition aggregation

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Support integration test args

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Remove work deletion from spoke, will be moved to hub GC

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Cleanup

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* update api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Wait for NS to exist before testing

Signed-off-by: Ben Perry <bhperry94@gmail.com>

---------

Signed-off-by: Ben Perry <bhperry94@gmail.com>
2025-07-14 04:53:04 +00:00
Ben Perry
377ba25c26 Workload conditions (#910)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m40s
Post / coverage (push) Failing after 35m43s
Post / images (amd64) (push) Failing after 8m36s
Post / images (arm64) (push) Failing after 8m8s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 48s
* Import OCM API changes for workload conditions

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Implement condition rule evaluator

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Evaluate manifest condition rules after apply

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* note to self

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Cleanup

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Return config option if rules are set

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* update api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Always return an error to inform user about the state of their condition rule

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Condition rule errors should not result in retrying apply

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Test condition rule reconciliation

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Return condition status Unknown when an internal CEL error occurs

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Switch to common CEL lib

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update to simplified celExpressions format

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Formatting

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* tidy

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update ocm api

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update sdk-go

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Switch to sdk-go ConditionLib

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update API

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Switch to WellKnownConditions with required Condition field

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Support CEL evaluation budget

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update sdk-go

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update API

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* lint

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update go.mod

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Tests and comments

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Move condition reader to status controller for more frequent updates

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Ignore missing WellKnownCondition

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Fix test

Signed-off-by: Ben Perry <bhperry94@gmail.com>

* Update condition tests

Signed-off-by: Ben Perry <bhperry94@gmail.com>

---------

Signed-off-by: Ben Perry <bhperry94@gmail.com>
2025-06-11 15:47:35 +00:00
Jian Qiu
4eda44f2b9 Add jitter in requeue for status controller (#991)
Some checks failed
Post / coverage (push) Failing after 27m51s
Post / images (amd64) (push) Failing after 3m27s
Post / images (arm64) (push) Failing after 3m13s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m9s
Close stale issues and PRs / stale (push) Successful in 40s
Instead of requeue all each resyncInterval, we requeue
for each item separately with a jitter to avoud bursty request

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-05-14 07:09:27 +00:00
Wei Liu
73150dea19 reduce unnecessary log (#890)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 4m14s
Post / images (amd64) (push) Failing after 6m28s
Post / images (arm64) (push) Failing after 5m18s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 25m50s
Close stale issues and PRs / stale (push) Successful in 7s
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-03-14 01:19:08 +00:00
Zhiwei Yin
568789fef4 refactor to use common HasFinalizer func (#830)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m33s
Post / coverage (push) Failing after 26m11s
Post / images (amd64) (push) Failing after 7m0s
Post / images (arm64) (push) Failing after 6m47s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 28s
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2025-02-13 02:48:46 +00:00
Zhiwei Yin
fa3a30b36e support wildcard in manifestConfigs (#703)
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2024-11-21 06:56:46 +00:00
Jian Qiu
3a2250d974 Refactor NewUnstructured method (#418)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2024-04-11 12:01:07 +00:00
Jian Qiu
92d4f86837 Add a flag for work agent to set raw json length (#366)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2024-03-06 03:52:16 +00:00
Jian Qiu
6cfce8ce24 Revert apply func (#353)
this part dep on library-go so remove from
sdk-go

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2024-01-22 03:46:46 +00:00
Jian Qiu
bede3edd92 Switch to patcher in sdk-go (#349)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2024-01-22 02:04:49 +00:00
Jian Qiu
3167826df9 Use finalizer in api repo (#241)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-08-04 11:54:55 +02:00
Yang Le
6d6a6f1d74 🌱 upgrade addondeploymentconfigs crd to latest version (#243)
Signed-off-by: Yang Le <yangle@redhat.com>
2023-08-03 09:56:39 +02:00
Jian Qiu
e810520961 🌱 Refactor code to fix lint warning (#218)
* Refactor code to fix lint warning

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* enable lint for testing files

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-07-25 07:12:34 +02:00
Jian Qiu
f7cd1402e9 run work and registration as a single binary (#201)
* run registratin/work together

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Fix integration test and lint issue

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Update operator to deploy singleton mode

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Update deps

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-07-14 04:56:48 +02:00
Jian Qiu
e4792e4b83 Refactor to use common queue/filter funcs (#197)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-06-28 15:59:19 +02:00
Jian Qiu
e344e26a5e Use patcher in work (#190)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-06-21 05:38:59 +02:00
Jian Zhu
d3d648283e 🌱 Configure the golangci lint (#180)
* 🌱 Configure the golangci lint

Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Fix lint issues

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2023-06-13 03:51:48 -04:00
Jian Zhu
7332a585c0 🌱 add a verify rule for golang files import order (#177)
* 🌱 add a verify rule for golang files import order

This PR uses the [gci tool](https://github.com/daixiang0/gci) to make all go files' import section with a specific order, it will organize import with group with order:
1. standard library modules
2. 3rd party modules
3. modules in OCM org, like the `open-cluster-management.io/api`
4. current project `open-cluster-management.io/ocm` modules

developers can use the `make fmt-imports` to format the import automatically and the `make verify-fmt-imports` to check for any violation.

Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 format the go files import

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2023-06-12 10:23:04 -04:00
xuezhaojun
ad38b9465f Relocate pkgs. (#146)
Signed-off-by: xuezhaojun <zxue@redhat.com>
2023-05-29 07:20:55 -04:00