Commit Graph

28 Commits

Author SHA1 Message Date
Jian Zhu
336e5b0e4d 🌱 Add TLS profile compliance for gRPC server (#1471)
Add TLS profile compliance to the gRPC server, completing TLS support
for all hub components. The operator reads the ocm-tls-profile ConfigMap
and injects --tls-min-version and --tls-cipher-suites flags into the
gRPC server deployment, matching the pattern used by all other hub
component deployments.

Changes:
- Add TLS flag injection to gRPC server deployment manifest
- Wire TLS flags from common options to gRPC server via closure
- Call ApplyTLSToCommand for the 8443 health server endpoint
- Apply TLS overrides to the 8090 gRPC port via SDK ApplyTLSFlags
- Update vendored sdk-go with CipherSuites support for gRPC server
- Add unit, controller, and integration tests

Assisted by Claude

Signed-off-by: zhujian <jiazhu@redhat.com>
2026-04-07 01:54:22 +00:00
Qing Hao
391ae86bff split debug controller as standalone service with proper validation (#1461)
* feat(placement): split debug controller as standalone service with proper validation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Qing Hao <qhao@redhat.com>

* feat(placement): make placement service conditional on PlacementDebugServer feature gate

Make placement debug service deployment conditional based on
PlacementDebugServer feature gate to allow users to control
whether to expose the debug endpoint.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Qing Hao <qhao@redhat.com>

---------

Signed-off-by: Qing Hao <qhao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
2026-04-03 02:40:24 +00:00
Wei Liu
6117a3e553 disable leader election for grpc server (#1468)
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2026-04-02 08:25:46 +00:00
Jian Zhu
fc55a5df7c 🌱 Add TLS ConfigMap watch and restart for cluster-manager operator (#1452)
* 🌱 Add TLS profile configuration support via flags and ConfigMap

Add pkg/common/tls library to support TLS profile compliance
for OCM components. This enables components to receive TLS
configuration via command-line flags (--tls-min-version and
--tls-cipher-suites) from operators, aligning with the upstream
enhancement proposal for TLS profile configuration.

Key features:
- TLS version and cipher suite parsing from flags or ConfigMap
- ConfigMap-based TLS configuration for operator use
- ConfigMap watcher for operators to detect profile changes
- OpenSSL cipher name mapping to Go crypto/tls constants
- Safe defaults (TLS 1.2) when no configuration provided

Updated pkg/common/options/webhook.go to use TLS library instead
of hardcoded TLS 1.2, enabling webhook components to respect
TLS flags injected by operators.

This is the foundation for OCM TLS profile compliance, keeping
upstream code OpenShift-agnostic while supporting dynamic TLS
configuration.

Related: open-cluster-management-io/enhancements#175

Signed-off-by: Jia Zhu <jiazhu@redhat.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Add TLS ConfigMap watch and restart to cluster-manager operator

Implement ConfigMap-based TLS profile compliance for cluster-manager operator
with hash comparison to prevent infinite restart loops.

Changes:
- Add TLS ConfigMap informer to watch ocm-tls-profile ConfigMap
- Load current TLS config at startup and compute hash
- Add event handlers that compare ConfigMap hash with current hash
- Only restart if ConfigMap content actually differs from current config
- Add comprehensive logging for all scenarios

Scenarios handled:
 ConfigMap exists at startup (hash matches) → no restart
 ConfigMap created after startup (hash differs) → restart to apply
 ConfigMap updated (new hash differs) → restart to apply
 ConfigMap deleted (was using it) → restart to use defaults

Leader election behavior:
- This code only runs on the leader pod (due to controllercmd framework)
- Non-leader pods wait idle until they acquire leadership
- New leaders load current ConfigMap state when they start, ensuring latest config
- Only the active leader monitors ConfigMap changes and restarts

🤖 Generated with Claude Code

Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Inject TLS config flags into addon-webhook deployment

Implement Case 2 pattern for addon-webhook TLS configuration:
cluster-manager-operator loads TLS config from ConfigMap and injects
it as flags into the addon-webhook deployment.

Changes:
- Add AddonWebhookTLSMinVersion and AddonWebhookTLSCipherSuites fields to HubConfig
- Load TLS config once when creating ClusterManagerController
- Pass TLS config strings as parameters to controller
- Inject --tls-min-version and --tls-cipher-suites flags into addon-webhook deployment template

This approach ensures addon-webhook receives TLS configuration via flags
without needing to watch the ConfigMap itself. When the ConfigMap changes,
cluster-manager-operator restarts, reloads the config, and updates the
deployment with new flags.

🤖 Generated with Claude Code

Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Log TLS min version and cipher suites on startup

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Move TLS library to sdk-go and update vendor dependencies

Relocates TLS config and cipher helpers from pkg/common/tls into the
vendored open-cluster-management.io/sdk-go/pkg/tls package, adds a
generic watcher utility, and updates all import references accordingly.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Inject TLS flags into all hub component deployments

Extend TLS flag injection from addon-webhook-only to all seven
hub deployments managed by cluster-manager-operator:

Manifests (operator → deployment args):
- Rename HubConfig.AddonWebhookTLS* → TLS* so the same fields
  drive all deployments rather than only the addon webhook
- Add {{- if .TLSMinVersion }} blocks to all six remaining
  deployment manifests (registration/work/placement controllers
  and registration/work webhook servers)

Controller binaries (registration, work, placement, addon-manager):
- Add --tls-min-version and --tls-cipher-suites flags to the
  common Options struct so the binaries accept the injected flags
  without failing; the flags are stored for future use

Note: library-go's NewCommandWithContext uses cmd.Run (not RunE),
so there is no clean programmatic hook to inject TLS into the 8443
health server without bypassing library-go's own boilerplate
(signal handling, log init, profiling). Upstream library-go also
has no native TLS configuration API on ControllerCommandConfig or
ControllerBuilder. The 8443 health server defaults to TLS 1.2 via
SetRecommendedHTTPServingInfoDefaults; configuring it further
requires an upstream library-go enhancement.

Webhook binaries already fully support these flags via WebhookOptions;
no binary changes are needed there.

Signed-off-by: Jian Zhu <zhujian@redhat.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Wire --tls-min-version to library-go 8443 health server via WithServingTLSConfig

Now that library-go has WithServingTLSConfig (ServingMinTLSVersion /
ServingCipherSuites fields + injection in StartController before
WithServer is called), wire the --tls-min-version and
--tls-cipher-suites flags from Options into it.

ApplyTLSToCommand installs a PersistentPreRunE hook that calls
CmdConfig.WithServingTLSConfig after cobra flag parsing completes.
PersistentPreRunE runs before cmd.Run, so all library-go boilerplate
(signal handling, logging, profiling) is preserved - unlike the
previous approach of replacing RunE which silently bypassed it.

Uses go mod replace → /Users/jiazhu/go/src/github.com/openshift/library-go
for local development/testing; replace directive to be removed once the
library-go PR is merged and vendored.

Signed-off-by: Jian Zhu <zhujian@redhat.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Switch to --config file for controller 8443 TLS configuration

Replace the WithServingTLSConfig approach with library-go's native
--config flag mechanism:

ApplyTLSToCommand now installs a PersistentPreRunE hook that:
1. Writes a minimal GenericOperatorConfig YAML to a temp file under
   /tmp (which is mounted as an emptyDir in all hub controller
   deployments, so writing is safe even with readOnlyRootFilesystem)
2. Sets --config to point at the temp file before cmd.Run executes

All library-go boilerplate in cmd.Run (signal handling, log init,
profiling, basicFlags.Validate) is fully preserved because
PersistentPreRunE runs before Run, not replacing it.

Inside StartController, Config() reads the temp file; the TLS values
survive SetRecommendedHTTPServingInfoDefaults because DefaultString
only sets fields that are currently empty.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Add tests for TLS profile compliance

Unit tests (pkg/common/options):
- TestApplyTLSToCommand: table-driven test covering all flag combinations:
  no flags (no-op), min-version only, cipher-suites only, both set,
  and --config pre-set by user (injection skipped).

Unit tests (clustermanager_controller):
- TestSyncDeployWithTLSConfig: verifies that when tlsMinVersion /
  tlsCipherSuites are set on the controller, the --tls-min-version and
  --tls-cipher-suites flags appear in the args of every managed hub
  deployment (registration, registration-webhook, placement, work-webhook).
  Also verifies the flags are absent when TLS config is not set.

Integration tests (test/integration/operator):
- "should inject tls-min-version into all hub deployments when
  ocm-tls-profile ConfigMap exists": creates the ocm-tls-profile
  ConfigMap with minTLSVersion=VersionTLS13 in the operator namespace
  and verifies all six hub deployments gain --tls-min-version=VersionTLS13
  in their container args.

Signed-off-by: Jian Zhu <zhujian@redhat.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Switch TLS cipher suite format from OpenSSL to IANA

Update vendored sdk-go to use IANA cipher suite names (e.g.
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) instead of OpenSSL names
(e.g. ECDHE-RSA-AES128-GCM-SHA256).

IANA is the canonical format used by Go's crypto/tls, the Kubernetes
apiserver --tls-cipher-suites flag, and library-go's ServingInfo.CipherSuites.
Using IANA names end-to-end eliminates the format mismatch that caused
library-go's 8443 health server to reject cipher suite names written by
ApplyTLSToCommand.

The ocm-tls-profile ConfigMap now accepts IANA names only. The downstream
tls-profile-sync sidecar is responsible for converting OpenShift
TLSSecurityProfile (OpenSSL-style) names to IANA before writing the ConfigMap.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 Fix TLS ConfigMap test: create ConfigMap before operator startup

The previous test created ocm-tls-profile ConfigMap after the operator
started, which triggered the watcher's hash-change detection and called
os.Exit(0), killing the test process. Move the test into a dedicated
Describe with BeforeEach that creates the ConfigMap before starting the
operator so the watcher seeds its hash at startup and no restart is
triggered.

Also add hubWorkControllerDeployment to the tlsDeployments list since
its manifest includes tls-min-version injection.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: Jia Zhu <jiazhu@redhat.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
Signed-off-by: Jian Zhu <zhujian@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-01 06:54:30 +00:00
Qing Hao
c516beffa6 Add addon conversion webhook for v1alpha1/v1beta1 API migration (#1289)
Some checks failed
Post / images (amd64, addon-manager) (push) Failing after 46s
Post / images (amd64, placement) (push) Failing after 41s
Post / images (amd64, registration-operator) (push) Failing after 39s
Post / images (amd64, work) (push) Failing after 42s
Post / images (arm64, addon-manager) (push) Failing after 39s
Post / images (arm64, placement) (push) Failing after 39s
Post / images (arm64, registration) (push) Failing after 40s
Post / images (arm64, registration-operator) (push) Failing after 42s
Post / images (arm64, work) (push) Failing after 39s
Post / images (amd64, registration) (push) Failing after 7m46s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Post / coverage (push) Failing after 14m33s
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m25s
Close stale issues and PRs / stale (push) Successful in 46s
* Add addon conversion webhook for v1alpha1/v1beta1 API migration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Qing Hao <qhao@redhat.com>

* Fix GroupVersion compatibility issues after API dependency update

This commit fixes compilation and test errors introduced by updating
the API dependency to use native conversion functions from PR #411.

Changes include:

1. Fix GroupVersion type mismatches across the codebase:
   - Updated OwnerReference creation to use schema.GroupVersion
   - Fixed webhook scheme registration to use proper GroupVersion type
   - Applied fixes to addon, placement, migration, work, and registration controllers

2. Enhance addon conversion webhook:
   - Use native API conversion functions from addon/v1beta1/conversion.go
   - Fix InstallNamespace annotation key to match expected format
   - Add custom logic to populate deprecated ConfigReferent field in ConfigReferences
   - Properly preserve annotations during v1alpha1 <-> v1beta1 conversion

3. Remove duplicate conversion code:
   - Deleted pkg/addon/webhook/conversion/ directory (~500 lines)
   - Now using native conversion functions from the API repository

4. Patch vendored addon-framework:
   - Fixed GroupVersion errors in agentdeploy utils

All unit tests pass successfully (97 packages, 0 failures).

Signed-off-by: Qing Hao <qhao@redhat.com>

---------

Signed-off-by: Qing Hao <qhao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-24 08:26:35 +00:00
Jian Qiu
f9d4628f17 Enhance test coverage (#1174)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-09-11 07:26:59 +00:00
Jian Qiu
588f82f48b Refactor webhook to use a common webhook option (#1096)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 1m26s
Post / coverage (push) Failing after 39m1s
Post / images (amd64) (push) Failing after 8m21s
Post / images (arm64) (push) Failing after 7m47s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 47s
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-07-29 07:38:59 +00:00
Wei Liu
405adb61cd starting grpc server with config file (#1071)
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2025-07-16 02:13:37 +00:00
Jian Zhu
a5757b46f3 Fix typo description of the cluster manager operator flags (#1044)
Some checks failed
Scorecard supply-chain security / Scorecard analysis (push) Failing after 2m36s
Post / coverage (push) Failing after 35m17s
Post / images (amd64) (push) Failing after 8m34s
Post / images (arm64) (push) Failing after 8m7s
Post / image manifest (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Close stale issues and PRs / stale (push) Successful in 56s
Signed-off-by: zhujian <jiazhu@redhat.com>
2025-06-19 08:06:21 +00:00
Jeffrey
215cfed77e Adding support for enableSyncLabels for clustermanager operator and registration controller (#1021)
Signed-off-by: Jeffrey Wong <jeffreywong0417@gmail.com>
2025-06-12 02:32:36 +00:00
Ankit Kurmi
cd8827572e feat: updated golang to v1.23.6 and related k8s.io packages (#870)
Signed-off-by: Ankit152 <ankitkurmi152@gmail.com>
2025-04-09 07:46:27 +00:00
Jian Qiu
d323b60253 Change the component name to klusterlet-agent (#809)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2025-01-17 10:34:11 +00:00
Qing Hao
4ebe9d7978 🐛 monitor the bootstrap kubeconfig and restart immediately when changes (#630)
* monitor the bootstrap kubeconfig and restart immediately when changes

Signed-off-by: haoqing0110 <qhao@redhat.com>

* fix comments

Signed-off-by: haoqing0110 <qhao@redhat.com>

---------

Signed-off-by: haoqing0110 <qhao@redhat.com>
2024-09-30 06:24:20 +00:00
Jian Qiu
8c1d286b11 Refactor registration (#535)
* Refactor registration

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Fix integration test

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Refactor cert controller to secret controller

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Update health check func

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2024-07-17 14:14:11 +00:00
Ohki Nozomu
1227b71043 Fix typo: Rename 'CommoOpts' to 'CommonOpts' (#523)
Signed-off-by: ohkinozomu <nozomunoise@gmail.com>
2024-06-17 02:21:10 +00:00
Zhiwei Yin
c4b2c65080 add enable-sync-labels flag to klusterlet operator (#505)
Signed-off-by: Zhiwei Yin <zyin@redhat.com>
2024-06-06 15:03:12 +00:00
Jian Qiu
c056181096 Add a disable-default-addon-namespace flag (#484)
* Add a disable-default-addon-namespace flag

if the flag is set, default addon ns will not be created
by the operator.

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Update with comments

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2024-06-03 06:54:15 +00:00
DONG BEIQING
22da639109 configurable controller replicas and master node selector (#468)
* configurable controller replicas and master node selector

Signed-off-by: Dong Beiqing <350758787@qq.com>

* run make fmt-imports

Signed-off-by: Dong Beiqing <350758787@qq.com>

* shorter lines

Signed-off-by: Dong Beiqing <350758787@qq.com>

* rename ControllerReplicas to DeploymentReplicas

Signed-off-by: Dong Beiqing <350758787@qq.com>

* rename masterNodeLabelSelectors to controlPlaneNodeLabels

Signed-off-by: Dong Beiqing <350758787@qq.com>

* rename controlPlaneNodeLabels to controlPlaneNodeLabelSelector

Signed-off-by: Dong Beiqing <350758787@qq.com>

---------

Signed-off-by: Dong Beiqing <350758787@qq.com>
2024-05-21 10:30:38 +00:00
Wei Liu
b1b734aa7a support cloudevents for manifestworkreplicaset (#352)
Signed-off-by: Wei Liu <liuweixa@redhat.com>
2024-03-06 13:17:22 +00:00
Yang Le
9aaa1327fa 🐛 move the rebootstrap logic to registration agent (#267)
Signed-off-by: Yang Le <yangle@redhat.com>
2023-10-18 09:58:06 +00:00
Jian Qiu
88f6f4dd17 Refactor code to start managers with shared informers (#232)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-08-21 00:03:58 -02:30
Jian Qiu
e22faa4545 🌱 Build a commonoption for all managers (#228)
* Build a commonoption for all managers

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Add unit tests

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-07-25 03:12:35 +02:00
Jian Qiu
f7cd1402e9 run work and registration as a single binary (#201)
* run registratin/work together

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Fix integration test and lint issue

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Update operator to deploy singleton mode

Signed-off-by: Jian Qiu <jqiu@redhat.com>

* Update deps

Signed-off-by: Jian Qiu <jqiu@redhat.com>

---------

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-07-14 04:56:48 +02:00
Jian Qiu
8c92c70e6d Being able to match multiple items in jsonpath (#202)
This allows a jsonpatch to match multiple items and
returns a list with jsonraw

Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-07-05 04:32:52 -04:00
Jian Zhu
a78d9f457d 🌱 Move addon manager from addon-framework to ocm repo (#196)
* update vendor to add addon-framework

Signed-off-by: zhujian <jiazhu@redhat.com>

* Move addon manager from addon-framework to ocm repo

Signed-off-by: zhujian <jiazhu@redhat.com>

* add integration tests for addon manager

Signed-off-by: zhujian <jiazhu@redhat.com>

* push addon manager image post commit

Signed-off-by: zhujian <jiazhu@redhat.com>

* use library-go to refactor addon controllers

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2023-06-27 03:59:54 +02:00
Jian Zhu
7332a585c0 🌱 add a verify rule for golang files import order (#177)
* 🌱 add a verify rule for golang files import order

This PR uses the [gci tool](https://github.com/daixiang0/gci) to make all go files' import section with a specific order, it will organize import with group with order:
1. standard library modules
2. 3rd party modules
3. modules in OCM org, like the `open-cluster-management.io/api`
4. current project `open-cluster-management.io/ocm` modules

developers can use the `make fmt-imports` to format the import automatically and the `make verify-fmt-imports` to check for any violation.

Signed-off-by: zhujian <jiazhu@redhat.com>

* 🌱 format the go files import

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
2023-06-12 10:23:04 -04:00
Jian Qiu
62efbf935b Build common options for agent (#163)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-06-08 02:55:43 -04:00
Jian Qiu
116ae8cc28 Refactor version/feature/cmd packages (#148)
Signed-off-by: Jian Qiu <jqiu@redhat.com>
2023-05-30 02:07:32 -04:00