Files
open-cluster-management/pkg/addon/templateagent/template_agent.go
Jian Zhu b506d16cf8
Some checks failed
Post / coverage (push) Failing after 38s
Post / images (amd64, addon-manager) (push) Failing after 33s
Post / images (amd64, placement) (push) Failing after 41s
Post / images (amd64, registration) (push) Failing after 40s
Post / images (amd64, registration-operator) (push) Failing after 38s
Post / images (amd64, work) (push) Failing after 36s
Post / images (arm64, addon-manager) (push) Failing after 35s
Post / images (arm64, placement) (push) Failing after 39s
Post / images (arm64, registration) (push) Failing after 34s
Post / images (arm64, registration-operator) (push) Failing after 33s
Post / images (arm64, work) (push) Failing after 35s
Post / image manifest (addon-manager) (push) Has been skipped
Post / image manifest (placement) (push) Has been skipped
Post / image manifest (registration) (push) Has been skipped
Post / image manifest (registration-operator) (push) Has been skipped
Post / image manifest (work) (push) Has been skipped
Post / trigger clusteradm e2e (push) Has been skipped
Scorecard supply-chain security / Scorecard analysis (push) Failing after 41s
Close stale issues and PRs / stale (push) Failing after 27s
🐛 Fix ManagedClusterAddons not removed when ClusterManagementAddon is deleted (#1160)
* Fix ManagedClusterAddons not removed when ClusterManagementAddon is deleted

The addon template controller was stopping addon managers immediately when
ClusterManagementAddon was deleted, without waiting for pre-delete jobs
to complete or ManagedClusterAddons to be cleaned up via owner reference
cascading deletion.

This change implements the TODO at line 105 by checking if all
ManagedClusterAddons are deleted before stopping the manager. The controller
now uses field selectors to efficiently query for remaining ManagedClusterAddons
and requeues after 10 seconds if any still exist, allowing time for proper
cleanup.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* add e2e test

Signed-off-by: zhujian <jiazhu@redhat.com>

* return err when stopUnusedManagers failed

Signed-off-by: zhujian <jiazhu@redhat.com>

* Address review comments for addon manager deletion fix

- Use lister instead of API client for better performance
- Add named constant for requeue delay
- Fix test cache synchronization issues
- Improve test coverage from 74.7% to 75.6%

Addresses review feedback from Qiujian16 and CodeRabbit.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Fix e2e test timeout for configmap deletion check

Add explicit 180s timeout for pre-delete job configmap cleanup.
The default 90s timeout was insufficient for the deletion workflow.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Improve error logging in template agent

- Replace utilruntime.HandleError with structured logging in CSR functions
- Add more context to error messages for better debugging
- Use logger.Info for template retrieval errors to provide better visibility

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Use ManagedClusterAddonByName index for efficient lookup

- Replace inefficient list-and-filter with indexed lookup
- Add managedClusterAddonIndexer field to controller struct
- Update comment to accurately describe functionality
- Fix unit tests to properly set up the required index

This addresses the PR review feedback to use the existing index
instead of listing all ManagedClusterAddOns and filtering by name.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Remove unused mcaLister field

Since we now use managedClusterAddonIndexer for efficient lookup,
the mcaLister field is no longer needed. This cleanup reduces
memory usage and simplifies the controller structure.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Replace inefficient list-and-filter with indexed lookup in runController

Use managedClusterAddonIndexer.ByIndex() instead of listing all ManagedClusterAddOns
and filtering by name. This provides O(1) indexed lookup instead of O(n) linear scan.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Fix review comments for addon manager deletion

- Fix closure capture bug in controller test by using captured variables
- Fix typo 'copyiedConfig' to 'copiedConfig' in e2e tests

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Optimize ManagedClusterAddOn event handling in addon template controller

Replace filtered event handling with custom event handlers that only trigger
reconciliation when AddOnTemplate configReferences actually change. This
reduces unnecessary reconciliation cycles by using reflect.DeepEqual to
compare config references between old and new objects.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>

* Revert "Optimize ManagedClusterAddOn event handling in addon template controller"

This reverts commit 4649d1b9ac.

Signed-off-by: zhujian <jiazhu@redhat.com>

---------

Signed-off-by: zhujian <jiazhu@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-10 01:30:19 +00:00

352 lines
12 KiB
Go

package templateagent
import (
"context"
"fmt"
"github.com/openshift/library-go/pkg/operator/events"
"github.com/valyala/fasttemplate"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
"k8s.io/client-go/kubernetes"
rbacv1lister "k8s.io/client-go/listers/rbac/v1"
"k8s.io/klog/v2"
"open-cluster-management.io/addon-framework/pkg/addonfactory"
addonconstants "open-cluster-management.io/addon-framework/pkg/addonmanager/constants"
"open-cluster-management.io/addon-framework/pkg/agent"
"open-cluster-management.io/addon-framework/pkg/utils"
addonapiv1alpha1 "open-cluster-management.io/api/addon/v1alpha1"
addonv1alpha1client "open-cluster-management.io/api/client/addon/clientset/versioned"
addoninformers "open-cluster-management.io/api/client/addon/informers/externalversions"
addonlisterv1alpha1 "open-cluster-management.io/api/client/addon/listers/addon/v1alpha1"
clusterv1 "open-cluster-management.io/api/cluster/v1"
)
const (
// Private value keys that are used internally by the addon template controller, should not be exposed to users.
// All private value keys should begin with "__"
NodePlacementPrivateValueKey = "__NODE_PLACEMENT"
RegistriesPrivateValueKey = "__REGISTRIES"
InstallNamespacePrivateValueKey = "__INSTALL_NAMESPACE"
ProxyPrivateValueKey = "__PROXY"
ResourceRequirementsPrivateValueKey = "__RESOURCE_REQUIREMENTS"
)
var PrivateValuesKeys = map[string]struct{}{
NodePlacementPrivateValueKey: {},
RegistriesPrivateValueKey: {},
InstallNamespacePrivateValueKey: {},
ProxyPrivateValueKey: {},
ResourceRequirementsPrivateValueKey: {},
}
// templateBuiltinValues includes the built-in values for crd template agentAddon.
// the values for template config should begin with an uppercase letter, so we need
// to convert it to Values by JsonStructToValues.
// the built-in values can not be overridden by getValuesFuncs
type templateCRDBuiltinValues struct {
ClusterName string `json:"CLUSTER_NAME,omitempty"`
InstallNamespace string `json:"INSTALL_NAMESPACE,omitempty"`
}
// templateDefaultValues includes the default values for crd template agentAddon.
// the values for template config should begin with an uppercase letter, so we need
// to convert it to Values by JsonStructToValues.
// the default values can be overridden by getValuesFuncs
type templateCRDDefaultValues struct {
HubKubeConfigPath string `json:"HUB_KUBECONFIG,omitempty"`
}
type CRDTemplateAgentAddon struct {
logger klog.Logger
getValuesFuncs []addonfactory.GetValuesFunc
trimCRDDescription bool
hubKubeClient kubernetes.Interface
addonClient addonv1alpha1client.Interface
addonLister addonlisterv1alpha1.ManagedClusterAddOnLister
addonTemplateLister addonlisterv1alpha1.AddOnTemplateLister
cmaLister addonlisterv1alpha1.ClusterManagementAddOnLister
rolebindingLister rbacv1lister.RoleBindingLister
addonName string
agentName string
eventRecorder events.Recorder
}
// NewCRDTemplateAgentAddon creates a CRDTemplateAgentAddon instance
func NewCRDTemplateAgentAddon(
ctx context.Context,
addonName string,
hubKubeClient kubernetes.Interface,
addonClient addonv1alpha1client.Interface,
addonInformers addoninformers.SharedInformerFactory,
rolebindingLister rbacv1lister.RoleBindingLister,
recorder events.Recorder,
getValuesFuncs ...addonfactory.GetValuesFunc,
) *CRDTemplateAgentAddon {
a := &CRDTemplateAgentAddon{
logger: klog.FromContext(ctx),
getValuesFuncs: getValuesFuncs,
trimCRDDescription: true,
hubKubeClient: hubKubeClient,
addonClient: addonClient,
addonLister: addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Lister(),
addonTemplateLister: addonInformers.Addon().V1alpha1().AddOnTemplates().Lister(),
cmaLister: addonInformers.Addon().V1alpha1().ClusterManagementAddOns().Lister(),
rolebindingLister: rolebindingLister,
addonName: addonName,
agentName: fmt.Sprintf("%s-agent", addonName),
eventRecorder: recorder,
}
return a
}
func (a *CRDTemplateAgentAddon) Manifests(
cluster *clusterv1.ManagedCluster,
addon *addonapiv1alpha1.ManagedClusterAddOn) ([]runtime.Object, error) {
template, err := a.getDesiredAddOnTemplateInner(addon.Name, addon.Status.ConfigReferences)
if err != nil {
return nil, err
}
if template == nil {
return nil, fmt.Errorf("addon %s/%s template not found in status", addon.Namespace, addon.Name)
}
return a.renderObjects(cluster, addon, template)
}
func (a *CRDTemplateAgentAddon) GetAgentAddonOptions() agent.AgentAddonOptions {
// TODO: consider a new way for developers to define their supported config GVRs
supportedConfigGVRs := []schema.GroupVersionResource{}
for gvr := range utils.BuiltInAddOnConfigGVRs {
supportedConfigGVRs = append(supportedConfigGVRs, gvr)
}
agentAddonOptions := agent.AgentAddonOptions{
AddonName: a.addonName,
HealthProber: &agent.HealthProber{
Type: agent.HealthProberTypeWorkloadAvailability,
},
HostedModeInfoFunc: addonconstants.GetHostedModeInfo,
SupportedConfigGVRs: supportedConfigGVRs,
Registration: &agent.RegistrationOption{
CSRConfigurations: a.TemplateCSRConfigurationsFunc(),
PermissionConfig: a.TemplatePermissionConfigFunc(),
CSRApproveCheck: a.TemplateCSRApproveCheckFunc(),
CSRSign: a.TemplateCSRSignFunc(),
AgentInstallNamespace: a.TemplateAgentRegistrationNamespaceFunc,
},
AgentDeployTriggerClusterFilter: func(old, new *clusterv1.ManagedCluster) bool {
return utils.ClusterImageRegistriesAnnotationChanged(old, new) ||
// if the cluster changes from unknow to true, recheck the health of the addon immediately
utils.ClusterAvailableConditionChanged(old, new)
},
// enable the ConfigCheckEnabled flag to check the configured condition before rendering manifests
ConfigCheckEnabled: true,
}
template, err := a.GetDesiredAddOnTemplate(nil, "", a.addonName)
if err != nil {
utilruntime.HandleError(fmt.Errorf("GetAgentAddonOptions failed to get addon %s template: %v", a.addonName, err))
return agentAddonOptions
}
if template == nil {
utilruntime.HandleError(fmt.Errorf("GetAgentAddonOptions addon %s template is nil", a.addonName))
return agentAddonOptions
}
agentAddonOptions.ManifestConfigs = template.Spec.AgentSpec.ManifestConfigs
return agentAddonOptions
}
func (a *CRDTemplateAgentAddon) renderObjects(
cluster *clusterv1.ManagedCluster,
addon *addonapiv1alpha1.ManagedClusterAddOn,
template *addonapiv1alpha1.AddOnTemplate) ([]runtime.Object, error) {
var objects []runtime.Object
presetValues, configValues, privateValues, err := a.getValues(cluster, addon, template)
if err != nil {
return objects, err
}
a.logger.V(4).Info("Logging presetValues, configValues, and privateValues",
"presetValues", presetValues,
"configValues", configValues,
"privateValues", privateValues)
for _, manifest := range template.Spec.AgentSpec.Workload.Manifests {
t := fasttemplate.New(string(manifest.Raw), "{{", "}}")
manifestStr := t.ExecuteString(configValues)
a.logger.V(4).Info("Addon render result",
"addonNamespace", addon.Namespace,
"addonName", addon.Name,
"renderResult", manifestStr)
object := &unstructured.Unstructured{}
if err := object.UnmarshalJSON([]byte(manifestStr)); err != nil {
return objects, err
}
object, err = a.decorateObject(template, object, presetValues, privateValues)
if err != nil {
return objects, err
}
objects = append(objects, object)
}
additionalObjects, err := a.injectAdditionalObjects(template, presetValues, privateValues)
if err != nil {
return objects, err
}
objects = append(objects, additionalObjects...)
return objects, nil
}
func (a *CRDTemplateAgentAddon) decorateObject(
template *addonapiv1alpha1.AddOnTemplate,
obj *unstructured.Unstructured,
orderedValues orderedValues,
privateValues addonfactory.Values) (*unstructured.Unstructured, error) {
decorators := []decorator{
newDeploymentDecorator(a.logger, a.addonName, template, orderedValues, privateValues),
newDaemonSetDecorator(a.logger, a.addonName, template, orderedValues, privateValues),
newNamespaceDecorator(privateValues),
}
var err error
for _, decorator := range decorators {
obj, err = decorator.decorate(obj)
if err != nil {
return obj, err
}
}
return obj, nil
}
func (a *CRDTemplateAgentAddon) injectAdditionalObjects(
template *addonapiv1alpha1.AddOnTemplate,
orderedValues orderedValues,
privateValues addonfactory.Values) ([]runtime.Object, error) {
injectors := []objectsInjector{
newProxyHandler(a.logger, a.addonName, privateValues),
}
decorators := []decorator{
// decorate the namespace of the additional objects
newNamespaceDecorator(privateValues),
}
var objs []runtime.Object
for _, injector := range injectors {
objects, err := injector.inject()
if err != nil {
return nil, err
}
for _, object := range objects {
// convert the runtime.Object to unstructured.Unstructured
unstructuredMapObj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(object)
if err != nil {
return nil, err
}
unstructuredObject := &unstructured.Unstructured{Object: unstructuredMapObj}
for _, decorator := range decorators {
unstructuredObject, err = decorator.decorate(unstructuredObject)
if err != nil {
return nil, err
}
}
objs = append(objs, unstructuredObject)
}
}
return objs, nil
}
// getDesiredAddOnTemplateInner returns the desired template of the addon,
// if the desired template is not found in the configReferences, it will
// return nil and no error, the caller should handle the nil template case.
func (a *CRDTemplateAgentAddon) getDesiredAddOnTemplateInner(
addonName string, configReferences []addonapiv1alpha1.ConfigReference,
) (*addonapiv1alpha1.AddOnTemplate, error) {
ok, templateRef := AddonTemplateConfigRef(configReferences)
if !ok {
a.logger.Info("Addon template config in addon status is empty", "addonName", addonName)
return nil, nil
}
desiredTemplate := templateRef.DesiredConfig
if desiredTemplate == nil || desiredTemplate.SpecHash == "" {
a.logger.Info("Addon template spec hash is empty", "addonName", addonName)
return nil, fmt.Errorf("addon %s template desired spec hash is empty", addonName)
}
template, err := a.addonTemplateLister.Get(desiredTemplate.Name)
if err != nil {
return nil, err
}
return template.DeepCopy(), nil
}
// TemplateAgentRegistrationNamespaceFunc reads deployment/daemonset resources in the manifests and use that namespace
// as the default registration namespace. If addonDeploymentConfig is set, uses the namespace in it.
func (a *CRDTemplateAgentAddon) TemplateAgentRegistrationNamespaceFunc(
addon *addonapiv1alpha1.ManagedClusterAddOn) (string, error) {
template, err := a.getDesiredAddOnTemplateInner(addon.Name, addon.Status.ConfigReferences)
if err != nil {
return "", err
}
if template == nil {
return "", fmt.Errorf("addon %s template not found in status", addon.Name)
}
// pick the namespace of the first deployment, if there is no deployment, pick the namespace of the first daemonset
var desiredNS = "open-cluster-management-agent-addon"
var firstDeploymentNamespace, firstDaemonSetNamespace string
for _, manifest := range template.Spec.AgentSpec.Workload.Manifests {
object := &unstructured.Unstructured{}
if err := object.UnmarshalJSON(manifest.Raw); err != nil {
a.logger.Error(err, "failed to extract the object")
continue
}
if firstDeploymentNamespace == "" {
if _, err = utils.ConvertToDeployment(object); err == nil {
firstDeploymentNamespace = object.GetNamespace()
break
}
}
if firstDaemonSetNamespace == "" {
if _, err = utils.ConvertToDaemonSet(object); err == nil {
firstDaemonSetNamespace = object.GetNamespace()
}
}
}
if firstDeploymentNamespace != "" {
desiredNS = firstDeploymentNamespace
} else if firstDaemonSetNamespace != "" {
desiredNS = firstDaemonSetNamespace
}
overrideNs, err := utils.AgentInstallNamespaceFromDeploymentConfigFunc(
utils.NewAddOnDeploymentConfigGetter(a.addonClient))(addon)
if err != nil {
return "", err
}
if len(overrideNs) > 0 {
desiredNS = overrideNs
}
return desiredNS, nil
}