* add advertiseAddress to NetworkProfileSpec
* use advertiseAddress for --advertise-address
* regenerate CRDs with advertiseAddress
* regenerate API docs
---------
Co-authored-by: bsctl <bsctl@clastix.io>
* feat(deployment): make startup probe failure threshold configurable
Add StartupProbeFailureThreshold field to TenantControlPlane CRD
DeploymentSpec, allowing users to configure how many consecutive
startup probe failures are tolerated before a container is considered
failed. The value is applied to all control plane components
(kube-apiserver, controller-manager, and scheduler).
Defaults to 3 (preserving current behavior). With PeriodSeconds=10,
the total startup timeout equals FailureThreshold * 10 seconds.
Setting this to 30 gives 5 minutes, which is useful for
resource-constrained environments.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* chore: regenerate CRD manifests for startupProbeFailureThreshold
Run `make manifests` to update Helm CRD files with the new
startupProbeFailureThreshold field in DeploymentSpec.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* feat(deployment): expand configurable probes to all probe types
Replace StartupProbeFailureThreshold with a full Probes config
supporting liveness, readiness, and startup probes with configurable
TimeoutSeconds, PeriodSeconds, and FailureThreshold parameters.
Use ptr.Deref for safe pointer dereferencing.
Ref: #471
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* chore: regenerate CRD manifests and API documentation
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* feat(deployment): add per-component probe overrides and expand ProbeSpec
Add cascading probe configuration: global defaults → per-component
overrides (apiServer, controllerManager, scheduler). Expand ProbeSpec
with InitialDelaySeconds and SuccessThreshold fields.
Ref: #471
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
* chore: regenerate CRD manifests and API documentation
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
---------
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Co-authored-by: Claude <noreply@anthropic.com>
Previously, when Kamaji created TLSRoutes for the kube-apiserver and
konnectivity, it automatically set the parentRefs `port` and `sectionName`,
overriding any user-provided values via TCP spec. This prevented users
from targeting specific Gateway listeners.
This change allows users to explicitly define parentRefs for the
kube-apiserver TLSRoute through `TCP.spec.controlPlane.gateway.parentRefs`.
The konnectivity TLSRoute behavior remains unchanged.
Signed-off-by: Parth Yadav <parth@coredge.io>
* refactor: migrate error packages from pkg/errors to stdlib
Replace github.com/pkg/errors with Go standard library error handling
in foundation error packages:
- internal/datastore/errors: errors.Wrap -> fmt.Errorf with %w
- internal/errors: errors.As -> stdlib errors.As
- controllers/soot/controllers/errors: errors.New -> stdlib errors.New
Part 1 of 4 in the pkg/errors migration.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor: migrate datastore package from pkg/errors to stdlib
Replace github.com/pkg/errors with Go standard library error handling
in the datastore layer:
- connection.go: errors.Wrap -> fmt.Errorf with %w
- datastore.go: errors.Wrap -> fmt.Errorf with %w
- etcd.go: goerrors alias removed, use stdlib errors.As
- nats.go: errors.Wrap/Is/New -> stdlib equivalents
- postgresql.go: goerrors.Wrap -> fmt.Errorf with %w
Part 2 of 4 in the pkg/errors migration.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor: migrate internal packages from pkg/errors to stdlib (partial)
Replace github.com/pkg/errors with Go standard library error handling
in internal packages:
- internal/builders/controlplane: errors.Wrap -> fmt.Errorf
- internal/crypto: errors.Wrap -> fmt.Errorf
- internal/kubeadm: errors.Wrap/Wrapf -> fmt.Errorf
- internal/upgrade: errors.Wrap -> fmt.Errorf
- internal/webhook: errors.Wrap -> fmt.Errorf
Part 3 of 4 in the pkg/errors migration.
Remaining files: internal/resources/*.go (8 files, 42 occurrences)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(resources): migrate from pkg/errors to stdlib
Replace github.com/pkg/errors with Go standard library:
- errors.Wrap(err, msg) → fmt.Errorf("msg: %w", err)
- errors.New(msg) → errors.New(msg)
Files migrated:
- internal/resources/kubeadm_phases.go
- internal/resources/kubeadm_upgrade.go
- internal/resources/kubeadm_utils.go
- internal/resources/datastore/datastore_multitenancy.go
- internal/resources/datastore/datastore_setup.go
- internal/resources/datastore/datastore_storage_config.go
- internal/resources/addons/coredns.go
- internal/resources/addons/kube_proxy.go
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(controllers): migrate from pkg/errors to stdlib
Replace github.com/pkg/errors with Go standard library:
- errors.Wrap(err, msg) → fmt.Errorf("msg: %w", err)
- errors.New(msg) → errors.New(msg) (stdlib)
- errors.Is/As → errors.Is/As (stdlib)
Files migrated:
- controllers/datastore_controller.go
- controllers/kubeconfiggenerator_controller.go
- controllers/tenantcontrolplane_controller.go
- controllers/telemetry_controller.go
- controllers/certificate_lifecycle_controller.go
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(soot): migrate from pkg/errors to stdlib
Replace github.com/pkg/errors with Go standard library:
- errors.Is() now uses stdlib errors.Is()
Files migrated:
- controllers/soot/controllers/kubeproxy.go
- controllers/soot/controllers/migrate.go
- controllers/soot/controllers/coredns.go
- controllers/soot/controllers/konnectivity.go
- controllers/soot/controllers/kubeadm_phase.go
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(api,cmd): migrate from pkg/errors to stdlib
Replace github.com/pkg/errors with Go standard library:
- errors.Wrap(err, msg) → fmt.Errorf("msg: %w", err)
Files migrated:
- api/v1alpha1/tenantcontrolplane_funcs.go
- cmd/utils/k8s_version.go
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* chore: run go mod tidy after pkg/errors migration
The github.com/pkg/errors package moved from direct to indirect
dependency. It remains as an indirect dependency because other
packages in the dependency tree still use it.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(datastore): use errors.Is for sentinel error comparison
The stdlib errors.As expects a pointer to a concrete error type, not
a pointer to an error value. For comparing against sentinel errors
like rpctypes.ErrGRPCUserNotFound, errors.Is should be used instead.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: resolve golangci-lint errors
- Fix GCI import formatting (remove extra blank lines between groups)
- Use errors.Is instead of errors.As for mutex sentinel errors
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(errors): use proper variable declarations for errors.As
The errors.As function requires a pointer to an assignable variable,
not a pointer to a composite literal. The previous pattern
`errors.As(err, &SomeError{})` creates a pointer to a temporary value
which errors.As cannot reliably use for assignment.
This fix declares proper variables for each error type and passes
their addresses to errors.As, ensuring correct error chain matching.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(datastore/etcd): use rpctypes.Error() for gRPC error comparison
The etcd gRPC status errors (ErrGRPCUserNotFound, ErrGRPCRoleNotFound)
cannot be compared directly using errors.Is() because they are wrapped
in gRPC status errors during transmission.
The etcd rpctypes package provides:
- ErrGRPC* constants: server-side gRPC status errors
- Err* constants (without GRPC prefix): client-side comparable errors
- Error() function: converts gRPC errors to comparable EtcdError values
The correct pattern is to use rpctypes.Error(err) to normalize the
received error, then compare against client-side error constants
like rpctypes.ErrUserNotFound.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* feat: add ObservedGeneration to all status types
Add ObservedGeneration field to DataStoreStatus, KubeconfigGeneratorStatus,
and TenantControlPlaneStatus to track which generation the controller has
processed. This enables clients and tools like kstatus to determine if the
controller has reconciled the latest spec changes.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor: follow Cluster API pattern for ObservedGeneration
Move ObservedGeneration setting for TenantControlPlane from intermediate
status updates to the final successful reconciliation completion. This
follows Cluster API conventions where ObservedGeneration indicates the
controller has fully processed the given generation.
Previously, ObservedGeneration was set on every status update during
resource processing, which could mislead clients into thinking the spec
was fully reconciled when the controller was still mid-reconciliation
or had hit a transient error.
Now:
- DataStore: Sets ObservedGeneration before single status update (simple controller)
- KubeconfigGenerator: Sets ObservedGeneration before single status update (simple controller)
- TenantControlPlane: Sets ObservedGeneration only after ALL resources processed successfully
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* test: verify ObservedGeneration equals Generation after reconciliation
Add assertion to e2e test to verify that status.observedGeneration
equals metadata.generation after a TenantControlPlane is successfully
reconciled.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* chore: regenerate CRDs with ObservedGeneration field
Run make crds to regenerate CRDs with the new ObservedGeneration
field in status types.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Run make manifests
* Run make apidoc
* Remove rbac role
* Remove webhook manifest
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* docs: Modified bitnami cert-manager reference because images do not exist anymore
* docs: update MetalLB version
* docs: added podman specific command for retrieving GW_IP
This change extends Gateway API support to Konnectivity addons.
When `spec.controlPlane.gateway` is configured and Konnectivity addon is
enabled, Kamaji automatically creates two TLSRoutes:
1. A Control plane TLSRoute (port 6443, sectionName "kube-apiserver")
2. A Konnectivity TLSRoute (port 8132, sectionName "konnectivity-server")
Both routes use the hostname specified in `gateway.hostname` and reference
the same Gateway resource via `parentRefs`, with `port` and `sectionName`
set automatically by Kamaji.
This patch also adds CEL validation to prevent users from specifying
`port` or `sectionName` in Gateway `parentRefs`, as these fields are now
managed automatically by Kamaji.
Signed-off-by: Parth Yadav <parth@coredge.io>
* feat: add support for multiple Datastores
* docs: add guide for datastore overrides
* feat(datastore): add e2e test for dataStoreOverrides
* ci: reclaim disk space from runner to fix flaky tests