Commit Graph

2329 Commits

Author SHA1 Message Date
Simone Tiraboschi
e56144c7a2 fix(descheduler): reset prometheus usage client at each extension point
Profile creation was moved outside the descheduling cycle in b214c147,
but reconcileInClusterSAToken() still runs only in runFnc(), after
newDescheduler() returns. This leaves the prometheus client nil when
LowNodeUtilization's New() runs, causing "prometheus client not
initialized" at startup.

Avoid failing at plugin creation time if the prometheus
client is not yet available. Instead, usageClientForMetrics() is now
called at the start of every extension point via a resetUsageClient()
helper, so each descheduling cycle picks up the latest client regardless
of when the SA token is reconciled or rotated.

Fixes: https://github.com/kubernetes-sigs/descheduler/issues/1840

Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
2026-04-29 11:36:32 +02:00
Kubernetes Prow Robot
fa8ae489ae Merge pull request #1856 from tiraboschi/background_eviction_metrics
evictions: fix missing observability for background evictions
2026-04-20 15:23:52 +05:30
Simone Tiraboschi
bc0f0354c6 evictions: fix missing observability for background evictions
Background evictions were completely invisible in metrics: the ignore=true
path caused EvictPod to return before incrementing any counter, leaving
operators with no signal that a background eviction had been triggered or
completed.

Add a "background" result label emitted at eviction request time and a
"success" label emitted from the informer DeleteFunc when the pod is
actually gone. The two labels together give a complete picture:
"background" is recorded at eviction request time and may not have a
matching "success" if the descheduler restarts before the pod is deleted,
while "success" confirms the eviction completed within the same lifecycle.

Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
2026-04-20 11:20:58 +02:00
Kubernetes Prow Robot
0bc278a816 Merge pull request #1859 from sammedsingalkar09/master
security: upgrade grpc and otel sdk dependencies
2026-04-18 20:15:36 +05:30
sammedsingalkar09
212b706950 security: upgrade grpc and otel sdk dependencies
Bump gRPC and OpenTelemetry SDK/exporter dependencies to patched releases and refresh vendored modules to address reported vulnerabilities while keeping tracing resource schema versions consistent.

Made-with: Cursor
2026-04-18 13:12:03 +05:30
Kubernetes Prow Robot
8f9d5c607d Merge pull request #1854 from kubernetes-sigs/security/update-trivy-action-v0.35.0
security: Update trivy-action to use sha for v0.35.0
2026-03-23 15:32:17 +05:30
Priyanka Saggu
1ca2edbb59 security: Update trivy-action to v0.35.0
Updates aquasecurity/trivy-action from mutable references to SHA-pinned
version to address security vulnerabilities.

- Updates to v0.35.0 (57a97c7e)
- Pins to specific SHA for immutability
- Addresses issue: aquasecurity/trivy#10425

Signed-off-by: Priyanka Saggu <priyankasaggu11929@gmail.com>
2026-03-22 18:42:48 +01:00
Kubernetes Prow Robot
0fafc09fff Merge pull request #1844 from a7i/extend-podlifetime-transitions
Extend PodLifeTime with condition, exit code, owner kind, and transition time filters
2026-03-07 01:04:21 +05:30
Amir Alavi
a4391ea73b Extract shared container state matching helpers into podutil
Move container waiting/terminated state checking from PodLifeTime and
RemovePodsHavingTooManyRestarts into podutil as separate exported helpers:
HasMatchingContainerWaitingState and HasMatchingContainerTerminatedState.
Each plugin composes only the helpers it needs.
2026-03-06 12:18:05 -05:00
Amir Alavi
a845ed3b36 Extend PodLifeTime with condition, exit code, owner kind, and transition time filters 2026-03-06 12:17:07 -05:00
Kubernetes Prow Robot
ac815c26f6 Merge pull request #1848 from sammedsingalkar09/master
update golang semconv dependencies
2026-03-06 20:00:22 +05:30
sammedsingalkar09
e76287fbbf update go dependencies 2026-03-06 11:24:00 +05:30
Kubernetes Prow Robot
751ba2e76e Merge pull request #1847 from a7i/fix/upgrade-codeql-action-v4
fix(ci): upgrade codeql-action to v4 and clean up security workflow
2026-03-05 09:06:18 +05:30
Amir Alavi
d82437286b fix(ci): upgrade codeql-action to v4 and clean up security workflow
CodeQL Action v1 and v2 have been deprecated. Update
upload-sarif to v4, remove unnecessary strategy block
(missing required matrix property), and remove invalid
exit-code input from the upload-sarif step.
2026-03-04 22:06:29 -05:00
Kubernetes Prow Robot
905e762603 Merge pull request #1842 from ingvagabund/data-races
fix: resolve detected data races
2026-02-25 21:38:25 +05:30
Jan Chaloupka
cbdab93459 fix: resolve detected data races 2026-02-25 16:38:10 +01:00
Kubernetes Prow Robot
af6e2adf42 Merge pull request #1838 from a7i/helm-icon
Change icon URL in Chart.yaml
2026-02-24 20:05:36 +05:30
Kamlesh Joshi
9bfdbe92e9 Add init containers support to Helm chart (#1826)
This commit adds support for init containers in the descheduler Helm chart,
allowing users to run initialization tasks before the main descheduler
container starts.

Changes:
- Add initContainers field to values.yaml with example usage
- Update deployment.yaml template to render init containers
- Update cronjob.yaml template to render init containers
- Bump chart version from 0.34.0 to 0.34.1

Init containers can be used for various purposes such as:
- Pre-loading configuration from external sources
- Waiting for dependencies to be ready
- Setting up required files or permissions
- Running security scans or compliance checks

Example usage in values.yaml:
initContainers:
  - name: init-config
    image: busybox:1.28
    command: ['sh', '-c', 'echo Initializing && sleep 5']

Signed-off-by: kjoshi <kjoshi@egnyte.com>
2026-02-21 08:03:39 +05:30
Cayla Fauver
9e9595357a Update helm RBAC to account for pvc failure on 0.35.0 (#1836)
* Synchronize helm clusterrole RBAC with base yaml

I noticed in v0.35.

```
E0219 23:53:57.761596       1 reflector.go:204] "Failed to watch" err="failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User \"system:serviceaccount:kube-system:descheduler\" cannot list resource \"persistentvolumeclaims\" in API group \"\" at the cluster scope" logger="UnhandledError" reflector="k8s.io/client-go/informers/factory.go:161" type="*v1.PersistentVolumeClaim"
```

I saw it in rbac.yaml bec9cd38d0/kubernetes/base/rbac.yaml (L38-L40)

So I figured this just needed a bump

* remove dupe

* undo version change
2026-02-21 06:17:38 +05:30
Kubernetes Prow Robot
bec9cd38d0 Merge pull request #1835 from a7i/descheduler-chart-v0.35.0
[v0.35.0] update helm chart
descheduler-helm-chart-0.35.0
2026-02-20 02:37:39 +05:30
Amir Alavi
0d387fc794 [v0.35.0] update helm chart 2026-02-19 11:57:02 -05:00
Kubernetes Prow Robot
2efac6ae8a Merge pull request #1834 from a7i/fix/helm-unittest-plugin-version
fix(ci): pin helm-unittest plugin version and bump chart-testing-action
2026-02-19 21:31:40 +05:30
Amir Alavi
d4013fd80d fix(ci): pin helm-unittest plugin version and bump chart-testing-action
The helm-unittest plugin install was failing with:
  error unmarshaling JSON: while decoding JSON: json: unknown field "platformHooks"

Pin helm-unittest to v1.0.3 and bump chart-testing-action to v2.8.0.
2026-02-19 08:00:09 -05:00
Kubernetes Prow Robot
b49fd27d10 Merge pull request #1830 from davidandreoletti/patch-1
Change annotations condition to deploymentAnnotations for Deployment object annotations
v0.35.0
2026-02-19 14:13:39 +05:30
Kubernetes Prow Robot
ce6bf5b735 Merge pull request #1832 from a7i/v0.35.0-docs-manifests
[v0.35.0] update docs and manifests
2026-02-19 10:23:37 +05:30
Kubernetes Prow Robot
86e96b5b04 Merge pull request #1831 from a7i/amir/CVE-2024-44337
chore: upgrade github.com/gomarkdown/markdown to latest version
2026-02-19 09:37:37 +05:30
Amir Alavi
86070c62c6 [v0.35.0] update docs and manifests 2026-02-18 22:49:59 -05:00
Amir Alavi
118f466290 chore: upgrade github.com/gomarkdown/markdown to latest version
Upgrade github.com/gomarkdown/markdown from v0.0.0-20240328165702-4d01890c35c0
to v0.0.0-20260217112301-37c66b85d6ab (latest as of 2026-02-17)
2026-02-18 22:39:34 -05:00
Kubernetes Prow Robot
0de5bad232 Merge pull request #1827 from a7i/k8s-1.35
[v0.35.0] bump to kubernetes 1.35 deps
2026-02-18 16:31:37 +05:30
David Andreoletti
66075b069a Change annotations condition to deploymentAnnotations 2026-02-18 17:59:03 +08:00
Kubernetes Prow Robot
f587486296 Merge pull request #1829 from ingvagabund/bump-golangci-lint
bump(golangci-lint): update and migrate
2026-02-18 15:03:40 +05:30
Jan Chaloupka
a868c8d129 chore: update the code based on golangci-lint report 2026-02-17 22:01:25 +01:00
Jan Chaloupka
91c1297a54 bump(golangci-lint): update and migrate
golangci-lint migrate to make .golangci.yaml v2 compatible
2026-02-17 21:53:53 +01:00
Kubernetes Prow Robot
760e9cc2e1 Merge pull request #1828 from ingvagabund/extend-list-of-supported-go-versions
chore: extend the list of supported Go versions
2026-02-17 16:27:03 +05:30
Jan Chaloupka
a4ac8447b6 chore: extend the list of supported Go versions 2026-02-17 11:27:24 +01:00
Amir Alavi
a206a88d86 [v0.35.0] bump to kubernetes 1.35 deps
Signed-off-by: Amir Alavi <amiralavi7@gmail.com>
2026-02-15 19:50:50 -05:00
Kubernetes Prow Robot
7221fa7613 Merge pull request #1822 from sammedsingalkar09/master
Update go dependecies to fix vulnerabilities
2026-02-06 01:52:31 +05:30
Kubernetes Prow Robot
6e33b690d7 Merge pull request #1823 from ingvagabund/prom-client-controller
refactor: move prometheus client controller related code under a seperate file
2026-02-05 21:14:33 +05:30
Jan Chaloupka
f4718bf928 refactor(prom client controllers): change the one letter receiver into ctrl 2026-02-05 15:54:49 +01:00
Jan Chaloupka
f149f5a083 refactor: move prometheus client controller related code under a seperate file 2026-02-05 15:42:47 +01:00
Kubernetes Prow Robot
8de50a8a17 Merge pull request #1815 from ingvagabund/new-profile-under-new-descheduler
feat(pkg/descheduler): create profiles outside the descheduling cycle
2026-02-05 14:36:35 +05:30
Jan Chaloupka
b214c14793 feat(pkg/descheduler): create profiles outside the descheduling cycle 2026-02-04 20:04:31 +01:00
Kubernetes Prow Robot
fc863ff58d Merge pull request #1821 from ingvagabund/prom-client-controllers
refactor(promClientController): split it into two prom client controllers
2026-02-04 22:48:32 +05:30
sammedsingalkar09
f801f34346 update dependecies 2026-02-04 22:32:15 +05:30
Jan Chaloupka
d262c7af44 refactor(TestPromClientControllerSync_EventHandler): be more verbose about the target expectations 2026-02-04 17:44:17 +01:00
Jan Chaloupka
4b5be0a772 feat(prometheus client reconciling): be more strict about clearing the previous connection
To avoid stalling connections that are not expected to be kept. E.g.
when an invalid secret is provided.
2026-02-04 17:16:44 +01:00
Jan Chaloupka
29e5a51cb5 refactor(newSecretBasedPromClientController): inline setupPrometheusProvider into newSecretBasedPromClientController 2026-02-04 17:16:15 +01:00
Jan Chaloupka
a91a02cadc refactor(newSecretBasedPromClientController): move prometheus config validation under newSecretBasedPromClientController 2026-02-04 17:14:58 +01:00
Jan Chaloupka
964df4ce95 refactor(promClientController): split it into two prom client controllers 2026-02-04 16:59:13 +01:00
Kubernetes Prow Robot
5fb70c12c7 Merge pull request #1820 from ingvagabund/refactorings
test(token reconciling): have tests initialize the prom client reconciling through the descheduler's bootstraping entry too
2026-02-04 18:31:59 +05:30