Profile creation was moved outside the descheduling cycle in b214c147,
but reconcileInClusterSAToken() still runs only in runFnc(), after
newDescheduler() returns. This leaves the prometheus client nil when
LowNodeUtilization's New() runs, causing "prometheus client not
initialized" at startup.
Avoid failing at plugin creation time if the prometheus
client is not yet available. Instead, usageClientForMetrics() is now
called at the start of every extension point via a resetUsageClient()
helper, so each descheduling cycle picks up the latest client regardless
of when the SA token is reconciled or rotated.
Fixes: https://github.com/kubernetes-sigs/descheduler/issues/1840
Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Background evictions were completely invisible in metrics: the ignore=true
path caused EvictPod to return before incrementing any counter, leaving
operators with no signal that a background eviction had been triggered or
completed.
Add a "background" result label emitted at eviction request time and a
"success" label emitted from the informer DeleteFunc when the pod is
actually gone. The two labels together give a complete picture:
"background" is recorded at eviction request time and may not have a
matching "success" if the descheduler restarts before the pod is deleted,
while "success" confirms the eviction completed within the same lifecycle.
Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
Move container waiting/terminated state checking from PodLifeTime and
RemovePodsHavingTooManyRestarts into podutil as separate exported helpers:
HasMatchingContainerWaitingState and HasMatchingContainerTerminatedState.
Each plugin composes only the helpers it needs.
When the descheduler is running in the dry mode the the kube client sandbox restoring may fail.
Which can be caused by timeouts when waiting for internal caches to sync.
The internal timeouts depend on the cluster size which changes in time.
No reason to cancel the context because of that.
Currently, there's a single prometheus client reconciler for both in
cluster and secret based strategies. The in cluster reconciling is run in
sync with each descheduling cycle. An in file token either changes or it
does not. If changed a new prometheus client is created. The secret
based reconciling is run async and watches for secret object changes. If a
secret changes a new client is created. The internal state of the
reconciler keeps previous connection data for clearing and checks.
The current reconciler implementation lacks mutually exclusive access.
So data races are possible. The prometheus configuration validation is
performed during every sync. The future refactorings is expected to move
the validation to the creation phase of the reconciler.
The extra unit testing is expected to cover the following scenarios:
- in cluster:
- in file token is unchanged: no-op
- in file token is changed: client is created or fails to be created
- secret:
- no secret is not found: no client creation, internal state cleared
- secret is found: if token changed a new client created, otherwise
no-op
- prometheus config validation
- prometheus client injection
Any error during new prom client creation should be followed by closing
the previous connection and reseting the internal state. Yet, the error
handling is not that strict currently. So the current extra unit testing
keeps the incomplete testing cases as they are.
Other use of the tests is to make sure every time a new prometheus
client is created a descheduling cycle injects a new profile with the
updated prometheus clients. So the future refactoring does not introduce
a regression.
The underlying implementation is the same. Only moving the code under a
separate controller that can be unit test independently of the
descheduler type implementation.