diff --git a/content/runbooks/kubernetes/KubeCPUOvercommit.md b/content/runbooks/kubernetes/KubeCPUOvercommit.md new file mode 100644 index 0000000..8641271 --- /dev/null +++ b/content/runbooks/kubernetes/KubeCPUOvercommit.md @@ -0,0 +1,45 @@ +--- +title: Kube CPU Overcommit +weight: 20 +--- + +# KubeCPUOvercommit + +## Meaning + +Cluster has overcommitted CPU resource requests for Pods +and cannot tolerate node failure. + +
+Full context + +Total number of CPU requests for pods exceeds cluster capacity. +In case of node failure some pods will not fit in the remaining nodes. + +
+ +## Impact + +The cluster cannot tolerate node failure. In the event of a node failure, some Pods will be in `Pending` state. + +## Diagnosis + +- Check if CPU resource requests are adjusted to the app usage +- Check if some nodes are available and not cordoned +- Check if cluster-autoscaler has issues with adding new nodes + +## Mitigation + +- Add more nodes to the cluster - usually it is better to have more smaller + nodes, than few bigger. + +- Add different node pools with different instance types to avoid problem + when using only one instance type in the cloud. + +- Use pod priorities to avoid important services from losing performance, + see [pod priority and preemption](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/) + +- Fine tune settings for special pods used with [cluster-autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-cluster-autoscaler-work-with-pod-priority-and-preemption) + +- Prepare performance tests for the expected workload, plan cluster capacity + accordingly. diff --git a/content/runbooks/kubernetes/KubeCPUQuotaOvercommit.md b/content/runbooks/kubernetes/KubeCPUQuotaOvercommit.md new file mode 100644 index 0000000..f629012 --- /dev/null +++ b/content/runbooks/kubernetes/KubeCPUQuotaOvercommit.md @@ -0,0 +1,39 @@ +--- +title: Kube CPU Quota Overcommit +weight: 20 +--- + +# KubeCPUQuotaOvercommit + +## Meaning + +Cluster has overcommitted CPU resource requests for Namespaces and cannot tolerate node failure. + +## Impact + +In the event of a node failure, some Pods will be in `Pending` state due to a lack of available CPU resources. + +## Diagnosis + +- Check if CPU resource requests are adjusted to the app usage +- Check if some nodes are available and not cordoned +- Check if cluster-autoscaler has issues with adding new nodes +- Check if the given namespace usage grows in time more than expected + +## Mitigation + +- Review existing quota for given namespace and adjust it accordingly. + +- Add more nodes to the cluster - usually it is better to have more smaller + nodes, than few bigger. + +- Add different node pools with different instance types to avoid problem + when using only one instance type in the cloud. + +- Use pod priorities to avoid important services from losing performance, + see [pod priority and preemption](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/) + +- Fine tune settings for special pods used with [cluster-autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-cluster-autoscaler-work-with-pod-priority-and-preemption) + +- Prepare performance tests for the expected workload, plan cluster capacity + accordingly. diff --git a/content/runbooks/kubernetes/KubeDaemonSetMisScheduled.md b/content/runbooks/kubernetes/KubeDaemonSetMisScheduled.md new file mode 100644 index 0000000..d0433f6 --- /dev/null +++ b/content/runbooks/kubernetes/KubeDaemonSetMisScheduled.md @@ -0,0 +1,35 @@ +--- +title: Kube DaemonSet MisScheduled +weight: 20 +--- + +# KubeDaemonSetMisScheduled + +## Meaning + +A number of pods of daemonset are running where they are not supposed to run. + +## Impact + +Service degradation or unavailability. +Excessive resource usage where they could be used by other apps. + +## Diagnosis + +Usually happens when specifying wrong pod nodeSelector/taints/affinities or +node (node pools) were tainted and existing pods were not scheduled for eviction. + +- Check daemonset status via `kubectl -n $NAMESPACE describe daemonset $NAME`. +- Check [DaemonSet update strategy](https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/) +- Check the status of the pods which belong to the replica sets under the deployment. +- Check pod template parameters such as: + - pod priority - maybe it was evicted by other more important pods + - affinity rules - maybe due to affinities and not enough nodes it is not + possible to schedule pods +- Check node taints and labels +- Check logs for [node-feature-discovery](https://kubernetes-sigs.github.io/node-feature-discovery/master/get-started/index.html) + and other supporting tools such as gpu-feature-discovery + +## Mitigation + +Update DaemonSet and apply change, delete pods manually. diff --git a/content/runbooks/kubernetes/KubeDaemonSetNotScheduled.md b/content/runbooks/kubernetes/KubeDaemonSetNotScheduled.md new file mode 100644 index 0000000..e58e5f9 --- /dev/null +++ b/content/runbooks/kubernetes/KubeDaemonSetNotScheduled.md @@ -0,0 +1,45 @@ +--- +title: Kube DaemonSet Not Scheduled +weight: 20 +--- + +# KubeDaemonSetNotScheduled + +## Meaning + +A number of pods of daemonset are not scheduled. + +## Impact + +Service degradation or unavailability. + +## Diagnosis + +Usually happens when specifying wrong pod taints/affinities or lack of +resources on the nodes. + +- Check daemonset status via `kubectl -n $NAMESPACE describe daemonset $NAME`. +- Check [DaemonSet update strategy](https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/) +- Check the status of the pods which belong to the replica sets under the deployment. +- Check pod template parameters such as: + - pod priority - maybe it was evicted by other more important pods + - resources - maybe it tries to use unavailable resource, such as GPU but + there is limited number of nodes with GPU + - affinity rules - maybe due to affinities and not enough nodes it is not + possible to schedule pods +- Check if Horizontal Pod Autoscaler (HPA) is not triggered due to untested + values (requests values). +- Check if cluster-autoscaler is able to create new nodes - see its logs or + cluster-autoscaler status configmap. + +## Mitigation + +Set proper priority class for important dameonsets to system-node-critical. + +See [DaemonSet rolling update is stuck](https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/#daemonset-rolling-update-is-stuck) + +In some rare cases you may need to change node affinities or delete pod +manually if this is special daemonset which has specific pod priority class +and is limited to only 1 replica (so it runs on specific node only) + +See [Debugging Pods](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application/#debugging-pods) diff --git a/content/runbooks/kubernetes/KubeDaemonSetRolloutStuck.md b/content/runbooks/kubernetes/KubeDaemonSetRolloutStuck.md new file mode 100644 index 0000000..3304c9f --- /dev/null +++ b/content/runbooks/kubernetes/KubeDaemonSetRolloutStuck.md @@ -0,0 +1,44 @@ +--- +title: Kube DaemonSet Rollout Stuck +weight: 20 +--- + +# KubeDaemonSetRolloutStuck + +## Meaning + +DaemonSet update is stuck waiting for replaced pod. + + +## Impact + +Service degradation or unavailability. + +## Diagnosis + +- Check daemonset status via `kubectl -n $NAMESPACE describe daemonset $NAME`. +- Check [DaemonSet update strategy](https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/) +- Check the status of the pods which belong to the replica sets under the deployment. +- Check pod template parameters such as: + - pod priority - maybe it was evicted by other more important pods + - resources - maybe it tries to use unavailable resource, such as GPU but + there is limited number of nodes with GPU + - affinity rules - maybe due to affinities and not enough nodes it is not + possible to schedule pods + - pod termination grace period - if too long then pods may be for too long + in terminating state +- Check if Horizontal Pod Autoscaler (HPA) is not triggered due to untested + values (requests values). +- Check if cluster-autoscaler is able to create new nodes - see its logs or + cluster-autoscaler status configmap. + +## Mitigation + +See [DaemonSet rolling update is stuck](https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/#daemonset-rolling-update-is-stuck) + +In some rare cases you may need to change node affinities or delete pod +manually if this is special daemonset +which has pod priority class system-cluster-critical and is limited to only +1 replica (so it runs on specific node only) + +See [Debugging Pods](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application/#debugging-pods) diff --git a/content/runbooks/kubernetes/KubeDeploymentGenerationMismatch.md b/content/runbooks/kubernetes/KubeDeploymentGenerationMismatch.md new file mode 100644 index 0000000..3635bad --- /dev/null +++ b/content/runbooks/kubernetes/KubeDeploymentGenerationMismatch.md @@ -0,0 +1,51 @@ +--- +title: Kube Deployment Generation Mismatch +weight: 20 +--- + +# KubeDeploymentGenerationMismatch + +## Meaning + +Deployment generation mismatch due to possible roll-back. + +## Impact + +Service degradation or unavailability. + +## Diagnosis + +See [Kubernetes Docs - Failed Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#failed-deployment) + +- Check out rollout history `kubectl -n $NAMESPACE rollout history deployment $NAME` +- Check rollout status if it is not paused +- Check deployment status via `kubectl -n $NAMESPACE describe deployment $NAME`. +- Check how many replicas are there declared. +- Investigate if new pods are not crashing. +- Check the status of the pods which belong to the replica sets under the deployment. +- Check pod template parameters such as: + - pod priority - maybe it was evicted by other more important pods + - resources - maybe it tries to use unavailable resource, such as GPU + but there is limited number of nodes with GPU + - affinity rules - maybe due to affinities and not enough nodes it is + not possible to schedule pods + - pod termination grace period - if too long then pods may be for too long + in terminating state +- Check if Horizontal Pod Autoscaler (HPA) is not triggered due to untested + values (requests values). +- Check if cluster-autoscaler is able to create new nodes - see its logs or + cluster-autoscaler status configmap. + +## Mitigation + +Depending on the conditions usually adding new nodes solves the issue. + +Otherwise probably deployment or HPA definition needs to be fixed. +If you can not add nodes then you can change rolling update strategy to `Recreate`. +Sometimes manually deleting pod helps :) + +In rare cases roll back to previous version - see [Kubernetes Docs - Rolling Back](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-back-to-a-previous-revision) + +In extremely rare situations scale oldest ReplicaSets to 0 and delete them. + +See [Debugging Pods](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application/#debugging-pods) diff --git a/content/runbooks/kubernetes/KubeDeploymentReplicasMismatch.md b/content/runbooks/kubernetes/KubeDeploymentReplicasMismatch.md new file mode 100644 index 0000000..2e18f5e --- /dev/null +++ b/content/runbooks/kubernetes/KubeDeploymentReplicasMismatch.md @@ -0,0 +1,52 @@ +--- +title: Kube Deployment Replicas Mismatch +weight: 20 +--- + +# KubeDeploymentReplicasMismatch + +## Meaning + +Deployment has not matched the expected number of replicas. + +
+Full context + +Kubernetes Deployment resource does not have number of replicas which were +declared to be in operation. +For example deployment is expected to have 3 replicas, but it has less than +that for a noticeable period of time. + +In rare occasions there may be more replicas than it should and system did +not clean it up. +
+ +## Impact + +Service degradation or unavailability. + +## Diagnosis + +- Check deployment status via `kubectl -n $NAMESPACE describe deployment $NAME`. +- Check how many replicas are there declared. +- Check the status of the pods which belong to the replica sets under the deployment. +- Check pod template parameters such as: + - pod priority - maybe it was evicted by other more important pods + - resources - maybe it tries to use unavailable resource, such as GPU + but there is limited number of nodes with GPU + - affinity rules - maybe due to affinities and not enough nodes it is + not possible to schedule pods + - pod termination grace period - if too long then pods may be for too long + in terminating state +- Check if Horizontal Pod Autoscaler (HPA) is not triggered due to untested + values (requests values). +- Check if cluster-autoscaler is able to create new nodes - see its logs or + cluster-autoscaler status configmap. + +## Mitigation + +Depending on the conditions usually adding new nodes solves the issue. + +Otherwise probably deployment or HPA definition needs to be fixed. + +See [Debugging Pods](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application/#debugging-pods) diff --git a/content/runbooks/kubernetes/KubeHpaMaxedOut.md b/content/runbooks/kubernetes/KubeHpaMaxedOut.md new file mode 100644 index 0000000..db1defd --- /dev/null +++ b/content/runbooks/kubernetes/KubeHpaMaxedOut.md @@ -0,0 +1,32 @@ +--- +title: Kube HPA Maxed Out +weight: 20 +--- + +# KubeHpaMaxedOut + +## Meaning + +Horizontal Pod Autoscaler has been running at max replicas for longer +than 15 minutes. + +## Impact + +Horizontal Pod Autoscaler won't be able to add new pods and thus scale application. +**Notice** for some services maximizing HPA is in fact desired. + +## Diagnosis + +Check why HPA was unable to scale: + +- max replicas too low +- too low value for requests such as CPU? + +## Mitigation + +If using basic metrics like CPU/Memory then ensure to set proper values for +`requests`. +For memory based scaling ensure there are no memory leaks. +If using custom metrics then tine tune how app scales accordingly to it. + +Use performance tests to see how the app scales. diff --git a/content/runbooks/kubernetes/KubeHpaReplicasMismatch.md b/content/runbooks/kubernetes/KubeHpaReplicasMismatch.md new file mode 100644 index 0000000..a49e791 --- /dev/null +++ b/content/runbooks/kubernetes/KubeHpaReplicasMismatch.md @@ -0,0 +1,28 @@ +--- +title: Kube HPA Replicas Mismatch +weight: 20 +--- + +# KubeHpaReplicasMismatch + +## Meaning + +Horizontal Pod Autoscaler has not matched the desired number of replicas for +longer than 15 minutes. + +## Impact + +HPA was unable to schedule desired number of pods. + +## Diagnosis + +Check why HPA was unable to scale: + +- not enough nodes in the cluster +- hitting resource quotas in the cluster +- pods evicted due to pod priority + +## Mitigation + +In case of cluster-autoscaler you may need to set up preemtive pod pools to +ensure nodes are created on time. diff --git a/content/runbooks/kubernetes/KubeJobCompletion.md b/content/runbooks/kubernetes/KubeJobCompletion.md new file mode 100644 index 0000000..47eea38 --- /dev/null +++ b/content/runbooks/kubernetes/KubeJobCompletion.md @@ -0,0 +1,25 @@ +--- +title: Kube Job Completion +weight: 20 +--- + +# KubeJobCompletion + +## Meaning + +Job is taking more than 1h to complete. + +## Impact + +- Long processing of batch jobs. +- Possible issues with scheduling next Job + +## Diagnosis + +- Check job via `kubectl -n $NAMESPACE describe jobs $JOB`. +- Check pod events via `kubectl -n $NAMESPACE describe job $JOB`. + +## Mitigation + +- Give it more resources so it finishes faster, if applicable. +- See [Job patterns](https://kubernetes.io/docs/tasks/job/)