Add zero downtime deployments tutorial

2026-04-15 06:57:34 +00:00 · 2019-03-03 13:24:15 +02:00
parent e2abcd1323
commit b5adee271c
2 changed files with 176 additions and 0 deletions
--- a/docs/gitbook/SUMMARY.md
+++ b/docs/gitbook/SUMMARY.md
@@ -17,3 +17,4 @@
 ## Tutorials

 * [Canaries with Helm charts and GitOps](tutorials/canary-helm-gitops.md)
+* [Zero downtime deployments](tutorials/zero-downtime-deployments.md)
--- a/docs/gitbook/tutorials/zero-downtime-deployments.md
+++ b/docs/gitbook/tutorials/zero-downtime-deployments.md
@@ -0,0 +1,175 @@
+# Zero downtime deployments
+
+This is a list of things you should consider when dealing with a high traffic production environment if you want to
+minimise the impact of rolling updates and downscaling.
+
+### Deployment strategy
+
+Limit the number of unavailable pods during a rolling update:
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+spec:
+  progressDeadlineSeconds: 120
+  strategy:
+    type: RollingUpdate
+    rollingUpdate:
+      maxUnavailable: 0
+```
+
+The default progress deadline for a deployment is ten minutes.
+You should consider adjusting this value to make the deployment process fail faster.
+
+### Liveness health check
+
+You application should expose a HTTP endpoint that Kubernetes can call to determine if 
+your app transitioned to a broken state from which it can't recover and needs to be restarted.
+
+```yaml
+readinessProbe:
+  exec:
+    command:
+    - wget
+    - --quiet
+    - --tries=1
+    - --timeout=4
+    - --spider
+    - http://localhost:8080/healthz
+  timeoutSeconds: 5
+  initialDelaySeconds: 5
+```
+
+If you've enabled mTLS, you'll have to use `exec` for liveness and readiness checks since 
+kubelet is not part of the service mesh and doesn't have access to the TLS cert.
+
+### Readiness health check
+
+You application should expose a HTTP endpoint that Kubernetes can call to determine if 
+your app is ready to receive traffic.
+
+```yaml
+livenessProbe:
+  exec:
+    command:
+    - wget
+    - --quiet
+    - --tries=1
+    - --timeout=4
+    - --spider
+    - http://localhost:8080/readyz
+  timeoutSeconds: 5
+  initialDelaySeconds: 5
+  periodSeconds: 5
+```
+
+If your app depends on external services, you should check if those services are available before allowing Kubernetes
+to route traffic to an app instance. Keep in mind that the Envoy sidecar can have a slower startup than your app.
+This means that on application start you should retry for at least a couple of seconds any external connection.
+
+### Graceful shutdown
+
+Before a pod gets terminated, Kubernetes sends a `SIGTERM` signal to every container and waits for period of 
+time (30s by default) for all containers to exit gracefully. If your app doesn't handle the `SIGTERM` signal or if it 
+doesn't exit within the grace period, Kubernetes will kill the container and any inflight requests that your app is 
+processing will fail.
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+spec:
+  template:
+    spec:
+      terminationGracePeriodSeconds: 60
+      containers:
+      - name: app
+        lifecycle:
+          preStop:
+            exec:
+              command:
+              - sleep
+              - "10"
+```
+
+Your app container should have a `preStop` hook that delays the container shutdown.
+This will allow the service mesh to drain the traffic and remove this pod from all other Envoy sidecars before your app 
+becomes unavailable.
+
+### Resource requests and limits
+
+Setting CPU and memory requests/limits for all workloads is a mandatory step if you're running a production system.
+Without limits your nodes could run out of memory or become unresponsive due to CPU exhausting.
+Without CPU and memory requests,
+the Kubernetes scheduler will not be able to make decisions about which nodes to place pods on.
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+spec:
+  template:
+    spec:
+      containers:
+      - name: app
+        resources:
+          limits:
+            cpu: 1000m
+            memory: 1Gi
+          requests:
+            cpu: 100m
+            memory: 128Mi
+```
+
+Note that without resource requests the horizontal pod autoscaler can't determine when to scale your app.
+
+### Autoscaling
+
+A production environment should be able to handle traffic bursts without impacting the quality of service.
+This can be achieved with Kubernetes autoscaling capabilities.
+Autoscaling in Kubernetes has two dimensions: the Cluster Autoscaler that deals with node scaling operations and
+the Horizontal Pod Autoscaler that automatically scales the number of pods in a deployment.
+
+```yaml
+apiVersion: autoscaling/v2beta1
+kind: HorizontalPodAutoscaler
+spec:
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: app
+  minReplicas: 2
+  maxReplicas: 4
+  metrics:
+  - type: Resource
+    resource:
+      name: cpu
+      targetAverageValue: 900m
+  - type: Resource
+    resource:
+      name: memory
+      targetAverageValue: 768Mi
+```
+
+The above HPA ensures your app will be scaled up before the pods reach the CPU or memory limits.
+
+### Ingress retries
+
+To minimise the impact of downscaling operations you can make use of Envoy retry capabilities.
+
+```yaml
+apiVersion: flagger.app/v1alpha3
+kind: Canary
+spec:
+  service:
+    port: 9898
+    gateways:
+    - public-gateway.istio-system.svc.cluster.local
+    hosts:
+    - app.example.com
+    appendHeaders:
+      x-envoy-upstream-rq-timeout-ms: "15000"
+      x-envoy-max-retries: "10"
+      x-envoy-retry-on: "gateway-error,connect-failure,refused-stream"
+```
+
+When the HPA scales down your app, your users could run into 503 errors.
+The above configuration will make Envoy retry the HTTP requests that failed due to gateway errors.