From b5adee271c125a51633e7354972f8c786661270b Mon Sep 17 00:00:00 2001 From: stefanprodan Date: Sun, 3 Mar 2019 13:24:15 +0200 Subject: [PATCH] Add zero downtime deployments tutorial --- docs/gitbook/SUMMARY.md | 1 + .../tutorials/zero-downtime-deployments.md | 175 ++++++++++++++++++ 2 files changed, 176 insertions(+) create mode 100644 docs/gitbook/tutorials/zero-downtime-deployments.md diff --git a/docs/gitbook/SUMMARY.md b/docs/gitbook/SUMMARY.md index 0cd9b38b..7109e1ff 100644 --- a/docs/gitbook/SUMMARY.md +++ b/docs/gitbook/SUMMARY.md @@ -17,3 +17,4 @@ ## Tutorials * [Canaries with Helm charts and GitOps](tutorials/canary-helm-gitops.md) +* [Zero downtime deployments](tutorials/zero-downtime-deployments.md) diff --git a/docs/gitbook/tutorials/zero-downtime-deployments.md b/docs/gitbook/tutorials/zero-downtime-deployments.md new file mode 100644 index 00000000..4f9a5e1b --- /dev/null +++ b/docs/gitbook/tutorials/zero-downtime-deployments.md @@ -0,0 +1,175 @@ +# Zero downtime deployments + +This is a list of things you should consider when dealing with a high traffic production environment if you want to +minimise the impact of rolling updates and downscaling. + +### Deployment strategy + +Limit the number of unavailable pods during a rolling update: + +```yaml +apiVersion: apps/v1 +kind: Deployment +spec: + progressDeadlineSeconds: 120 + strategy: + type: RollingUpdate + rollingUpdate: + maxUnavailable: 0 +``` + +The default progress deadline for a deployment is ten minutes. +You should consider adjusting this value to make the deployment process fail faster. + +### Liveness health check + +You application should expose a HTTP endpoint that Kubernetes can call to determine if +your app transitioned to a broken state from which it can't recover and needs to be restarted. + +```yaml +readinessProbe: + exec: + command: + - wget + - --quiet + - --tries=1 + - --timeout=4 + - --spider + - http://localhost:8080/healthz + timeoutSeconds: 5 + initialDelaySeconds: 5 +``` + +If you've enabled mTLS, you'll have to use `exec` for liveness and readiness checks since +kubelet is not part of the service mesh and doesn't have access to the TLS cert. + +### Readiness health check + +You application should expose a HTTP endpoint that Kubernetes can call to determine if +your app is ready to receive traffic. + +```yaml +livenessProbe: + exec: + command: + - wget + - --quiet + - --tries=1 + - --timeout=4 + - --spider + - http://localhost:8080/readyz + timeoutSeconds: 5 + initialDelaySeconds: 5 + periodSeconds: 5 +``` + +If your app depends on external services, you should check if those services are available before allowing Kubernetes +to route traffic to an app instance. Keep in mind that the Envoy sidecar can have a slower startup than your app. +This means that on application start you should retry for at least a couple of seconds any external connection. + +### Graceful shutdown + +Before a pod gets terminated, Kubernetes sends a `SIGTERM` signal to every container and waits for period of +time (30s by default) for all containers to exit gracefully. If your app doesn't handle the `SIGTERM` signal or if it +doesn't exit within the grace period, Kubernetes will kill the container and any inflight requests that your app is +processing will fail. + +```yaml +apiVersion: apps/v1 +kind: Deployment +spec: + template: + spec: + terminationGracePeriodSeconds: 60 + containers: + - name: app + lifecycle: + preStop: + exec: + command: + - sleep + - "10" +``` + +Your app container should have a `preStop` hook that delays the container shutdown. +This will allow the service mesh to drain the traffic and remove this pod from all other Envoy sidecars before your app +becomes unavailable. + +### Resource requests and limits + +Setting CPU and memory requests/limits for all workloads is a mandatory step if you're running a production system. +Without limits your nodes could run out of memory or become unresponsive due to CPU exhausting. +Without CPU and memory requests, +the Kubernetes scheduler will not be able to make decisions about which nodes to place pods on. + +```yaml +apiVersion: apps/v1 +kind: Deployment +spec: + template: + spec: + containers: + - name: app + resources: + limits: + cpu: 1000m + memory: 1Gi + requests: + cpu: 100m + memory: 128Mi +``` + +Note that without resource requests the horizontal pod autoscaler can't determine when to scale your app. + +### Autoscaling + +A production environment should be able to handle traffic bursts without impacting the quality of service. +This can be achieved with Kubernetes autoscaling capabilities. +Autoscaling in Kubernetes has two dimensions: the Cluster Autoscaler that deals with node scaling operations and +the Horizontal Pod Autoscaler that automatically scales the number of pods in a deployment. + +```yaml +apiVersion: autoscaling/v2beta1 +kind: HorizontalPodAutoscaler +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: app + minReplicas: 2 + maxReplicas: 4 + metrics: + - type: Resource + resource: + name: cpu + targetAverageValue: 900m + - type: Resource + resource: + name: memory + targetAverageValue: 768Mi +``` + +The above HPA ensures your app will be scaled up before the pods reach the CPU or memory limits. + +### Ingress retries + +To minimise the impact of downscaling operations you can make use of Envoy retry capabilities. + +```yaml +apiVersion: flagger.app/v1alpha3 +kind: Canary +spec: + service: + port: 9898 + gateways: + - public-gateway.istio-system.svc.cluster.local + hosts: + - app.example.com + appendHeaders: + x-envoy-upstream-rq-timeout-ms: "15000" + x-envoy-max-retries: "10" + x-envoy-retry-on: "gateway-error,connect-failure,refused-stream" +``` + +When the HPA scales down your app, your users could run into 503 errors. +The above configuration will make Envoy retry the HTTP requests that failed due to gateway errors. \ No newline at end of file