mirror of
https://github.com/fluxcd/flagger.git
synced 2026-04-15 06:57:34 +00:00
Add zero downtime deployments tutorial
This commit is contained in:
@@ -17,3 +17,4 @@
|
||||
## Tutorials
|
||||
|
||||
* [Canaries with Helm charts and GitOps](tutorials/canary-helm-gitops.md)
|
||||
* [Zero downtime deployments](tutorials/zero-downtime-deployments.md)
|
||||
|
||||
175
docs/gitbook/tutorials/zero-downtime-deployments.md
Normal file
175
docs/gitbook/tutorials/zero-downtime-deployments.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Zero downtime deployments
|
||||
|
||||
This is a list of things you should consider when dealing with a high traffic production environment if you want to
|
||||
minimise the impact of rolling updates and downscaling.
|
||||
|
||||
### Deployment strategy
|
||||
|
||||
Limit the number of unavailable pods during a rolling update:
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
spec:
|
||||
progressDeadlineSeconds: 120
|
||||
strategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxUnavailable: 0
|
||||
```
|
||||
|
||||
The default progress deadline for a deployment is ten minutes.
|
||||
You should consider adjusting this value to make the deployment process fail faster.
|
||||
|
||||
### Liveness health check
|
||||
|
||||
You application should expose a HTTP endpoint that Kubernetes can call to determine if
|
||||
your app transitioned to a broken state from which it can't recover and needs to be restarted.
|
||||
|
||||
```yaml
|
||||
readinessProbe:
|
||||
exec:
|
||||
command:
|
||||
- wget
|
||||
- --quiet
|
||||
- --tries=1
|
||||
- --timeout=4
|
||||
- --spider
|
||||
- http://localhost:8080/healthz
|
||||
timeoutSeconds: 5
|
||||
initialDelaySeconds: 5
|
||||
```
|
||||
|
||||
If you've enabled mTLS, you'll have to use `exec` for liveness and readiness checks since
|
||||
kubelet is not part of the service mesh and doesn't have access to the TLS cert.
|
||||
|
||||
### Readiness health check
|
||||
|
||||
You application should expose a HTTP endpoint that Kubernetes can call to determine if
|
||||
your app is ready to receive traffic.
|
||||
|
||||
```yaml
|
||||
livenessProbe:
|
||||
exec:
|
||||
command:
|
||||
- wget
|
||||
- --quiet
|
||||
- --tries=1
|
||||
- --timeout=4
|
||||
- --spider
|
||||
- http://localhost:8080/readyz
|
||||
timeoutSeconds: 5
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 5
|
||||
```
|
||||
|
||||
If your app depends on external services, you should check if those services are available before allowing Kubernetes
|
||||
to route traffic to an app instance. Keep in mind that the Envoy sidecar can have a slower startup than your app.
|
||||
This means that on application start you should retry for at least a couple of seconds any external connection.
|
||||
|
||||
### Graceful shutdown
|
||||
|
||||
Before a pod gets terminated, Kubernetes sends a `SIGTERM` signal to every container and waits for period of
|
||||
time (30s by default) for all containers to exit gracefully. If your app doesn't handle the `SIGTERM` signal or if it
|
||||
doesn't exit within the grace period, Kubernetes will kill the container and any inflight requests that your app is
|
||||
processing will fail.
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
terminationGracePeriodSeconds: 60
|
||||
containers:
|
||||
- name: app
|
||||
lifecycle:
|
||||
preStop:
|
||||
exec:
|
||||
command:
|
||||
- sleep
|
||||
- "10"
|
||||
```
|
||||
|
||||
Your app container should have a `preStop` hook that delays the container shutdown.
|
||||
This will allow the service mesh to drain the traffic and remove this pod from all other Envoy sidecars before your app
|
||||
becomes unavailable.
|
||||
|
||||
### Resource requests and limits
|
||||
|
||||
Setting CPU and memory requests/limits for all workloads is a mandatory step if you're running a production system.
|
||||
Without limits your nodes could run out of memory or become unresponsive due to CPU exhausting.
|
||||
Without CPU and memory requests,
|
||||
the Kubernetes scheduler will not be able to make decisions about which nodes to place pods on.
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: app
|
||||
resources:
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 1Gi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
```
|
||||
|
||||
Note that without resource requests the horizontal pod autoscaler can't determine when to scale your app.
|
||||
|
||||
### Autoscaling
|
||||
|
||||
A production environment should be able to handle traffic bursts without impacting the quality of service.
|
||||
This can be achieved with Kubernetes autoscaling capabilities.
|
||||
Autoscaling in Kubernetes has two dimensions: the Cluster Autoscaler that deals with node scaling operations and
|
||||
the Horizontal Pod Autoscaler that automatically scales the number of pods in a deployment.
|
||||
|
||||
```yaml
|
||||
apiVersion: autoscaling/v2beta1
|
||||
kind: HorizontalPodAutoscaler
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: app
|
||||
minReplicas: 2
|
||||
maxReplicas: 4
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
targetAverageValue: 900m
|
||||
- type: Resource
|
||||
resource:
|
||||
name: memory
|
||||
targetAverageValue: 768Mi
|
||||
```
|
||||
|
||||
The above HPA ensures your app will be scaled up before the pods reach the CPU or memory limits.
|
||||
|
||||
### Ingress retries
|
||||
|
||||
To minimise the impact of downscaling operations you can make use of Envoy retry capabilities.
|
||||
|
||||
```yaml
|
||||
apiVersion: flagger.app/v1alpha3
|
||||
kind: Canary
|
||||
spec:
|
||||
service:
|
||||
port: 9898
|
||||
gateways:
|
||||
- public-gateway.istio-system.svc.cluster.local
|
||||
hosts:
|
||||
- app.example.com
|
||||
appendHeaders:
|
||||
x-envoy-upstream-rq-timeout-ms: "15000"
|
||||
x-envoy-max-retries: "10"
|
||||
x-envoy-retry-on: "gateway-error,connect-failure,refused-stream"
|
||||
```
|
||||
|
||||
When the HPA scales down your app, your users could run into 503 errors.
|
||||
The above configuration will make Envoy retry the HTTP requests that failed due to gateway errors.
|
||||
Reference in New Issue
Block a user