mirror of
https://github.com/jpetazzo/container.training.git
synced 2026-02-14 17:49:59 +00:00
🩺 Update healthcheck exercise
This commit is contained in:
@@ -4,8 +4,6 @@
|
||||
|
||||
(we will use the `rng` service in the dockercoins app)
|
||||
|
||||
- Observe the correct behavior of the readiness probe
|
||||
- See what happens when the load increses
|
||||
|
||||
(when deploying e.g. an invalid image)
|
||||
|
||||
- Observe the behavior of the liveness probe
|
||||
(spoiler alert: it involves timeouts!)
|
||||
|
||||
@@ -2,34 +2,85 @@
|
||||
|
||||
- We want to add healthchecks to the `rng` service in dockercoins
|
||||
|
||||
- First, deploy a new copy of dockercoins
|
||||
- The `rng` service exhibits an interesting behavior under load:
|
||||
|
||||
- Then, add a readiness probe on the `rng` service
|
||||
*its latency increases (which will cause probes to time out!)*
|
||||
|
||||
(using a simple HTTP check on the `/` route of the service)
|
||||
- We want to see:
|
||||
|
||||
- Check what happens when deploying an invalid image for `rng` (e.g. `alpine`)
|
||||
- what happens when the readiness probe fails
|
||||
|
||||
- Then roll back `rng` to the original image and add a liveness probe
|
||||
- what happens when the liveness probe fails
|
||||
|
||||
(with the same parameters)
|
||||
|
||||
- Scale up the `worker` service (to 15+ workers) and observe
|
||||
|
||||
- What happens?
|
||||
- how to set "appropriate" probes and probe parameters
|
||||
|
||||
---
|
||||
|
||||
## Goal
|
||||
## Setup
|
||||
|
||||
- *Before* adding the readiness probe:
|
||||
- First, deploy a new copy of dockercoins
|
||||
|
||||
updating the image of the `rng` service with `alpine` should break it
|
||||
(for instance, in a brand new namespace)
|
||||
|
||||
- *After* adding the readiness probe:
|
||||
- Pro tip #1: ping (e.g. with `httping`) the `rng` service at all times
|
||||
|
||||
updating the image of the `rng` service with `alpine` shouldn't break it
|
||||
- it should initially show a few milliseconds latency
|
||||
|
||||
- When adding the liveness probe, nothing special should happen
|
||||
- that will increase when we scale up
|
||||
|
||||
- Scaling the `worker` service will then cause disruptions
|
||||
- it will also let us detect when the service goes "boom"
|
||||
|
||||
- Pro tip #2: also keep an eye on the web UI
|
||||
|
||||
---
|
||||
|
||||
## Readiness
|
||||
|
||||
- Add a readiness probe to `rng`
|
||||
|
||||
- this requires editing the pod template in the Deployment manifest
|
||||
|
||||
- use a simple HTTP check on the `/` route of the service
|
||||
|
||||
- keep all other parameters (timeouts, thresholds...) at their default values
|
||||
|
||||
- Check what happens when deploying an invalid image for `rng` (e.g. `alpine`)
|
||||
|
||||
*(If the probe was set up correctly, the app will continue to work,
|
||||
because Kubernetes won't switch over the traffic to the `alpine` containers,
|
||||
because they don't pass the readiness probe.)*
|
||||
|
||||
---
|
||||
|
||||
## Readiness under load
|
||||
|
||||
- Then roll back `rng` to the original image
|
||||
|
||||
- Check what happens when we scale up the `worker` Deployment to 15+ workers
|
||||
|
||||
(get the latency above 1 second)
|
||||
|
||||
*(We should now observe intermittent unavailability of the service, i.e. every
|
||||
30 seconds it will be unreachable for a bit, then come back, then go away again, etc.)*
|
||||
|
||||
---
|
||||
|
||||
## Liveness
|
||||
|
||||
- Now replace the readiness probe with a liveness probe
|
||||
|
||||
- What happens now?
|
||||
|
||||
*(At first the behavior looks the same as with the readiness probe:
|
||||
service becomes unreachable, then reachable again, etc.; but there is
|
||||
a significant difference behind the scenes. What is it?)*
|
||||
|
||||
---
|
||||
|
||||
## Readiness and liveness
|
||||
|
||||
- Bonus questions!
|
||||
|
||||
- What happens if we enable both probes at the same time?
|
||||
|
||||
- What strategies can we use so that both probes are useful?
|
||||
|
||||
Reference in New Issue
Block a user