diff --git a/k8s/consul.yaml b/k8s/consul.yaml
index 2e5bc138..a82d4733 100644
--- a/k8s/consul.yaml
+++ b/k8s/consul.yaml
@@ -1,3 +1,37 @@
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+ name: consul
+ labels:
+ app: consul
+rules:
+ - apiGroups: [""]
+ resources:
+ - pods
+ verbs:
+ - get
+ - list
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+ name: consul
+roleRef:
+ apiGroup: rbac.authorization.k8s.io
+ kind: ClusterRole
+ name: consul
+subjects:
+ - kind: ServiceAccount
+ name: consul
+ namespace: default
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+ name: consul
+ labels:
+ app: consul
+---
apiVersion: v1
kind: Service
metadata:
@@ -24,6 +58,7 @@ spec:
labels:
app: consul
spec:
+ serviceAccountName: consul
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
@@ -37,18 +72,11 @@ spec:
terminationGracePeriodSeconds: 10
containers:
- name: consul
- image: "consul:1.2.2"
- env:
- - name: NAMESPACE
- valueFrom:
- fieldRef:
- fieldPath: metadata.namespace
+ image: "consul:1.4.0"
args:
- "agent"
- "-bootstrap-expect=3"
- - "-retry-join=consul-0.consul.$(NAMESPACE).svc.cluster.local"
- - "-retry-join=consul-1.consul.$(NAMESPACE).svc.cluster.local"
- - "-retry-join=consul-2.consul.$(NAMESPACE).svc.cluster.local"
+ - "-retry-join=provider=k8s label_selector=\"app=consul\""
- "-client=0.0.0.0"
- "-data-dir=/consul/data"
- "-server"
diff --git a/slides/k8s/daemonset.md b/slides/k8s/daemonset.md
index f3bfc54f..a569f438 100644
--- a/slides/k8s/daemonset.md
+++ b/slides/k8s/daemonset.md
@@ -252,38 +252,29 @@ The master node has [taints](https://kubernetes.io/docs/concepts/configuration/t
---
-## What are all these pods doing?
+## Is this working?
-- Let's check the logs of all these `rng` pods
-
-- All these pods have the label `app=rng`:
-
- - the first pod, because that's what `kubectl create deployment` does
- - the other ones (in the daemon set), because we
- *copied the spec from the first one*
-
-- Therefore, we can query everybody's logs using that `app=rng` selector
-
-.exercise[
-
-- Check the logs of all the pods having a label `app=rng`:
- ```bash
- kubectl logs -l app=rng --tail 1
- ```
-
-]
+- Look at the web UI
--
-It appears that *all the pods* are serving requests at the moment.
+- The graph should now go above 10 hashes per second!
+
+--
+
+- It looks like the newly created pods are serving traffic correctly
+
+- How and why did this happen?
+
+ (We didn't do anything special to add them to the `rng` service load balancer!)
---
-## The magic of selectors
+# Labels and selectors
- The `rng` *service* is load balancing requests to a set of pods
-- This set of pods is defined as "pods having the label `app=rng`"
+- That set of pods is defined by the *selector* of the `rng` service
.exercise[
@@ -294,19 +285,60 @@ It appears that *all the pods* are serving requests at the moment.
]
-When we created additional pods with this label, they were
-automatically detected by `svc/rng` and added as *endpoints*
-to the associated load balancer.
+- The selector is `app=rng`
+
+- It means "all the pods having the label `app=rng`"
+
+ (They can have additional labels as well, that's OK!)
---
-## Removing the first pod from the load balancer
+## Selector evaluation
+
+- We can use selectors with many `kubectl` commands
+
+- For instance, with `kubectl get`, `kubectl logs`, `kubectl delete` ... and more
+
+.exercise[
+
+- Get the list of pods matching selector `app=rng`:
+ ```bash
+ kubectl get pods -l app=rng
+ kubectl get pods --selector app=rng
+ ```
+
+]
+
+But ... why do these pods (in particular, the *new* ones) have this `app=rng` label?
+
+---
+
+## Where do labels come from?
+
+- When we create a deployment with `kubectl create deployment rng`,
+
this deployment gets the label `app=rng`
+
+- The replica sets created by this deployment also get the label `app=rng`
+
+- The pods created by these replica sets also get the label `app=rng`
+
+- When we created the daemon set from the deployment, we re-used the same spec
+
+- Therefore, the pods created by the daemon set get the same labels
+
+.footnote[Note: when we use `kubectl run stuff`, the label is `run=stuff` instead.]
+
+---
+
+## Updating load balancer configuration
+
+- We would like to remove a pod from the load balancer
- What would happen if we removed that pod, with `kubectl delete pod ...`?
--
- The `replicaset` would re-create it immediately.
+ It would be re-created immediately (by the replica set or the daemon set)
--
@@ -314,90 +346,272 @@ to the associated load balancer.
--
- The `replicaset` would re-create it immediately.
+ It would *also* be re-created immediately
--
- ... Because what matters to the `replicaset` is the number of pods *matching that selector.*
-
---
-
-- But but but ... Don't we have more than one pod with `app=rng` now?
-
---
-
- The answer lies in the exact selector used by the `replicaset` ...
+ Why?!?
---
-## Deep dive into selectors
+## Selectors for replica sets and daemon sets
-- Let's look at the selectors for the `rng` *deployment* and the associated *replica set*
+- The "mission" of a replica set is:
+
+ "Make sure that there is the right number of pods matching this spec!"
+
+- The "mission" of a daemon set is:
+
+ "Make sure that there is a pod matching this spec on each node!"
+
+--
+
+- *In fact,* replica sets and daemon sets do not check pod specifications
+
+- They merely have a *selector*, and they look for pods matching that selector
+
+- Yes, we can fool them by manually creating pods with the "right" labels
+
+- Bottom line: if we remove our `app=rng` label ...
+
+ ... The pod "diseappears" for its parent, which re-creates another pod to replace it
+
+---
+
+class: extra-details
+
+## Isolation of replica sets and daemon sets
+
+- Since both the `rng` daemon set and the `rng` replica set use `app=rng` ...
+
+ ... Why don't they "find" each other's pods?
+
+--
+
+- *Replica sets* have a more specific selector, visible with `kubectl describe`
+
+ (It looks like `app=rng,pod-template-hash=abcd1234`)
+
+- *Daemon sets* also have a more specific selector, but it's invisible
+
+ (It looks like `app=rng,controller-revision-hash=abcd1234`)
+
+- As a result, each controller only "sees" the pods it manages
+
+---
+
+## Removing a pod from the load balancer
+
+- Currently, the `rng` service is defined by the `app=rng` selector
+
+- The only way to remove a pod is to remove or change the `app` label
+
+- ... But that will cause another pod to be created instead!
+
+- What's the solution?
+
+--
+
+- We need to change the selector of the `rng` service!
+
+- Let's add another label to that selector (e.g. `enabled=yes`)
+
+---
+
+## Complex selectors
+
+- If a selector specifies multiple labels, they are understood as a logical *AND*
+
+ (In other words: the pods must match all the labels)
+
+- Kubernetes has support for advanced, set-based selectors
+
+ (But these cannot be used with services, at least not yet!)
+
+---
+
+## The plan
+
+1. Add the label `enabled=yes` to all our `rng` pods
+
+2. Update the selector for the `rng` service to also include `enabled=yes`
+
+3. Toggle traffic to a pod by manually adding/removing the `enabled` label
+
+4. Profit!
+
+*Note: if we swap steps 1 and 2, it will cause a short
+service disruption, because there will be a period of time
+during which the service selector won't match any pod.
+During that time, requests to the service will time out.
+By doing things in the order above, we guarantee that there won't
+be any interruption.*
+
+---
+
+## Adding labels to pods
+
+- We want to add the label `enabled=yes` to all pods that have `app=rng`
+
+- We could edit each pod one by one with `kubectl edit` ...
+
+- ... Or we could use `kubectl label` to label them all
+
+- `kubectl label` can use selectors itself
.exercise[
-- Show detailed information about the `rng` deployment:
+- Add `enabled=yes` to all pods that have `app=rng`:
```bash
- kubectl describe deploy rng
+ kubectl label pods -l app=rng enabled=yes
```
-- Show detailed information about the `rng` replica:
-
(The second command doesn't require you to get the exact name of the replica set)
+]
+
+---
+
+## Updating the service selector
+
+- We need to edit the service specification
+
+- Reminder: in the service definition, we will see `app: rng` in two places
+
+ - the label of the service itself (we don't need to touch that one)
+
+ - the selector of the service (that's the one we want to change)
+
+.exercise[
+
+- Update the service to add `enabled: yes` to its selector:
```bash
- kubectl describe rs rng-yyyyyyyy
- kubectl describe rs -l app=rng
+ kubectl edit service rng
```
+
+
]
--
-The replica set selector also has a `pod-template-hash`, unlike the pods in our daemon set.
+... And then we get *the weirdest error ever.* Why?
---
-# Updating a service through labels and selectors
+## When the YAML parser is being too smart
-- What if we want to drop the `rng` deployment from the load balancer?
+- YAML parsers try to help us:
-- Option 1:
+ - `xyz` is the string `"xyz"`
- - destroy it
+ - `42` is the integer `42`
-- Option 2:
+ - `yes` is the boolean value `true`
- - add an extra *label* to the daemon set
+- If we want the string `"42"` or the string `"yes"`, we have to quote them
- - update the service *selector* to refer to that *label*
+- So we have to use `enabled: "yes"`
---
-
-Of course, option 2 offers more learning opportunities. Right?
+.footnote[For a good laugh: if we had used "ja", "oui", "si" ... as the value, it would have worked!]
---
-## Add an extra label to the daemon set
+## Updating the service selector, take 2
-- We will update the daemon set "spec"
+.exercise[
-- Option 1:
+- Update the service to add `enabled: "yes"` to its selector:
+ ```bash
+ kubectl edit service rng
+ ```
- - edit the `rng.yml` file that we used earlier
+
- - load the new definition with `kubectl apply`
+]
-- Option 2:
+This time it should work!
- - use `kubectl edit`
-
---
-
-*If you feel like you got this๐๐, feel free to try directly.*
-
-*We've included a few hints on the next slides for your convenience!*
+If we did everything correctly, the web UI shouldn't show any change.
---
+## Updating labels
+
+- We want to disable the pod that was created by the deployment
+
+- All we have to do, is remove the `enabled` label from that pod
+
+- To identify that pod, we can use its name
+
+- ... Or rely on the fact that it's the only one with a `pod-template-hash` label
+
+- Good to know:
+
+ - `kubectl label ... foo=` doesn't remove a label (it sets it to an empty string)
+
+ - to remove label `foo`, use `kubectl label ... foo-`
+
+ - to change an existing label, we would need to add `--overwrite`
+
+---
+
+## Removing a pod from the load balancer
+
+.exercise[
+
+- In one window, check the logs of that pod:
+ ```bash
+ POD=$(kubectl get pod -l app=rng,pod-template-hash -o name)
+ kubectl logs --tail 1 --follow $POD
+
+ ```
+ (We should see a steady stream of HTTP logs)
+
+- In another window, remove the label from the pod:
+ ```bash
+ kubectl label pod -l app=rng,pod-template-hash enabled-
+ ```
+ (The stream of HTTP logs should stop immediately)
+
+]
+
+There might be a slight change in the web UI (since we removed a bit
+of capacity from the `rng` service). If we remove more pods,
+the effect should be more visible.
+
+---
+
+class: extra-details
+
+## Updating the daemon set
+
+- If we scale up our cluster by adding new nodes, the daemon set will create more pods
+
+- These pods won't have the `enabled=yes` label
+
+- If we want these pods to have that label, we need to edit the daemon set spec
+
+- We can do that with e.g. `kubectl edit daemonset rng`
+
+---
+
+class: extra-details
+
## We've put resources in your resources
- Reminder: a daemon set is a resource that creates more resources!
@@ -410,7 +624,9 @@ Of course, option 2 offers more learning opportunities. Right?
- the label(s) of the resource(s) created by the first resource (in the `template` block)
-- You need to update the selector and the template (metadata labels are not mandatory)
+- We would need to update the selector and the template
+
+ (metadata labels are not mandatory)
- The template must match the selector
@@ -418,175 +634,6 @@ Of course, option 2 offers more learning opportunities. Right?
---
-## Adding our label
-
-- Let's add a label `isactive: yes`
-
-- In YAML, `yes` should be quoted; i.e. `isactive: "yes"`
-
-.exercise[
-
-- Update the daemon set to add `isactive: "yes"` to the selector and template label:
- ```bash
- kubectl edit daemonset rng
- ```
-
-
-
-- Update the service to add `isactive: "yes"` to its selector:
- ```bash
- kubectl edit service rng
- ```
-
-
-
-]
-
----
-
-## Checking what we've done
-
-.exercise[
-
-- Check the most recent log line of all `app=rng` pods to confirm that exactly one per node is now active:
- ```bash
- kubectl logs -l app=rng --tail 1
- ```
-
-]
-
-The timestamps should give us a hint about how many pods are currently receiving traffic.
-
-.exercise[
-
-- Look at the pods that we have right now:
- ```bash
- kubectl get pods
- ```
-
-]
-
----
-
-## Cleaning up
-
-- The pods of the deployment and the "old" daemon set are still running
-
-- We are going to identify them programmatically
-
-.exercise[
-
-- List the pods with `app=rng` but without `isactive=yes`:
- ```bash
- kubectl get pods -l app=rng,isactive!=yes
- ```
-
-- Remove these pods:
- ```bash
- kubectl delete pods -l app=rng,isactive!=yes
- ```
-
-]
-
----
-
-## Cleaning up stale pods
-
-```
-$ kubectl get pods
-NAME READY STATUS RESTARTS AGE
-rng-54f57d4d49-7pt82 1/1 Terminating 0 51m
-rng-54f57d4d49-vgz9h 1/1 Running 0 22s
-rng-b85tm 1/1 Terminating 0 39m
-rng-hfbrr 1/1 Terminating 0 39m
-rng-vplmj 1/1 Running 0 7m
-rng-xbpvg 1/1 Running 0 7m
-[...]
-```
-
-- The extra pods (noted `Terminating` above) are going away
-
-- ... But a new one (`rng-54f57d4d49-vgz9h` above) was restarted immediately!
-
---
-
-- Remember, the *deployment* still exists, and makes sure that one pod is up and running
-
-- If we delete the pod associated to the deployment, it is recreated automatically
-
----
-
-## Deleting a deployment
-
-.exercise[
-
-- Remove the `rng` deployment:
- ```bash
- kubectl delete deployment rng
- ```
-]
-
---
-
-- The pod that was created by the deployment is now being terminated:
-
-```
-$ kubectl get pods
-NAME READY STATUS RESTARTS AGE
-rng-54f57d4d49-vgz9h 1/1 Terminating 0 4m
-rng-vplmj 1/1 Running 0 11m
-rng-xbpvg 1/1 Running 0 11m
-[...]
-```
-
-Ding, dong, the deployment is dead! And the daemon set lives on.
-
----
-
-## Avoiding extra pods
-
-- When we changed the definition of the daemon set, it immediately created new pods. We had to remove the old ones manually.
-
-- How could we have avoided this?
-
---
-
-- By adding the `isactive: "yes"` label to the pods before changing the daemon set!
-
-- This can be done programmatically with `kubectl patch`:
-
- ```bash
- PATCH='
- metadata:
- labels:
- isactive: "yes"
- '
- kubectl get pods -l app=rng -l controller-revision-hash -o name |
- xargs kubectl patch -p "$PATCH"
- ```
-
----
-
## Labels and debugging
- When a pod is misbehaving, we can delete it: another one will be recreated
diff --git a/slides/k8s/statefulsets.md b/slides/k8s/statefulsets.md
index 35c7fc20..3dc45286 100644
--- a/slides/k8s/statefulsets.md
+++ b/slides/k8s/statefulsets.md
@@ -266,7 +266,9 @@ spec:
---
-## Stateful sets in action
+# Running a Consul cluster
+
+- Here is a good use-case for Stateful sets!
- We are going to deploy a Consul cluster with 3 nodes
@@ -294,42 +296,54 @@ consul agent -data=dir=/consul/data -client=0.0.0.0 -server -ui \
-retry-join=`Y.Y.Y.Y`
```
-- We need to replace X.X.X.X and Y.Y.Y.Y with the addresses of other nodes
+- Replace X.X.X.X and Y.Y.Y.Y with the addresses of other nodes
-- We can specify DNS names, but then they have to be FQDN
-
-- It's OK for a pod to include itself in the list as well
-
-- We can therefore use the same command-line on all nodes (easier!)
+- The same command-line can be used on all nodes (convenient!)
---
-## Discovering the addresses of other pods
+## Cloud Auto-join
-- When a service is created for a stateful set, individual DNS entries are created
+- Since version 1.4.0, Consul can use the Kubernetes API to find its peers
-- These entries are constructed like this:
+- This is called [Cloud Auto-join]
- `-...svc.cluster.local`
+- Instead of passing an IP address, we need to pass a parameter like this:
-- `` is the number of the pod in the set (starting at zero)
+ ```
+ consul agent -retry-join "provider=k8s label_selector=\"app=consul\""
+ ```
-- If we deploy Consul in the default namespace, the names could be:
+- Consul needs to be able to talk to the Kubernetes API
- - `consul-0.consul.default.svc.cluster.local`
- - `consul-1.consul.default.svc.cluster.local`
- - `consul-2.consul.default.svc.cluster.local`
+- We can provide a `kubeconfig` file
+
+- If Consul runs in a pod, it will use the *service account* of the pod
+
+[Cloud Auto-join]: https://www.consul.io/docs/agent/cloud-auto-join.html#kubernetes-k8s-
+
+---
+
+## Setting up Cloud auto-join
+
+- We need to create a service account for Consul
+
+- We need to create a role that can `list` and `get` pods
+
+- We need to bind that role to the service account
+
+- And of course, we need to make sure that Consul pods use that service account
---
## Putting it all together
-- The file `k8s/consul.yaml` defines a service and a stateful set
+- The file `k8s/consul.yaml` defines the required resources
+
+ (service account, cluster role, cluster role binding, service, stateful set)
- It has a few extra touches:
- - the name of the namespace is injected through an environment variable
-
- a `podAntiAffinity` prevents two pods from running on the same node
- a `preStop` hook makes the pod leave the cluster when shutdown gracefully