Files
container.training/slides/k8s/multinode.md
Jérôme Petazzoni 02004317ac 🐞 Fix some ambiguous markdown link references
I thought that the links were local to each slide, but...
apparently not. Whoops.
2022-10-24 20:41:23 +02:00

540 lines
9.9 KiB
Markdown

# Adding nodes to the cluster
- So far, our cluster has only 1 node
- Let's see what it takes to add more nodes
- We are going to use another set of machines: `kubenet`
---
## The environment
- We have 3 identical machines: `kubenet1`, `kubenet2`, `kubenet3`
- The Docker Engine is installed (and running) on these machines
- The Kubernetes binaries are installed, but nothing is running
- We will use `kubenet1` to run the control plane
---
## The plan
- Start the control plane on `kubenet1`
- Join the 3 nodes to the cluster
- Deploy and scale a simple web server
.lab[
- Log into `kubenet1`
]
---
## Running the control plane
- We will use a Compose file to start the control plane components
.lab[
- Clone the repository containing the workshop materials:
```bash
git clone https://@@GITREPO@@
```
- Go to the `compose/simple-k8s-control-plane` directory:
```bash
cd container.training/compose/simple-k8s-control-plane
```
- Start the control plane:
```bash
docker-compose up
```
]
---
## Checking the control plane status
- Before moving on, verify that the control plane works
.lab[
- Show control plane component statuses:
```bash
kubectl get componentstatuses
kubectl get cs
```
- Show the (empty) list of nodes:
```bash
kubectl get nodes
```
]
---
class: extra-details
## Differences from `dmuc`
- Our new control plane listens on `0.0.0.0` instead of the default `127.0.0.1`
- The ServiceAccount admission plugin is disabled
---
## Joining the nodes
- We need to generate a `kubeconfig` file for kubelet
- This time, we need to put the public IP address of `kubenet1`
(instead of `localhost` or `127.0.0.1`)
.lab[
- Generate the `kubeconfig` file:
```bash
kubectl config set-cluster kubenet --server http://`X.X.X.X`:8080
kubectl config set-context kubenet --cluster kubenet
kubectl config use-context kubenet
cp ~/.kube/config ~/kubeconfig
```
]
---
## Distributing the `kubeconfig` file
- We need that `kubeconfig` file on the other nodes, too
.lab[
- Copy `kubeconfig` to the other nodes:
```bash
for N in 2 3; do
scp ~/kubeconfig kubenet$N:
done
```
]
---
## Starting kubelet
- Reminder: kubelet needs to run as root; don't forget `sudo`!
.lab[
- Join the first node:
```bash
sudo kubelet --kubeconfig ~/kubeconfig
```
- Open more terminals and join the other nodes to the cluster:
```bash
ssh kubenet2 sudo kubelet --kubeconfig ~/kubeconfig
ssh kubenet3 sudo kubelet --kubeconfig ~/kubeconfig
```
]
---
## Checking cluster status
- We should now see all 3 nodes
- At first, their `STATUS` will be `NotReady`
- They will move to `Ready` state after approximately 10 seconds
.lab[
- Check the list of nodes:
```bash
kubectl get nodes
```
]
---
## Deploy a web server
- Let's create a Deployment and scale it
(so that we have multiple pods on multiple nodes)
.lab[
- Create a Deployment running `jpetazzo/color`:
```bash
kubectl create deployment blue --image=jpetazzo/color
```
- Scale it:
```bash
kubectl scale deployment blue --replicas=5
```
]
---
## Check our pods
- The pods will be scheduled on the nodes
- The nodes will pull the `jpetazzo/color` image, and start the pods
- What are the IP addresses of our pods?
.lab[
- Check the IP addresses of our pods
```bash
kubectl get pods -o wide
```
]
--
🤔 Something's not right ... Some pods have the same IP address!
---
## What's going on?
- Without the `--network-plugin` flag, kubelet defaults to "no-op" networking
- It lets the container engine use a default network
(in that case, we end up with the default Docker bridge)
- Our pods are running on independent, disconnected, host-local networks
---
## What do we need to do?
- On a normal cluster, kubelet is configured to set up pod networking with CNI plugins
- This requires:
- installing CNI plugins
- writing CNI configuration files
- running kubelet with `--network-plugin=cni`
---
## Using network plugins
- We need to set up a better network
- Before diving into CNI, we will use the `kubenet` plugin
- This plugin creates a `cbr0` bridge and connects the containers to that bridge
- This plugin allocates IP addresses from a range:
- either specified to kubelet (e.g. with `--pod-cidr`)
- or stored in the node's `spec.podCIDR` field
.footnote[See [here][kubenet-plugin] for more details about this `kubenet` plugin.]
[kubenet-plugin]: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#kubenet
---
## What `kubenet` does and *does not* do
- It allocates IP addresses to pods *locally*
(each node has its own local subnet)
- It connects the pods to a *local* bridge
(pods on the same node can communicate together; not with other nodes)
- It doesn't set up routing or tunneling
(we get pods on separated networks; we need to connect them somehow)
- It doesn't allocate subnets to nodes
(this can be done manually, or by the controller manager)
---
## Setting up routing or tunneling
- *On each node*, we will add routes to the other nodes' pod network
- Of course, this is not convenient or scalable!
- We will see better techniques to do this; but for now, hang on!
---
## Allocating subnets to nodes
- There are multiple options:
- passing the subnet to kubelet with the `--pod-cidr` flag
- manually setting `spec.podCIDR` on each node
- allocating node CIDRs automatically with the controller manager
- The last option would be implemented by adding these flags to controller manager:
```
--allocate-node-cidrs=true --cluster-cidr=<cidr>
```
---
class: extra-details
## The pod CIDR field is not mandatory
- `kubenet` needs the pod CIDR, but other plugins don't need it
(e.g. because they allocate addresses in multiple pools, or a single big one)
- The pod CIDR field may eventually be deprecated and replaced by an annotation
(see [kubernetes/kubernetes#57130](https://github.com/kubernetes/kubernetes/issues/57130))
---
## Restarting kubelet wih pod CIDR
- We need to stop and restart all our kubelets
- We will add the `--network-plugin` and `--pod-cidr` flags
- We all have a "cluster number" (let's call that `C`) printed on your VM info card
- We will use pod CIDR `10.C.N.0/24` (where `N` is the node number: 1, 2, 3)
.lab[
- Stop all the kubelets (Ctrl-C is fine)
- Restart them all, adding `--network-plugin=kubenet --pod-cidr 10.C.N.0/24`
]
---
## What happens to our pods?
- When we stop (or kill) kubelet, the containers keep running
- When kubelet starts again, it detects the containers
.lab[
- Check that our pods are still here:
```bash
kubectl get pods -o wide
```
]
🤔 But our pods still use local IP addresses!
---
## Recreating the pods
- The IP address of a pod cannot change
- kubelet doesn't automatically kill/restart containers with "invalid" addresses
<br/>
(in fact, from kubelet's point of view, there is no such thing as an "invalid" address)
- We must delete our pods and recreate them
.lab[
- Delete all the pods, and let the ReplicaSet recreate them:
```bash
kubectl delete pods --all
```
- Wait for the pods to be up again:
```bash
kubectl get pods -o wide -w
```
]
---
## Adding kube-proxy
- Let's start kube-proxy to provide internal load balancing
- Then see if we can create a Service and use it to contact our pods
.lab[
- Start kube-proxy:
```bash
sudo kube-proxy --kubeconfig ~/.kube/config
```
- Expose our Deployment:
```bash
kubectl expose deployment blue --port=80
```
]
---
## Test internal load balancing
.lab[
- Retrieve the ClusterIP address:
```bash
kubectl get svc blue
```
- Send a few requests to the ClusterIP address (with `curl`)
]
--
Sometimes it works, sometimes it doesn't. Why?
---
## Routing traffic
- Our pods have new, distinct IP addresses
- But they are on host-local, isolated networks
- If we try to ping a pod on a different node, it won't work
- kube-proxy merely rewrites the destination IP address
- But we need that IP address to be reachable in the first place
- How do we fix this?
(hint: check the title of this slide!)
---
## Important warning
- The technique that we are about to use doesn't work everywhere
- It only works if:
- all the nodes are directly connected to each other (at layer 2)
- the underlying network allows the IP addresses of our pods
- If we are on physical machines connected by a switch: OK
- If we are on virtual machines in a public cloud: NOT OK
- on AWS, we need to disable "source and destination checks" on our instances
- on OpenStack, we need to disable "port security" on our network ports
---
## Routing basics
- We need to tell *each* node:
"The subnet 10.C.N.0/24 is located on node N" (for all values of N)
- This is how we add a route on Linux:
```bash
ip route add 10.C.N.0/24 via W.X.Y.Z
```
(where `W.X.Y.Z` is the internal IP address of node N)
- We can see the internal IP addresses of our nodes with:
```bash
kubectl get nodes -o wide
```
---
## Firewalling
- By default, Docker prevents containers from using arbitrary IP addresses
(by setting up iptables rules)
- We need to allow our containers to use our pod CIDR
- For simplicity, we will insert a blanket iptables rule allowing all traffic:
`iptables -I FORWARD -j ACCEPT`
- This has to be done on every node
---
## Setting up routing
.lab[
- Create all the routes on all the nodes
- Insert the iptables rule allowing traffic
- Check that you can ping all the pods from one of the nodes
- Check that you can `curl` the ClusterIP of the Service successfully
]
---
## What's next?
- We did a lot of manual operations:
- allocating subnets to nodes
- adding command-line flags to kubelet
- updating the routing tables on our nodes
- We want to automate all these steps
- We want something that works on all networks
???
:EN:- Connecting nodes ands pods
:FR:- Interconnecter les nœuds et les pods