mirror of
https://github.com/jpetazzo/container.training.git
synced 2026-05-19 23:36:33 +00:00
540 lines
9.9 KiB
Markdown
540 lines
9.9 KiB
Markdown
# Adding nodes to the cluster
|
|
|
|
- So far, our cluster has only 1 node
|
|
|
|
- Let's see what it takes to add more nodes
|
|
|
|
- We are going to use another set of machines: `kubenet`
|
|
|
|
---
|
|
|
|
## The environment
|
|
|
|
- We have 3 identical machines: `kubenet1`, `kubenet2`, `kubenet3`
|
|
|
|
- The Docker Engine is installed (and running) on these machines
|
|
|
|
- The Kubernetes binaries are installed, but nothing is running
|
|
|
|
- We will use `kubenet1` to run the control plane
|
|
|
|
---
|
|
|
|
## The plan
|
|
|
|
- Start the control plane on `kubenet1`
|
|
|
|
- Join the 3 nodes to the cluster
|
|
|
|
- Deploy and scale a simple web server
|
|
|
|
.lab[
|
|
|
|
- Log into `kubenet1`
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Running the control plane
|
|
|
|
- We will use a Compose file to start the control plane components
|
|
|
|
.lab[
|
|
|
|
- Clone the repository containing the workshop materials:
|
|
```bash
|
|
git clone https://@@GITREPO@@
|
|
```
|
|
|
|
- Go to the `compose/simple-k8s-control-plane` directory:
|
|
```bash
|
|
cd container.training/compose/simple-k8s-control-plane
|
|
```
|
|
|
|
- Start the control plane:
|
|
```bash
|
|
docker-compose up
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Checking the control plane status
|
|
|
|
- Before moving on, verify that the control plane works
|
|
|
|
.lab[
|
|
|
|
- Show control plane component statuses:
|
|
```bash
|
|
kubectl get componentstatuses
|
|
kubectl get cs
|
|
```
|
|
|
|
- Show the (empty) list of nodes:
|
|
```bash
|
|
kubectl get nodes
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
class: extra-details
|
|
|
|
## Differences from `dmuc`
|
|
|
|
- Our new control plane listens on `0.0.0.0` instead of the default `127.0.0.1`
|
|
|
|
- The ServiceAccount admission plugin is disabled
|
|
|
|
---
|
|
|
|
## Joining the nodes
|
|
|
|
- We need to generate a `kubeconfig` file for kubelet
|
|
|
|
- This time, we need to put the public IP address of `kubenet1`
|
|
|
|
(instead of `localhost` or `127.0.0.1`)
|
|
|
|
.lab[
|
|
|
|
- Generate the `kubeconfig` file:
|
|
```bash
|
|
kubectl config set-cluster kubenet --server http://`X.X.X.X`:8080
|
|
kubectl config set-context kubenet --cluster kubenet
|
|
kubectl config use-context kubenet
|
|
cp ~/.kube/config ~/kubeconfig
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Distributing the `kubeconfig` file
|
|
|
|
- We need that `kubeconfig` file on the other nodes, too
|
|
|
|
.lab[
|
|
|
|
- Copy `kubeconfig` to the other nodes:
|
|
```bash
|
|
for N in 2 3; do
|
|
scp ~/kubeconfig kubenet$N:
|
|
done
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Starting kubelet
|
|
|
|
- Reminder: kubelet needs to run as root; don't forget `sudo`!
|
|
|
|
.lab[
|
|
|
|
- Join the first node:
|
|
```bash
|
|
sudo kubelet --kubeconfig ~/kubeconfig
|
|
```
|
|
|
|
- Open more terminals and join the other nodes to the cluster:
|
|
```bash
|
|
ssh kubenet2 sudo kubelet --kubeconfig ~/kubeconfig
|
|
ssh kubenet3 sudo kubelet --kubeconfig ~/kubeconfig
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Checking cluster status
|
|
|
|
- We should now see all 3 nodes
|
|
|
|
- At first, their `STATUS` will be `NotReady`
|
|
|
|
- They will move to `Ready` state after approximately 10 seconds
|
|
|
|
.lab[
|
|
|
|
- Check the list of nodes:
|
|
```bash
|
|
kubectl get nodes
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Deploy a web server
|
|
|
|
- Let's create a Deployment and scale it
|
|
|
|
(so that we have multiple pods on multiple nodes)
|
|
|
|
.lab[
|
|
|
|
- Create a Deployment running `jpetazzo/color`:
|
|
```bash
|
|
kubectl create deployment blue --image=jpetazzo/color
|
|
```
|
|
|
|
- Scale it:
|
|
```bash
|
|
kubectl scale deployment blue --replicas=5
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Check our pods
|
|
|
|
- The pods will be scheduled on the nodes
|
|
|
|
- The nodes will pull the `jpetazzo/color` image, and start the pods
|
|
|
|
- What are the IP addresses of our pods?
|
|
|
|
.lab[
|
|
|
|
- Check the IP addresses of our pods
|
|
```bash
|
|
kubectl get pods -o wide
|
|
```
|
|
|
|
]
|
|
|
|
--
|
|
|
|
🤔 Something's not right ... Some pods have the same IP address!
|
|
|
|
---
|
|
|
|
## What's going on?
|
|
|
|
- Without the `--network-plugin` flag, kubelet defaults to "no-op" networking
|
|
|
|
- It lets the container engine use a default network
|
|
|
|
(in that case, we end up with the default Docker bridge)
|
|
|
|
- Our pods are running on independent, disconnected, host-local networks
|
|
|
|
---
|
|
|
|
## What do we need to do?
|
|
|
|
- On a normal cluster, kubelet is configured to set up pod networking with CNI plugins
|
|
|
|
- This requires:
|
|
|
|
- installing CNI plugins
|
|
|
|
- writing CNI configuration files
|
|
|
|
- running kubelet with `--network-plugin=cni`
|
|
|
|
---
|
|
|
|
## Using network plugins
|
|
|
|
- We need to set up a better network
|
|
|
|
- Before diving into CNI, we will use the `kubenet` plugin
|
|
|
|
- This plugin creates a `cbr0` bridge and connects the containers to that bridge
|
|
|
|
- This plugin allocates IP addresses from a range:
|
|
|
|
- either specified to kubelet (e.g. with `--pod-cidr`)
|
|
|
|
- or stored in the node's `spec.podCIDR` field
|
|
|
|
.footnote[See [here][kubenet-plugin] for more details about this `kubenet` plugin.]
|
|
|
|
[kubenet-plugin]: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#kubenet
|
|
|
|
---
|
|
|
|
## What `kubenet` does and *does not* do
|
|
|
|
- It allocates IP addresses to pods *locally*
|
|
|
|
(each node has its own local subnet)
|
|
|
|
- It connects the pods to a *local* bridge
|
|
|
|
(pods on the same node can communicate together; not with other nodes)
|
|
|
|
- It doesn't set up routing or tunneling
|
|
|
|
(we get pods on separated networks; we need to connect them somehow)
|
|
|
|
- It doesn't allocate subnets to nodes
|
|
|
|
(this can be done manually, or by the controller manager)
|
|
|
|
---
|
|
|
|
## Setting up routing or tunneling
|
|
|
|
- *On each node*, we will add routes to the other nodes' pod network
|
|
|
|
- Of course, this is not convenient or scalable!
|
|
|
|
- We will see better techniques to do this; but for now, hang on!
|
|
|
|
---
|
|
|
|
## Allocating subnets to nodes
|
|
|
|
- There are multiple options:
|
|
|
|
- passing the subnet to kubelet with the `--pod-cidr` flag
|
|
|
|
- manually setting `spec.podCIDR` on each node
|
|
|
|
- allocating node CIDRs automatically with the controller manager
|
|
|
|
- The last option would be implemented by adding these flags to controller manager:
|
|
```
|
|
--allocate-node-cidrs=true --cluster-cidr=<cidr>
|
|
```
|
|
|
|
---
|
|
|
|
class: extra-details
|
|
|
|
## The pod CIDR field is not mandatory
|
|
|
|
- `kubenet` needs the pod CIDR, but other plugins don't need it
|
|
|
|
(e.g. because they allocate addresses in multiple pools, or a single big one)
|
|
|
|
- The pod CIDR field may eventually be deprecated and replaced by an annotation
|
|
|
|
(see [kubernetes/kubernetes#57130](https://github.com/kubernetes/kubernetes/issues/57130))
|
|
|
|
---
|
|
|
|
## Restarting kubelet wih pod CIDR
|
|
|
|
- We need to stop and restart all our kubelets
|
|
|
|
- We will add the `--network-plugin` and `--pod-cidr` flags
|
|
|
|
- We all have a "cluster number" (let's call that `C`) printed on your VM info card
|
|
|
|
- We will use pod CIDR `10.C.N.0/24` (where `N` is the node number: 1, 2, 3)
|
|
|
|
.lab[
|
|
|
|
- Stop all the kubelets (Ctrl-C is fine)
|
|
|
|
- Restart them all, adding `--network-plugin=kubenet --pod-cidr 10.C.N.0/24`
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## What happens to our pods?
|
|
|
|
- When we stop (or kill) kubelet, the containers keep running
|
|
|
|
- When kubelet starts again, it detects the containers
|
|
|
|
.lab[
|
|
|
|
- Check that our pods are still here:
|
|
```bash
|
|
kubectl get pods -o wide
|
|
```
|
|
|
|
]
|
|
|
|
🤔 But our pods still use local IP addresses!
|
|
|
|
---
|
|
|
|
## Recreating the pods
|
|
|
|
- The IP address of a pod cannot change
|
|
|
|
- kubelet doesn't automatically kill/restart containers with "invalid" addresses
|
|
<br/>
|
|
(in fact, from kubelet's point of view, there is no such thing as an "invalid" address)
|
|
|
|
- We must delete our pods and recreate them
|
|
|
|
.lab[
|
|
|
|
- Delete all the pods, and let the ReplicaSet recreate them:
|
|
```bash
|
|
kubectl delete pods --all
|
|
```
|
|
|
|
- Wait for the pods to be up again:
|
|
```bash
|
|
kubectl get pods -o wide -w
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Adding kube-proxy
|
|
|
|
- Let's start kube-proxy to provide internal load balancing
|
|
|
|
- Then see if we can create a Service and use it to contact our pods
|
|
|
|
.lab[
|
|
|
|
- Start kube-proxy:
|
|
```bash
|
|
sudo kube-proxy --kubeconfig ~/.kube/config
|
|
```
|
|
|
|
- Expose our Deployment:
|
|
```bash
|
|
kubectl expose deployment blue --port=80
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Test internal load balancing
|
|
|
|
.lab[
|
|
|
|
- Retrieve the ClusterIP address:
|
|
```bash
|
|
kubectl get svc blue
|
|
```
|
|
|
|
- Send a few requests to the ClusterIP address (with `curl`)
|
|
|
|
]
|
|
|
|
--
|
|
|
|
Sometimes it works, sometimes it doesn't. Why?
|
|
|
|
---
|
|
|
|
## Routing traffic
|
|
|
|
- Our pods have new, distinct IP addresses
|
|
|
|
- But they are on host-local, isolated networks
|
|
|
|
- If we try to ping a pod on a different node, it won't work
|
|
|
|
- kube-proxy merely rewrites the destination IP address
|
|
|
|
- But we need that IP address to be reachable in the first place
|
|
|
|
- How do we fix this?
|
|
|
|
(hint: check the title of this slide!)
|
|
|
|
---
|
|
|
|
## Important warning
|
|
|
|
- The technique that we are about to use doesn't work everywhere
|
|
|
|
- It only works if:
|
|
|
|
- all the nodes are directly connected to each other (at layer 2)
|
|
|
|
- the underlying network allows the IP addresses of our pods
|
|
|
|
- If we are on physical machines connected by a switch: OK
|
|
|
|
- If we are on virtual machines in a public cloud: NOT OK
|
|
|
|
- on AWS, we need to disable "source and destination checks" on our instances
|
|
|
|
- on OpenStack, we need to disable "port security" on our network ports
|
|
|
|
---
|
|
|
|
## Routing basics
|
|
|
|
- We need to tell *each* node:
|
|
|
|
"The subnet 10.C.N.0/24 is located on node N" (for all values of N)
|
|
|
|
- This is how we add a route on Linux:
|
|
```bash
|
|
ip route add 10.C.N.0/24 via W.X.Y.Z
|
|
```
|
|
|
|
(where `W.X.Y.Z` is the internal IP address of node N)
|
|
|
|
- We can see the internal IP addresses of our nodes with:
|
|
```bash
|
|
kubectl get nodes -o wide
|
|
```
|
|
|
|
---
|
|
|
|
## Firewalling
|
|
|
|
- By default, Docker prevents containers from using arbitrary IP addresses
|
|
|
|
(by setting up iptables rules)
|
|
|
|
- We need to allow our containers to use our pod CIDR
|
|
|
|
- For simplicity, we will insert a blanket iptables rule allowing all traffic:
|
|
|
|
`iptables -I FORWARD -j ACCEPT`
|
|
|
|
- This has to be done on every node
|
|
|
|
---
|
|
|
|
## Setting up routing
|
|
|
|
.lab[
|
|
|
|
- Create all the routes on all the nodes
|
|
|
|
- Insert the iptables rule allowing traffic
|
|
|
|
- Check that you can ping all the pods from one of the nodes
|
|
|
|
- Check that you can `curl` the ClusterIP of the Service successfully
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## What's next?
|
|
|
|
- We did a lot of manual operations:
|
|
|
|
- allocating subnets to nodes
|
|
|
|
- adding command-line flags to kubelet
|
|
|
|
- updating the routing tables on our nodes
|
|
|
|
- We want to automate all these steps
|
|
|
|
- We want something that works on all networks
|
|
|
|
???
|
|
|
|
:EN:- Connecting nodes ands pods
|
|
:FR:- Interconnecter les nœuds et les pods
|