github/container.training

Fork 0

mirror of https://github.com/jpetazzo/container.training.git synced 2026-02-14 17:49:59 +00:00

Files

Jérôme Petazzoni 8b554c02d3 ⬆️ Bump cluster-upgrade lab to 1.28

2024-09-05 00:07:04 +02:00

12 KiB

Raw Permalink Blame History

Upgrading clusters

It's recommended to run consistent versions across a cluster

(mostly to have feature parity and latest security updates)
It's not mandatory

(otherwise, cluster upgrades would be a nightmare!)
Components can be upgraded one at a time without problems

Checking what we're running

It's easy to check the version for the API server

.lab[

Log into node oldversion1
Check the version of kubectl and of the API server:
```
kubectl version
```

]

In a HA setup with multiple API servers, they can have different versions
Running the command above multiple times can return different values

Node versions

It's also easy to check the version of kubelet

.lab[

Check node versions (includes kubelet, kernel, container engine):
```
kubectl get nodes -o wide
```

]

Different nodes can run different kubelet versions
Different nodes can run different kernel versions
Different nodes can run different container engines

Control plane versions

If the control plane is self-hosted (running in pods), we can check it

.lab[

Show image versions for all pods in kube-system namespace:

  kubectl --namespace=kube-system get pods -o json \
          | jq -r '
            .items[]
            | [.spec.nodeName, .metadata.name]
              + 
              (.spec.containers[].image | split(":"))
            | @tsv
            ' \
          | column -t

]

What version are we running anyway?

When I say, "I'm running Kubernetes 1.28", is that the version of:
- kubectl
- API server
- kubelet
- controller manager
- something else?

Other versions that are important

etcd
kube-dns or CoreDNS
CNI plugin(s)
Network controller, network policy controller
Container engine
Linux kernel

Important questions

Should we upgrade the control plane before or after the kubelets?
Within the control plane, should we upgrade the API server first or last?
How often should we upgrade?
How long are versions maintained?
All the answers are in the documentation about version skew policy!
Let's review the key elements together ...

Kubernetes uses semantic versioning

Kubernetes versions look like MAJOR.MINOR.PATCH; e.g. in 1.28.9:
- MAJOR = 1
- MINOR = 28
- PATCH = 9
It's always possible to mix and match different PATCH releases

(e.g. 1.28.9 and 1.28.13 are compatible)
It is recommended to run the latest PATCH release

(but it's mandatory only when there is a security advisory)

Version skew

API server must be more recent than its clients (kubelet and control plane)
... Which means it must always be upgraded first
All components support a difference of one¹ MINOR version
This allows live upgrades (since we can mix e.g. 1.28 and 1.29)
It also means that going from 1.28 to 1.30 requires going through 1.29

.footnote[¹Except kubelet, which can be up to two MINOR behind API server, and kubectl, which can be one MINOR ahead or behind API server.]

Release cycle

There is a new PATCH relese whenever necessary

(every few weeks, or "ASAP" when there is a security vulnerability)
There is a new MINOR release every 3 months (approximately)
At any given time, three MINOR releases are maintained
... Which means that MINOR releases are maintained approximately 9 months
We should expect to upgrade at least every 3 months (on average)

General guidelines

To update a component, use whatever was used to install it
If it's a distro package, update that distro package
If it's a container or pod, update that container or pod
If you used configuration management, update with that

Know where your binaries come from

Sometimes, we need to upgrade quickly

(when a vulnerability is announced and patched)
If we are using an installer, we should:
- make sure it's using upstream packages
- or make sure that whatever packages it uses are current
- make sure we can tell it to pin specific component versions

In practice

We are going to update a few cluster components
We will change the kubelet version on one node
We will change the version of the API server
We will work with cluster oldversion (nodes oldversion1, oldversion2, oldversion3)

Updating the API server

This cluster has been deployed with kubeadm
The control plane runs in static pods
These pods are started automatically by kubelet

(even when kubelet can't contact the API server)
They are defined in YAML files in /etc/kubernetes/manifests

(this path is set by a kubelet command-line flag)
kubelet automatically updates the pods when the files are changed

Changing the API server version

We will edit the YAML file to use a different image version

.lab[

Log into node oldversion1
Check API server version:
```
kubectl version
```

Edit the API server pod manifest:

sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml

Look for the image: line, and update it to e.g. v1.30.1

]

Checking what we've done

The API server will be briefly unavailable while kubelet restarts it

.lab[

Check the API server version:
```
kubectl version
```

]

Was that a good idea?

No!

Remember the guideline we gave earlier:

To update a component, use whatever was used to install it.
This control plane was deployed with kubeadm
We should use kubeadm to upgrade it!

Updating the whole control plane

Let's make it right, and use kubeadm to upgrade the entire control plane

(note: this is possible only because the cluster was installed with kubeadm)

.lab[

Check what will be upgraded:
```
sudo kubeadm upgrade plan
```

]

Note 1: kubeadm thinks that our cluster is running 1.24.1.
It is confused by our manual upgrade of the API server!

Note 2: kubeadm itself is still version 1.22.1..
It doesn't know how to upgrade do 1.23.X.

Upgrading kubeadm

First things first: we need to upgrade kubeadm
The Kubernetes package repositories are now split by minor versions

(i.e. there is one repository for 1.28, another for 1.29, etc.)
This avoids accidentally upgrading from one minor version to another

(e.g. with unattended upgrades or if packages haven't been held/pinned)
We'll need to add the new package repository and unpin packages!

Installing the new packages

Edit /etc/apt/sources.list.d/kubernetes.list

(or copy it to e.g. kubernetes-1.29.list and edit that)
apt-get update
Now edit (or remove) /etc/apt/preferences.d/kubernetes
apt-get install kubeadm should now upgrade kubeadm correctly! 🎉

Reverting our manual API server upgrade

First, we should revert our image: change

(so that kubeadm executes the right migration steps)

.lab[

Edit the API server pod manifest:

sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml

Look for the image: line, and restore it to the original value

(e.g. v1.28.9)
Wait for the control plane to come back up

]

Upgrading the cluster with kubeadm

Now we can let kubeadm do its job!

.lab[

Check the upgrade plan:
```
sudo kubeadm upgrade plan
```
Perform the upgrade:
```
sudo kubeadm upgrade apply v1.29.0
```

]

Updating kubelet

These nodes have been installed using the official Kubernetes packages
We can therefore use apt or apt-get

.lab[

Log into node oldversion2
Update package lists and APT pins like we did before
Then upgrade kubelet

]

Checking what we've done

.lab[

Log into node oldversion1
Check node versions:
```
kubectl get nodes -o wide
```
Create a deployment and scale it to make sure that the node still works

]

Was that a good idea?

Almost!

Yes, kubelet was installed with distribution packages
However, kubeadm took care of configuring kubelet

(when doing kubeadm join ...)
We were supposed to run a special command before upgrading kubelet!
That command should be executed on each node
It will download the kubelet configuration generated by kubeadm

Upgrading kubelet the right way

We need to upgrade kubeadm, upgrade kubelet config, then upgrade kubelet

(after upgrading the control plane)

.lab[

Execute the whole upgrade procedure on each node:

  for N in 1 2 3; do
    ssh oldversion$N "
      sudo sed -i s/1.28/1.29/ /etc/apt/sources.list.d/kubernetes.list &&
      sudo rm /etc/apt/preferences.d/kubernetes &&
      sudo apt update &&
      sudo apt install kubeadm -y &&
      sudo kubeadm upgrade node &&
      sudo apt install kubelet -y"
  done

]

Checking what we've done

All our nodes should now be updated to version 1.29

.lab[

Check nodes versions:
```
kubectl get nodes -o wide
```

]

And now, was that a good idea?

Almost!

The official recommendation is to drain a node before performing node maintenance

(migrate all workloads off the node before upgrading it)
How do we do that?
Is it really necessary?
Let's see!

Draining a node

This can be achieved with the kubectl drain command, which will:
- cordon the node (prevent new pods from being scheduled there)
- evict all the pods running on the node (delete them gracefully)
- the evicted pods will automatically be recreated somewhere else
- evictions might be blocked in some cases (Pod Disruption Budgets, emptyDir volumes)
Once the node is drained, it can safely be upgraded, restarted...
Once it's ready, it can be put back in commission with kubectl uncordon

Is it necessary?

When upgrading kubelet from one patch-level version to another:
- it's probably fine
When upgrading system packages:
- it's probably fine
- except when it's not
When upgrading the kernel:
- it's probably fine
- ...as long as we can tolerate a restart of the containers on the node
- ...and that they will be unavailable for a few minutes (during the reboot)

Is it necessary?

When upgrading kubelet from one minor version to another:
- it may or may not be fine
- in some cases (e.g. migrating from Docker to containerd) it will not
Here's what the documentation says:

Draining nodes before upgrading kubelet ensures that pods are re-admitted and containers are re-created, which may be necessary to resolve some security issues or other important bugs.
Do it at your own risk, and if you do, test extensively in staging environments!

Database operators to the rescue

Moving stateful pods (e.g.: database server) can cause downtime
Database replication can help:
- if a node contains database servers, we make sure these servers aren't primaries
- if they are primaries, we execute a switch over
Some database operators (e.g. CNPG) will do that switch over automatically

(when they detect that a node has been cordoned)

class: extra-details

Skipping versions

This example worked because we went from 1.28 to 1.29
If you are upgrading from e.g. 1.26, you will have to go through 1.27 first
This means upgrading kubeadm to 1.27.X, then using it to upgrade the cluster
Then upgrading kubeadm to 1.28.X, etc.
Make sure to read the release notes before upgrading!

???

:EN:- Best practices for cluster upgrades :EN:- Example: upgrading a kubeadm cluster

:FR:- Bonnes pratiques pour la mise à jour des clusters :FR:- Exemple : mettre à jour un cluster kubeadm

12 KiB Raw Permalink Blame History

Upgrading clusters

Checking what we're running

Node versions

Control plane versions

What version are we running anyway?

Other versions that are important

Important questions

Kubernetes uses semantic versioning

Version skew

Release cycle

General guidelines

Know where your binaries come from

In practice

Updating the API server

Changing the API server version

Checking what we've done

Was that a good idea?

Updating the whole control plane

Upgrading kubeadm

Installing the new packages

Reverting our manual API server upgrade

Upgrading the cluster with kubeadm

Updating kubelet

Checking what we've done

Was that a good idea?

Upgrading kubelet the right way

Checking what we've done

And now, was that a good idea?

Draining a node

Is it necessary?

Is it necessary?

Database operators to the rescue

Skipping versions

12 KiB

Raw Permalink Blame History