10 KiB
Volumes
-
Volumes are special directories that are mounted in containers
-
Volumes can have many different purposes:
-
share files and directories between containers running on the same machine
-
share files and directories between containers and their host
-
centralize configuration information in Kubernetes and expose it to containers
-
manage credentials and secrets and expose them securely to containers
-
store persistent data for stateful services
-
access storage systems (like Ceph, EBS, NFS, Portworx, and many others)
-
class: extra-details
Kubernetes volumes vs. Docker volumes
-
Kubernetes and Docker volumes are very similar
(the Kubernetes documentation says otherwise ...
but it refers to Docker 1.7, which was released in 2015!) -
Docker volumes allow us to share data between containers running on the same host
-
Kubernetes volumes allow us to share data between containers in the same pod
-
Both Docker and Kubernetes volumes enable access to storage systems
-
Kubernetes volumes are also used to expose configuration and secrets
-
Docker has specific concepts for configuration and secrets
(but under the hood, the technical implementation is similar) -
If you're not familiar with Docker volumes, you can safely ignore this slide!
Volumes ≠ Persistent Volumes
-
Volumes and Persistent Volumes are related, but very different!
-
Volumes:
-
appear in Pod specifications (we'll see that in a few slides)
-
do not exist as API resources (cannot do
kubectl get volumes)
-
-
Persistent Volumes:
-
are API resources (can do
kubectl get persistentvolumes) -
correspond to concrete volumes (e.g. on a SAN, EBS, etc.)
-
cannot be associated with a Pod directly; but through a Persistent Volume Claim
-
won't be discussed further in this section
-
Adding a volume to a Pod
-
We will start with the simplest Pod manifest we can find
-
We will add a volume to that Pod manifest
-
We will mount that volume in a container in the Pod
-
By default, this volume will be an
emptyDir(an empty directory)
-
It will "shadow" the directory where it's mounted
Our basic Pod
apiVersion: v1
kind: Pod
metadata:
name: nginx-without-volume
spec:
containers:
- name: nginx
image: nginx
This is a MVP! (Minimum Viable Pod😉)
It runs a single NGINX container.
Trying the basic pod
.lab[
- Create the Pod:
kubectl create -f ~/container.training/k8s/nginx-1-without-volume.yaml
-
Get its IP address:
IPADDR=$(kubectl get pod nginx-without-volume -o jsonpath={.status.podIP}) -
Send a request with curl:
curl $IPADDR
]
(We should see the "Welcome to NGINX" page.)
Adding a volume
-
We need to add the volume in two places:
-
at the Pod level (to declare the volume)
-
at the container level (to mount the volume)
-
-
We will declare a volume named
www -
No type is specified, so it will default to
emptyDir(as the name implies, it will be initialized as an empty directory at pod creation)
-
In that pod, there is also a container named
nginx -
That container mounts the volume
wwwto path/usr/share/nginx/html/
The Pod with a volume
apiVersion: v1
kind: Pod
metadata:
name: nginx-with-volume
spec:
volumes:
- name: www
containers:
- name: nginx
image: nginx
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html/
Trying the Pod with a volume
.lab[
- Create the Pod:
kubectl create -f ~/container.training/k8s/nginx-2-with-volume.yaml
-
Get its IP address:
IPADDR=$(kubectl get pod nginx-with-volume -o jsonpath={.status.podIP}) -
Send a request with curl:
curl $IPADDR
]
(We should now see a "403 Forbidden" error page.)
Populating the volume with another container
-
Let's add another container to the Pod
-
Let's mount the volume in both containers
-
That container will populate the volume with static files
-
NGINX will then serve these static files
-
To populate the volume, we will clone the Spoon-Knife repository
-
this repository is https://github.com/octocat/Spoon-Knife
-
it's very popular (more than 100K stars!)
-
Sharing a volume between two containers
.small[
apiVersion: v1
kind: Pod
metadata:
name: nginx-with-git
spec:
volumes:
- name: www
containers:
- name: nginx
image: nginx
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html/
- name: git
image: alpine
command: [ "sh", "-c", "apk add git && git clone https://github.com/octocat/Spoon-Knife /www" ]
volumeMounts:
- name: www
mountPath: /www/
restartPolicy: OnFailure
]
Sharing a volume, explained
-
We added another container to the pod
-
That container mounts the
wwwvolume on a different path (/www) -
It uses the
alpineimage -
When started, it installs
gitand clones theoctocat/Spoon-Kniferepository(that repository contains a tiny HTML website)
-
As a result, NGINX now serves this website
Trying the shared volume
-
This one will be time-sensitive!
-
We need to catch the Pod IP address as soon as it's created
-
Then send a request to it as fast as possible
.lab[
- Watch the pods (so that we can catch the Pod IP address)
kubectl get pods -o wide --watch
]
Shared volume in action
.lab[
- Create the pod:
kubectl create -f ~/container.training/k8s/nginx-3-with-git.yaml
- As soon as we see its IP address, access it:
curl `$IP`
- A few seconds later, the state of the pod will change; access it again:
curl `$IP`
]
The first time, we should see "403 Forbidden".
The second time, we should see the HTML file from the Spoon-Knife repository.
Explanations
-
Both containers are started at the same time
-
NGINX starts very quickly
(it can serve requests immediately)
-
But at this point, the volume is empty
(NGINX serves "403 Forbidden")
-
The other containers installs git and clones the repository
(this takes a bit longer)
-
When the other container is done, the volume holds the repository
(NGINX serves the HTML file)
The devil is in the details
-
The default
restartPolicyisAlways -
This would cause our
gitcontainer to run again ... and again ... and again(with an exponential back-off delay, as explained in the documentation)
-
That's why we specified
restartPolicy: OnFailure
Inconsistencies
-
There is a short period of time during which the website is not available
(because the
gitcontainer hasn't done its job yet) -
With a bigger website, we could get inconsistent results
(where only a part of the content is ready)
-
In real applications, this could cause incorrect results
-
How can we avoid that?
Init Containers
-
We can define containers that should execute before the main ones
-
They will be executed in order
(instead of in parallel)
-
They must all succeed before the main containers are started
-
This is exactly what we need here!
-
Let's see one in action
.footnote[See Init Containers documentation for all the details.]
Defining Init Containers
.small[
apiVersion: v1
kind: Pod
metadata:
name: nginx-with-init
spec:
volumes:
- name: www
containers:
- name: nginx
image: nginx
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html/
initContainers:
- name: git
image: alpine
command: [ "sh", "-c", "apk add git && git clone https://github.com/octocat/Spoon-Knife /www" ]
volumeMounts:
- name: www
mountPath: /www/
]
Trying the init container
.lab[
-
Create the pod:
kubectl create -f ~/container.training/k8s/nginx-4-with-init.yaml -
Try to send HTTP requests as soon as the pod comes up
]
-
This time, instead of "403 Forbidden" we get a "connection refused"
-
NGINX doesn't start until the git container has done its job
-
We never get inconsistent results
(a "half-ready" container)
Volume lifecycle
-
The lifecycle of a volume is linked to the pod's lifecycle
-
This means that a volume is created when the pod is created
-
This is mostly relevant for
emptyDirvolumes(other volumes, like remote storage, are not "created" but rather "attached" )
-
A volume survives across container restarts
-
A volume is destroyed (or, for remote storage, detached) when the pod is destroyed
Other uses of init containers
-
Load content, data sets...
-
Generate configuration (or certificates)
-
Database migrations
-
Wait for other services to be up
(to avoid flurry of connection errors in main container)
-
etc.
Init containers vs sidecars
-
Init containers run before the main container(s)
-
Sidecars run in parallel to the main container(s)
-
What's the difference between a sidecar and a "main container"?
--
-
sidecar might need to start before the main container(s)
(e.g. if it provides "ambassador"-style connectivity service) -
sidecar might need to stop after the main container(s)
(ditto) -
sidecar might need to be stopped automatically when main container(s) complete
(e.g. for batch jobs) -
Kubernetes has special support for sidecars!
Sidecars
-
Introduced as an alpha feature in K8S 1.28; GA in K8S 1.33
-
A sidecar is an
initContainerwith arestartPolicy: Always -
Sidecars are started in the order defined by the
initContainerslist -
They can have healthchecks
-
Kubernetes doesn't wait for them to complete
(they run asynchronously)
-
When all the main containers have completed, sidecars are shutdown automatically
(they don't prevent jobs from completing!)
???
:EN:- Sharing data between containers with volumes :EN:- When and how to use Init Containers
:FR:- Partager des données grâce aux volumes :FR:- Quand et comment utiliser un Init Container