Now that we have a good number of longer exercises, it makes sense to rename the shorter demos/labs into 'labs' to avoid confusion between the two.
6.6 KiB
Putting it all together
-
We want to run that Consul cluster and actually persist data
-
We'll use a StatefulSet that will leverage PV and PVC
-
If we have a dynamic provisioner:
the cluster will come up right away
-
If we don't have a dynamic provisioner:
we will need to create Persistent Volumes manually
Persistent Volume Claims and Stateful sets
-
A Stateful set can define one (or more)
volumeClaimTemplate -
Each
volumeClaimTemplatewill create one Persistent Volume Claim per Pod -
Each Pod will therefore have its own individual volume
-
These volumes are numbered (like the Pods)
-
Example:
- a Stateful set is named
consul - it is scaled to replicas
- it has a
volumeClaimTemplatenameddata - then it will create pods
consul-0,consul-1,consul-2 - these pods will have volumes named
data, referencing PersistentVolumeClaims nameddata-consul-0,data-consul-1,data-consul-2
- a Stateful set is named
Persistent Volume Claims are sticky
-
When updating the stateful set (e.g. image upgrade), each pod keeps its volume
-
When pods get rescheduled (e.g. node failure), they keep their volume
(this requires a storage system that is not node-local)
-
These volumes are not automatically deleted
(when the stateful set is scaled down or deleted)
-
If a stateful set is scaled back up later, the pods get their data back
Deploying Consul
-
Let's use a new manifest for our Consul cluster
-
The only differences between that file and the previous one are:
-
volumeClaimTemplatedefined in the Stateful Set spec -
the corresponding
volumeMountsin the Pod spec
-
.lab[
- Apply the persistent Consul YAML file:
kubectl apply -f ~/container.training/k8s/consul-3.yaml
]
No dynamic provisioner
-
If we don't have a dynamic provisioner, we need to create the PVs
-
We are going to use local volumes
(similar conceptually to
hostPathvolumes) -
We can use local volumes without installing extra plugins
-
However, they are tied to a node
-
If that node goes down, the volume becomes unavailable
Observing the situation
- Let's look at Persistent Volume Claims and Pods
.lab[
-
Check that we now have an unbound Persistent Volume Claim:
kubectl get pvc -
We don't have any Persistent Volume:
kubectl get pv -
The Pod
consul-0is not scheduled yet:kubectl get pods -o wide
]
Hint: leave these commands running with -w in different windows.
Explanations
-
In a Stateful Set, the Pods are started one by one
-
consul-1won't be created untilconsul-0is running -
consul-0has a dependency on an unbound Persistent Volume Claim -
The scheduler won't schedule the Pod until the PVC is bound
(because the PVC might be bound to a volume that is only available on a subset of nodes; for instance EBS are tied to an availability zone)
Creating Persistent Volumes
-
Let's create 3 local directories (
/mnt/consul) on node2, node3, node4 -
Then create 3 Persistent Volumes corresponding to these directories
.lab[
-
Create the local directories:
for NODE in node2 node3 node4; do ssh $NODE sudo mkdir -p /mnt/consul done -
Create the PV objects:
kubectl apply -f ~/container.training/k8s/volumes-for-consul.yaml
]
Check our Consul cluster
-
The PVs that we created will be automatically matched with the PVCs
-
Once a PVC is bound, its pod can start normally
-
Once the pod
consul-0has started,consul-1can be created, etc. -
Eventually, our Consul cluster is up, and backend by "persistent" volumes
.lab[
- Check that our Consul clusters has 3 members indeed:
kubectl exec consul-0 -- consul members
]
Devil is in the details (1/2)
-
The size of the Persistent Volumes is bogus
(it is used when matching PVs and PVCs together, but there is no actual quota or limit)
-
The Pod might end up using more than the requested size
-
The PV may or may not have the capacity that it's advertising
-
It works well with dynamically provisioned block volumes
-
...Less so in other scenarios!
Devil is in the details (2/2)
-
This specific example worked because we had exactly 1 free PV per node:
-
if we had created multiple PVs per node ...
-
we could have ended with two PVCs bound to PVs on the same node ...
-
which would have required two pods to be on the same node ...
-
which is forbidden by the anti-affinity constraints in the StatefulSet
-
-
To avoid that, we need to associated the PVs with a Storage Class that has:
volumeBindingMode: WaitForFirstConsumer(this means that a PVC will be bound to a PV only after being used by a Pod)
-
See this blog post for more details
If we have a dynamic provisioner
These are the steps when dynamic provisioning happens:
-
The Stateful Set creates PVCs according to the
volumeClaimTemplate. -
The Stateful Set creates Pods using these PVCs.
-
The PVCs are automatically annotated with our Storage Class.
-
The dynamic provisioner provisions volumes and creates the corresponding PVs.
-
The PersistentVolumeClaimBinder associates the PVs and the PVCs together.
-
PVCs are now bound, the Pods can start.
Validating persistence (1)
-
When the StatefulSet is deleted, the PVC and PV still exist
-
And if we recreate an identical StatefulSet, the PVC and PV are reused
-
Let's see that!
.lab[
-
Put some data in Consul:
kubectl exec consul-0 -- consul kv put answer 42 -
Delete the Consul cluster:
kubectl delete -f ~/container.training/k8s/consul-3.yaml
]
Validating persistence (2)
.lab[
-
Wait until the last Pod is deleted:
kubectl wait pod consul-0 --for=delete -
Check that PV and PVC are still here:
kubectl get pv,pvc
]
Validating persistence (3)
.lab[
-
Re-create the cluster:
kubectl apply -f ~/container.training/k8s/consul-3.yaml -
Wait until it's up
-
Then access the key that we set earlier:
kubectl exec consul-0 -- consul kv get answer
]
Cleaning up
-
PV and PVC don't get deleted automatically
-
This is great (less risk of accidental data loss)
-
This is not great (storage usage increases)
-
Managing PVC lifecycle:
-
remove them manually
-
add their StatefulSet to their
ownerReferences -
delete the Namespace that they belong to
-
???
:EN:- Defining volumeClaimTemplates :FR:- Définir des volumeClaimTemplates