Compare commits

..

67 Commits

Author SHA1 Message Date
Jérôme Petazzoni
8c62ba7b28 🏖️ Highfive May 2025 2025-06-13 08:52:05 +02:00
Jérôme Petazzoni
71ee3012fb Add DMUC advanced exercises 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
5ed12d6631 🔧 Tweak backup chapter 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
839b50a7a6 📃 Update chapter on static pods 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
e0fdbfdb50 📃 Update control plane auth section 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
d9f53288f2 🔒️ Update section on user key and cert generation 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
697e9cf9f7 🔗 Links to docs and blog posts about ephemeral storage isolation 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
6b06fa2b35 🔗 Update Kyverno doc links 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
240b2a24e2 🐞 Typo fix 2025-06-13 08:49:59 +02:00
Hiranyey Gajbhiye
4bc97aa1b8 Update concepts-k8s.md
Fixed spelling mistake if it was unintentional
2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
798dc2216c 📃 Clarify what needs to be scaled up in healthcheck lab 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
5117b27386 🔧 Tweak portal VM size to use GP4 (GP2 is deprecated) 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
d2f736a850 📍 Pin express version in webui 2025-06-13 08:49:59 +02:00
Jérôme Petazzoni
01c374d0a4 Merge pull request #664 from lpiot/main
The missing slides…😅
2025-06-13 08:48:44 +02:00
Ludovic Piot
eee44979c5 📝 Add Kyverno install chapter 2025-06-12 22:13:19 +02:00
Ludovic Piot
4d3bc06e30 📝 Add Kyverno install chapter 2025-06-12 21:50:42 +02:00
Ludovic Piot
229ab045b3 🔥 2025-06-12 21:04:06 +02:00
Ludovic Piot
fe1a61eaeb 🎨 2025-06-12 21:03:49 +02:00
Ludovic Piot
9613589dea 📝 Add small section about SSH keypairs rotation for Flux 2025-06-12 20:23:59 +02:00
Ludovic Piot
ca8865a10b 📝 Change the mermaid scenario diagram 2025-06-12 20:07:11 +02:00
Ludovic Piot
f279bbea11 ✏️ 2025-06-12 20:06:27 +02:00
Ludovic Piot
bc6100301e 📝 Add monitoring stack install 2025-06-12 20:05:14 +02:00
Jérôme Petazzoni
a32751636a Merge pull request #663 from lpiot/main
The deck with a small fix
2025-06-11 20:33:27 +02:00
Ludovic Piot
4a0e23d131 🐛 Sorry Jerome 2025-06-11 19:59:52 +02:00
Ludovic Piot
6e987d1fca Merge branch 'm6' into main 2025-06-11 19:52:03 +02:00
Ludovic Piot
18b888009e 📝 Add an MVP Network policies section 2025-06-11 19:44:17 +02:00
Ludovic Piot
36dd8bb695 📝 Add the new chapters to the M6 stack 2025-06-11 19:33:35 +02:00
Ludovic Piot
395c5a38ab 🎨 Add reference to the chapter title 2025-06-11 19:24:57 +02:00
Ludovic Piot
2b0d3b87ac 📝 Add OpenEBS install chapter 2025-06-11 19:24:13 +02:00
Ludovic Piot
a165e60407 📝 Add k0s install chapter 2025-06-11 19:22:40 +02:00
Ludovic Piot
3c13fd51dd 🎨 Add Mario animation when Flux reconcile 2025-06-11 19:22:04 +02:00
Ludovic Piot
324ad2fdd0 🎨 Update mermaid scenario diagram 2025-06-11 19:21:13 +02:00
Ludovic Piot
269ae79e30 📝 Add k0s install chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
39a15b3d7d ✏️ Clean up consistency about how we evoke the OPS team 2025-06-11 17:08:52 +02:00
Ludovic Piot
9e7ed8cb49 📝 Add MOVY tenant creation chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
06e7a47659 📝 Upgrade the mermaid scenario 2025-06-11 17:08:52 +02:00
Ludovic Piot
802e525f57 📝 Add Ingress chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
0f68f89840 📝 Add Ingress chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
b275342bd2 ✏️ Fixing TEST emphasis 2025-06-11 17:08:52 +02:00
Ludovic Piot
e11e97ccff 📝 Add k0s install chapter 2025-06-11 15:10:43 +02:00
Ludovic Piot
023a9d0346 ✏️ Clean up consistency about how we evoke the OPS team 2025-06-10 19:20:25 +02:00
Ludovic Piot
3f5eaae6b9 📝 Add MOVY tenant creation chapter 2025-06-10 19:19:19 +02:00
Ludovic Piot
1634d5b5bc 📝 Upgrade the mermaid scenario 2025-06-10 17:15:38 +02:00
Ludovic Piot
40418be55a 📝 Add Ingress chapter 2025-06-10 16:19:06 +02:00
Ludovic Piot
04198b7f91 📝 Add Ingress chapter 2025-06-10 16:05:17 +02:00
Jérôme Petazzoni
150c8fc768 Merge pull request #660 from lpiot/main
Mostly the scenario upgrade with Mermaid schemas
2025-06-10 14:24:18 +02:00
Ludovic Piot
e2af1bb057 ✏️ Fixing TEST emphasis 2025-06-10 12:51:09 +02:00
Ludovic Piot
d4c260aa4a 💄 📝 🎨 Upgrade the mermaid scenario schema 2025-06-09 21:20:57 +02:00
Ludovic Piot
89cd677b09 📝 upgrade R01 chapter 2025-06-09 21:20:57 +02:00
Ludovic Piot
3008680c12 🛂 🐛 fix permissions for persistentVolumes management 2025-06-09 21:20:57 +02:00
Ludovic Piot
f7b8184617 🎨 2025-06-09 21:20:57 +02:00
Jérôme Petazzoni
a565c0979c Merge pull request #659 from lpiot/main
Add R01 chapter and fixes to previous chapters
2025-06-09 20:05:55 +02:00
Jérôme Petazzoni
7a11f03b5e Merge branch 'm6' into main 2025-06-09 20:05:26 +02:00
Ludovic Piot
b0760b99a5 ✏️ 📝 Fix shpod access methods 2025-06-09 17:11:57 +02:00
Ludovic Piot
bcb9c3003f 📝 Add R01 chapter about test-ROCKY tenant config 2025-06-09 17:10:35 +02:00
Ludovic Piot
99ce9b3a8a 🎨 📝 Add missing steps in demo 2025-06-09 16:09:45 +02:00
Ludovic Piot
0ba602b533 🎨 clean up code display 2025-06-09 16:08:58 +02:00
Jérôme Petazzoni
d43c41e11e Proof-read first half of M6-START 2025-06-09 14:46:13 +02:00
Ludovic Piot
331309dc63 🎨 cleanup display of some console results 2025-06-09 14:11:05 +02:00
Ludovic Piot
44146915e0 📝 🍱 add T03 chapter 2025-06-04 23:55:33 +02:00
Ludovic Piot
84996e739b 🍱 📝 rewording and updating pics 2025-06-04 23:54:51 +02:00
Ludovic Piot
2aea1f70b2 📝 Add Flux install 2025-05-29 18:00:18 +02:00
Ludovic Piot
985e2ae42c 📝 add M6 intro slidedeck 2025-05-29 12:25:57 +02:00
Ludovic Piot
ea58428a0c 🐛 Slides now generate! ♻️ Move a slide 2025-05-14 22:05:59 +02:00
Ludovic Piot
59e60786c0 🎨 make personnae and cluster names consistent 2025-05-14 21:49:09 +02:00
Ludovic Piot
af63cf1405 🚨 2025-05-14 21:25:59 +02:00
Ludovic Piot
f9041807f6 🎉 first M6 draft slidedeck 2025-05-14 20:52:32 +02:00
120 changed files with 1737 additions and 4986 deletions

View File

@@ -9,7 +9,7 @@
"forwardPorts": [],
//"postCreateCommand": "... install extra packages...",
"postStartCommand": "dind.sh ; kind.sh",
"postStartCommand": "dind.sh",
// This lets us use "docker-outside-docker".
// Unfortunately, minikube, kind, etc. don't work very well that way;

1
.gitignore vendored
View File

@@ -17,7 +17,6 @@ slides/autopilot/state.yaml
slides/index.html
slides/past.html
slides/slides.zip
slides/_academy_*
node_modules
### macOS ###

View File

@@ -1,7 +1,7 @@
FROM ruby:alpine
RUN apk add --update build-base curl
RUN gem install sinatra --version '~> 3'
RUN gem install thin --version '~> 1'
RUN gem install thin
ADD hasher.rb /
CMD ["ruby", "hasher.rb"]
EXPOSE 80

View File

@@ -1,33 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: blue
name: blue
spec:
replicas: 1
selector:
matchLabels:
app: blue
template:
metadata:
labels:
app: blue
spec:
containers:
- image: jpetazzo/color
name: color
---
apiVersion: v1
kind: Service
metadata:
labels:
app: blue
name: blue
spec:
ports:
- name: "80"
port: 80
selector:
app: blue

View File

@@ -1,12 +0,0 @@
# This removes the haproxy Deployment.
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
patches:
- patch: |-
$patch: delete
kind: Deployment
apiVersion: apps/v1
metadata:
name: haproxy

View File

@@ -1,14 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
# Within a Kustomization, it is not possible to specify in which
# order transformations (patches, replacements, etc) should be
# executed. If we want to execute transformations in a specific
# order, one possibility is to put them in individual components,
# and then invoke these components in the order we want.
# It works, but it creates an extra level of indirection, which
# reduces readability and complicates maintenance.
components:
- setup
- cleanup

View File

@@ -1,20 +0,0 @@
global
#log stdout format raw local0
#daemon
maxconn 32
defaults
#log global
timeout client 1h
timeout connect 1h
timeout server 1h
mode http
option abortonclose
frontend metrics
bind :9000
http-request use-service prometheus-exporter
frontend ollama_frontend
bind :8000
default_backend ollama_backend
maxconn 16
backend ollama_backend
server ollama_server localhost:11434 check

View File

@@ -1,39 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: haproxy
name: haproxy
spec:
selector:
matchLabels:
app: haproxy
template:
metadata:
labels:
app: haproxy
spec:
volumes:
- name: haproxy
configMap:
name: haproxy
containers:
- image: haproxy:3.0
name: haproxy
volumeMounts:
- name: haproxy
mountPath: /usr/local/etc/haproxy
readinessProbe:
httpGet:
port: 9000
ports:
- name: haproxy
containerPort: 8000
- name: metrics
containerPort: 9000
resources:
requests:
cpu: 0.05
limits:
cpu: 1

View File

@@ -1,75 +0,0 @@
# This adds a sidecar to the ollama Deployment, by taking
# the pod template and volumes from the haproxy Deployment.
# The idea is to allow to run ollama+haproxy in two modes:
# - separately (each with their own Deployment),
# - together in the same Pod, sidecar-style.
# The YAML files define how to run them separetely, and this
# "replacements" directive fetches a specific volume and
# a specific container from the haproxy Deployment, to add
# them to the ollama Deployment.
#
# This would be simpler if kustomize allowed to append or
# merge lists in "replacements"; but it doesn't seem to be
# possible at the moment.
#
# It would be even better if kustomize allowed to perform
# a strategic merge using a fieldPath as the source, because
# we could merge both the containers and the volumes in a
# single operation.
#
# Note that technically, it might be possible to layer
# multiple kustomizations so that one generates the patch
# to be used in another; but it wouldn't be very readable
# or maintainable so we decided to not do that right now.
#
# However, the current approach (fetching fields one by one)
# has an advantage: it could let us transform the haproxy
# container into a real sidecar (i.e. an initContainer with
# a restartPolicy=Always).
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
resources:
- haproxy.yaml
configMapGenerator:
- name: haproxy
files:
- haproxy.cfg
replacements:
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.volumes.[name=haproxy]
targets:
- select:
kind: Deployment
name: ollama
fieldPaths:
- spec.template.spec.volumes.[name=haproxy]
options:
create: true
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.containers.[name=haproxy]
targets:
- select:
kind: Deployment
name: ollama
fieldPaths:
- spec.template.spec.containers.[name=haproxy]
options:
create: true
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.containers.[name=haproxy].ports.[name=haproxy].containerPort
targets:
- select:
kind: Service
name: ollama
fieldPaths:
- spec.ports.[name=11434].targetPort

View File

@@ -1,34 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: blue
name: blue
spec:
replicas: 2
selector:
matchLabels:
app: blue
template:
metadata:
labels:
app: blue
spec:
containers:
- image: jpetazzo/color
name: color
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
app: blue
name: blue
spec:
ports:
- port: 80
selector:
app: blue

View File

@@ -1,94 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# Each of these YAML files contains a Deployment and a Service.
# The blue.yaml file is here just to demonstrate that the rest
# of this Kustomization can be precisely scoped to the ollama
# Deployment (and Service): the blue Deployment and Service
# shouldn't be affected by our kustomize transformers.
resources:
- ollama.yaml
- blue.yaml
buildMetadata:
# Add a label app.kubernetes.io/managed-by=kustomize-vX.Y.Z
- managedByLabel
# Add an annotation config.kubernetes.io/origin, indicating:
# - which file defined that resource;
# - if it comes from a git repository, which one, and which
# ref (tag, branch...) it was.
- originAnnotations
# Add an annotation alpha.config.kubernetes.io/transformations
# indicating which patches and other transformers have changed
# each resource.
- transformerAnnotations
# Let's generate a ConfigMap with literal values.
# Note that this will actually add a suffix to the name of the
# ConfigMaps (e.g.: ollama-8bk8bd8m76) and it will update all
# references to the ConfigMap (e.g. in Deployment manifests)
# accordingly. The suffix is a hash of the ConfigMap contents,
# so that basically, if the ConfigMap is edited, any workload
# using that ConfigMap will automatically do a rolling update.
configMapGenerator:
- name: ollama
literals:
- "model=gemma3:270m"
- "prompt=If you visit Paris, I suggest that you"
- "queue=4"
name: ollama
patches:
# The Deployment manifest in ollama.yaml doesn't specify
# resource requests and limits, so that it can run on any
# cluster (including resource-constrained local clusters
# like KiND or minikube). The example belows add CPU
# requests and limits using a strategic merge patch.
# The patch is inlined here, but it could also be put
# in a file and referenced with "path: xxxxxx.yaml".
- patch: |
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
spec:
template:
spec:
containers:
- name: ollama
resources:
requests:
cpu: 1
limits:
cpu: 2
# This will have the same effect, with one little detail:
# JSON patches cannot specify containers by name, so this
# assumes that the ollama container is the first one in
# the pod template (whereas the strategic merge patch can
# use "merge keys" and identify containers by their name).
#- target:
# kind: Deployment
# name: ollama
# patch: |
# - op: add
# path: /spec/template/spec/containers/0/resources
# value:
# requests:
# cpu: 1
# limits:
# cpu: 2
# A "component" is a bit like a "base", in the sense that
# it lets us define some reusable resources and behaviors.
# There is a key different, though:
# - a "base" will be evaluated in isolation: it will
# generate+transform some resources, then these resources
# will be included in the main Kustomization;
# - a "component" has access to all the resources that
# have been generated by the main Kustomization, which
# means that it can transform them (with patches etc).
components:
- add-haproxy-sidecar

View File

@@ -1,73 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: ollama
name: ollama
spec:
selector:
matchLabels:
app: ollama
template:
metadata:
labels:
app: ollama
spec:
volumes:
- name: ollama
hostPath:
path: /opt/ollama
type: DirectoryOrCreate
containers:
- image: ollama/ollama
name: ollama
env:
- name: OLLAMA_MAX_QUEUE
valueFrom:
configMapKeyRef:
name: ollama
key: queue
- name: MODEL
valueFrom:
configMapKeyRef:
name: ollama
key: model
volumeMounts:
- name: ollama
mountPath: /root/.ollama
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- ollama pull $MODEL
livenessProbe:
httpGet:
port: 11434
readinessProbe:
exec:
command:
- /bin/sh
- -c
- ollama show $MODEL
ports:
- name: ollama
containerPort: 11434
---
apiVersion: v1
kind: Service
metadata:
labels:
app: ollama
name: ollama
spec:
ports:
- name: "11434"
port: 11434
protocol: TCP
targetPort: 11434
selector:
app: ollama
type: ClusterIP

View File

@@ -1,5 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- microservices
- redis

View File

@@ -1,13 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- microservices.yaml
transformers:
- |
apiVersion: builtin
kind: PrefixSuffixTransformer
metadata:
name: use-ghcr-io
prefix: ghcr.io/
fieldSpecs:
- path: spec/template/spec/containers/image

View File

@@ -1,125 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hasher
name: hasher
spec:
replicas: 1
selector:
matchLabels:
app: hasher
template:
metadata:
labels:
app: hasher
spec:
containers:
- image: dockercoins/hasher:v0.1
name: hasher
---
apiVersion: v1
kind: Service
metadata:
labels:
app: hasher
name: hasher
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: hasher
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rng
name: rng
spec:
replicas: 1
selector:
matchLabels:
app: rng
template:
metadata:
labels:
app: rng
spec:
containers:
- image: dockercoins/rng:v0.1
name: rng
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rng
name: rng
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: rng
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: webui
name: webui
spec:
replicas: 1
selector:
matchLabels:
app: webui
template:
metadata:
labels:
app: webui
spec:
containers:
- image: dockercoins/webui:v0.1
name: webui
---
apiVersion: v1
kind: Service
metadata:
labels:
app: webui
name: webui
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: webui
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: worker
name: worker
spec:
replicas: 1
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- image: dockercoins/worker:v0.1
name: worker

View File

@@ -1,4 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- redis.yaml

View File

@@ -1,35 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: redis
name: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- image: redis
name: redis
---
apiVersion: v1
kind: Service
metadata:
labels:
app: redis
name: redis
spec:
ports:
- port: 6379
protocol: TCP
targetPort: 6379
selector:
app: redis
type: ClusterIP

View File

@@ -1,160 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hasher
name: hasher
spec:
replicas: 1
selector:
matchLabels:
app: hasher
template:
metadata:
labels:
app: hasher
spec:
containers:
- image: dockercoins/hasher:v0.1
name: hasher
---
apiVersion: v1
kind: Service
metadata:
labels:
app: hasher
name: hasher
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: hasher
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: redis
name: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- image: redis
name: redis
---
apiVersion: v1
kind: Service
metadata:
labels:
app: redis
name: redis
spec:
ports:
- port: 6379
protocol: TCP
targetPort: 6379
selector:
app: redis
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rng
name: rng
spec:
replicas: 1
selector:
matchLabels:
app: rng
template:
metadata:
labels:
app: rng
spec:
containers:
- image: dockercoins/rng:v0.1
name: rng
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rng
name: rng
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: rng
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: webui
name: webui
spec:
replicas: 1
selector:
matchLabels:
app: webui
template:
metadata:
labels:
app: webui
spec:
containers:
- image: dockercoins/webui:v0.1
name: webui
---
apiVersion: v1
kind: Service
metadata:
labels:
app: webui
name: webui
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: webui
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: worker
name: worker
spec:
replicas: 1
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- image: dockercoins/worker:v0.1
name: worker

View File

@@ -1,30 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- dockercoins.yaml
replacements:
- sourceValue: ghcr.io/dockercoins
targets:
- select:
kind: Deployment
labelSelector: "app in (hasher,rng,webui,worker)"
# It will soon be possible to use regexes in replacement selectors,
# meaning that the "labelSelector:" above can be replaced with the
# following "name:" selector which is a tiny bit simpler:
#name: hasher|rng|webui|worker
# Regex support in replacement selectors was added by this PR:
# https://github.com/kubernetes-sigs/kustomize/pull/5863
# This PR was merged in August 2025, but as of October 2025, the
# latest release of Kustomize is 5.7.1, which was released in July.
# Hopefully the feature will be available in the next release :)
# Another possibility would be to select all Deployments, and then
# reject the one(s) for which we don't want to update the registry;
# for instance:
#reject:
# kind: Deployment
# name: redis
fieldPaths:
- spec.template.spec.containers.*.image
options:
delimiter: "/"
index: 0

View File

@@ -66,7 +66,7 @@ Here is where we look for credentials for each provider:
- Civo: CLI configuration file (`~/.civo.json`)
- Digital Ocean: CLI configuration file (`~/.config/doctl/config.yaml`)
- Exoscale: CLI configuration file (`~/.config/exoscale/exoscale.toml`)
- Google Cloud: we're using "Application Default Credentials (ADC)"; run `gcloud auth application-default login`; note that we'll use the default "project" set in `gcloud` unless you set the `GOOGLE_PROJECT` environment variable
- Google Cloud: FIXME, note that the project name is currently hard-coded to `prepare-tf`
- Hetzner: CLI configuration file (`~/.config/hcloud/cli.toml`)
- Linode: CLI configuration file (`~/.config/linode-cli`)
- OpenStack: you will need to write a tfvars file (check [that exemple](terraform/virtual-machines/openstack/tfvars.example))

View File

@@ -5,22 +5,16 @@
# 10% CPU
# (See https://docs.google.com/document/d/1n0lwp6rQKQUIuo_A5LQ1dgCzrmjkDjmDtNj1Jn92UrI)
# PRO2-XS = 4 core, 16 gb
#
# With vspod:
# 800 MB RAM
# 33% CPU
#
set -e
KONKTAG=konk
PROVIDER=linode
STUDENTS=5
PROVIDER=scaleway
STUDENTS=30
case "$PROVIDER" in
linode)
export TF_VAR_node_size=g6-standard-6
export TF_VAR_location=fr-par
export TF_VAR_location=us-east
;;
scaleway)
export TF_VAR_node_size=PRO2-XS
@@ -34,13 +28,11 @@ esac
export KUBECONFIG=~/kubeconfig
if [ "$PROVIDER" = "kind" ]; then
kind create cluster --name $KONKTAG
kind create cluster --name konk
ADDRTYPE=InternalIP
else
if ! [ -f tags/$KONKTAG/stage2/kubeconfig.101 ]; then
./labctl create --mode mk8s --settings settings/konk.env --provider $PROVIDER --tag $KONKTAG
fi
cp tags/$KONKTAG/stage2/kubeconfig.101 $KUBECONFIG
./labctl create --mode mk8s --settings settings/konk.env --provider $PROVIDER --tag konk
cp tags/konk/stage2/kubeconfig.101 $KUBECONFIG
ADDRTYPE=ExternalIP
fi

View File

@@ -270,27 +270,7 @@ _cmd_create() {
ln -s ../../$SETTINGS tags/$TAG/settings.env.orig
cp $SETTINGS tags/$TAG/settings.env
# For Google Cloud, it is necessary to specify which "project" to use.
# Unfortunately, the Terraform provider doesn't seem to have a way
# to detect which Google Cloud project you want to use; it has to be
# specified one way or another. Let's decide that it should be set with
# the GOOGLE_PROJECT env var; and if that var is not set, we'll try to
# figure it out from gcloud.
# (See https://github.com/hashicorp/terraform-provider-google/issues/10907#issuecomment-1015721600)
# Since we need that variable to be set each time we'll call Terraform
# (e.g. when destroying the environment), let's save it to the settings.env
# file.
if [ "$PROVIDER" = "googlecloud" ]; then
if ! [ "$GOOGLE_PROJECT" ]; then
info "PROVIDER=googlecloud but GOOGLE_PROJECT is not set. Detecting it."
GOOGLE_PROJECT=$(gcloud config get project)
info "GOOGLE_PROJECT will be set to '$GOOGLE_PROJECT'."
fi
echo "export GOOGLE_PROJECT=$GOOGLE_PROJECT" >> tags/$TAG/settings.env
fi
. tags/$TAG/settings.env
. $SETTINGS
echo $MODE > tags/$TAG/mode
echo $PROVIDER > tags/$TAG/provider
@@ -503,7 +483,7 @@ _cmd_kubebins() {
curl -L https://github.com/etcd-io/etcd/releases/download/$ETCD_VERSION/etcd-$ETCD_VERSION-linux-$ARCH.tar.gz \
| sudo tar --strip-components=1 --wildcards -zx '*/etcd' '*/etcdctl'
fi
if ! [ -x kube-apiserver ]; then
if ! [ -x hyperkube ]; then
##VERSION##
curl -L https://dl.k8s.io/$K8SBIN_VERSION/kubernetes-server-linux-$ARCH.tar.gz \
| sudo tar --strip-components=3 -zx \
@@ -1238,17 +1218,14 @@ fi
"
}
_cmd ssh "Open an SSH session to a node (first one by default)"
_cmd ssh "Open an SSH session to the first node of a tag"
_cmd_ssh() {
TAG=$1
need_tag
if [ "$2" ]; then
ssh -l ubuntu -i tags/$TAG/id_rsa $2
else
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
fi
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
}
_cmd tags "List groups of VMs known locally"

View File

@@ -23,14 +23,6 @@ pssh() {
# necessary - or down to zero, too.
sleep ${PSSH_DELAY_PRE-1}
# When things go wrong, it's convenient to ask pssh to show the output
# of the failed command. Let's make that easy with a DEBUG env var.
if [ "$DEBUG" ]; then
PSSH_I=-i
else
PSSH_I=""
fi
$(which pssh || which parallel-ssh) -h $HOSTFILE -l ubuntu \
--par ${PSSH_PARALLEL_CONNECTIONS-100} \
--timeout 300 \
@@ -39,6 +31,5 @@ pssh() {
-O UserKnownHostsFile=/dev/null \
-O StrictHostKeyChecking=no \
-O ForwardAgent=yes \
$PSSH_I \
"$@"
}

View File

@@ -2,11 +2,7 @@ terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.38.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 3.0"
version = "2.16.1"
}
}
}
@@ -20,7 +16,7 @@ provider "kubernetes" {
provider "helm" {
alias = "cluster_${index}"
kubernetes = {
kubernetes {
config_path = "./kubeconfig.${index}"
}
}
@@ -55,37 +51,42 @@ resource "helm_release" "shpod_${index}" {
name = "shpod"
namespace = "shpod"
create_namespace = false
values = [
yamlencode({
service = {
type = "NodePort"
}
resources = {
requests = {
cpu = "100m"
memory = "500M"
}
limits = {
cpu = "1"
memory = "1000M"
}
}
persistentVolume = {
enabled = true
}
ssh = {
password = random_string.shpod_${index}.result
}
rbac = {
cluster = {
clusterRoles = [ "cluster-admin" ]
}
}
codeServer = {
enabled = true
}
})
]
set {
name = "service.type"
value = "NodePort"
}
set {
name = "resources.requests.cpu"
value = "100m"
}
set {
name = "resources.requests.memory"
value = "500M"
}
set {
name = "resources.limits.cpu"
value = "1"
}
set {
name = "resources.limits.memory"
value = "1000M"
}
set {
name = "persistentVolume.enabled"
value = "true"
}
set {
name = "ssh.password"
value = random_string.shpod_${index}.result
}
set {
name = "rbac.cluster.clusterRoles"
value = "{cluster-admin}"
}
set {
name = "codeServer.enabled"
value = "true"
}
}
resource "helm_release" "metrics_server_${index}" {
@@ -100,36 +101,10 @@ resource "helm_release" "metrics_server_${index}" {
name = "metrics-server"
namespace = "metrics-server"
create_namespace = true
values = [
yamlencode({
args = [ "--kubelet-insecure-tls" ]
})
]
}
# As of October 2025, the ebs-csi-driver addon (which is used on EKS
# to provision persistent volumes) doesn't automatically create a
# StorageClass. Here, we're trying to detect the DaemonSet created
# by the ebs-csi-driver; and if we find it, we create the corresponding
# StorageClass.
data "kubernetes_resources" "ebs_csi_node_${index}" {
provider = kubernetes.cluster_${index}
api_version = "apps/v1"
kind = "DaemonSet"
label_selector = "app.kubernetes.io/name=aws-ebs-csi-driver"
namespace = "kube-system"
}
resource "kubernetes_storage_class" "ebs_csi_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 1 : 0
provider = kubernetes.cluster_${index}
metadata {
name = "ebs-csi"
annotations = {
"storageclass.kubernetes.io/is-default-class" = "true"
}
set {
name = "args"
value = "{--kubelet-insecure-tls}"
}
storage_provisioner = "ebs.csi.aws.com"
}
# This section here deserves a little explanation.
@@ -161,14 +136,8 @@ resource "kubernetes_storage_class" "ebs_csi_${index}" {
# Lastly - in the ConfigMap we actually put both the original kubeconfig,
# and the one where we injected our new user (just in case we want to
# use or look at the original for any reason).
#
# One more thing: the kubernetes.io/kube-apiserver-client signer is
# disabled on EKS, so... we don't generate that ConfigMap on EKS.
# To detect if we're on EKS, we're looking for the ebs-csi-node DaemonSet.
# (Which means that the detection will break if the ebs-csi addon is missing.)
resource "kubernetes_config_map" "kubeconfig_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 0 : 1
provider = kubernetes.cluster_${index}
metadata {
name = "kubeconfig"
@@ -194,7 +163,7 @@ resource "kubernetes_config_map" "kubeconfig_${index}" {
- name: cluster-admin
user:
client-key-data: $${base64encode(tls_private_key.cluster_admin_${index}.private_key_pem)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}[0].certificate)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}.certificate)}
EOT
}
}
@@ -232,7 +201,6 @@ resource "kubernetes_cluster_role_binding" "shpod_cluster_admin_${index}" {
}
resource "kubernetes_certificate_signing_request_v1" "cluster_admin_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 0 : 1
provider = kubernetes.cluster_${index}
metadata {
name = "cluster-admin"

View File

@@ -23,7 +23,7 @@ variable "node_size" {
}
variable "location" {
type = string
type = string
default = null
}

View File

@@ -1,45 +1,60 @@
data "aws_eks_cluster_versions" "_" {
default_only = true
# Taken from:
# https://github.com/hashicorp/learn-terraform-provision-eks-cluster/blob/main/main.tf
data "aws_availability_zones" "available" {}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.19.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 21.0"
name = var.cluster_name
kubernetes_version = data.aws_eks_cluster_versions._.cluster_versions[0].cluster_version
vpc_id = local.vpc_id
subnet_ids = local.subnet_ids
endpoint_public_access = true
enable_cluster_creator_admin_permissions = true
upgrade_policy = {
# The default policy is EXTENDED, which incurs additional costs
# when running an old control plane. We don't advise to run old
# control planes, but we also don't want to incur costs if an
# old version is chosen accidentally.
support_type = "STANDARD"
}
source = "terraform-aws-modules/eks/aws"
version = "19.5.1"
cluster_name = var.cluster_name
cluster_version = "1.24"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
eks_managed_node_group_defaults = {
ami_type = "AL2_x86_64"
addons = {
coredns = {}
eks-pod-identity-agent = {
before_compute = true
}
kube-proxy = {}
vpc-cni = {
before_compute = true
}
aws-ebs-csi-driver = {
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
}
}
eks_managed_node_groups = {
x86 = {
name = "x86"
one = {
name = "node-group-one"
instance_types = [local.node_size]
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
}
}
}
@@ -51,7 +66,7 @@ data "aws_iam_policy" "ebs_csi_policy" {
module "irsa-ebs-csi" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "~> 5.39.0"
version = "4.7.0"
create_role = true
role_name = "AmazonEKSTFEBSCSIRole-${module.eks.cluster_name}"
@@ -60,9 +75,13 @@ module "irsa-ebs-csi" {
oidc_fully_qualified_subjects = ["system:serviceaccount:kube-system:ebs-csi-controller-sa"]
}
resource "aws_vpc_security_group_ingress_rule" "_" {
security_group_id = module.eks.node_security_group_id
cidr_ipv4 = "0.0.0.0/0"
ip_protocol = -1
description = "Allow all traffic to Kubernetes nodes (so that we can use NodePorts, hostPorts, etc.)"
resource "aws_eks_addon" "ebs-csi" {
cluster_name = module.eks.cluster_name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.5.2-eksbuild.1"
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
tags = {
"eks_addon" = "ebs-csi"
"terraform" = "true"
}
}

View File

@@ -2,7 +2,7 @@ terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 6.17.0"
version = "~> 4.47.0"
}
}
}

View File

@@ -1,61 +0,0 @@
# OK, we have two options here.
# 1. Create our own VPC
# - Pros: provides good isolation from other stuff deployed in the
# AWS account; makes sure that we don't interact with
# existing security groups, subnets, etc.
# - Cons: by default, there is a quota of 5 VPC per region, so
# we can only deploy 5 clusters
# 2. Use the default VPC
# - Pros/cons: the opposite :)
variable "use_default_vpc" {
type = bool
default = true
}
data "aws_vpc" "default" {
default = true
}
data "aws_subnets" "default" {
filter {
name = "vpc-id"
values = [data.aws_vpc.default.id]
}
}
data "aws_availability_zones" "available" {}
module "vpc" {
count = var.use_default_vpc ? 0 : 1
source = "terraform-aws-modules/vpc/aws"
version = "~> 6.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.11.0/24", "10.0.12.0/24", "10.0.13.0/24"]
public_subnets = ["10.0.21.0/24", "10.0.22.0/24", "10.0.23.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
map_public_ip_on_launch = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
locals {
vpc_id = var.use_default_vpc ? data.aws_vpc.default.id : module.vpc[0].vpc_id
subnet_ids = var.use_default_vpc ? data.aws_subnets.default.ids : module.vpc[0].public_subnets
}

View File

@@ -0,0 +1,12 @@
locals {
location = var.location != null ? var.location : "europe-north1-a"
region = replace(local.location, "/-[a-z]$/", "")
# Unfortunately, the following line doesn't work
# (that attribute just returns an empty string)
# so we have to hard-code the project name.
#project = data.google_client_config._.project
project = "prepare-tf"
}
data "google_client_config" "_" {}

View File

@@ -1,7 +1,7 @@
resource "google_container_cluster" "_" {
name = var.cluster_name
location = local.location
deletion_protection = false
name = var.cluster_name
project = local.project
location = local.location
#min_master_version = var.k8s_version
# To deploy private clusters, uncomment the section below,
@@ -42,7 +42,7 @@ resource "google_container_cluster" "_" {
node_pool {
name = "x86"
node_config {
tags = ["lab-${var.cluster_name}"]
tags = var.common_tags
machine_type = local.node_size
}
initial_node_count = var.min_nodes_per_pool
@@ -62,25 +62,3 @@ resource "google_container_cluster" "_" {
}
}
}
resource "google_compute_firewall" "_" {
name = "lab-${var.cluster_name}"
network = "default"
allow {
protocol = "tcp"
ports = ["0-65535"]
}
allow {
protocol = "udp"
ports = ["0-65535"]
}
allow {
protocol = "icmp"
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["lab-${var.cluster_name}"]
}

View File

@@ -6,8 +6,6 @@ output "has_metrics_server" {
value = true
}
data "google_client_config" "_" {}
output "kubeconfig" {
sensitive = true
value = <<-EOT

View File

@@ -1 +0,0 @@
../../providers/googlecloud/provider.tf

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "4.5.0"
}
}
}

View File

@@ -30,7 +30,7 @@ resource "scaleway_k8s_pool" "_" {
max_size = var.max_nodes_per_pool
autoscaling = var.max_nodes_per_pool > var.min_nodes_per_pool
autohealing = true
depends_on = [scaleway_instance_security_group._]
depends_on = [ scaleway_instance_security_group._ ]
}
data "scaleway_k8s_version" "_" {

View File

@@ -4,36 +4,25 @@ resource "helm_release" "_" {
create_namespace = true
repository = "https://charts.loft.sh"
chart = "vcluster"
version = "0.27.1"
values = [
yamlencode({
controlPlane = {
proxy = {
extraSANs = [ local.guest_api_server_host ]
}
service = {
spec = {
type = "NodePort"
}
}
statefulSet = {
persistence = {
volumeClaim = {
enabled = true
}
}
}
}
sync = {
fromHost = {
nodes = {
enabled = true
selector = {
all = true
}
}
}
}
})
]
version = "0.19.7"
set {
name = "service.type"
value = "NodePort"
}
set {
name = "storage.persistence"
value = "false"
}
set {
name = "sync.nodes.enabled"
value = "true"
}
set {
name = "sync.nodes.syncAllNodes"
value = "true"
}
set {
name = "syncer.extraArgs"
value = "{--tls-san=${local.guest_api_server_host}}"
}
}

View File

@@ -1,8 +0,0 @@
terraform {
required_providers {
helm = {
source = "hashicorp/helm"
version = "~> 3.0"
}
}
}

View File

@@ -1,8 +0,0 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 7.0"
}
}
}

View File

@@ -9,9 +9,5 @@ variable "node_sizes" {
variable "location" {
type = string
default = "europe-north1-a"
default = null
}
locals {
location = (var.location != "" && var.location != null) ? var.location : "europe-north1-a"
}

View File

@@ -1,5 +1,5 @@
provider "helm" {
kubernetes = {
kubernetes {
config_path = "~/kubeconfig"
}
}

View File

@@ -1 +0,0 @@
../common.tf

View File

@@ -1 +0,0 @@
../../providers/googlecloud/config.tf

View File

@@ -1,54 +0,0 @@
# Note: names and tags on GCP have to match a specific regex:
# (?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)
# In other words, they must start with a letter; and generally,
# we make them start with a number (year-month-day-etc, so 2025-...)
# so we prefix names and tags with "lab-" in this configuration.
resource "google_compute_instance" "_" {
for_each = local.nodes
zone = var.location
name = "lab-${each.value.node_name}"
tags = ["lab-${var.tag}"]
machine_type = each.value.node_size
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2404-lts-amd64"
}
}
network_interface {
network = "default"
access_config {}
}
metadata = {
"ssh-keys" = "ubuntu:${tls_private_key.ssh.public_key_openssh}"
}
}
locals {
ip_addresses = {
for key, value in local.nodes :
key => google_compute_instance._[key].network_interface[0].access_config[0].nat_ip
}
}
resource "google_compute_firewall" "_" {
name = "lab-${var.tag}"
network = "default"
allow {
protocol = "tcp"
ports = ["0-65535"]
}
allow {
protocol = "udp"
ports = ["0-65535"]
}
allow {
protocol = "icmp"
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["lab-${var.tag}"]
}

View File

@@ -1 +0,0 @@
../../providers/googlecloud/provider.tf

View File

@@ -1 +0,0 @@
../../providers/googlecloud/variables.tf

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-10-enix.container.training/
slides: https://2025-05-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "
@@ -49,7 +49,7 @@ content:
- containers/Advanced_Dockerfiles.md
- containers/Multi_Stage_Builds.md
- containers/Publishing_To_Docker_Hub.md
- containers/Exercise_Dockerfile_Multistage.md
- containers/Exercise_Dockerfile_Advanced.md
- # DAY 4
- containers/Buildkit.md
- containers/Network_Drivers.md

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-10-enix.container.training/
slides: https://2025-05-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "

View File

@@ -6,7 +6,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-10-enix.container.training/
slides: https://2025-05-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-10-enix.container.training/
slides: https://2025-05-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-10-enix.container.training/
slides: https://2025-05-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "
@@ -14,13 +14,14 @@ exclude:
content:
- shared/title.md
- logistics-m5.md
- logistics.md
- k8s/intro.md
- shared/about-slides.md
- shared/chat-room-im.md
#- shared/chat-room-zoom-meeting.md
#- shared/chat-room-zoom-webinar.md
- shared/toc.md
# DAY 1
-
- k8s/prereqs-advanced.md
- shared/handson.md
@@ -49,15 +50,22 @@ content:
- k8s/cluster-backup.md
#- k8s/cloud-controller-manager.md
-
- flux/scenario.md
- flux/bootstrap.md
- flux/tenants.md
- flux/app1-rocky-test.md
- flux/ingress.md
- flux/app2-movy-test.md
- k8s/k0s.md
- flux/add-cluster.md
- flux/openebs.md
- flux/observability.md
- flux/kyverno.md
- shared/thankyou.md
- k8s/M6-START-a-company-scenario.md
- k8s/M6-T02-flux-install.md
- k8s/M6-T03-installing-tenants.md
- k8s/M6-R01-flux_configure-ROCKY-deployment.md
- k8s/M6-T05-ingress-config.md
- k8s/M6-M01-adding-MOVY-tenant.md
- k8s/M6-K01-METAL-install.md
- k8s/M6-K03-openebs-install.md
- k8s/M6-monitoring-stack-install.md
- k8s/M6-kyverno-install.md
- shared/thankyou.md
#-
# |
# # (Extra content)
# - k8s/apiserver-deepdive.md
# - k8s/setup-overview.md
# - k8s/setup-devel.md
# - k8s/setup-managed.md
# - k8s/setup-selfhosted.md

View File

@@ -1,31 +0,0 @@
#!/usr/bin/env python
import os
import re
import sys
html_file = sys.argv[1]
output_file_template = "_academy_{}.html"
title_regex = "name: toc-(.*)"
redirects = open("_redirects", "w")
sections = re.split(title_regex, open(html_file).read())[1:]
while sections:
link, markdown = sections[0], sections[1]
sections = sections[2:]
output_file_name = output_file_template.format(link)
with open(output_file_name, "w") as f:
html = open("workshop.html").read()
html = html.replace("@@MARKDOWN@@", markdown)
titles = re.findall("# (.*)", markdown) + [""]
html = html.replace("@@TITLE@@", "{} — Kubernetes Academy".format(titles[0]))
html = html.replace("@@SLIDENUMBERPREFIX@@", "")
html = html.replace("@@EXCLUDE@@", "")
html = html.replace(".nav[", ".hide[")
f.write(html)
redirects.write("/{} /{} 200!\n".format(link, output_file_name))
html = open(html_file).read()
html = re.sub("#toc-([^)]*)", "_academy_\\1.html", html)
sys.stdout.write(html)

View File

@@ -84,9 +84,9 @@ like Windows, macOS, Solaris, FreeBSD ...
* Each `lxc-start` process exposes a custom API over a local UNIX socket, allowing to interact with the container.
* No notion of image (container filesystems had be managed manually).
* No notion of image (container filesystems have to be managed manually).
* Networking had to be set up manually.
* Networking has to be set up manually.
---
@@ -98,22 +98,10 @@ like Windows, macOS, Solaris, FreeBSD ...
* Daemon exposing a REST API.
* Can run containers and virtual machines.
* Can manage images, snapshots, migrations, networking, storage.
* "offers a user experience similar to virtual machines but using Linux containers instead."
* Driven by Canonical.
---
## Incus
* Community-driven fork of LXD.
* Relatively recent [announced in August 2023](https://linuxcontainers.org/incus/announcement/) so time will tell what the notable differences will be.
---
## CRI-O
@@ -152,7 +140,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## [Kata containers](https://katacontainers.io/)
## Kata containers
* OCI-compliant runtime.
@@ -164,7 +152,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## [gVisor](https://gvisor.dev/)
## gVisor
* OCI-compliant runtime.
@@ -182,17 +170,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## Others
- Micro VMs: Firecracker, Edera...
- [crun](https://github.com/containers/crun) (runc rewritten in C)
- [youki](https://youki-dev.github.io/youki/) (runc rewritten in Rust)
---
## To Docker Or Not To Docker
## Overall ...
* The Docker Engine is very developer-centric:
@@ -206,26 +184,8 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
* As a result, it is a fantastic tool in development environments.
* On Kubernetes clusters, containerd or CRI-O are better choices.
* On servers:
* On Kubernetes clusters, the container engine is an implementation detail.
- Docker is a good default choice
---
## Different levels
- Directly use namespaces, cgroups, capabilities with custom code or scripts
*useful for troubleshooting/debugging and for educative purposes; e.g. pipework*
- Use low-level engines like runc, crun, youki
*useful when building custom architectures; e.g. a brand new orchestrator*
- Use low-level APIs like CRI or containerd grpc API
*useful to achieve high-level features like Docker, but without Docker; e.g. ctr, nerdctl*
- Use high-level APIs like Docker and Kubernetes
*that's what most people will do*
- If you use Kubernetes, the engine doesn't matter

View File

@@ -327,7 +327,9 @@ class: extra-details
## Which one is the best?
- In modern (2015+) systems, overlay2 should be the best option.
- Eventually, overlay2 should be the best option.
- It is available on all modern systems.
- Its memory usage is better than Device Mapper, BTRFS, or ZFS.

View File

@@ -141,13 +141,3 @@ class: pic
* etc.
* Docker Inc. launches commercial offers.
---
## Standardization of container runtimes
- Docker 1.11 (2016) introduces containerd and runc
- [Kubernetes 1.5 (2016)](https://kubernetes.io/blog/2016/12/kubernetes-1-5-supporting-production-workloads/) introduces the CRI
- First releases of CRI-O (2017), kata containers...

View File

@@ -1,4 +1,4 @@
# Exercise — multi-stage builds
# Exercise — writing better Dockerfiles
Let's update our Dockerfiles to leverage multi-stage builds!

View File

@@ -1,5 +0,0 @@
# Exercise — BuildKit cache mounts
We want to make our builds faster by leveraging BuildKit cache mounts.
Of course, if we don't make any changes to the code, the build should be instantaneous. Therefore, to benchmark our changes, we will make trivial changes to the code (e.g. change the message in a "print" statement) and measure (e.g. with `time`) how long it takes to rebuild the image.

View File

@@ -1,249 +0,0 @@
# Deep Dive Into Images
- Image = files (layers) + metadata (configuration)
- Layers = regular tar archives
(potentially with *whiteouts*)
- Configuration = everything needed to run the container
(e.g. Cmd, Env, WorkdingDir...)
---
## Image formats
- Docker image [v1] (no longer used, except in `docker save` and `docker load`)
- Docker image v1.1 (IDs are now hashes instead of random values)
- Docker image [v2] (multi-arch support; content-addressable images)
- [OCI image format][oci] (almost the same, except for media types)
[v1]: https://github.com/moby/docker-image-spec?tab=readme-ov-file
[v2]: https://github.com/distribution/distribution/blob/main/docs/content/spec/manifest-v2-2.md
[oci]: https://github.com/opencontainers/image-spec/blob/main/spec.md
---
## OCI images
- Manifest = JSON document
- Used by container engines to know "what should I download to unpack this image?"
- Contains references to blobs, identified by their sha256 digest + size
- config (single sha256 digest)
- layers (list of sha256 digests)
- Also annotations (key/values)
- It's also possible to have a manifest list, or "fat manifest"
(which lists multiple manifests; this is used for multi-arch support)
---
## Config blob
- Also a JSON document
- `architecture` string (e.g. `amd64`)
- `config` object
Cmd, Entrypoint, Env, ExposedPorts, StopSignal, User, Volumes, WorkingDir
- `history` list
purely informative; shown with e.g. `docker history`
- `rootfs` object
`type` (always `layers`) + list of "diff ids"
---
class: extra-details
## Layers vs layers
- The image configuration contains digests of *uncompressed layers*
- The image manifest contains digests of *compressed layers*
(layer blobs in the registry can be tar, tar+gzip, tar+zstd)
---
## Layer format
- Layer = completely normal tar archive
- When a file is added or modified, it is added to the archive
(note: trivial changes, e.g. permissions, require to re-add the whole file!)
- When a file is deleted, a *whiteout* file is created
e.g. `rm hello.txt` results in a file named `.wh.hello.txt`
- Files starting with `.wh.` are forbidden in containers
- There is a special file, `.wh..wh..opq`, which means "remove all siblings"
(optimization to completely empty a directory)
- See [layer specification](https://github.com/opencontainers/image-spec/blob/main/layer.md) for details
---
class: extra-details
## Origin of layer format
- The initial storage driver for Docker was AUFS
- AUFS is out-of-tree but Debian and Ubuntu included it
(they used it for live CD / live USB boot)
- It meant that Docker could work out of the box on these distros
- Later, Docker added support for other systems
(devicemapper thin provisioning, btrfs, overlay...)
- Today, overlay is the best compromise for most use-cases
---
## Inspecting images
- `skopeo` can copy images between different places
(registries, Docker Engine, local storage as used by podman...)
- Example:
```bash
skopeo copy docker://alpine oci:/tmp/alpine.oci
```
- The image manifest will be in `/tmp/alpine.oci/index.json`
- Blobs (image configuration and layers) will be in `/tmp/alpine.oci/blobs/sha256`
- Note: as of version 1.20, `skopeo` doesn't handle extensions like stargz yet
(copying stargz images won't transfer the special index blobs)
---
## Layer surgery
Here is an example of how to manually edit an image.
https://github.com/jpetazzo/layeremove
It removes a specific layer from an image.
Note: it would be better to use a buildkit cache mount instead.
(This is just an educative example!)
---
## Stargz
- [Stargz] = Seekable Tar Gz, or "stargazer"
- Goal: start a container *before* its image has been fully downloaded
- Particularly useful for huge images that take minutes to download
- Also known as "streamable images" or "lazy loading"
- Alternative: [SOCI]
[stargz]: https://github.com/containerd/stargz-snapshotter
[SOCI]: https://github.com/awslabs/soci-snapshotter
---
## Stargz architecture
- Combination of:
- a backward-compatible extension to the OCI image format
- a containerd *snapshotter*
(=containerd component responsible for managing container and image storage)
- tooling to create, convert, optimize images
- Installation requires:
- running the snapshotter daemon
- configuring containerd
- building new images or converting the existing ones
---
## Stargz principle
- Normal image layer = tar.gz = gzip(tar(file1, file2, ...))
- Can't access fileN without uncompressing everything before it
- Seekable Tar Gz = gzip(tar(file1)) + gzip(tar(file2)) + ... + index
(big files can also be chunked)
- Can access individual files
(and even individual chunks, if needed)
- Downside: lower compression ratio
(less compression context; extra gzip headers)
---
## Stargz format
- The index mentioned above is stored in separate registry blobs
(one index for each layer)
- The digest of the index blobs is stored in annotations in normal OCI images
- Fully compatible with existing registries
- Existing container engines will load images transparently
(without leveraging stargz capabilities)
---
## Stargz limitations
- Tools like `skopeo` will ignore index blobs
(=copying images across registries will discard stargz capabilities)
- Indexes need to be downloaded before container can be started
(=still significant start time when there are many files in images)
- Significant latency when accessing a file lazily
(need to hit the registry, typically with a range header, uncompress file)
- Images can be optimized to pre-load important files

File diff suppressed because it is too large Load Diff

View File

@@ -1,75 +0,0 @@
# Rootless Networking
The "classic" approach for container networking is `veth` + bridge.
Pros:
- good performance
- easy to manage and understand
- flexible (possibility to use multiple, isolated bridges)
Cons:
- requires root access on the host to set up networking
---
## Rootless options
- Locked down helpers
- daemon, scripts started through sudo...
- used by some desktop virtualization platforms
- still requires root access at some point
- Userland networking stacks
- true solution that does not require root privileges
- lower performance
---
## Userland stacks
- [SLiRP](https://en.wikipedia.org/wiki/Slirp)
*the OG project that inspired the other ones!*
- [VPNKit](https://github.com/moby/vpnkit)
*introduced by Docker Desktop to play nice with enterprise VPNs*
- [slirp4netns](https://github.com/rootless-containers/slirp4netns)
*slirp adapted for network namespaces, and therefore, containers; better performance*
- [passt and pasta](https://passt.top/)
*more modern approach; better support for inbound traffic; IPv6...)*
---
## Passt/Pasta
- No dependencies
- NAT (like slirp4netns) or no-NAT (for e.g. KubeVirt)
- Can handle inbound traffic dynamically
- No dynamic memory allocation
- Good security posture
- IPv6 support
- Reasonable performance
---
## Demo?

View File

@@ -1,162 +0,0 @@
# Security models
In this section, we want to address a few security-related questions:
- What permissions do we need to run containers or a container engine?
- Can we use containers to escalate permissions?
- Can we break out of a container (move from container to host)?
- Is it safe to run untrusted code in containers?
- What about Kubernetes?
---
## Running Docker, containerd, podman...
- In the early days, running containers required root permissions
(to set up namespaces, cgroups, networking, mount filesystems...)
- Eventually, new kernel features were developed to allow "rootless" operation
(user namespaces and associated tweaks)
- Rootless requires a little bit of additional setup on the system (e.g. subuid)
(although this is increasingly often automated in modern distros)
- Docker runs as root by default; Podman runs rootless by default
---
## Advantages of rootless
- Containers can run without any intervention from root
(no package install, no daemon running as root...)
- Containerized processes run with non-privileged UID
- Container escape doesn't automatically result in full host compromise
- Can isolate workloads by using different UID
---
## Downsides of rootless
- *Relatively* newer (rootless Docker was introduced in 2019)
- many quirks/issues/limitations in the initial implementations
- kernel features and other mechanisms were introduced over time
- they're not always very well documented
- I/O performance (disk, network) is typically lower
(due to using special mechanisms instead of more direct access)
- Rootless and rootful engines must use different image storage
(due to UID mapping)
---
## Why not rootless everywhere?
- Not very useful on clusters
- users shouldn't log into cluster nodes
- questionable security improvement
- lower I/O performance
- Not very useful with Docker Desktop / Podman Desktop
- container workloads are already inside a VM
- could arguably provide a layer of inter-workload isolation
- would require new APIs and concepts
---
## Permission escalation
- Access to the Docker socket = root access to the machine
```bash
docker run --privileged -v /:/hostfs -ti alpine
```
- That's why by default, the Docker socket access is locked down
(only accessible by `root` and group `docker`)
- If user `alice` has access to the Docker socket:
*compromising user `alice` leads to whole host compromise!*
- Doesn't fundamentally change the threat model
(if `alice` gets compromised in the first place, we're in trouble!)
- Enables new threats (persistence, kernel access...)
---
## Avoiding the problem
- Rootless containers
- Container VM (Docker Desktop, Podman Desktop, Orbstack...)
- Unfortunately: no fine-grained access to the Docker API
(no way to e.g. disable privileged containers, volume mounts...)
---
## Escaping containers
- Very easy with some features
(privileged containers, volume mounts, device access)
- Otherwise impossible in theory
(but of course, vulnerabilities do exist!)
- **Be careful with scripts invoking `docker run`, or Compose files!**
---
## Untrusted code
- Should be safe as long as we're not enabling dangerous features
(privileged containers, volume mounts, device access, capabilities...)
- Remember that by default, containers can make network calls
(but see: `--net none` and also `docker network create --internal`)
- And of course, again: vulnerabilities do exist!
---
## What about Kubernetes?
- Ability to run arbitrary pods = dangerous
- But there are multiple safety mechanisms available:
- Pod Security Settings (limit "dangerous" features)
- RBAC (control who can do what)
- webhooks and policy engines for even finer grained control

View File

@@ -1,159 +0,0 @@
# Exercise — Build a container from scratch
Our goal will be to execute a container running a simple web server.
(Example: NGINX, or https://github.com/jpetazzo/color.)
We want the web server to be isolated:
- it shouldn't be able to access the outside world,
- but we should be able to connect to it from our machine.
Make sure to automate / script things as much as possible!
---
## Steps
1. Prepare the filesystem
2. Run it with chroot
3. Isolation with namespaces
4. Network configuration
5. Cgroups
6. Non-root
---
## Prepare the filesystem
- Obtain a root filesystem with one of the following methods:
- download an Alpine mini root fs
- export an Alpine or NGINX container image with Docker
- download and convert a container image with Skopeo
- make it from scratch with busybox + a static [jpetazzo/color](https://github.com/jpetazzo/color)
- ...anything you want! (Nix, anyone?)
- Enter the root filesystem with `chroot`
---
## Help, network does not work!
- Check that you have external connectivity from the chroot:
```bash
ping 1.1.1.1
```
(that *should* work; if it doesn't, we have a serious problem!)
- Check that DNS resolution works:
```bash
ping enix.io
```
- If you're having a DNS resolution error, configure DNS in the container:
```bash
echo nameserver 1.1.1.1 > /etc/resolv.conf
```
---
## Running a web server
Here are a few possibilities...
- Install the NGINX package and run it with `nginx`
(note: by default it will start in the background)
- Run NGINX in the foreground with `nginx -g "daemon off;"`
- Install the package Caddy and run `caddy file-server -ab`
(it will remain in the foreground and show logs; **RECOMMENDED**)
- Download and/or build https://github.com/jpetazzo/color
(if you're familiar with the Go ecosystem!)
---
## Run with chroot
- Start the web server from within the chroot
- Confirm that you can connect to it from outside
- Write a script to start our "proto-container"
---
## Isolation with namespaces
- Now, enter the root filesystem with `unshare`
(enable all the namespaces you want; maybe not `user` yet, though!)
- Start the web server
(you might need to configure at least the loopback network interface!)
- Confirm that we *cannot* connect from outside
- Update the our start script to use unshare
- Automate network configuration
(pay attention to the fact that network tools *may not* exist in the container)
---
## Network configuration
- While our "container" is running, create a `veth` pair
- Move one `veth` to the container
- Assign addresses to both `veth`
- Confirm that we can connect to the web server from outside
(using the address assigned to the container's `veth`)
- Update our start script to automate the setup of the `veth` pair
- Bonus points: update the script to that it can start *multiple* containers
---
## Cgroups
- Create a cgroup for our container
- Move the container to the cgroup
- Set a very low CPU limit and confirm that it slows down the server
(but doesn't affect the rest of the system)
- Update the script to automate this
---
## Non-root
- Switch to a non-privileged user when starting the container
- Adjust the web server configuration so that it starts
(non-privileged users cannot bind to ports below 1024)

View File

@@ -1,35 +0,0 @@
# Exercise — Images from scratch
There are two parts in this exercise:
1. Obtaining and unpacking an image from scratch
2. Adding overlay mounts to the "container from scratch" lab
---
## Pulling from scratch, easy mode
- Download manifest and layers with `skopeo`
- Parse manifest and configuration with e.g. `jq`
- Uncompress the layers in a directory
- Check that the result works (using `chroot`)
---
## Pulling from scratch, medium mode
- Don't use `skopeo`
- Hints: if pulling from the Docker Hub, you'll need a token
(there are examples in Docker's documentation)
---
## Pulling from scratch, hard mode
- Handle whiteouts!

View File

@@ -1,126 +0,0 @@
## Flux install
We'll install `Flux`.
And replay the all scenario a 2nd time.
Let's face it: we don't have that much time. 😅
Since all our install and configuration is `GitOps`-based, we might just leverage on copy-paste and code configuration…
Maybe.
Let's copy the 📂 `./clusters/CLOUDY` folder and rename it 📂 `./clusters/METAL`.
---
### Modifying Flux config 📄 files
- In 📄 file `./clusters/METAL/flux-system/gotk-sync.yaml`
</br>change the `Kustomization` value `spec.path: ./clusters/METAL`
- ⚠️ We'll have to adapt the `Flux` _CLI_ command line
- And that's pretty much it!
- We'll see if anything goes wrong on that new cluster
---
### Connecting to our dedicated `Github` repo to host Flux config
.lab[
- let's replace `GITHUB_TOKEN` and `GITHUB_REPO` values
- don't forget to change the patch to `clusters/METAL`
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="container-training-fleet" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--repository=${GITHUB_REPO} \
--team=OPS \
--team=ROCKY --team=MOVY \
--path=clusters/METAL
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Flux deployed our complete stack
Everything seems to be here but…
- one database is in `Pending` state
- our `ingresses` don't work well
```bash
k8s@shpod ~$ curl --header 'Host: rocky.test.enixdomain.com' http://${myIngressControllerSvcIP}
curl: (52) Empty reply from server
```
---
### Fixing the Ingress
The current `ingress-nginx` configuration leverages on specific annotations used by Scaleway to bind a _IaaS_ load-balancer to the `ingress-controller`.
We don't have such kind of things here.😕
- We could bind our `ingress-controller` to a `NodePort`.
`ingress-nginx` install manifests propose it here:
</br>https://github.com/kubernetes/ingress-nginx/tree/release-1.14/deploy/static/provider/baremetal
- In the 📄file `./clusters/METAL/ingress-nginx/sync.yaml`,
</br>change the `Kustomization` value `spec.path: ./deploy/static/provider/baremetal`
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Troubleshooting the database
One of our `db-0` pod is in `Pending` state.
```bash
k8s@shpod ~$ k get pods db-0 -n *-test -oyaml
()
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2025-06-11T11:15:42Z"
message: '0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims.
preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
```
---
### Troubleshooting the PersistentVolumeClaims
```bash
k8s@shpod ~$ k get pvc postgresql-data-db-0 -n *-test -o yaml
()
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 9s (x182 over 45m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
```
No `storage class` is available on this cluster.
We hadn't the problem on our managed cluster since a default storage class was configured and then associated to our `PersistentVolumeClaim`.
Why is there no problem with the other database?

View File

@@ -12,119 +12,119 @@
<table>
<tr>
<td>Mardi 23 septembre 2025</td>
<td>Mardi 13 mai 2025</td>
<td>
<a href="1.yml.html">Docker Intensif</a>
</td>
</tr>
<tr>
<td>Mercredi 24 septembre 2025</td>
<td>Mercredi 14 mai 2025</td>
<td>
<a href="1.yml.html">Docker Intensif</a>
</td>
</tr>
<tr>
<td>Jeudi 25 septembre 2025</td>
<td>Jeudi 15 mai 2025</td>
<td>
<a href="1.yml.html">Docker Intensif</a>
</td>
</tr>
<tr>
<td>Vendredi 26 septembre 2025</td>
<td>Vendredi 16 mai 2025</td>
<td>
<a href="1.yml.html">Docker Intensif</a>
</td>
</tr>
<tr>
<td>Mardi 30 septembre 2025</td>
<td>Mardi 20 mai 2025</td>
<td>
<a href="2.yml.html">Fondamentaux Kubernetes</a>
</td>
</tr>
<tr>
<td>Mercredi 1 octobre 2025</td>
<td>Mercredi 21 mai 2025</td>
<td>
<a href="2.yml.html">Fondamentaux Kubernetes</a>
</td>
</tr>
<tr>
<td>Jeudi 2 octobre 2025</td>
<td>Jeudi 22 mai 2025</td>
<td>
<a href="2.yml.html">Fondamentaux Kubernetes</a>
</td>
</tr>
<tr>
<td>Vendredi 3 octobre 2025</td>
<td>Vendredi 23 mai 2025</td>
<td>
<a href="2.yml.html">Fondamentaux Kubernetes</a>
</td>
</tr>
<tr>
<td>Mardi 7 octobre 2025</td>
<td>Lundi 26 mai 2025</td>
<td>
<a href="3.yml.html">Packaging d'applications pour Kubernetes</a>
</td>
</tr>
<tr>
<td>Mercredi 8 octobre 2025</td>
<td>Mardi 27 mai 2025</td>
<td>
<a href="3.yml.html">Packaging d'applications pour Kubernetes</a>
</td>
</tr>
<tr>
<td>Jeudi 9 octobre 2025</td>
<td>Mercredi 28 mai 2025</td>
<td>
<a href="3.yml.html">Packaging d'applications pour Kubernetes</a>
</td>
</tr>
<tr>
<td>Mardi 14 octobre 2025</td>
<td>Lundi 2 juin 2025</td>
<td>
<a href="4.yml.html">Kubernetes Avancé</a>
</td>
</tr>
<tr>
<td>Mercredi 15 octobre 2025</td>
<td>Mardi 3 juin 2025</td>
<td>
<a href="4.yml.html">Kubernetes Avancé</a>
</td>
</tr>
<tr>
<td>Jeudi 16 octobre 2025</td>
<td>Mercredi 4 juin 2025</td>
<td>
<a href="4.yml.html">Kubernetes Avancé</a>
</td>
</tr>
<tr>
<td>Vendredi 17 octobre 2025</td>
<td>Jeudi 5 juin 2025</td>
<td>
<a href="4.yml.html">Kubernetes Avancé</a>
</td>
</tr>
<tr>
<td>Mardi 4 novembre 2025</td>
<td>Mardi 10 juin 2025</td>
<td>
<a href="5.yml.html">Opérer Kubernetes</a>
</td>
</tr>
<tr>
<td>Mercredi 5 novembre 2025</td>
<td>Mercredi 11 juin 2025</td>
<td>
<a href="5.yml.html">Opérer Kubernetes</a>
</td>
</tr>
<tr>
<td>Jeudi 6 novembre 2025</td>
<td>Jeudi 12 juin 2025</td>
<td>
<a href="5.yml.html">Opérer Kubernetes</a>
</td>
</tr>
<tr>
<td>Vendredi 7 novembre 2025</td>
<td>Vendredi 13 juin 2025</td>
<td>
<a href="5.yml.html">Opérer Kubernetes</a>
</td>

View File

Before

Width:  |  Height:  |  Size: 74 KiB

After

Width:  |  Height:  |  Size: 74 KiB

View File

Before

Width:  |  Height:  |  Size: 73 KiB

After

Width:  |  Height:  |  Size: 73 KiB

View File

Before

Width:  |  Height:  |  Size: 186 KiB

After

Width:  |  Height:  |  Size: 186 KiB

View File

Before

Width:  |  Height:  |  Size: 34 KiB

After

Width:  |  Height:  |  Size: 34 KiB

View File

Before

Width:  |  Height:  |  Size: 221 KiB

After

Width:  |  Height:  |  Size: 221 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

View File

Before

Width:  |  Height:  |  Size: 162 KiB

After

Width:  |  Height:  |  Size: 162 KiB

View File

Before

Width:  |  Height:  |  Size: 570 KiB

After

Width:  |  Height:  |  Size: 570 KiB

View File

Before

Width:  |  Height:  |  Size: 278 KiB

After

Width:  |  Height:  |  Size: 278 KiB

View File

Before

Width:  |  Height:  |  Size: 347 KiB

After

Width:  |  Height:  |  Size: 347 KiB

View File

Before

Width:  |  Height:  |  Size: 192 KiB

After

Width:  |  Height:  |  Size: 192 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

View File

Before

Width:  |  Height:  |  Size: 71 KiB

After

Width:  |  Height:  |  Size: 71 KiB

View File

Before

Width:  |  Height:  |  Size: 70 KiB

After

Width:  |  Height:  |  Size: 70 KiB

View File

Before

Width:  |  Height:  |  Size: 241 KiB

After

Width:  |  Height:  |  Size: 241 KiB

View File

Before

Width:  |  Height:  |  Size: 189 KiB

After

Width:  |  Height:  |  Size: 189 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 412 KiB

View File

@@ -54,11 +54,10 @@ content:
- containers/Multi_Stage_Builds.md
#- containers/Publishing_To_Docker_Hub.md
- containers/Dockerfile_Tips.md
- containers/Exercise_Dockerfile_Multistage.md
- containers/Exercise_Dockerfile_Advanced.md
#- containers/Docker_Machine.md
#- containers/Advanced_Dockerfiles.md
#- containers/Buildkit.md
#- containers/Exercise_Dockerfile_Buildkit.md
#- containers/Init_Systems.md
#- containers/Application_Configuration.md
#- containers/Logging.md
@@ -66,7 +65,6 @@ content:
#- containers/Copy_On_Write.md
#- containers/Containers_From_Scratch.md
#- containers/Container_Engines.md
#- containers/Security.md
#- containers/Pods_Anatomy.md
#- containers/Ecosystem.md
#- containers/Orchestration_Overview.md

View File

@@ -40,7 +40,7 @@ content:
- - containers/Multi_Stage_Builds.md
- containers/Publishing_To_Docker_Hub.md
- containers/Dockerfile_Tips.md
- containers/Exercise_Dockerfile_Multistage.md
- containers/Exercise_Dockerfile_Advanced.md
- - containers/Naming_And_Inspecting.md
- containers/Labels.md
- containers/Getting_Inside.md
@@ -58,17 +58,13 @@ content:
- containers/Docker_Machine.md
- - containers/Advanced_Dockerfiles.md
- containers/Buildkit.md
- containers/Exercise_Dockerfile_Buildkit.md
- containers/Init_Systems.md
- containers/Application_Configuration.md
- containers/Logging.md
- containers/Resource_Limits.md
- - containers/Namespaces_Cgroups.md
- containers/Copy_On_Write.md
- containers/Containers_From_Scratch.md
- containers/Security.md
- containers/Rootless_Networking.md
- containers/Images_Deep_Dive.md
#- containers/Containers_From_Scratch.md
- - containers/Container_Engines.md
- containers/Pods_Anatomy.md
- containers/Ecosystem.md

View File

@@ -41,7 +41,7 @@ content:
- containers/Dockerfile_Tips.md
- containers/Multi_Stage_Builds.md
- containers/Publishing_To_Docker_Hub.md
- containers/Exercise_Dockerfile_Multistage.md
- containers/Exercise_Dockerfile_Advanced.md
-
- containers/Naming_And_Inspecting.md
- containers/Labels.md
@@ -64,7 +64,6 @@ content:
- containers/Init_Systems.md
- containers/Advanced_Dockerfiles.md
- containers/Buildkit.md
- containers/Exercise_Dockerfile_Buildkit.md
-
- containers/Application_Configuration.md
- containers/Logging.md
@@ -78,7 +77,5 @@ content:
#- containers/Namespaces_Cgroups.md
#- containers/Copy_On_Write.md
#- containers/Containers_From_Scratch.md
#- containers/Rootless_Networking.md
#- containers/Security.md
#- containers/Pods_Anatomy.md
#- containers/Ecosystem.md

View File

@@ -0,0 +1,349 @@
# K01- Installing a Kubernetes cluster from scratch
We operated a managed cluster from **Scaleway** `Kapsule`.
It's great! Most batteries are included:
- storage classes, with an already configured default one
- a default CNI with `Cilium`
<br/>(`Calico` is supported too)
- a _IaaS_ load-balancer that is manageable by `ingress-controllers`
- a management _WebUI_ with the Kubernetes dashboard
- an observability stack with `metrics-server` and the Kubernetes dashboard
But what about _on premises_ needs?
---
class: extra-details
## On premises Kubernetes distributions
The [CNCF landscape](https://landscape.cncf.io/?fullscreen=yes&zoom=200&group=certified-partners-and-providers) currently lists **61!** Kubernetes distributions, today.
Not speaking of Kubernetes managed services from Cloud providers…
Please, refer to the [`Setting up Kubernetes` chapter in the High Five M2 module](./2.yml.html#toc-setting-up-kubernetes) for more infos about Kubernetes distributions.
---
## Introducing k0s
Nowadays, some "light" distros are considered good enough to run production clusters.
That's the case for `k0s`.
It's an open source Kubernetes lightweight distribution.
Mainly relying on **Mirantis**, a long-time software vendor in Kubernetes ecosystem.
(The ones who bought `Docker Enterprise` a long time ago. remember?)
`k0s` aims to be both
- a lightweight distribution for _edge-computing_ and development pupose
- an enterprise-grade HA distribution fully supported by its editor
<br/>`MKE4` and `kordent` leverage on `k0s`
---
### `k0s` package
Its single binary includes:
- a CRI (`containerd`)
- Kubernetes vanilla control plane components (including both `etcd`)
- a vanilla network stack
- `kube-router`
- `kube-proxy`
- `coredns`
- `konnectivity`
- `kubectl` CLI
- install / uninstall features
- backup / restore features
---
class: pic
![k0s package](images/M6-k0s-packaging.png)
---
class: extra-details
### Konnectivity
You've seen that Kubernetes cluster architecture is very versatile.
I'm referring to the [`Kubernetes architecture` chapter in the High Five M5 module](./5.yml.html#toc-kubernetes-architecture)
Network communications between control plane components and worker nodes might be uneasy to configure.
`Konnectivity` is a response to this pain. It acts as an RPC proxy for any communication initiated from control plane to workers.
These communications are listed in [`Kubernetes internal APIs` chapter in the High Five M5 module](https://2025-01-enix.container.training/5.yml.html#toc-kubernetes-internal-apis)
The agent deployed on each worker node maintains an RPC tunnel with the one deployed on control plane side.
---
class: pic
![konnectivity architecture](images/M6-konnectivity-architecture.png)
---
## Installing `k0s`
It installs with a one-liner command
- either in single-node lightweight footprint
- or in multi-nodes HA footprint
.lab[
- Get the binary
```bash
docker@m621: ~$ wget https://github.com/k0sproject/k0sctl/releases/download/v0.25.1/k0sctl-linux-amd64
```
]
---
### Prepare the config file
.lab[
- Create the config file
```bash
docker@m621: ~$ k0sctl init \
--controller-count 3 \
--user docker \
--k0s m621 m622 m623 > k0sctl.yaml
```
- change the following field: `spec.hosts[*].role: controller+worker`
- add the following fields: `spec.hosts[*].noTaints: true`
```bash
docker@m621: ~$ k0sctl apply --config k0sctl.yaml
```
]
---
### And the famous one-liner
.lab[
```bash
k8s@shpod: ~$ k0sctl apply --config k0sctl.yaml
```
]
---
### Check that k0s installed correctly
.lab[
```bash
docker@m621 ~$ sudo k0s status
Version: v1.33.1+k0s.1
Process ID: 60183
Role: controller
Workloads: true
SingleNode: false
Kube-api probing successful: true
Kube-api probing last error:
docker@m621 ~$ sudo k0s etcd member-list
{"members":{"m621":"https://10.10.3.190:2380","m622":"https://10.10.2.92:2380","m623":"https://10.10.2.110:2380"}}
```
]
---
### `kubectl` is included
.lab[
```bash
docker@m621 ~$ sudo k0s kubectl get nodes
NAME STATUS ROLES AGE VERSION
m621 Ready control-plane 66m v1.33.1+k0s
m622 Ready control-plane 66m v1.33.1+k0s
m623 Ready control-plane 66m v1.33.1+k0s
docker@m621 ~$ sudo k0s kubectl run shpod --image jpetazzo/shpod
```
]
---
class: extra-details
### Single node install (for info!)
For testing purpose, you may want to use a single-node (yet `etcd`-geared) install…
.lab[
- Install it
```bash
docker@m621 ~$ curl -sSLf https://get.k0s.sh | sudo sh
docker@m621 ~$ sudo k0s install controller --single
docker@m621 ~$ sudo k0s start
```
- Reset it
```bash
docker@m621 ~$ sudo k0s start
docker@m621 ~$ sudo k0s reset
```
]
---
## Deploying shpod
.lab[
```bash
docker@m621 ~$ sudo k0s kubectl apply -f https://shpod.in/shpod.yaml
docker@m621 ~$ sudo k0s kubectl apply -f https://shpod.in/shpod.yaml
```
]
---
## Flux install
We'll install `Flux`.
And replay the all scenario a 2nd time.
Let's face it: we don't have that much time. 😅
Since all our install and configuration is `GitOps`-based, we might just leverage on copy-paste and code configuration…
Maybe.
Let's copy the 📂 `./clusters/CLOUDY` folder and rename it 📂 `./clusters/METAL`.
---
### Modifying Flux config 📄 files
- In 📄 file `./clusters/METAL/flux-system/gotk-sync.yaml`
</br>change the `Kustomization` value `spec.path: ./clusters/METAL`
- ⚠️ We'll have to adapt the `Flux` _CLI_ command line
- And that's pretty much it!
- We'll see if anything goes wrong on that new cluster
---
### Connecting to our dedicated `Github` repo to host Flux config
.lab[
- let's replace `GITHUB_TOKEN` and `GITHUB_REPO` values
- don't forget to change the patch to `clusters/METAL`
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="container-training-fleet" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--repository=${GITHUB_REPO} \
--team=OPS \
--team=ROCKY --team=MOVY \
--path=clusters/METAL
```
]
---
class: pic
![Running Mario](images/M6-running-Mario.gif)
---
### Flux deployed our complete stack
Everything seems to be here but…
- one database is in `Pending` state
- our `ingresses` don't work well
```bash
k8s@shpod ~$ curl --header 'Host: rocky.test.enixdomain.com' http://${myIngressControllerSvcIP}
curl: (52) Empty reply from server
```
---
### Fixing the Ingress
The current `ingress-nginx` configuration leverages on specific annotations used by Scaleway to bind a _IaaS_ load-balancer to the `ingress-controller`.
We don't have such kind of things here.😕
- We could bind our `ingress-controller` to a `NodePort`.
`ingress-nginx` install manifests propose it here:
</br>https://github.com/kubernetes/ingress-nginx/deploy/static/provider/baremetal
- In the 📄file `./clusters/METAL/ingress-nginx/sync.yaml`,
</br>change the `Kustomization` value `spec.path: ./deploy/static/provider/baremetal`
---
class: pic
![Running Mario](images/M6-running-Mario.gif)
---
### Troubleshooting the database
One of our `db-0` pod is in `Pending` state.
```bash
k8s@shpod ~$ k get pods db-0 -n *-test -oyaml
()
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2025-06-11T11:15:42Z"
message: '0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims.
preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
```
---
### Troubleshooting the PersistentVolumeClaims
```bash
k8s@shpod ~$ k get pvc postgresql-data-db-0 -n *-test -o yaml
()
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 9s (x182 over 45m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
```
No `storage class` is available on this cluster.
We hadn't the problem on our managed cluster since a default storage class was configured and then associated to our `PersistentVolumeClaim`.
Why is there no problem with the other database?

View File

@@ -76,7 +76,7 @@ And here we go!
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---

View File

@@ -9,7 +9,7 @@ but let's see if we can succeed by just adding manifests in our `Flux` configura
class: pic
![Flux configuration waterfall](images/flux/flux-config-dependencies.png)
![Flux configuration waterfall](images/M6-flux-config-dependencies.png)
---
@@ -89,7 +89,7 @@ k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---
@@ -132,7 +132,7 @@ k8s@shpod:~$ flux reconcile source git movy-app -n movy-test
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---
@@ -170,13 +170,13 @@ And push the modifications…
class: pic
![MOVY app has an incorrect dataset](images/flux/incorrect-dataset-in-MOVY-app.png)
![MOVY app has an incorrect dataset](images/M6-incorrect-dataset-in-MOVY-app.png)
---
class: pic
![ROCKY app has an incorrect dataset](images/flux/incorrect-dataset-in-ROCKY-app.png)
![ROCKY app has an incorrect dataset](images/M6-incorrect-dataset-in-ROCKY-app.png)
---
@@ -212,7 +212,7 @@ Please, refer to the [`Network policies` chapter in the High Five M4 module](./4
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---

View File

@@ -167,13 +167,13 @@ k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---
class: pic
![rocky config files](images/flux/R01-config-files.png)
![rocky config files](images/M6-R01-config-files.png)
---
@@ -300,7 +300,7 @@ class: extra-details
💡 This managed cluster comes with custom `StorageClasses` leveraging on Cloud _IaaS_ capabilities (i.e. block devices)
![Flux configuration waterfall](images/flux/persistentvolumes.png)
![Flux configuration waterfall](images/M6-persistentvolumes.png)
- a default `StorageClass` is applied if none is specified (like here)
- for **_🏭PROD_** purpose, ops team might enforce a more performant `StorageClass`
@@ -310,7 +310,7 @@ class: extra-details
class: pic
![Flux configuration waterfall](images/flux/flux-config-dependencies.png)
![Flux configuration waterfall](images/M6-flux-config-dependencies.png)
---

View File

View File

@@ -78,8 +78,10 @@ Prerequisites are:
- `Flux` _CLI_ needs a `Github` personal access token (_PAT_)
- to create and/or access the `Github` repository
- to give permissions to existing teams in our `Github` organization
- The _PAT_ needs _CRUD_ permissions on our `Github` organization
- The PAT needs _CRUD_ permissions on our `Github` organization
- repositories
- admin:public_key
- users
- As **_⚙OPS_** team, let's creates a `Github` personal access token…
@@ -87,7 +89,7 @@ Prerequisites are:
class: pic
![Generating a Github personal access token](images/flux/github-add-token.jpg)
![Generating a Github personal access token](images/M6-github-add-token.jpg)
---
@@ -116,32 +118,6 @@ k8s@shpod:~$ flux bootstrap github \
class: extra-details
### Creating a personnal dedicated `Github` repo
You don't need to rely onto a Github organization: any `Github` personnal repository is OK.
.lab[
- let's replace the `GITHUB_TOKEN` value by our _Personal Access Token_
- and the `GITHUB_REPO` value by our specific repository name
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="lpiot" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--personal \
--repository=${GITHUB_REPO} \
--path=clusters/CLOUDY
```
]
---
class: extra-details
Here is the result
```bash
@@ -193,7 +169,7 @@ Here is the result
- `Flux` sets up permissions that allow teams within our organization to **access** the `Github` repository as maintainers
- Teams need to exist before `Flux` proceeds to this configuration
![Teams in Github](images/flux/github-teams.png)
![Teams in Github](images/M6-github-teams.png)
---
@@ -207,22 +183,6 @@ Here is the result
---
### The PAT is not needed anymore!
- During the install process, `Flux` creates an `ssh` key pair so that it is able to contribute to the `Github` repository.
```bash
► generating source secret
✔ public key: ecdsa-sha2-nistp384 AAAAE2VjZHNhLXNoYTItbmlzdHAzODQAAAAIbmlzdHAzODQAAABhBFqaT8B8SezU92qoE+bhnv9xONv9oIGuy7yVAznAZfyoWWEVkgP2dYDye5lMbgl6MorG/yjfkyo75ETieAE49/m9D2xvL4esnSx9zsOLdnfS9W99XSfFpC2n6soL+Exodw==
✔ configured deploy key "flux-system-main-flux-system-./clusters/CLOUDY" for "https://github.com/container-training-fleet/fleet-config-using-flux-XXXXX"
► applying source secret "flux-system/flux-system"
✔ reconciled source secret
```
- You can now delete the formerly created _Personal Access Token_: `Flux` won't use it anymore.
---
### 📂 Flux config files
`Flux` has been successfully installed onto our **_☁CLOUDY_** Kubernetes cluster!
@@ -232,13 +192,13 @@ Its configuration is managed through a _Gitops_ workflow sourced directly from o
Let's review our `Flux` configuration files we've created and pushed into the `Github` repository…
… as well as the corresponding components running in our Kubernetes cluster
![Flux config files](images/flux/flux-config-files.png)
![Flux config files](images/M6-flux-config-files.png)
---
class: pic
<!-- FIXME: wrong schema -->
![Flux architecture](images/flux/flux-controllers.png)
![Flux architecture](images/M6-flux-controllers.png)
---

View File

@@ -34,7 +34,7 @@ Several _tenants_ are created
class: pic
![Multi-tenants clusters](images/flux/cluster-multi-tenants.png )
![Multi-tenants clusters](images/M6-cluster-multi-tenants.png )
---
@@ -105,7 +105,7 @@ Let's review the `fleet-config-using-flux-XXXXX/clusters/CLOUDY/tenants.yaml` fi
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---

View File

@@ -90,13 +90,13 @@ k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---
class: pic
![Ingress-nginx provisionned a IaaS load-balancer in Scaleway Cloud services](images/flux/ingress-nginx-scaleway-lb.png)
![Ingress-nginx provisionned a IaaS load-balancer in Scaleway Cloud services](images/M6-ingress-nginx-scaleway-lb.png)
---
@@ -141,7 +141,7 @@ k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---
@@ -172,7 +172,7 @@ k8s@shpod:~$ \
class: pic
![Rocky application screenshot](images/flux/rocky-app-screenshot.png)
![Rocky application screenshot](images/M6-rocky-app-screenshot.png)
---

View File

@@ -0,0 +1,353 @@
# Installing a Kubernetes cluster from scratch
We operated a managed cluster from **Scaleway** `Kapsule`.
It's great! Most batteries are included:
- storage classes, with an already configured default one
- a default CNI with `Cilium`
<br/>(`Calico` is supported too)
- a _IaaS_ load-balancer that is manageable by `ingress-controllers`
- a management _WebUI_ with the Kubernetes dashboard
- an observability stack with `metrics-server` and the Kubernetes dashboard
But what about _on premises_ needs?
---
class: extra-details
## On premises Kubernetes distributions
The [CNCF landscape](https://landscape.cncf.io/?fullscreen=yes&zoom=200&group=certified-partners-and-providers) currently lists **61!** Kubernetes distributions, today.
Not speaking of Kubernetes managed services from Cloud providers…
Please, refer to the [`Setting up Kubernetes` chapter in the High Five M2 module](./2.yml.html#toc-setting-up-kubernetes) for more infos about Kubernetes distributions.
---
## Introducing k0s
Nowadays, some "light" distros are considered good enough to run production clusters.
That's the case for `k0s`.
It's an open source Kubernetes lightweight distribution.
Mainly relying on **Mirantis**, a long-time software vendor in Kubernetes ecosystem.
(The ones who bought `Docker Enterprise` a long time ago. remember?)
`k0s` aims to be both
- a lightweight distribution for _edge-computing_ and development pupose
- an enterprise-grade HA distribution fully supported by its editor
<br/>`MKE4` and `kordent` leverage on `k0s`
---
### `k0s` package
Its single binary includes:
- a CRI (`containerd`)
- Kubernetes vanilla control plane components (including both `etcd`)
- a vanilla network stack
- `kube-router`
- `kube-proxy`
- `coredns`
- `konnectivity`
- `kubectl` CLI
- install / uninstall features
- backup / restore features
---
class: pic
![k0s package](images/M6-k0s-packaging.png)
---
class: extra-details
### Konnectivity
You've seen that Kubernetes cluster architecture is very versatile.
I'm referring to the [`Kubernetes architecture` chapter in the High Five M5 module](./5.yml.html#toc-kubernetes-architecture)
Network communications between control plane components and worker nodes might be uneasy to configure.
`Konnectivity` is a response to this pain. It acts as an RPC proxy for any communication initiated from control plane to workers.
These communications are listed in [`Kubernetes internal APIs` chapter in the High Five M5 module](https://2025-01-enix.container.training/5.yml.html#toc-kubernetes-internal-apis)
The agent deployed on each worker node maintains an RPC tunnel with the one deployed on control plane side.
---
class: pic
![konnectivity architecture](images/M6-konnectivity-architecture.png)
---
## Installing `k0s`
It installs with a one-liner command
- either in single-node lightweight footprint
- or in multi-nodes HA footprint
.lab[
- Get the binary
```bash
docker@m621: ~$ wget https://github.com/k0sproject/k0sctl/releases/download/v0.25.1/k0sctl-linux-amd64
```
]
---
### Prepare the config file
.lab[
- Create the config file
```bash
docker@m621: ~$ k0sctl init \
--controller-count 3 \
--user docker \
--k0s m621 m622 m623 > k0sctl.yaml
```
- change the following field: `spec.hosts[*].role: controller+worker`
- add the following fields: `spec.hosts[*].noTaints: true`
```bash
docker@m621: ~$ k0sctl apply --config k0sctl.yaml
```
]
---
### And the famous one-liner
.lab[
```bash
k8s@shpod: ~$ k0sctl apply --config k0sctl.yaml
```
]
---
### Check that k0s installed correctly
.lab[
```bash
docker@m621 ~$ sudo k0s status
Version: v1.33.1+k0s.1
Process ID: 60183
Role: controller
Workloads: true
SingleNode: false
Kube-api probing successful: true
Kube-api probing last error:
docker@m621 ~$ sudo k0s etcd member-list
{"members":{"m621":"https://10.10.3.190:2380","m622":"https://10.10.2.92:2380","m623":"https://10.10.2.110:2380"}}
```
]
---
### `kubectl` is included
.lab[
```bash
docker@m621 ~$ sudo k0s kubectl get nodes
NAME STATUS ROLES AGE VERSION
m621 Ready control-plane 66m v1.33.1+k0s
m622 Ready control-plane 66m v1.33.1+k0s
m623 Ready control-plane 66m v1.33.1+k0s
docker@m621 ~$ sudo k0s kubectl run shpod --image jpetazzo/shpod
```
]
---
class: extra-details
### Single node install (for info!)
For testing purpose, you may want to use a single-node (yet `etcd`-geared) install…
.lab[
- Install it
```bash
docker@m621 ~$ curl -sSLf https://get.k0s.sh | sudo sh
docker@m621 ~$ sudo k0s install controller --single
docker@m621 ~$ sudo k0s start
```
- Reset it
```bash
docker@m621 ~$ sudo k0s start
docker@m621 ~$ sudo k0s reset
```
]
---
## Deploying shpod
.lab[
```bash
docker@m621 ~$ sudo k0s kubectl apply -f https://shpod.in/shpod.yaml
docker@m621 ~$ sudo k0s kubectl apply -f https://shpod.in/shpod.yaml
```
]
---
## Flux install
We'll install `Flux`.
And replay the all scenario a 2nd time.
Let's face it: we don't have that much time. 😅
Since all our install and configuration is `GitOps`-based, we might just leverage on copy-paste and code configuration…
Maybe.
Let's copy the 📂 `./clusters/CLOUDY` folder and rename it 📂 `./clusters/METAL`.
---
### Modifying Flux config 📄 files
- In 📄 file `./clusters/METAL/flux-system/gotk-sync.yaml`
</br>change the `Kustomization` value `spec.path: ./clusters/METAL`
- ⚠️ We'll have to adapt the `Flux` _CLI_ command line
- And that's pretty much it!
- We'll see if anything goes wrong on that new cluster
---
### Connecting to our dedicated `Github` repo to host Flux config
.lab[
- let's replace `GITHUB_TOKEN` and `GITHUB_REPO` values
- don't forget to change the patch to `clusters/METAL`
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="container-training-fleet" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--repository=${GITHUB_REPO} \
--team=OPS \
--team=ROCKY --team=MOVY \
--path=clusters/METAL
```
]
---
class: pic
![Running Mario](images/M6-running-Mario.gif)
---
### Flux deployed our complete stack
Everything seems to be here but…
- one database is in `Pending` state
- our `ingresses` don't work well
```bash
k8s@shpod ~$ curl --header 'Host: rocky.test.enixdomain.com' http://${myIngressControllerSvcIP}
curl: (52) Empty reply from server
```
---
### Fixing the Ingress
The current `ingress-nginx` configuration leverages on specific annotations used by Scaleway to bind a _IaaS_ load-balancer to the `ingress-controller`.
We don't have such kind of things here.😕
- We could bind our `ingress-controller` to a `NodePort`.
`ingress-nginx` install manifests propose it here:
</br>https://github.com/kubernetes/ingress-nginx/deploy/static/provider/baremetal
- In the 📄file `./clusters/METAL/ingress-nginx/sync.yaml`,
</br>change the `Kustomization` value `spec.path: ./deploy/static/provider/baremetal`
---
class: pic
![Running Mario](images/M6-running-Mario.gif)
---
### Troubleshooting the database
One of our `db-0` pod is in `Pending` state.
```bash
k8s@shpod ~$ k get pods db-0 -n *-test -oyaml
()
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2025-06-11T11:15:42Z"
message: '0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims.
preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
```
---
### Troubleshooting the PersistentVolumeClaims
```bash
k8s@shpod ~$ k get pvc postgresql-data-db-0 -n *-test -o yaml
()
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 9s (x182 over 45m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
```
No `storage class` is available on this cluster.
We hadn't the problem on our managed cluster since a default storage class was configured and then associated to our `PersistentVolumeClaim`.
Why is there no problem with the other database?
---
## Installing OpenEBS as our CSI

View File

@@ -13,7 +13,7 @@ Please, refer to the [`Setting up Kubernetes` chapter in the High Five M4 module
---
## Creating an `Helm` source in Flux for Kyverno Helm chart
## Creating an `Helm` source in Flux for OpenEBS Helm chart
.lab[
@@ -107,7 +107,7 @@ flux create kustomization
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---

View File

@@ -58,7 +58,7 @@ k8s@shpod:~/fleet-config-using-flux-XXXXX$ flux create kustomization dashboards
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---
@@ -98,7 +98,7 @@ k8s@shpod:~$ flux create secret git flux-system \
class: pic
![Running Mario](images/running-mario.gif)
![Running Mario](images/M6-running-Mario.gif)
---
@@ -127,7 +127,7 @@ k8s@shpod:~$ k get secret kube-prometheus-stack-grafana -n monitoring \
class: pic
![Grafana dashboard screenshot](images/flux/grafana-dashboard.png)
![Grafana dashboard screenshot](images/M6-grafana-dashboard.png)
---

View File

@@ -18,7 +18,7 @@
- API backend
- database
- database (that we will keep out of Kubernetes for now)
- We have built images for our frontend and backend components
@@ -33,15 +33,7 @@
---
## Kubernetes, level 1
--
- Leave our database outside of Kubernetes (because database be scary🥺)
--
- Deploy a managed Kubernetes cluster (cloud or [professional services][enix-k8s-expert])
## Basic things we can ask Kubernetes to do
--
@@ -71,24 +63,18 @@
- Keep processing requests during the upgrade; update my containers one at a time
[enix-k8s-expert]: https://enix.io/en/kubernetes-expert/
---
## Kubernetes, level 2
- Deploy a pre-production environment
(still using our external database, for now)
- Resource management and scheduling
(reserve CPU/RAM for containers; placement constraints; priorities)
## Other things that Kubernetes can do for us
- Autoscaling
(straightforward on CPU; more complex on other metrics)
- Resource management and scheduling
(reserve CPU/RAM for containers; placement constraints)
- Advanced rollout patterns
(blue/green deployment, canary deployment)
@@ -109,74 +95,24 @@ class: pic
---
## Kubernetes, level 3
- Run staging databases on the cluster
(no replication, no backups, no scaling)
- Automatic or semi-automatic deployment of feature branches
(each with its own database)
- Fine-grained access control
(defining *what* can be done by *whom* on *which* resources)
## More things that Kubernetes can do for us
- Batch jobs
(one-off; parallel; also cron-style periodic execution)
- Package applications with e.g. Helm charts
- Fine-grained access control
---
(defining *what* can be done by *whom* on *which* resources)
## Kubernetes, level 4
- Stateful services with persistence, replication, backups
- Stateful services
(databases, message queues, etc.)
- Automate complex tasks with *operators*
- Automating complex tasks with *operators*
(e.g. database replication, failover, etc.)
- Combine the two previous points with database operators like [CloudNativePG][cnpg]
(learn more about database operators: [FR][pirates-video-fr], [EN][pirates-video-en])
- Leverage advanced storage with e.g. local ZFS volumes
(learn more about ZFS and databases on k8s: [FR][zfs-video-fr], [EN][zfs-video-en])
- Deploy and manage clusters in-house
[cnpg]: https://cloudnative-pg.io/
[pirates-video-fr]: https://www.youtube.com/watch?v=d_ka7PlWo1I
[pirates-video-en]: https://www.youtube.com/watch?v=ojUdBjbiKWk&t=5s
[zfs-video-fr]: https://www.youtube.com/watch?v=XN9YL93f8tI
[zfs-video-en]: https://www.youtube.com/watch?v=3sJIYiDnod4
---
## Kubernetes, level 5
- Deploying and managing clusters at scale
(hundreds of clusters, thousands of nodes...)
- Writing custom operators
- Hybrid deployments
---
## Disclaimer
The levels mentioned in the previous slides are not necessarily linear.
They aren't exhaustive either (we didn't mention e.g. observability and alerting).
---
## Kubernetes architecture

View File

@@ -140,9 +140,7 @@ then make or request changes where needed.*
- [validating admission policies][ac-vap] (using CEL, Common Expression Language)
- More is coming; e.g. [mutating admission policies][ac-map]
(alpha in Kubernetes 1.32, beta in Kubernetes 1.34)
- More is coming; e.g. [mutating admission policies][ac-map] (alpha in Kubernetes 1.32)
[ac-controllers]: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/
[ac-webhooks]: https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/

View File

@@ -1,523 +0,0 @@
# The Gateway API
- Over time, Kubernetes has introduced multiple ways to expose containers
- In the first versions of Kubernetes, we would use a `Service` of type `LoadBalancer`
- HTTP services often need extra features, though:
- content-based routing (route requests with URI, HTTP headers...)
- TLS termination
- middlewares (e.g. authentication)
- etc.
- This led to the introduction of the `Ingress` resource
---
## History of Ingress
- Kubernetes 1.8 (September 2017) introduced `Ingress` (v1beta1)
- Kubernetes 1.19 (August 2020) graduated `Ingress` to GA (v1)
- Ingress supports:
- content-based routing with URI or HTTP `Host:` header
- TLS termination (with neat integration with e.g. cert-manager)
- Ingress doesn't support:
- content-based routing with other headers (e.g. cookies)
- middlewares
- traffic split for e.g. canary deployments
---
## Everyone needed something better
- Virtually *every* ingress controller added proprietary extensions:
- `nginx.ingress.kubernetes.io/configuration-snippet` annotation
- Traefik has CRDs like `IngressRoute`, `TraefikService`, `Middleware`...
- HAProxy has CRDs like `Backend`, `TCP`...
- etc.
- Ingress was too specific to L7 (HTTP) traffic
- We needed a totally new set of APIs and resources!
---
## Gateway API in a nutshell
- Handle HTTP, GRPC, TCP, TLS, UDP routes
(note: as of October 2025, only HTTP and GRPC routes are in GA)
- Finer-grained permission model
(e.g. define which namespaces can use a specific "gateway"; more on that later)
- Standardize more "core" features than Ingress
(header-based routing, traffic weighing, rewrite requests and responses...)
- Pave the way for further extension thanks to different feature sets
(`Core` vs `Extended` vs `Implementation-specific`)
- Can also be used for service meshes
---
## Gateway API personas
- Ingress informally had two personas:
- cluster administrator (installs and manages the Ingress Controller)
- application developer (creates Ingress resources)
- Gateway [formally defines three personas][gateway-personas]:
- infrastructure provider
<br/>
(~network admin; potentially works within managed providers)
- cluster operator
<br/>
(~Kubernetes admin; potentially manages multiple clusters)
- application developer
[gateway-personas]: https://gateway-api.sigs.k8s.io/concepts/roles-and-personas/
---
class: pic
## Gateway API resources
![Diagram showing GatewayClass, Gateway, HTTPRoute, Service](https://gateway-api.sigs.k8s.io/images/resource-model.png)
---
## Gateway API resources
- `Service` = our good old Kubernetes service
- `HTTPRoute` = describes which requests should go to which `Service`
(similar to the `Ingress` resource)
- `Gateway` = how traffic enters the system
(could correspond to e.g. a `LoadBalancer` `Service`)
- `GatewayClass` = represents different types of `Gateways`
(many gateway controllers will offer only one)
---
## `HTTPRoute` anatomy
- `spec.parentRefs` = where requests come from
- typically a single `Gateway`
- could be multiple `Gateway` resources
- can also be a `Service` (for cluster mesh uses)
- `spec.hostnames` = which hosts (HTTP `Host:` header) this applies to
- `spec.rules[].matches` = which requests this applies to (match paths, headers...)
- `spec.rules[].filters` = optional transformations (change headers, rewrite URI...)
- `spec.rules[].backendRefs` = where requests go to
---
## Minimal `HTTPRoute`
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: xyz
spec:
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: my-gateway
namespace: my-gateway-namespace
hostnames: [ xyz.example.com ]
rules:
- backendRefs:
- name: xyz
port: 80
```
---
## Gateway API in action
- Let's deploy Traefik in Gateway API mode!
- We'll use the [official Helm chart for Traefik][traefik-chart]
- We'll need to set a few values
- `providers.kubernetesGateway.enabled=true`
*enable Gateway API provisioning*
- `gateway.listeners.web.namespacePolicy.from=All`
*allow `HTTPRoutes` in all namespaces to refer to the default `Gateway`*
[traefik-chart]: https://artifacthub.io/packages/helm/traefik/traefik
---
## `LoadBalancer` vs `hostPort`
- If we're using a managed Kubernetes cluster, we'll use the default mode:
- Traefik runs with a `Deployment`
- Traefik `Service` has type `LoadBalancer`
- we connect to the `LoadBalancer` public IP address
- If we don't have a CCM (or `LoadBalancer` `Service`), we'll do things differently:
- Traefik runs with a `DaemonSet`
- Traefik `Service` has type `ClusterIP` (not strictly necessary but cleaner)
- we connect to any node's public IP address
---
## Installing Traefik (with `LoadBalancer`)
Install the Helm chart:
```bash
helm upgrade --install --namespace traefik --create-namespace \
--repo https://traefik.github.io/charts traefik traefik \
--version 37.1.2 \
--set providers.kubernetesGateway.enabled=true \
--set gateway.listeners.web.namespacePolicy.from=All \
#
```
We'll connect by using the public IP address of the load balancer:
```bash
kubectl get services --namespace traefik
```
---
## Installing Traefik (with `hostPort`)
Install the Helm chart:
```bash
helm upgrade --install --namespace traefik --create-namespace \
--repo https://traefik.github.io/charts traefik traefik \
--version 37.1.2 \
--set deployment.kind=DaemonSet \
--set ports.web.hostPort=80 \
--set ports.websecure.hostPort=443 \
--set service.type=ClusterIP \
--set providers.kubernetesGateway.enabled=true \
--set gateway.listeners.web.namespacePolicy.from=All \
#
```
We'll connect by using the public IP address of any node of the cluster.
---
class: extra-details
## Taints and tolerations
- By default, Traefik Pods will respect node taints
- If some nodes have taints (e.g. control plane nodes) we might need tolerations
(if we want to run Traefik on all nodes)
- Adding the corresponding tolerations is left as an exercise for the reader!
---
class: extra-details
## Rolling updates with `hostPort`
- It is not possible to have two pods on the same node using the same `hostPort`
- Therefore, it is important to pay attention to the `DaemonSet` rolling update parameters
- If `maxUnavailable` is non-zero:
- old pods will be shutdown first
- new pods will start without a problem
- there will be a short interruption of service
- If `maxSurge` is non-zero:
- new pods will be created but won't be able to start (since the `hostPort` is taken)
- old pods will remain running and the rolling update will not proceed
---
## Testing our Gateway controller
- Send a test request to Traefik
(e.g. with `curl http://<ipaddress>`)
- For now we should get a `404 not found`
(as there are no routes configured)
---
## A basic HTTP route
- Create a basic HTTP container and expose it with a Service; e.g.:
```bash
kubectl create deployment blue --image jpetazzo/color --port 80
kubectl expose deployment blue
```
---
## A basic HTTP route
- Create an `HTTPRoute` with the following YAML:
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: blue
spec:
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: traefik-gateway
namespace: traefik
rules:
- backendRefs:
- name: blue
port: 80
```
- Our `curl` command should now show a response from the `blue` pod
---
class: extra-details
## Traefik dashboard
- By default, Traefik exposes a dashboard
(on a different port than the one used for "normal" traffic)
- To access it:
```bash
kubectl port-forward --namespace traefik daemonset/traefik 1234:8080
```
(replace `daemonset` with `deployment` if necessary)
- Then connect to http://localhost:1234/dashboard/ (pay attention to the final `/`!)
---
## `Core` vs `Extended` vs `Implementation-specific`
- All Gateway controllers must support `Core` features
- Some optional features are in the `Extended` set:
- they may or may not supported
- but at least, their specification is part of the API definition
- Gateway controllers can also have `Implementation-specific` features
(=proprietary extensions)
- In the following slides, we'll tag features with `Core` or `Extended`
---
## `HTTPRoute.spec.rules[].matches`
Some fields are part of the `Core` set; some are part of the `Extended` set.
```yaml
match:
path: # Core
value: /hello
type: PathPrefix # default value; can also be "Exact"
headers: # Core
- name: x-custom-header
value: foo
queryparams: # Extended
- type: Exact # can also have implementation-specific values, e.g. Regex
name: product
value: pizza
method: GET # Extended
```
---
## `HTTPRoute.spec.rules[].filters.*HeaderModifier`
`RequestHeaderModifier` is `Core`
`ResponseHeaderModifier` is `Extended`
```yaml
type: RequestHeaderModifier # or ResponseHeaderModifier
requestHeaderModifier: # or responseHeaderModifier
set: # replace an existing header
- name: x-my-header
value: hello
add: # appends to an existing header
- name: x-my-header # (adding a comma if it's already set)
value: hello
remove:
- x-my-header
```
---
## `HTTPRoute.spec.rules[].filters.RequestRedirect`
```yaml
type: RequestRedirect
requestRedirect:
scheme: https # http or https
hostname: newxyz.example.com
path: /new
port: 8080
statusCode: 302 # default=302; can be 301 302 303 307 308
```
All fields are optional. Empty fields mean "leave as is".
Note that while `RequestRedirect` is `Core`, some options are `Extended`!
(See the [API specification for details][http-request-redirect].)
[http-request-redirect]: https://gateway-api.sigs.k8s.io/reference/spec/#httprequestredirectfilter
---
## `HTTPRoute.spec.rules[].filters.URLRewrite`
```yaml
type: URLRewrite
urlRewrite:
hostname: newxyz.example.com
path: /new
```
`hostname` will rewrite the HTTP `Host:` header.
This is an `Extended` feature.
It conflicts with `HTTPRequestRedirect`.
---
## `HTTPRoute.spec.rules[].filters.RequestMirror`
This is an `Extended` feature. It sends a copy of all (or a fraction) of requests to another backend. Responses from the mirrored backend are ignored.
```yaml
type: RequestMirror
requestMirror:
percent: 10
fraction:
numerator: 1
denominator: 10
backendRef:
group: "" # default
kind: Service # default
name: log-some-requests
namespace: my-observability-namespace # defaults to same namespace
port: 80
hostname: newxyz.example.com
```
Specify `percent` or `fraction`, not both. If neither is specified, all requests get mirrored.
---
## Other routes
- `GRPCRoute` can use GRPC services and methods to route requests
*this is useful if you're using GRPC; otherwise you can ignore it!*
- `TLSRoute` can use SNI header to route requests (without decrypting traffic)
*this is useful to host multiple TLS services on a single address with end-to-end encryption*
- `TCPRoute` can route TCP connections
*this is useful to colocate multiple protocols on the same address, e.g. HTTP+HTTPS+SSH*
- `UDPRoute` can route UDP packets
*ditto, e.g. for DNS/UDP, DNS/TCP, DNS/HTTPS*
---
## `gateway.spec.listeners.allowedRoutes`
- With `Ingress`, any `Ingress` resource can "catch" traffic
- This could be a problem e.g. if a dev/staging environment accidentally (or maliciously) creates an `Ingress` with a production hostname
- Gateway API introduces guardrails
- A `Gateway` can indicate if it can be referred by routes:
- from all namespaces (like with `Ingress`)
- only from the same namespace
- only from specific namespaces matching a selector
- That's why we specified `gateway.listeners.web.namespacePolicy.from=All` when deploying Traefik
???
:EN:- The Gateway API
:FR:- La Gateway API

View File

@@ -171,12 +171,10 @@ Note: it might take a minute or two for the worker to start.
- save successive revisions, allowing us to rollback
[Helm docs][helm-labels] and [Kubernetes docs][k8s-labels]
[Helm docs](https://helm.sh/docs/topics/chart_best_practices/labels/)
and [Kubernetes docs](https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/)
have details about recommended annotations and labels.
[helm-labels]: https://helm.sh/docs/chart_best_practices/labels/
[k8s-labels]: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
---
## Cleaning up

View File

@@ -18,7 +18,51 @@
---
## Helm features
## From `kubectl run` to YAML
- We can create resources with one-line commands
(`kubectl run`, `kubectl create deployment`, `kubectl expose`...)
- We can also create resources by loading YAML files
(with `kubectl apply -f`, `kubectl create -f`...)
- There can be multiple resources in a single YAML files
(making them convenient to deploy entire stacks)
- However, these YAML bundles often need to be customized
(e.g.: number of replicas, image version to use, features to enable...)
---
## Beyond YAML
- Very often, after putting together our first `app.yaml`, we end up with:
- `app-prod.yaml`
- `app-staging.yaml`
- `app-dev.yaml`
- instructions indicating to users "please tweak this and that in the YAML"
- That's where using something like
[CUE](https://github.com/cue-labs/cue-by-example/tree/main/003_kubernetes_tutorial),
[Kustomize](https://kustomize.io/),
or [Helm](https://helm.sh/) can help!
- Now we can do something like this:
```bash
helm install app ... --set this.parameter=that.value
```
---
## Other features of Helm
- With Helm, we create "charts"
@@ -226,7 +270,7 @@ class: extra-details
]
Then go to → https://artifacthub.io/packages/helm/securecodebox/juice-shop
Then go to → https://artifacthub.io/packages/helm/seccurecodebox/juice-shop
---

Some files were not shown because too many files have changed in this diff Show More