Compare commits

..

75 Commits

Author SHA1 Message Date
Jérôme Petazzoni
514c6d8362 🍂 Highfive Sep-Oct-Nov 2025 2025-11-05 19:28:41 +01:00
Ludovic Piot
96ecb86f23 📝 🎨 lpiot-issue-8: Add the Flux bootstrap without relying on an organization 2025-11-05 18:59:42 +01:00
Ludovic Piot
58255d47fa 📝 lpiot-issue-10: Add a "delete PAT" step during the Flux install process 2025-11-05 18:59:42 +01:00
Ludovic Piot
8ca2d2a4fb ✏️ 2025-11-05 18:59:42 +01:00
Ludovic Piot
641e0ea98b 📝 lpiot-issue-12: Flux only need REPO permissions in Github PAT 2025-11-05 18:59:42 +01:00
Ludovic Piot
356a0e814f 🎨 Change the name of the k0s servers 2025-11-05 18:59:42 +01:00
Ludovic Piot
2effd41ff0 📝 🐛 lpiot-issue-25: broken link 2025-11-05 18:59:42 +01:00
Ludovic Piot
af448c4540 🐛 add the YAML files needed by the M5/M6 section 2025-11-05 18:59:42 +01:00
Jérôme Petazzoni
9f0224bb26 🖼️ Re-add images for flux/M6 chapter 2025-11-04 08:19:09 +01:00
Jérôme Petazzoni
39a71565a0 🔧 Replace hyperkube with kube-apiserver
Hyperkube isn't available anymore, so the previous version of
the script would constantly redownload the tarball over and over
2025-11-04 07:46:27 +01:00
Jérôme Petazzoni
cbea696d2c ️ Invoke kind script to automatically start a k8s cluster 2025-10-29 16:09:42 +01:00
Jérôme Petazzoni
46b56b90e2 🐞 Typo fix 2025-10-29 13:40:00 +01:00
Jérôme Petazzoni
6d0d394948 ⚙️ Add academy builder script 2025-10-29 13:37:02 +01:00
Jérôme Petazzoni
d6017b5d40 ️ Add chapter about codespaces and dev clusters 2025-10-28 21:44:09 +01:00
Jérôme Petazzoni
8b91bd6ef0 🔗 Add link to FluxCD Kustomization 2025-10-28 17:59:55 +01:00
Jérôme Petazzoni
078e799666 Update Kustomize content 2025-10-28 16:22:54 +01:00
Jérôme Petazzoni
f25abf663b 🛠️ Improve AWS EKS support
- detect which EKS version to use
  (instead of hard-coding it in the TF config)
- do not issue a CSR on EKS
  (because EKS is broken and doesn't support it)
- automatically install a StorageClass on EKS
  (because the EBS CSI addon doesn't install one by default)
- put EKS clusters in the default VPC
  (instead of creating one VPC per cluster,
  since there is a default limit of 5 VPC per region)
2025-10-25 11:26:13 +02:00
Jérôme Petazzoni
6d8ae7132d ️ Improve googlecloud support
- add support to provision VMs on googlecloud
- refactor the way we define the project used by Terraform
  (we'll now use the GOOGLE_PROJECT environment variable,
  and if it's not set, we'll set it automatically by getting
  the default project from the gcloud CLI)
2025-10-24 10:46:54 +02:00
Jérôme Petazzoni
404f816de6 ️ Add a couple of slides about sidecars 2025-10-23 10:06:13 +02:00
Jérôme Petazzoni
b0a3460efa 🛜 Add details about Traffic Distribution
KEP4444 hit GA in 1.33, so I've updated the relevant slide
2025-10-22 17:05:54 +02:00
Jérôme Petazzoni
944db5f8ea ️ Add chapter on Gateway API 2025-10-22 16:48:49 +02:00
Ludovic Piot
e820ca466f 🆕 Add Flux (M5B/M6) content 2025-10-21 13:21:16 +02:00
Jérôme Petazzoni
d3c5bde6de ✏️ Mutating CEL is coming 2025-10-14 17:45:55 +02:00
Jérôme Petazzoni
b56e7bdb52 ️ Add content about Extended Resources and Dynamic Resource Allocation 2025-10-14 17:42:27 +02:00
Jérôme Petazzoni
f98c77564f 📃 Update information about swap 2025-10-13 17:30:32 +02:00
Jérôme Petazzoni
3d98d56bf8 🔗 Fix a couple of Helm URLs 2025-10-08 08:33:29 +02:00
Jérôme Petazzoni
25576a570f ♻️ Update vcluster Helm chart; improve konk script
It is now possible to have multiple konk clusters in parallel,
thanks to the KONKTAG environment variable.
2025-10-01 16:44:11 +02:00
Jérôme Petazzoni
47fc74a21a 🔗 Add a bunch of links to CNPG and ZFS talks in concept slides 2025-09-29 15:23:22 +02:00
Jérôme Petazzoni
d524cd73fa ️ Add mention to kl and gonzo 2025-09-22 16:13:48 +02:00
Jérôme Petazzoni
6b1fa88887 ️ Compile some cloud native security recs 2025-09-11 16:48:13 +02:00
Jérôme Petazzoni
f37d8112f8 🔧 Mention container engine levels 2025-09-11 16:21:27 +02:00
Jérôme Petazzoni
5005de823d ️ Merge container security content 2025-09-11 16:01:33 +02:00
Jérôme Petazzoni
de60cdbc7e ✏️ Tweak container from scratch exercise 2025-09-08 15:31:47 +02:00
Jérôme Petazzoni
605ee21b83 ️ Add BuildKit exercise 2025-09-07 10:52:42 +02:00
Jérôme Petazzoni
fd06364ab0 ♻️ Update notes about overlay support 2025-09-06 13:16:39 +02:00
Jérôme Petazzoni
1be66f3513 ️ Add image deep dive + exercise 2025-09-06 13:08:01 +02:00
Jérôme Petazzoni
3c142ad06d ️ Add logistics file for Enix 2025-09-04 17:00:39 +02:00
Jérôme Petazzoni
b291243472 ️ Add container from scratch exercise; update cgroup to v2 2025-09-04 15:01:11 +02:00
emanulato
ef7d4fcdaa fix PuTTY link in handson.md
The link to PuTTY was pointing to putty.org. This domain has no relation to the PuTTY project! Instead, the website run by the actual PuTTY team can be found under https://putty.software , see https://hachyderm.io/@simontatham/115025974777386803
2025-08-29 14:51:59 +02:00
Jérôme Petazzoni
0fd5499233 🏷️ Add descriptions for Helmfile 2025-06-30 19:34:10 +02:00
Jérôme Petazzoni
0e4d7df9fc Update Terraform Helm provider to 3.X 2025-06-27 17:40:10 +02:00
Jérôme Petazzoni
9175a5c42a 📍 Pin version of thin
Thin 2.0 was released June 22 (ish), so... We need to pin Thin to 1.X.

This is embarrassing in a way, but also a great debugging opportunity every couple of years! 😬😅
2025-06-25 17:07:27 +02:00
Jérôme Petazzoni
d090aec9f6 ️ Add a basic manifest for a Deployment+Service 2025-06-24 15:02:37 +02:00
Jérôme Petazzoni
08c702423f Add DMUC advanced exercises 2025-06-11 20:43:07 +02:00
Jérôme Petazzoni
5d5aad347b 🔧 Tweak backup chapter 2025-06-11 08:35:58 +02:00
Jérôme Petazzoni
2390783cfd 📃 Update chapter on static pods 2025-06-09 10:04:03 +02:00
Jérôme Petazzoni
10fbfa135a 📃 Update control plane auth section 2025-06-06 15:35:20 +02:00
Jérôme Petazzoni
64376c5ec2 🔒️ Update section on user key and cert generation 2025-06-06 12:01:39 +02:00
Jérôme Petazzoni
b536318b03 🔗 Links to docs and blog posts about ephemeral storage isolation 2025-06-06 09:08:51 +02:00
Jérôme Petazzoni
2a8bbfb719 🔗 Update Kyverno doc links 2025-06-06 09:08:45 +02:00
Jérôme Petazzoni
a3c2c92984 🐞 Typo fix 2025-06-02 08:03:19 +02:00
Hiranyey Gajbhiye
1062c519b8 Update concepts-k8s.md
Fixed spelling mistake if it was unintentional
2025-05-31 10:25:44 +02:00
Jérôme Petazzoni
bc0ac34f5b 📃 Clarify what needs to be scaled up in healthcheck lab 2025-05-22 15:39:11 +02:00
Jérôme Petazzoni
4896a91bd4 🔧 Tweak portal VM size to use GP4 (GP2 is deprecated) 2025-05-22 15:38:27 +02:00
Jérôme Petazzoni
303dc93ac8 📍 Pin express version in webui 2025-05-20 17:33:41 +02:00
Jérôme Petazzoni
785d704726 🏭️ Rework Kyverno chapter 2025-05-11 18:34:11 +02:00
Jérôme Petazzoni
cd346ecace 📃 Update slides about k8s setup 2025-05-07 22:33:30 +02:00
Jérôme Petazzoni
4de3c303a6 🐞 Don't query when overwriting partial zip download
Thanks @swacquie for that one
2025-05-05 19:04:52 +02:00
Jérôme Petazzoni
121713a6c7 🔧 Tweak devcontainer configuration 2025-05-02 19:43:45 +02:00
Jérôme Petazzoni
4431cfe68a 📦️ Add devcontainer
This is still highly experimental, but hopefully it'll
let us go through the beginning of the class with
github codespaces.
2025-05-02 13:04:14 +02:00
Jérôme Petazzoni
dcf218dbe2 🐞 Fix webssh python version 2025-04-28 10:07:55 +02:00
Jérôme Petazzoni
43ff815d9f 🐞 Fix tabs in logins.jsonl 2025-04-27 14:03:02 +02:00
Jérôme Petazzoni
92e61ef83b ☁️ Add nano instances for scaleway konk usecase 2025-04-27 12:53:41 +02:00
Jérôme Petazzoni
45770cc584 Add monokube exercise 2025-03-25 17:35:01 -05:00
Jérôme Petazzoni
58700396f9 🐞 Fix permissions for injected kubeconfig in mk8s stage2 2025-03-23 18:27:31 -05:00
Jérôme Petazzoni
8783da014c 🐞 Handle dualstack nodes (with multiple ExternalIP) 2025-03-23 18:15:50 -05:00
Jérôme Petazzoni
f780100217 Add kuik and a blue green exercise 2025-03-22 18:46:55 -05:00
Jérôme Petazzoni
555cd058bb 🔗 Fix source link in API deep dive 2025-03-22 18:07:18 -05:00
Jérôme Petazzoni
a05d1f9d4f ♻️ Use a variable for proxmox VM storage 2025-02-17 18:38:18 +01:00
Jérôme Petazzoni
84365d03c6 🔧 Add tags to Proxmox VMs; use linked clones by default 2025-02-17 17:28:53 +00:00
Jérôme Petazzoni
164bc01388 🛜 code-server will now also listen on IPv6 2025-02-17 17:28:01 +00:00
Jérôme Petazzoni
c07116bd29 ♻️ Update etcdctl snapshot commands; mention auger 2025-02-17 18:26:34 +01:00
Jérôme Petazzoni
c4057f9c35 🔧 Minor update to Kyverno chapter and manifests 2025-02-17 14:46:07 +01:00
Jérôme Petazzoni
f57bd9a072 Bump code server version 2025-02-17 12:55:24 +01:00
Jérôme Petazzoni
fca6396540 🐞 Fix Flux link ref 2025-02-12 11:01:00 +01:00
163 changed files with 9184 additions and 1476 deletions

View File

@@ -0,0 +1,26 @@
{
"name": "container.training environment to get started with Docker and/or Kubernetes",
"image": "ghcr.io/jpetazzo/shpod",
"features": {
//"ghcr.io/devcontainers/features/common-utils:2": {}
},
// Use 'forwardPorts' to make a list of ports inside the container available locally.
"forwardPorts": [],
//"postCreateCommand": "... install extra packages...",
"postStartCommand": "dind.sh ; kind.sh",
// This lets us use "docker-outside-docker".
// Unfortunately, minikube, kind, etc. don't work very well that way;
// so for now, we'll likely use "docker-in-docker" instead (with a
// privilege dcontainer). But we're still exposing that socket in case
// someone wants to do something interesting with it.
"mounts": ["source=/var/run/docker.sock,target=/var/run/docker-host.sock,type=bind"],
// This is for docker-in-docker.
"privileged": true,
// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
"remoteUser": "k8s"
}

1
.gitignore vendored
View File

@@ -17,6 +17,7 @@ slides/autopilot/state.yaml
slides/index.html
slides/past.html
slides/slides.zip
slides/_academy_*
node_modules
### macOS ###

View File

@@ -1,7 +1,7 @@
FROM ruby:alpine
RUN apk add --update build-base curl
RUN gem install sinatra --version '~> 3'
RUN gem install thin
RUN gem install thin --version '~> 1'
ADD hasher.rb /
CMD ["ruby", "hasher.rb"]
EXPOSE 80

View File

@@ -1,5 +1,5 @@
FROM node:4-slim
RUN npm install express
RUN npm install express@4
RUN npm install redis@3
COPY files/ /files/
COPY webui.js /

View File

@@ -0,0 +1,9 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx
data:
use-forwarded-headers: true
compute-full-forwarded-for: true
use-proxy-protocol: true

View File

@@ -0,0 +1,10 @@
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: flux-system
app.kubernetes.io/part-of: flux
app.kubernetes.io/version: v2.5.1
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
name: ingress-nginx

View File

@@ -0,0 +1,12 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- M6-ingress-nginx-components.yaml
- sync.yaml
patches:
- path: M6-ingress-nginx-cm-patch.yaml
target:
kind: ConfigMap
- path: M6-ingress-nginx-svc-patch.yaml
target:
kind: Service

View File

@@ -0,0 +1,8 @@
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx
annotations:
service.beta.kubernetes.io/scw-loadbalancer-proxy-protocol-v2: true
service.beta.kubernetes.io/scw-loadbalancer-use-hostname: true

View File

@@ -0,0 +1,10 @@
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: flux-system
app.kubernetes.io/part-of: flux
app.kubernetes.io/version: v2.5.1
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
name: kyverno

View File

@@ -0,0 +1,72 @@
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: flux-multi-tenancy
spec:
validationFailureAction: enforce
rules:
- name: serviceAccountName
exclude:
resources:
namespaces:
- flux-system
match:
resources:
kinds:
- Kustomization
- HelmRelease
validate:
message: ".spec.serviceAccountName is required"
pattern:
spec:
serviceAccountName: "?*"
- name: kustomizationSourceRefNamespace
exclude:
resources:
namespaces:
- flux-system
- ingress-nginx
- kyverno
- monitoring
- openebs
match:
resources:
kinds:
- Kustomization
preconditions:
any:
- key: "{{request.object.spec.sourceRef.namespace}}"
operator: NotEquals
value: ""
validate:
message: "spec.sourceRef.namespace must be the same as metadata.namespace"
deny:
conditions:
- key: "{{request.object.spec.sourceRef.namespace}}"
operator: NotEquals
value: "{{request.object.metadata.namespace}}"
- name: helmReleaseSourceRefNamespace
exclude:
resources:
namespaces:
- flux-system
- ingress-nginx
- kyverno
- monitoring
- openebs
match:
resources:
kinds:
- HelmRelease
preconditions:
any:
- key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
operator: NotEquals
value: ""
validate:
message: "spec.chart.spec.sourceRef.namespace must be the same as metadata.namespace"
deny:
conditions:
- key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
operator: NotEquals
value: "{{request.object.metadata.namespace}}"

View File

@@ -0,0 +1,29 @@
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: flux-system
app.kubernetes.io/part-of: flux
app.kubernetes.io/version: v2.5.1
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
name: monitoring
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana
namespace: monitoring
spec:
ingressClassName: nginx
rules:
- host: grafana.test.metal.mybestdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kube-prometheus-stack-grafana
port:
number: 80

View File

@@ -0,0 +1,35 @@
---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: deny-from-other-namespaces
spec:
podSelector: {}
ingress:
- from:
- podSelector: {}
---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-webui
spec:
podSelector:
matchLabels:
app: web
ingress:
- from: []
---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-db
spec:
podSelector:
matchLabels:
app: db
ingress:
- from:
- podSelector:
matchLabels:
app: web

View File

@@ -0,0 +1,10 @@
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: flux-system
app.kubernetes.io/part-of: flux
app.kubernetes.io/version: v2.5.1
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
name: openebs

View File

@@ -0,0 +1,12 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: openebs
resources:
- M6-openebs-components.yaml
- sync.yaml
configMapGenerator:
- name: openebs-values
files:
- values.yaml=M6-openebs-values.yaml
configurations:
- M6-openebs-kustomizeconfig.yaml

View File

@@ -0,0 +1,6 @@
nameReference:
- kind: ConfigMap
version: v1
fieldSpecs:
- path: spec/valuesFrom/name
kind: HelmRelease

View File

@@ -0,0 +1,15 @@
# helm install openebs --namespace openebs openebs/openebs
# --set engines.replicated.mayastor.enabled=false
# --set lvm-localpv.lvmNode.kubeletDir=/var/lib/k0s/kubelet/
# --create-namespace
engines:
replicated:
mayastor:
enabled: false
# Needed for k0s install since kubelet install is slightly divergent from vanilla install >:-(
lvm-localpv:
lvmNode:
kubeletDir: /var/lib/k0s/kubelet/
localprovisioner:
hostpathClass:
isDefaultClass: true

View File

@@ -0,0 +1,38 @@
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
namespace: rocky-test
name: rocky-full-access
rules:
- apiGroups: ["", extensions, apps]
resources: [deployments, replicasets, pods, services, ingresses, statefulsets]
verbs: [get, list, watch, create, update, patch, delete] # You can also use [*]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: rocky-pv-access
rules:
- apiGroups: [""]
resources: [persistentvolumes]
verbs: [get, list, watch, create, patch]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
toolkit.fluxcd.io/tenant: rocky
name: rocky-reconciler2
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: rocky-pv-access
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: gotk:rocky-test:reconciler
- kind: ServiceAccount
name: rocky
namespace: rocky-test

19
k8s/M6-rocky-ingress.yaml Normal file
View File

@@ -0,0 +1,19 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rocky
namespace: rocky-test
spec:
ingressClassName: nginx
rules:
- host: rocky.test.mybestdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web
port:
number: 80

View File

@@ -0,0 +1,8 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../base/rocky
patches:
- path: M6-rocky-test-patch.yaml
target:
kind: Kustomization

View File

@@ -0,0 +1,7 @@
apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
name: rocky
namespace: rocky-test
spec:
path: ./k8s/plain

33
k8s/blue.yaml Normal file
View File

@@ -0,0 +1,33 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: blue
name: blue
spec:
replicas: 1
selector:
matchLabels:
app: blue
template:
metadata:
labels:
app: blue
spec:
containers:
- image: jpetazzo/color
name: color
---
apiVersion: v1
kind: Service
metadata:
labels:
app: blue
name: blue
spec:
ports:
- name: "80"
port: 80
selector:
app: blue

View File

@@ -0,0 +1,12 @@
# This removes the haproxy Deployment.
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
patches:
- patch: |-
$patch: delete
kind: Deployment
apiVersion: apps/v1
metadata:
name: haproxy

View File

@@ -0,0 +1,14 @@
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
# Within a Kustomization, it is not possible to specify in which
# order transformations (patches, replacements, etc) should be
# executed. If we want to execute transformations in a specific
# order, one possibility is to put them in individual components,
# and then invoke these components in the order we want.
# It works, but it creates an extra level of indirection, which
# reduces readability and complicates maintenance.
components:
- setup
- cleanup

View File

@@ -0,0 +1,20 @@
global
#log stdout format raw local0
#daemon
maxconn 32
defaults
#log global
timeout client 1h
timeout connect 1h
timeout server 1h
mode http
option abortonclose
frontend metrics
bind :9000
http-request use-service prometheus-exporter
frontend ollama_frontend
bind :8000
default_backend ollama_backend
maxconn 16
backend ollama_backend
server ollama_server localhost:11434 check

View File

@@ -0,0 +1,39 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: haproxy
name: haproxy
spec:
selector:
matchLabels:
app: haproxy
template:
metadata:
labels:
app: haproxy
spec:
volumes:
- name: haproxy
configMap:
name: haproxy
containers:
- image: haproxy:3.0
name: haproxy
volumeMounts:
- name: haproxy
mountPath: /usr/local/etc/haproxy
readinessProbe:
httpGet:
port: 9000
ports:
- name: haproxy
containerPort: 8000
- name: metrics
containerPort: 9000
resources:
requests:
cpu: 0.05
limits:
cpu: 1

View File

@@ -0,0 +1,75 @@
# This adds a sidecar to the ollama Deployment, by taking
# the pod template and volumes from the haproxy Deployment.
# The idea is to allow to run ollama+haproxy in two modes:
# - separately (each with their own Deployment),
# - together in the same Pod, sidecar-style.
# The YAML files define how to run them separetely, and this
# "replacements" directive fetches a specific volume and
# a specific container from the haproxy Deployment, to add
# them to the ollama Deployment.
#
# This would be simpler if kustomize allowed to append or
# merge lists in "replacements"; but it doesn't seem to be
# possible at the moment.
#
# It would be even better if kustomize allowed to perform
# a strategic merge using a fieldPath as the source, because
# we could merge both the containers and the volumes in a
# single operation.
#
# Note that technically, it might be possible to layer
# multiple kustomizations so that one generates the patch
# to be used in another; but it wouldn't be very readable
# or maintainable so we decided to not do that right now.
#
# However, the current approach (fetching fields one by one)
# has an advantage: it could let us transform the haproxy
# container into a real sidecar (i.e. an initContainer with
# a restartPolicy=Always).
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
resources:
- haproxy.yaml
configMapGenerator:
- name: haproxy
files:
- haproxy.cfg
replacements:
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.volumes.[name=haproxy]
targets:
- select:
kind: Deployment
name: ollama
fieldPaths:
- spec.template.spec.volumes.[name=haproxy]
options:
create: true
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.containers.[name=haproxy]
targets:
- select:
kind: Deployment
name: ollama
fieldPaths:
- spec.template.spec.containers.[name=haproxy]
options:
create: true
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.containers.[name=haproxy].ports.[name=haproxy].containerPort
targets:
- select:
kind: Service
name: ollama
fieldPaths:
- spec.ports.[name=11434].targetPort

View File

@@ -0,0 +1,34 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: blue
name: blue
spec:
replicas: 2
selector:
matchLabels:
app: blue
template:
metadata:
labels:
app: blue
spec:
containers:
- image: jpetazzo/color
name: color
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
app: blue
name: blue
spec:
ports:
- port: 80
selector:
app: blue

View File

@@ -0,0 +1,94 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# Each of these YAML files contains a Deployment and a Service.
# The blue.yaml file is here just to demonstrate that the rest
# of this Kustomization can be precisely scoped to the ollama
# Deployment (and Service): the blue Deployment and Service
# shouldn't be affected by our kustomize transformers.
resources:
- ollama.yaml
- blue.yaml
buildMetadata:
# Add a label app.kubernetes.io/managed-by=kustomize-vX.Y.Z
- managedByLabel
# Add an annotation config.kubernetes.io/origin, indicating:
# - which file defined that resource;
# - if it comes from a git repository, which one, and which
# ref (tag, branch...) it was.
- originAnnotations
# Add an annotation alpha.config.kubernetes.io/transformations
# indicating which patches and other transformers have changed
# each resource.
- transformerAnnotations
# Let's generate a ConfigMap with literal values.
# Note that this will actually add a suffix to the name of the
# ConfigMaps (e.g.: ollama-8bk8bd8m76) and it will update all
# references to the ConfigMap (e.g. in Deployment manifests)
# accordingly. The suffix is a hash of the ConfigMap contents,
# so that basically, if the ConfigMap is edited, any workload
# using that ConfigMap will automatically do a rolling update.
configMapGenerator:
- name: ollama
literals:
- "model=gemma3:270m"
- "prompt=If you visit Paris, I suggest that you"
- "queue=4"
name: ollama
patches:
# The Deployment manifest in ollama.yaml doesn't specify
# resource requests and limits, so that it can run on any
# cluster (including resource-constrained local clusters
# like KiND or minikube). The example belows add CPU
# requests and limits using a strategic merge patch.
# The patch is inlined here, but it could also be put
# in a file and referenced with "path: xxxxxx.yaml".
- patch: |
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
spec:
template:
spec:
containers:
- name: ollama
resources:
requests:
cpu: 1
limits:
cpu: 2
# This will have the same effect, with one little detail:
# JSON patches cannot specify containers by name, so this
# assumes that the ollama container is the first one in
# the pod template (whereas the strategic merge patch can
# use "merge keys" and identify containers by their name).
#- target:
# kind: Deployment
# name: ollama
# patch: |
# - op: add
# path: /spec/template/spec/containers/0/resources
# value:
# requests:
# cpu: 1
# limits:
# cpu: 2
# A "component" is a bit like a "base", in the sense that
# it lets us define some reusable resources and behaviors.
# There is a key different, though:
# - a "base" will be evaluated in isolation: it will
# generate+transform some resources, then these resources
# will be included in the main Kustomization;
# - a "component" has access to all the resources that
# have been generated by the main Kustomization, which
# means that it can transform them (with patches etc).
components:
- add-haproxy-sidecar

View File

@@ -0,0 +1,73 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: ollama
name: ollama
spec:
selector:
matchLabels:
app: ollama
template:
metadata:
labels:
app: ollama
spec:
volumes:
- name: ollama
hostPath:
path: /opt/ollama
type: DirectoryOrCreate
containers:
- image: ollama/ollama
name: ollama
env:
- name: OLLAMA_MAX_QUEUE
valueFrom:
configMapKeyRef:
name: ollama
key: queue
- name: MODEL
valueFrom:
configMapKeyRef:
name: ollama
key: model
volumeMounts:
- name: ollama
mountPath: /root/.ollama
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- ollama pull $MODEL
livenessProbe:
httpGet:
port: 11434
readinessProbe:
exec:
command:
- /bin/sh
- -c
- ollama show $MODEL
ports:
- name: ollama
containerPort: 11434
---
apiVersion: v1
kind: Service
metadata:
labels:
app: ollama
name: ollama
spec:
ports:
- name: "11434"
port: 11434
protocol: TCP
targetPort: 11434
selector:
app: ollama
type: ClusterIP

View File

@@ -0,0 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- microservices
- redis

View File

@@ -0,0 +1,13 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- microservices.yaml
transformers:
- |
apiVersion: builtin
kind: PrefixSuffixTransformer
metadata:
name: use-ghcr-io
prefix: ghcr.io/
fieldSpecs:
- path: spec/template/spec/containers/image

View File

@@ -0,0 +1,125 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hasher
name: hasher
spec:
replicas: 1
selector:
matchLabels:
app: hasher
template:
metadata:
labels:
app: hasher
spec:
containers:
- image: dockercoins/hasher:v0.1
name: hasher
---
apiVersion: v1
kind: Service
metadata:
labels:
app: hasher
name: hasher
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: hasher
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rng
name: rng
spec:
replicas: 1
selector:
matchLabels:
app: rng
template:
metadata:
labels:
app: rng
spec:
containers:
- image: dockercoins/rng:v0.1
name: rng
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rng
name: rng
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: rng
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: webui
name: webui
spec:
replicas: 1
selector:
matchLabels:
app: webui
template:
metadata:
labels:
app: webui
spec:
containers:
- image: dockercoins/webui:v0.1
name: webui
---
apiVersion: v1
kind: Service
metadata:
labels:
app: webui
name: webui
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: webui
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: worker
name: worker
spec:
replicas: 1
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- image: dockercoins/worker:v0.1
name: worker

View File

@@ -0,0 +1,4 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- redis.yaml

View File

@@ -0,0 +1,35 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: redis
name: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- image: redis
name: redis
---
apiVersion: v1
kind: Service
metadata:
labels:
app: redis
name: redis
spec:
ports:
- port: 6379
protocol: TCP
targetPort: 6379
selector:
app: redis
type: ClusterIP

View File

@@ -0,0 +1,160 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hasher
name: hasher
spec:
replicas: 1
selector:
matchLabels:
app: hasher
template:
metadata:
labels:
app: hasher
spec:
containers:
- image: dockercoins/hasher:v0.1
name: hasher
---
apiVersion: v1
kind: Service
metadata:
labels:
app: hasher
name: hasher
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: hasher
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: redis
name: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- image: redis
name: redis
---
apiVersion: v1
kind: Service
metadata:
labels:
app: redis
name: redis
spec:
ports:
- port: 6379
protocol: TCP
targetPort: 6379
selector:
app: redis
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rng
name: rng
spec:
replicas: 1
selector:
matchLabels:
app: rng
template:
metadata:
labels:
app: rng
spec:
containers:
- image: dockercoins/rng:v0.1
name: rng
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rng
name: rng
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: rng
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: webui
name: webui
spec:
replicas: 1
selector:
matchLabels:
app: webui
template:
metadata:
labels:
app: webui
spec:
containers:
- image: dockercoins/webui:v0.1
name: webui
---
apiVersion: v1
kind: Service
metadata:
labels:
app: webui
name: webui
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: webui
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: worker
name: worker
spec:
replicas: 1
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- image: dockercoins/worker:v0.1
name: worker

View File

@@ -0,0 +1,30 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- dockercoins.yaml
replacements:
- sourceValue: ghcr.io/dockercoins
targets:
- select:
kind: Deployment
labelSelector: "app in (hasher,rng,webui,worker)"
# It will soon be possible to use regexes in replacement selectors,
# meaning that the "labelSelector:" above can be replaced with the
# following "name:" selector which is a tiny bit simpler:
#name: hasher|rng|webui|worker
# Regex support in replacement selectors was added by this PR:
# https://github.com/kubernetes-sigs/kustomize/pull/5863
# This PR was merged in August 2025, but as of October 2025, the
# latest release of Kustomize is 5.7.1, which was released in July.
# Hopefully the feature will be available in the next release :)
# Another possibility would be to select all Deployments, and then
# reject the one(s) for which we don't want to update the registry;
# for instance:
#reject:
# kind: Deployment
# name: redis
fieldPaths:
- spec.template.spec.containers.*.image
options:
delimiter: "/"
index: 0

View File

@@ -3,7 +3,6 @@ kind: ClusterPolicy
metadata:
name: pod-color-policy-1
spec:
validationFailureAction: enforce
rules:
- name: ensure-pod-color-is-valid
match:
@@ -18,5 +17,6 @@ spec:
operator: NotIn
values: [ red, green, blue ]
validate:
failureAction: Enforce
message: "If it exists, the label color must be red, green, or blue."
deny: {}

View File

@@ -3,7 +3,6 @@ kind: ClusterPolicy
metadata:
name: pod-color-policy-2
spec:
validationFailureAction: enforce
background: false
rules:
- name: prevent-color-change
@@ -22,6 +21,7 @@ spec:
operator: NotEquals
value: ""
validate:
failureAction: Enforce
message: "Once label color has been added, it cannot be changed."
deny:
conditions:

View File

@@ -3,7 +3,6 @@ kind: ClusterPolicy
metadata:
name: pod-color-policy-3
spec:
validationFailureAction: enforce
background: false
rules:
- name: prevent-color-change
@@ -22,7 +21,6 @@ spec:
operator: Equals
value: ""
validate:
failureAction: Enforce
message: "Once label color has been added, it cannot be removed."
deny:
conditions:
deny: {}

View File

@@ -66,7 +66,7 @@ Here is where we look for credentials for each provider:
- Civo: CLI configuration file (`~/.civo.json`)
- Digital Ocean: CLI configuration file (`~/.config/doctl/config.yaml`)
- Exoscale: CLI configuration file (`~/.config/exoscale/exoscale.toml`)
- Google Cloud: FIXME, note that the project name is currently hard-coded to `prepare-tf`
- Google Cloud: we're using "Application Default Credentials (ADC)"; run `gcloud auth application-default login`; note that we'll use the default "project" set in `gcloud` unless you set the `GOOGLE_PROJECT` environment variable
- Hetzner: CLI configuration file (`~/.config/hcloud/cli.toml`)
- Linode: CLI configuration file (`~/.config/linode-cli`)
- OpenStack: you will need to write a tfvars file (check [that exemple](terraform/virtual-machines/openstack/tfvars.example))

View File

@@ -5,19 +5,27 @@
# 10% CPU
# (See https://docs.google.com/document/d/1n0lwp6rQKQUIuo_A5LQ1dgCzrmjkDjmDtNj1Jn92UrI)
# PRO2-XS = 4 core, 16 gb
#
# With vspod:
# 800 MB RAM
# 33% CPU
#
set -e
PROVIDER=scaleway
STUDENTS=30
KONKTAG=konk
PROVIDER=linode
STUDENTS=5
case "$PROVIDER" in
linode)
export TF_VAR_node_size=g6-standard-6
export TF_VAR_location=eu-west
export TF_VAR_location=fr-par
;;
scaleway)
export TF_VAR_node_size=PRO2-XS
# For tiny testing purposes, these are okay too:
#export TF_VAR_node_size=PLAY2-NANO
export TF_VAR_location=fr-par-2
;;
esac
@@ -26,17 +34,19 @@ esac
export KUBECONFIG=~/kubeconfig
if [ "$PROVIDER" = "kind" ]; then
kind create cluster --name konk
kind create cluster --name $KONKTAG
ADDRTYPE=InternalIP
else
./labctl create --mode mk8s --settings settings/konk.env --provider $PROVIDER --tag konk
cp tags/konk/stage2/kubeconfig.101 $KUBECONFIG
if ! [ -f tags/$KONKTAG/stage2/kubeconfig.101 ]; then
./labctl create --mode mk8s --settings settings/konk.env --provider $PROVIDER --tag $KONKTAG
fi
cp tags/$KONKTAG/stage2/kubeconfig.101 $KUBECONFIG
ADDRTYPE=ExternalIP
fi
# set external_ip labels
kubectl get nodes -o=jsonpath='{range .items[*]}{.metadata.name} {.status.addresses[?(@.type=="'$ADDRTYPE'")].address}{"\n"}{end}' |
while read node address; do
while read node address ignoredaddresses; do
kubectl label node $node external_ip=$address
done

View File

@@ -55,7 +55,7 @@ _cmd_codeserver() {
need_tag
ARCH=${ARCHITECTURE-amd64}
CODESERVER_VERSION=4.96.2
CODESERVER_VERSION=4.96.4
CODESERVER_URL=https://github.com/coder/code-server/releases/download/v${CODESERVER_VERSION}/code-server-${CODESERVER_VERSION}-linux-${ARCH}.tar.gz
pssh "
set -e
@@ -76,7 +76,7 @@ Description=code-server
WantedBy=default.target
[Service]
ExecStart=/usr/local/bin/code-server --bind-addr 0:1789
ExecStart=/usr/local/bin/code-server --bind-addr [::]:1789
Restart=always
EOF
sudo systemctl --user -M $USER_LOGIN@ enable code-server.service --now
@@ -270,7 +270,27 @@ _cmd_create() {
ln -s ../../$SETTINGS tags/$TAG/settings.env.orig
cp $SETTINGS tags/$TAG/settings.env
. $SETTINGS
# For Google Cloud, it is necessary to specify which "project" to use.
# Unfortunately, the Terraform provider doesn't seem to have a way
# to detect which Google Cloud project you want to use; it has to be
# specified one way or another. Let's decide that it should be set with
# the GOOGLE_PROJECT env var; and if that var is not set, we'll try to
# figure it out from gcloud.
# (See https://github.com/hashicorp/terraform-provider-google/issues/10907#issuecomment-1015721600)
# Since we need that variable to be set each time we'll call Terraform
# (e.g. when destroying the environment), let's save it to the settings.env
# file.
if [ "$PROVIDER" = "googlecloud" ]; then
if ! [ "$GOOGLE_PROJECT" ]; then
info "PROVIDER=googlecloud but GOOGLE_PROJECT is not set. Detecting it."
GOOGLE_PROJECT=$(gcloud config get project)
info "GOOGLE_PROJECT will be set to '$GOOGLE_PROJECT'."
fi
echo "export GOOGLE_PROJECT=$GOOGLE_PROJECT" >> tags/$TAG/settings.env
fi
. tags/$TAG/settings.env
echo $MODE > tags/$TAG/mode
echo $PROVIDER > tags/$TAG/provider
@@ -374,9 +394,13 @@ _cmd_clusterize() {
done < /tmp/cluster
"
while read line; do
printf '{"login": "%s", "password": "%s", "ipaddrs": "%s"}\n' "$USER_LOGIN" "$USER_PASSWORD" "$line"
done < tags/$TAG/clusters.tsv > tags/$TAG/logins.jsonl
jq --raw-input --compact-output \
--arg USER_LOGIN "$USER_LOGIN" --arg USER_PASSWORD "$USER_PASSWORD" '
{
"login": $USER_LOGIN,
"password": $USER_PASSWORD,
"ipaddrs": .
}' < tags/$TAG/clusters.tsv > tags/$TAG/logins.jsonl
echo cluster_ok > tags/$TAG/status
}
@@ -479,7 +503,7 @@ _cmd_kubebins() {
curl -L https://github.com/etcd-io/etcd/releases/download/$ETCD_VERSION/etcd-$ETCD_VERSION-linux-$ARCH.tar.gz \
| sudo tar --strip-components=1 --wildcards -zx '*/etcd' '*/etcdctl'
fi
if ! [ -x hyperkube ]; then
if ! [ -x kube-apiserver ]; then
##VERSION##
curl -L https://dl.k8s.io/$K8SBIN_VERSION/kubernetes-server-linux-$ARCH.tar.gz \
| sudo tar --strip-components=3 -zx \
@@ -1116,7 +1140,7 @@ _cmd_tailhist () {
set -e
sudo apt-get install unzip -y
wget -c https://github.com/joewalnes/websocketd/releases/download/v0.3.0/websocketd-0.3.0-linux_$ARCH.zip
unzip websocketd-0.3.0-linux_$ARCH.zip websocketd
unzip -o websocketd-0.3.0-linux_$ARCH.zip websocketd
sudo mv websocketd /usr/local/bin/websocketd
sudo mkdir -p /opt/tailhist
sudo tee /opt/tailhist.service <<EOF
@@ -1214,14 +1238,17 @@ fi
"
}
_cmd ssh "Open an SSH session to the first node of a tag"
_cmd ssh "Open an SSH session to a node (first one by default)"
_cmd_ssh() {
TAG=$1
need_tag
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
if [ "$2" ]; then
ssh -l ubuntu -i tags/$TAG/id_rsa $2
else
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
fi
}
_cmd tags "List groups of VMs known locally"
@@ -1392,7 +1419,7 @@ WantedBy=multi-user.target
[Service]
WorkingDirectory=/opt/webssh
ExecStart=/usr/bin/env python run.py --fbidhttp=false --port=1080 --policy=reject
ExecStart=/usr/bin/env python3 run.py --fbidhttp=false --port=1080 --policy=reject
User=nobody
Group=nogroup
Restart=always

View File

@@ -23,6 +23,14 @@ pssh() {
# necessary - or down to zero, too.
sleep ${PSSH_DELAY_PRE-1}
# When things go wrong, it's convenient to ask pssh to show the output
# of the failed command. Let's make that easy with a DEBUG env var.
if [ "$DEBUG" ]; then
PSSH_I=-i
else
PSSH_I=""
fi
$(which pssh || which parallel-ssh) -h $HOSTFILE -l ubuntu \
--par ${PSSH_PARALLEL_CONNECTIONS-100} \
--timeout 300 \
@@ -31,5 +39,6 @@ pssh() {
-O UserKnownHostsFile=/dev/null \
-O StrictHostKeyChecking=no \
-O ForwardAgent=yes \
$PSSH_I \
"$@"
}

View File

@@ -1,4 +1,4 @@
#export TF_VAR_node_size=GP2.4
#export TF_VAR_node_size=GP4.4
#export TF_VAR_node_size=g6-standard-6
#export TF_VAR_node_size=m7i.xlarge

View File

@@ -2,7 +2,11 @@ terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "2.16.1"
version = "~> 2.38.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 3.0"
}
}
}
@@ -16,7 +20,7 @@ provider "kubernetes" {
provider "helm" {
alias = "cluster_${index}"
kubernetes {
kubernetes = {
config_path = "./kubeconfig.${index}"
}
}
@@ -51,42 +55,37 @@ resource "helm_release" "shpod_${index}" {
name = "shpod"
namespace = "shpod"
create_namespace = false
set {
name = "service.type"
value = "NodePort"
}
set {
name = "resources.requests.cpu"
value = "100m"
}
set {
name = "resources.requests.memory"
value = "500M"
}
set {
name = "resources.limits.cpu"
value = "1"
}
set {
name = "resources.limits.memory"
value = "1000M"
}
set {
name = "persistentVolume.enabled"
value = "true"
}
set {
name = "ssh.password"
value = random_string.shpod_${index}.result
}
set {
name = "rbac.cluster.clusterRoles"
value = "{cluster-admin}"
}
set {
name = "codeServer.enabled"
value = "true"
}
values = [
yamlencode({
service = {
type = "NodePort"
}
resources = {
requests = {
cpu = "100m"
memory = "500M"
}
limits = {
cpu = "1"
memory = "1000M"
}
}
persistentVolume = {
enabled = true
}
ssh = {
password = random_string.shpod_${index}.result
}
rbac = {
cluster = {
clusterRoles = [ "cluster-admin" ]
}
}
codeServer = {
enabled = true
}
})
]
}
resource "helm_release" "metrics_server_${index}" {
@@ -101,13 +100,75 @@ resource "helm_release" "metrics_server_${index}" {
name = "metrics-server"
namespace = "metrics-server"
create_namespace = true
set {
name = "args"
value = "{--kubelet-insecure-tls}"
}
values = [
yamlencode({
args = [ "--kubelet-insecure-tls" ]
})
]
}
# As of October 2025, the ebs-csi-driver addon (which is used on EKS
# to provision persistent volumes) doesn't automatically create a
# StorageClass. Here, we're trying to detect the DaemonSet created
# by the ebs-csi-driver; and if we find it, we create the corresponding
# StorageClass.
data "kubernetes_resources" "ebs_csi_node_${index}" {
provider = kubernetes.cluster_${index}
api_version = "apps/v1"
kind = "DaemonSet"
label_selector = "app.kubernetes.io/name=aws-ebs-csi-driver"
namespace = "kube-system"
}
resource "kubernetes_storage_class" "ebs_csi_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 1 : 0
provider = kubernetes.cluster_${index}
metadata {
name = "ebs-csi"
annotations = {
"storageclass.kubernetes.io/is-default-class" = "true"
}
}
storage_provisioner = "ebs.csi.aws.com"
}
# This section here deserves a little explanation.
#
# When we access a cluster with shpod (either through SSH or code-server)
# there is no kubeconfig file - we simply use "in-cluster" authentication
# with a ServiceAccount token. This is a bit unusual, and ideally, I would
# prefer to have a "normal" kubeconfig file in the students' shell.
#
# So what we're doing here, is that we're populating a ConfigMap with
# a kubeconfig file; and in the initialization scripts (e.g. bashrc) we
# automatically download the kubeconfig file from the ConfigMap and place
# it in ~/.kube/kubeconfig.
#
# But, which kubeconfig file should we use? We could use the "normal"
# kubeconfig file that was generated by the provider; but in some cases,
# that kubeconfig file might use a token instead of a certificate for
# user authentication - and ideally, I would like to have a certificate
# so that in the section about auth and RBAC, we can dissect that TLS
# certificate and explain where our permissions come from.
#
# So we're creating a TLS key pair; using the CSR API to issue a user
# certificate belongong to a special group; and grant the cluster-admin
# role to that group; then we use the kubeconfig file generated by the
# provider but override the user with that TLS key pair.
#
# This is not strictly necessary but it streamlines the lesson on auth.
#
# Lastly - in the ConfigMap we actually put both the original kubeconfig,
# and the one where we injected our new user (just in case we want to
# use or look at the original for any reason).
#
# One more thing: the kubernetes.io/kube-apiserver-client signer is
# disabled on EKS, so... we don't generate that ConfigMap on EKS.
# To detect if we're on EKS, we're looking for the ebs-csi-node DaemonSet.
# (Which means that the detection will break if the ebs-csi addon is missing.)
resource "kubernetes_config_map" "kubeconfig_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 0 : 1
provider = kubernetes.cluster_${index}
metadata {
name = "kubeconfig"
@@ -133,7 +194,7 @@ resource "kubernetes_config_map" "kubeconfig_${index}" {
- name: cluster-admin
user:
client-key-data: $${base64encode(tls_private_key.cluster_admin_${index}.private_key_pem)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}.certificate)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}[0].certificate)}
EOT
}
}
@@ -153,7 +214,25 @@ resource "tls_cert_request" "cluster_admin_${index}" {
}
}
resource "kubernetes_cluster_role_binding" "shpod_cluster_admin_${index}" {
provider = kubernetes.cluster_${index}
metadata {
name = "shpod-cluster-admin"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = "cluster-admin"
}
subject {
api_group = "rbac.authorization.k8s.io"
kind = "Group"
name = "shpod-cluster-admins"
}
}
resource "kubernetes_certificate_signing_request_v1" "cluster_admin_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 0 : 1
provider = kubernetes.cluster_${index}
metadata {
name = "cluster-admin"

View File

@@ -23,7 +23,7 @@ variable "node_size" {
}
variable "location" {
type = string
type = string
default = null
}

View File

@@ -1,60 +1,45 @@
# Taken from:
# https://github.com/hashicorp/learn-terraform-provision-eks-cluster/blob/main/main.tf
data "aws_availability_zones" "available" {}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.19.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
data "aws_eks_cluster_versions" "_" {
default_only = true
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "19.5.1"
cluster_name = var.cluster_name
cluster_version = "1.24"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
eks_managed_node_group_defaults = {
ami_type = "AL2_x86_64"
source = "terraform-aws-modules/eks/aws"
version = "~> 21.0"
name = var.cluster_name
kubernetes_version = data.aws_eks_cluster_versions._.cluster_versions[0].cluster_version
vpc_id = local.vpc_id
subnet_ids = local.subnet_ids
endpoint_public_access = true
enable_cluster_creator_admin_permissions = true
upgrade_policy = {
# The default policy is EXTENDED, which incurs additional costs
# when running an old control plane. We don't advise to run old
# control planes, but we also don't want to incur costs if an
# old version is chosen accidentally.
support_type = "STANDARD"
}
addons = {
coredns = {}
eks-pod-identity-agent = {
before_compute = true
}
kube-proxy = {}
vpc-cni = {
before_compute = true
}
aws-ebs-csi-driver = {
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
}
}
eks_managed_node_groups = {
one = {
name = "node-group-one"
x86 = {
name = "x86"
instance_types = [local.node_size]
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
}
}
}
@@ -66,7 +51,7 @@ data "aws_iam_policy" "ebs_csi_policy" {
module "irsa-ebs-csi" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "4.7.0"
version = "~> 5.39.0"
create_role = true
role_name = "AmazonEKSTFEBSCSIRole-${module.eks.cluster_name}"
@@ -75,13 +60,9 @@ module "irsa-ebs-csi" {
oidc_fully_qualified_subjects = ["system:serviceaccount:kube-system:ebs-csi-controller-sa"]
}
resource "aws_eks_addon" "ebs-csi" {
cluster_name = module.eks.cluster_name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.5.2-eksbuild.1"
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
tags = {
"eks_addon" = "ebs-csi"
"terraform" = "true"
}
resource "aws_vpc_security_group_ingress_rule" "_" {
security_group_id = module.eks.node_security_group_id
cidr_ipv4 = "0.0.0.0/0"
ip_protocol = -1
description = "Allow all traffic to Kubernetes nodes (so that we can use NodePorts, hostPorts, etc.)"
}

View File

@@ -2,7 +2,7 @@ terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.47.0"
version = "~> 6.17.0"
}
}
}

View File

@@ -0,0 +1,61 @@
# OK, we have two options here.
# 1. Create our own VPC
# - Pros: provides good isolation from other stuff deployed in the
# AWS account; makes sure that we don't interact with
# existing security groups, subnets, etc.
# - Cons: by default, there is a quota of 5 VPC per region, so
# we can only deploy 5 clusters
# 2. Use the default VPC
# - Pros/cons: the opposite :)
variable "use_default_vpc" {
type = bool
default = true
}
data "aws_vpc" "default" {
default = true
}
data "aws_subnets" "default" {
filter {
name = "vpc-id"
values = [data.aws_vpc.default.id]
}
}
data "aws_availability_zones" "available" {}
module "vpc" {
count = var.use_default_vpc ? 0 : 1
source = "terraform-aws-modules/vpc/aws"
version = "~> 6.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.11.0/24", "10.0.12.0/24", "10.0.13.0/24"]
public_subnets = ["10.0.21.0/24", "10.0.22.0/24", "10.0.23.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
map_public_ip_on_launch = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
locals {
vpc_id = var.use_default_vpc ? data.aws_vpc.default.id : module.vpc[0].vpc_id
subnet_ids = var.use_default_vpc ? data.aws_subnets.default.ids : module.vpc[0].public_subnets
}

View File

@@ -1,12 +0,0 @@
locals {
location = var.location != null ? var.location : "europe-north1-a"
region = replace(local.location, "/-[a-z]$/", "")
# Unfortunately, the following line doesn't work
# (that attribute just returns an empty string)
# so we have to hard-code the project name.
#project = data.google_client_config._.project
project = "prepare-tf"
}
data "google_client_config" "_" {}

View File

@@ -1,7 +1,7 @@
resource "google_container_cluster" "_" {
name = var.cluster_name
project = local.project
location = local.location
name = var.cluster_name
location = local.location
deletion_protection = false
#min_master_version = var.k8s_version
# To deploy private clusters, uncomment the section below,
@@ -42,7 +42,7 @@ resource "google_container_cluster" "_" {
node_pool {
name = "x86"
node_config {
tags = var.common_tags
tags = ["lab-${var.cluster_name}"]
machine_type = local.node_size
}
initial_node_count = var.min_nodes_per_pool
@@ -62,3 +62,25 @@ resource "google_container_cluster" "_" {
}
}
}
resource "google_compute_firewall" "_" {
name = "lab-${var.cluster_name}"
network = "default"
allow {
protocol = "tcp"
ports = ["0-65535"]
}
allow {
protocol = "udp"
ports = ["0-65535"]
}
allow {
protocol = "icmp"
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["lab-${var.cluster_name}"]
}

View File

@@ -6,6 +6,8 @@ output "has_metrics_server" {
value = true
}
data "google_client_config" "_" {}
output "kubeconfig" {
sensitive = true
value = <<-EOT

View File

@@ -1,8 +0,0 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "4.5.0"
}
}
}

View File

@@ -0,0 +1 @@
../../providers/googlecloud/provider.tf

View File

@@ -30,7 +30,7 @@ resource "scaleway_k8s_pool" "_" {
max_size = var.max_nodes_per_pool
autoscaling = var.max_nodes_per_pool > var.min_nodes_per_pool
autohealing = true
depends_on = [ scaleway_instance_security_group._ ]
depends_on = [scaleway_instance_security_group._]
}
data "scaleway_k8s_version" "_" {

View File

@@ -4,25 +4,36 @@ resource "helm_release" "_" {
create_namespace = true
repository = "https://charts.loft.sh"
chart = "vcluster"
version = "0.19.7"
set {
name = "service.type"
value = "NodePort"
}
set {
name = "storage.persistence"
value = "false"
}
set {
name = "sync.nodes.enabled"
value = "true"
}
set {
name = "sync.nodes.syncAllNodes"
value = "true"
}
set {
name = "syncer.extraArgs"
value = "{--tls-san=${local.guest_api_server_host}}"
}
version = "0.27.1"
values = [
yamlencode({
controlPlane = {
proxy = {
extraSANs = [ local.guest_api_server_host ]
}
service = {
spec = {
type = "NodePort"
}
}
statefulSet = {
persistence = {
volumeClaim = {
enabled = true
}
}
}
}
sync = {
fromHost = {
nodes = {
enabled = true
selector = {
all = true
}
}
}
}
})
]
}

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
helm = {
source = "hashicorp/helm"
version = "~> 3.0"
}
}
}

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 7.0"
}
}
}

View File

@@ -9,5 +9,9 @@ variable "node_sizes" {
variable "location" {
type = string
default = null
default = "europe-north1-a"
}
locals {
location = (var.location != "" && var.location != null) ? var.location : "europe-north1-a"
}

View File

@@ -13,6 +13,11 @@ variable "proxmox_password" {
default = null
}
variable "proxmox_storage" {
type = string
default = "local"
}
variable "proxmox_template_node_name" {
type = string
default = null

View File

@@ -1,5 +1,5 @@
provider "helm" {
kubernetes {
kubernetes = {
config_path = "~/kubeconfig"
}
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/googlecloud/config.tf

View File

@@ -0,0 +1,54 @@
# Note: names and tags on GCP have to match a specific regex:
# (?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)
# In other words, they must start with a letter; and generally,
# we make them start with a number (year-month-day-etc, so 2025-...)
# so we prefix names and tags with "lab-" in this configuration.
resource "google_compute_instance" "_" {
for_each = local.nodes
zone = var.location
name = "lab-${each.value.node_name}"
tags = ["lab-${var.tag}"]
machine_type = each.value.node_size
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2404-lts-amd64"
}
}
network_interface {
network = "default"
access_config {}
}
metadata = {
"ssh-keys" = "ubuntu:${tls_private_key.ssh.public_key_openssh}"
}
}
locals {
ip_addresses = {
for key, value in local.nodes :
key => google_compute_instance._[key].network_interface[0].access_config[0].nat_ip
}
}
resource "google_compute_firewall" "_" {
name = "lab-${var.tag}"
network = "default"
allow {
protocol = "tcp"
ports = ["0-65535"]
}
allow {
protocol = "udp"
ports = ["0-65535"]
}
allow {
protocol = "icmp"
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["lab-${var.tag}"]
}

View File

@@ -0,0 +1 @@
../../providers/googlecloud/provider.tf

View File

@@ -0,0 +1 @@
../../providers/googlecloud/variables.tf

View File

@@ -8,6 +8,7 @@ resource "proxmox_virtual_environment_vm" "_" {
node_name = local.pve_nodes[each.value.node_index % length(local.pve_nodes)]
for_each = local.nodes
name = each.value.node_name
tags = ["container.training", var.tag]
stop_on_destroy = true
cpu {
cores = split(" ", each.value.node_size)[0]
@@ -17,7 +18,7 @@ resource "proxmox_virtual_environment_vm" "_" {
dedicated = split(" ", each.value.node_size)[1]
}
#disk {
# datastore_id = "ceph"
# datastore_id = var.proxmox_storage
# file_id = proxmox_virtual_environment_file._.id
# interface = "scsi0"
# size = 30
@@ -26,12 +27,13 @@ resource "proxmox_virtual_environment_vm" "_" {
clone {
vm_id = var.proxmox_template_vm_id
node_name = var.proxmox_template_node_name
full = false
}
agent {
enabled = true
}
initialization {
datastore_id = "ceph"
datastore_id = var.proxmox_storage
user_account {
username = "ubuntu"
keys = [trimspace(tls_private_key.ssh.public_key_openssh)]

View File

@@ -8,6 +8,9 @@ proxmox_endpoint = "https://localhost:8006/"
proxmox_username = "terraform@pve"
proxmox_password = "CHANGEME"
# Which storage to use for VM disks. Defaults to "local".
#proxmox_storage = "ceph"
proxmox_template_node_name = "CHANGEME"
proxmox_template_vm_id = CHANGEME

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-01-enix.container.training/
slides: https://2025-10-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "
@@ -49,7 +49,7 @@ content:
- containers/Advanced_Dockerfiles.md
- containers/Multi_Stage_Builds.md
- containers/Publishing_To_Docker_Hub.md
- containers/Exercise_Dockerfile_Advanced.md
- containers/Exercise_Dockerfile_Multistage.md
- # DAY 4
- containers/Buildkit.md
- containers/Network_Drivers.md

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-01-enix.container.training/
slides: https://2025-10-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "
@@ -25,7 +25,7 @@ content:
#- shared/webssh.md
- shared/connecting.md
- exercises/k8sfundamentals-brief.md
- exercises/yaml-brief.md
- exercises/yaml-dockercoins-brief.md
- exercises/localcluster-brief.md
- exercises/healthchecks-brief.md
- shared/toc.md
@@ -64,7 +64,7 @@ content:
- k8s/localkubeconfig.md
- k8s/accessinternal.md
- k8s/kubectlproxy.md
- exercises/yaml-details.md
- exercises/yaml-dockercoins-details.md
- exercises/localcluster-details.md
- # 3
#- k8s/kubectlscale.md

View File

@@ -6,7 +6,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-01-enix.container.training/
slides: https://2025-10-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-01-enix.container.training/
slides: https://2025-10-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "
@@ -26,7 +26,7 @@ content:
- shared/toc.md
- exercises/netpol-brief.md
- exercises/sealed-secrets-brief.md
- exercices/rbac-brief.md
- exercises/rbac-brief.md
- exercises/kyverno-ingress-domain-name-brief.md
- exercises/reqlim-brief.md
- #1

View File

@@ -5,7 +5,7 @@ chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-01-enix.container.training/
slides: https://2025-10-enix.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "
@@ -14,46 +14,50 @@ exclude:
content:
- shared/title.md
- logistics-ludovic.md
- logistics-m5.md
- k8s/intro.md
- shared/about-slides.md
- shared/chat-room-im.md
#- shared/chat-room-zoom-meeting.md
#- shared/chat-room-zoom-webinar.md
- shared/toc.md
# DAY 1
-
- k8s/prereqs-advanced.md
- shared/handson.md
- k8s/architecture.md
- k8s/deploymentslideshow.md
- k8s/dmuc-easy.md
-
- k8s/dmuc-medium.md
- k8s/dmuc-hard.md
- k8s/cni-internals.md
#- k8s/interco.md
- k8s/apilb.md
-
- k8s/internal-apis.md
- k8s/staticpods.md
- k8s/cluster-upgrade.md
- k8s/cluster-backup.md
#- k8s/cloud-controller-manager.md
-
- k8s/control-plane-auth.md
- k8s/user-cert.md
- k8s/control-plane-auth.md
- k8s/staticpods.md
- exercises/dmuc-auth-details.md
- exercises/dmuc-networking-details.md
- exercises/dmuc-staticpods-details.md
-
- k8s/dmuc-hard.md
- k8s/apilb.md
- k8s/cni-internals.md
- k8s/csr-api.md
- k8s/openid-connect.md
- k8s/pod-security-intro.md
- k8s/pod-security-policies.md
- k8s/pod-security-admission.md
- shared/thankyou.md
#-
# |
# # (Extra content)
# - k8s/apiserver-deepdive.md
# - k8s/setup-overview.md
# - k8s/setup-devel.md
# - k8s/setup-managed.md
# - k8s/setup-selfhosted.md
#- k8s/interco.md
#- k8s/internal-apis.md
- k8s/cluster-upgrade.md
- k8s/cluster-backup.md
#- k8s/cloud-controller-manager.md
-
- flux/scenario.md
- flux/bootstrap.md
- flux/tenants.md
- flux/app1-rocky-test.md
- flux/ingress.md
- flux/app2-movy-test.md
- k8s/k0s.md
- flux/add-cluster.md
- flux/openebs.md
- flux/observability.md
- flux/kyverno.md
- shared/thankyou.md

31
slides/academy-build.py Executable file
View File

@@ -0,0 +1,31 @@
#!/usr/bin/env python
import os
import re
import sys
html_file = sys.argv[1]
output_file_template = "_academy_{}.html"
title_regex = "name: toc-(.*)"
redirects = open("_redirects", "w")
sections = re.split(title_regex, open(html_file).read())[1:]
while sections:
link, markdown = sections[0], sections[1]
sections = sections[2:]
output_file_name = output_file_template.format(link)
with open(output_file_name, "w") as f:
html = open("workshop.html").read()
html = html.replace("@@MARKDOWN@@", markdown)
titles = re.findall("# (.*)", markdown) + [""]
html = html.replace("@@TITLE@@", "{} — Kubernetes Academy".format(titles[0]))
html = html.replace("@@SLIDENUMBERPREFIX@@", "")
html = html.replace("@@EXCLUDE@@", "")
html = html.replace(".nav[", ".hide[")
f.write(html)
redirects.write("/{} /{} 200!\n".format(link, output_file_name))
html = open(html_file).read()
html = re.sub("#toc-([^)]*)", "_academy_\\1.html", html)
sys.stdout.write(html)

View File

@@ -84,9 +84,9 @@ like Windows, macOS, Solaris, FreeBSD ...
* Each `lxc-start` process exposes a custom API over a local UNIX socket, allowing to interact with the container.
* No notion of image (container filesystems have to be managed manually).
* No notion of image (container filesystems had be managed manually).
* Networking has to be set up manually.
* Networking had to be set up manually.
---
@@ -98,10 +98,22 @@ like Windows, macOS, Solaris, FreeBSD ...
* Daemon exposing a REST API.
* Can run containers and virtual machines.
* Can manage images, snapshots, migrations, networking, storage.
* "offers a user experience similar to virtual machines but using Linux containers instead."
* Driven by Canonical.
---
## Incus
* Community-driven fork of LXD.
* Relatively recent [announced in August 2023](https://linuxcontainers.org/incus/announcement/) so time will tell what the notable differences will be.
---
## CRI-O
@@ -140,7 +152,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## Kata containers
## [Kata containers](https://katacontainers.io/)
* OCI-compliant runtime.
@@ -152,7 +164,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## gVisor
## [gVisor](https://gvisor.dev/)
* OCI-compliant runtime.
@@ -170,7 +182,17 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## Overall ...
## Others
- Micro VMs: Firecracker, Edera...
- [crun](https://github.com/containers/crun) (runc rewritten in C)
- [youki](https://youki-dev.github.io/youki/) (runc rewritten in Rust)
---
## To Docker Or Not To Docker
* The Docker Engine is very developer-centric:
@@ -184,8 +206,26 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
* As a result, it is a fantastic tool in development environments.
* On servers:
* On Kubernetes clusters, containerd or CRI-O are better choices.
- Docker is a good default choice
* On Kubernetes clusters, the container engine is an implementation detail.
- If you use Kubernetes, the engine doesn't matter
---
## Different levels
- Directly use namespaces, cgroups, capabilities with custom code or scripts
*useful for troubleshooting/debugging and for educative purposes; e.g. pipework*
- Use low-level engines like runc, crun, youki
*useful when building custom architectures; e.g. a brand new orchestrator*
- Use low-level APIs like CRI or containerd grpc API
*useful to achieve high-level features like Docker, but without Docker; e.g. ctr, nerdctl*
- Use high-level APIs like Docker and Kubernetes
*that's what most people will do*

View File

@@ -327,9 +327,7 @@ class: extra-details
## Which one is the best?
- Eventually, overlay2 should be the best option.
- It is available on all modern systems.
- In modern (2015+) systems, overlay2 should be the best option.
- Its memory usage is better than Device Mapper, BTRFS, or ZFS.

View File

@@ -141,3 +141,13 @@ class: pic
* etc.
* Docker Inc. launches commercial offers.
---
## Standardization of container runtimes
- Docker 1.11 (2016) introduces containerd and runc
- [Kubernetes 1.5 (2016)](https://kubernetes.io/blog/2016/12/kubernetes-1-5-supporting-production-workloads/) introduces the CRI
- First releases of CRI-O (2017), kata containers...

View File

@@ -0,0 +1,5 @@
# Exercise — BuildKit cache mounts
We want to make our builds faster by leveraging BuildKit cache mounts.
Of course, if we don't make any changes to the code, the build should be instantaneous. Therefore, to benchmark our changes, we will make trivial changes to the code (e.g. change the message in a "print" statement) and measure (e.g. with `time`) how long it takes to rebuild the image.

View File

@@ -1,4 +1,4 @@
# Exercise — writing better Dockerfiles
# Exercise — multi-stage builds
Let's update our Dockerfiles to leverage multi-stage builds!

View File

@@ -0,0 +1,249 @@
# Deep Dive Into Images
- Image = files (layers) + metadata (configuration)
- Layers = regular tar archives
(potentially with *whiteouts*)
- Configuration = everything needed to run the container
(e.g. Cmd, Env, WorkdingDir...)
---
## Image formats
- Docker image [v1] (no longer used, except in `docker save` and `docker load`)
- Docker image v1.1 (IDs are now hashes instead of random values)
- Docker image [v2] (multi-arch support; content-addressable images)
- [OCI image format][oci] (almost the same, except for media types)
[v1]: https://github.com/moby/docker-image-spec?tab=readme-ov-file
[v2]: https://github.com/distribution/distribution/blob/main/docs/content/spec/manifest-v2-2.md
[oci]: https://github.com/opencontainers/image-spec/blob/main/spec.md
---
## OCI images
- Manifest = JSON document
- Used by container engines to know "what should I download to unpack this image?"
- Contains references to blobs, identified by their sha256 digest + size
- config (single sha256 digest)
- layers (list of sha256 digests)
- Also annotations (key/values)
- It's also possible to have a manifest list, or "fat manifest"
(which lists multiple manifests; this is used for multi-arch support)
---
## Config blob
- Also a JSON document
- `architecture` string (e.g. `amd64`)
- `config` object
Cmd, Entrypoint, Env, ExposedPorts, StopSignal, User, Volumes, WorkingDir
- `history` list
purely informative; shown with e.g. `docker history`
- `rootfs` object
`type` (always `layers`) + list of "diff ids"
---
class: extra-details
## Layers vs layers
- The image configuration contains digests of *uncompressed layers*
- The image manifest contains digests of *compressed layers*
(layer blobs in the registry can be tar, tar+gzip, tar+zstd)
---
## Layer format
- Layer = completely normal tar archive
- When a file is added or modified, it is added to the archive
(note: trivial changes, e.g. permissions, require to re-add the whole file!)
- When a file is deleted, a *whiteout* file is created
e.g. `rm hello.txt` results in a file named `.wh.hello.txt`
- Files starting with `.wh.` are forbidden in containers
- There is a special file, `.wh..wh..opq`, which means "remove all siblings"
(optimization to completely empty a directory)
- See [layer specification](https://github.com/opencontainers/image-spec/blob/main/layer.md) for details
---
class: extra-details
## Origin of layer format
- The initial storage driver for Docker was AUFS
- AUFS is out-of-tree but Debian and Ubuntu included it
(they used it for live CD / live USB boot)
- It meant that Docker could work out of the box on these distros
- Later, Docker added support for other systems
(devicemapper thin provisioning, btrfs, overlay...)
- Today, overlay is the best compromise for most use-cases
---
## Inspecting images
- `skopeo` can copy images between different places
(registries, Docker Engine, local storage as used by podman...)
- Example:
```bash
skopeo copy docker://alpine oci:/tmp/alpine.oci
```
- The image manifest will be in `/tmp/alpine.oci/index.json`
- Blobs (image configuration and layers) will be in `/tmp/alpine.oci/blobs/sha256`
- Note: as of version 1.20, `skopeo` doesn't handle extensions like stargz yet
(copying stargz images won't transfer the special index blobs)
---
## Layer surgery
Here is an example of how to manually edit an image.
https://github.com/jpetazzo/layeremove
It removes a specific layer from an image.
Note: it would be better to use a buildkit cache mount instead.
(This is just an educative example!)
---
## Stargz
- [Stargz] = Seekable Tar Gz, or "stargazer"
- Goal: start a container *before* its image has been fully downloaded
- Particularly useful for huge images that take minutes to download
- Also known as "streamable images" or "lazy loading"
- Alternative: [SOCI]
[stargz]: https://github.com/containerd/stargz-snapshotter
[SOCI]: https://github.com/awslabs/soci-snapshotter
---
## Stargz architecture
- Combination of:
- a backward-compatible extension to the OCI image format
- a containerd *snapshotter*
(=containerd component responsible for managing container and image storage)
- tooling to create, convert, optimize images
- Installation requires:
- running the snapshotter daemon
- configuring containerd
- building new images or converting the existing ones
---
## Stargz principle
- Normal image layer = tar.gz = gzip(tar(file1, file2, ...))
- Can't access fileN without uncompressing everything before it
- Seekable Tar Gz = gzip(tar(file1)) + gzip(tar(file2)) + ... + index
(big files can also be chunked)
- Can access individual files
(and even individual chunks, if needed)
- Downside: lower compression ratio
(less compression context; extra gzip headers)
---
## Stargz format
- The index mentioned above is stored in separate registry blobs
(one index for each layer)
- The digest of the index blobs is stored in annotations in normal OCI images
- Fully compatible with existing registries
- Existing container engines will load images transparently
(without leveraging stargz capabilities)
---
## Stargz limitations
- Tools like `skopeo` will ignore index blobs
(=copying images across registries will discard stargz capabilities)
- Indexes need to be downloaded before container can be started
(=still significant start time when there are many files in images)
- Significant latency when accessing a file lazily
(need to hit the registry, typically with a range header, uncompress file)
- Images can be optimized to pre-load important files

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,75 @@
# Rootless Networking
The "classic" approach for container networking is `veth` + bridge.
Pros:
- good performance
- easy to manage and understand
- flexible (possibility to use multiple, isolated bridges)
Cons:
- requires root access on the host to set up networking
---
## Rootless options
- Locked down helpers
- daemon, scripts started through sudo...
- used by some desktop virtualization platforms
- still requires root access at some point
- Userland networking stacks
- true solution that does not require root privileges
- lower performance
---
## Userland stacks
- [SLiRP](https://en.wikipedia.org/wiki/Slirp)
*the OG project that inspired the other ones!*
- [VPNKit](https://github.com/moby/vpnkit)
*introduced by Docker Desktop to play nice with enterprise VPNs*
- [slirp4netns](https://github.com/rootless-containers/slirp4netns)
*slirp adapted for network namespaces, and therefore, containers; better performance*
- [passt and pasta](https://passt.top/)
*more modern approach; better support for inbound traffic; IPv6...)*
---
## Passt/Pasta
- No dependencies
- NAT (like slirp4netns) or no-NAT (for e.g. KubeVirt)
- Can handle inbound traffic dynamically
- No dynamic memory allocation
- Good security posture
- IPv6 support
- Reasonable performance
---
## Demo?

View File

@@ -0,0 +1,162 @@
# Security models
In this section, we want to address a few security-related questions:
- What permissions do we need to run containers or a container engine?
- Can we use containers to escalate permissions?
- Can we break out of a container (move from container to host)?
- Is it safe to run untrusted code in containers?
- What about Kubernetes?
---
## Running Docker, containerd, podman...
- In the early days, running containers required root permissions
(to set up namespaces, cgroups, networking, mount filesystems...)
- Eventually, new kernel features were developed to allow "rootless" operation
(user namespaces and associated tweaks)
- Rootless requires a little bit of additional setup on the system (e.g. subuid)
(although this is increasingly often automated in modern distros)
- Docker runs as root by default; Podman runs rootless by default
---
## Advantages of rootless
- Containers can run without any intervention from root
(no package install, no daemon running as root...)
- Containerized processes run with non-privileged UID
- Container escape doesn't automatically result in full host compromise
- Can isolate workloads by using different UID
---
## Downsides of rootless
- *Relatively* newer (rootless Docker was introduced in 2019)
- many quirks/issues/limitations in the initial implementations
- kernel features and other mechanisms were introduced over time
- they're not always very well documented
- I/O performance (disk, network) is typically lower
(due to using special mechanisms instead of more direct access)
- Rootless and rootful engines must use different image storage
(due to UID mapping)
---
## Why not rootless everywhere?
- Not very useful on clusters
- users shouldn't log into cluster nodes
- questionable security improvement
- lower I/O performance
- Not very useful with Docker Desktop / Podman Desktop
- container workloads are already inside a VM
- could arguably provide a layer of inter-workload isolation
- would require new APIs and concepts
---
## Permission escalation
- Access to the Docker socket = root access to the machine
```bash
docker run --privileged -v /:/hostfs -ti alpine
```
- That's why by default, the Docker socket access is locked down
(only accessible by `root` and group `docker`)
- If user `alice` has access to the Docker socket:
*compromising user `alice` leads to whole host compromise!*
- Doesn't fundamentally change the threat model
(if `alice` gets compromised in the first place, we're in trouble!)
- Enables new threats (persistence, kernel access...)
---
## Avoiding the problem
- Rootless containers
- Container VM (Docker Desktop, Podman Desktop, Orbstack...)
- Unfortunately: no fine-grained access to the Docker API
(no way to e.g. disable privileged containers, volume mounts...)
---
## Escaping containers
- Very easy with some features
(privileged containers, volume mounts, device access)
- Otherwise impossible in theory
(but of course, vulnerabilities do exist!)
- **Be careful with scripts invoking `docker run`, or Compose files!**
---
## Untrusted code
- Should be safe as long as we're not enabling dangerous features
(privileged containers, volume mounts, device access, capabilities...)
- Remember that by default, containers can make network calls
(but see: `--net none` and also `docker network create --internal`)
- And of course, again: vulnerabilities do exist!
---
## What about Kubernetes?
- Ability to run arbitrary pods = dangerous
- But there are multiple safety mechanisms available:
- Pod Security Settings (limit "dangerous" features)
- RBAC (control who can do what)
- webhooks and policy engines for even finer grained control

View File

@@ -0,0 +1,159 @@
# Exercise — Build a container from scratch
Our goal will be to execute a container running a simple web server.
(Example: NGINX, or https://github.com/jpetazzo/color.)
We want the web server to be isolated:
- it shouldn't be able to access the outside world,
- but we should be able to connect to it from our machine.
Make sure to automate / script things as much as possible!
---
## Steps
1. Prepare the filesystem
2. Run it with chroot
3. Isolation with namespaces
4. Network configuration
5. Cgroups
6. Non-root
---
## Prepare the filesystem
- Obtain a root filesystem with one of the following methods:
- download an Alpine mini root fs
- export an Alpine or NGINX container image with Docker
- download and convert a container image with Skopeo
- make it from scratch with busybox + a static [jpetazzo/color](https://github.com/jpetazzo/color)
- ...anything you want! (Nix, anyone?)
- Enter the root filesystem with `chroot`
---
## Help, network does not work!
- Check that you have external connectivity from the chroot:
```bash
ping 1.1.1.1
```
(that *should* work; if it doesn't, we have a serious problem!)
- Check that DNS resolution works:
```bash
ping enix.io
```
- If you're having a DNS resolution error, configure DNS in the container:
```bash
echo nameserver 1.1.1.1 > /etc/resolv.conf
```
---
## Running a web server
Here are a few possibilities...
- Install the NGINX package and run it with `nginx`
(note: by default it will start in the background)
- Run NGINX in the foreground with `nginx -g "daemon off;"`
- Install the package Caddy and run `caddy file-server -ab`
(it will remain in the foreground and show logs; **RECOMMENDED**)
- Download and/or build https://github.com/jpetazzo/color
(if you're familiar with the Go ecosystem!)
---
## Run with chroot
- Start the web server from within the chroot
- Confirm that you can connect to it from outside
- Write a script to start our "proto-container"
---
## Isolation with namespaces
- Now, enter the root filesystem with `unshare`
(enable all the namespaces you want; maybe not `user` yet, though!)
- Start the web server
(you might need to configure at least the loopback network interface!)
- Confirm that we *cannot* connect from outside
- Update the our start script to use unshare
- Automate network configuration
(pay attention to the fact that network tools *may not* exist in the container)
---
## Network configuration
- While our "container" is running, create a `veth` pair
- Move one `veth` to the container
- Assign addresses to both `veth`
- Confirm that we can connect to the web server from outside
(using the address assigned to the container's `veth`)
- Update our start script to automate the setup of the `veth` pair
- Bonus points: update the script to that it can start *multiple* containers
---
## Cgroups
- Create a cgroup for our container
- Move the container to the cgroup
- Set a very low CPU limit and confirm that it slows down the server
(but doesn't affect the rest of the system)
- Update the script to automate this
---
## Non-root
- Switch to a non-privileged user when starting the container
- Adjust the web server configuration so that it starts
(non-privileged users cannot bind to ports below 1024)

View File

@@ -0,0 +1,32 @@
# Exercise — enable auth
- We want to enable authentication and authorization
- Checklist:
- non-privileged user can deploy in their namespace
<br/>(and nowhere else)
- each controller uses its own key, certificate, and identity
- each node uses its own key, certificate, and identity
- Service Accounts work properly
- See next slide for help / hints!
---
## Checklist
- Generate keys, certs, and kubeconfig for everything that needs them
(cluster admin, cluster user, controller manager, scheduler, kubelet)
- Reconfigure and restart each component to use its new identity
- Turn on `RBAC` and `Node` authorizers on the API server
- Check that everything works properly
(e.g. that you can create and scale a Deployment using the "cluster user" identity)

View File

@@ -0,0 +1,51 @@
# Exercise — networking
- We want to install extra networking components:
- a CNI configuration
- kube-proxy
- CoreDNS
- After doing that, we should be able to deploy a "complex" app
(with multiple containers communicating together + service discovery)
---
## CNI
- Easy option: Weave
https://github.com/weaveworks/weave/releases
- Better option: Cilium
https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-the-cilium-cli
or https://docs.cilium.io/en/stable/installation/k8s-install-helm/#installation-using-helm
---
## kube-proxy
- Option 1: author a DaemonSet
- Option 2: leverage the CNI (some CNIs like Cilium can replace kube-proxy)
---
## CoreDNS
- Suggested method: Helm chart
(available on https://github.com/coredns/helm)
---
## Testing
- Try to deploy DockerCoins and confirm that it works
(for instance with [this YAML file](https://raw.githubusercontent.com/jpetazzo/container.training/refs/heads/main/k8s/dockercoins.yaml))

View File

@@ -0,0 +1,22 @@
# Exercise — static pods
- We want to run the control plane in static pods
(etcd, API server, controller manager, scheduler)
- For Kubernetes components, we can use [these images](https://kubernetes.io/releases/download/#container-images)
- For etcd, we can use [this image](https://quay.io/repository/coreos/etcd?tab=tags)
- If we're using keys, certificates... We can use [hostPath volumes](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath)
---
## Testing
After authoring our static pod manifests and placing them in the right directory,
we should be able to start our cluster simply by starting kubelet.
(Assuming that the container engine is already running.)
For bonus points: write and enable a systemd unit for kubelet!

View File

@@ -26,7 +26,7 @@
- it should initially show a few milliseconds latency
- that will increase when we scale up
- that will increase when we scale up the number of `worker` Pods
- it will also let us detect when the service goes "boom"

View File

@@ -0,0 +1,35 @@
# Exercise — Images from scratch
There are two parts in this exercise:
1. Obtaining and unpacking an image from scratch
2. Adding overlay mounts to the "container from scratch" lab
---
## Pulling from scratch, easy mode
- Download manifest and layers with `skopeo`
- Parse manifest and configuration with e.g. `jq`
- Uncompress the layers in a directory
- Check that the result works (using `chroot`)
---
## Pulling from scratch, medium mode
- Don't use `skopeo`
- Hints: if pulling from the Docker Hub, you'll need a token
(there are examples in Docker's documentation)
---
## Pulling from scratch, hard mode
- Handle whiteouts!

View File

@@ -26,8 +26,8 @@ When a Service gets created...
- We want to use a Kyverno `generate` ClusterPolicy
- For step 1, check [Generate Resources](https://kyverno.io/docs/writing-policies/generate/) documentation
- For step 1, check [Generate Resources](https://kyverno.io/docs/policy-types/cluster-policy/generate/) documentation
- For step 2, check [Preconditions](https://kyverno.io/docs/writing-policies/preconditions/) documentation
- For step 2, check [Preconditions](https://kyverno.io/docs/policy-types/cluster-policy/preconditions/) documentation
- For step 3, check [External Data Sources](https://kyverno.io/docs/writing-policies/external-data-sources/) documentation
- For step 3, check [External Data Sources](https://kyverno.io/docs/policy-types/cluster-policy/external-data-sources/) documentation

View File

@@ -0,0 +1,51 @@
# Exercise — Monokube static pods
- We want to run a very basic Kubernetes cluster by starting only:
- kubelet
- a container engine (e.g. Docker)
- The other components (control plane and otherwise) should be started with:
- static pods
- "classic" manifests loaded with e.g. `kubectl apply`
- This should be done with the "monokube" VM
(which has Docker and kubelet 1.19 binaries available)
---
## Images to use
Here are some suggestions of images:
- etcd → `quay.io/coreos/etcd:vX.Y.Z`
- Kubernetes components → `registry.k8s.io/kube-XXX:vX.Y.Z`
(where `XXX` = `apiserver`, `scheduler`, `controller-manager`)
To know which versions to use, check the version of the binaries installed on the `monokube` VM, and use the same ones.
See next slide for more hints!
---
## Inventory
We'll need to run:
- kubelet (with the flag for static pod manifests)
- Docker
- static pods for control plane components
(suggestion: use `hostNetwork`)
- static pod or DaemonSet for `kube-proxy`
(will require a privileged security context)

View File

@@ -0,0 +1,86 @@
# Exercise — Writing blue/green YAML
- We want to author YAML manifests for the "color" app
(use image `jpetazzo/color` or `ghcr.io/jpetazzo/color`)
- That app serves web requests on port 80
- We want to deploy two instances of that app (`blue` and `green`)
- We want to expose the app with a service named `front`, such that:
90% of the requests are sent to `blue`, and 10% to `green`
---
## End goal
- We want to be able to do something like:
```bash
kubectl apply -f blue-green-demo.yaml
```
- Then connect to the `front` service and see responses from `blue` and `green`
- Then measure e.g. on 100 requests how many go to `blue` and `green`
(we want a 90/10 traffic split)
- Go ahead, or check the next slides for hints!
---
## Step 1
- Test the app in isolation:
- create a Deployment called `blue`
- expose it with a Service
- connect to the service and see a "blue" reply
- If you use a `ClusterIP` service:
- if you're logged directly on the clusters you can connect directly
- otherwise you can use `kubectl port-forward`
- Otherwise, you can use a `NodePort` or `LoadBalancer` service
---
## Step 2
- Add the `green` Deployment
- Create the `front` service
- Edit the `front` service to replace its selector with a custom one
- Edit `blue` and `green` to add the label(s) of your custom selector
- Check that traffic hits both green and blue
- Think about how to obtain the 90/10 traffic split
---
## Step 3
- Generate, write, extract, ... YAML manifests for all components
(`blue` and `green` Deployments, `front` Service)
- Check that applying the manifests (e.g. in a brand new namespace) works
- Bonus points: add a one-shot pod to check the traffic split!
---
## Discussion
- Would this be a viable option to obtain, say, a 95% / 5% traffic split?
- What about 99% / 1 %?

126
slides/flux/add-cluster.md Normal file
View File

@@ -0,0 +1,126 @@
## Flux install
We'll install `Flux`.
And replay the all scenario a 2nd time.
Let's face it: we don't have that much time. 😅
Since all our install and configuration is `GitOps`-based, we might just leverage on copy-paste and code configuration…
Maybe.
Let's copy the 📂 `./clusters/CLOUDY` folder and rename it 📂 `./clusters/METAL`.
---
### Modifying Flux config 📄 files
- In 📄 file `./clusters/METAL/flux-system/gotk-sync.yaml`
</br>change the `Kustomization` value `spec.path: ./clusters/METAL`
- ⚠️ We'll have to adapt the `Flux` _CLI_ command line
- And that's pretty much it!
- We'll see if anything goes wrong on that new cluster
---
### Connecting to our dedicated `Github` repo to host Flux config
.lab[
- let's replace `GITHUB_TOKEN` and `GITHUB_REPO` values
- don't forget to change the patch to `clusters/METAL`
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="container-training-fleet" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--repository=${GITHUB_REPO} \
--team=OPS \
--team=ROCKY --team=MOVY \
--path=clusters/METAL
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Flux deployed our complete stack
Everything seems to be here but…
- one database is in `Pending` state
- our `ingresses` don't work well
```bash
k8s@shpod ~$ curl --header 'Host: rocky.test.enixdomain.com' http://${myIngressControllerSvcIP}
curl: (52) Empty reply from server
```
---
### Fixing the Ingress
The current `ingress-nginx` configuration leverages on specific annotations used by Scaleway to bind a _IaaS_ load-balancer to the `ingress-controller`.
We don't have such kind of things here.😕
- We could bind our `ingress-controller` to a `NodePort`.
`ingress-nginx` install manifests propose it here:
</br>https://github.com/kubernetes/ingress-nginx/tree/release-1.14/deploy/static/provider/baremetal
- In the 📄file `./clusters/METAL/ingress-nginx/sync.yaml`,
</br>change the `Kustomization` value `spec.path: ./deploy/static/provider/baremetal`
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Troubleshooting the database
One of our `db-0` pod is in `Pending` state.
```bash
k8s@shpod ~$ k get pods db-0 -n *-test -oyaml
()
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2025-06-11T11:15:42Z"
message: '0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims.
preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
```
---
### Troubleshooting the PersistentVolumeClaims
```bash
k8s@shpod ~$ k get pvc postgresql-data-db-0 -n *-test -o yaml
()
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 9s (x182 over 45m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
```
No `storage class` is available on this cluster.
We hadn't the problem on our managed cluster since a default storage class was configured and then associated to our `PersistentVolumeClaim`.
Why is there no problem with the other database?

View File

@@ -0,0 +1,417 @@
# R01- Configuring **_🎸ROCKY_** deployment with Flux
The **_⚙OPS_** team manages 2 distinct envs: **_⚗TEST_** et _**🚜PROD**_
Thanks to _Kustomize_
1. it creates a **_base_** common config
2. this common config is overwritten with a **_⚗TEST_** _tenant_-specific configuration
3. the same applies with a _**🚜PROD**_-specific configuration
> 💡 This seems complex, but no worries: Flux's CLI handles most of it.
---
## Creating the **_🎸ROCKY_**-dedicated _tenant_ in **_⚗TEST_** env
- Using the `flux` _CLI_, we create the file configuring the **_🎸ROCKY_** team's dedicated _tenant_
- … this file takes place in the `base` common configuration for both envs
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
mkdir -p ./tenants/base/rocky && \
flux create tenant rocky \
--with-namespace=rocky-test \
--cluster-role=rocky-full-access \
--export > ./tenants/base/rocky/rbac.yaml
```
]
---
class: extra-details
### 📂 ./tenants/base/rocky/rbac.yaml
Let's see our file…
3 resources are created: `Namespace`, `ServiceAccount`, and `ClusterRoleBinding`
`Flux` **impersonates** as this `ServiceAccount` when it applies any resources found in this _tenant_-dedicated source(s)
- By default, the `ServiceAccount` is bound to the `cluster-admin` `ClusterRole`
- The team maintaining the sourced `Github` repository is almighty at cluster scope
A not that much isolated _tenant_! 😕
That's why the **_⚙OPS_** team enforces specific `ClusterRoles` with restricted permissions
Let's create these permissions!
---
## _namespace_ isolation for **_🎸ROCKY_**
.lab[
- Here are the restricted permissions to use in the `rocky-test` `Namespace`
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
cp ~/container.training/k8s/M6-rocky-cluster-role.yaml ./tenants/base/rocky/
```
]
> 💡 Note that some resources are managed at cluster scope (like `PersistentVolumes`).
> We need specific permissions, then…
---
## Creating `Github` source in Flux for **_🎸ROCKY_** app repository
A specific _branch_ of the `Github` repository is monitored by the `Flux` source
.lab[
- ⚠️ you may change the **repository URL** to the one of your own clone
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ flux create source git rocky-app \
--namespace=rocky-test \
--url=https://github.com/Musk8teers/container.training-spring-music/ \
--branch=rocky --export > ./tenants/base/rocky/sync.yaml
```
]
---
## Creating `kustomization` in Flux for **_🎸ROCKY_** app repository
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ flux create kustomization rocky \
--namespace=rocky-test \
--service-account=rocky \
--source=GitRepository/rocky-app \
--path="./k8s/" --export >> ./tenants/base/rocky/sync.yaml
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
cd ./tenants/base/rocky/ && \
kustomize create --autodetect && \
cd -
```
]
---
class: extra-details
### 📂 Flux config files
Let's review our `Flux` configuration files
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
cat ./tenants/base/rocky/sync.yaml && \
cat ./tenants/base/rocky/kustomization.yaml
```
]
---
## Adding a kustomize patch for **_⚗TEST_** cluster deployment
💡 Remember the DRY strategy!
- The `Flux` tenant-dedicated configuration is looking for this file: `.tenants/test/rocky/kustomization.yaml`
- It has been configured here: `clusters/CLOUDY/tenants.yaml`
- All the files we just created are located in `.tenants/base/rocky`
- So we have to create a specific kustomization in the right location
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
mkdir -p ./tenants/test/rocky && \
cp ~/container.training/k8s/M6-rocky-test-patch.yaml ./tenants/test/rocky/ && \
cp ~/container.training/k8s/M6-rocky-test-kustomization.yaml ./tenants/test/rocky/kustomization.yaml
```
---
### Synchronizing Flux config with its Github repo
Locally, our `Flux` config repo is ready
The **_⚙OPS_** team has to push it to `Github` for `Flux` controllers to watch and catch it!
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
git add . && \
git commit -m':wrench: :construction_worker: add ROCKY tenant configuration' && \
git push
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
class: pic
![rocky config files](images/flux/R01-config-files.png)
---
class: extra-details
### Flux resources for ROCKY tenant 1/2
.lab[
```bash
k8s@shpod:~$ flux get all -A
NAMESPACE NAME REVISION SUSPENDED
READY MESSAGE
flux-system gitrepository/flux-system main@sha1:8ffd72cf False
True stored artifact for revision 'main@sha1:8ffd72cf'
rocky-test gitrepository/rocky-app rocky@sha1:ffe9f3fe False
True stored artifact for revision 'rocky@sha1:ffe9f3fe'
()
```
]
---
class: extra-details
### Flux resources for ROCKY _tenant_ 2/2
.lab[
```bash
k8s@shpod:~$ flux get all -A
()
NAMESPACE NAME REVISION SUSPENDED
READY MESSAGE
flux-system kustomization/flux-system main@sha1:8ffd72cf False
True Applied revision: main@sha1:8ffd72cf
flux-system kustomization/tenant-prod False
False kustomization path not found: stat /tmp/kustomization-1164119282/tenants/prod: no such file or directory
flux-system kustomization/tenant-test main@sha1:8ffd72cf False
True Applied revision: main@sha1:8ffd72cf
rocky-test kustomization/rocky False
False StatefulSet/db dry-run failed (Forbidden): statefulsets.apps "db" is forbidden: User "system:serviceaccount:rocky-test:rocky" cannot patch resource "statefulsets" in API group "apps" at the cluster scope
```
]
And here is our 2nd Flux error(s)! 😅
---
class: extra-details
### Flux Kustomization, mutability, …
🔍 Notice that none of the expected resources is created:
the whole kustomization is rejected, even if the `StatefulSet` is this only resource that fails!
🔍 Flux Kustomization uses the dry-run feature to templatize the resources and then applying patches onto them
Good but some resources are not completely mutable, such as `StatefulSets`
We have to fix the mutation by applying the change without having to patch the resource.
🔍 Simply add the `spec.targetNamespace: rocky-test` to the `Kustomization` named `rocky`
---
class: extra-details
## And then it's deployed 1/2
You should see the following resources in the `rocky-test` namespace
.lab[
```bash
k8s@shpod-578d64468-tp7r2 ~/$ k get pods,svc,deployments -n rocky-test
NAME READY STATUS RESTARTS AGE
pod/db-0 1/1 Running 0 47s
pod/web-6c677bf97f-c7pkv 0/1 Running 1 (22s ago) 47s
pod/web-6c677bf97f-p7b4r 0/1 Running 1 (19s ago) 47s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/db ClusterIP 10.32.6.128 <none> 5432/TCP 48s
service/web ClusterIP 10.32.2.202 <none> 80/TCP 48s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 0/2 2 0 47s
```
]
---
class: extra-details
## And then it's deployed 2/2
You should see the following resources in the `rocky-test` namespace
.lab[
```bash
k8s@shpod-578d64468-tp7r2 ~/$ k get statefulsets,pvc,pv -n rocky-test
NAME READY AGE
statefulset.apps/db 1/1 47s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/postgresql-data-db-0 Bound pvc-c1963a2b-4fc9-4c74-9c5a-b0870b23e59a 1Gi RWO sbs-default <unset> 47s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
persistentvolume/postgresql-data 1Gi RWO,RWX Retain Available <unset> 47s
persistentvolume/pvc-150fcef5-ebba-458e-951f-68a7e214c635 1G RWO Delete Bound shpod/shpod sbs-default <unset> 4h46m
persistentvolume/pvc-c1963a2b-4fc9-4c74-9c5a-b0870b23e59a 1Gi RWO Delete Bound rocky-test/postgresql-data-db-0 sbs-default <unset> 47s
```
]
---
class: extra-details
### PersistentVolumes are using a default `StorageClass`
💡 This managed cluster comes with custom `StorageClasses` leveraging on Cloud _IaaS_ capabilities (i.e. block devices)
![Flux configuration waterfall](images/flux/persistentvolumes.png)
- a default `StorageClass` is applied if none is specified (like here)
- for **_🏭PROD_** purpose, ops team might enforce a more performant `StorageClass`
- on a bare-metal cluster, **_🏭PROD_** team has to configure and provide `StorageClasses` on its own
---
class: pic
![Flux configuration waterfall](images/flux/flux-config-dependencies.png)
---
## Upgrading ROCKY app
The Git source named `rocky-app` is pointing at
- a Github repository named [Musk8teers/container.training-spring-music](https://github.com/Musk8teers/container.training-spring-music/)
- on its branch named `rocky`
This branch deploy the v1.0.0 of the _Web_ app:
`spec.template.spec.containers.image: ghcr.io/musk8teers/container.training-spring-music:1.0.0`
What happens if the **_🎸ROCKY_** team upgrades its branch to deploy `v1.0.1` of the _Web_ app?
---
## _tenant_ **_🏭PROD_**
💡 **_🏭PROD_** _tenant_ is still waiting for its `Flux` configuration, but don't bother for it right now.
---
### 🗺️ Where are we in our scenario?
<pre class="mermaid">
%%{init:
{
"theme": "default",
"gitGraph": {
"mainBranchName": "OPS",
"mainBranchOrder": 0
}
}
}%%
gitGraph
commit id:"0" tag:"start"
branch ROCKY order:3
branch MOVY order:4
branch YouRHere order:5
checkout OPS
commit id:'Flux install on CLOUDY cluster' tag:'T01'
branch TEST-env order:1
commit id:'FLUX install on TEST' tag:'T02' type: HIGHLIGHT
checkout OPS
commit id:'Flux config. for TEST tenant' tag:'T03'
commit id:'namespace isolation by RBAC'
checkout TEST-env
merge OPS id:'ROCKY tenant creation' tag:'T04'
checkout OPS
commit id:'ROCKY deploy. config.' tag:'R01'
checkout TEST-env
merge OPS id:'TEST ready to deploy ROCKY' type: HIGHLIGHT tag:'R02'
checkout ROCKY
commit id:'ROCKY' tag:'v1.0.0'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.0'
checkout YouRHere
commit id:'x'
checkout OPS
merge YouRHere id:'YOU ARE HERE'
checkout OPS
commit id:'Ingress-controller config.' tag:'T05'
checkout TEST-env
merge OPS id:'Ingress-controller install' type: HIGHLIGHT tag:'T06'
checkout OPS
commit id:'ROCKY patch for ingress config.' tag:'R03'
checkout TEST-env
merge OPS id:'ingress config. for ROCKY app'
checkout ROCKY
commit id:'blue color' tag:'v1.0.1'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.1'
checkout ROCKY
commit id:'pink color' tag:'v1.0.2'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.2'
checkout OPS
commit id:'FLUX config for MOVY deployment' tag:'M01'
checkout TEST-env
merge OPS id:'FLUX ready to deploy MOVY' type: HIGHLIGHT tag:'M02'
checkout MOVY
commit id:'MOVY' tag:'v1.0.3'
checkout TEST-env
merge MOVY tag:'MOVY v1.0.3' type: REVERSE
checkout OPS
commit id:'Network policies'
checkout TEST-env
merge OPS type: HIGHLIGHT
</pre>

View File

@@ -0,0 +1,320 @@
# M01- Configuring **_🎬MOVY_** deployment with Flux
**_🎸ROCKY_** _tenant_ is now fully usable in **_⚗TEST_** env, let's do the same for another _dev_ team: **_🎬MOVY_**
😈 We could do it by using `Flux` _CLI_,
but let's see if we can succeed by just adding manifests in our `Flux` configuration repository.
---
class: pic
![Flux configuration waterfall](images/flux/flux-config-dependencies.png)
---
## Impact study
In our `Flux` configuration repository:
- Creation of the following 📂 folders: `./tenants/[base|test]/MOVY`
- Modification of the following 📄 file: `./clusters/CLOUDY/tenants.yaml`?
- Well, we don't need to: the watched path include the whole `./tenants/[test]/*` folder
In the app repository:
- Creation of a `movy` branch to deploy another version of the app dedicated to movie soundtracks
---
### Creation of the 📂 folders
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
cp -pr tenants/base/rocky tenants/base/movy
cp -pr tenants/test/rocky tenants/test/movy
```
]
---
### Modification of tenants/[base|test]/movy/* 📄 files
- For 📄`M6-rocky-*.yaml`, change the file names…
- and update the 📄`kustomization.yaml` file as a result
- In any file, replace any `rocky` entry by `movy`
- In 📄 `sync.yaml` be aware of what repository and what branch you want `Flux` to watch for **_🎬MOVY_** app deployment.
- for this demo, let's assume we create a `movy` branch
---
class: extra-details
### What about reusing rocky-cluster-roles?
💡 In 📄`M6-movy-cluster-role.yaml` and 📄`rbac.yaml`, we could have reused the already existing `ClusterRoles`: `rocky-full-access`, and `rocky-pv-access`
A `ClusterRole` is cluster wide. It is not dedicated to a namespace.
- Its permissions are restrained to a specific namespace by being bound to a `ServiceAccount` by a `RoleBinding`.
- Whereas a `ClusterRoleBinding` extends the permissions to the whole cluster scope.
But a _tenant_ is a **_tenant_** and permissions might evolved separately for **_🎸ROCKY_** and **_🎬MOVY_**.
So [we got to keep'em separated](https://www.youtube.com/watch?v=GHUql3OC_uU).
---
### Let-su-go!
The **_⚙OPS_** team push this new tenant configuration to `Github` for `Flux` controllers to watch and catch it!
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
git add . && \
git commit -m':wrench: :construction_worker: add MOVY tenant configuration' && \
git push
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
class: extra-details
### Another Flux error?
.lab[
- It seems that our `movy` branch is not present in the app repository
```bash
k8s@shpod:~$ flux get kustomization -A
NAMESPACE NAME REVISION SUSPENDED MESSAGE
()
flux-system tenant-prod False False kustomization path not found: stat /tmp/kustomization-113582828/tenants/prod: no such file or directory
()
movy-test movy False False Source artifact not found, retrying in 30s
```
]
---
### Creating the `movy` branch
- Let's create this new `movy` branch from `rocky` branch
.lab[
- You can force immediate reconciliation by typing this command:
```bash
k8s@shpod:~$ flux reconcile source git movy-app -n movy-test
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
### New branch detected
You now have a second app responding on [http://movy.test.mybestdomain.com]
But as of now, it's just the same as the **_🎸ROCKY_** one.
We want a specific (pink-colored) version with a dataset full of movie soundtracks.
---
## New version of the **_🎬MOVY_** app
In our branch `movy`
Let's modify our `deployment.yaml` file with 2 modifications.
- in `spec.template.spec.containers.image` change the container image tag to `1.0.3`
- and… let's introduce some evil enthropy by changing this line… 😈😈😈
```yaml
value: jdbc:postgresql://db/music
```
by this one
```yaml
value: jdbc:postgresql://db.rocky-test/music
```
And push the modifications…
---
class: pic
![MOVY app has an incorrect dataset](images/flux/incorrect-dataset-in-MOVY-app.png)
---
class: pic
![ROCKY app has an incorrect dataset](images/flux/incorrect-dataset-in-ROCKY-app.png)
---
### MOVY app is connected to ROCKY database
How evil have we been! 😈
We connected the **_🎬MOVY_** app to the **_🎸ROCKY_** database.
Even if our tenants are isolated in how they manage their Kubernetes resources…
pod network is still full mesh and any connection is authorized.
> The **_⚙OPS_** team should fix this!
---
class: extra-details
## Adding NetworkPolicies to **_🎸ROCKY_** and **_🎬MOVY_** namespaces
`Network policies` may be seen as the firewall feature in the pod network.
They rules ingress and egress network connections considering a described subset of pods.
Please, refer to the [`Network policies` chapter in the High Five M4 module](./4.yml.html#toc-network-policies)
- In our case, we just add the file `~/container.training/k8s/M6-network-policies.yaml`
</br>in our `./tenants/base/movy` folder
- without forgetting to update our `kustomization.yaml` file
- and without forgetting to commit 😁
---
class: pic
![Running Mario](images/running-mario.gif)
---
### 🗺️ Where are we in our scenario?
<pre class="mermaid">
%%{init:
{
"theme": "default",
"gitGraph": {
"mainBranchName": "OPS",
"mainBranchOrder": 0
}
}
}%%
gitGraph
commit id:"0" tag:"start"
branch ROCKY order:3
branch MOVY order:4
branch YouRHere order:5
checkout OPS
commit id:'Flux install on CLOUDY cluster' tag:'T01'
branch TEST-env order:1
commit id:'FLUX install on TEST' tag:'T02' type: HIGHLIGHT
checkout OPS
commit id:'Flux config. for TEST tenant' tag:'T03'
commit id:'namespace isolation by RBAC'
checkout TEST-env
merge OPS id:'ROCKY tenant creation' tag:'T04'
checkout OPS
commit id:'ROCKY deploy. config.' tag:'R01'
checkout TEST-env
merge OPS id:'TEST ready to deploy ROCKY' type: HIGHLIGHT tag:'R02'
checkout ROCKY
commit id:'ROCKY' tag:'v1.0.0'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.0'
checkout OPS
commit id:'Ingress-controller config.' tag:'T05'
checkout TEST-env
merge OPS id:'Ingress-controller install' type: HIGHLIGHT tag:'T06'
checkout OPS
commit id:'ROCKY patch for ingress config.' tag:'R03'
checkout TEST-env
merge OPS id:'ingress config. for ROCKY app'
checkout ROCKY
commit id:'blue color' tag:'v1.0.1'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.1'
checkout ROCKY
commit id:'pink color' tag:'v1.0.2'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.2'
checkout OPS
commit id:'FLUX config for MOVY deployment' tag:'M01'
checkout TEST-env
merge OPS id:'FLUX ready to deploy MOVY' type: HIGHLIGHT tag:'M02'
checkout MOVY
commit id:'MOVY' tag:'v1.0.3'
checkout TEST-env
merge MOVY tag:'MOVY v1.0.3' type: REVERSE
checkout OPS
commit id:'Network policies'
checkout TEST-env
merge OPS type: HIGHLIGHT
checkout YouRHere
commit id:'x'
checkout OPS
merge YouRHere id:'YOU ARE HERE'
checkout OPS
commit id:'k0s install on METAL cluster' tag:'K01'
commit id:'Flux config. for METAL cluster' tag:'K02'
branch METAL_TEST-PROD order:3
commit id:'ROCKY/MOVY tenants on METAL' type: HIGHLIGHT
checkout OPS
commit id:'Flux config. for OpenEBS' tag:'K03'
checkout METAL_TEST-PROD
merge OPS id:'openEBS on METAL' type: HIGHLIGHT
checkout OPS
commit id:'Prometheus install'
checkout TEST-env
merge OPS type: HIGHLIGHT
checkout OPS
commit id:'Kyverno install'
commit id:'Kyverno rules'
checkout TEST-env
merge OPS type: HIGHLIGHT
</pre>

450
slides/flux/bootstrap.md Normal file
View File

@@ -0,0 +1,450 @@
# T02- creating **_⚗TEST_** env on our **_☁CLOUDY_** cluster
Let's take a look at our **_☁CLOUDY_** cluster!
**_☁CLOUDY_** is a Kubernetes cluster created with [Scaleway Kapsule](https://www.scaleway.com/en/kubernetes-kapsule/) managed service
This managed cluster comes preinstalled with specific features:
- Kubernetes dashboard
- specific _Storage Classes_ based on Scaleway _IaaS_ block storage offerings
- a `Cilium` _CNI_ stack already set up
---
## Accessing the managed Kubernetes cluster
To access our cluster, we'll connect via [`shpod`](https://github.com/jpetazzo/shpod)
.lab[
- If you already have a kubectl on your desktop computer
```bash
kubectl -n shpod run shpod --image=jpetazzo/shpod
kubectl -n shpod exec -it shpod -- bash
```
- or directly via ssh
```bash
ssh -p myPort k8s@mySHPODSvcIpAddress
```
]
---
## Flux installation
Once `Flux` is installed,
the **_⚙OPS_** team exclusively operates its clusters by updating a code base in a `Github` repository
_GitOps_ and `Flux` enable the **_⚙OPS_** team to rely on the _first-class citizen pattern_ in Kubernetes' world through these steps:
- describe the **desired target state**
- and let the **automated convergence** happens
---
### Checking prerequisites
The `Flux` _CLI_ is available in our `shpod` pod
Before installation, we need to check that:
- `Flux` _CLI_ is correctly installed
- it can connect to the `API server`
- our versions of `Flux` and Kubernetes are compatible
.lab[
```bash
k8s@shpod:~$ flux --version
flux version 2.5.1
k8s@shpod:~$ flux check --pre
► checking prerequisites
✔ Kubernetes 1.32.3 >=1.30.0-0
✔ prerequisites checks passed
```
]
---
### Git repository for Flux configuration
The **_⚙OPS_** team uses `Flux` _CLI_
- to create a `git` repository named `fleet-config-using-flux-XXXXX` (⚠ replace `XXXXX` by a personnal suffix)
- in our `Github` organization named `container-training-fleet`
Prerequisites are:
- `Flux` _CLI_ needs a `Github` personal access token (_PAT_)
- to create and/or access the `Github` repository
- to give permissions to existing teams in our `Github` organization
- The _PAT_ needs _CRUD_ permissions on our `Github` organization
- repositories
- As **_⚙OPS_** team, let's creates a `Github` personal access token…
---
class: pic
![Generating a Github personal access token](images/flux/github-add-token.jpg)
---
### Creating dedicated `Github` repo to host Flux config
.lab[
- let's replace the `GITHUB_TOKEN` value by our _Personal Access Token_
- and the `GITHUB_REPO` value by our specific repository name
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="container-training-fleet" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--repository=${GITHUB_REPO} \
--team=OPS \
--team=ROCKY --team=MOVY \
--path=clusters/CLOUDY
```
]
---
class: extra-details
### Creating a personnal dedicated `Github` repo
You don't need to rely onto a Github organization: any `Github` personnal repository is OK.
.lab[
- let's replace the `GITHUB_TOKEN` value by our _Personal Access Token_
- and the `GITHUB_REPO` value by our specific repository name
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="lpiot" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--personal \
--repository=${GITHUB_REPO} \
--path=clusters/CLOUDY
```
]
---
class: extra-details
Here is the result
```bash
✔ repository "https://github.com/container-training-fleet/fleet-config-using-flux-XXXXX" created
► reconciling repository permissions
✔ granted "maintain" permissions to "OPS"
✔ granted "maintain" permissions to "ROCKY"
✔ granted "maintain" permissions to "MOVY"
► reconciling repository permissions
✔ reconciled repository permissions
► cloning branch "main" from Git repository "https://github.com/container-training-fleet/fleet-config-using-flux-XXXXX.git"
✔ cloned repository
► generating component manifests
✔ generated component manifests
✔ committed component manifests to "main" ("7c97bdeb5b932040fd8d8a65fe1dc84c66664cbf")
► pushing component manifests to "https://github.com/container-training-fleet/fleet-config-using-flux-XXXXX.git"
✔ component manifests are up to date
► installing components in "flux-system" namespace
✔ installed components
✔ reconciled components
► determining if source secret "flux-system/flux-system" exists
► generating source secret
✔ public key: ecdsa-sha2-nistp384 AAAAE2VjZHNhLXNoYTItbmlzdHAzODQAAAAIbmlzdHAzODQAAABhBFqaT8B8SezU92qoE+bhnv9xONv9oIGuy7yVAznAZfyoWWEVkgP2dYDye5lMbgl6MorG/yjfkyo75ETieAE49/m9D2xvL4esnSx9zsOLdnfS9W99XSfFpC2n6soL+Exodw==
✔ configured deploy key "flux-system-main-flux-system-./clusters/CLOUDY" for "https://github.com/container-training-fleet/fleet-config-using-flux-XXXXX"
► applying source secret "flux-system/flux-system"
✔ reconciled source secret
► generating sync manifests
✔ generated sync manifests
✔ committed sync manifests to "main" ("11035e19cabd9fd2c7c94f6e93707f22d69a5ff2")
► pushing sync manifests to "https://github.com/container-training-fleet/fleet-config-using-flux-XXXXX.git"
► applying sync manifests
✔ reconciled sync configuration
◎ waiting for GitRepository "flux-system/flux-system" to be reconciled
✔ GitRepository reconciled successfully
◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✔ Kustomization reconciled successfully
► confirming components are healthy
✔ helm-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ all components are healthy
```
---
### Flux configures Github repository access for teams
- `Flux` sets up permissions that allow teams within our organization to **access** the `Github` repository as maintainers
- Teams need to exist before `Flux` proceeds to this configuration
![Teams in Github](images/flux/github-teams.png)
---
### ⚠️ Disclaimer
- In this lab, adding these teams as maintainers was merely a demonstration of how `Flux` _CLI_ sets up permissions in Github
- But there is no need for dev teams to have access to this `Github` repository
- One advantage of _GitOps_ lies in its ability to easily set up 💪🏼 **Separation of concerns** by using multiple `Flux` sources
---
### The PAT is not needed anymore!
- During the install process, `Flux` creates an `ssh` key pair so that it is able to contribute to the `Github` repository.
```bash
► generating source secret
✔ public key: ecdsa-sha2-nistp384 AAAAE2VjZHNhLXNoYTItbmlzdHAzODQAAAAIbmlzdHAzODQAAABhBFqaT8B8SezU92qoE+bhnv9xONv9oIGuy7yVAznAZfyoWWEVkgP2dYDye5lMbgl6MorG/yjfkyo75ETieAE49/m9D2xvL4esnSx9zsOLdnfS9W99XSfFpC2n6soL+Exodw==
✔ configured deploy key "flux-system-main-flux-system-./clusters/CLOUDY" for "https://github.com/container-training-fleet/fleet-config-using-flux-XXXXX"
► applying source secret "flux-system/flux-system"
✔ reconciled source secret
```
- You can now delete the formerly created _Personal Access Token_: `Flux` won't use it anymore.
---
### 📂 Flux config files
`Flux` has been successfully installed onto our **_☁CLOUDY_** Kubernetes cluster!
Its configuration is managed through a _Gitops_ workflow sourced directly from our `Github` repository
Let's review our `Flux` configuration files we've created and pushed into the `Github` repository…
… as well as the corresponding components running in our Kubernetes cluster
![Flux config files](images/flux/flux-config-files.png)
---
class: pic
<!-- FIXME: wrong schema -->
![Flux architecture](images/flux/flux-controllers.png)
---
class: extra-details
### Flux resources 1/2
.lab[
```bash
k8s@shpod:~$ kubectl get all --namespace flux-system
NAME READY STATUS RESTARTS AGE
pod/helm-controller-b6767d66-h6qhk 1/1 Running 0 5m
pod/kustomize-controller-57c7ff5596-94rnd 1/1 Running 0 5m
pod/notification-controller-58ffd586f7-zxfvk 1/1 Running 0 5m
pod/source-controller-6ff87cb475-g6gn6 1/1 Running 0 5m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/notification-controller ClusterIP 10.104.139.156 <none> 80/TCP 5m1s
service/source-controller ClusterIP 10.106.120.137 <none> 80/TCP 5m
service/webhook-receiver ClusterIP 10.96.28.236 <none> 80/TCP 5m
()
```
]
---
class: extra-details
### Flux resources 2/2
.lab[
```bash
k8s@shpod:~$ kubectl get all --namespace flux-system
()
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/helm-controller 1/1 1 1 5m
deployment.apps/kustomize-controller 1/1 1 1 5m
deployment.apps/notification-controller 1/1 1 1 5m
deployment.apps/source-controller 1/1 1 1 5m
NAME DESIRED CURRENT READY AGE
replicaset.apps/helm-controller-b6767d66 1 1 1 5m
replicaset.apps/kustomize-controller-57c7ff5596 1 1 1 5m
replicaset.apps/notification-controller-58ffd586f7 1 1 1 5m
replicaset.apps/source-controller-6ff87cb475 1 1 1 5m
```
]
---
### Flux components
- the `source controller` monitors `Git` repositories to apply Kubernetes resources on the cluster
- the `Helm controller` checks for new `Helm` _charts_ releases in `Helm` repositories and installs updates as needed
- _CRDs_ store `Flux` configuration within the Kubernetes control plane
---
class: extra-details
### Flux resources that have been created
.lab[
```bash
k8s@shpod:~$ flux get all --all-namespaces
NAMESPACE NAME REVISION SUSPENDED
READY MESSAGE
flux-system gitrepository/flux-system main@sha1:d48291a8 False
True stored artifact for revision 'main@sha1:d48291a8'
NAMESPACE NAME REVISION SUSPENDED
READY MESSAGE
flux-system kustomization/flux-system main@sha1:d48291a8 False
True Applied revision: main@sha1:d48291a8
```
]
---
### Flux CLI
`Flux` Command-Line Interface fulfills 3 primary functions:
1. It installs and configures first mandatory `Flux` resources in a _Gitops_ `git` repository
- ensuring proper access and permissions
2. It locally generates `YAML` files for desired `Flux` resources so that we just need to `git push` them
- _tenants_
- sources
-
3. It requests the API server to manage `Flux`-related resources
- _operators_
- _CRDs_
- logs
---
class: extra-details
### Flux -- for more info
Please, refer to the [`Flux` chapter in the High Five M3 module](./3.yml.html#toc-helm-chart-format)
---
### Flux relies on Kustomize
The `Flux` component named `kustomize controller` look for `Kustomize` resources in `Flux` code-based sources
1. `Kustomize` look for `YAML` manifests listed in the `kustomization.yaml` file
2. and aggregates, hydrates and patches them following the `kustomization` configuration
---
class: extra-details
### 2 different kustomization resources
⚠️ `Flux` uses 2 distinct resources with `kind: kustomization`
```yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: kustomization
```
describes how Kustomize (the _CLI_ tool) appends and transforms `YAML` manifests into a single bunch of `YAML` described resources
```yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1 group
kind: Kustomization
```
describes where `Flux kustomize-controller` looks for a `kustomization.yaml` file in a given `Flux` code-based source
---
class: extra-details
### Kustomize -- for more info
Please, refer to the [`Kustomize` chapter in the High Five M3 module](./3.yml.html#toc-kustomize)
---
class: extra-details
### Group / Version / Kind -- for more info
For more info about how Kubernetes resource natures are identified by their `Group / Version / Kind` triplet…
… please, refer to the [`Kubernetes API` chapter in the High Five M5 module](./5.yml.html#toc-the-kubernetes-api)
---
### 🗺️ Where are we in our scenario?
<pre class="mermaid">
%%{init:
{
"theme": "default",
"gitGraph": {
"mainBranchName": "OPS",
"mainBranchOrder": 0
}
}
}%%
gitGraph
commit id:"0" tag:"start"
branch ROCKY order:3
branch MOVY order:4
branch YouRHere order:5
checkout OPS
commit id:'Flux install on CLOUDY cluster' tag:'T01'
branch TEST-env order:1
commit id:'FLUX install on TEST' tag:'T02' type: HIGHLIGHT
checkout YouRHere
commit id:'x'
checkout OPS
merge YouRHere id:'YOU ARE HERE'
checkout OPS
commit id:'Flux config. for TEST tenant' tag:'T03'
commit id:'namespace isolation by RBAC'
checkout TEST-env
merge OPS id:'ROCKY tenant creation' tag:'T04'
checkout OPS
commit id:'ROCKY deploy. config.' tag:'R01'
checkout TEST-env
merge OPS id:'TEST ready to deploy ROCKY' type: HIGHLIGHT tag:'R02'
checkout ROCKY
commit id:'ROCKY' tag:'v1.0.0'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.0'
</pre>

284
slides/flux/ingress.md Normal file
View File

@@ -0,0 +1,284 @@
# T05- Configuring ingress for **_🎸ROCKY_** app
🍾 **_🎸ROCKY_** team has just deployed its `v1.0.0`
We would like to reach it from our workstations
The regular way to do it in Kubernetes is to configure an `Ingress` resource.
- `Ingress` is an abstract resource that manages how services are exposed outside of the Kubernetes cluster (Layer 7).
- It relies on `ingress-controller`(s) that are technical solutions to handle all the rules related to ingress.
- Available features vary, depending on the `ingress-controller`: load-balancing, networking, firewalling, API management, throttling, TLS encryption, etc.
- `ingress-controller` may provision Cloud _IaaS_ network resources such as load-balancer, persistent IPs, etc.
---
class: extra-details
## Ingress -- for more info
Please, refer to the [`Ingress` chapter in the High Five M2 module](./2.yml.html#toc-exposing-http-services-with-ingress-resources)
---
## Installing `ingress-nginx` as our `ingress-controller`
We'll use `ingress-nginx` (relying on `NGinX`), quite a popular choice.
- It is able to provision IaaS load-balancer in ScaleWay Cloud services
- As a reverse-proxy, it is able to balance HTTP connections on an on-premises cluster
The **_⚙OPS_** Team add this new install to its `Flux` config. repo
---
### Creating a `Github` source in Flux for `ingress-nginx`
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
mkdir -p ./clusters/CLOUDY/ingress-nginx && \
flux create source git ingress-nginx \
--namespace=ingress-nginx \
--url=https://github.com/kubernetes/ingress-nginx/ \
--branch=release-1.12 \
--export > ./clusters/CLOUDY/ingress-nginx/sync.yaml
```
]
---
### Creating `kustomization` in Flux for `ingress-nginx`
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ flux create kustomization ingress-nginx \
--namespace=ingress-nginx \
--source=GitRepository/ingress-nginx \
--path="./deploy/static/provider/scw/" \
--export >> ./clusters/CLOUDY/ingress-nginx/sync.yaml
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
cp -p ~/container.training/k8s/M6-ingress-nginx-kustomization.yaml \
./clusters/CLOUDY/ingress-nginx/kustomization.yaml && \
cp -p ~/container.training/k8s/M6-ingress-nginx-components.yaml \
~/container.training/k8s/M6-ingress-nginx-*-patch.yaml \
./clusters/CLOUDY/ingress-nginx/
```
]
---
### Applying the new config
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
git add ./clusters/CLOUDY/ingress-nginx && \
git commit -m':wrench: :rocket: add Ingress-controller' && \
git push
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
class: pic
![Ingress-nginx provisionned a IaaS load-balancer in Scaleway Cloud services](images/flux/ingress-nginx-scaleway-lb.png)
---
class: extra-details
### Using external Git source
💡 Note that you can directly use pubilc `Github` repository (not maintained by your company).
- If you have to alter the configuration, `Kustomize` patching capabilities might help.
- Depending on the _gitflow_ this repository uses, updates will be deployed automatically to your cluster (here we're using a `release` branch).
- This repo exposes a `kustomization.yaml`. Well done!
---
## Adding the `ingress` resource to ROCKY app
.lab[
- Add the new manifest to our kustomization bunch
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
cp -pr ~/container.training/k8s/M6-rocky-ingress.yaml ./tenants/base/rocky && \
echo '- M6-rocky-ingress.yaml' >> ./tenants/base/rocky/kustomization.yaml
```
- Commit and its done
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
git add . && \
git commit -m':wrench: :rocket: add Ingress' && \
git push
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Here is the result
After Flux reconciled the whole bunch of sources and kustomizations, you should see
- `Ingress-NGinX` controller components in `ingress-nginx` namespace
- A new `Ingress` in `rocky-test` namespace
.lab[
```bash
k8s@shpod:~$ kubectl get all -n ingress-nginx && \
kubectl get ingress -n rocky-test
k8s@shpod:~$ \
PublicIP=$(kubectl get ingress rocky -n rocky-test \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}')
k8s@shpod:~$ \
curl --header 'rocky.test.mybestdomain.com' http://$PublicIP/
```
]
---
class: pic
![Rocky application screenshot](images/flux/rocky-app-screenshot.png)
---
## Upgrading **_🎸ROCKY_** app
**_🎸ROCKY_** team is now fully able to upgrade and deploy its app autonomously.
Just give it a try!
- In the `deployment.yaml` file
- in the app repo ([https://github.com/Musk8teers/container.training-spring-music/])
- you can change the `spec.template.spec.containers.image` to `1.0.1` and then to `1.0.2`
Dont' forget which branch is watched by `Flux` Git source named `rocky`
Don't forget to commit!
---
## Few considerations
- The **_⚙OPS_** team has to decide how to manage name resolution for public IPs
- Scaleway propose to expose a wildcard domain for its Kubernetes clusters
- Here, we chose that `Ingress-controller` (that makes sense) but `Ingress` as well were managed by the **_⚙OPS_** team.
- It might have been done in many different ways!
---
### 🗺️ Where are we in our scenario?
<pre class="mermaid">
%%{init:
{
"theme": "default",
"gitGraph": {
"mainBranchName": "OPS",
"mainBranchOrder": 0
}
}
}%%
gitGraph
commit id:"0" tag:"start"
branch ROCKY order:3
branch MOVY order:4
branch YouRHere order:5
checkout OPS
commit id:'Flux install on CLOUDY cluster' tag:'T01'
branch TEST-env order:1
commit id:'FLUX install on TEST' tag:'T02' type: HIGHLIGHT
checkout OPS
commit id:'Flux config. for TEST tenant' tag:'T03'
commit id:'namespace isolation by RBAC'
checkout TEST-env
merge OPS id:'ROCKY tenant creation' tag:'T04'
checkout OPS
commit id:'ROCKY deploy. config.' tag:'R01'
checkout TEST-env
merge OPS id:'TEST ready to deploy ROCKY' type: HIGHLIGHT tag:'R02'
checkout ROCKY
commit id:'ROCKY' tag:'v1.0.0'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.0'
checkout OPS
commit id:'Ingress-controller config.' tag:'T05'
checkout TEST-env
merge OPS id:'Ingress-controller install' type: HIGHLIGHT tag:'T06'
checkout OPS
commit id:'ROCKY patch for ingress config.' tag:'R03'
checkout TEST-env
merge OPS id:'ingress config. for ROCKY app'
checkout ROCKY
commit id:'blue color' tag:'v1.0.1'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.1'
checkout ROCKY
commit id:'pink color' tag:'v1.0.2'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.2'
checkout YouRHere
commit id:'x'
checkout OPS
merge YouRHere id:'YOU ARE HERE'
checkout OPS
commit id:'FLUX config for MOVY deployment' tag:'M01'
checkout TEST-env
merge OPS id:'FLUX ready to deploy MOVY' type: HIGHLIGHT tag:'M02'
checkout MOVY
commit id:'MOVY' tag:'v1.0.3'
checkout TEST-env
merge MOVY tag:'MOVY v1.0.3' type: REVERSE
checkout OPS
commit id:'Network policies'
checkout TEST-env
merge OPS type: HIGHLIGHT
</pre>

241
slides/flux/kyverno.md Normal file
View File

@@ -0,0 +1,241 @@
## introducing Kyverno
Kyverno is a tool to extend Kubernetes permission management to express complex policies…
</br>… and override manifests delivered by client teams.
---
class: extra-details
### Kyverno -- for more info
Please, refer to the [`Setting up Kubernetes` chapter in the High Five M4 module](./4.yml.html#toc-policy-management-with-kyverno) for more infos about `Kyverno`.
---
## Creating an `Helm` source in Flux for Kyverno Helm chart
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
mkdir -p clusters/CLOUDY/kyverno && \
cp -pr ~/container.training/k8s/
k8s@shpod ~$ flux create source helm kyverno \
--namespace=kyverno \
--url=https://kyverno.github.io/kyverno/ \
--interval=3m \
--export > ./clusters/CLOUDY/kyverno/sync2.yaml
```
]
---
## Creating the `HelmRelease` in Flux
.lab[
```bash
k8s@shpod ~$ flux create helmrelease kyverno \
--namespace=kyverno \
--source=HelmRepository/kyverno.flux-system \
--target-namespace=kyverno \
--create-target-namespace=true \
--chart-version=">=3.4.2" \
--chart=kyverno \
--export >> ./clusters/CLOUDY/kyverno/sync.yaml
```
]
---
## Add Kyverno policy
This polivy is just an example.
It enforces the use of a `Service Account` in `Flux` configurations
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
mkdir -p clusters/CLOUDY/kyverno-policies && \
cp -pr ~/container.training/k8s/M6-kyverno-enforce-service-account.yaml \
./clusters/CLOUDY/kyverno-policies/
---
### Creating `kustomization` in Flux for Kyverno policies
.lab[
```bash
k8s@shpod:~/fleet-config-using-flux-XXXXX$ \
flux create kustomization kyverno-policies \
--namespace=kyverno \
--source=GitRepository/flux-system \
--path="./clusters/CLOUDY/kyverno-policies/" \
--prune true --interval 5m \
--depends-on kyverno \
--export >> ./clusters/CLOUDY/kyverno-policies/sync.yaml
```
]
## Apply Kyverno policy
```bash
flux create kustomization
--path
--source GitRepository/
--export > ./clusters/CLOUDY/kyverno-policies/sync.yaml
```
---
## Add Kyverno dependency for **_⚗TEST_** cluster
- Now that we've got `Kyverno` policies,
- ops team will enforce any upgrade from any kustomization in our dev team tenants
- to wait for the `kyverno` policies to be reconciled (in a `Flux` perspective)
- upgrade file `./clusters/CLOUDY/tenants.yaml`,
- by adding this property: `spec.dependsOn.{name: kyverno-policies}`
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Debugging
`Kyverno-policies` `Kustomization` failed because `spec.dependsOn` property can only target a resource from the same `Kind`.
- Let's suppress the `spec.dependsOn` property.
Now `Kustomizations` for **_🎸ROCKY_** and **_🎬MOVY_** tenants failed because of our policies.
---
### 🗺️ Where are we in our scenario?
<pre class="mermaid">
%%{init:
{
"theme": "default",
"gitGraph": {
"mainBranchName": "OPS",
"mainBranchOrder": 0
}
}
}%%
gitGraph
commit id:"0" tag:"start"
branch ROCKY order:4
branch MOVY order:5
branch YouRHere order:6
checkout OPS
commit id:'Flux install on CLOUDY cluster' tag:'T01'
branch TEST-env order:1
commit id:'FLUX install on TEST' tag:'T02' type: HIGHLIGHT
checkout OPS
commit id:'Flux config. for TEST tenant' tag:'T03'
commit id:'namespace isolation by RBAC'
checkout TEST-env
merge OPS id:'ROCKY tenant creation' tag:'T04'
checkout OPS
commit id:'ROCKY deploy. config.' tag:'R01'
checkout TEST-env
merge OPS id:'TEST ready to deploy ROCKY' type: HIGHLIGHT tag:'R02'
checkout ROCKY
commit id:'ROCKY' tag:'v1.0.0'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.0'
checkout OPS
commit id:'Ingress-controller config.' tag:'T05'
checkout TEST-env
merge OPS id:'Ingress-controller install' type: HIGHLIGHT tag:'T06'
checkout OPS
commit id:'ROCKY patch for ingress config.' tag:'R03'
checkout TEST-env
merge OPS id:'ingress config. for ROCKY app'
checkout ROCKY
commit id:'blue color' tag:'v1.0.1'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.1'
checkout ROCKY
commit id:'pink color' tag:'v1.0.2'
checkout TEST-env
merge ROCKY tag:'ROCKY v1.0.2'
checkout OPS
commit id:'FLUX config for MOVY deployment' tag:'M01'
checkout TEST-env
merge OPS id:'FLUX ready to deploy MOVY' type: HIGHLIGHT tag:'M02'
checkout MOVY
commit id:'MOVY' tag:'v1.0.3'
checkout TEST-env
merge MOVY tag:'MOVY v1.0.3' type: REVERSE
checkout OPS
commit id:'Network policies'
checkout TEST-env
merge OPS type: HIGHLIGHT tag:'T07'
checkout OPS
commit id:'k0s install on METAL cluster' tag:'K01'
commit id:'Flux config. for METAL cluster' tag:'K02'
branch METAL_TEST-PROD order:3
commit id:'ROCKY/MOVY tenants on METAL' type: HIGHLIGHT
checkout OPS
commit id:'Flux config. for OpenEBS' tag:'K03'
checkout METAL_TEST-PROD
merge OPS id:'openEBS on METAL' type: HIGHLIGHT
checkout OPS
commit id:'Prometheus install'
checkout TEST-env
merge OPS type: HIGHLIGHT
checkout OPS
commit id:'Kyverno install'
commit id:'Kyverno rules'
checkout TEST-env
merge OPS type: HIGHLIGHT
checkout YouRHere
commit id:'x'
checkout OPS
merge YouRHere id:'YOU ARE HERE'
checkout OPS
commit id:'Flux config. for PROD tenant' tag:'P01'
branch PROD-env order:2
commit id:'ROCKY tenant on PROD'
checkout OPS
commit id:'ROCKY patch for PROD' tag:'R04'
checkout PROD-env
merge OPS id:'PROD ready to deploy ROCKY' type: HIGHLIGHT
checkout PROD-env
merge ROCKY tag:'ROCKY v1.0.2'
checkout MOVY
commit id:'MOVY HELM chart' tag:'M03'
checkout TEST-env
merge MOVY tag:'MOVY v1.0'
</pre>

Some files were not shown because too many files have changed in this diff Show More