Compare commits

..

54 Commits

Author SHA1 Message Date
Jérôme Petazzoni
01c374d0a4 Merge pull request #664 from lpiot/main
The missing slides…😅
2025-06-13 08:48:44 +02:00
Ludovic Piot
eee44979c5 📝 Add Kyverno install chapter 2025-06-12 22:13:19 +02:00
Ludovic Piot
4d3bc06e30 📝 Add Kyverno install chapter 2025-06-12 21:50:42 +02:00
Ludovic Piot
229ab045b3 🔥 2025-06-12 21:04:06 +02:00
Ludovic Piot
fe1a61eaeb 🎨 2025-06-12 21:03:49 +02:00
Ludovic Piot
9613589dea 📝 Add small section about SSH keypairs rotation for Flux 2025-06-12 20:23:59 +02:00
Ludovic Piot
ca8865a10b 📝 Change the mermaid scenario diagram 2025-06-12 20:07:11 +02:00
Ludovic Piot
f279bbea11 ✏️ 2025-06-12 20:06:27 +02:00
Ludovic Piot
bc6100301e 📝 Add monitoring stack install 2025-06-12 20:05:14 +02:00
Jérôme Petazzoni
a32751636a Merge pull request #663 from lpiot/main
The deck with a small fix
2025-06-11 20:33:27 +02:00
Ludovic Piot
4a0e23d131 🐛 Sorry Jerome 2025-06-11 19:59:52 +02:00
Ludovic Piot
6e987d1fca Merge branch 'm6' into main 2025-06-11 19:52:03 +02:00
Ludovic Piot
18b888009e 📝 Add an MVP Network policies section 2025-06-11 19:44:17 +02:00
Ludovic Piot
36dd8bb695 📝 Add the new chapters to the M6 stack 2025-06-11 19:33:35 +02:00
Ludovic Piot
395c5a38ab 🎨 Add reference to the chapter title 2025-06-11 19:24:57 +02:00
Ludovic Piot
2b0d3b87ac 📝 Add OpenEBS install chapter 2025-06-11 19:24:13 +02:00
Ludovic Piot
a165e60407 📝 Add k0s install chapter 2025-06-11 19:22:40 +02:00
Ludovic Piot
3c13fd51dd 🎨 Add Mario animation when Flux reconcile 2025-06-11 19:22:04 +02:00
Ludovic Piot
324ad2fdd0 🎨 Update mermaid scenario diagram 2025-06-11 19:21:13 +02:00
Ludovic Piot
269ae79e30 📝 Add k0s install chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
39a15b3d7d ✏️ Clean up consistency about how we evoke the OPS team 2025-06-11 17:08:52 +02:00
Ludovic Piot
9e7ed8cb49 📝 Add MOVY tenant creation chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
06e7a47659 📝 Upgrade the mermaid scenario 2025-06-11 17:08:52 +02:00
Ludovic Piot
802e525f57 📝 Add Ingress chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
0f68f89840 📝 Add Ingress chapter 2025-06-11 17:08:52 +02:00
Ludovic Piot
b275342bd2 ✏️ Fixing TEST emphasis 2025-06-11 17:08:52 +02:00
Ludovic Piot
e11e97ccff 📝 Add k0s install chapter 2025-06-11 15:10:43 +02:00
Ludovic Piot
023a9d0346 ✏️ Clean up consistency about how we evoke the OPS team 2025-06-10 19:20:25 +02:00
Ludovic Piot
3f5eaae6b9 📝 Add MOVY tenant creation chapter 2025-06-10 19:19:19 +02:00
Ludovic Piot
1634d5b5bc 📝 Upgrade the mermaid scenario 2025-06-10 17:15:38 +02:00
Ludovic Piot
40418be55a 📝 Add Ingress chapter 2025-06-10 16:19:06 +02:00
Ludovic Piot
04198b7f91 📝 Add Ingress chapter 2025-06-10 16:05:17 +02:00
Jérôme Petazzoni
150c8fc768 Merge pull request #660 from lpiot/main
Mostly the scenario upgrade with Mermaid schemas
2025-06-10 14:24:18 +02:00
Ludovic Piot
e2af1bb057 ✏️ Fixing TEST emphasis 2025-06-10 12:51:09 +02:00
Ludovic Piot
d4c260aa4a 💄 📝 🎨 Upgrade the mermaid scenario schema 2025-06-09 21:20:57 +02:00
Ludovic Piot
89cd677b09 📝 upgrade R01 chapter 2025-06-09 21:20:57 +02:00
Ludovic Piot
3008680c12 🛂 🐛 fix permissions for persistentVolumes management 2025-06-09 21:20:57 +02:00
Ludovic Piot
f7b8184617 🎨 2025-06-09 21:20:57 +02:00
Jérôme Petazzoni
a565c0979c Merge pull request #659 from lpiot/main
Add R01 chapter and fixes to previous chapters
2025-06-09 20:05:55 +02:00
Jérôme Petazzoni
7a11f03b5e Merge branch 'm6' into main 2025-06-09 20:05:26 +02:00
Ludovic Piot
b0760b99a5 ✏️ 📝 Fix shpod access methods 2025-06-09 17:11:57 +02:00
Ludovic Piot
bcb9c3003f 📝 Add R01 chapter about test-ROCKY tenant config 2025-06-09 17:10:35 +02:00
Ludovic Piot
99ce9b3a8a 🎨 📝 Add missing steps in demo 2025-06-09 16:09:45 +02:00
Ludovic Piot
0ba602b533 🎨 clean up code display 2025-06-09 16:08:58 +02:00
Jérôme Petazzoni
d43c41e11e Proof-read first half of M6-START 2025-06-09 14:46:13 +02:00
Ludovic Piot
331309dc63 🎨 cleanup display of some console results 2025-06-09 14:11:05 +02:00
Ludovic Piot
44146915e0 📝 🍱 add T03 chapter 2025-06-04 23:55:33 +02:00
Ludovic Piot
84996e739b 🍱 📝 rewording and updating pics 2025-06-04 23:54:51 +02:00
Ludovic Piot
2aea1f70b2 📝 Add Flux install 2025-05-29 18:00:18 +02:00
Ludovic Piot
985e2ae42c 📝 add M6 intro slidedeck 2025-05-29 12:25:57 +02:00
Ludovic Piot
ea58428a0c 🐛 Slides now generate! ♻️ Move a slide 2025-05-14 22:05:59 +02:00
Ludovic Piot
59e60786c0 🎨 make personnae and cluster names consistent 2025-05-14 21:49:09 +02:00
Ludovic Piot
af63cf1405 🚨 2025-05-14 21:25:59 +02:00
Ludovic Piot
f9041807f6 🎉 first M6 draft slidedeck 2025-05-14 20:52:32 +02:00
159 changed files with 4246 additions and 6160 deletions

View File

@@ -9,7 +9,7 @@
"forwardPorts": [],
//"postCreateCommand": "... install extra packages...",
"postStartCommand": "dind.sh ; kind.sh",
"postStartCommand": "dind.sh",
// This lets us use "docker-outside-docker".
// Unfortunately, minikube, kind, etc. don't work very well that way;

1
.gitignore vendored
View File

@@ -17,7 +17,6 @@ slides/autopilot/state.yaml
slides/index.html
slides/past.html
slides/slides.zip
slides/_academy_*
node_modules
### macOS ###

View File

@@ -1,24 +1,26 @@
services:
version: "2"
services:
rng:
build: rng
ports:
- "8001:80"
- "8001:80"
hasher:
build: hasher
ports:
- "8002:80"
- "8002:80"
webui:
build: webui
ports:
- "8000:80"
- "8000:80"
volumes:
- "./webui/files/:/files/"
- "./webui/files/:/files/"
redis:
image: redis
worker:
build: worker

View File

@@ -1,8 +1,7 @@
FROM ruby:alpine
WORKDIR /app
RUN apk add --update build-base curl
RUN gem install sinatra --version '~> 3'
RUN gem install thin
COPY hasher.rb .
CMD ["ruby", "hasher.rb", "-o", "::"]
ADD hasher.rb /
CMD ["ruby", "hasher.rb"]
EXPOSE 80

View File

@@ -2,6 +2,7 @@ require 'digest'
require 'sinatra'
require 'socket'
set :bind, '0.0.0.0'
set :port, 80
post '/' do

View File

@@ -1,7 +1,5 @@
FROM python:alpine
WORKDIR /app
RUN pip install Flask
COPY rng.py .
ENV FLASK_APP=rng FLASK_RUN_HOST=:: FLASK_RUN_PORT=80
CMD ["flask", "run"]
COPY rng.py /
CMD ["python", "rng.py"]
EXPOSE 80

View File

@@ -28,5 +28,5 @@ def rng(how_many_bytes):
if __name__ == "__main__":
app.run(port=80)
app.run(host="0.0.0.0", port=80, threaded=False)

View File

@@ -1,8 +1,7 @@
FROM node:23-alpine
WORKDIR /app
FROM node:4-slim
RUN npm install express
RUN npm install morgan
RUN npm install redis@5
COPY . .
RUN npm install redis@3
COPY files/ /files/
COPY webui.js /
CMD ["node", "webui.js"]
EXPOSE 80

View File

@@ -1,34 +1,26 @@
import express from 'express';
import morgan from 'morgan';
import { createClient } from 'redis';
var client = await createClient({
url: "redis://redis",
socket: {
family: 0
}
})
.on("error", function (err) {
console.error("Redis error", err);
})
.connect();
var express = require('express');
var app = express();
var redis = require('redis');
app.use(morgan('common'));
var client = redis.createClient(6379, 'redis');
client.on("error", function (err) {
console.error("Redis error", err);
});
app.get('/', function (req, res) {
res.redirect('/index.html');
});
app.get('/json', async(req, res) => {
var coins = await client.hLen('wallet');
var hashes = await client.get('hashes');
var now = Date.now() / 1000;
res.json({
coins: coins,
hashes: hashes,
now: now
app.get('/json', function (req, res) {
client.hlen('wallet', function (err, coins) {
client.get('hashes', function (err, hashes) {
var now = Date.now() / 1000;
res.json( {
coins: coins,
hashes: hashes,
now: now
});
});
});
});

View File

@@ -1,6 +1,5 @@
FROM python:alpine
WORKDIR /app
RUN pip install redis
RUN pip install requests
COPY worker.py .
COPY worker.py /
CMD ["python", "worker.py"]

View File

@@ -1,33 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: blue
name: blue
spec:
replicas: 1
selector:
matchLabels:
app: blue
template:
metadata:
labels:
app: blue
spec:
containers:
- image: jpetazzo/color
name: color
---
apiVersion: v1
kind: Service
metadata:
labels:
app: blue
name: blue
spec:
ports:
- name: "80"
port: 80
selector:
app: blue

View File

@@ -1,12 +0,0 @@
# This removes the haproxy Deployment.
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
patches:
- patch: |-
$patch: delete
kind: Deployment
apiVersion: apps/v1
metadata:
name: haproxy

View File

@@ -1,14 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
# Within a Kustomization, it is not possible to specify in which
# order transformations (patches, replacements, etc) should be
# executed. If we want to execute transformations in a specific
# order, one possibility is to put them in individual components,
# and then invoke these components in the order we want.
# It works, but it creates an extra level of indirection, which
# reduces readability and complicates maintenance.
components:
- setup
- cleanup

View File

@@ -1,20 +0,0 @@
global
#log stdout format raw local0
#daemon
maxconn 32
defaults
#log global
timeout client 1h
timeout connect 1h
timeout server 1h
mode http
option abortonclose
frontend metrics
bind :9000
http-request use-service prometheus-exporter
frontend ollama_frontend
bind :8000
default_backend ollama_backend
maxconn 16
backend ollama_backend
server ollama_server localhost:11434 check

View File

@@ -1,39 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: haproxy
name: haproxy
spec:
selector:
matchLabels:
app: haproxy
template:
metadata:
labels:
app: haproxy
spec:
volumes:
- name: haproxy
configMap:
name: haproxy
containers:
- image: haproxy:3.0
name: haproxy
volumeMounts:
- name: haproxy
mountPath: /usr/local/etc/haproxy
readinessProbe:
httpGet:
port: 9000
ports:
- name: haproxy
containerPort: 8000
- name: metrics
containerPort: 9000
resources:
requests:
cpu: 0.05
limits:
cpu: 1

View File

@@ -1,75 +0,0 @@
# This adds a sidecar to the ollama Deployment, by taking
# the pod template and volumes from the haproxy Deployment.
# The idea is to allow to run ollama+haproxy in two modes:
# - separately (each with their own Deployment),
# - together in the same Pod, sidecar-style.
# The YAML files define how to run them separetely, and this
# "replacements" directive fetches a specific volume and
# a specific container from the haproxy Deployment, to add
# them to the ollama Deployment.
#
# This would be simpler if kustomize allowed to append or
# merge lists in "replacements"; but it doesn't seem to be
# possible at the moment.
#
# It would be even better if kustomize allowed to perform
# a strategic merge using a fieldPath as the source, because
# we could merge both the containers and the volumes in a
# single operation.
#
# Note that technically, it might be possible to layer
# multiple kustomizations so that one generates the patch
# to be used in another; but it wouldn't be very readable
# or maintainable so we decided to not do that right now.
#
# However, the current approach (fetching fields one by one)
# has an advantage: it could let us transform the haproxy
# container into a real sidecar (i.e. an initContainer with
# a restartPolicy=Always).
apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
resources:
- haproxy.yaml
configMapGenerator:
- name: haproxy
files:
- haproxy.cfg
replacements:
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.volumes.[name=haproxy]
targets:
- select:
kind: Deployment
name: ollama
fieldPaths:
- spec.template.spec.volumes.[name=haproxy]
options:
create: true
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.containers.[name=haproxy]
targets:
- select:
kind: Deployment
name: ollama
fieldPaths:
- spec.template.spec.containers.[name=haproxy]
options:
create: true
- source:
kind: Deployment
name: haproxy
fieldPath: spec.template.spec.containers.[name=haproxy].ports.[name=haproxy].containerPort
targets:
- select:
kind: Service
name: ollama
fieldPaths:
- spec.ports.[name=11434].targetPort

View File

@@ -1,34 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: blue
name: blue
spec:
replicas: 2
selector:
matchLabels:
app: blue
template:
metadata:
labels:
app: blue
spec:
containers:
- image: jpetazzo/color
name: color
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
app: blue
name: blue
spec:
ports:
- port: 80
selector:
app: blue

View File

@@ -1,94 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# Each of these YAML files contains a Deployment and a Service.
# The blue.yaml file is here just to demonstrate that the rest
# of this Kustomization can be precisely scoped to the ollama
# Deployment (and Service): the blue Deployment and Service
# shouldn't be affected by our kustomize transformers.
resources:
- ollama.yaml
- blue.yaml
buildMetadata:
# Add a label app.kubernetes.io/managed-by=kustomize-vX.Y.Z
- managedByLabel
# Add an annotation config.kubernetes.io/origin, indicating:
# - which file defined that resource;
# - if it comes from a git repository, which one, and which
# ref (tag, branch...) it was.
- originAnnotations
# Add an annotation alpha.config.kubernetes.io/transformations
# indicating which patches and other transformers have changed
# each resource.
- transformerAnnotations
# Let's generate a ConfigMap with literal values.
# Note that this will actually add a suffix to the name of the
# ConfigMaps (e.g.: ollama-8bk8bd8m76) and it will update all
# references to the ConfigMap (e.g. in Deployment manifests)
# accordingly. The suffix is a hash of the ConfigMap contents,
# so that basically, if the ConfigMap is edited, any workload
# using that ConfigMap will automatically do a rolling update.
configMapGenerator:
- name: ollama
literals:
- "model=gemma3:270m"
- "prompt=If you visit Paris, I suggest that you"
- "queue=4"
name: ollama
patches:
# The Deployment manifest in ollama.yaml doesn't specify
# resource requests and limits, so that it can run on any
# cluster (including resource-constrained local clusters
# like KiND or minikube). The example belows add CPU
# requests and limits using a strategic merge patch.
# The patch is inlined here, but it could also be put
# in a file and referenced with "path: xxxxxx.yaml".
- patch: |
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
spec:
template:
spec:
containers:
- name: ollama
resources:
requests:
cpu: 1
limits:
cpu: 2
# This will have the same effect, with one little detail:
# JSON patches cannot specify containers by name, so this
# assumes that the ollama container is the first one in
# the pod template (whereas the strategic merge patch can
# use "merge keys" and identify containers by their name).
#- target:
# kind: Deployment
# name: ollama
# patch: |
# - op: add
# path: /spec/template/spec/containers/0/resources
# value:
# requests:
# cpu: 1
# limits:
# cpu: 2
# A "component" is a bit like a "base", in the sense that
# it lets us define some reusable resources and behaviors.
# There is a key different, though:
# - a "base" will be evaluated in isolation: it will
# generate+transform some resources, then these resources
# will be included in the main Kustomization;
# - a "component" has access to all the resources that
# have been generated by the main Kustomization, which
# means that it can transform them (with patches etc).
components:
- add-haproxy-sidecar

View File

@@ -1,73 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: ollama
name: ollama
spec:
selector:
matchLabels:
app: ollama
template:
metadata:
labels:
app: ollama
spec:
volumes:
- name: ollama
hostPath:
path: /opt/ollama
type: DirectoryOrCreate
containers:
- image: ollama/ollama
name: ollama
env:
- name: OLLAMA_MAX_QUEUE
valueFrom:
configMapKeyRef:
name: ollama
key: queue
- name: MODEL
valueFrom:
configMapKeyRef:
name: ollama
key: model
volumeMounts:
- name: ollama
mountPath: /root/.ollama
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- ollama pull $MODEL
livenessProbe:
httpGet:
port: 11434
readinessProbe:
exec:
command:
- /bin/sh
- -c
- ollama show $MODEL
ports:
- name: ollama
containerPort: 11434
---
apiVersion: v1
kind: Service
metadata:
labels:
app: ollama
name: ollama
spec:
ports:
- name: "11434"
port: 11434
protocol: TCP
targetPort: 11434
selector:
app: ollama
type: ClusterIP

View File

@@ -1,5 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- microservices
- redis

View File

@@ -1,13 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- microservices.yaml
transformers:
- |
apiVersion: builtin
kind: PrefixSuffixTransformer
metadata:
name: use-ghcr-io
prefix: ghcr.io/
fieldSpecs:
- path: spec/template/spec/containers/image

View File

@@ -1,125 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hasher
name: hasher
spec:
replicas: 1
selector:
matchLabels:
app: hasher
template:
metadata:
labels:
app: hasher
spec:
containers:
- image: dockercoins/hasher:v0.1
name: hasher
---
apiVersion: v1
kind: Service
metadata:
labels:
app: hasher
name: hasher
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: hasher
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rng
name: rng
spec:
replicas: 1
selector:
matchLabels:
app: rng
template:
metadata:
labels:
app: rng
spec:
containers:
- image: dockercoins/rng:v0.1
name: rng
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rng
name: rng
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: rng
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: webui
name: webui
spec:
replicas: 1
selector:
matchLabels:
app: webui
template:
metadata:
labels:
app: webui
spec:
containers:
- image: dockercoins/webui:v0.1
name: webui
---
apiVersion: v1
kind: Service
metadata:
labels:
app: webui
name: webui
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: webui
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: worker
name: worker
spec:
replicas: 1
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- image: dockercoins/worker:v0.1
name: worker

View File

@@ -1,4 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- redis.yaml

View File

@@ -1,35 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: redis
name: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- image: redis
name: redis
---
apiVersion: v1
kind: Service
metadata:
labels:
app: redis
name: redis
spec:
ports:
- port: 6379
protocol: TCP
targetPort: 6379
selector:
app: redis
type: ClusterIP

View File

@@ -1,160 +0,0 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hasher
name: hasher
spec:
replicas: 1
selector:
matchLabels:
app: hasher
template:
metadata:
labels:
app: hasher
spec:
containers:
- image: dockercoins/hasher:v0.1
name: hasher
---
apiVersion: v1
kind: Service
metadata:
labels:
app: hasher
name: hasher
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: hasher
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: redis
name: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- image: redis
name: redis
---
apiVersion: v1
kind: Service
metadata:
labels:
app: redis
name: redis
spec:
ports:
- port: 6379
protocol: TCP
targetPort: 6379
selector:
app: redis
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rng
name: rng
spec:
replicas: 1
selector:
matchLabels:
app: rng
template:
metadata:
labels:
app: rng
spec:
containers:
- image: dockercoins/rng:v0.1
name: rng
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rng
name: rng
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: rng
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: webui
name: webui
spec:
replicas: 1
selector:
matchLabels:
app: webui
template:
metadata:
labels:
app: webui
spec:
containers:
- image: dockercoins/webui:v0.1
name: webui
---
apiVersion: v1
kind: Service
metadata:
labels:
app: webui
name: webui
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: webui
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: worker
name: worker
spec:
replicas: 1
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- image: dockercoins/worker:v0.1
name: worker

View File

@@ -1,30 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- dockercoins.yaml
replacements:
- sourceValue: ghcr.io/dockercoins
targets:
- select:
kind: Deployment
labelSelector: "app in (hasher,rng,webui,worker)"
# It will soon be possible to use regexes in replacement selectors,
# meaning that the "labelSelector:" above can be replaced with the
# following "name:" selector which is a tiny bit simpler:
#name: hasher|rng|webui|worker
# Regex support in replacement selectors was added by this PR:
# https://github.com/kubernetes-sigs/kustomize/pull/5863
# This PR was merged in August 2025, but as of October 2025, the
# latest release of Kustomize is 5.7.1, which was released in July.
# Hopefully the feature will be available in the next release :)
# Another possibility would be to select all Deployments, and then
# reject the one(s) for which we don't want to update the registry;
# for instance:
#reject:
# kind: Deployment
# name: redis
fieldPaths:
- spec.template.spec.containers.*.image
options:
delimiter: "/"
index: 0

View File

@@ -66,7 +66,7 @@ Here is where we look for credentials for each provider:
- Civo: CLI configuration file (`~/.civo.json`)
- Digital Ocean: CLI configuration file (`~/.config/doctl/config.yaml`)
- Exoscale: CLI configuration file (`~/.config/exoscale/exoscale.toml`)
- Google Cloud: we're using "Application Default Credentials (ADC)"; run `gcloud auth application-default login`; note that we'll use the default "project" set in `gcloud` unless you set the `GOOGLE_PROJECT` environment variable
- Google Cloud: FIXME, note that the project name is currently hard-coded to `prepare-tf`
- Hetzner: CLI configuration file (`~/.config/hcloud/cli.toml`)
- Linode: CLI configuration file (`~/.config/linode-cli`)
- OpenStack: you will need to write a tfvars file (check [that exemple](terraform/virtual-machines/openstack/tfvars.example))

View File

@@ -36,12 +36,8 @@ _populate_zone() {
ZONE_ID=$(_get_zone_id $1)
shift
for IPADDR in $*; do
case "$IPADDR" in
*.*) TYPE=A;;
*:*) TYPE=AAAA;;
esac
cloudflare zones/$ZONE_ID/dns_records "name=*" "type=$TYPE" "content=$IPADDR"
cloudflare zones/$ZONE_ID/dns_records "name=\@" "type=$TYPE" "content=$IPADDR"
cloudflare zones/$ZONE_ID/dns_records "name=*" "type=A" "content=$IPADDR"
cloudflare zones/$ZONE_ID/dns_records "name=\@" "type=A" "content=$IPADDR"
done
}

View File

@@ -5,22 +5,16 @@
# 10% CPU
# (See https://docs.google.com/document/d/1n0lwp6rQKQUIuo_A5LQ1dgCzrmjkDjmDtNj1Jn92UrI)
# PRO2-XS = 4 core, 16 gb
#
# With vspod:
# 800 MB RAM
# 33% CPU
#
set -e
KONKTAG=konk
PROVIDER=linode
STUDENTS=5
PROVIDER=scaleway
STUDENTS=30
case "$PROVIDER" in
linode)
export TF_VAR_node_size=g6-standard-6
export TF_VAR_location=fr-par
export TF_VAR_location=us-east
;;
scaleway)
export TF_VAR_node_size=PRO2-XS
@@ -34,13 +28,11 @@ esac
export KUBECONFIG=~/kubeconfig
if [ "$PROVIDER" = "kind" ]; then
kind create cluster --name $KONKTAG
kind create cluster --name konk
ADDRTYPE=InternalIP
else
if ! [ -f tags/$KONKTAG/stage2/kubeconfig.101 ]; then
./labctl create --mode mk8s --settings settings/konk.env --provider $PROVIDER --tag $KONKTAG
fi
cp tags/$KONKTAG/stage2/kubeconfig.101 $KUBECONFIG
./labctl create --mode mk8s --settings settings/konk.env --provider $PROVIDER --tag konk
cp tags/konk/stage2/kubeconfig.101 $KUBECONFIG
ADDRTYPE=ExternalIP
fi

View File

@@ -56,7 +56,7 @@ _cmd_codeserver() {
ARCH=${ARCHITECTURE-amd64}
CODESERVER_VERSION=4.96.4
CODESERVER_URL=\$GITHUB/coder/code-server/releases/download/v${CODESERVER_VERSION}/code-server-${CODESERVER_VERSION}-linux-${ARCH}.tar.gz
CODESERVER_URL=https://github.com/coder/code-server/releases/download/v${CODESERVER_VERSION}/code-server-${CODESERVER_VERSION}-linux-${ARCH}.tar.gz
pssh "
set -e
i_am_first_node || exit 0
@@ -230,7 +230,7 @@ _cmd_create() {
;;
*) die "Invalid mode: $MODE (supported modes: mk8s, pssh)." ;;
esac
if ! [ -f "$SETTINGS" ]; then
die "Settings file ($SETTINGS) not found."
fi
@@ -270,27 +270,7 @@ _cmd_create() {
ln -s ../../$SETTINGS tags/$TAG/settings.env.orig
cp $SETTINGS tags/$TAG/settings.env
# For Google Cloud, it is necessary to specify which "project" to use.
# Unfortunately, the Terraform provider doesn't seem to have a way
# to detect which Google Cloud project you want to use; it has to be
# specified one way or another. Let's decide that it should be set with
# the GOOGLE_PROJECT env var; and if that var is not set, we'll try to
# figure it out from gcloud.
# (See https://github.com/hashicorp/terraform-provider-google/issues/10907#issuecomment-1015721600)
# Since we need that variable to be set each time we'll call Terraform
# (e.g. when destroying the environment), let's save it to the settings.env
# file.
if [ "$PROVIDER" = "googlecloud" ]; then
if ! [ "$GOOGLE_PROJECT" ]; then
info "PROVIDER=googlecloud but GOOGLE_PROJECT is not set. Detecting it."
GOOGLE_PROJECT=$(gcloud config get project)
info "GOOGLE_PROJECT will be set to '$GOOGLE_PROJECT'."
fi
echo "export GOOGLE_PROJECT=$GOOGLE_PROJECT" >> tags/$TAG/settings.env
fi
. tags/$TAG/settings.env
. $SETTINGS
echo $MODE > tags/$TAG/mode
echo $PROVIDER > tags/$TAG/provider
@@ -375,8 +355,8 @@ _cmd_clusterize() {
pssh -I < tags/$TAG/clusters.tsv "
grep -w \$PSSH_HOST | tr '\t' '\n' > /tmp/cluster"
pssh "
echo \$PSSH_HOST > /tmp/ip_address
head -n 1 /tmp/cluster | sudo tee /etc/ip_address_of_first_node
echo \$PSSH_HOST > /tmp/ipv4
head -n 1 /tmp/cluster | sudo tee /etc/ipv4_of_first_node
echo ${CLUSTERPREFIX}1 | sudo tee /etc/name_of_first_node
echo HOSTIP=\$PSSH_HOST | sudo tee -a /etc/environment
NODEINDEX=\$((\$PSSH_NODENUM%$CLUSTERSIZE+1))
@@ -459,7 +439,7 @@ _cmd_docker() {
set -e
### Install docker-compose.
sudo curl -fsSL -o /usr/local/bin/docker-compose \
\$GITHUB/docker/compose/releases/download/$COMPOSE_VERSION/docker-compose-$COMPOSE_PLATFORM
https://github.com/docker/compose/releases/download/$COMPOSE_VERSION/docker-compose-$COMPOSE_PLATFORM
sudo chmod +x /usr/local/bin/docker-compose
docker-compose version
@@ -467,7 +447,7 @@ _cmd_docker() {
##VERSION## https://github.com/docker/machine/releases
MACHINE_VERSION=v0.16.2
sudo curl -fsSL -o /usr/local/bin/docker-machine \
\$GITHUB/docker/machine/releases/download/\$MACHINE_VERSION/docker-machine-\$(uname -s)-\$(uname -m)
https://github.com/docker/machine/releases/download/\$MACHINE_VERSION/docker-machine-\$(uname -s)-\$(uname -m)
sudo chmod +x /usr/local/bin/docker-machine
docker-machine version
"
@@ -500,10 +480,10 @@ _cmd_kubebins() {
set -e
cd /usr/local/bin
if ! [ -x etcd ]; then
curl -L \$GITHUB/etcd-io/etcd/releases/download/$ETCD_VERSION/etcd-$ETCD_VERSION-linux-$ARCH.tar.gz \
curl -L https://github.com/etcd-io/etcd/releases/download/$ETCD_VERSION/etcd-$ETCD_VERSION-linux-$ARCH.tar.gz \
| sudo tar --strip-components=1 --wildcards -zx '*/etcd' '*/etcdctl'
fi
if ! [ -x kube-apiserver ]; then
if ! [ -x hyperkube ]; then
##VERSION##
curl -L https://dl.k8s.io/$K8SBIN_VERSION/kubernetes-server-linux-$ARCH.tar.gz \
| sudo tar --strip-components=3 -zx \
@@ -512,7 +492,7 @@ _cmd_kubebins() {
sudo mkdir -p /opt/cni/bin
cd /opt/cni/bin
if ! [ -x bridge ]; then
curl -L \$GITHUB/containernetworking/plugins/releases/download/$CNI_VERSION/cni-plugins-linux-$ARCH-$CNI_VERSION.tgz \
curl -L https://github.com/containernetworking/plugins/releases/download/$CNI_VERSION/cni-plugins-linux-$ARCH-$CNI_VERSION.tgz \
| sudo tar -zx
fi
"
@@ -562,18 +542,6 @@ EOF"
kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl &&
echo 'alias k=kubecolor' | sudo tee /etc/bash_completion.d/k &&
echo 'complete -F __start_kubectl k' | sudo tee -a /etc/bash_completion.d/k"
# Install helm early
# (so that we can use it to install e.g. Cilium etc.)
ARCH=${ARCHITECTURE-amd64}
HELM_VERSION=3.19.1
pssh "
if [ ! -x /usr/local/bin/helm ]; then
curl -fsSL https://get.helm.sh/helm-v${HELM_VERSION}-linux-${ARCH}.tar.gz |
sudo tar --strip-components=1 --wildcards -zx -C /usr/local/bin '*/helm'
helm completion bash | sudo tee /etc/bash_completion.d/helm
helm version
fi"
}
_cmd kubeadm "Setup kubernetes clusters with kubeadm"
@@ -597,18 +565,6 @@ _cmd_kubeadm() {
# Initialize kube control plane
pssh --timeout 200 "
IPV6=\$(ip -json a | jq -r '.[].addr_info[] | select(.scope==\"global\" and .family==\"inet6\") | .local' | head -n1)
if [ \"\$IPV6\" ]; then
ADVERTISE=\"advertiseAddress: \$IPV6\"
SERVICE_SUBNET=\"serviceSubnet: fdff::/112\"
touch /tmp/install-cilium-ipv6-only
touch /tmp/ipv6-only
else
ADVERTISE=
SERVICE_SUBNET=
touch /tmp/install-weave
fi
echo IPV6=\$IPV6 ADVERTISE=\$ADVERTISE
if i_am_first_node && [ ! -f /etc/kubernetes/admin.conf ]; then
kubeadm token generate > /tmp/token &&
cat >/tmp/kubeadm-config.yaml <<EOF
@@ -616,12 +572,9 @@ kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- token: \$(cat /tmp/token)
localAPIEndpoint:
\$ADVERTISE
nodeRegistration:
ignorePreflightErrors:
- NumCPU
- FileContent--proc-sys-net-ipv6-conf-default-forwarding
$IGNORE_SYSTEMVERIFICATION
$IGNORE_SWAP
$IGNORE_IPTABLES
@@ -648,9 +601,7 @@ kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
apiServer:
certSANs:
- \$(cat /tmp/ip_address)
networking:
\$SERVICE_SUBNET
- \$(cat /tmp/ipv4)
$CLUSTER_CONFIGURATION_KUBERNETESVERSION
EOF
sudo kubeadm init --config=/tmp/kubeadm-config.yaml
@@ -669,20 +620,9 @@ EOF
# Install weave as the pod network
pssh "
if i_am_first_node; then
if [ -f /tmp/install-weave ]; then
curl -fsSL \$GITHUB/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s-1.11.yaml |
sed s,weaveworks/weave,quay.io/rackspace/weave, |
kubectl apply -f-
fi
if [ -f /tmp/install-cilium-ipv6-only ]; then
helm upgrade -i cilium cilium --repo https://helm.cilium.io/ \
--namespace kube-system \
--set cni.chainingMode=portmap \
--set ipv6.enabled=true \
--set ipv4.enabled=false \
--set underlayProtocol=ipv6 \
--version 1.18.3
fi
curl -fsSL https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s-1.11.yaml |
sed s,weaveworks/weave,quay.io/rackspace/weave, |
kubectl apply -f-
fi"
# FIXME this is a gross hack to add the deployment key to our SSH agent,
@@ -705,16 +645,13 @@ EOF
fi
# Install metrics server
pssh -I <../k8s/metrics-server.yaml "
pssh "
if i_am_first_node; then
kubectl apply -f-
fi"
# It would be nice to be able to use that helm chart for metrics-server.
# Unfortunately, the charts themselves are on github.com and we want to
# avoid that due to their lack of IPv6 support.
kubectl apply -f https://raw.githubusercontent.com/jpetazzo/container.training/master/k8s/metrics-server.yaml
#helm upgrade --install metrics-server \
# --repo https://kubernetes-sigs.github.io/metrics-server/ metrics-server \
# --namespace kube-system --set args={--kubelet-insecure-tls}
fi"
}
_cmd kubetools "Install a bunch of CLI tools for Kubernetes"
@@ -741,7 +678,7 @@ _cmd_kubetools() {
# Install ArgoCD CLI
##VERSION## https://github.com/argoproj/argo-cd/releases/latest
URL=\$GITHUB/argoproj/argo-cd/releases/latest/download/argocd-linux-${ARCH}
URL=https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-${ARCH}
pssh "
if [ ! -x /usr/local/bin/argocd ]; then
sudo curl -o /usr/local/bin/argocd -fsSL $URL
@@ -754,7 +691,7 @@ _cmd_kubetools() {
##VERSION## https://github.com/fluxcd/flux2/releases
FLUX_VERSION=2.3.0
FILENAME=flux_${FLUX_VERSION}_linux_${ARCH}
URL=\$GITHUB/fluxcd/flux2/releases/download/v$FLUX_VERSION/$FILENAME.tar.gz
URL=https://github.com/fluxcd/flux2/releases/download/v$FLUX_VERSION/$FILENAME.tar.gz
pssh "
if [ ! -x /usr/local/bin/flux ]; then
curl -fsSL $URL |
@@ -769,7 +706,7 @@ _cmd_kubetools() {
set -e
if ! [ -x /usr/local/bin/kctx ]; then
cd /tmp
git clone \$GITHUB/ahmetb/kubectx
git clone https://github.com/ahmetb/kubectx
sudo cp kubectx/kubectx /usr/local/bin/kctx
sudo cp kubectx/kubens /usr/local/bin/kns
sudo cp kubectx/completion/*.bash /etc/bash_completion.d
@@ -780,7 +717,7 @@ _cmd_kubetools() {
set -e
if ! [ -d /opt/kube-ps1 ]; then
cd /tmp
git clone \$GITHUB/jonmosco/kube-ps1
git clone https://github.com/jonmosco/kube-ps1
sudo mv kube-ps1 /opt/kube-ps1
sudo -u $USER_LOGIN sed -i s/docker-prompt/kube_ps1/ /home/$USER_LOGIN/.bashrc &&
sudo -u $USER_LOGIN tee -a /home/$USER_LOGIN/.bashrc <<EOF
@@ -797,7 +734,7 @@ EOF
##VERSION## https://github.com/stern/stern/releases
STERN_VERSION=1.29.0
FILENAME=stern_${STERN_VERSION}_linux_${ARCH}
URL=\$GITHUB/stern/stern/releases/download/v$STERN_VERSION/$FILENAME.tar.gz
URL=https://github.com/stern/stern/releases/download/v$STERN_VERSION/$FILENAME.tar.gz
pssh "
if [ ! -x /usr/local/bin/stern ]; then
curl -fsSL $URL |
@@ -808,11 +745,9 @@ EOF
fi"
# Install helm
HELM_VERSION=3.19.1
pssh "
if [ ! -x /usr/local/bin/helm ]; then
curl -fsSL https://get.helm.sh/helm-v${HELM_VERSION}-linux-${ARCH}.tar.gz |
sudo tar --strip-components=1 --wildcards -zx -C /usr/local/bin '*/helm'
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get-helm-3 | sudo bash &&
helm completion bash | sudo tee /etc/bash_completion.d/helm
helm version
fi"
@@ -820,7 +755,7 @@ EOF
# Install kustomize
##VERSION## https://github.com/kubernetes-sigs/kustomize/releases
KUSTOMIZE_VERSION=v5.4.1
URL=\$GITHUB/kubernetes-sigs/kustomize/releases/download/kustomize/${KUSTOMIZE_VERSION}/kustomize_${KUSTOMIZE_VERSION}_linux_${ARCH}.tar.gz
URL=https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize/${KUSTOMIZE_VERSION}/kustomize_${KUSTOMIZE_VERSION}_linux_${ARCH}.tar.gz
pssh "
if [ ! -x /usr/local/bin/kustomize ]; then
curl -fsSL $URL |
@@ -837,17 +772,15 @@ EOF
pssh "
if [ ! -x /usr/local/bin/ship ]; then
##VERSION##
curl -fsSL \$GITHUB/replicatedhq/ship/releases/download/v0.51.3/ship_0.51.3_linux_$ARCH.tar.gz |
curl -fsSL https://github.com/replicatedhq/ship/releases/download/v0.51.3/ship_0.51.3_linux_$ARCH.tar.gz |
sudo tar -C /usr/local/bin -zx ship
fi"
# Install the AWS IAM authenticator
AWSIAMAUTH_VERSION=0.7.8
URL=\$GITHUB/kubernetes-sigs/aws-iam-authenticator/releases/download/v${AWSIAMAUTH_VERSION}/aws-iam-authenticator_${AWSIAMAUTH_VERSION}_linux_${ARCH}
pssh "
if [ ! -x /usr/local/bin/aws-iam-authenticator ]; then
##VERSION##
sudo curl -fsSLo /usr/local/bin/aws-iam-authenticator $URL
sudo curl -fsSLo /usr/local/bin/aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.12.7/2019-03-27/bin/linux/$ARCH/aws-iam-authenticator
sudo chmod +x /usr/local/bin/aws-iam-authenticator
aws-iam-authenticator version
fi"
@@ -857,17 +790,17 @@ EOF
if [ ! -x /usr/local/bin/jless ]; then
##VERSION##
sudo apt-get install -y libxcb-render0 libxcb-shape0 libxcb-xfixes0
wget \$GITHUB/PaulJuliusMartinez/jless/releases/download/v0.9.0/jless-v0.9.0-x86_64-unknown-linux-gnu.zip
wget https://github.com/PaulJuliusMartinez/jless/releases/download/v0.9.0/jless-v0.9.0-x86_64-unknown-linux-gnu.zip
unzip jless-v0.9.0-x86_64-unknown-linux-gnu
sudo mv jless /usr/local/bin
fi"
# Install the krew package manager
pssh "
if [ ! -d /home/$USER_LOGIN/.krew ] && [ ! -f /tmp/ipv6-only ]; then
if [ ! -d /home/$USER_LOGIN/.krew ]; then
cd /tmp &&
KREW=krew-linux_$ARCH
curl -fsSL \$GITHUB/kubernetes-sigs/krew/releases/latest/download/\$KREW.tar.gz |
curl -fsSL https://github.com/kubernetes-sigs/krew/releases/latest/download/\$KREW.tar.gz |
tar -zxf- &&
sudo -u $USER_LOGIN -H ./\$KREW install krew &&
echo export PATH=/home/$USER_LOGIN/.krew/bin:\\\$PATH | sudo -u $USER_LOGIN tee -a /home/$USER_LOGIN/.bashrc
@@ -875,7 +808,7 @@ EOF
# Install kubecolor
KUBECOLOR_VERSION=0.4.0
URL=\$GITHUB/kubecolor/kubecolor/releases/download/v${KUBECOLOR_VERSION}/kubecolor_${KUBECOLOR_VERSION}_linux_${ARCH}.tar.gz
URL=https://github.com/kubecolor/kubecolor/releases/download/v${KUBECOLOR_VERSION}/kubecolor_${KUBECOLOR_VERSION}_linux_${ARCH}.tar.gz
pssh "
if [ ! -x /usr/local/bin/kubecolor ]; then
##VERSION##
@@ -887,7 +820,7 @@ EOF
pssh "
if [ ! -x /usr/local/bin/k9s ]; then
FILENAME=k9s_Linux_$ARCH.tar.gz &&
curl -fsSL \$GITHUB/derailed/k9s/releases/latest/download/\$FILENAME |
curl -fsSL https://github.com/derailed/k9s/releases/latest/download/\$FILENAME |
sudo tar -C /usr/local/bin -zx k9s
k9s version
fi"
@@ -896,7 +829,7 @@ EOF
pssh "
if [ ! -x /usr/local/bin/popeye ]; then
FILENAME=popeye_Linux_$ARCH.tar.gz &&
curl -fsSL \$GITHUB/derailed/popeye/releases/latest/download/\$FILENAME |
curl -fsSL https://github.com/derailed/popeye/releases/latest/download/\$FILENAME |
sudo tar -C /usr/local/bin -zx popeye
popeye version
fi"
@@ -909,7 +842,7 @@ EOF
if [ ! -x /usr/local/bin/tilt ]; then
TILT_VERSION=0.33.13
FILENAME=tilt.\$TILT_VERSION.linux.$TILT_ARCH.tar.gz
curl -fsSL \$GITHUB/tilt-dev/tilt/releases/download/v\$TILT_VERSION/\$FILENAME |
curl -fsSL https://github.com/tilt-dev/tilt/releases/download/v\$TILT_VERSION/\$FILENAME |
sudo tar -C /usr/local/bin -zx tilt
tilt completion bash | sudo tee /etc/bash_completion.d/tilt
tilt version
@@ -927,7 +860,7 @@ EOF
# Install Kompose
pssh "
if [ ! -x /usr/local/bin/kompose ]; then
curl -fsSLo kompose \$GITHUB/kubernetes/kompose/releases/latest/download/kompose-linux-$ARCH &&
curl -fsSLo kompose https://github.com/kubernetes/kompose/releases/latest/download/kompose-linux-$ARCH &&
sudo install kompose /usr/local/bin
kompose completion bash | sudo tee /etc/bash_completion.d/kompose
kompose version
@@ -936,7 +869,7 @@ EOF
# Install KinD
pssh "
if [ ! -x /usr/local/bin/kind ]; then
curl -fsSLo kind \$GITHUB/kubernetes-sigs/kind/releases/latest/download/kind-linux-$ARCH &&
curl -fsSLo kind https://github.com/kubernetes-sigs/kind/releases/latest/download/kind-linux-$ARCH &&
sudo install kind /usr/local/bin
kind completion bash | sudo tee /etc/bash_completion.d/kind
kind version
@@ -945,7 +878,7 @@ EOF
# Install YTT
pssh "
if [ ! -x /usr/local/bin/ytt ]; then
curl -fsSLo ytt \$GITHUB/vmware-tanzu/carvel-ytt/releases/latest/download/ytt-linux-$ARCH &&
curl -fsSLo ytt https://github.com/vmware-tanzu/carvel-ytt/releases/latest/download/ytt-linux-$ARCH &&
sudo install ytt /usr/local/bin
ytt completion bash | sudo tee /etc/bash_completion.d/ytt
ytt version
@@ -953,7 +886,7 @@ EOF
##VERSION## https://github.com/bitnami-labs/sealed-secrets/releases
KUBESEAL_VERSION=0.26.2
URL=\$GITHUB/bitnami-labs/sealed-secrets/releases/download/v${KUBESEAL_VERSION}/kubeseal-${KUBESEAL_VERSION}-linux-${ARCH}.tar.gz
URL=https://github.com/bitnami-labs/sealed-secrets/releases/download/v${KUBESEAL_VERSION}/kubeseal-${KUBESEAL_VERSION}-linux-${ARCH}.tar.gz
#case $ARCH in
#amd64) FILENAME=kubeseal-linux-amd64;;
#arm64) FILENAME=kubeseal-arm64;;
@@ -970,7 +903,7 @@ EOF
VELERO_VERSION=1.13.2
pssh "
if [ ! -x /usr/local/bin/velero ]; then
curl -fsSL \$GITHUB/vmware-tanzu/velero/releases/download/v$VELERO_VERSION/velero-v$VELERO_VERSION-linux-$ARCH.tar.gz |
curl -fsSL https://github.com/vmware-tanzu/velero/releases/download/v$VELERO_VERSION/velero-v$VELERO_VERSION-linux-$ARCH.tar.gz |
sudo tar --strip-components=1 --wildcards -zx -C /usr/local/bin '*/velero'
velero completion bash | sudo tee /etc/bash_completion.d/velero
velero version --client-only
@@ -980,7 +913,7 @@ EOF
KUBENT_VERSION=0.7.2
pssh "
if [ ! -x /usr/local/bin/kubent ]; then
curl -fsSL \$GITHUB/doitintl/kube-no-trouble/releases/download/${KUBENT_VERSION}/kubent-${KUBENT_VERSION}-linux-$ARCH.tar.gz |
curl -fsSL https://github.com/doitintl/kube-no-trouble/releases/download/${KUBENT_VERSION}/kubent-${KUBENT_VERSION}-linux-$ARCH.tar.gz |
sudo tar -zxvf- -C /usr/local/bin kubent
kubent --version
fi"
@@ -988,7 +921,7 @@ EOF
# Ngrok. Note that unfortunately, this is the x86_64 binary.
# We might have to rethink how to handle this for multi-arch environments.
pssh "
if [ ! -x /usr/local/bin/ngrok ] && [ ! -f /tmp/ipv6-only ]; then
if [ ! -x /usr/local/bin/ngrok ]; then
curl -fsSL https://bin.equinox.io/c/bNyj1mQVY4c/ngrok-v3-stable-linux-amd64.tgz |
sudo tar -zxvf- -C /usr/local/bin ngrok
fi"
@@ -1087,9 +1020,7 @@ _cmd_ping() {
TAG=$1
need_tag
# If we connect to our VMs over IPv6, the IP address is between brackets.
# Unfortunately, fping doesn't support that; so let's strip brackets here.
tr -d [] < tags/$TAG/ips.txt | fping
fping < tags/$TAG/ips.txt
}
_cmd stage2 "Finalize the setup of managed Kubernetes clusters"
@@ -1161,7 +1092,7 @@ _cmd_standardize() {
sudo netfilter-persistent start
fi"
# oracle-cloud-agent upgrades packages in the background.
# oracle-cloud-agent upgrades pacakges in the background.
# This breaks our deployment scripts, because when we invoke apt-get, it complains
# that the lock already exists (symptom: random "Exited with error code 100").
# Workaround: if we detect oracle-cloud-agent, remove it.
@@ -1173,15 +1104,6 @@ _cmd_standardize() {
sudo snap remove oracle-cloud-agent
sudo dpkg --remove --force-remove-reinstreq unified-monitoring-agent
fi"
# Check if a cachttps instance is available.
# (This is used to access GitHub on IPv6-only hosts.)
pssh "
if curl -fsSLI http://cachttps.internal:3131/https://github.com/ >/dev/null; then
echo GITHUB=http://cachttps.internal:3131/https://github.com
else
echo GITHUB=https://github.com
fi | sudo tee -a /etc/environment"
}
_cmd tailhist "Install history viewer on port 1088"
@@ -1197,7 +1119,7 @@ _cmd_tailhist () {
pssh "
set -e
sudo apt-get install unzip -y
wget -c \$GITHUB/joewalnes/websocketd/releases/download/v0.3.0/websocketd-0.3.0-linux_$ARCH.zip
wget -c https://github.com/joewalnes/websocketd/releases/download/v0.3.0/websocketd-0.3.0-linux_$ARCH.zip
unzip -o websocketd-0.3.0-linux_$ARCH.zip websocketd
sudo mv websocketd /usr/local/bin/websocketd
sudo mkdir -p /opt/tailhist
@@ -1296,17 +1218,14 @@ fi
"
}
_cmd ssh "Open an SSH session to a node (first one by default)"
_cmd ssh "Open an SSH session to the first node of a tag"
_cmd_ssh() {
TAG=$1
need_tag
if [ "$2" ]; then
ssh -l ubuntu -i tags/$TAG/id_rsa $2
else
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
fi
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
}
_cmd tags "List groups of VMs known locally"
@@ -1463,7 +1382,7 @@ _cmd_webssh() {
sudo apt-get install python3-tornado python3-paramiko -y"
pssh "
cd /opt
[ -d webssh ] || sudo git clone \$GITHUB/jpetazzo/webssh"
[ -d webssh ] || sudo git clone https://github.com/jpetazzo/webssh"
pssh "
for KEYFILE in /etc/ssh/*.pub; do
read a b c < \$KEYFILE; echo localhost \$a \$b
@@ -1548,7 +1467,7 @@ test_vm() {
"whoami" \
"hostname -i" \
"ls -l /usr/local/bin/i_am_first_node" \
"grep . /etc/name_of_first_node /etc/ip_addres_of_first_node" \
"grep . /etc/name_of_first_node /etc/ipv4_of_first_node" \
"cat /etc/hosts" \
"hostnamectl status" \
"docker version | grep Version -B1" \

View File

@@ -23,14 +23,6 @@ pssh() {
# necessary - or down to zero, too.
sleep ${PSSH_DELAY_PRE-1}
# When things go wrong, it's convenient to ask pssh to show the output
# of the failed command. Let's make that easy with a DEBUG env var.
if [ "$DEBUG" ]; then
PSSH_I=-i
else
PSSH_I=""
fi
$(which pssh || which parallel-ssh) -h $HOSTFILE -l ubuntu \
--par ${PSSH_PARALLEL_CONNECTIONS-100} \
--timeout 300 \
@@ -39,6 +31,5 @@ pssh() {
-O UserKnownHostsFile=/dev/null \
-O StrictHostKeyChecking=no \
-O ForwardAgent=yes \
$PSSH_I \
"$@"
}

View File

@@ -1,4 +1,4 @@
#export TF_VAR_node_size=GP4.4
#export TF_VAR_node_size=GP2.4
#export TF_VAR_node_size=g6-standard-6
#export TF_VAR_node_size=m7i.xlarge

View File

@@ -2,11 +2,7 @@ terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.38.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 3.0"
version = "2.16.1"
}
}
}
@@ -20,7 +16,7 @@ provider "kubernetes" {
provider "helm" {
alias = "cluster_${index}"
kubernetes = {
kubernetes {
config_path = "./kubeconfig.${index}"
}
}
@@ -55,37 +51,42 @@ resource "helm_release" "shpod_${index}" {
name = "shpod"
namespace = "shpod"
create_namespace = false
values = [
yamlencode({
service = {
type = "NodePort"
}
resources = {
requests = {
cpu = "100m"
memory = "500M"
}
limits = {
cpu = "1"
memory = "1000M"
}
}
persistentVolume = {
enabled = true
}
ssh = {
password = random_string.shpod_${index}.result
}
rbac = {
cluster = {
clusterRoles = [ "cluster-admin" ]
}
}
codeServer = {
enabled = true
}
})
]
set {
name = "service.type"
value = "NodePort"
}
set {
name = "resources.requests.cpu"
value = "100m"
}
set {
name = "resources.requests.memory"
value = "500M"
}
set {
name = "resources.limits.cpu"
value = "1"
}
set {
name = "resources.limits.memory"
value = "1000M"
}
set {
name = "persistentVolume.enabled"
value = "true"
}
set {
name = "ssh.password"
value = random_string.shpod_${index}.result
}
set {
name = "rbac.cluster.clusterRoles"
value = "{cluster-admin}"
}
set {
name = "codeServer.enabled"
value = "true"
}
}
resource "helm_release" "metrics_server_${index}" {
@@ -100,36 +101,10 @@ resource "helm_release" "metrics_server_${index}" {
name = "metrics-server"
namespace = "metrics-server"
create_namespace = true
values = [
yamlencode({
args = [ "--kubelet-insecure-tls" ]
})
]
}
# As of October 2025, the ebs-csi-driver addon (which is used on EKS
# to provision persistent volumes) doesn't automatically create a
# StorageClass. Here, we're trying to detect the DaemonSet created
# by the ebs-csi-driver; and if we find it, we create the corresponding
# StorageClass.
data "kubernetes_resources" "ebs_csi_node_${index}" {
provider = kubernetes.cluster_${index}
api_version = "apps/v1"
kind = "DaemonSet"
label_selector = "app.kubernetes.io/name=aws-ebs-csi-driver"
namespace = "kube-system"
}
resource "kubernetes_storage_class" "ebs_csi_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 1 : 0
provider = kubernetes.cluster_${index}
metadata {
name = "ebs-csi"
annotations = {
"storageclass.kubernetes.io/is-default-class" = "true"
}
set {
name = "args"
value = "{--kubelet-insecure-tls}"
}
storage_provisioner = "ebs.csi.aws.com"
}
# This section here deserves a little explanation.
@@ -161,14 +136,8 @@ resource "kubernetes_storage_class" "ebs_csi_${index}" {
# Lastly - in the ConfigMap we actually put both the original kubeconfig,
# and the one where we injected our new user (just in case we want to
# use or look at the original for any reason).
#
# One more thing: the kubernetes.io/kube-apiserver-client signer is
# disabled on EKS, so... we don't generate that ConfigMap on EKS.
# To detect if we're on EKS, we're looking for the ebs-csi-node DaemonSet.
# (Which means that the detection will break if the ebs-csi addon is missing.)
resource "kubernetes_config_map" "kubeconfig_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 0 : 1
provider = kubernetes.cluster_${index}
metadata {
name = "kubeconfig"
@@ -194,7 +163,7 @@ resource "kubernetes_config_map" "kubeconfig_${index}" {
- name: cluster-admin
user:
client-key-data: $${base64encode(tls_private_key.cluster_admin_${index}.private_key_pem)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}[0].certificate)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}.certificate)}
EOT
}
}
@@ -232,7 +201,6 @@ resource "kubernetes_cluster_role_binding" "shpod_cluster_admin_${index}" {
}
resource "kubernetes_certificate_signing_request_v1" "cluster_admin_${index}" {
count = (length(data.kubernetes_resources.ebs_csi_node_${index}.objects) > 0) ? 0 : 1
provider = kubernetes.cluster_${index}
metadata {
name = "cluster-admin"

View File

@@ -23,7 +23,7 @@ variable "node_size" {
}
variable "location" {
type = string
type = string
default = null
}

View File

@@ -1,45 +1,60 @@
data "aws_eks_cluster_versions" "_" {
default_only = true
# Taken from:
# https://github.com/hashicorp/learn-terraform-provision-eks-cluster/blob/main/main.tf
data "aws_availability_zones" "available" {}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.19.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 21.0"
name = var.cluster_name
kubernetes_version = data.aws_eks_cluster_versions._.cluster_versions[0].cluster_version
vpc_id = local.vpc_id
subnet_ids = local.subnet_ids
endpoint_public_access = true
enable_cluster_creator_admin_permissions = true
upgrade_policy = {
# The default policy is EXTENDED, which incurs additional costs
# when running an old control plane. We don't advise to run old
# control planes, but we also don't want to incur costs if an
# old version is chosen accidentally.
support_type = "STANDARD"
}
source = "terraform-aws-modules/eks/aws"
version = "19.5.1"
cluster_name = var.cluster_name
cluster_version = "1.24"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
eks_managed_node_group_defaults = {
ami_type = "AL2_x86_64"
addons = {
coredns = {}
eks-pod-identity-agent = {
before_compute = true
}
kube-proxy = {}
vpc-cni = {
before_compute = true
}
aws-ebs-csi-driver = {
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
}
}
eks_managed_node_groups = {
x86 = {
name = "x86"
one = {
name = "node-group-one"
instance_types = [local.node_size]
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
}
}
}
@@ -51,7 +66,7 @@ data "aws_iam_policy" "ebs_csi_policy" {
module "irsa-ebs-csi" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "~> 5.39.0"
version = "4.7.0"
create_role = true
role_name = "AmazonEKSTFEBSCSIRole-${module.eks.cluster_name}"
@@ -60,9 +75,13 @@ module "irsa-ebs-csi" {
oidc_fully_qualified_subjects = ["system:serviceaccount:kube-system:ebs-csi-controller-sa"]
}
resource "aws_vpc_security_group_ingress_rule" "_" {
security_group_id = module.eks.node_security_group_id
cidr_ipv4 = "0.0.0.0/0"
ip_protocol = -1
description = "Allow all traffic to Kubernetes nodes (so that we can use NodePorts, hostPorts, etc.)"
resource "aws_eks_addon" "ebs-csi" {
cluster_name = module.eks.cluster_name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.5.2-eksbuild.1"
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
tags = {
"eks_addon" = "ebs-csi"
"terraform" = "true"
}
}

View File

@@ -2,7 +2,7 @@ terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 6.17.0"
version = "~> 4.47.0"
}
}
}

View File

@@ -1,61 +0,0 @@
# OK, we have two options here.
# 1. Create our own VPC
# - Pros: provides good isolation from other stuff deployed in the
# AWS account; makes sure that we don't interact with
# existing security groups, subnets, etc.
# - Cons: by default, there is a quota of 5 VPC per region, so
# we can only deploy 5 clusters
# 2. Use the default VPC
# - Pros/cons: the opposite :)
variable "use_default_vpc" {
type = bool
default = true
}
data "aws_vpc" "default" {
default = true
}
data "aws_subnets" "default" {
filter {
name = "vpc-id"
values = [data.aws_vpc.default.id]
}
}
data "aws_availability_zones" "available" {}
module "vpc" {
count = var.use_default_vpc ? 0 : 1
source = "terraform-aws-modules/vpc/aws"
version = "~> 6.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.11.0/24", "10.0.12.0/24", "10.0.13.0/24"]
public_subnets = ["10.0.21.0/24", "10.0.22.0/24", "10.0.23.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
map_public_ip_on_launch = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
locals {
vpc_id = var.use_default_vpc ? data.aws_vpc.default.id : module.vpc[0].vpc_id
subnet_ids = var.use_default_vpc ? data.aws_subnets.default.ids : module.vpc[0].public_subnets
}

View File

@@ -0,0 +1,12 @@
locals {
location = var.location != null ? var.location : "europe-north1-a"
region = replace(local.location, "/-[a-z]$/", "")
# Unfortunately, the following line doesn't work
# (that attribute just returns an empty string)
# so we have to hard-code the project name.
#project = data.google_client_config._.project
project = "prepare-tf"
}
data "google_client_config" "_" {}

View File

@@ -1,7 +1,7 @@
resource "google_container_cluster" "_" {
name = var.cluster_name
location = local.location
deletion_protection = false
name = var.cluster_name
project = local.project
location = local.location
#min_master_version = var.k8s_version
# To deploy private clusters, uncomment the section below,
@@ -42,7 +42,7 @@ resource "google_container_cluster" "_" {
node_pool {
name = "x86"
node_config {
tags = ["lab-${var.cluster_name}"]
tags = var.common_tags
machine_type = local.node_size
}
initial_node_count = var.min_nodes_per_pool
@@ -62,25 +62,3 @@ resource "google_container_cluster" "_" {
}
}
}
resource "google_compute_firewall" "_" {
name = "lab-${var.cluster_name}"
network = "default"
allow {
protocol = "tcp"
ports = ["0-65535"]
}
allow {
protocol = "udp"
ports = ["0-65535"]
}
allow {
protocol = "icmp"
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["lab-${var.cluster_name}"]
}

View File

@@ -6,8 +6,6 @@ output "has_metrics_server" {
value = true
}
data "google_client_config" "_" {}
output "kubeconfig" {
sensitive = true
value = <<-EOT

View File

@@ -1 +0,0 @@
../../providers/googlecloud/provider.tf

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "4.5.0"
}
}
}

View File

@@ -30,7 +30,7 @@ resource "scaleway_k8s_pool" "_" {
max_size = var.max_nodes_per_pool
autoscaling = var.max_nodes_per_pool > var.min_nodes_per_pool
autohealing = true
depends_on = [scaleway_instance_security_group._]
depends_on = [ scaleway_instance_security_group._ ]
}
data "scaleway_k8s_version" "_" {

View File

@@ -4,36 +4,25 @@ resource "helm_release" "_" {
create_namespace = true
repository = "https://charts.loft.sh"
chart = "vcluster"
version = "0.27.1"
values = [
yamlencode({
controlPlane = {
proxy = {
extraSANs = [ local.guest_api_server_host ]
}
service = {
spec = {
type = "NodePort"
}
}
statefulSet = {
persistence = {
volumeClaim = {
enabled = true
}
}
}
}
sync = {
fromHost = {
nodes = {
enabled = true
selector = {
all = true
}
}
}
}
})
]
version = "0.19.7"
set {
name = "service.type"
value = "NodePort"
}
set {
name = "storage.persistence"
value = "false"
}
set {
name = "sync.nodes.enabled"
value = "true"
}
set {
name = "sync.nodes.syncAllNodes"
value = "true"
}
set {
name = "syncer.extraArgs"
value = "{--tls-san=${local.guest_api_server_host}}"
}
}

View File

@@ -1,8 +0,0 @@
terraform {
required_providers {
helm = {
source = "hashicorp/helm"
version = "~> 3.0"
}
}
}

View File

@@ -1,8 +0,0 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 7.0"
}
}
}

View File

@@ -9,9 +9,5 @@ variable "node_sizes" {
variable "location" {
type = string
default = "europe-north1-a"
default = null
}
locals {
location = (var.location != "" && var.location != null) ? var.location : "europe-north1-a"
}

View File

@@ -1,5 +1,5 @@
provider "helm" {
kubernetes = {
kubernetes {
config_path = "~/kubeconfig"
}
}

View File

@@ -63,8 +63,7 @@ locals {
resource "local_file" "ip_addresses" {
content = join("", formatlist("%s\n", [
for key, value in local.ip_addresses :
strcontains(value, ".") ? value : "[${value}]"
for key, value in local.ip_addresses : value
]))
filename = "ips.txt"
file_permission = "0600"

View File

@@ -1 +0,0 @@
../common.tf

View File

@@ -1 +0,0 @@
../../providers/googlecloud/config.tf

View File

@@ -1,54 +0,0 @@
# Note: names and tags on GCP have to match a specific regex:
# (?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)
# In other words, they must start with a letter; and generally,
# we make them start with a number (year-month-day-etc, so 2025-...)
# so we prefix names and tags with "lab-" in this configuration.
resource "google_compute_instance" "_" {
for_each = local.nodes
zone = var.location
name = "lab-${each.value.node_name}"
tags = ["lab-${var.tag}"]
machine_type = each.value.node_size
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2404-lts-amd64"
}
}
network_interface {
network = "default"
access_config {}
}
metadata = {
"ssh-keys" = "ubuntu:${tls_private_key.ssh.public_key_openssh}"
}
}
locals {
ip_addresses = {
for key, value in local.nodes :
key => google_compute_instance._[key].network_interface[0].access_config[0].nat_ip
}
}
resource "google_compute_firewall" "_" {
name = "lab-${var.tag}"
network = "default"
allow {
protocol = "tcp"
ports = ["0-65535"]
}
allow {
protocol = "udp"
ports = ["0-65535"]
}
allow {
protocol = "icmp"
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["lab-${var.tag}"]
}

View File

@@ -1 +0,0 @@
../../providers/googlecloud/provider.tf

View File

@@ -1 +0,0 @@
../../providers/googlecloud/variables.tf

View File

@@ -1,34 +1,12 @@
data "proxmox_virtual_environment_nodes" "_" {}
data "proxmox_virtual_environment_vms" "_" {
filter {
name = "template"
values = [true]
}
}
data "proxmox_virtual_environment_vms" "templates" {
for_each = toset(data.proxmox_virtual_environment_nodes._.names)
tags = ["ubuntu"]
filter {
name = "node_name"
values = [each.value]
}
filter {
name = "template"
values = [true]
}
}
locals {
pve_nodes = data.proxmox_virtual_environment_nodes._.names
pve_node = { for k, v in local.nodes : k => local.pve_nodes[v.node_index % length(local.pve_nodes)] }
pve_template_id = { for k, v in local.nodes : k => data.proxmox_virtual_environment_vms.templates[local.pve_node[k]].vms[0].vm_id }
pve_nodes = data.proxmox_virtual_environment_nodes._.names
}
resource "proxmox_virtual_environment_vm" "_" {
node_name = local.pve_nodes[each.value.node_index % length(local.pve_nodes)]
for_each = local.nodes
node_name = local.pve_node[each.key]
name = each.value.node_name
tags = ["container.training", var.tag]
stop_on_destroy = true
@@ -46,17 +24,9 @@ resource "proxmox_virtual_environment_vm" "_" {
# size = 30
# discard = "on"
#}
### Strategy 1: clone from shared storage
#clone {
# vm_id = var.proxmox_template_vm_id
# node_name = var.proxmox_template_node_name
# full = false
#}
### Strategy 2: clone from local storage
### (requires that the template exists on each node)
clone {
vm_id = local.pve_template_id[each.key]
node_name = local.pve_node[each.key]
vm_id = var.proxmox_template_vm_id
node_name = var.proxmox_template_node_name
full = false
}
agent {
@@ -71,9 +41,7 @@ resource "proxmox_virtual_environment_vm" "_" {
ip_config {
ipv4 {
address = "dhcp"
}
ipv6 {
address = "dhcp"
#gateway =
}
}
}
@@ -104,11 +72,8 @@ resource "proxmox_virtual_environment_vm" "_" {
locals {
ip_addresses = {
for key, value in local.nodes :
key => [for addr in flatten(concat(
proxmox_virtual_environment_vm._[key].ipv6_addresses,
proxmox_virtual_environment_vm._[key].ipv4_addresses,
["ERROR"])) :
addr if addr != "127.0.0.1" && addr != "::1"][0]
key => [for addr in flatten(concat(proxmox_virtual_environment_vm._[key].ipv4_addresses, ["ERROR"])) :
addr if addr != "127.0.0.1"][0]
}
}

View File

@@ -2,7 +2,7 @@ terraform {
required_providers {
proxmox = {
source = "bpg/proxmox"
version = "~> 0.86.0"
version = "~> 0.70.1"
}
}
}

View File

@@ -2,7 +2,6 @@
#/ /kube-halfday.yml.html 200!
#/ /kube-fullday.yml.html 200!
#/ /kube-twodays.yml.html 200!
/ /docker.yml.html 200!
# And this allows to do "git clone https://container.training".
/info/refs service=git-upload-pack https://github.com/jpetazzo/container.training/info/refs?service=git-upload-pack

View File

@@ -1,31 +0,0 @@
#!/usr/bin/env python
import os
import re
import sys
html_file = sys.argv[1]
output_file_template = "_academy_{}.html"
title_regex = "name: toc-(.*)"
redirects = open("_redirects", "w")
sections = re.split(title_regex, open(html_file).read())[1:]
while sections:
link, markdown = sections[0], sections[1]
sections = sections[2:]
output_file_name = output_file_template.format(link)
with open(output_file_name, "w") as f:
html = open("workshop.html").read()
html = html.replace("@@MARKDOWN@@", markdown)
titles = re.findall("# (.*)", markdown) + [""]
html = html.replace("@@TITLE@@", "{} — Kubernetes Academy".format(titles[0]))
html = html.replace("@@SLIDENUMBERPREFIX@@", "")
html = html.replace("@@EXCLUDE@@", "")
html = html.replace(".nav[", ".hide[")
f.write(html)
redirects.write("/{} /{} 200!\n".format(link, output_file_name))
html = open(html_file).read()
html = re.sub("#toc-([^)]*)", "_academy_\\1.html", html)
sys.stdout.write(html)

View File

@@ -29,20 +29,6 @@ At the end of this lesson, you will be able to:
---
## `Dockerfile` example
```
FROM python:alpine
WORKDIR /app
RUN pip install Flask
COPY rng.py .
ENV FLASK_APP=rng FLASK_RUN_HOST=:: FLASK_RUN_PORT=80
CMD ["flask", "run"]
EXPOSE 80
```
---
## Writing our first `Dockerfile`
Our Dockerfile must be in a **new, empty directory**.
@@ -101,7 +87,119 @@ To keep things simple for now: this is the directory where our Dockerfile is loc
---
## Build output
## What happens when we build the image?
It depends if we're using BuildKit or not!
If there are lots of blue lines and the first line looks like this:
```
[+] Building 1.8s (4/6)
```
... then we're using BuildKit.
If the output is mostly black-and-white and the first line looks like this:
```
Sending build context to Docker daemon 2.048kB
```
... then we're using the "classic" or "old-style" builder.
---
## To BuildKit or Not To BuildKit
Classic builder:
- copies the whole "build context" to the Docker Engine
- linear (processes lines one after the other)
- requires a full Docker Engine
BuildKit:
- only transfers parts of the "build context" when needed
- will parallelize operations (when possible)
- can run in non-privileged containers (e.g. on Kubernetes)
---
## With the classic builder
The output of `docker build` looks like this:
.small[
```bash
docker build -t figlet .
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM ubuntu
---> f975c5035748
Step 2/3 : RUN apt-get update
---> Running in e01b294dbffd
(...output of the RUN command...)
Removing intermediate container e01b294dbffd
---> eb8d9b561b37
Step 3/3 : RUN apt-get install figlet
---> Running in c29230d70f9b
(...output of the RUN command...)
Removing intermediate container c29230d70f9b
---> 0dfd7a253f21
Successfully built 0dfd7a253f21
Successfully tagged figlet:latest
```
]
* The output of the `RUN` commands has been omitted.
* Let's explain what this output means.
---
## Sending the build context to Docker
```bash
Sending build context to Docker daemon 2.048 kB
```
* The build context is the `.` directory given to `docker build`.
* It is sent (as an archive) by the Docker client to the Docker daemon.
* This allows to use a remote machine to build using local files.
* Be careful (or patient) if that directory is big and your link is slow.
* You can speed up the process with a [`.dockerignore`](https://docs.docker.com/engine/reference/builder/#dockerignore-file) file
* It tells docker to ignore specific files in the directory
* Only ignore files that you won't need in the build context!
---
## Executing each step
```bash
Step 2/3 : RUN apt-get update
---> Running in e01b294dbffd
(...output of the RUN command...)
Removing intermediate container e01b294dbffd
---> eb8d9b561b37
```
* A container (`e01b294dbffd`) is created from the base image.
* The `RUN` command is executed in this container.
* The container is committed into an image (`eb8d9b561b37`).
* The build container (`e01b294dbffd`) is removed.
* The output of this step will be the base image for the next one.
---
## With BuildKit
.small[
```bash
@@ -133,7 +231,7 @@ To keep things simple for now: this is the directory where our Dockerfile is loc
---
## Understanding builder output
## Understanding BuildKit output
- BuildKit transfers the Dockerfile and the *build context*
@@ -151,9 +249,9 @@ To keep things simple for now: this is the directory where our Dockerfile is loc
class: extra-details
## Builder plain output
## BuildKit plain output
- When running builds in e.g. a CI pipeline, its output will be different
- When running BuildKit in e.g. a CI pipeline, its output will be different
- We can see the same output format by using `--progress=plain`
@@ -262,8 +360,6 @@ class: extra-details
---
class: extra-details
## Shell syntax vs exec syntax
Dockerfile commands that execute something can have two forms:
@@ -278,8 +374,6 @@ We are going to change our Dockerfile to see how it affects the resulting image.
---
class: extra-details
## Using exec syntax in our Dockerfile
Let's change our Dockerfile as follows!
@@ -298,8 +392,6 @@ $ docker build -t figlet .
---
class: extra-details
## History with exec syntax
Compare the new history:
@@ -321,8 +413,6 @@ IMAGE CREATED CREATED BY SIZE
---
class: extra-details
## When to use exec syntax and shell syntax
* shell syntax:
@@ -341,8 +431,6 @@ class: extra-details
---
class: extra-details
## Pro-tip: the `exec` shell built-in
POSIX shells have a built-in command named `exec`.
@@ -359,8 +447,6 @@ From a user perspective:
---
class: extra-details
## Example using `exec`
```dockerfile

View File

@@ -42,7 +42,7 @@ Our new Dockerfile will look like this:
```dockerfile
FROM ubuntu
RUN apt-get update
RUN apt-get install figlet
RUN ["apt-get", "install", "figlet"]
CMD figlet -f script hello
```
@@ -96,8 +96,6 @@ root@7ac86a641116:/#
---
class: extra-details
## Using `ENTRYPOINT`
We want to be able to specify a different message on the command line,
@@ -119,8 +117,6 @@ We will use the `ENTRYPOINT` verb in Dockerfile.
---
class: extra-details
## Adding `ENTRYPOINT` to our Dockerfile
Our new Dockerfile will look like this:
@@ -128,7 +124,7 @@ Our new Dockerfile will look like this:
```dockerfile
FROM ubuntu
RUN apt-get update
RUN apt-get install figlet
RUN ["apt-get", "install", "figlet"]
ENTRYPOINT ["figlet", "-f", "script"]
```
@@ -142,8 +138,6 @@ Why did we use JSON syntax for our `ENTRYPOINT`?
---
class: extra-details
## Implications of JSON vs string syntax
* When CMD or ENTRYPOINT use string syntax, they get wrapped in `sh -c`.
@@ -164,8 +158,6 @@ sh -c "figlet -f script" salut
---
class: extra-details
## Build and test our image
Let's build it:
@@ -190,8 +182,6 @@ $ docker run figlet salut
---
class: extra-details
## Using `CMD` and `ENTRYPOINT` together
What if we want to define a default message for our container?
@@ -206,8 +196,6 @@ Then we will use `ENTRYPOINT` and `CMD` together.
---
class: extra-details
## `CMD` and `ENTRYPOINT` together
Our new Dockerfile will look like this:
@@ -215,7 +203,7 @@ Our new Dockerfile will look like this:
```dockerfile
FROM ubuntu
RUN apt-get update
RUN apt-get install figlet
RUN ["apt-get", "install", "figlet"]
ENTRYPOINT ["figlet", "-f", "script"]
CMD ["hello world"]
```
@@ -229,8 +217,6 @@ CMD ["hello world"]
---
class: extra-details
## Build and test our image
Let's build it:
@@ -255,8 +241,6 @@ $ docker run myfiglet
---
class: extra-details
## Overriding the image default parameters
Now let's pass extra arguments to the image.
@@ -274,8 +258,6 @@ We overrode `CMD` but still used `ENTRYPOINT`.
---
class: extra-details
## Overriding `ENTRYPOINT`
What if we want to run a shell in our container?
@@ -292,8 +274,6 @@ root@6027e44e2955:/#
---
class: extra-details
## `CMD` and `ENTRYPOINT` recap
- `docker run myimage` executes `ENTRYPOINT` + `CMD`
@@ -317,8 +297,6 @@ class: extra-details
---
class: extra-details
## When to use `ENTRYPOINT` vs `CMD`
`ENTRYPOINT` is great for "containerized binaries".

View File

@@ -157,6 +157,8 @@ Here is the file used in the demo:
.small[
```yaml
version: "3"
services:
www:
build: www
@@ -276,8 +278,6 @@ For the full list, check: https://docs.docker.com/compose/compose-file/
---
class: extra-details
## Running multiple copies of a stack
- Copy the stack in two different directories, e.g. `front` and `frontcopy`
@@ -353,8 +353,6 @@ Use `docker compose down -v` to remove everything including volumes.
---
class: extra-details
## Special handling of volumes
- When an image gets updated, Compose automatically creates a new container
@@ -373,8 +371,6 @@ class: extra-details
---
class: extra-details
## Gotchas with volumes
- Unfortunately, Docker volumes don't have labels or metadata
@@ -395,8 +391,6 @@ class: extra-details
---
class: extra-details
## Managing volumes explicitly
Option 1: *named volumes*
@@ -418,8 +412,6 @@ volumes:
---
class: extra-details
## Managing volumes explicitly
Option 2: *relative paths*
@@ -439,8 +431,6 @@ services:
---
class: extra-details
## Managing complex stacks
- Compose provides multiple features to manage complex stacks
@@ -463,8 +453,6 @@ class: extra-details
---
class: extra-details
## Dependencies
- A service can have a `depends_on` section
@@ -477,6 +465,28 @@ class: extra-details
⚠️ It doesn't make a service "wait" for another one to be up!
---
class: extra-details
## A bit of history and trivia
- Compose was initially named "Fig"
- Compose is one of the only components of Docker written in Python
(almost everything else is in Go)
- In 2020, Docker introduced "Compose CLI":
- `docker compose` command to deploy Compose stacks to some clouds
- in Go instead of Python
- progressively getting feature parity with `docker compose`
- also provides numerous improvements (e.g. leverages BuildKit by default)
???
:EN:- Using compose to describe an environment

View File

@@ -84,9 +84,9 @@ like Windows, macOS, Solaris, FreeBSD ...
* Each `lxc-start` process exposes a custom API over a local UNIX socket, allowing to interact with the container.
* No notion of image (container filesystems had be managed manually).
* No notion of image (container filesystems have to be managed manually).
* Networking had to be set up manually.
* Networking has to be set up manually.
---
@@ -98,22 +98,10 @@ like Windows, macOS, Solaris, FreeBSD ...
* Daemon exposing a REST API.
* Can run containers and virtual machines.
* Can manage images, snapshots, migrations, networking, storage.
* "offers a user experience similar to virtual machines but using Linux containers instead."
* Driven by Canonical.
---
## Incus
* Community-driven fork of LXD.
* Relatively recent [announced in August 2023](https://linuxcontainers.org/incus/announcement/) so time will tell what the notable differences will be.
---
## CRI-O
@@ -152,7 +140,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## [Kata containers](https://katacontainers.io/)
## Kata containers
* OCI-compliant runtime.
@@ -164,7 +152,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## [gVisor](https://gvisor.dev/)
## gVisor
* OCI-compliant runtime.
@@ -182,17 +170,7 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
---
## Others
- Micro VMs: Firecracker, Edera...
- [crun](https://github.com/containers/crun) (runc rewritten in C)
- [youki](https://youki-dev.github.io/youki/) (runc rewritten in Rust)
---
## To Docker Or Not To Docker
## Overall ...
* The Docker Engine is very developer-centric:
@@ -206,26 +184,8 @@ We're not aware of anyone using it directly (i.e. outside of Kubernetes).
* As a result, it is a fantastic tool in development environments.
* On Kubernetes clusters, containerd or CRI-O are better choices.
* On servers:
* On Kubernetes clusters, the container engine is an implementation detail.
- Docker is a good default choice
---
## Different levels
- Directly use namespaces, cgroups, capabilities with custom code or scripts
*useful for troubleshooting/debugging and for educative purposes; e.g. pipework*
- Use low-level engines like runc, crun, youki
*useful when building custom architectures; e.g. a brand new orchestrator*
- Use low-level APIs like CRI or containerd grpc API
*useful to achieve high-level features like Docker, but without Docker; e.g. ctr, nerdctl*
- Use high-level APIs like Docker and Kubernetes
*that's what most people will do*
- If you use Kubernetes, the engine doesn't matter

View File

@@ -235,8 +235,6 @@ communication across hosts, and publishing/load balancing for inbound traffic.
---
class: extra-details
## Finding the container's IP address
We can use the `docker inspect` command to find the IP address of the
@@ -255,8 +253,6 @@ $ docker inspect --format '{{ .NetworkSettings.IPAddress }}' <yourContainerID>
---
class: extra-details
## Pinging our container
Let's try to ping our container *from another container.*

View File

@@ -327,7 +327,9 @@ class: extra-details
## Which one is the best?
- In modern (2015+) systems, overlay2 should be the best option.
- Eventually, overlay2 should be the best option.
- It is available on all modern systems.
- Its memory usage is better than Device Mapper, BTRFS, or ZFS.

View File

@@ -141,13 +141,3 @@ class: pic
* etc.
* Docker Inc. launches commercial offers.
---
## Standardization of container runtimes
- Docker 1.11 (2016) introduces containerd and runc
- [Kubernetes 1.5 (2016)](https://kubernetes.io/blog/2016/12/kubernetes-1-5-supporting-production-workloads/) introduces the CRI
- First releases of CRI-O (2017), kata containers...

View File

@@ -1,44 +1,140 @@
# Docker? Containers?
# Docker 30,000ft overview
- **Docker:** open-source platform that runs containers.
In this lesson, we will learn about:
- **Container:** unit of software/deployment that contains everything needed for the code to run.
* Why containers (non-technical elevator pitch)
- Docker containers can run (almost) everywhere.
* Why containers (technical elevator pitch)
- Containers typically use less resources than VMs.
* How Docker helps us to build, ship, and run
- Can be easily copied and deployed. Make development faster.
* The history of containers
- Isolated from each other and from the host.
We won't actually run Docker or containers in this chapter (yet!).
Don't worry, we will get to that fast enough!
---
## Container vs VM
## Elevator pitch
**Virtual Machine**
### (for your manager, your boss...)
- Heavier and slower to boot.
- Include a full guest OS.
- Better for running multiple OS types on one host.
---
**Container**
- Lightweight and fast to start.
- Share the host OS kernel.
- Use fewer resources (CPU, RAM, storage).
- Ideal for microservices and scalable applications.
## OK... Why the buzz around containers?
* The software industry has changed
* Before:
* monolithic applications
* long development cycles
* single environment
* slowly scaling up
* Now:
* decoupled services
* fast, iterative improvements
* multiple environments
* quickly scaling out
---
## Deployment becomes very complex
* Many different stacks:
* languages
* frameworks
* databases
* Many different targets:
* individual development environments
* pre-production, QA, staging...
* production: on prem, cloud, hybrid
---
class: pic
![Container vs CM](images/cont_vs_vm.png)
## The deployment problem
![problem](images/shipping-software-problem.png)
---
## Basic workflow
class: pic
## The matrix from hell
![matrix](images/shipping-matrix-from-hell.png)
---
class: pic
## The parallel with the shipping industry
![history](images/shipping-industry-problem.png)
---
class: pic
## Intermodal shipping containers
![shipping](images/shipping-industry-solution.png)
---
class: pic
## A new shipping ecosystem
![shipeco](images/shipping-indsutry-results.png)
---
class: pic
## A shipping container system for applications
![shipapp](images/shipping-software-solution.png)
---
class: pic
## Eliminate the matrix from hell
![elimatrix](images/shipping-matrix-solved.png)
---
## Results
* [Dev-to-prod reduced from 9 months to 15 minutes (ING)](
https://gallant-turing-d0d520.netlify.com/docker-case-studies/CS_ING_01.25.2015_1.pdf)
* [Continuous integration job time reduced by more than 60% (BBC)](
https://gallant-turing-d0d520.netlify.com/docker-case-studies/CS_BBCNews_01.25.2015_1.pdf)
* [Deploy 100 times a day instead of once a week (GILT)](
https://gallant-turing-d0d520.netlify.com/docker-case-studies/CS_Gilt_Groupe_03.18.2015_0.pdf)
* [70% infrastructure consolidation (MetLife)](
https://www.youtube.com/watch?v=Bwt3xigvlj0)
* etc.
---
## Elevator pitch
### (for your fellow devs and ops)
---
## Escape dependency hell
1. Write installation instructions into an `INSTALL.txt` file
@@ -66,7 +162,7 @@ Never again "worked in dev - ops problem now!"
```bash
git clone ...
docker compose up
docker-compose up
```
With this, you can create development, integration, QA environments in minutes!
@@ -113,6 +209,109 @@ Images contain all the libraries, dependencies, etc. needed to run the app.
class: extra-details
## Decouple "plumbing" from application logic
1. Write your code to connect to named services ("db", "api"...)
2. Use Compose to start your stack
3. Docker will setup per-container DNS resolver for those names
4. You can now scale, add load balancers, replication ... without changing your code
Note: this is not covered in this intro level workshop!
---
class: extra-details
## What did Docker bring to the table?
### Docker before/after
---
class: extra-details
## Formats and APIs, before Docker
* No standardized exchange format.
<br/>(No, a rootfs tarball is *not* a format!)
* Containers are hard to use for developers.
<br/>(Where's the equivalent of `docker run debian`?)
* As a result, they are *hidden* from the end users.
* No re-usable components, APIs, tools.
<br/>(At best: VM abstractions, e.g. libvirt.)
Analogy:
* Shipping containers are not just steel boxes.
* They are steel boxes that are a standard size, with the same hooks and holes.
---
class: extra-details
## Formats and APIs, after Docker
* Standardize the container format, because containers were not portable.
* Make containers easy to use for developers.
* Emphasis on re-usable components, APIs, ecosystem of standard tools.
* Improvement over ad-hoc, in-house, specific tools.
---
class: extra-details
## Shipping, before Docker
* Ship packages: deb, rpm, gem, jar, homebrew...
* Dependency hell.
* "Works on my machine."
* Base deployment often done from scratch (debootstrap...) and unreliable.
---
class: extra-details
## Shipping, after Docker
* Ship container images with all their dependencies.
* Images are bigger, but they are broken down into layers.
* Only ship layers that have changed.
* Save disk, network, memory usage.
---
class: extra-details
## Example
Layers:
* CentOS
* JRE
* Tomcat
* Dependencies
* Application JAR
* Configuration
---
class: extra-details
## Devs vs Ops, before Docker
* Drop a tarball (or a commit hash) with instructions.
@@ -149,71 +348,3 @@ class: extra-details
* Devs can be empowered to make releases themselves
more easily.
---
## Pets vs. Cattle
* In the "pets vs. cattle" metaphor, there are two kinds of servers.
* Pets:
* have distinctive names and unique configurations
* when they have an outage, we do everything we can to fix them
* Cattle:
* have generic names (e.g. with numbers) and generic configuration
* configuration is enforced by configuration management, golden images ...
* when they have an outage, we can replace them immediately with a new server
* What's the connection with Docker and containers?
---
## Local development environments
* When we use local VMs (with e.g. VirtualBox or VMware), our workflow looks like this:
* create VM from base template (Ubuntu, CentOS...)
* install packages, set up environment
* work on project
* when done, shut down VM
* next time we need to work on project, restart VM as we left it
* if we need to tweak the environment, we do it live
* Over time, the VM configuration evolves, diverges.
* We don't have a clean, reliable, deterministic way to provision that environment.
---
## Local development with Docker
* With Docker, the workflow looks like this:
* create container image with our dev environment
* run container with that image
* work on project
* when done, shut down container
* next time we need to work on project, start a new container
* if we need to tweak the environment, we create a new image
* We have a clear definition of our environment, and can share it reliably with others.
* Let's see in the next chapters how to bake a custom image with `figlet`!

View File

@@ -6,6 +6,8 @@ We will see how to:
* Leverage the build cache so that builds can be faster.
* Embed unit testing in the build process.
---
## Reducing the number of layers
@@ -74,8 +76,6 @@ CMD ["python", "app.py"]
---
class: extra-details
## Be careful with `chown`, `chmod`, `mv`
* Layers cannot store efficiently changes in permissions or ownership.
@@ -117,8 +117,6 @@ class: extra-details
---
class: extra-details
## Use `COPY --chown`
* The Dockerfile instruction `COPY` can take a `--chown` parameter.
@@ -142,8 +140,6 @@ class: extra-details
---
class: extra-details
## Set correct permissions locally
* Instead of using `chmod`, set the right file permissions locally.
@@ -152,6 +148,29 @@ class: extra-details
---
## Embedding unit tests in the build process
```dockerfile
FROM <baseimage>
RUN <install dependencies>
COPY <code>
RUN <build code>
RUN <install test dependencies>
COPY <test data sets and fixtures>
RUN <unit tests>
FROM <baseimage>
RUN <install dependencies>
COPY <code>
RUN <build code>
CMD, EXPOSE ...
```
* The build fails as soon as an instruction fails
* If `RUN <unit tests>` fails, the build doesn't produce an image
* If it succeeds, it produces a clean image (without test libraries and data)
---
# Dockerfile examples
There are a number of tips, tricks, and techniques that we can use in Dockerfiles.
@@ -267,8 +286,6 @@ ENV PIP=9.0.3 \
---
class: extra-details
## Entrypoints and wrappers
It is very common to define a custom entrypoint.
@@ -286,8 +303,6 @@ That entrypoint will generally be a script, performing any combination of:
---
class: extra-details
## A typical entrypoint script
```dockerfile
@@ -342,6 +357,67 @@ RUN ...
---
## Overrides
In theory, development and production images should be the same.
In practice, we often need to enable specific behaviors in development (e.g. debug statements).
One way to reconcile both needs is to use Compose to enable these behaviors.
Let's look at the [trainingwheels](https://github.com/jpetazzo/trainingwheels) demo app for an example.
---
## Production image
This Dockerfile builds an image leveraging gunicorn:
```dockerfile
FROM python
RUN pip install flask
RUN pip install gunicorn
RUN pip install redis
COPY . /src
WORKDIR /src
CMD gunicorn --bind 0.0.0.0:5000 --workers 10 counter:app
EXPOSE 5000
```
(Source: [trainingwheels Dockerfile](https://github.com/jpetazzo/trainingwheels/blob/master/www/Dockerfile))
---
## Development Compose file
This Compose file uses the same image, but with a few overrides for development:
- the Flask development server is used (overriding `CMD`),
- the `DEBUG` environment variable is set,
- a volume is used to provide a faster local development workflow.
.small[
```yaml
services:
www:
build: www
ports:
- 8000:5000
user: nobody
environment:
DEBUG: 1
command: python counter.py
volumes:
- ./www:/src
```
]
(Source: [trainingwheels Compose file](https://github.com/jpetazzo/trainingwheels/blob/master/docker-compose.yml))
---
## How to know which best practices are better?
- The main goal of containers is to make our lives easier.

View File

@@ -1,4 +1,4 @@
# Exercise — multi-stage builds
# Exercise — writing better Dockerfiles
Let's update our Dockerfiles to leverage multi-stage builds!

View File

@@ -1,5 +0,0 @@
# Exercise — BuildKit cache mounts
We want to make our builds faster by leveraging BuildKit cache mounts.
Of course, if we don't make any changes to the code, the build should be instantaneous. Therefore, to benchmark our changes, we will make trivial changes to the code (e.g. change the message in a "print" statement) and measure (e.g. with `time`) how long it takes to rebuild the image.

View File

@@ -147,9 +147,6 @@ Now, try to:
* run `figlet`. Does that work?
???
On macOS: brew list | wc -l
---
class: self-paced
@@ -228,3 +225,73 @@ bash: figlet: command not found
*This puts a strong emphasis on automation and repeatability. Let's see why ...*
---
## Pets vs. Cattle
* In the "pets vs. cattle" metaphor, there are two kinds of servers.
* Pets:
* have distinctive names and unique configurations
* when they have an outage, we do everything we can to fix them
* Cattle:
* have generic names (e.g. with numbers) and generic configuration
* configuration is enforced by configuration management, golden images ...
* when they have an outage, we can replace them immediately with a new server
* What's the connection with Docker and containers?
---
## Local development environments
* When we use local VMs (with e.g. VirtualBox or VMware), our workflow looks like this:
* create VM from base template (Ubuntu, CentOS...)
* install packages, set up environment
* work on project
* when done, shut down VM
* next time we need to work on project, restart VM as we left it
* if we need to tweak the environment, we do it live
* Over time, the VM configuration evolves, diverges.
* We don't have a clean, reliable, deterministic way to provision that environment.
---
## Local development with Docker
* With Docker, the workflow looks like this:
* create container image with our dev environment
* run container with that image
* work on project
* when done, shut down container
* next time we need to work on project, start a new container
* if we need to tweak the environment, we create a new image
* We have a clear definition of our environment, and can share it reliably with others.
* Let's see in the next chapters how to bake a custom image with `figlet`!
???
:EN:- Running our first container
:FR:- Lancer nos premiers conteneurs

View File

@@ -1,249 +0,0 @@
# Deep Dive Into Images
- Image = files (layers) + metadata (configuration)
- Layers = regular tar archives
(potentially with *whiteouts*)
- Configuration = everything needed to run the container
(e.g. Cmd, Env, WorkdingDir...)
---
## Image formats
- Docker image [v1] (no longer used, except in `docker save` and `docker load`)
- Docker image v1.1 (IDs are now hashes instead of random values)
- Docker image [v2] (multi-arch support; content-addressable images)
- [OCI image format][oci] (almost the same, except for media types)
[v1]: https://github.com/moby/docker-image-spec?tab=readme-ov-file
[v2]: https://github.com/distribution/distribution/blob/main/docs/content/spec/manifest-v2-2.md
[oci]: https://github.com/opencontainers/image-spec/blob/main/spec.md
---
## OCI images
- Manifest = JSON document
- Used by container engines to know "what should I download to unpack this image?"
- Contains references to blobs, identified by their sha256 digest + size
- config (single sha256 digest)
- layers (list of sha256 digests)
- Also annotations (key/values)
- It's also possible to have a manifest list, or "fat manifest"
(which lists multiple manifests; this is used for multi-arch support)
---
## Config blob
- Also a JSON document
- `architecture` string (e.g. `amd64`)
- `config` object
Cmd, Entrypoint, Env, ExposedPorts, StopSignal, User, Volumes, WorkingDir
- `history` list
purely informative; shown with e.g. `docker history`
- `rootfs` object
`type` (always `layers`) + list of "diff ids"
---
class: extra-details
## Layers vs layers
- The image configuration contains digests of *uncompressed layers*
- The image manifest contains digests of *compressed layers*
(layer blobs in the registry can be tar, tar+gzip, tar+zstd)
---
## Layer format
- Layer = completely normal tar archive
- When a file is added or modified, it is added to the archive
(note: trivial changes, e.g. permissions, require to re-add the whole file!)
- When a file is deleted, a *whiteout* file is created
e.g. `rm hello.txt` results in a file named `.wh.hello.txt`
- Files starting with `.wh.` are forbidden in containers
- There is a special file, `.wh..wh..opq`, which means "remove all siblings"
(optimization to completely empty a directory)
- See [layer specification](https://github.com/opencontainers/image-spec/blob/main/layer.md) for details
---
class: extra-details
## Origin of layer format
- The initial storage driver for Docker was AUFS
- AUFS is out-of-tree but Debian and Ubuntu included it
(they used it for live CD / live USB boot)
- It meant that Docker could work out of the box on these distros
- Later, Docker added support for other systems
(devicemapper thin provisioning, btrfs, overlay...)
- Today, overlay is the best compromise for most use-cases
---
## Inspecting images
- `skopeo` can copy images between different places
(registries, Docker Engine, local storage as used by podman...)
- Example:
```bash
skopeo copy docker://alpine oci:/tmp/alpine.oci
```
- The image manifest will be in `/tmp/alpine.oci/index.json`
- Blobs (image configuration and layers) will be in `/tmp/alpine.oci/blobs/sha256`
- Note: as of version 1.20, `skopeo` doesn't handle extensions like stargz yet
(copying stargz images won't transfer the special index blobs)
---
## Layer surgery
Here is an example of how to manually edit an image.
https://github.com/jpetazzo/layeremove
It removes a specific layer from an image.
Note: it would be better to use a buildkit cache mount instead.
(This is just an educative example!)
---
## Stargz
- [Stargz] = Seekable Tar Gz, or "stargazer"
- Goal: start a container *before* its image has been fully downloaded
- Particularly useful for huge images that take minutes to download
- Also known as "streamable images" or "lazy loading"
- Alternative: [SOCI]
[stargz]: https://github.com/containerd/stargz-snapshotter
[SOCI]: https://github.com/awslabs/soci-snapshotter
---
## Stargz architecture
- Combination of:
- a backward-compatible extension to the OCI image format
- a containerd *snapshotter*
(=containerd component responsible for managing container and image storage)
- tooling to create, convert, optimize images
- Installation requires:
- running the snapshotter daemon
- configuring containerd
- building new images or converting the existing ones
---
## Stargz principle
- Normal image layer = tar.gz = gzip(tar(file1, file2, ...))
- Can't access fileN without uncompressing everything before it
- Seekable Tar Gz = gzip(tar(file1)) + gzip(tar(file2)) + ... + index
(big files can also be chunked)
- Can access individual files
(and even individual chunks, if needed)
- Downside: lower compression ratio
(less compression context; extra gzip headers)
---
## Stargz format
- The index mentioned above is stored in separate registry blobs
(one index for each layer)
- The digest of the index blobs is stored in annotations in normal OCI images
- Fully compatible with existing registries
- Existing container engines will load images transparently
(without leveraging stargz capabilities)
---
## Stargz limitations
- Tools like `skopeo` will ignore index blobs
(=copying images across registries will discard stargz capabilities)
- Indexes need to be downloaded before container can be started
(=still significant start time when there are many files in images)
- Significant latency when accessing a file lazily
(need to hit the registry, typically with a range header, uncompress file)
- Images can be optimized to pre-load important files

View File

@@ -115,7 +115,46 @@ If an image is read-only, how do we change it?
* A new image is created by stacking the new layer on top of the old image.
* This can be automated by writing a `Dockerfile` and then running `docker build`.
---
## A chicken-and-egg problem
* The only way to create an image is by "freezing" a container.
* The only way to create a container is by instantiating an image.
* Help!
---
## Creating the first images
There is a special empty image called `scratch`.
* It allows to *build from scratch*.
The `docker import` command loads a tarball into Docker.
* The imported tarball becomes a standalone image.
* That new image has a single layer.
Note: you will probably never have to do this yourself.
---
## Creating other images
`docker commit`
* Saves all the changes made to a container into a new layer.
* Creates a new image (effectively a copy of the container).
`docker build` **(used 99% of the time)**
* Performs a repeatable build sequence.
* This is the preferred method!
We will explain both methods in a moment.
---
@@ -123,15 +162,15 @@ If an image is read-only, how do we change it?
There are three namespaces:
* Official images on the Docker Hub
* Official images
e.g. `ubuntu`, `busybox` ...
* User (and organizations) images on the Docker Hub
* User (and organizations) images
e.g. `jpetazzo/clock`
* Images on registries that are NOT the Docker Hub
* Self-hosted images
e.g. `registry.example.com:5000/my-private/image`
@@ -244,6 +283,30 @@ jpetazzo/clock latest 12068b93616f 12 months ago 2.433 MB
---
## Searching for images
We cannot list *all* images on a remote registry, but
we can search for a specific keyword:
```bash
$ docker search marathon
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
mesosphere/marathon A cluster-wide init and co... 105 [OK]
mesoscloud/marathon Marathon 31 [OK]
mesosphere/marathon-lb Script to update haproxy b... 22 [OK]
tobilg/mongodb-marathon A Docker image to start a ... 4 [OK]
```
* "Stars" indicate the popularity of the image.
* "Official" images are those in the root namespace.
* "Automated" images are built automatically by the Docker Hub.
<br/>(This means that their build recipe is always available.)
---
## Downloading images
There are two ways to download images.

View File

@@ -314,6 +314,52 @@ class: extra-details
---
## Trash your servers and burn your code
*(This is the title of a
[2013 blog post][immutable-deployments]
by Chad Fowler, where he explains the concept of immutable infrastructure.)*
[immutable-deployments]: https://web.archive.org/web/20160305073617/http://chadfowler.com/blog/2013/06/23/immutable-deployments/
--
* Let's majorly mess up our container.
(Remove files or whatever.)
* Now, how can we fix this?
--
* Our old container (with the blue version of the code) is still running.
* See on which port it is exposed:
```bash
docker ps
```
* Point our browser to it to confirm that it still works fine.
---
## Immutable infrastructure in a nutshell
* Instead of *updating* a server, we deploy a new one.
* This might be challenging with classical servers, but it's trivial with containers.
* In fact, with Docker, the most logical workflow is to build a new image and run it.
* If something goes wrong with the new image, we can always restart the old one.
* We can even keep both versions running side by side.
If this pattern sounds interesting, you might want to read about *blue/green deployment*
and *canary deployments*.
---
## Recap of the development workflow
1. Write a Dockerfile to build an image containing our development environment.
@@ -341,6 +387,35 @@ class: extra-details
class: extra-details
## Debugging inside the container
Docker has a command called `docker exec`.
It allows users to run a new process in a container which is already running.
If sometimes you find yourself wishing you could SSH into a container: you can use `docker exec` instead.
You can get a shell prompt inside an existing container this way, or run an arbitrary process for automation.
---
class: extra-details
## `docker exec` example
```bash
$ # You can run ruby commands in the area the app is running and more!
$ docker exec -it <yourContainerId> bash
root@5ca27cf74c2e:/opt/namer# irb
irb(main):001:0> [0, 1, 2, 3, 4].map {|x| x ** 2}.compact
=> [0, 1, 4, 9, 16]
irb(main):002:0> exit
```
---
class: extra-details
## Stopping the container
Now that we're done let's stop our container.

View File

@@ -1,140 +0,0 @@
class: title
# More Dockerfile Instructions
![construction](images/title-advanced-dockerfiles.jpg)
---
## `Dockerfile` usage summary
* `Dockerfile` instructions are executed in order.
* Each instruction creates a new layer in the image.
* Docker maintains a cache with the layers of previous builds.
* When there are no changes in the instructions and files making a layer,
the builder re-uses the cached layer, without executing the instruction for that layer.
* The `FROM` instruction MUST be the first non-comment instruction.
* Lines starting with `#` are treated as comments.
* Some instructions (like `CMD` or `ENTRYPOINT`) update a piece of metadata.
(As a result, each call to these instructions makes the previous one useless.)
---
## The `EXPOSE` instruction
The `EXPOSE` instruction tells Docker what ports are to be published
in this image.
```dockerfile
EXPOSE 8080
EXPOSE 80 443
EXPOSE 53/tcp 53/udp
```
* All ports are private by default.
* Declaring a port with `EXPOSE` is not enough to make it public.
* The `Dockerfile` doesn't control on which port a service gets exposed.
---
## Exposing ports
* When you `docker run -p <port> ...`, that port becomes public.
(Even if it was not declared with `EXPOSE`.)
* When you `docker run -P ...` (without port number), all ports
declared with `EXPOSE` become public.
A *public port* is reachable from other containers and from outside the host.
A *private port* is not reachable from outside.
---
## `VOLUME`
The `VOLUME` instruction tells Docker that a specific directory
should be a *volume*.
```dockerfile
VOLUME /var/lib/mysql
```
Filesystem access in volumes bypasses the copy-on-write layer,
offering native performance to I/O done in those directories.
Volumes can be attached to multiple containers, allowing to
"port" data over from a container to another, e.g. to
upgrade a database to a newer version.
It is possible to start a container in "read-only" mode.
The container filesystem will be made read-only, but volumes
can still have read/write access if necessary.
---
## The `WORKDIR` instruction
The `WORKDIR` instruction sets the working directory for subsequent
instructions.
It also affects `CMD` and `ENTRYPOINT`, since it sets the working
directory used when starting the container.
```dockerfile
WORKDIR /src
```
You can specify `WORKDIR` again to change the working directory for
further operations.
---
## The `ENV` instruction
The `ENV` instruction specifies environment variables that should be
set in any container launched from the image.
```dockerfile
ENV WEBAPP_PORT 8080
```
This will result in an environment variable being created in any
containers created from this image of
```bash
WEBAPP_PORT=8080
```
You can also specify environment variables when you use `docker run`.
```bash
$ docker run -e WEBAPP_PORT=8000 -e WEBAPP_HOST=www.example.com ...
```
---
class: extra-details
## The `USER` instruction
The `USER` instruction sets the user name or UID to use when running
the image.
It can be used multiple times to change back to root or to another user.
???
:EN:- Advanced Dockerfile syntax
:FR:- Dockerfile niveau expert

View File

@@ -48,6 +48,161 @@ Therefore, `RUN rm` does not reduce the size of the image or free up disk space.
---
## Removing unnecessary files
Various techniques are available to obtain smaller images:
- collapsing layers,
- adding binaries that are built outside of the Dockerfile,
- squashing the final image,
- multi-stage builds.
Let's review them quickly.
---
## Collapsing layers
You will frequently see Dockerfiles like this:
```dockerfile
FROM ubuntu
RUN apt-get update && apt-get install xxx && ... && apt-get remove xxx && ...
```
Or the (more readable) variant:
```dockerfile
FROM ubuntu
RUN apt-get update \
&& apt-get install xxx \
&& ... \
&& apt-get remove xxx \
&& ...
```
This `RUN` command gives us a single layer.
The files that are added, then removed in the same layer, do not grow the layer size.
---
## Collapsing layers: pros and cons
Pros:
- works on all versions of Docker
- doesn't require extra tools
Cons:
- not very readable
- some unnecessary files might still remain if the cleanup is not thorough
- that layer is expensive (slow to build)
---
## Building binaries outside of the Dockerfile
This results in a Dockerfile looking like this:
```dockerfile
FROM ubuntu
COPY xxx /usr/local/bin
```
Of course, this implies that the file `xxx` exists in the build context.
That file has to exist before you can run `docker build`.
For instance, it can:
- exist in the code repository,
- be created by another tool (script, Makefile...),
- be created by another container image and extracted from the image.
See for instance the [busybox official image](https://github.com/docker-library/busybox/blob/fe634680e32659aaf0ee0594805f74f332619a90/musl/Dockerfile) or this [older busybox image](https://github.com/jpetazzo/docker-busybox).
---
## Building binaries outside: pros and cons
Pros:
- final image can be very small
Cons:
- requires an extra build tool
- we're back in dependency hell and "works on my machine"
Cons, if binary is added to code repository:
- breaks portability across different platforms
- grows repository size a lot if the binary is updated frequently
---
## Squashing the final image
The idea is to transform the final image into a single-layer image.
This can be done in (at least) two ways.
- Activate experimental features and squash the final image:
```bash
docker image build --squash ...
```
- Export/import the final image.
```bash
docker build -t temp-image .
docker run --entrypoint true --name temp-container temp-image
docker export temp-container | docker import - final-image
docker rm temp-container
docker rmi temp-image
```
---
## Squashing the image: pros and cons
Pros:
- single-layer images are smaller and faster to download
- removed files no longer take up storage and network resources
Cons:
- we still need to actively remove unnecessary files
- squash operation can take a lot of time (on big images)
- squash operation does not benefit from cache
<br/>
(even if we change just a tiny file, the whole image needs to be re-squashed)
---
## Multi-stage builds
Multi-stage builds allow us to have multiple *stages*.
Each stage is a separate image, and can copy files from previous stages.
We're going to see how they work in more detail.
---
# Multi-stage builds
* At any point in our `Dockerfile`, we can add a new `FROM` line.
@@ -160,7 +315,7 @@ class: extra-details
(instead of using multiple Dockerfiles, which could go out of sync)
---
--
class: extra-details

File diff suppressed because it is too large Load Diff

View File

@@ -1,75 +0,0 @@
# Rootless Networking
The "classic" approach for container networking is `veth` + bridge.
Pros:
- good performance
- easy to manage and understand
- flexible (possibility to use multiple, isolated bridges)
Cons:
- requires root access on the host to set up networking
---
## Rootless options
- Locked down helpers
- daemon, scripts started through sudo...
- used by some desktop virtualization platforms
- still requires root access at some point
- Userland networking stacks
- true solution that does not require root privileges
- lower performance
---
## Userland stacks
- [SLiRP](https://en.wikipedia.org/wiki/Slirp)
*the OG project that inspired the other ones!*
- [VPNKit](https://github.com/moby/vpnkit)
*introduced by Docker Desktop to play nice with enterprise VPNs*
- [slirp4netns](https://github.com/rootless-containers/slirp4netns)
*slirp adapted for network namespaces, and therefore, containers; better performance*
- [passt and pasta](https://passt.top/)
*more modern approach; better support for inbound traffic; IPv6...)*
---
## Passt/Pasta
- No dependencies
- NAT (like slirp4netns) or no-NAT (for e.g. KubeVirt)
- Can handle inbound traffic dynamically
- No dynamic memory allocation
- Good security posture
- IPv6 support
- Reasonable performance
---
## Demo?

View File

@@ -1,162 +0,0 @@
# Security models
In this section, we want to address a few security-related questions:
- What permissions do we need to run containers or a container engine?
- Can we use containers to escalate permissions?
- Can we break out of a container (move from container to host)?
- Is it safe to run untrusted code in containers?
- What about Kubernetes?
---
## Running Docker, containerd, podman...
- In the early days, running containers required root permissions
(to set up namespaces, cgroups, networking, mount filesystems...)
- Eventually, new kernel features were developed to allow "rootless" operation
(user namespaces and associated tweaks)
- Rootless requires a little bit of additional setup on the system (e.g. subuid)
(although this is increasingly often automated in modern distros)
- Docker runs as root by default; Podman runs rootless by default
---
## Advantages of rootless
- Containers can run without any intervention from root
(no package install, no daemon running as root...)
- Containerized processes run with non-privileged UID
- Container escape doesn't automatically result in full host compromise
- Can isolate workloads by using different UID
---
## Downsides of rootless
- *Relatively* newer (rootless Docker was introduced in 2019)
- many quirks/issues/limitations in the initial implementations
- kernel features and other mechanisms were introduced over time
- they're not always very well documented
- I/O performance (disk, network) is typically lower
(due to using special mechanisms instead of more direct access)
- Rootless and rootful engines must use different image storage
(due to UID mapping)
---
## Why not rootless everywhere?
- Not very useful on clusters
- users shouldn't log into cluster nodes
- questionable security improvement
- lower I/O performance
- Not very useful with Docker Desktop / Podman Desktop
- container workloads are already inside a VM
- could arguably provide a layer of inter-workload isolation
- would require new APIs and concepts
---
## Permission escalation
- Access to the Docker socket = root access to the machine
```bash
docker run --privileged -v /:/hostfs -ti alpine
```
- That's why by default, the Docker socket access is locked down
(only accessible by `root` and group `docker`)
- If user `alice` has access to the Docker socket:
*compromising user `alice` leads to whole host compromise!*
- Doesn't fundamentally change the threat model
(if `alice` gets compromised in the first place, we're in trouble!)
- Enables new threats (persistence, kernel access...)
---
## Avoiding the problem
- Rootless containers
- Container VM (Docker Desktop, Podman Desktop, Orbstack...)
- Unfortunately: no fine-grained access to the Docker API
(no way to e.g. disable privileged containers, volume mounts...)
---
## Escaping containers
- Very easy with some features
(privileged containers, volume mounts, device access)
- Otherwise impossible in theory
(but of course, vulnerabilities do exist!)
- **Be careful with scripts invoking `docker run`, or Compose files!**
---
## Untrusted code
- Should be safe as long as we're not enabling dangerous features
(privileged containers, volume mounts, device access, capabilities...)
- Remember that by default, containers can make network calls
(but see: `--net none` and also `docker network create --internal`)
- And of course, again: vulnerabilities do exist!
---
## What about Kubernetes?
- Ability to run arbitrary pods = dangerous
- But there are multiple safety mechanisms available:
- Pod Security Settings (limit "dangerous" features)
- RBAC (control who can do what)
- webhooks and policy engines for even finer grained control

View File

@@ -15,21 +15,11 @@ class: title
- If you are doing or re-doing this course on your own, you can:
- install [Docker Desktop][docker-desktop] or [Podman Desktop][podman-desktop]
<br/>(available for Linux, Mac, Windows; provides a nice GUI)
- install Docker locally (as explained in the chapter "Installing Docker")
- install [Docker CE][docker-ce] or [Podman][podman]
<br/>(for intermediate/advanced users who prefer the CLI)
- install Docker on e.g. a cloud VM
- try platforms like [Play With Docker][pwd] or [KodeKloud]
<br/>(if you can't/won't install anything locally)
[docker-desktop]: https://docs.docker.com/desktop/
[podman-desktop]: https://podman-desktop.io/downloads
[docker-ce]: https://docs.docker.com/engine/install/
[podman]: https://podman.io/docs/installation#installing-on-linux
[pwd]: https://labs.play-with-docker.com/
[KodeKloud]: https://kodekloud.com/free-labs/docker/
- use https://www.play-with-docker.com/ to instantly get a training environment
---
@@ -49,6 +39,42 @@ individual Docker VM.*
---
## What *is* Docker?
- "Installing Docker" really means "Installing the Docker Engine and CLI".
- The Docker Engine is a daemon (a service running in the background).
- This daemon manages containers, the same way that a hypervisor manages VMs.
- We interact with the Docker Engine by using the Docker CLI.
- The Docker CLI and the Docker Engine communicate through an API.
- There are many other programs and client libraries which use that API.
---
## Why don't we run Docker locally?
- We are going to download container images and distribution packages.
- This could put a bit of stress on the local WiFi and slow us down.
- Instead, we use a remote VM that has a good connectivity
- In some rare cases, installing Docker locally is challenging:
- no administrator/root access (computer managed by strict corp IT)
- 32-bit CPU or OS
- old OS version (e.g. CentOS 6, OSX pre-Yosemite, Windows 7)
- It's better to spend time learning containers than fiddling with the installer!
---
## Connecting to your Virtual Machine
You need an SSH client.
@@ -67,6 +93,23 @@ $ ssh <login>@<ip-address>
* MobaXterm (https://mobaxterm.mobatek.net/)
---
class: in-person
## `tailhist`
The shell history of the instructor is available online in real time.
Note the IP address of the instructor's virtual machine (A.B.C.D).
Open http://A.B.C.D:1088 in your browser and you should see the history.
The history is updated in real time (using a WebSocket connection).
It should be green when the WebSocket is connected.
If it turns red, reloading the page should fix it.
---
@@ -101,47 +144,10 @@ Server:
If this doesn't work, raise your hand so that an instructor can assist you!
---
???
## Installing Docker
- "Installing Docker" really means "Installing the **Docker Engine** and **CLI**".
- The Docker Engine is a **daemon** (a service running in the background) —— it manages containers, the same way that a hypervisor manages VMs.
- We interact with the Docker Engine by using the Docker CLI.
- The Docker CLI and the Docker Engine communicate through an API.
- There are many other programs and client libraries which use that API.
---
class: pic
![Docker Architecture](images/docker-engine-architecture.svg)
---
## Can we run Docker locally?
- If you already have Docker (or Podman) installed, you can use it!
- The VMs can be convenient if:
- you can't/won't install Docker or Podman on your machine,
- your local internet connection is slow.
- We're going to download many container images and distribution packages.
- If the class takes place in a venue with slow WiFi, this can slow us down.
- The remote VMs have good connectivity and downloads will be fast there.
(Initially, we provided VMs to make sure that nobody would waste time
with installers, or because they didn't have the right permissions
on their machine, etc.)
:EN:Container concepts
:FR:Premier contact avec les conteneurs
:EN:- What's a container engine?
:FR:- Qu'est-ce qu'un *container engine* ?

View File

@@ -1,62 +0,0 @@
title: |
Docker Fundamentals
& Optimizations
<div style="display:flex; justify-content:center; align-items:center; gap:70px;">
<img src="https://images.seeklogo.com/logo-png/44/1/ecosia-logo-png_seeklogo-440094.png" width="250">
<img src="https://gist.githubusercontent.com/jpetazzo/dcecd53a111f1fbe65c29ee15b9143e4/raw/fe18ea3aa66d1dc16964d4223bf6cf8f6a51d40a/empowered.png" width="200">
<img src="https://gist.githubusercontent.com/jpetazzo/dcecd53a111f1fbe65c29ee15b9143e4/raw/fe18ea3aa66d1dc16964d4223bf6cf8f6a51d40a/pyladies.png" width="300">
</div>
#chat: "[Mattermost](https://training.enix.io/mattermost)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2025-11-docker.container.training/
slidenumberprefix: "workshop.container.training &mdash; login = firstname@ &mdash; password = where we are :) &mdash; "
exclude:
- self-paced
content:
- shared/title.md
- shared/contact.md
- logistics.md
- containers/intro.md
- shared/about-slides.md
#- shared/chat-room-im.md
#- shared/chat-room-zoom-meeting.md
#- shared/chat-room-zoom-webinar.md
- shared/toc.md
- # MORNING
#- containers/Docker_History.md
- containers/Training_Environment.md
#- containers/Installing_Docker.md
- containers/Docker_Overview.md
- containers/First_Containers.md
- containers/Background_Containers.md
- containers/Initial_Images.md
#- containers/Building_Images_Interactively.md
- containers/Building_Images_With_Dockerfiles.md
- containers/Cmd_And_Entrypoint.md
- containers/Copying_Files_During_Build.md
- containers/Dockerfile_Tips.md
- containers/More_Dockerfile_Instructions.md
- containers/Multi_Stage_Builds.md
- containers/Exercise_Dockerfile_Basic.md
- containers/Exercise_Dockerfile_Multistage.md
- # AFTERNOON
- containers/Container_Networking_Basics.md
- containers/Local_Development_Workflow.md
#- containers/Container_Network_Model.md
- containers/Compose_For_Dev_Stacks.md
- containers/Exercise_Composefile.md
#- containers/Start_And_Attach.md
#- containers/Naming_And_Inspecting.md
#- containers/Labels.md
#- containers/Getting_Inside.md
#- containers/Publishing_To_Docker_Hub.md
#- containers/Buildkit.md
- shared/thankyou.md

View File

@@ -1,159 +0,0 @@
# Exercise — Build a container from scratch
Our goal will be to execute a container running a simple web server.
(Example: NGINX, or https://github.com/jpetazzo/color.)
We want the web server to be isolated:
- it shouldn't be able to access the outside world,
- but we should be able to connect to it from our machine.
Make sure to automate / script things as much as possible!
---
## Steps
1. Prepare the filesystem
2. Run it with chroot
3. Isolation with namespaces
4. Network configuration
5. Cgroups
6. Non-root
---
## Prepare the filesystem
- Obtain a root filesystem with one of the following methods:
- download an Alpine mini root fs
- export an Alpine or NGINX container image with Docker
- download and convert a container image with Skopeo
- make it from scratch with busybox + a static [jpetazzo/color](https://github.com/jpetazzo/color)
- ...anything you want! (Nix, anyone?)
- Enter the root filesystem with `chroot`
---
## Help, network does not work!
- Check that you have external connectivity from the chroot:
```bash
ping 1.1.1.1
```
(that *should* work; if it doesn't, we have a serious problem!)
- Check that DNS resolution works:
```bash
ping enix.io
```
- If you're having a DNS resolution error, configure DNS in the container:
```bash
echo nameserver 1.1.1.1 > /etc/resolv.conf
```
---
## Running a web server
Here are a few possibilities...
- Install the NGINX package and run it with `nginx`
(note: by default it will start in the background)
- Run NGINX in the foreground with `nginx -g "daemon off;"`
- Install the package Caddy and run `caddy file-server -ab`
(it will remain in the foreground and show logs; **RECOMMENDED**)
- Download and/or build https://github.com/jpetazzo/color
(if you're familiar with the Go ecosystem!)
---
## Run with chroot
- Start the web server from within the chroot
- Confirm that you can connect to it from outside
- Write a script to start our "proto-container"
---
## Isolation with namespaces
- Now, enter the root filesystem with `unshare`
(enable all the namespaces you want; maybe not `user` yet, though!)
- Start the web server
(you might need to configure at least the loopback network interface!)
- Confirm that we *cannot* connect from outside
- Update the our start script to use unshare
- Automate network configuration
(pay attention to the fact that network tools *may not* exist in the container)
---
## Network configuration
- While our "container" is running, create a `veth` pair
- Move one `veth` to the container
- Assign addresses to both `veth`
- Confirm that we can connect to the web server from outside
(using the address assigned to the container's `veth`)
- Update our start script to automate the setup of the `veth` pair
- Bonus points: update the script to that it can start *multiple* containers
---
## Cgroups
- Create a cgroup for our container
- Move the container to the cgroup
- Set a very low CPU limit and confirm that it slows down the server
(but doesn't affect the rest of the system)
- Update the script to automate this
---
## Non-root
- Switch to a non-privileged user when starting the container
- Adjust the web server configuration so that it starts
(non-privileged users cannot bind to ports below 1024)

View File

@@ -1,32 +0,0 @@
# Exercise — enable auth
- We want to enable authentication and authorization
- Checklist:
- non-privileged user can deploy in their namespace
<br/>(and nowhere else)
- each controller uses its own key, certificate, and identity
- each node uses its own key, certificate, and identity
- Service Accounts work properly
- See next slide for help / hints!
---
## Checklist
- Generate keys, certs, and kubeconfig for everything that needs them
(cluster admin, cluster user, controller manager, scheduler, kubelet)
- Reconfigure and restart each component to use its new identity
- Turn on `RBAC` and `Node` authorizers on the API server
- Check that everything works properly
(e.g. that you can create and scale a Deployment using the "cluster user" identity)

View File

@@ -1,51 +0,0 @@
# Exercise — networking
- We want to install extra networking components:
- a CNI configuration
- kube-proxy
- CoreDNS
- After doing that, we should be able to deploy a "complex" app
(with multiple containers communicating together + service discovery)
---
## CNI
- Easy option: Weave
https://github.com/weaveworks/weave/releases
- Better option: Cilium
https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-the-cilium-cli
or https://docs.cilium.io/en/stable/installation/k8s-install-helm/#installation-using-helm
---
## kube-proxy
- Option 1: author a DaemonSet
- Option 2: leverage the CNI (some CNIs like Cilium can replace kube-proxy)
---
## CoreDNS
- Suggested method: Helm chart
(available on https://github.com/coredns/helm)
---
## Testing
- Try to deploy DockerCoins and confirm that it works
(for instance with [this YAML file](https://raw.githubusercontent.com/jpetazzo/container.training/refs/heads/main/k8s/dockercoins.yaml))

View File

@@ -1,22 +0,0 @@
# Exercise — static pods
- We want to run the control plane in static pods
(etcd, API server, controller manager, scheduler)
- For Kubernetes components, we can use [these images](https://kubernetes.io/releases/download/#container-images)
- For etcd, we can use [this image](https://quay.io/repository/coreos/etcd?tab=tags)
- If we're using keys, certificates... We can use [hostPath volumes](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath)
---
## Testing
After authoring our static pod manifests and placing them in the right directory,
we should be able to start our cluster simply by starting kubelet.
(Assuming that the container engine is already running.)
For bonus points: write and enable a systemd unit for kubelet!

View File

@@ -26,7 +26,7 @@
- it should initially show a few milliseconds latency
- that will increase when we scale up the number of `worker` Pods
- that will increase when we scale up
- it will also let us detect when the service goes "boom"

View File

@@ -1,35 +0,0 @@
# Exercise — Images from scratch
There are two parts in this exercise:
1. Obtaining and unpacking an image from scratch
2. Adding overlay mounts to the "container from scratch" lab
---
## Pulling from scratch, easy mode
- Download manifest and layers with `skopeo`
- Parse manifest and configuration with e.g. `jq`
- Uncompress the layers in a directory
- Check that the result works (using `chroot`)
---
## Pulling from scratch, medium mode
- Don't use `skopeo`
- Hints: if pulling from the Docker Hub, you'll need a token
(there are examples in Docker's documentation)
---
## Pulling from scratch, hard mode
- Handle whiteouts!

View File

@@ -26,8 +26,8 @@ When a Service gets created...
- We want to use a Kyverno `generate` ClusterPolicy
- For step 1, check [Generate Resources](https://kyverno.io/docs/policy-types/cluster-policy/generate/) documentation
- For step 1, check [Generate Resources](https://kyverno.io/docs/writing-policies/generate/) documentation
- For step 2, check [Preconditions](https://kyverno.io/docs/policy-types/cluster-policy/preconditions/) documentation
- For step 2, check [Preconditions](https://kyverno.io/docs/writing-policies/preconditions/) documentation
- For step 3, check [External Data Sources](https://kyverno.io/docs/policy-types/cluster-policy/external-data-sources/) documentation
- For step 3, check [External Data Sources](https://kyverno.io/docs/writing-policies/external-data-sources/) documentation

View File

@@ -1,126 +0,0 @@
## Flux install
We'll install `Flux`.
And replay the all scenario a 2nd time.
Let's face it: we don't have that much time. 😅
Since all our install and configuration is `GitOps`-based, we might just leverage on copy-paste and code configuration…
Maybe.
Let's copy the 📂 `./clusters/CLOUDY` folder and rename it 📂 `./clusters/METAL`.
---
### Modifying Flux config 📄 files
- In 📄 file `./clusters/METAL/flux-system/gotk-sync.yaml`
</br>change the `Kustomization` value `spec.path: ./clusters/METAL`
- ⚠️ We'll have to adapt the `Flux` _CLI_ command line
- And that's pretty much it!
- We'll see if anything goes wrong on that new cluster
---
### Connecting to our dedicated `Github` repo to host Flux config
.lab[
- let's replace `GITHUB_TOKEN` and `GITHUB_REPO` values
- don't forget to change the patch to `clusters/METAL`
```bash
k8s@shpod:~$ export GITHUB_TOKEN="my-token" && \
export GITHUB_USER="container-training-fleet" && \
export GITHUB_REPO="fleet-config-using-flux-XXXXX"
k8s@shpod:~$ flux bootstrap github \
--owner=${GITHUB_USER} \
--repository=${GITHUB_REPO} \
--team=OPS \
--team=ROCKY --team=MOVY \
--path=clusters/METAL
```
]
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Flux deployed our complete stack
Everything seems to be here but…
- one database is in `Pending` state
- our `ingresses` don't work well
```bash
k8s@shpod ~$ curl --header 'Host: rocky.test.enixdomain.com' http://${myIngressControllerSvcIP}
curl: (52) Empty reply from server
```
---
### Fixing the Ingress
The current `ingress-nginx` configuration leverages on specific annotations used by Scaleway to bind a _IaaS_ load-balancer to the `ingress-controller`.
We don't have such kind of things here.😕
- We could bind our `ingress-controller` to a `NodePort`.
`ingress-nginx` install manifests propose it here:
</br>https://github.com/kubernetes/ingress-nginx/tree/release-1.14/deploy/static/provider/baremetal
- In the 📄file `./clusters/METAL/ingress-nginx/sync.yaml`,
</br>change the `Kustomization` value `spec.path: ./deploy/static/provider/baremetal`
---
class: pic
![Running Mario](images/running-mario.gif)
---
### Troubleshooting the database
One of our `db-0` pod is in `Pending` state.
```bash
k8s@shpod ~$ k get pods db-0 -n *-test -oyaml
()
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2025-06-11T11:15:42Z"
message: '0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims.
preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
```
---
### Troubleshooting the PersistentVolumeClaims
```bash
k8s@shpod ~$ k get pvc postgresql-data-db-0 -n *-test -o yaml
()
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 9s (x182 over 45m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
```
No `storage class` is available on this cluster.
We hadn't the problem on our managed cluster since a default storage class was configured and then associated to our `PersistentVolumeClaim`.
Why is there no problem with the other database?

View File

Before

Width:  |  Height:  |  Size: 74 KiB

After

Width:  |  Height:  |  Size: 74 KiB

View File

Before

Width:  |  Height:  |  Size: 73 KiB

After

Width:  |  Height:  |  Size: 73 KiB

View File

Before

Width:  |  Height:  |  Size: 186 KiB

After

Width:  |  Height:  |  Size: 186 KiB

View File

Before

Width:  |  Height:  |  Size: 34 KiB

After

Width:  |  Height:  |  Size: 34 KiB

View File

Before

Width:  |  Height:  |  Size: 221 KiB

After

Width:  |  Height:  |  Size: 221 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

View File

Before

Width:  |  Height:  |  Size: 162 KiB

After

Width:  |  Height:  |  Size: 162 KiB

View File

Before

Width:  |  Height:  |  Size: 570 KiB

After

Width:  |  Height:  |  Size: 570 KiB

View File

Before

Width:  |  Height:  |  Size: 278 KiB

After

Width:  |  Height:  |  Size: 278 KiB

View File

Before

Width:  |  Height:  |  Size: 347 KiB

After

Width:  |  Height:  |  Size: 347 KiB

View File

Before

Width:  |  Height:  |  Size: 192 KiB

After

Width:  |  Height:  |  Size: 192 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Some files were not shown because too many files have changed in this diff Show More