Compare commits

..

1 Commits

Author SHA1 Message Date
Jérôme Petazzoni
549c8f5eaf Docker content for derivco 2021-11-11 09:38:21 +01:00
222 changed files with 2920 additions and 8051 deletions

View File

@@ -1,67 +1,14 @@
# (1) Setting up a registry, and telling Tilt to use it.
# Tilt needs a registry to store images.
# The following manifest defines a Deployment to run a basic Docker registry,
# and a NodePort Service to access it. Using a NodePort means that we don't
# need to obtain a TLS certificate, because we will be accessing the registry
# through localhost.
k8s_yaml('../k8s/tilt-registry.yaml')
# Tell Tilt to use the registry that we just deployed instead of whatever
# is defined in our Kubernetes resources. Tilt will patch image names to
# use our registry.
default_registry('localhost:30555')
# Create a port forward so that we can access the registry from our local
# environment, too. Note that if you run Tilt directly from a Kubernetes node
# (which is not typical, but might happen in some lab/training environments)
# the following might cause an error because port 30555 is already taken.
k8s_resource(workload='tilt-registry', port_forwards='30555:5000')
# (2) Telling Tilt how to build and run our app.
# The following two lines will use the kubectl-build plugin
# to leverage buildkit and build the images in our Kubernetes
# cluster. This is not enabled by default, because it requires
# the plugin to be installed.
# See https://github.com/vmware-tanzu/buildkit-cli-for-kubectl
# for more information about this plugin.
#load('ext://kubectl_build', 'kubectl_build')
#docker_build = kubectl_build
# Our Kubernetes manifests use images 'dockercoins/...' so we tell Tilt
# how each of these images should be built. The first argument is the name
# of the image, the second argument is the directory containing the build
# context (i.e. the Dockerfile to build the image).
docker_build('dockercoins/hasher', 'hasher')
docker_build('dockercoins/rng', 'rng')
docker_build('dockercoins/webui', 'webui')
docker_build('dockercoins/worker', 'worker')
# The following manifests defines five Deployments and four Services for
# our application.
k8s_yaml('../k8s/dockercoins.yaml')
# (3) Finishing touches.
# Uncomment the following line to let tilt run with the default kubeadm cluster-admin context.
#allow_k8s_contexts('kubernetes-admin@kubernetes')
# The following line lets Tilt run with the default kubeadm cluster-admin context.
allow_k8s_contexts('kubernetes-admin@kubernetes')
# This will run an ngrok tunnel to expose Tilt to the outside world.
# This is intended to be used when Tilt runs on a remote machine.
local_resource(name='ngrok:tunnel', serve_cmd='ngrok http 10350')
# This will wait until the ngrok tunnel is up, and show its URL to the user.
# We send the output to /dev/tty so that it doesn't get intercepted by
# Tilt, and gets displayed to the user's terminal instead.
# Note: this assumes that the ngrok instance will be running on port 4040.
# If you have other ngrok instances running on the machine, this might not work.
local_resource(name='ngrok:showurl', cmd='''
while sleep 1; do
TUNNELS=$(curl -fsSL http://localhost:4040/api/tunnels | jq -r .tunnels[].public_url)
[ "$TUNNELS" ] && break
done
printf "\nYou should be able to connect to the Tilt UI with the following URL(s): %s\n" "$TUNNELS" >/dev/tty
'''
)
# While we're here: if you're controlling a remote cluster, uncomment that line.
# It will create a port forward so that you can access the remote registry.
#k8s_resource(workload='registry', port_forwards='30555:5000')

View File

@@ -1,6 +1,6 @@
FROM node:4-slim
RUN npm install express
RUN npm install redis@3
RUN npm install redis
COPY files/ /files/
COPY webui.js /
CMD ["node", "webui.js"]

View File

@@ -1,16 +0,0 @@
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1alpha1
kind: PodSecurityConfiguration
defaults:
enforce: baseline
audit: baseline
warn: baseline
exemptions:
usernames:
- cluster-admin
namespaces:
- kube-system

View File

@@ -3,12 +3,6 @@
# - no actual persistence
# - scaling down to 1 will break the cluster
# - pods may be colocated
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: consul
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
@@ -34,6 +28,11 @@ subjects:
name: consul
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: consul
---
apiVersion: v1
kind: Service
metadata:
name: consul
@@ -62,7 +61,7 @@ spec:
serviceAccountName: consul
containers:
- name: consul
image: "consul:1.11"
image: "consul:1.8"
env:
- name: NAMESPACE
valueFrom:

View File

@@ -2,12 +2,6 @@
# There is still no actual persistence, but:
# - podAntiaffinity prevents pod colocation
# - clusters works when scaling down to 1 (thanks to lifecycle hook)
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: consul
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
@@ -33,6 +27,11 @@ subjects:
name: consul
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: consul
---
apiVersion: v1
kind: Service
metadata:
name: consul
@@ -69,7 +68,7 @@ spec:
terminationGracePeriodSeconds: 10
containers:
- name: consul
image: "consul:1.11"
image: "consul:1.8"
env:
- name: NAMESPACE
valueFrom:

View File

@@ -1,11 +1,5 @@
# Even better Consul cluster.
# That one uses a volumeClaimTemplate to achieve true persistence.
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: consul
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
@@ -31,6 +25,11 @@ subjects:
name: consul
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: consul
---
apiVersion: v1
kind: Service
metadata:
name: consul
@@ -76,7 +75,7 @@ spec:
terminationGracePeriodSeconds: 10
containers:
- name: consul
image: "consul:1.11"
image: "consul:1.8"
volumeMounts:
- name: data
mountPath: /consul/data

View File

@@ -1,16 +1,18 @@
global
daemon
maxconn 256
defaults
mode tcp
timeout connect 5s
timeout client 50s
timeout server 50s
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
listen very-basic-load-balancer
frontend the-frontend
bind *:80
server blue color.blue.svc:80
server green color.green.svc:80
default_backend the-backend
backend the-backend
server google.com-80 google.com:80 maxconn 32 check
server ibm.fr-80 ibm.fr:80 maxconn 32 check
# Note: the services above must exist,
# otherwise HAproxy won't start.

View File

@@ -1,28 +0,0 @@
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: ingress-domain-name
spec:
rules:
- name: create-ingress
match:
resources:
kinds:
- Service
generate:
kind: Ingress
name: "{{request.object.metadata.name}}"
namespace: "{{request.object.metadata.namespace}}"
data:
spec:
rules:
- host: "{{request.object.metadata.name}}.{{request.object.metadata.namespace}}.A.B.C.D.nip.io"
http:
paths:
- backend:
service:
name: "{{request.object.metadata.name}}"
port:
number: 80
path: /
pathType: Prefix

View File

@@ -1,32 +0,0 @@
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: ingress-domain-name
spec:
rules:
- name: create-ingress
match:
resources:
kinds:
- Service
preconditions:
- key: "{{request.object.spec.ports[0].name}}"
operator: Equals
value: http
generate:
kind: Ingress
name: "{{request.object.metadata.name}}"
namespace: "{{request.object.metadata.namespace}}"
data:
spec:
rules:
- host: "{{request.object.metadata.name}}.{{request.object.metadata.namespace}}.A.B.C.D.nip.io"
http:
paths:
- backend:
service:
name: "{{request.object.metadata.name}}"
port:
name: http
path: /
pathType: Prefix

View File

@@ -1,37 +0,0 @@
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: ingress-domain-name
spec:
rules:
- name: create-ingress
context:
- name: configmap
configMap:
name: ingress-domain-name
namespace: "{{request.object.metadata.namespace}}"
match:
resources:
kinds:
- Service
preconditions:
- key: "{{request.object.spec.ports[0].name}}"
operator: Equals
value: http
generate:
kind: Ingress
name: "{{request.object.metadata.name}}"
namespace: "{{request.object.metadata.namespace}}"
data:
spec:
rules:
- host: "{{request.object.metadata.name}}.{{request.object.metadata.namespace}}.{{configmap.data.domain}}"
http:
paths:
- backend:
service:
name: "{{request.object.metadata.name}}"
port:
name: http
path: /
pathType: Prefix

View File

@@ -1,20 +0,0 @@
kind: Pod
apiVersion: v1
metadata:
generateName: mounter-
labels:
container.training/mounter: ""
spec:
volumes:
- name: pvc
persistentVolumeClaim:
claimName: my-pvc-XYZ45
containers:
- name: mounter
image: alpine
stdin: true
tty: true
volumeMounts:
- name: pvc
mountPath: /pvc
workingDir: /pvc

View File

@@ -3,7 +3,8 @@ apiVersion: networking.k8s.io/v1
metadata:
name: deny-from-other-namespaces
spec:
podSelector: {}
podSelector:
matchLabels:
ingress:
- from:
- podSelector: {}

View File

@@ -1,20 +0,0 @@
kind: PersistentVolume
apiVersion: v1
metadata:
generateName: my-pv-
labels:
container.training/pv: ""
spec:
accessModes:
- ReadWriteOnce
- ReadWriteMany
capacity:
storage: 1G
hostPath:
path: /tmp/my-pv
#storageClassName: my-sc
#claimRef:
# kind: PersistentVolumeClaim
# apiVersion: v1
# namespace: default
# name: my-pvc-XYZ45

View File

@@ -1,13 +0,0 @@
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
generateName: my-pvc-
labels:
container.training/pvc: ""
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1G
#storageClassName: my-sc

View File

@@ -1,147 +0,0 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: blue
labels:
app: rainbow
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rainbow
color: blue
name: color
namespace: blue
spec:
selector:
matchLabels:
app: rainbow
color: blue
template:
metadata:
labels:
app: rainbow
color: blue
spec:
containers:
- image: jpetazzo/color
name: color
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rainbow
color: blue
name: color
namespace: blue
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: rainbow
color: blue
type: ClusterIP
---
apiVersion: v1
kind: Namespace
metadata:
name: green
labels:
app: rainbow
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rainbow
color: green
name: color
namespace: green
spec:
selector:
matchLabels:
app: rainbow
color: green
template:
metadata:
labels:
app: rainbow
color: green
spec:
containers:
- image: jpetazzo/color
name: color
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rainbow
color: green
name: color
namespace: green
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: rainbow
color: green
type: ClusterIP
---
apiVersion: v1
kind: Namespace
metadata:
name: red
labels:
app: rainbow
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: rainbow
color: red
name: color
namespace: red
spec:
selector:
matchLabels:
app: rainbow
color: red
template:
metadata:
labels:
app: rainbow
color: red
spec:
containers:
- image: jpetazzo/color
name: color
---
apiVersion: v1
kind: Service
metadata:
labels:
app: rainbow
color: red
name: color
namespace: red
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: rainbow
color: red
type: ClusterIP

View File

@@ -1,107 +1,17 @@
⚠️ This is work in progress. The UX needs to be improved,
and the docs could be better.
This directory contains a Terraform configuration to deploy
a bunch of Kubernetes clusters on various cloud providers,
using their respective managed Kubernetes products.
a bunch of Kubernetes clusters on various cloud providers, using their respective managed Kubernetes products.
## With shell wrapper
This is the recommended use. It makes it easy to start N clusters
on any provider. It will create a directory with a name like
`tag-YYYY-MM-DD-HH-MM-SS-SEED-PROVIDER`, copy the Terraform configuration
to that directory, then create the clusters using that configuration.
1. One-time setup: configure provider authentication for the provider(s) that you wish to use.
- Digital Ocean:
```bash
doctl auth init
```
- Google Cloud Platform: you will need to create a project named `prepare-tf`
and enable the relevant APIs for this project (sorry, if you're new to GCP,
this sounds vague; but if you're familiar with it you know what to do; if you
want to change the project name you can edit the Terraform configuration)
- Linode:
```bash
linode-cli configure
```
- Oracle Cloud: FIXME
(set up `oci` through the `oci-cli` Python package)
- Scaleway: run `scw init`
2. Optional: set number of clusters, cluster size, and region.
By default, 1 cluster will be configured, with 2 nodes, and auto-scaling up to 5 nodes.
If you want, you can override these parameters, with the following variables.
```bash
export TF_VAR_how_many_clusters=5
export TF_VAR_min_nodes_per_pool=2
export TF_VAR_max_nodes_per_pool=4
export TF_VAR_location=xxx
```
The `location` variable is optional. Each provider should have a default value.
The value of the `location` variable is provider-specific. Examples:
| Provider | Example value | How to see possible values
|---------------|-------------------|---------------------------
| Digital Ocean | `ams3` | `doctl compute region list`
| Google Cloud | `europe-north1-a` | `gcloud compute zones list`
| Linode | `eu-central` | `linode-cli regions list`
| Oracle Cloud | `eu-stockholm-1` | `oci iam region list`
You can also specify multiple locations, and then they will be
used in round-robin fashion.
For example, with Google Cloud, since the default quotas are very
low (my account is limited to 8 public IP addresses per zone, and
my requests to increase that quota were denied) you can do the
following:
```bash
export TF_VAR_location=$(gcloud compute zones list --format=json | jq -r .[].name | grep ^europe)
```
Then when you apply, clusters will be created across all available
zones in Europe. (When I write this, there are 20+ zones in Europe,
so even with my quota, I can create 40 clusters.)
3. Run!
```bash
./run.sh <providername>
```
(If you don't specify a provider name, it will list available providers.)
4. Shutting down
Go to the directory that was created by the previous step (`tag-YYYY-MM...`)
and run `terraform destroy`.
You can also run `./clean.sh` which will destroy ALL clusters deployed by the previous run script.
## Without shell wrapper
Expert mode.
Useful to run steps sperarately, and/or when working on the Terraform configurations.
To use it:
1. Select the provider you wish to use.
Go to the `source` directory and edit `main.tf`.
Change the `source` attribute of the `module "clusters"` section.
Check the content of the `modules` directory to see available choices.
```bash
vim main.tf
```
2. Initialize the provider.
```bash
@@ -110,20 +20,24 @@ terraform init
3. Configure provider authentication.
See steps above, and add the following extra steps:
- Digital Coean:
```bash
export DIGITALOCEAN_ACCESS_TOKEN=$(grep ^access-token ~/.config/doctl/config.yaml | cut -d: -f2 | tr -d " ")
```
- Linode:
```bash
export LINODE_TOKEN=$(grep ^token ~/.config/linode-cli | cut -d= -f2 | tr -d " ")
```
- Digital Ocean: `export DIGITALOCEAN_ACCESS_TOKEN=...`
(check `~/.config/doctl/config.yaml` for the token)
- Linode: `export LINODE_TOKEN=...`
(check `~/.config/linode-cli` for the token)
- Oracle Cloud: it should use `~/.oci/config`
- Scaleway: run `scw init`
4. Decide how many clusters and how many nodes per clusters you want.
```bash
export TF_VAR_how_many_clusters=5
export TF_VAR_min_nodes_per_pool=2
# Optional (will enable autoscaler when available)
export TF_VAR_max_nodes_per_pool=4
# Optional (will only work on some providers)
export TF_VAR_enable_arm_pool=true
```
5. Provision clusters.
```bash
@@ -132,7 +46,7 @@ terraform apply
6. Perform second stage provisioning.
This will install an SSH server on the clusters.
This will install a SSH server on the clusters.
```bash
cd stage2
@@ -158,5 +72,5 @@ terraform destroy
9. Clean up stage2.
```bash
rm stage2/terraform.tfstate*
rm stage/terraform.tfstate*
```

View File

@@ -1,9 +0,0 @@
#!/bin/sh
export LINODE_TOKEN=$(grep ^token ~/.config/linode-cli | cut -d= -f2 | tr -d " ")
export DIGITALOCEAN_ACCESS_TOKEN=$(grep ^access-token ~/.config/doctl/config.yaml | cut -d: -f2 | tr -d " ")
for T in tag-*; do
(
cd $T
terraform apply -destroy -auto-approve && mv ../$T ../deleted$T
)
done

16
prepare-tf/locals.tf Normal file
View File

@@ -0,0 +1,16 @@
resource "random_string" "_" {
length = 5
special = false
upper = false
}
resource "time_static" "_" {}
locals {
tag = format("tf-%s-%s", formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339), random_string._.result)
# Common tags to be assigned to all resources
common_tags = [
"created-by=terraform",
"tag=${local.tag}"
]
}

View File

@@ -1,5 +1,5 @@
module "clusters" {
source = "./modules/PROVIDER"
source = "./modules/linode"
for_each = local.clusters
cluster_name = each.value.cluster_name
min_nodes_per_pool = var.min_nodes_per_pool
@@ -7,24 +7,22 @@ module "clusters" {
enable_arm_pool = var.enable_arm_pool
node_size = var.node_size
common_tags = local.common_tags
location = each.value.location
}
locals {
clusters = {
for i in range(101, 101 + var.how_many_clusters) :
i => {
cluster_name = format("%s-%03d", local.tag, i)
kubeconfig_path = format("./stage2/kubeconfig.%03d", i)
cluster_name = format("%s-%03d", local.tag, i)
kubeconfig_path = format("./stage2/kubeconfig.%03d", i)
#dashdash_kubeconfig = format("--kubeconfig=./stage2/kubeconfig.%03d", i)
externalips_path = format("./stage2/externalips.%03d", i)
flags_path = format("./stage2/flags.%03d", i)
location = local.locations[i % length(local.locations)]
}
}
}
resource "local_file" "stage2" {
filename = "./stage2/main.tf"
filename = "./stage2/main.tf"
file_permission = "0644"
content = templatefile(
"./stage2.tmpl",
@@ -32,15 +30,6 @@ resource "local_file" "stage2" {
)
}
resource "local_file" "flags" {
for_each = local.clusters
filename = each.value.flags_path
file_permission = "0600"
content = <<-EOT
has_metrics_server: ${module.clusters[each.key].has_metrics_server}
EOT
}
resource "local_file" "kubeconfig" {
for_each = local.clusters
filename = each.value.kubeconfig_path
@@ -70,8 +59,8 @@ resource "null_resource" "wait_for_nodes" {
}
data "external" "externalips" {
for_each = local.clusters
depends_on = [null_resource.wait_for_nodes]
for_each = local.clusters
depends_on = [ null_resource.wait_for_nodes ]
program = [
"sh",
"-c",

View File

@@ -1,13 +1,12 @@
resource "digitalocean_kubernetes_cluster" "_" {
name = var.cluster_name
tags = var.common_tags
# Region is mandatory, so let's provide a default value.
region = var.location != null ? var.location : "nyc1"
name = var.cluster_name
tags = local.common_tags
region = var.region
version = var.k8s_version
node_pool {
name = "x86"
tags = var.common_tags
name = "dok-x86"
tags = local.common_tags
size = local.node_type
auto_scale = true
min_nodes = var.min_nodes_per_pool

View File

@@ -5,7 +5,3 @@ output "kubeconfig" {
output "cluster_id" {
value = digitalocean_kubernetes_cluster._.id
}
output "has_metrics_server" {
value = false
}

View File

@@ -8,6 +8,10 @@ variable "common_tags" {
default = []
}
locals {
common_tags = [for tag in var.common_tags : replace(tag, "=", "-")]
}
variable "node_size" {
type = string
default = "M"
@@ -42,11 +46,9 @@ locals {
node_type = var.node_types[var.node_size]
}
# To view supported regions, run:
# doctl compute region list
variable "location" {
variable "region" {
type = string
default = null
default = "ams3"
}
# To view supported versions, run:

View File

@@ -1,8 +1,7 @@
resource "linode_lke_cluster" "_" {
label = var.cluster_name
tags = var.common_tags
# "region" is mandatory, so let's provide a default value if none was given.
region = var.location != null ? var.location : "eu-central"
label = var.cluster_name
tags = var.common_tags
region = var.region
k8s_version = var.k8s_version
pool {

View File

@@ -5,7 +5,3 @@ output "kubeconfig" {
output "cluster_id" {
value = linode_lke_cluster._.id
}
output "has_metrics_server" {
value = false
}

View File

@@ -42,11 +42,11 @@ locals {
node_type = var.node_types[var.node_size]
}
# To view supported regions, run:
# To view supported versions, run:
# linode-cli regions list
variable "location" {
variable "region" {
type = string
default = null
default = "us-east"
}
# To view supported versions, run:

View File

@@ -1,7 +1,6 @@
resource "oci_identity_compartment" "_" {
name = var.cluster_name
description = var.cluster_name
enable_delete = true
name = var.cluster_name
description = var.cluster_name
}
locals {

View File

@@ -9,7 +9,3 @@ output "kubeconfig" {
output "cluster_id" {
value = oci_containerengine_cluster._.id
}
output "has_metrics_server" {
value = false
}

View File

@@ -70,13 +70,6 @@ locals {
node_type = var.node_types[var.node_size]
}
# To view supported regions, run:
# oci iam region list | jq .data[].name
variable "location" {
type = string
default = null
}
# To view supported versions, run:
# oci ce cluster-options get --cluster-option-id all | jq -r '.data["kubernetes-versions"][]'
variable "k8s_version" {

View File

@@ -1,15 +1,13 @@
resource "scaleway_k8s_cluster" "_" {
name = var.cluster_name
region = var.location
tags = var.common_tags
version = var.k8s_version
cni = var.cni
delete_additional_resources = true
name = var.cluster_name
tags = var.common_tags
version = var.k8s_version
cni = var.cni
}
resource "scaleway_k8s_pool" "_" {
cluster_id = scaleway_k8s_cluster._.id
name = "x86"
name = "scw-x86"
tags = var.common_tags
node_type = local.node_type
size = var.min_nodes_per_pool

View File

@@ -5,7 +5,3 @@ output "kubeconfig" {
output "cluster_id" {
value = scaleway_k8s_cluster._.id
}
output "has_metrics_server" {
value = sort([var.k8s_version, "1.22"])[0] == "1.22"
}

View File

@@ -47,12 +47,7 @@ variable "cni" {
default = "cilium"
}
variable "location" {
type = string
default = null
}
# To view supported versions, run:
# See supported versions with:
# scw k8s version list -o json | jq -r .[].name
variable "k8s_version" {
type = string

View File

@@ -1,49 +0,0 @@
#!/bin/sh
set -e
TIME=$(which time)
PROVIDER=$1
[ "$PROVIDER" ] || {
echo "Please specify a provider as first argument, or 'ALL' for parallel mode."
echo "Available providers:"
ls -1 source/modules
exit 1
}
[ "$TAG" ] || {
TIMESTAMP=$(date +%Y-%m-%d-%H-%M-%S)
RANDOMTAG=$(base64 /dev/urandom | tr A-Z a-z | tr -d /+ | head -c5)
export TAG=tag-$TIMESTAMP-$RANDOMTAG
}
[ "$PROVIDER" = "ALL" ] && {
for PROVIDER in $(ls -1 source/modules); do
$TERMINAL -T $TAG-$PROVIDER -e sh -c "
export TAG=$TAG-$PROVIDER
$0 $PROVIDER
cd $TAG-$PROVIDER
bash
" &
done
exit 0
}
[ -d "source/modules/$PROVIDER" ] || {
echo "Provider '$PROVIDER' not found."
echo "Available providers:"
ls -1 source/modules
exit 1
}
export LINODE_TOKEN=$(grep ^token ~/.config/linode-cli | cut -d= -f2 | tr -d " ")
export DIGITALOCEAN_ACCESS_TOKEN=$(grep ^access-token ~/.config/doctl/config.yaml | cut -d: -f2 | tr -d " ")
cp -a source $TAG
cd $TAG
cp -r modules/$PROVIDER modules/PROVIDER
$TIME -o time.1.init terraform init
$TIME -o time.2.stage1 terraform apply -auto-approve
cd stage2
$TIME -o ../time.3.init terraform init
$TIME -o ../time.4.stage2 terraform apply -auto-approve

View File

@@ -1,19 +0,0 @@
resource "random_string" "_" {
length = 4
number = false
special = false
upper = false
}
resource "time_static" "_" {}
locals {
timestamp = formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339)
tag = random_string._.result
# Common tags to be assigned to all resources
common_tags = [
"created-by-terraform",
format("created-at-%s", local.timestamp),
format("created-for-%s", local.tag)
]
}

View File

@@ -1,65 +0,0 @@
resource "google_container_cluster" "_" {
name = var.cluster_name
project = local.project
location = local.location
min_master_version = var.k8s_version
# To deploy private clusters, uncomment the section below,
# and uncomment the block in network.tf.
# Private clusters require extra resources (Cloud NAT,
# router, network, subnet) and the quota for some of these
# resources is fairly low on GCP; so if you want to deploy
# a lot of private clusters (more than 10), you can use these
# blocks as a base but you will probably have to refactor
# things quite a bit (you will at least need to define a single
# shared router and use it across all the clusters).
/*
network = google_compute_network._.name
subnetwork = google_compute_subnetwork._.name
private_cluster_config {
enable_private_nodes = true
# This must be set to "false".
# (Otherwise, access to the public endpoint is disabled.)
enable_private_endpoint = false
# This must be set to a /28.
# I think it shouldn't collide with the pod network subnet.
master_ipv4_cidr_block = "10.255.255.0/28"
}
# Private clusters require "VPC_NATIVE" networking mode
# (as opposed to the legacy "ROUTES").
networking_mode = "VPC_NATIVE"
# ip_allocation_policy is required for VPC_NATIVE clusters.
ip_allocation_policy {
# This is the block that will be used for pods.
cluster_ipv4_cidr_block = "10.0.0.0/12"
# The services block is optional
# (GKE will pick one automatically).
#services_ipv4_cidr_block = ""
}
*/
node_pool {
name = "x86"
node_config {
tags = var.common_tags
machine_type = local.node_type
}
initial_node_count = var.min_nodes_per_pool
autoscaling {
min_node_count = var.min_nodes_per_pool
max_node_count = max(var.min_nodes_per_pool, var.max_nodes_per_pool)
}
}
# This is not strictly necessary.
# We'll see if we end up using it.
# (If it is removed, make sure to also remove the corresponding
# key+cert variables from outputs.tf!)
master_auth {
client_certificate_config {
issue_client_certificate = true
}
}
}

View File

@@ -1,38 +0,0 @@
/*
resource "google_compute_network" "_" {
name = var.cluster_name
project = local.project
# The default is to create subnets automatically.
# However, this creates one subnet per zone in all regions,
# which causes a quick exhaustion of the subnet quota.
auto_create_subnetworks = false
}
resource "google_compute_subnetwork" "_" {
name = var.cluster_name
ip_cidr_range = "10.254.0.0/16"
region = local.region
network = google_compute_network._.id
project = local.project
}
resource "google_compute_router" "_" {
name = var.cluster_name
region = local.region
network = google_compute_network._.name
project = local.project
}
resource "google_compute_router_nat" "_" {
name = var.cluster_name
router = google_compute_router._.name
region = local.region
project = local.project
# Everyone in the network is allowed to NAT out.
# (We would change this if we only wanted to allow specific subnets to NAT out.)
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
# Pick NAT addresses automatically.
# (We would change this if we wanted to use specific addresses to NAT out.)
nat_ip_allocate_option = "AUTO_ONLY"
}
*/

View File

@@ -1,35 +0,0 @@
data "google_client_config" "_" {}
output "kubeconfig" {
value = <<-EOT
apiVersion: v1
kind: Config
current-context: ${google_container_cluster._.name}
clusters:
- name: ${google_container_cluster._.name}
cluster:
server: https://${google_container_cluster._.endpoint}
certificate-authority-data: ${google_container_cluster._.master_auth[0].cluster_ca_certificate}
contexts:
- name: ${google_container_cluster._.name}
context:
cluster: ${google_container_cluster._.name}
user: client-token
users:
- name: client-cert
user:
client-key-data: ${google_container_cluster._.master_auth[0].client_key}
client-certificate-data: ${google_container_cluster._.master_auth[0].client_certificate}
- name: client-token
user:
token: ${data.google_client_config._.access_token}
EOT
}
output "cluster_id" {
value = google_container_cluster._.id
}
output "has_metrics_server" {
value = true
}

View File

@@ -1,8 +0,0 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "4.5.0"
}
}
}

View File

@@ -1,68 +0,0 @@
variable "cluster_name" {
type = string
default = "deployed-with-terraform"
}
variable "common_tags" {
type = list(string)
default = []
}
variable "node_size" {
type = string
default = "M"
}
variable "min_nodes_per_pool" {
type = number
default = 2
}
variable "max_nodes_per_pool" {
type = number
default = 5
}
# FIXME
variable "enable_arm_pool" {
type = bool
default = false
}
variable "node_types" {
type = map(string)
default = {
"S" = "e2-small"
"M" = "e2-medium"
"L" = "e2-standard-2"
}
}
locals {
node_type = var.node_types[var.node_size]
}
# To view supported locations, run:
# gcloud compute zones list
variable "location" {
type = string
default = null
}
# To view supported versions, run:
# gcloud container get-server-config --region=europe-north1 '--format=flattened(channels)'
# But it's also possible to just specify e.g. "1.20" and it figures it out.
variable "k8s_version" {
type = string
default = "1.21"
}
locals {
location = var.location != null ? var.location : "europe-north1-a"
region = replace(local.location, "/-[a-z]$/", "")
# Unfortunately, the following line doesn't work
# (that attribute just returns an empty string)
# so we have to hard-code the project name.
#project = data.google_client_config._.project
project = "prepare-tf"
}

View File

@@ -1,40 +0,0 @@
variable "how_many_clusters" {
type = number
default = 1
}
variable "node_size" {
type = string
default = "M"
# Can be S, M, L.
# We map these values to different specific instance types for each provider,
# but the idea is that they shoudl correspond to the following sizes:
# S = 2 GB RAM
# M = 4 GB RAM
# L = 8 GB RAM
}
variable "min_nodes_per_pool" {
type = number
default = 1
}
variable "max_nodes_per_pool" {
type = number
default = 0
}
variable "enable_arm_pool" {
type = bool
default = false
}
variable "location" {
type = string
default = null
}
# TODO: perhaps handle if it's space-separated instead of newline?
locals {
locations = var.location == null ? [null] : split("\n", var.location)
}

View File

@@ -2,7 +2,7 @@ terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "2.7.1"
version = "2.0.3"
}
}
}
@@ -119,11 +119,6 @@ resource "kubernetes_cluster_role_binding" "shpod_${index}" {
name = "shpod"
namespace = "shpod"
}
subject {
api_group = "rbac.authorization.k8s.io"
kind = "Group"
name = "shpod-cluster-admins"
}
}
resource "random_string" "shpod_${index}" {
@@ -140,14 +135,9 @@ provider "helm" {
}
resource "helm_release" "metrics_server_${index}" {
# Some providers pre-install metrics-server.
# Some don't. Let's install metrics-server,
# but only if it's not already installed.
count = yamldecode(file("./flags.${index}"))["has_metrics_server"] ? 0 : 1
provider = helm.cluster_${index}
repository = "https://charts.bitnami.com/bitnami"
chart = "metrics-server"
version = "5.8.8"
name = "metrics-server"
namespace = "metrics-server"
create_namespace = true
@@ -191,7 +181,7 @@ resource "kubernetes_config_map" "kubeconfig_${index}" {
- name: cluster-admin
user:
client-key-data: $${base64encode(tls_private_key.cluster_admin_${index}.private_key_pem)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}.certificate)}
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request.cluster_admin_${index}.certificate)}
EOT
}
}
@@ -205,14 +195,11 @@ resource "tls_cert_request" "cluster_admin_${index}" {
private_key_pem = tls_private_key.cluster_admin_${index}.private_key_pem
subject {
common_name = "cluster-admin"
# Note: CSR API v1 doesn't allow issuing certs with "system:masters" anymore.
#organization = "system:masters"
# We'll use this custom group name instead.cluster-admin user.
organization = "shpod-cluster-admins"
organization = "system:masters"
}
}
resource "kubernetes_certificate_signing_request_v1" "cluster_admin_${index}" {
resource "kubernetes_certificate_signing_request" "cluster_admin_${index}" {
provider = kubernetes.cluster_${index}
metadata {
name = "cluster-admin"
@@ -220,7 +207,6 @@ resource "kubernetes_certificate_signing_request_v1" "cluster_admin_${index}" {
spec {
usages = ["client auth"]
request = tls_cert_request.cluster_admin_${index}.cert_request_pem
signer_name = "kubernetes.io/kube-apiserver-client"
}
auto_approve = true
}

28
prepare-tf/variables.tf Normal file
View File

@@ -0,0 +1,28 @@
variable "how_many_clusters" {
type = number
default = 2
}
variable "node_size" {
type = string
default = "M"
# Can be S, M, L.
# S = 2 GB RAM
# M = 4 GB RAM
# L = 8 GB RAM
}
variable "min_nodes_per_pool" {
type = number
default = 1
}
variable "max_nodes_per_pool" {
type = number
default = 0
}
variable "enable_arm_pool" {
type = bool
default = true
}

View File

@@ -14,9 +14,7 @@ These tools can help you to create VMs on:
- [Docker](https://docs.docker.com/engine/installation/)
- [Docker Compose](https://docs.docker.com/compose/install/)
- [Parallel SSH](https://github.com/lilydjwg/pssh)
(should be installable with `pip install git+https://github.com/lilydjwg/pssh`;
on a Mac, try `brew install pssh`)
- [Parallel SSH](https://code.google.com/archive/p/parallel-ssh/) (on a Mac: `brew install pssh`)
Depending on the infrastructure that you want to use, you also need to install
the CLI that is specific to that cloud. For OpenStack deployments, you will

View File

@@ -75,11 +75,9 @@ _cmd_createuser() {
echo '$USER_LOGIN ALL=(ALL) NOPASSWD:ALL' | sudo tee /etc/sudoers.d/$USER_LOGIN
"
# The MaxAuthTries is here to help with folks who have many SSH keys.
pssh "
set -e
sudo sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
sudo sed -i 's/#MaxAuthTries 6/MaxAuthTries 42/' /etc/ssh/sshd_config
sudo service ssh restart
"
@@ -238,12 +236,6 @@ _cmd_docker() {
sudo add-apt-repository 'deb https://download.docker.com/linux/ubuntu bionic stable'
sudo apt-get -q update
sudo apt-get -qy install docker-ce
# Add registry mirror configuration.
if ! [ -f /etc/docker/daemon.json ]; then
echo '{\"registry-mirrors\": [\"https://mirror.gcr.io\"]}' | sudo tee /etc/docker/daemon.json
sudo systemctl restart docker
fi
"
##VERSION## https://github.com/docker/compose/releases
@@ -311,15 +303,13 @@ _cmd_kube() {
need_login_password
# Optional version, e.g. 1.13.5
SETTINGS=tags/$TAG/settings.yaml
KUBEVERSION=$(awk '/^kubernetes_version:/ {print $2}' $SETTINGS)
KUBEVERSION=$2
if [ "$KUBEVERSION" ]; then
pssh "
sudo tee /etc/apt/preferences.d/kubernetes <<EOF
Package: kubectl kubeadm kubelet
Pin: version $KUBEVERSION*
Pin-Priority: 1000
EOF"
EXTRA_APTGET="=$KUBEVERSION-00"
EXTRA_KUBEADM="kubernetesVersion: v$KUBEVERSION"
else
EXTRA_APTGET=""
EXTRA_KUBEADM=""
fi
# Install packages
@@ -330,8 +320,7 @@ EOF"
sudo tee /etc/apt/sources.list.d/kubernetes.list"
pssh --timeout 200 "
sudo apt-get update -q &&
sudo apt-get install -qy kubelet kubeadm kubectl &&
sudo apt-mark hold kubelet kubeadm kubectl
sudo apt-get install -qy kubelet$EXTRA_APTGET kubeadm$EXTRA_APTGET kubectl$EXTRA_APTGET &&
kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl &&
echo 'alias k=kubectl' | sudo tee /etc/bash_completion.d/k &&
echo 'complete -F __start_kubectl k' | sudo tee -a /etc/bash_completion.d/k"
@@ -343,11 +332,6 @@ EOF"
sudo swapoff -a"
fi
# Re-enable CRI interface in containerd
pssh "
echo '# Use default parameters for containerd.' | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd"
# Initialize kube control plane
pssh --timeout 200 "
if i_am_first_node && [ ! -f /etc/kubernetes/admin.conf ]; then
@@ -357,38 +341,19 @@ kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- token: \$(cat /tmp/token)
nodeRegistration:
# Comment out the next line to switch back to Docker.
criSocket: /run/containerd/containerd.sock
ignorePreflightErrors:
- NumCPU
---
kind: JoinConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
discovery:
bootstrapToken:
apiServerEndpoint: \$(cat /etc/name_of_first_node):6443
token: \$(cat /tmp/token)
unsafeSkipCAVerification: true
nodeRegistration:
# Comment out the next line to switch back to Docker.
criSocket: /run/containerd/containerd.sock
ignorePreflightErrors:
- NumCPU
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
# The following line is necessary when using Docker.
# It doesn't seem necessary when using containerd.
#cgroupDriver: cgroupfs
cgroupDriver: cgroupfs
---
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
apiServer:
certSANs:
- \$(cat /tmp/ipv4)
$EXTRA_KUBEADM
EOF
sudo kubeadm init --config=/tmp/kubeadm-config.yaml
sudo kubeadm init --config=/tmp/kubeadm-config.yaml --ignore-preflight-errors=NumCPU
fi"
# Put kubeconfig in ubuntu's and $USER_LOGIN's accounts
@@ -412,8 +377,8 @@ EOF
pssh --timeout 200 "
if ! i_am_first_node && [ ! -f /etc/kubernetes/kubelet.conf ]; then
FIRSTNODE=\$(cat /etc/name_of_first_node) &&
ssh $SSHOPTS \$FIRSTNODE cat /tmp/kubeadm-config.yaml > /tmp/kubeadm-config.yaml &&
sudo kubeadm join --config /tmp/kubeadm-config.yaml
TOKEN=\$(ssh $SSHOPTS \$FIRSTNODE cat /tmp/token) &&
sudo kubeadm join --discovery-token-unsafe-skip-ca-verification --token \$TOKEN \$FIRSTNODE:6443
fi"
# Install metrics server
@@ -504,7 +469,7 @@ EOF
if [ ! -x /usr/local/bin/kustomize ]; then
curl -fsSL $URL |
sudo tar -C /usr/local/bin -zx kustomize
kustomize completion bash | sudo tee /etc/bash_completion.d/kustomize
echo complete -C /usr/local/bin/kustomize kustomize | sudo tee /etc/bash_completion.d/kustomize
kustomize version
fi"
@@ -713,7 +678,7 @@ _cmd_tailhist () {
ARCH=${ARCHITECTURE-amd64}
[ "$ARCH" = "aarch64" ] && ARCH=arm64
pssh "
pssh -i "
set -e
wget https://github.com/joewalnes/websocketd/releases/download/v0.3.0/websocketd-0.3.0-linux_$ARCH.zip
unzip websocketd-0.3.0-linux_$ARCH.zip websocketd

View File

@@ -1,7 +1,7 @@
infra_start() {
COUNT=$1
cp terraform-openstack/*.tf tags/$TAG
cp terraform/*.tf tags/$TAG
(
cd tags/$TAG
if ! terraform init; then

View File

@@ -1,82 +0,0 @@
#!/bin/sh
# https://open-api.netlify.com/#tag/dnsZone
[ "$1" ] || {
echo ""
echo "Add a record in Netlify DNS."
echo "This script is hardcoded to add a record to container.training".
echo ""
echo "Syntax:"
echo "$0 list"
echo "$0 add <name> <ipaddr>"
echo "$0 del <recordid>"
echo ""
echo "Example to create a A record for eu.container.training:"
echo "$0 add eu 185.145.250.0"
echo ""
exit 1
}
NETLIFY_USERID=$(jq .userId < ~/.config/netlify/config.json)
NETLIFY_TOKEN=$(jq -r .users[$NETLIFY_USERID].auth.token < ~/.config/netlify/config.json)
netlify() {
URI=$1
shift
http https://api.netlify.com/api/v1/$URI "$@" "Authorization:Bearer $NETLIFY_TOKEN"
}
ZONE_ID=$(netlify dns_zones |
jq -r '.[] | select ( .name == "container.training" ) | .id')
_list() {
netlify dns_zones/$ZONE_ID/dns_records |
jq -r '.[] | select(.type=="A") | [.hostname, .type, .value, .id] | @tsv'
}
_add() {
NAME=$1.container.training
ADDR=$2
# It looks like if we create two identical records, then delete one of them,
# Netlify DNS ends up in a weird state (the name doesn't resolve anymore even
# though it's still visible through the API and the website?)
if netlify dns_zones/$ZONE_ID/dns_records |
jq '.[] | select(.hostname=="'$NAME'" and .type=="A" and .value=="'$ADDR'")' |
grep .
then
echo "It looks like that record already exists. Refusing to create it."
exit 1
fi
netlify dns_zones/$ZONE_ID/dns_records type=A hostname=$NAME value=$ADDR ttl=300
netlify dns_zones/$ZONE_ID/dns_records |
jq '.[] | select(.hostname=="'$NAME'")'
}
_del() {
RECORD_ID=$1
# OK, since that one is dangerous, I'm putting the whole request explicitly here
http DELETE \
https://api.netlify.com/api/v1/dns_zones/$ZONE_ID/dns_records/$RECORD_ID \
"Authorization:Bearer $NETLIFY_TOKEN"
}
case "$1" in
list)
_list
;;
add)
_add $2 $3
;;
del)
_del $2
;;
*)
echo "Unknown command '$1'."
exit 1
;;
esac

View File

@@ -1,33 +0,0 @@
# Number of VMs per cluster
clustersize: 3
# The hostname of each node will be clusterprefix + a number
clusterprefix: oldversion
# Jinja2 template to use to generate ready-to-cut cards
cards_template: cards.html
# Use "Letter" in the US, and "A4" everywhere else
paper_size: A4
# Login and password that students will use
user_login: k8s
user_password: training
# For a list of old versions, check:
# https://kubernetes.io/releases/patch-releases/#non-active-branch-history
kubernetes_version: 1.18.20
image:
steps:
- wait
- clusterize
- tools
- docker
- createuser
- webssh
- tailhist
- kube
- kubetools
- kubetest

View File

@@ -3,7 +3,7 @@ set -e
export AWS_INSTANCE_TYPE=t3a.small
INFRA=infra/aws-eu-north-1
INFRA=infra/aws-us-east-2
STUDENTS=2
@@ -33,15 +33,9 @@ TAG=$PREFIX-$SETTINGS
--settings settings/$SETTINGS.yaml \
--students $STUDENTS
INFRA=infra/enix
#INFRA=infra/aws-us-west-1
SETTINGS=admin-oldversion
TAG=$PREFIX-$SETTINGS
./workshopctl start \
--tag $TAG \
--infra $INFRA \
--settings settings/$SETTINGS.yaml \
--students $STUDENTS
export AWS_INSTANCE_TYPE=t3a.medium
SETTINGS=admin-test
TAG=$PREFIX-$SETTINGS

56
slides/1.yml Normal file
View File

@@ -0,0 +1,56 @@
title: |
Docker & Kubernetes
Part 1 - Docker
chat: "[Teams](https://teams.microsoft.com/l/channel/19%3arctk01XQVWxbj6pjTtJDfVd0_QOzfzYe7Xt8VDpl9681%40thread.tacv2/General?groupId=89c621d8-7080-447f-a7eb-9d6704776dd5&tenantId=72aa0d83-624a-4ebf-a683-1b9b45548610)"
gitrepo: github.com/jpetazzo/container.training
slides: https://2021-11-derivco.container.training/
#slidenumberprefix: "#SomeHashTag &mdash; "
exclude:
- self-paced
content:
- shared/title.md
- logistics.md
- containers/intro.md
- shared/about-slides.md
- shared/chat-room-im.md
#- shared/chat-room-zoom-meeting.md
#- shared/chat-room-zoom-webinar.md
- shared/toc.md
- # DAY 1
#- containers/Docker_Overview.md
#- containers/Docker_History.md
- containers/Training_Environment.md
#- containers/Installing_Docker.md
- containers/First_Containers.md
- containers/Background_Containers.md
- containers/Initial_Images.md
-
- containers/Building_Images_Interactively.md
- containers/Building_Images_With_Dockerfiles.md
- containers/Cmd_And_Entrypoint.md
- containers/Copying_Files_During_Build.md
- containers/Exercise_Dockerfile_Basic.md
- # DAY 2
- containers/Dockerfile_Tips.md
- containers/Multi_Stage_Builds.md
- containers/Container_Networking_Basics.md
- containers/Local_Development_Workflow.md
- containers/Getting_Inside.md
-
- containers/Container_Network_Model.md
- containers/Compose_For_Dev_Stacks.md
- containers/Exercise_Composefile.md
- containers/Exercise_Dockerfile_Advanced.md
- shared/thankyou.md
- # EXTRA
- containers/Start_And_Attach.md
- containers/Naming_And_Inspecting.md
- containers/Labels.md
- containers/Advanced_Dockerfiles.md
- containers/Network_Drivers.md

View File

@@ -2,7 +2,7 @@
#/ /kube-halfday.yml.html 200!
#/ /kube-fullday.yml.html 200!
#/ /kube-twodays.yml.html 200!
/ /kube.yml.html 200!
/ /1.yml.html 200!
# And this allows to do "git clone https://container.training".
/info/refs service=git-upload-pack https://github.com/jpetazzo/container.training/info/refs?service=git-upload-pack

View File

@@ -109,7 +109,7 @@ class: extra-details
- Example: [ctr.run](https://ctr.run/)
.lab[
.exercise[
- Use ctr.run to automatically build a container image and run it:
```bash

View File

@@ -28,7 +28,7 @@ class: self-paced
- Likewise, it will take more than merely *reading* these slides
to make you an expert
- These slides include *tons* of demos, exercises, and examples
- These slides include *tons* of exercises and examples
- They assume that you have access to a machine running Docker

View File

@@ -1,5 +0,0 @@
## Exercise — Application Configuration
- Configure an application with a ConfigMap
- Generate configuration file from the downward API

View File

@@ -1,87 +0,0 @@
# Exercise — Application Configuration
- We want to configure an application with a ConfigMap
- We will use the "rainbow" example shown previously
(HAProxy load balancing traffic to services in multiple namespaces)
- We won't provide the HAProxy configuration file
- Instead, we will provide a list of namespaces
(e.g. as a space-delimited list in a ConfigMap)
- Our Pod should generate the HAProxy configuration using the ConfigMap
---
## Setup
- Let's say that we have the "rainbow" app deployed:
```bash
kubectl apply -f ~/container.training/k8s/rainbow.yaml
```
- And a ConfigMap like the following one:
```bash
kubectl create configmap rainbow --from-literal=namespaces="blue green"
```
---
## Goal 1
- We want a Deployment and a Service called `rainbow`
- The `rainbow` Service should load balance across Namespaces `blue` and `green`
(i.e. to the Services called `color` in both these Namespaces)
- We want to be able to update the configuration:
- update the ConfigMap to put `blue green red`
- what should we do so that HAproxy picks up the change?
---
## Goal 2
- Check what happens if we specify a backend that doesn't exist
(e.g. add `purple` to the list of namespaces)
- If we specify invalid backends to HAProxy, it won't start!
- Implement a workaround among these two:
- remove invalid backends from the list before starting HAProxy
- wait until all backends are valid before starting HAProxy
---
## Goal 3
- We'd like HAProxy to pick up ConfigMap updates automatically
- How can we do that?
---
## Hints
- Check the following slides if you need help!
--
- We want to generate the HAProxy configuration in an `initContainer`
--
- The `namespaces` entry of the `rainbow` ConfigMap should be exposed to the `initContainer`
--
- The HAProxy configuration should be in a volume shared with HAProxy

View File

@@ -1,7 +0,0 @@
## Exercise — Build a Cluster
- Deploy a cluster by configuring and running each component manually
- Add CNI networking
- Generate and validate ServiceAccount tokens

View File

@@ -1,33 +0,0 @@
# Exercise — Build a Cluster
- Step 1: deploy a cluster
- follow the steps in the "Dessine-moi un cluster" section
- Step 2: add CNI networking
- une kube-router
- interconnect with the route-reflector
- check that you receive the routes of other clusters
- Step 3: generate and validate ServiceAccount tokens
- see next slide for help!
---
## ServiceAccount tokens
- We need to generate a TLS key pair and certificate
- A self-signed key will work
- We don't need anything particular in the certificate
(no particular CN, key use flags, etc.)
- The key needs to be passed to both API server and controller manager
- Check that ServiceAccount tokens are generated correctly

View File

@@ -4,6 +4,8 @@
(we will use the `rng` service in the dockercoins app)
- See what happens when the load increses
- Observe the correct behavior of the readiness probe
(spoiler alert: it involves timeouts!)
(when deploying e.g. an invalid image)
- Observe the behavior of the liveness probe

View File

@@ -2,85 +2,36 @@
- We want to add healthchecks to the `rng` service in dockercoins
- The `rng` service exhibits an interesting behavior under load:
*its latency increases (which will cause probes to time out!)*
- We want to see:
- what happens when the readiness probe fails
- what happens when the liveness probe fails
- how to set "appropriate" probes and probe parameters
---
## Setup
- First, deploy a new copy of dockercoins
(for instance, in a brand new namespace)
- Then, add a readiness probe on the `rng` service
- Pro tip #1: ping (e.g. with `httping`) the `rng` service at all times
- it should initially show a few milliseconds latency
- that will increase when we scale up
- it will also let us detect when the service goes "boom"
- Pro tip #2: also keep an eye on the web UI
---
## Readiness
- Add a readiness probe to `rng`
- this requires editing the pod template in the Deployment manifest
- use a simple HTTP check on the `/` route of the service
- keep all other parameters (timeouts, thresholds...) at their default values
(using a simple HTTP check on the `/` route of the service)
- Check what happens when deploying an invalid image for `rng` (e.g. `alpine`)
*(If the probe was set up correctly, the app will continue to work,
because Kubernetes won't switch over the traffic to the `alpine` containers,
because they don't pass the readiness probe.)*
- Then roll back `rng` to the original image and add a liveness probe
(with the same parameters)
- Scale up the `worker` service (to 15+ workers) and observe
- What happens, and how can we improve the situation?
---
## Readiness under load
## Goal
- Then roll back `rng` to the original image
- *Before* adding the readiness probe:
- Check what happens when we scale up the `worker` Deployment to 15+ workers
updating the image of the `rng` service with `alpine` should break it
(get the latency above 1 second)
- *After* adding the readiness probe:
*(We should now observe intermittent unavailability of the service, i.e. every
30 seconds it will be unreachable for a bit, then come back, then go away again, etc.)*
updating the image of the `rng` service with `alpine` shouldn't break it
---
- When adding the liveness probe, nothing special should happen
## Liveness
- Scaling the `worker` service will then cause disruptions
- Now replace the readiness probe with a liveness probe
- What happens now?
*(At first the behavior looks the same as with the readiness probe:
service becomes unreachable, then reachable again, etc.; but there is
a significant difference behind the scenes. What is it?)*
---
## Readiness and liveness
- Bonus questions!
- What happens if we enable both probes at the same time?
- What strategies can we use so that both probes are useful?
- The final goal is to understand why, and how to fix it

View File

@@ -6,7 +6,7 @@
- the web app itself (dockercoins, NGINX, whatever we want)
- an ingress controller
- an ingress controller (we suggest Traefik)
- a domain name (`use \*.nip.io` or `\*.localdev.me`)
@@ -16,7 +16,7 @@
## Goal
- We want to be able to access the web app using a URL like:
- We want to be able to access the web app using an URL like:
http://webapp.localdev.me
@@ -30,13 +30,11 @@
## Hints
- For the ingress controller, we can use:
- Traefik can be installed with Helm
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx/blob/main/docs/deploy/index.md)
(it can be found on the Artifact Hub)
- the [Traefik Helm chart](https://doc.traefik.io/traefik/getting-started/install-traefik/#use-the-helm-chart)
- the container.training [Traefik DaemonSet](https://raw.githubusercontent.com/jpetazzo/container.training/main/k8s/traefik-v2.yaml)
- If using Kubernetes 1.22+, make sure to use Traefik 2.5+
- If our cluster supports LoadBalancer Services: easy

View File

@@ -1,5 +1,3 @@
⚠️ BROKEN EXERCISE - DO NOT USE
## Exercise — Ingress Secret Policy
*Implement policy to limit impact of ingress controller vulnerabilities.*

View File

@@ -1,5 +1,3 @@
⚠️ BROKEN EXERCISE - DO NOT USE
# Exercise — Ingress Secret Policy
- Most ingress controllers have access to all Secrets
@@ -90,6 +88,6 @@
## Step 5: double-check
- Check that the Ingress Controller can't access other secrets
- Check that the Ingres Controller can't access other secrets
(e.g. by manually creating a Secret and checking with `kubectl exec`?)

View File

@@ -8,37 +8,25 @@
- We'll use one Deployment for each component
(created with `kubectl create deployment`)
(see next slide for the images to use)
- We'll connect them with Services
(create with `kubectl expose`)
- We'll check that we can access the web UI in a browser
---
## Images
- We'll use the following images:
- hasher → `dockercoins/hasher:v0.1`
- hasher → `dockercoins/hasher:v0.1`
- redis → `redis`
- redis`redis`
- rng`dockercoins/rng:v0.1`
- rng`dockercoins/rng:v0.1`
- webui`dockercoins/webui:v0.1`
- webui`dockercoins/webui:v0.1`
- worker → `dockercoins/worker:v0.1`
- All services should be internal services, except the web UI
(since we want to be able to connect to the web UI from outside)
---
class: pic
![Dockercoins architecture diagram](images/dockercoins-diagram.png)
- worker`dockercoins/worker:v0.1`
---
@@ -46,7 +34,7 @@ class: pic
- We should be able to see the web UI in our browser
(with the graph showing approximately 3-4 hashes/second)
(with the graph showing approximatiely 3-4 hashes/second)
---
@@ -56,4 +44,4 @@ class: pic
(check the logs of the worker; they indicate the port numbers)
- The web UI can be exposed with a NodePort or LoadBalancer Service
- The web UI can be exposed with a NodePort Service

View File

@@ -1,9 +0,0 @@
## Exercise — Generating Ingress With Kyverno
- When a Service gets created, automatically generate an Ingress
- Step 1: expose all services with a hard-coded domain name
- Step 2: only expose services that have a port named `http`
- Step 3: configure the domain name with a per-namespace ConfigMap

View File

@@ -1,33 +0,0 @@
# Exercise — Generating Ingress With Kyverno
When a Service gets created...
*(for instance, Service `blue` in Namespace `rainbow`)*
...Automatically generate an Ingress.
*(for instance, with host name `blue.rainbow.MYDOMAIN.COM`)*
---
## Goals
- Step 1: expose all services with a hard-coded domain name
- Step 2: only expose services that have a port named `http`
- Step 3: configure the domain name with a per-namespace ConfigMap
(e.g. `kubectl create configmap ingress-domain-name --from-literal=domain=1.2.3.4.nip.io`)
---
## Hints
- We want to use a Kyverno `generate` ClusterPolicy
- For step 1, check [Generate Resources](https://kyverno.io/docs/writing-policies/generate/) documentation
- For step 2, check [Preconditions](https://kyverno.io/docs/writing-policies/preconditions/) documentation
- For step 3, check [External Data Sources](https://kyverno.io/docs/writing-policies/external-data-sources/) documentation

View File

@@ -1,9 +0,0 @@
## Exercise — Remote Cluster
- Install kubectl locally
- Retrieve the kubeconfig file of our remote cluster
- Deploy dockercoins on that cluster
- Access an internal service without exposing it

View File

@@ -1,62 +0,0 @@
# Exercise — Remote Cluster
- We want to control a remote cluster
- Then we want to run a copy of dockercoins on that cluster
- We want to be able to connect to an internal service
---
## Goal
- Be able to access e.g. hasher, rng, or webui
(without exposing them with a NodePort or LoadBalancer service)
---
## Getting access to the cluster
- If you don't have `kubectl` on your machine, install it
- Download the kubeconfig file from the remote cluster
(you can use `scp` or even copy-paste it)
- If you already have a kubeconfig file on your machine:
- save the remote kubeconfig with another name (e.g. `~/.kube/config.remote`)
- set the `KUBECONFIG` environment variable to point to that file name
- ...or use the `--kubeconfig=...` option with `kubectl`
- Check that you can access the cluster (e.g. `kubectl get nodes`)
---
## If you get an error...
⚠️ The following applies to clusters deployed with `kubeadm`
- If you have a cluster where the nodes are named `node1`, `node2`, etc.
- `kubectl` commands might show connection errors with internal IP addresses
(e.g. 10.10... or 172.17...)
- In that case, you might need to edit the `kubeconfig` file:
- find the server address
- update it to put the *external* address of the first node of the cluster
---
## Deploying an app
- Deploy another copy of dockercoins from your local machine
- Access internal services (e.g. with `kubectl port-forward`)

View File

@@ -24,9 +24,9 @@ We will call them "dev cluster" and "prod cluster".
- Our application needs two secrets:
- a *logging API token* (not too sensitive; same in dev and prod)
- `logging_api_token` (not too sensitive; same in dev and prod)
- a *database password* (sensitive; different in dev and prod)
- `database_password` (sensitive; different in dev and prod)
- Secrets can be exposed as env vars, or mounted in volumes
@@ -42,7 +42,7 @@ We will call them "dev cluster" and "prod cluster".
- On the dev cluster, create a Namespace called `dev`
- Create the two secrets, `logging-api-token` and `database-password`
- Create the two secrets, `logging_api_token` and `database_password`
(the content doesn't matter; put a random string of your choice)
@@ -110,8 +110,8 @@ We want Alice to be able to:
- deploy the whole application in the `prod` namespace
- access the *logging API token* secret
- access the `logging_api_token` secret
- but *not* the *database password* secret
- but *not* the `database_password` secret
- view the logs of the app

View File

@@ -1,9 +0,0 @@
## Exercise — Terraform Node Pools
- Write a Terraform configuration to deploy a cluster
- The cluster should have two node pools with autoscaling
- Deploy two apps, each using exclusively one node pool
- Bonus: deploy an app balanced across both node pools

View File

@@ -1,69 +0,0 @@
# Exercise — Terraform Node Pools
- Write a Terraform configuration to deploy a cluster
- The cluster should have two node pools with autoscaling
- Deploy two apps, each using exclusively one node pool
- Bonus: deploy an app balanced across both node pools
---
## Cluster deployment
- Write a Terraform configuration to deploy a cluster
- We want to have two node pools with autoscaling
- Example for sizing:
- 4 GB / 1 CPU per node
- pools of 1 to 4 nodes
---
## Cluster autoscaling
- Deploy an app on the cluster
(you can use `nginx`, `jpetazzo/color`...)
- Set a resource request (e.g. 1 GB RAM)
- Scale up and verify that the autoscaler kicks in
---
## Pool isolation
- We want to deploy two apps
- The first app should be deployed exclusively on the first pool
- The second app should be deployed exclusively on the second pool
- Check the next slide for hints!
---
## Hints
- One solution involves adding a `nodeSelector` to the pod templates
- Another solution involves adding:
- `taints` to the node pools
- matching `tolerations` to the pod templates
---
## Balancing
- Step 1: make sure that the pools are not balanced
- Step 2: deploy a new app, check that it goes to the emptiest pool
- Step 3: update the app so that it balances (as much as possible) between pools

View File

@@ -1,60 +0,0 @@
#!/bin/sh
# The materials for a given training live in their own branch.
# Sometimes, we write custom content (or simply new content) for a training,
# and that content doesn't get merged back to main. This script tries to
# detect that with the following heuristics:
# - list all remote branches
# - for each remote branch, list the changes that weren't merged into main
# (using "diff main...$BRANCH", three dots)
# - ignore a bunch of training-specific files that change all the time anyway
# - for the remaining files, compute the diff between main and the branch
# (using "diff main..$BRANCH", two dots)
# - ignore changes of less than 10 lines
# - also ignore a few red herrings
# - display whatever is left
# For "git diff" (in the filter function) to work correctly, we must be
# at the root of the repo.
cd $(git rev-parse --show-toplevel)
BRANCHES=$(git branch -r | grep -v origin/HEAD | grep origin/2)
filter() {
threshold=10
while read filename; do
case $filename in
# Generic training-specific files
slides/*.html) continue;;
slides/*.yml) continue;;
slides/logistics*.md) continue;;
# Specific content that can be ignored
#slides/containers/Local_Environment.md) threshold=100;;
# Content that was moved/refactored enough to confuse us
slides/containers/Local_Environment.md) threshold=100;;
slides/exercises.md) continue;;
slides/k8s/batch-jobs) threshold=20;;
# Renames
*/{*}*) continue;;
esac
git diff --find-renames --numstat main..$BRANCH -- "$filename" | {
# If the files are identical, the diff will be empty, and "read" will fail.
read plus minus filename || return
# Ignore binary files (FIXME though?)
if [ $plus = - ]; then
return
fi
diff=$((plus-minus))
if [ $diff -gt $threshold ]; then
echo git diff main..$BRANCH -- $filename
fi
}
done
}
for BRANCH in $BRANCHES; do
if FILES=$(git diff --find-renames --name-only main...$BRANCH | filter | grep .); then
echo "🌳 $BRANCH:"
echo "$FILES"
fi
done

View File

@@ -32,7 +32,7 @@
- You're welcome to use whatever you like (e.g. AWS profiles)
.lab[
.exercise[
- Set the AWS region, API access key, and secret key:
```bash
@@ -58,7 +58,7 @@
- register it in our kubeconfig file
.lab[
.exercise[
- Update our kubeconfig file:
```bash

View File

@@ -20,13 +20,13 @@
## Suspension of disbelief
The labs and demos in this section assume that we have set up `kubectl` on our
The exercises in this section assume that we have set up `kubectl` on our
local machine in order to access a remote cluster.
We will therefore show how to access services and pods of the remote cluster,
from our local machine.
You can also run these commands directly on the cluster (if you haven't
You can also run these exercises directly on the cluster (if you haven't
installed and set up `kubectl` locally).
Running commands locally will be less useful
@@ -58,7 +58,7 @@ installed and set up `kubectl` to communicate with your cluster.
- Let's access the `webui` service through `kubectl proxy`
.lab[
.exercise[
- Run an API proxy in the background:
```bash
@@ -101,7 +101,7 @@ installed and set up `kubectl` to communicate with your cluster.
- Let's access our remote Redis server
.lab[
.exercise[
- Forward connections from local port 10000 to remote port 6379:
```bash

View File

@@ -198,7 +198,7 @@ Some examples ...
(the Node "echo" app, the Flask app, and one ngrok tunnel for each of them)
.lab[
.exercise[
- Go to the webhook directory:
```bash
@@ -244,7 +244,7 @@ class: extra-details
- We need to update the configuration with the correct `url`
.lab[
.exercise[
- Edit the webhook configuration manifest:
```bash
@@ -271,7 +271,7 @@ class: extra-details
(so if the webhook server is down, we can still create pods)
.lab[
.exercise[
- Register the webhook:
```bash
@@ -288,7 +288,7 @@ It is strongly recommended to tail the logs of the API server while doing that.
- Let's create a pod and try to set a `color` label
.lab[
.exercise[
- Create a pod named `chroma`:
```bash
@@ -328,7 +328,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
## Update the webhook configuration
.lab[
.exercise[
- First, check the ngrok URL of the tunnel for the Flask app:
```bash
@@ -395,7 +395,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
## Let's get to work!
.lab[
.exercise[
- Make sure we're in the right directory:
```bash
@@ -424,7 +424,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
... we'll store it in a ConfigMap, and install dependencies on the fly
.lab[
.exercise[
- Load the webhook source in a ConfigMap:
```bash
@@ -446,7 +446,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
(of course, there are plenty others options; e.g. `cfssl`)
.lab[
.exercise[
- Generate a self-signed certificate:
```bash
@@ -470,7 +470,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
- Let's reconfigure the webhook to use our Service instead of ngrok
.lab[
.exercise[
- Edit the webhook configuration manifest:
```bash
@@ -504,7 +504,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
Shell to the rescue!
.lab[
.exercise[
- Load up our cert and encode it in base64:
```bash

View File

@@ -66,7 +66,7 @@
- We'll ask `kubectl` to show us the exacts requests that it's making
.lab[
.exercise[
- Check the URI for a cluster-scope, "core" resource, e.g. a Node:
```bash
@@ -122,7 +122,7 @@ class: extra-details
- What about namespaced resources?
.lab[
.exercise[
- Check the URI for a namespaced, "core" resource, e.g. a Service:
```bash
@@ -169,7 +169,7 @@ class: extra-details
## Accessing a subresource
.lab[
.exercise[
- List `kube-proxy` pods:
```bash
@@ -200,7 +200,7 @@ command=echo&command=hello&command=world&container=kube-proxy&stderr=true&stdout
- There are at least three useful commands to introspect the API server
.lab[
.exercise[
- List resources types, their group, kind, short names, and scope:
```bash
@@ -249,7 +249,7 @@ command=echo&command=hello&command=world&container=kube-proxy&stderr=true&stdout
The following assumes that `metrics-server` is deployed on your cluster.
.lab[
.exercise[
- Check that the metrics.k8s.io is registered with `metrics-server`:
```bash
@@ -271,7 +271,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
- We can have multiple resources with the same name
.lab[
.exercise[
- Look for resources named `node`:
```bash
@@ -298,7 +298,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
- But we can look at the raw data (with `-o json` or `-o yaml`)
.lab[
.exercise[
- Look at NodeMetrics objects with one of these commands:
```bash
@@ -320,7 +320,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
--
.lab[
.exercise[
- Display node metrics:
```bash
@@ -342,7 +342,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
- Then we can register that server by creating an APIService resource
.lab[
.exercise[
- Check the definition used for the `metrics-server`:
```bash

View File

@@ -103,7 +103,7 @@ class: extra-details
---
## `WithWaitGroup`
## `WithWaitGroup`,
- When we shutdown, tells clients (with in-flight requests) to retry

View File

@@ -20,67 +20,25 @@ The control plane can run:
- in containers, on the same nodes that run other application workloads
(default behavior for local clusters like [Minikube](https://github.com/kubernetes/minikube), [kind](https://kind.sigs.k8s.io/)...)
(example: [Minikube](https://github.com/kubernetes/minikube); 1 node runs everything, [kind](https://kind.sigs.k8s.io/))
- on a dedicated node
(default behavior when deploying with kubeadm)
(example: a cluster installed with kubeadm)
- on a dedicated set of nodes
([Kubernetes The Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way); [kops](https://github.com/kubernetes/kops); also kubeadm)
(example: [Kubernetes The Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way); [kops](https://github.com/kubernetes/kops))
- outside of the cluster
(most managed clusters like AKS, DOK, EKS, GKE, Kapsule, LKE, OKE...)
(example: most managed clusters like AKS, EKS, GKE)
---
class: pic
![](images/control-planes/single-node-dev.svg)
---
class: pic
![](images/control-planes/managed-kubernetes.svg)
---
class: pic
![](images/control-planes/single-control-and-workers.svg)
---
class: pic
![](images/control-planes/stacked-control-plane.svg)
---
class: pic
![](images/control-planes/non-dedicated-stacked-nodes.svg)
---
class: pic
![](images/control-planes/advanced-control-plane.svg)
---
class: pic
![](images/control-planes/advanced-control-plane-split-events.svg)
---
class: pic
![Kubernetes architecture diagram: communication between components](images/k8s-arch4-thanks-luxas.png)
![Kubernetes architecture diagram: control plane and nodes](images/k8s-arch2.png)
---
@@ -157,6 +115,12 @@ The kubelet agent uses a number of special-purpose protocols and interfaces, inc
---
class: pic
![Kubernetes architecture diagram: communication between components](images/k8s-arch4-thanks-luxas.png)
---
# The Kubernetes API
[
@@ -203,9 +167,9 @@ What does that mean?
## Let's experiment a bit!
- For this section, connect to the first node of the `test` cluster
- For the exercises in this section, connect to the first node of the `test` cluster
.lab[
.exercise[
- SSH to the first node of the test cluster
@@ -224,7 +188,7 @@ What does that mean?
- Let's create a simple object
.lab[
.exercise[
- Create a namespace with the following command:
```bash
@@ -246,7 +210,7 @@ This is equivalent to `kubectl create namespace hello`.
- Let's retrieve the object we just created
.lab[
.exercise[
- Read back our object:
```bash
@@ -354,7 +318,7 @@ class: extra-details
- The easiest way is to use `kubectl label`
.lab[
.exercise[
- In one terminal, watch namespaces:
```bash
@@ -402,7 +366,7 @@ class: extra-details
- DELETED resources
.lab[
.exercise[
- In one terminal, watch pods, displaying full events:
```bash

View File

@@ -361,7 +361,7 @@ class: extra-details
## Listing service accounts
.lab[
.exercise[
- The resource name is `serviceaccount` or `sa` for short:
```bash
@@ -378,7 +378,7 @@ class: extra-details
## Finding the secret
.lab[
.exercise[
- List the secrets for the `default` service account:
```bash
@@ -398,7 +398,7 @@ class: extra-details
- The token is stored in the secret, wrapped with base64 encoding
.lab[
.exercise[
- View the secret:
```bash
@@ -421,7 +421,7 @@ class: extra-details
- Let's send a request to the API, without and with the token
.lab[
.exercise[
- Find the ClusterIP for the `kubernetes` service:
```bash
@@ -495,49 +495,6 @@ class: extra-details
---
class: extra-details
## Listing all possible verbs
- The Kubernetes API is self-documented
- We can ask it which resources, subresources, and verb exist
- One way to do this is to use:
- `kubectl get --raw /api/v1` (for core resources with `apiVersion: v1`)
- `kubectl get --raw /apis/<group>/<version>` (for other resources)
- The JSON response can be formatted with e.g. `jq` for readability
---
class: extra-details
## Examples
- List all verbs across all `v1` resources
```bash
kubectl get --raw /api/v1 | jq -r .resources[].verbs[] | sort -u
```
- List all resources and subresources in `apps/v1`
```bash
kubectl get --raw /apis/apps/v1 | jq -r .resources[].name
```
- List which verbs are available on which resources in `networking.k8s.io`
```bash
kubectl get --raw /apis/networking.k8s.io/v1 | \
jq -r '.resources[] | .name + ": " + (.verbs | join(", "))'
```
---
## From rules to roles to rolebindings
- A *role* is an API object containing a list of *rules*
@@ -616,7 +573,7 @@ class: extra-details
- Nixery automatically generates images with the requested packages
.lab[
.exercise[
- Run our pod:
```bash
@@ -632,7 +589,7 @@ class: extra-details
- Normally, at this point, we don't have any API permission
.lab[
.exercise[
- Check our permissions with `kubectl`:
```bash
@@ -658,7 +615,7 @@ class: extra-details
(but again, we could call it `view` or whatever we like)
.lab[
.exercise[
- Create the new role binding:
```bash
@@ -716,7 +673,7 @@ It's important to note a couple of details in these flags...
- We should be able to *view* things, but not to *edit* them
.lab[
.exercise[
- Check our permissions with `kubectl`:
```bash
@@ -971,18 +928,6 @@ class: extra-details
kubectl describe clusterrole cluster-admin
```
---
## `list` vs. `get`
⚠️ `list` grants read permissions to resources!
- It's not possible to give permission to list resources without also reading them
- This has implications for e.g. Secrets
(if a controller needs to be able to enumerate Secrets, it will be able to read them)
???
:EN:- Authentication and authorization in Kubernetes

View File

@@ -93,7 +93,7 @@
- We can use the `--dry-run=client` option
.lab[
.exercise[
- Generate the YAML for a Deployment without creating it:
```bash
@@ -128,7 +128,7 @@ class: extra-details
## The limits of `kubectl apply --dry-run=client`
.lab[
.exercise[
- Generate the YAML for a deployment:
```bash
@@ -161,7 +161,7 @@ class: extra-details
(all validation and mutation hooks will be executed)
.lab[
.exercise[
- Try the same YAML file as earlier, with server-side dry run:
```bash
@@ -200,7 +200,7 @@ class: extra-details
- `kubectl diff` does a server-side dry run, *and* shows differences
.lab[
.exercise[
- Try `kubectl diff` on the YAML that we tweaked earlier:
```bash

View File

@@ -1,693 +0,0 @@
# Amazon EKS
- Elastic Kubernetes Service
- AWS runs the Kubernetes control plane
(all we see is an API server endpoint)
- Pods can run on any combination of:
- EKS-managed nodes
- self-managed nodes
- Fargate
- Leverages and integrates with AWS services and APIs
---
## Some integrations
- Authenticate with IAM users and roles
- Associate IAM roles to Kubernetes ServiceAccounts
- Load balance traffic with ALB/ELB/NLB
- Persist data with EBS/EFS
- Label nodes with instance ID, instance type, region, AZ ...
- Pods can be "first class citizens" of VPC
---
## Pros/cons
- Fully managed control plane
- Handles deployment, upgrade, scaling of the control plane
- Available versions and features tend to lag a bit
- Doesn't fit the most demanding users
("demanding" starts somewhere between 100 and 1000 nodes)
---
## Good to know ...
- Some integrations are specific to EKS
(some authentication models)
- Many integrations are *not* specific to EKS
- The Cloud Controller Manager can run outside of EKS
(and provide LoadBalancer services, EBS volumes, and more)
---
# Provisioning clusters
- AWS console, API, CLI
- `eksctl`
- Infrastructure-as-Code
---
## AWS "native" provisioning
- AWS web console
- click-click-click!
- difficulty: low
- AWS API or CLI
- must provide subnets, ARNs
- difficulty: medium
---
## `eksctl`
- Originally developed by Weave
(back when AWS "native" provisioning wasn't very good)
- `eksctl create cluster` just works™
- Has been "adopted" by AWS
(is listed in official documentations)
---
## Infrastructure-as-Code
- Cloud Formation
- Terraform
[terraform-aws-eks](https://github.com/terraform-aws-modules/terraform-aws-eks)
by the community
([example](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic))
[terraform-provider-aws](https://github.com/hashicorp/terraform-provider-aws)
by Hashicorp
([example](https://github.com/hashicorp/terraform-provider-aws/tree/main/examples/eks-getting-started))
[Kubestack](https://www.kubestack.com/)
---
## Node groups
- Virtually all provisioning models have a concept of "node group"
- Node group = group of similar nodes in an ASG
- can span multiple AZ
- can have instances of different types¹
- A cluster will need at least one node group
.footnote[¹As I understand it, to specify fallbacks if one instance type is unavailable or out of capacity.]
---
# IAM → EKS authentication
- Access EKS clusters using IAM users and roles
- No special role, permission, or policy is needed in IAM
(but the `eks:DescribeCluster` permission can be useful, see later)
- Users and roles need to be explicitly listed in the cluster
- Configuration is done through a ConfigMap in the cluster
---
## Setting it up
- Nothing to do when creating the cluster
(feature is always enabled)
- Users and roles are *mapped* to Kubernetes users and groups
(through the `aws-auth` ConfigMap in `kube-system`)
- That's it!
---
## Mapping
- The `aws-auth` ConfigMap can contain two entries:
- `mapRoles` (map IAM roles)
- `mapUsers` (map IAM users)
- Each entry is a YAML file
- Each entry includes:
- `rolearn` or `userarn` to map
- `username` (as a string)
- `groups` (as a list; can be empty)
---
## Example
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: kube-system
name: aws-auth
data:
mapRoles: `|`
- rolearn: arn:aws:iam::111122223333:role/blah
username: blah
groups: [ devs, ops ]
mapUsers: `|`
- userarn: arn:aws:iam::111122223333:user/alice
username: alice
groups: [ system:masters ]
- userarn: arn:aws:iam::111122223333:user/bob
username: bob
groups: [ system:masters ]
```
---
## Client setup
- We need either the `aws` CLI or the `aws-iam-authenticator`
- We use them as `exec` plugins in `~/.kube/config`
- Done automatically by `eksctl`
- Or manually with `aws eks update-kubeconfig`
- Discovering the address of the API server requires one IAM permission
```json
"Action": [
"eks:DescribeCluster"
],
"Resource": "arn:aws:eks:<region>:<account>:cluster/<cluster-name>"
```
(wildcards can be used when specifying the resource)
---
class: extra-details
## How it works
- The helper generates a token
(with `aws eks get-token` or `aws-iam-authenticator token`)
- Note: these calls will always succeed!
(even if AWS API keys are invalid)
- The token is used to authenticate with the Kubernetes API
- AWS' Kubernetes API server will decode and validate the token
(and map the underlying user or role accordingly)
---
## Read The Fine Manual
https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
---
# EKS → IAM authentication
- Access AWS services from workloads running on EKS
(e.g.: access S3 bucket from code running in a Pod)
- This works by associating an IAM role to a K8S ServiceAccount
- There are also a few specific roles used internally by EKS
(e.g. to let the nodes establish network configurations)
- ... We won't talk about these
---
## The big picture
- One-time setup task
([create an OIDC provider associated to our EKS cluster](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html))
- Create (or update) a role with an appropriate *trust policy*
(more on that later)
- Annotate service accounts to map them to that role
`eks.amazonaws.com/role-arn=arn:aws:iam::111122223333:role/some-iam-role`
- Create (or re-create) pods using that ServiceAccount
- The pods can now use that role!
---
## Trust policies
- IAM roles have a *trust policy* (aka *assume role policy*)
(cf `aws iam create-role ... --assume-role-policy-document ...`)
- That policy contains a *statement* list
- This list indicates who/what is allowed to assume (use) the role
- In the current scenario, that policy will contain something saying:
*ServiceAccount S on EKS cluster C is allowed to use this role*
---
## Trust policy for a single ServiceAccount
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_PROVIDER}:sub":
"system:serviceaccount:<namespace>:<service-account>"
}
}
}
]
}
```
---
## Trust policy for multiple ServiceAccounts
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"${OIDC_PROVIDER}:sub":
["system:serviceaccount:container-training:*"]
}
}
}
]
}
```
---
## The little details
- When pods are created, they are processed by a mutating webhook
(typically named `pod-identity-webhook`)
- Pods using a ServiceAccount with the right annotation get:
- an extra token
<br/>
(mounted in `/var/run/secrets/eks.amazonaws.com/serviceaccount/token`)
- a few env vars
<br/>
(including `AWS_WEB_IDENTITY_TOKEN_FILE` and `AWS_ROLE_ARN`)
- AWS client libraries and tooling will work this that
(see [this list](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html) for supported versions)
---
# CNI
- EKS is a compliant Kubernetes implementation
(which means we can use a wide range of CNI plugins)
- However, the recommended CNI plugin is the "AWS VPC CNI"
(https://github.com/aws/amazon-vpc-cni-k8s)
- Pods are then "first class citizens" of AWS VPC
---
## AWS VPC CNI
- Each Pod gets an address in a VPC subnet
- No overlay network, no encapsulation, no overhead
(other than AWS network fabric, obviously)
- Probably the fastest network option when running on AWS
- Allows "direct" load balancing (more on that later)
- Can use security groups with Pod traffic
- But: limits the number of Pods per Node
- But: more complex configuration (more on that later)
---
## Number of Pods per Node
- Each Pod gets an IP address on an ENI
(Elastic Network Interface)
- EC2 instances can only have a limited number of ENIs
(the exact limit depends on the instance type)
- ENIs can only have a limited number of IP addresses
(with variations here as well)
- This gives limits of e.g. 35 pods on `t3.large`, 29 on `c5.large` ...
(see
[full list of limits per instance type](https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt
)
and
[ENI/IP details](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/pkg/awsutils/vpc_ip_resource_limit.go
))
---
## Limits?
- These limits might seem low
- They're not *that* low if you compute e.g. the RAM/Pod ratio
- Except if you're running lots if tiny pods
- Bottom line: do the math!
---
class: extra-details
## Pre-loading
- It can take a little while to allocate/attach an ENI
- The AWS VPC CNI can keep a few extra addresses on each Node
(by default, one ENI worth of IP addresses)
- This is tunable if needed
(see [the docs](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/eni-and-ip-target.md
) for details)
---
## Better load balancing
- The default path for inbound traffic is:
Load balancer → NodePort → Pod
- With the AWS VPC CNI, it becomes possible to do:
Load balancer → Pod
- More on that in the load balancing section!
---
## Configuration complexity
- The AWS VPC CNI is a very good solution when running EKS
- It brings optimized solutions to various use-cases:
- direct load balancing
- user authentication
- interconnection with other infrastructure
- etc.
- Keep in mind that all these solutions are AWS-specific
- They can require a non-trivial amount of specific configuration
- Especially when moving from a simple POC to an IAC deployment!
---
# Load Balancers
- Here be dragons!
- Multiple options, each with different pros/cons
- It's necessary to know both AWS products and K8S concepts
---
## AWS load balancers
- CLB / Classic Load Balancer (formerly known as ELB)
- can work in L4 (TCP) or L7 (HTTP) mode
- can do TLS unrolling
- can't do websockets, HTTP/2, content-based routing ...
- NLB / Network Load Balancer
- high-performance L4 load balancer with TLS support
- ALB / Application Load Balancer
- HTTP load balancer
- can do TLS unrolling
- can do websockets, HTTP/2, content-based routing ...
---
## Load balancing modes
- "IP targets"
- send traffic directly from LB to Pods
- Pods must use the AWS VPC CNI
- compatible with Fargate Pods
- "Instance targets"
- send traffic to a NodePort (generally incurs an extra hop)
- Pods can use any CNI
- not compatible with Fargate Pods
- Each LB (Service) can use a different mode, if necessary
---
## Kubernetes load balancers
- Service (L4)
- ClusterIP: internal load balancing
- NodePort: external load balancing on ports >30000
- LoadBalancer: external load balancing on the port you want
- ExternalIP: external load balancing directly on nodes
- Ingress (L7 HTTP)
- partial content-based routing (`Host` header, request path)
- requires an Ingress Controller (in front)
- works with Services (in back)
---
## Two controllers are available
- Kubernetes "in-tree" load balancer controller
- always available
- used by default for LoadBalancer Services
- creates CLB by default; can also do NLB
- can only do "instance targets"
- can use extra CLB features (TLS, HTTP)
- AWS Load Balancer Controller (fka AWS ALB Ingress Controller)
- optional add-on (requires additional config)
- primarily meant to be an Ingress Controller
- creates NLB and ALB
- can do "instance targets" and "IP targets"
- can also be used for LoadBalancer Services with type `nlb-ip`
- They can run side by side
---
## Which one should we use?
- AWS Load Balancer Controller supports "IP targets"
(which means direct routing of traffic to Pods)
- It can be used as an Ingress controller
- It *seems* to be the perfect solution for EKS!
- However ...
---
## Caveats
- AWS Load Balancer Controller requires extensive configuration
- a few hours to a few days to get it to work in a POC ...
- a few days to a few weeks to industrialize that process?
- It's AWS-specific
- It still introduces an extra hop, even if that hop is invisible
- Other ingress controllers can have interesting features
(canary deployment, A/B testing ...)
---
## Noteworthy annotations and docs
- `service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip`
- LoadBalancer Service with "IP targets" ([docs](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/service/nlb_ip_mode/))
- requires AWS Load Balancer Controller
- `service.beta.kubernetes.io/aws-load-balancer-internal: "true"`
- internal load balancer (for private VPC)
- `service.beta.kubernetes.io/aws-load-balancer-type: nlb`
- opt for NLB instead of CLB with in-tree controller
- `service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"`
- use HAProxy [PROXY protocol](https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt)
---
## TLS-related annotations
- `service.beta.kubernetes.io/aws-load-balancer-ssl-cert`
- enable TLS and use that certificate
- example value: `arn:aws:acm:<region>:<account>:certificate/<cert-id>`
- `service.beta.kubernetes.io/aws-load-balancer-ssl-ports`
- enable TLS *only* on the specified ports (when multiple ports are exposed)
- example value: `"443,8443"`
- `service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy`
- specify ciphers and other TLS parameters to use (see [that list](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-security-policy-table.html))
- example value: `"ELBSecurityPolicy-TLS-1-2-2017-01"`
---
## To HTTP(S) or not to HTTP(S)
- `service.beta.kubernetes.io/aws-load-balancer-backend-protocol`
- can be either `http`, `https`, `ssl`, or `tcp`
- if `https` or `ssl`: enable TLS to the backend
- if `http` or `https`: enable HTTP `x-forwarded-for` headers (with `http` or `https`)
???
## Cluster autoscaling
## Logging
https://docs.aws.amazon.com/eks/latest/userguide/logging-using-cloudtrail.html
:EN:- Working with EKS
:EN:- Cluster and user provisioning
:EN:- Networking and load balancing
:FR:- Travailler avec EKS
:FR:- Outils de déploiement
:FR:- Intégration avec IAM
:FR:- Fonctionalités réseau

View File

@@ -30,7 +30,7 @@
- or we hit the *backoff limit* of the Job (default=6)
.lab[
.exercise[
- Create a Job that has a 50% chance of success:
```bash
@@ -49,7 +49,7 @@
- If the Pod fails, the Job creates another Pod
.lab[
.exercise[
- Check the status of the Pod(s) created by the Job:
```bash
@@ -108,7 +108,7 @@ class: extra-details
(The Cron Job will not hold if a previous job is still running)
.lab[
.exercise[
- Create the Cron Job:
```bash
@@ -135,7 +135,7 @@ class: extra-details
(re-creating another one if it fails, for instance if its node fails)
.lab[
.exercise[
- Check the Jobs that are created:
```bash

View File

@@ -98,7 +98,7 @@
- Let's list our bootstrap tokens on a cluster created with kubeadm
.lab[
.exercise[
- Log into node `test1`
@@ -145,7 +145,7 @@ class: extra-details
- The token we need to use has the form `abcdef.1234567890abcdef`
.lab[
.exercise[
- Check that it is accepted by the API server:
```bash
@@ -177,7 +177,7 @@ class: extra-details
- That information is stored in a public ConfigMap
.lab[
.exercise[
- Retrieve that ConfigMap:
```bash

View File

@@ -88,7 +88,7 @@ spec:
- Let's try this out!
.lab[
.exercise[
- Check the port used by our self-hosted registry:
```bash

View File

@@ -40,7 +40,7 @@
- Let's build the image for the DockerCoins `worker` service with Kaniko
.lab[
.exercise[
- Find the port number for our self-hosted registry:
```bash
@@ -160,7 +160,7 @@ spec:
- The YAML for the pod is in `k8s/kaniko-build.yaml`
.lab[
.exercise[
- Create the pod:
```bash

View File

@@ -37,7 +37,7 @@ so that your build pipeline is automated.*
- We will deploy a registry container, and expose it with a NodePort
.lab[
.exercise[
- Create the registry service:
```bash
@@ -57,7 +57,7 @@ so that your build pipeline is automated.*
- We need to find out which port has been allocated
.lab[
.exercise[
- View the service details:
```bash
@@ -78,7 +78,7 @@ so that your build pipeline is automated.*
- A convenient Docker registry API route to remember is `/v2/_catalog`
.lab[
.exercise[
<!-- ```hide kubectl wait deploy/registry --for condition=available```-->
@@ -102,7 +102,7 @@ We should see:
- We can retag a small image, and push it to the registry
.lab[
.exercise[
- Make sure we have the busybox image, and retag it:
```bash
@@ -123,7 +123,7 @@ We should see:
- Let's use the same endpoint as before
.lab[
.exercise[
- Ensure that our busybox image is now in the local registry:
```bash
@@ -143,7 +143,7 @@ The curl command should now output:
- We are going to use a convenient feature of Docker Compose
.lab[
.exercise[
- Go to the `stacks` directory:
```bash
@@ -217,7 +217,7 @@ class: extra-details
- All our images should now be in the registry
.lab[
.exercise[
- Re-run the same `curl` command as earlier:
```bash
@@ -232,4 +232,4 @@ variable, so that we can quickly switch from
the self-hosted registry to pre-built images
hosted on the Docker Hub. So make sure that
this $REGISTRY variable is set correctly when
running these commands!*
running the exercises!*

View File

@@ -56,7 +56,7 @@
- It can be installed with a YAML manifest, or with Helm
.lab[
.exercise[
- Let's install the cert-manager Helm chart with this one-liner:
```bash
@@ -86,7 +86,7 @@
- The manifest shown on the previous slide is in @@LINK[k8s/cm-clusterissuer.yaml]
.lab[
.exercise[
- Create the ClusterIssuer:
```bash
@@ -115,7 +115,7 @@
- The manifest shown on the previous slide is in @@LINK[k8s/cm-certificate.yaml]
.lab[
.exercise[
- Edit the Certificate to update the domain name
@@ -140,7 +140,7 @@
- then it waits for the challenge to complete
.lab[
.exercise[
- View the resources created by cert-manager:
```bash
@@ -158,7 +158,7 @@
`http://<our-domain>/.well-known/acme-challenge/<token>`
.lab[
.exercise[
- Check the *path* of the Ingress in particular:
```bash
@@ -176,7 +176,7 @@
An Ingress Controller! 😅
.lab[
.exercise[
- Install an Ingress Controller:
```bash

View File

@@ -1,445 +0,0 @@
# Cluster autoscaler
- When the cluster is full, we need to add more nodes
- This can be done manually:
- deploy new machines and add them to the cluster
- if using managed Kubernetes, use some API/CLI/UI
- Or automatically with the cluster autoscaler:
https://github.com/kubernetes/autoscaler
---
## Use-cases
- Batch job processing
"once in a while, we need to execute these 1000 jobs in parallel"
"...but the rest of the time there is almost nothing running on the cluster"
- Dynamic workload
"a few hours per day or a few days per week, we have a lot of traffic"
"...but the rest of the time, the load is much lower"
---
## Pay for what you use
- The point of the cloud is to "pay for what you use"
- If you have a fixed number of cloud instances running at all times:
*you're doing in wrong (except if your load is always the same)*
- If you're not using some kind of autoscaling, you're wasting money
(except if you like lining the pockets of your cloud provider)
---
## Running the cluster autoscaler
- We must run nodes on a supported infrastructure
- See [here] for a non-exhaustive list of supported providers
- Sometimes, the cluster autoscaler is installed automatically
(or by setting a flag / checking a box when creating the cluster)
- Sometimes, it requires additional work
(which is often non-trivial and highly provider-specific)
[here]: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider
---
## Scaling up in theory
IF a Pod is `Pending`,
AND adding a Node would allow this Pod to be scheduled,
THEN add a Node.
---
## Fine print 1
*IF a Pod is `Pending`...*
- First of all, the Pod must exist
- Pod creation might be blocked by e.g. a namespace quota
- In that case, the cluster autoscaler will never trigger
---
## Fine print 2
*IF a Pod is `Pending`...*
- If our Pods do not have resource requests:
*they will be in the `BestEffort` class*
- Generally, Pods in the `BestEffort` class are schedulable
- except if they have anti-affinity placement constraints
- except if all Nodes already run the max number of pods (110 by default)
- Therefore, if we want to leverage cluster autoscaling:
*our Pods should have resource requests*
---
## Fine print 3
*AND adding a Node would allow this Pod to be scheduled...*
- The autoscaler won't act if:
- the Pod is too big to fit on a single Node
- the Pod has impossible placement constraints
- Examples:
- "run one Pod per datacenter" with 4 pods and 3 datacenters
- "use this nodeSelector" but no such Node exists
---
## Trying it out
- We're going to check how much capacity is available on the cluster
- Then we will create a basic deployment
- We will add resource requests to that deployment
- Then scale the deployment to exceed the available capacity
- **The following commands require a working cluster autoscaler!**
---
## Checking available resources
.lab[
- Check how much CPU is allocatable on the cluster:
```bash
kubectl get nodes -o jsonpath={..allocatable.cpu}
```
]
- If we see e.g. `2800m 2800m 2800m`, that means:
3 nodes with 2.8 CPUs allocatable each
- To trigger autoscaling, we will create 7 pods requesting 1 CPU each
(each node can fit 2 such pods)
---
## Creating our test Deployment
.lab[
- Create the Deployment:
```bash
kubectl create deployment blue --image=jpetazzo/color
```
- Add a request for 1 CPU:
```bash
kubectl patch deployment blue --patch='
spec:
template:
spec:
containers:
- name: color
resources:
requests:
cpu: 1
'
```
]
---
## Scaling up in practice
- This assumes that we have strictly less than 7 CPUs available
(adjust the numbers if necessary!)
.lab[
- Scale up the Deployment:
```bash
kubectl scale deployment blue --replicas=7
```
- Check that we have a new Pod, and that it's `Pending`:
```bash
kubectl get pods
```
]
---
## Cluster autoscaling
- After a few minutes, a new Node should appear
- When that Node becomes `Ready`, the Pod will be assigned to it
- The Pod will then be `Running`
- Reminder: the `AGE` of the Pod indicates when the Pod was *created*
(it doesn't indicate when the Pod was scheduled or started!)
- To see other state transitions, check the `status.conditions` of the Pod
---
## Scaling down in theory
IF a Node has less than 50% utilization for 10 minutes,
AND all its Pods can be scheduled on other Nodes,
AND all its Pods are *evictable*,
AND the Node doesn't have a "don't scale me down" annotation¹,
THEN drain the Node and shut it down.
.footnote[¹The annotation is: `cluster-autoscaler.kubernetes.io/scale-down-disabled=true`]
---
## When is a Pod "evictable"?
By default, Pods are evictable, except if any of the following is true.
- They have a restrictive Pod Disruption Budget
- They are "standalone" (not controlled by a ReplicaSet/Deployment, StatefulSet, Job...)
- They are in `kube-system` and don't have a Pod Disruption Budget
- They have local storage (that includes `EmptyDir`!)
This can be overridden by setting the annotation:
<br/>
`cluster-autoscaler.kubernetes.io/safe-to-evict`
<br/>(it can be set to `true` or `false`)
---
## Pod Disruption Budget
- Special resource to configure how many Pods can be *disrupted*
(i.e. shutdown/terminated)
- Applies to Pods matching a given selector
(typically matching the selector of a Deployment)
- Only applies to *voluntary disruption*
(e.g. cluster autoscaler draining a node, planned maintenance...)
- Can express `minAvailable` or `maxUnavailable`
- See [documentation] for details and examples
[documentation]: https://kubernetes.io/docs/tasks/run-application/configure-pdb/
---
## Local storage
- If our Pods use local storage, they will prevent scaling down
- If we have e.g. an `EmptyDir` volume for caching/sharing:
make sure to set the `.../safe-to-evict` annotation to `true`!
- Even if the volume...
- ...only has a PID file or UNIX socket
- ...is empty
- ...is not mounted by any container in the Pod!
---
## Expensive batch jobs
- Careful if we have long-running batch jobs!
(e.g. jobs that take many hours/days to complete)
- These jobs could get evicted before they complete
(especially if they use less than 50% of the allocatable resources)
- Make sure to set the `.../safe-to-evict` annotation to `false`!
---
## Node groups
- Easy scenario: all nodes have the same size
- Realistic scenario: we have nodes of different sizes
- e.g. mix of CPU and GPU nodes
- e.g. small nodes for control plane, big nodes for batch jobs
- e.g. leveraging spot capacity
- The cluster autoscaler can handle it!
---
class: extra-details
## Leveraging spot capacity
- AWS, Azure, and Google Cloud are typically more expensive then their competitors
- However, they offer *spot* capacity (spot instances, spot VMs...)
- *Spot* capacity:
- has a much lower cost (see e.g. AWS [spot instance advisor][awsspot])
- has a cost that varies continuously depending on regions, instance type...
- can be preempted at all times
- To be cost-effective, it is strongly recommended to leverage spot capacity
[awsspot]: https://aws.amazon.com/ec2/spot/instance-advisor/
---
## Node groups in practice
- The cluster autoscaler maps nodes to *node groups*
- this is an internal, provider-dependent mechanism
- the node group is sometimes visible through a proprietary label or annotation
- Each node group is scaled independently
- The cluster autoscaler uses [expanders] to decide which node group to scale up
(the default expander is "random", i.e. pick a node group at random!)
- Of course, only acceptable node groups will be considered
(i.e. node groups that could accommodate the `Pending` Pods)
[expanders]: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-expanders
---
class: extra-details
## Scaling to zero
- *In general,* a node group needs to have at least one node at all times
(the cluster autoscaler uses that node to figure out the size, labels, taints... of the group)
- *On some providers,* there are special ways to specify labels and/or taints
(but if you want to scale to zero, check that the provider supports it!)
---
## Warning
- Autoscaling up is easy
- Autoscaling down is harder
- It might get stuck because Pods are not evictable
- Do at least a dry run to make sure that the cluster scales down correctly!
- Have alerts on cloud spend
- *Especially when using big/expensive nodes (e.g. with GPU!)*
---
## Preferred vs. Required
- Some Kubernetes mechanisms allow to express "soft preferences":
- affinity (`requiredDuringSchedulingIgnoredDuringExecution` vs `preferredDuringSchedulingIgnoredDuringExecution`)
- taints (`NoSchedule`/`NoExecute` vs `PreferNoSchedule`)
- Remember that these "soft preferences" can be ignored
(and given enough time and churn on the cluster, they will!)
---
## Troubleshooting
- The cluster autoscaler publishes its status on a ConfigMap
.lab[
- Check the cluster autoscaler status:
```bash
kubectl describe configmap --namespace kube-system cluster-autoscaler-status
```
]
- We can also check the logs of the autoscaler
(except on managed clusters where it's running internally, not visible to us)
---
## Acknowledgements
Special thanks to [@s0ulshake] for their help with this section!
If you need help to run your data science workloads on Kubernetes,
<br/>they're available for consulting.
(Get in touch with them through https://www.linkedin.com/in/ajbowen/)
[@s0ulshake]: https://twitter.com/s0ulshake

View File

@@ -18,9 +18,9 @@
- It's easy to check the version for the API server
.lab[
.exercise[
- Log into node `oldversion1`
- Log into node `test1`
- Check the version of kubectl and of the API server:
```bash
@@ -39,7 +39,7 @@
- It's also easy to check the version of kubelet
.lab[
.exercise[
- Check node versions (includes kubelet, kernel, container engine):
```bash
@@ -60,7 +60,7 @@
- If the control plane is self-hosted (running in pods), we can check it
.lab[
.exercise[
- Show image versions for all pods in `kube-system` namespace:
```bash
@@ -81,7 +81,7 @@
## What version are we running anyway?
- When I say, "I'm running Kubernetes 1.18", is that the version of:
- When I say, "I'm running Kubernetes 1.15", is that the version of:
- kubectl
@@ -157,15 +157,15 @@
## Kubernetes uses semantic versioning
- Kubernetes versions look like MAJOR.MINOR.PATCH; e.g. in 1.18.20:
- Kubernetes versions look like MAJOR.MINOR.PATCH; e.g. in 1.17.2:
- MAJOR = 1
- MINOR = 18
- PATCH = 20
- MINOR = 17
- PATCH = 2
- It's always possible to mix and match different PATCH releases
(e.g. 1.18.20 and 1.18.15 are compatible)
(e.g. 1.16.1 and 1.16.6 are compatible)
- It is recommended to run the latest PATCH release
@@ -181,9 +181,9 @@
- All components support a difference of one¹ MINOR version
- This allows live upgrades (since we can mix e.g. 1.18 and 1.19)
- This allows live upgrades (since we can mix e.g. 1.15 and 1.16)
- It also means that going from 1.18 to 1.20 requires going through 1.19
- It also means that going from 1.14 to 1.16 requires going through 1.15
.footnote[¹Except kubelet, which can be up to two MINOR behind API server,
and kubectl, which can be one MINOR ahead or behind API server.]
@@ -214,7 +214,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
- We will change the version of the API server
- We will work with cluster `oldversion` (nodes `oldversion1`, `oldversion2`, `oldversion3`)
- We will work with cluster `test` (nodes `test1`, `test2`, `test3`)
---
@@ -240,9 +240,9 @@ and kubectl, which can be one MINOR ahead or behind API server.]
- We will edit the YAML file to use a different image version
.lab[
.exercise[
- Log into node `oldversion1`
- Log into node `test1`
- Check API server version:
```bash
@@ -254,7 +254,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
```
- Look for the `image:` line, and update it to e.g. `v1.19.0`
- Look for the `image:` line, and update it to e.g. `v1.16.0`
]
@@ -264,7 +264,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
- The API server will be briefly unavailable while kubelet restarts it
.lab[
.exercise[
- Check the API server version:
```bash
@@ -299,7 +299,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
(note: this is possible only because the cluster was installed with kubeadm)
.lab[
.exercise[
- Check what will be upgraded:
```bash
@@ -308,11 +308,11 @@ and kubectl, which can be one MINOR ahead or behind API server.]
]
Note 1: kubeadm thinks that our cluster is running 1.19.0.
Note 1: kubeadm thinks that our cluster is running 1.16.0.
<br/>It is confused by our manual upgrade of the API server!
Note 2: kubeadm itself is still version 1.18.20..
<br/>It doesn't know how to upgrade do 1.19.X.
Note 2: kubeadm itself is still version 1.15.9.
<br/>It doesn't know how to upgrade do 1.16.X.
---
@@ -320,7 +320,7 @@ Note 2: kubeadm itself is still version 1.18.20..
- First things first: we need to upgrade kubeadm
.lab[
.exercise[
- Upgrade kubeadm:
```
@@ -335,28 +335,28 @@ Note 2: kubeadm itself is still version 1.18.20..
]
Problem: kubeadm doesn't know know how to handle
upgrades from version 1.18.
upgrades from version 1.15.
This is because we installed version 1.22 (or even later).
This is because we installed version 1.17 (or even later).
We need to install kubeadm version 1.19.X.
We need to install kubeadm version 1.16.X.
---
## Downgrading kubeadm
- We need to go back to version 1.19.X.
- We need to go back to version 1.16.X (e.g. 1.16.6)
.lab[
.exercise[
- View available versions for package `kubeadm`:
```bash
apt show kubeadm -a | grep ^Version | grep 1.19
apt show kubeadm -a | grep ^Version | grep 1.16
```
- Downgrade kubeadm:
```
sudo apt install kubeadm=1.19.8-00
sudo apt install kubeadm=1.16.6-00
```
- Check what kubeadm tells us:
@@ -366,7 +366,7 @@ We need to install kubeadm version 1.19.X.
]
kubeadm should now agree to upgrade to 1.19.8.
kubeadm should now agree to upgrade to 1.16.6.
---
@@ -378,11 +378,11 @@ kubeadm should now agree to upgrade to 1.19.8.
- Or we can try the upgrade anyway
.lab[
.exercise[
- Perform the upgrade:
```bash
sudo kubeadm upgrade apply v1.19.8
sudo kubeadm upgrade apply v1.16.6
```
]
@@ -395,9 +395,9 @@ kubeadm should now agree to upgrade to 1.19.8.
- We can therefore use `apt` or `apt-get`
.lab[
.exercise[
- Log into node `oldversion3`
- Log into node `test3`
- View available versions for package `kubelet`:
```bash
@@ -406,7 +406,7 @@ kubeadm should now agree to upgrade to 1.19.8.
- Upgrade kubelet:
```bash
sudo apt install kubelet=1.19.8-00
sudo apt install kubelet=1.16.6-00
```
]
@@ -415,9 +415,9 @@ kubeadm should now agree to upgrade to 1.19.8.
## Checking what we've done
.lab[
.exercise[
- Log into node `oldversion1`
- Log into node `test1`
- Check node versions:
```bash
@@ -458,15 +458,15 @@ kubeadm should now agree to upgrade to 1.19.8.
(after upgrading the control plane)
.lab[
.exercise[
- Download the configuration on each node, and upgrade kubelet:
```bash
for N in 1 2 3; do
ssh oldversion$N "
sudo apt install kubeadm=1.19.8-00 &&
ssh test$N "
sudo apt install kubeadm=1.16.6-00 &&
sudo kubeadm upgrade node &&
sudo apt install kubelet=1.19.8-00"
sudo apt install kubelet=1.16.6-00"
done
```
]
@@ -475,9 +475,9 @@ kubeadm should now agree to upgrade to 1.19.8.
## Checking what we've done
- All our nodes should now be updated to version 1.19.8
- All our nodes should now be updated to version 1.16.6
.lab[
.exercise[
- Check nodes versions:
```bash
@@ -492,13 +492,13 @@ class: extra-details
## Skipping versions
- This example worked because we went from 1.18 to 1.19
- This example worked because we went from 1.15 to 1.16
- If you are upgrading from e.g. 1.16, you will have to go through 1.17 first
- If you are upgrading from e.g. 1.14, you will have to go through 1.15 first
- This means upgrading kubeadm to 1.17.X, then using it to upgrade the cluster
- This means upgrading kubeadm to 1.15.X, then using it to upgrade the cluster
- Then upgrading kubeadm to 1.18.X, etc.
- Then upgrading kubeadm to 1.16.X, etc.
- **Make sure to read the release notes before upgrading!**

View File

@@ -204,7 +204,7 @@ class: extra-details
## Logging into the new cluster
.lab[
.exercise[
- Log into node `kuberouter1`
@@ -228,7 +228,7 @@ class: extra-details
- By default, kubelet gets the CNI configuration from `/etc/cni/net.d`
.lab[
.exercise[
- Check the content of `/etc/cni/net.d`
@@ -262,7 +262,7 @@ class: extra-details
(where `C` is our cluster number)
.lab[
.exercise[
- Edit the Compose file to set the Cluster CIDR:
```bash
@@ -298,7 +298,7 @@ class: extra-details
(where `A.B.C.D` is the public address of `kuberouter1`, running the control plane)
.lab[
.exercise[
- Edit the YAML file to set the API server address:
```bash
@@ -320,7 +320,7 @@ Note: the DaemonSet won't create any pods (yet) since there are no nodes (yet).
- This is similar to what we did for the `kubenet` cluster
.lab[
.exercise[
- Generate the kubeconfig file (replacing `X.X.X.X` with the address of `kuberouter1`):
```bash
@@ -338,7 +338,7 @@ Note: the DaemonSet won't create any pods (yet) since there are no nodes (yet).
- We need to copy that kubeconfig file to the other nodes
.lab[
.exercise[
- Copy `kubeconfig` to the other nodes:
```bash
@@ -359,7 +359,7 @@ Note: the DaemonSet won't create any pods (yet) since there are no nodes (yet).
- We need to pass `--network-plugin=cni`
.lab[
.exercise[
- Join the first node:
```bash
@@ -384,7 +384,7 @@ class: extra-details
(in `/etc/cni/net.d`)
.lab[
.exercise[
- Check the content of `/etc/cni/net.d`
@@ -400,7 +400,7 @@ class: extra-details
- Let's create a Deployment and expose it with a Service
.lab[
.exercise[
- Create a Deployment running a web server:
```bash
@@ -423,7 +423,7 @@ class: extra-details
## Checking that everything works
.lab[
.exercise[
- Get the ClusterIP address for the service:
```bash
@@ -449,7 +449,7 @@ class: extra-details
- What if we need to check that everything is working properly?
.lab[
.exercise[
- Check the IP addresses of our pods:
```bash
@@ -490,7 +490,7 @@ class: extra-details
## Trying `kubectl logs` / `kubectl exec`
.lab[
.exercise[
- Try to show the logs of a kube-router pod:
```bash

View File

@@ -344,94 +344,32 @@ We'll cover them just after!*
---
## Example: HAProxy configuration
## Passing a configuration file with a configmap
- We are going to deploy HAProxy, a popular load balancer
- We will start a load balancer powered by HAProxy
- It expects to find its configuration in a specific place:
- We will use the [official `haproxy` image](https://hub.docker.com/_/haproxy/)
`/usr/local/etc/haproxy/haproxy.cfg`
- It expects to find its configuration in `/usr/local/etc/haproxy/haproxy.cfg`
- We will create a ConfigMap holding the configuration file
- We will provide a simple HAproxy configuration, `k8s/haproxy.cfg`
- Then we will mount that ConfigMap in a Pod running HAProxy
- It listens on port 80, and load balances connections between IBM and Google
---
## Blue/green load balancing
## Creating the configmap
- In this example, we will deploy two versions of our app:
.exercise[
- the "blue" version in the `blue` namespace
- the "green" version in the `green` namespace
- In both namespaces, we will have a Deployment and a Service
(both named `color`)
- We want to load balance traffic between both namespaces
(we can't do that with a simple service selector: these don't cross namespaces)
---
## Deploying the app
- We're going to use the image `jpetazzo/color`
(it is a simple "HTTP echo" server showing which pod served the request)
- We can create each Namespace, Deployment, and Service by hand, or...
.lab[
- We can deploy the app with a YAML manifest:
- Go to the `k8s` directory in the repository:
```bash
kubectl apply -f ~/container.training/k8s/rainbow.yaml
cd ~/container.training/k8s
```
]
---
## Testing the app
- Reminder: Service `x` in Namespace `y` is available through:
`x.y`, `x.y.svc`, `x.y.svc.cluster.local`
- Since the `cluster.local` suffix can change, we'll use `x.y.svc`
.lab[
- Check that the app is up and running:
- Create a configmap named `haproxy` and holding the configuration file:
```bash
kubectl run --rm -it --restart=Never --image=nixery.dev/curl my-test-pod \
curl color.blue.svc
```
]
---
## Creating the HAProxy configuration
Here is the file that we will use, @@LINK[k8s/haproxy.cfg]:
```
@@INCLUDE[k8s/haproxy.cfg]
```
---
## Creating the ConfigMap
.lab[
- Create a ConfigMap named `haproxy` and holding the configuration file:
```bash
kubectl create configmap haproxy --from-file=~/container.training/k8s/haproxy.cfg
kubectl create configmap haproxy --from-file=haproxy.cfg
```
- Check what our configmap looks like:
@@ -443,21 +381,37 @@ Here is the file that we will use, @@LINK[k8s/haproxy.cfg]:
---
## Using the ConfigMap
## Using the configmap
Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
We are going to use the following pod definition:
```yaml
@@INCLUDE[k8s/haproxy.yaml]
apiVersion: v1
kind: Pod
metadata:
name: haproxy
spec:
volumes:
- name: config
configMap:
name: haproxy
containers:
- name: haproxy
image: haproxy
volumeMounts:
- name: config
mountPath: /usr/local/etc/haproxy/
```
---
## Creating the Pod
## Using the configmap
.lab[
- The resource definition from the previous slide is in `k8s/haproxy.yaml`
- Create the HAProxy Pod:
.exercise[
- Create the HAProxy pod:
```bash
kubectl apply -f ~/container.training/k8s/haproxy.yaml
```
@@ -476,21 +430,27 @@ Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
## Testing our load balancer
- If everything went well, when we should see a perfect round robin
- The load balancer will send:
(one request to `blue`, one request to `green`, one request to `blue`, etc.)
- half of the connections to Google
.lab[
- the other half to IBM
- Send a few requests:
.exercise[
- Access the load balancer a few times:
```bash
for i in $(seq 10); do
curl $IP
done
curl $IP
curl $IP
```
]
We should see connections served by Google, and others served by IBM.
<br/>
(Each server sends us a redirect page. Look at the URL that they send us to!)
---
## Exposing configmaps with the downward API
@@ -509,7 +469,7 @@ Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
## Creating the configmap
.lab[
.exercise[
- Our configmap will have a single key, `http.addr`:
```bash
@@ -530,16 +490,29 @@ Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
We are going to use the following pod definition:
```yaml
@@INCLUDE[k8s/registry.yaml]
apiVersion: v1
kind: Pod
metadata:
name: registry
spec:
containers:
- name: registry
image: registry
env:
- name: REGISTRY_HTTP_ADDR
valueFrom:
configMapKeyRef:
name: registry
key: http.addr
```
---
## Using the configmap
- The resource definition from the previous slide is in @@LINK[k8s/registry.yaml]
- The resource definition from the previous slide is in `k8s/registry.yaml`
.lab[
.exercise[
- Create the registry pod:
```bash

Some files were not shown because too many files have changed in this diff Show More