mirror of
https://github.com/jpetazzo/container.training.git
synced 2026-03-02 17:30:20 +00:00
Compare commits
1 Commits
2022-01-lu
...
2021-11-de
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
549c8f5eaf |
@@ -1,67 +1,14 @@
|
||||
# (1) Setting up a registry, and telling Tilt to use it.
|
||||
|
||||
# Tilt needs a registry to store images.
|
||||
|
||||
# The following manifest defines a Deployment to run a basic Docker registry,
|
||||
# and a NodePort Service to access it. Using a NodePort means that we don't
|
||||
# need to obtain a TLS certificate, because we will be accessing the registry
|
||||
# through localhost.
|
||||
k8s_yaml('../k8s/tilt-registry.yaml')
|
||||
|
||||
# Tell Tilt to use the registry that we just deployed instead of whatever
|
||||
# is defined in our Kubernetes resources. Tilt will patch image names to
|
||||
# use our registry.
|
||||
default_registry('localhost:30555')
|
||||
|
||||
# Create a port forward so that we can access the registry from our local
|
||||
# environment, too. Note that if you run Tilt directly from a Kubernetes node
|
||||
# (which is not typical, but might happen in some lab/training environments)
|
||||
# the following might cause an error because port 30555 is already taken.
|
||||
k8s_resource(workload='tilt-registry', port_forwards='30555:5000')
|
||||
|
||||
# (2) Telling Tilt how to build and run our app.
|
||||
|
||||
# The following two lines will use the kubectl-build plugin
|
||||
# to leverage buildkit and build the images in our Kubernetes
|
||||
# cluster. This is not enabled by default, because it requires
|
||||
# the plugin to be installed.
|
||||
# See https://github.com/vmware-tanzu/buildkit-cli-for-kubectl
|
||||
# for more information about this plugin.
|
||||
#load('ext://kubectl_build', 'kubectl_build')
|
||||
#docker_build = kubectl_build
|
||||
|
||||
# Our Kubernetes manifests use images 'dockercoins/...' so we tell Tilt
|
||||
# how each of these images should be built. The first argument is the name
|
||||
# of the image, the second argument is the directory containing the build
|
||||
# context (i.e. the Dockerfile to build the image).
|
||||
docker_build('dockercoins/hasher', 'hasher')
|
||||
docker_build('dockercoins/rng', 'rng')
|
||||
docker_build('dockercoins/webui', 'webui')
|
||||
docker_build('dockercoins/worker', 'worker')
|
||||
|
||||
# The following manifests defines five Deployments and four Services for
|
||||
# our application.
|
||||
k8s_yaml('../k8s/dockercoins.yaml')
|
||||
|
||||
# (3) Finishing touches.
|
||||
# Uncomment the following line to let tilt run with the default kubeadm cluster-admin context.
|
||||
#allow_k8s_contexts('kubernetes-admin@kubernetes')
|
||||
|
||||
# The following line lets Tilt run with the default kubeadm cluster-admin context.
|
||||
allow_k8s_contexts('kubernetes-admin@kubernetes')
|
||||
|
||||
# This will run an ngrok tunnel to expose Tilt to the outside world.
|
||||
# This is intended to be used when Tilt runs on a remote machine.
|
||||
local_resource(name='ngrok:tunnel', serve_cmd='ngrok http 10350')
|
||||
|
||||
# This will wait until the ngrok tunnel is up, and show its URL to the user.
|
||||
# We send the output to /dev/tty so that it doesn't get intercepted by
|
||||
# Tilt, and gets displayed to the user's terminal instead.
|
||||
# Note: this assumes that the ngrok instance will be running on port 4040.
|
||||
# If you have other ngrok instances running on the machine, this might not work.
|
||||
local_resource(name='ngrok:showurl', cmd='''
|
||||
while sleep 1; do
|
||||
TUNNELS=$(curl -fsSL http://localhost:4040/api/tunnels | jq -r .tunnels[].public_url)
|
||||
[ "$TUNNELS" ] && break
|
||||
done
|
||||
printf "\nYou should be able to connect to the Tilt UI with the following URL(s): %s\n" "$TUNNELS" >/dev/tty
|
||||
'''
|
||||
)
|
||||
# While we're here: if you're controlling a remote cluster, uncomment that line.
|
||||
# It will create a port forward so that you can access the remote registry.
|
||||
#k8s_resource(workload='registry', port_forwards='30555:5000')
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
FROM node:4-slim
|
||||
RUN npm install express
|
||||
RUN npm install redis@3
|
||||
RUN npm install redis
|
||||
COPY files/ /files/
|
||||
COPY webui.js /
|
||||
CMD ["node", "webui.js"]
|
||||
|
||||
@@ -1,16 +0,0 @@
|
||||
apiVersion: apiserver.config.k8s.io/v1
|
||||
kind: AdmissionConfiguration
|
||||
plugins:
|
||||
- name: PodSecurity
|
||||
configuration:
|
||||
apiVersion: pod-security.admission.config.k8s.io/v1alpha1
|
||||
kind: PodSecurityConfiguration
|
||||
defaults:
|
||||
enforce: baseline
|
||||
audit: baseline
|
||||
warn: baseline
|
||||
exemptions:
|
||||
usernames:
|
||||
- cluster-admin
|
||||
namespaces:
|
||||
- kube-system
|
||||
@@ -3,12 +3,6 @@
|
||||
# - no actual persistence
|
||||
# - scaling down to 1 will break the cluster
|
||||
# - pods may be colocated
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
metadata:
|
||||
@@ -34,6 +28,11 @@ subjects:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: consul
|
||||
@@ -62,7 +61,7 @@ spec:
|
||||
serviceAccountName: consul
|
||||
containers:
|
||||
- name: consul
|
||||
image: "consul:1.11"
|
||||
image: "consul:1.8"
|
||||
env:
|
||||
- name: NAMESPACE
|
||||
valueFrom:
|
||||
|
||||
@@ -2,12 +2,6 @@
|
||||
# There is still no actual persistence, but:
|
||||
# - podAntiaffinity prevents pod colocation
|
||||
# - clusters works when scaling down to 1 (thanks to lifecycle hook)
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
metadata:
|
||||
@@ -33,6 +27,11 @@ subjects:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: consul
|
||||
@@ -69,7 +68,7 @@ spec:
|
||||
terminationGracePeriodSeconds: 10
|
||||
containers:
|
||||
- name: consul
|
||||
image: "consul:1.11"
|
||||
image: "consul:1.8"
|
||||
env:
|
||||
- name: NAMESPACE
|
||||
valueFrom:
|
||||
|
||||
@@ -1,11 +1,5 @@
|
||||
# Even better Consul cluster.
|
||||
# That one uses a volumeClaimTemplate to achieve true persistence.
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
metadata:
|
||||
@@ -31,6 +25,11 @@ subjects:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: consul
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: consul
|
||||
@@ -76,7 +75,7 @@ spec:
|
||||
terminationGracePeriodSeconds: 10
|
||||
containers:
|
||||
- name: consul
|
||||
image: "consul:1.11"
|
||||
image: "consul:1.8"
|
||||
volumeMounts:
|
||||
- name: data
|
||||
mountPath: /consul/data
|
||||
|
||||
@@ -1,16 +1,18 @@
|
||||
global
|
||||
daemon
|
||||
maxconn 256
|
||||
|
||||
defaults
|
||||
mode tcp
|
||||
timeout connect 5s
|
||||
timeout client 50s
|
||||
timeout server 50s
|
||||
timeout connect 5000ms
|
||||
timeout client 50000ms
|
||||
timeout server 50000ms
|
||||
|
||||
listen very-basic-load-balancer
|
||||
frontend the-frontend
|
||||
bind *:80
|
||||
server blue color.blue.svc:80
|
||||
server green color.green.svc:80
|
||||
default_backend the-backend
|
||||
|
||||
backend the-backend
|
||||
server google.com-80 google.com:80 maxconn 32 check
|
||||
server ibm.fr-80 ibm.fr:80 maxconn 32 check
|
||||
|
||||
# Note: the services above must exist,
|
||||
# otherwise HAproxy won't start.
|
||||
|
||||
@@ -1,28 +0,0 @@
|
||||
apiVersion: kyverno.io/v1
|
||||
kind: ClusterPolicy
|
||||
metadata:
|
||||
name: ingress-domain-name
|
||||
spec:
|
||||
rules:
|
||||
- name: create-ingress
|
||||
match:
|
||||
resources:
|
||||
kinds:
|
||||
- Service
|
||||
generate:
|
||||
kind: Ingress
|
||||
name: "{{request.object.metadata.name}}"
|
||||
namespace: "{{request.object.metadata.namespace}}"
|
||||
data:
|
||||
spec:
|
||||
rules:
|
||||
- host: "{{request.object.metadata.name}}.{{request.object.metadata.namespace}}.A.B.C.D.nip.io"
|
||||
http:
|
||||
paths:
|
||||
- backend:
|
||||
service:
|
||||
name: "{{request.object.metadata.name}}"
|
||||
port:
|
||||
number: 80
|
||||
path: /
|
||||
pathType: Prefix
|
||||
@@ -1,32 +0,0 @@
|
||||
apiVersion: kyverno.io/v1
|
||||
kind: ClusterPolicy
|
||||
metadata:
|
||||
name: ingress-domain-name
|
||||
spec:
|
||||
rules:
|
||||
- name: create-ingress
|
||||
match:
|
||||
resources:
|
||||
kinds:
|
||||
- Service
|
||||
preconditions:
|
||||
- key: "{{request.object.spec.ports[0].name}}"
|
||||
operator: Equals
|
||||
value: http
|
||||
generate:
|
||||
kind: Ingress
|
||||
name: "{{request.object.metadata.name}}"
|
||||
namespace: "{{request.object.metadata.namespace}}"
|
||||
data:
|
||||
spec:
|
||||
rules:
|
||||
- host: "{{request.object.metadata.name}}.{{request.object.metadata.namespace}}.A.B.C.D.nip.io"
|
||||
http:
|
||||
paths:
|
||||
- backend:
|
||||
service:
|
||||
name: "{{request.object.metadata.name}}"
|
||||
port:
|
||||
name: http
|
||||
path: /
|
||||
pathType: Prefix
|
||||
@@ -1,37 +0,0 @@
|
||||
apiVersion: kyverno.io/v1
|
||||
kind: ClusterPolicy
|
||||
metadata:
|
||||
name: ingress-domain-name
|
||||
spec:
|
||||
rules:
|
||||
- name: create-ingress
|
||||
context:
|
||||
- name: configmap
|
||||
configMap:
|
||||
name: ingress-domain-name
|
||||
namespace: "{{request.object.metadata.namespace}}"
|
||||
match:
|
||||
resources:
|
||||
kinds:
|
||||
- Service
|
||||
preconditions:
|
||||
- key: "{{request.object.spec.ports[0].name}}"
|
||||
operator: Equals
|
||||
value: http
|
||||
generate:
|
||||
kind: Ingress
|
||||
name: "{{request.object.metadata.name}}"
|
||||
namespace: "{{request.object.metadata.namespace}}"
|
||||
data:
|
||||
spec:
|
||||
rules:
|
||||
- host: "{{request.object.metadata.name}}.{{request.object.metadata.namespace}}.{{configmap.data.domain}}"
|
||||
http:
|
||||
paths:
|
||||
- backend:
|
||||
service:
|
||||
name: "{{request.object.metadata.name}}"
|
||||
port:
|
||||
name: http
|
||||
path: /
|
||||
pathType: Prefix
|
||||
@@ -1,20 +0,0 @@
|
||||
kind: Pod
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
generateName: mounter-
|
||||
labels:
|
||||
container.training/mounter: ""
|
||||
spec:
|
||||
volumes:
|
||||
- name: pvc
|
||||
persistentVolumeClaim:
|
||||
claimName: my-pvc-XYZ45
|
||||
containers:
|
||||
- name: mounter
|
||||
image: alpine
|
||||
stdin: true
|
||||
tty: true
|
||||
volumeMounts:
|
||||
- name: pvc
|
||||
mountPath: /pvc
|
||||
workingDir: /pvc
|
||||
@@ -3,7 +3,8 @@ apiVersion: networking.k8s.io/v1
|
||||
metadata:
|
||||
name: deny-from-other-namespaces
|
||||
spec:
|
||||
podSelector: {}
|
||||
podSelector:
|
||||
matchLabels:
|
||||
ingress:
|
||||
- from:
|
||||
- podSelector: {}
|
||||
|
||||
20
k8s/pv.yaml
20
k8s/pv.yaml
@@ -1,20 +0,0 @@
|
||||
kind: PersistentVolume
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
generateName: my-pv-
|
||||
labels:
|
||||
container.training/pv: ""
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
- ReadWriteMany
|
||||
capacity:
|
||||
storage: 1G
|
||||
hostPath:
|
||||
path: /tmp/my-pv
|
||||
#storageClassName: my-sc
|
||||
#claimRef:
|
||||
# kind: PersistentVolumeClaim
|
||||
# apiVersion: v1
|
||||
# namespace: default
|
||||
# name: my-pvc-XYZ45
|
||||
13
k8s/pvc.yaml
13
k8s/pvc.yaml
@@ -1,13 +0,0 @@
|
||||
kind: PersistentVolumeClaim
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
generateName: my-pvc-
|
||||
labels:
|
||||
container.training/pvc: ""
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: 1G
|
||||
#storageClassName: my-sc
|
||||
147
k8s/rainbow.yaml
147
k8s/rainbow.yaml
@@ -1,147 +0,0 @@
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: blue
|
||||
labels:
|
||||
app: rainbow
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: blue
|
||||
name: color
|
||||
namespace: blue
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: rainbow
|
||||
color: blue
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: blue
|
||||
spec:
|
||||
containers:
|
||||
- image: jpetazzo/color
|
||||
name: color
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: blue
|
||||
name: color
|
||||
namespace: blue
|
||||
spec:
|
||||
ports:
|
||||
- name: http
|
||||
port: 80
|
||||
protocol: TCP
|
||||
targetPort: 80
|
||||
selector:
|
||||
app: rainbow
|
||||
color: blue
|
||||
type: ClusterIP
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: green
|
||||
labels:
|
||||
app: rainbow
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: green
|
||||
name: color
|
||||
namespace: green
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: rainbow
|
||||
color: green
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: green
|
||||
spec:
|
||||
containers:
|
||||
- image: jpetazzo/color
|
||||
name: color
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: green
|
||||
name: color
|
||||
namespace: green
|
||||
spec:
|
||||
ports:
|
||||
- name: http
|
||||
port: 80
|
||||
protocol: TCP
|
||||
targetPort: 80
|
||||
selector:
|
||||
app: rainbow
|
||||
color: green
|
||||
type: ClusterIP
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: red
|
||||
labels:
|
||||
app: rainbow
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: red
|
||||
name: color
|
||||
namespace: red
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: rainbow
|
||||
color: red
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: red
|
||||
spec:
|
||||
containers:
|
||||
- image: jpetazzo/color
|
||||
name: color
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
labels:
|
||||
app: rainbow
|
||||
color: red
|
||||
name: color
|
||||
namespace: red
|
||||
spec:
|
||||
ports:
|
||||
- name: http
|
||||
port: 80
|
||||
protocol: TCP
|
||||
targetPort: 80
|
||||
selector:
|
||||
app: rainbow
|
||||
color: red
|
||||
type: ClusterIP
|
||||
@@ -1,107 +1,17 @@
|
||||
⚠️ This is work in progress. The UX needs to be improved,
|
||||
and the docs could be better.
|
||||
|
||||
This directory contains a Terraform configuration to deploy
|
||||
a bunch of Kubernetes clusters on various cloud providers,
|
||||
using their respective managed Kubernetes products.
|
||||
a bunch of Kubernetes clusters on various cloud providers, using their respective managed Kubernetes products.
|
||||
|
||||
## With shell wrapper
|
||||
|
||||
This is the recommended use. It makes it easy to start N clusters
|
||||
on any provider. It will create a directory with a name like
|
||||
`tag-YYYY-MM-DD-HH-MM-SS-SEED-PROVIDER`, copy the Terraform configuration
|
||||
to that directory, then create the clusters using that configuration.
|
||||
|
||||
1. One-time setup: configure provider authentication for the provider(s) that you wish to use.
|
||||
|
||||
- Digital Ocean:
|
||||
```bash
|
||||
doctl auth init
|
||||
```
|
||||
|
||||
- Google Cloud Platform: you will need to create a project named `prepare-tf`
|
||||
and enable the relevant APIs for this project (sorry, if you're new to GCP,
|
||||
this sounds vague; but if you're familiar with it you know what to do; if you
|
||||
want to change the project name you can edit the Terraform configuration)
|
||||
|
||||
- Linode:
|
||||
```bash
|
||||
linode-cli configure
|
||||
```
|
||||
|
||||
- Oracle Cloud: FIXME
|
||||
(set up `oci` through the `oci-cli` Python package)
|
||||
|
||||
- Scaleway: run `scw init`
|
||||
|
||||
2. Optional: set number of clusters, cluster size, and region.
|
||||
|
||||
By default, 1 cluster will be configured, with 2 nodes, and auto-scaling up to 5 nodes.
|
||||
|
||||
If you want, you can override these parameters, with the following variables.
|
||||
|
||||
```bash
|
||||
export TF_VAR_how_many_clusters=5
|
||||
export TF_VAR_min_nodes_per_pool=2
|
||||
export TF_VAR_max_nodes_per_pool=4
|
||||
export TF_VAR_location=xxx
|
||||
```
|
||||
|
||||
The `location` variable is optional. Each provider should have a default value.
|
||||
The value of the `location` variable is provider-specific. Examples:
|
||||
|
||||
| Provider | Example value | How to see possible values
|
||||
|---------------|-------------------|---------------------------
|
||||
| Digital Ocean | `ams3` | `doctl compute region list`
|
||||
| Google Cloud | `europe-north1-a` | `gcloud compute zones list`
|
||||
| Linode | `eu-central` | `linode-cli regions list`
|
||||
| Oracle Cloud | `eu-stockholm-1` | `oci iam region list`
|
||||
|
||||
You can also specify multiple locations, and then they will be
|
||||
used in round-robin fashion.
|
||||
|
||||
For example, with Google Cloud, since the default quotas are very
|
||||
low (my account is limited to 8 public IP addresses per zone, and
|
||||
my requests to increase that quota were denied) you can do the
|
||||
following:
|
||||
|
||||
```bash
|
||||
export TF_VAR_location=$(gcloud compute zones list --format=json | jq -r .[].name | grep ^europe)
|
||||
```
|
||||
|
||||
Then when you apply, clusters will be created across all available
|
||||
zones in Europe. (When I write this, there are 20+ zones in Europe,
|
||||
so even with my quota, I can create 40 clusters.)
|
||||
|
||||
3. Run!
|
||||
|
||||
```bash
|
||||
./run.sh <providername>
|
||||
```
|
||||
|
||||
(If you don't specify a provider name, it will list available providers.)
|
||||
|
||||
4. Shutting down
|
||||
|
||||
Go to the directory that was created by the previous step (`tag-YYYY-MM...`)
|
||||
and run `terraform destroy`.
|
||||
|
||||
You can also run `./clean.sh` which will destroy ALL clusters deployed by the previous run script.
|
||||
|
||||
## Without shell wrapper
|
||||
|
||||
Expert mode.
|
||||
|
||||
Useful to run steps sperarately, and/or when working on the Terraform configurations.
|
||||
To use it:
|
||||
|
||||
1. Select the provider you wish to use.
|
||||
|
||||
Go to the `source` directory and edit `main.tf`.
|
||||
|
||||
Change the `source` attribute of the `module "clusters"` section.
|
||||
|
||||
Check the content of the `modules` directory to see available choices.
|
||||
|
||||
```bash
|
||||
vim main.tf
|
||||
```
|
||||
|
||||
2. Initialize the provider.
|
||||
|
||||
```bash
|
||||
@@ -110,20 +20,24 @@ terraform init
|
||||
|
||||
3. Configure provider authentication.
|
||||
|
||||
See steps above, and add the following extra steps:
|
||||
|
||||
- Digital Coean:
|
||||
```bash
|
||||
export DIGITALOCEAN_ACCESS_TOKEN=$(grep ^access-token ~/.config/doctl/config.yaml | cut -d: -f2 | tr -d " ")
|
||||
```
|
||||
|
||||
- Linode:
|
||||
```bash
|
||||
export LINODE_TOKEN=$(grep ^token ~/.config/linode-cli | cut -d= -f2 | tr -d " ")
|
||||
```
|
||||
- Digital Ocean: `export DIGITALOCEAN_ACCESS_TOKEN=...`
|
||||
(check `~/.config/doctl/config.yaml` for the token)
|
||||
- Linode: `export LINODE_TOKEN=...`
|
||||
(check `~/.config/linode-cli` for the token)
|
||||
- Oracle Cloud: it should use `~/.oci/config`
|
||||
- Scaleway: run `scw init`
|
||||
|
||||
4. Decide how many clusters and how many nodes per clusters you want.
|
||||
|
||||
```bash
|
||||
export TF_VAR_how_many_clusters=5
|
||||
export TF_VAR_min_nodes_per_pool=2
|
||||
# Optional (will enable autoscaler when available)
|
||||
export TF_VAR_max_nodes_per_pool=4
|
||||
# Optional (will only work on some providers)
|
||||
export TF_VAR_enable_arm_pool=true
|
||||
```
|
||||
|
||||
5. Provision clusters.
|
||||
|
||||
```bash
|
||||
@@ -132,7 +46,7 @@ terraform apply
|
||||
|
||||
6. Perform second stage provisioning.
|
||||
|
||||
This will install an SSH server on the clusters.
|
||||
This will install a SSH server on the clusters.
|
||||
|
||||
```bash
|
||||
cd stage2
|
||||
@@ -158,5 +72,5 @@ terraform destroy
|
||||
9. Clean up stage2.
|
||||
|
||||
```bash
|
||||
rm stage2/terraform.tfstate*
|
||||
rm stage/terraform.tfstate*
|
||||
```
|
||||
|
||||
@@ -1,9 +0,0 @@
|
||||
#!/bin/sh
|
||||
export LINODE_TOKEN=$(grep ^token ~/.config/linode-cli | cut -d= -f2 | tr -d " ")
|
||||
export DIGITALOCEAN_ACCESS_TOKEN=$(grep ^access-token ~/.config/doctl/config.yaml | cut -d: -f2 | tr -d " ")
|
||||
for T in tag-*; do
|
||||
(
|
||||
cd $T
|
||||
terraform apply -destroy -auto-approve && mv ../$T ../deleted$T
|
||||
)
|
||||
done
|
||||
16
prepare-tf/locals.tf
Normal file
16
prepare-tf/locals.tf
Normal file
@@ -0,0 +1,16 @@
|
||||
resource "random_string" "_" {
|
||||
length = 5
|
||||
special = false
|
||||
upper = false
|
||||
}
|
||||
|
||||
resource "time_static" "_" {}
|
||||
|
||||
locals {
|
||||
tag = format("tf-%s-%s", formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339), random_string._.result)
|
||||
# Common tags to be assigned to all resources
|
||||
common_tags = [
|
||||
"created-by=terraform",
|
||||
"tag=${local.tag}"
|
||||
]
|
||||
}
|
||||
@@ -1,5 +1,5 @@
|
||||
module "clusters" {
|
||||
source = "./modules/PROVIDER"
|
||||
source = "./modules/linode"
|
||||
for_each = local.clusters
|
||||
cluster_name = each.value.cluster_name
|
||||
min_nodes_per_pool = var.min_nodes_per_pool
|
||||
@@ -7,24 +7,22 @@ module "clusters" {
|
||||
enable_arm_pool = var.enable_arm_pool
|
||||
node_size = var.node_size
|
||||
common_tags = local.common_tags
|
||||
location = each.value.location
|
||||
}
|
||||
|
||||
locals {
|
||||
clusters = {
|
||||
for i in range(101, 101 + var.how_many_clusters) :
|
||||
i => {
|
||||
cluster_name = format("%s-%03d", local.tag, i)
|
||||
kubeconfig_path = format("./stage2/kubeconfig.%03d", i)
|
||||
cluster_name = format("%s-%03d", local.tag, i)
|
||||
kubeconfig_path = format("./stage2/kubeconfig.%03d", i)
|
||||
#dashdash_kubeconfig = format("--kubeconfig=./stage2/kubeconfig.%03d", i)
|
||||
externalips_path = format("./stage2/externalips.%03d", i)
|
||||
flags_path = format("./stage2/flags.%03d", i)
|
||||
location = local.locations[i % length(local.locations)]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
resource "local_file" "stage2" {
|
||||
filename = "./stage2/main.tf"
|
||||
filename = "./stage2/main.tf"
|
||||
file_permission = "0644"
|
||||
content = templatefile(
|
||||
"./stage2.tmpl",
|
||||
@@ -32,15 +30,6 @@ resource "local_file" "stage2" {
|
||||
)
|
||||
}
|
||||
|
||||
resource "local_file" "flags" {
|
||||
for_each = local.clusters
|
||||
filename = each.value.flags_path
|
||||
file_permission = "0600"
|
||||
content = <<-EOT
|
||||
has_metrics_server: ${module.clusters[each.key].has_metrics_server}
|
||||
EOT
|
||||
}
|
||||
|
||||
resource "local_file" "kubeconfig" {
|
||||
for_each = local.clusters
|
||||
filename = each.value.kubeconfig_path
|
||||
@@ -70,8 +59,8 @@ resource "null_resource" "wait_for_nodes" {
|
||||
}
|
||||
|
||||
data "external" "externalips" {
|
||||
for_each = local.clusters
|
||||
depends_on = [null_resource.wait_for_nodes]
|
||||
for_each = local.clusters
|
||||
depends_on = [ null_resource.wait_for_nodes ]
|
||||
program = [
|
||||
"sh",
|
||||
"-c",
|
||||
@@ -1,13 +1,12 @@
|
||||
resource "digitalocean_kubernetes_cluster" "_" {
|
||||
name = var.cluster_name
|
||||
tags = var.common_tags
|
||||
# Region is mandatory, so let's provide a default value.
|
||||
region = var.location != null ? var.location : "nyc1"
|
||||
name = var.cluster_name
|
||||
tags = local.common_tags
|
||||
region = var.region
|
||||
version = var.k8s_version
|
||||
|
||||
node_pool {
|
||||
name = "x86"
|
||||
tags = var.common_tags
|
||||
name = "dok-x86"
|
||||
tags = local.common_tags
|
||||
size = local.node_type
|
||||
auto_scale = true
|
||||
min_nodes = var.min_nodes_per_pool
|
||||
@@ -5,7 +5,3 @@ output "kubeconfig" {
|
||||
output "cluster_id" {
|
||||
value = digitalocean_kubernetes_cluster._.id
|
||||
}
|
||||
|
||||
output "has_metrics_server" {
|
||||
value = false
|
||||
}
|
||||
@@ -8,6 +8,10 @@ variable "common_tags" {
|
||||
default = []
|
||||
}
|
||||
|
||||
locals {
|
||||
common_tags = [for tag in var.common_tags : replace(tag, "=", "-")]
|
||||
}
|
||||
|
||||
variable "node_size" {
|
||||
type = string
|
||||
default = "M"
|
||||
@@ -42,11 +46,9 @@ locals {
|
||||
node_type = var.node_types[var.node_size]
|
||||
}
|
||||
|
||||
# To view supported regions, run:
|
||||
# doctl compute region list
|
||||
variable "location" {
|
||||
variable "region" {
|
||||
type = string
|
||||
default = null
|
||||
default = "ams3"
|
||||
}
|
||||
|
||||
# To view supported versions, run:
|
||||
@@ -1,8 +1,7 @@
|
||||
resource "linode_lke_cluster" "_" {
|
||||
label = var.cluster_name
|
||||
tags = var.common_tags
|
||||
# "region" is mandatory, so let's provide a default value if none was given.
|
||||
region = var.location != null ? var.location : "eu-central"
|
||||
label = var.cluster_name
|
||||
tags = var.common_tags
|
||||
region = var.region
|
||||
k8s_version = var.k8s_version
|
||||
|
||||
pool {
|
||||
@@ -5,7 +5,3 @@ output "kubeconfig" {
|
||||
output "cluster_id" {
|
||||
value = linode_lke_cluster._.id
|
||||
}
|
||||
|
||||
output "has_metrics_server" {
|
||||
value = false
|
||||
}
|
||||
@@ -42,11 +42,11 @@ locals {
|
||||
node_type = var.node_types[var.node_size]
|
||||
}
|
||||
|
||||
# To view supported regions, run:
|
||||
# To view supported versions, run:
|
||||
# linode-cli regions list
|
||||
variable "location" {
|
||||
variable "region" {
|
||||
type = string
|
||||
default = null
|
||||
default = "us-east"
|
||||
}
|
||||
|
||||
# To view supported versions, run:
|
||||
@@ -1,7 +1,6 @@
|
||||
resource "oci_identity_compartment" "_" {
|
||||
name = var.cluster_name
|
||||
description = var.cluster_name
|
||||
enable_delete = true
|
||||
name = var.cluster_name
|
||||
description = var.cluster_name
|
||||
}
|
||||
|
||||
locals {
|
||||
@@ -9,7 +9,3 @@ output "kubeconfig" {
|
||||
output "cluster_id" {
|
||||
value = oci_containerengine_cluster._.id
|
||||
}
|
||||
|
||||
output "has_metrics_server" {
|
||||
value = false
|
||||
}
|
||||
@@ -70,13 +70,6 @@ locals {
|
||||
node_type = var.node_types[var.node_size]
|
||||
}
|
||||
|
||||
# To view supported regions, run:
|
||||
# oci iam region list | jq .data[].name
|
||||
variable "location" {
|
||||
type = string
|
||||
default = null
|
||||
}
|
||||
|
||||
# To view supported versions, run:
|
||||
# oci ce cluster-options get --cluster-option-id all | jq -r '.data["kubernetes-versions"][]'
|
||||
variable "k8s_version" {
|
||||
@@ -1,15 +1,13 @@
|
||||
resource "scaleway_k8s_cluster" "_" {
|
||||
name = var.cluster_name
|
||||
region = var.location
|
||||
tags = var.common_tags
|
||||
version = var.k8s_version
|
||||
cni = var.cni
|
||||
delete_additional_resources = true
|
||||
name = var.cluster_name
|
||||
tags = var.common_tags
|
||||
version = var.k8s_version
|
||||
cni = var.cni
|
||||
}
|
||||
|
||||
resource "scaleway_k8s_pool" "_" {
|
||||
cluster_id = scaleway_k8s_cluster._.id
|
||||
name = "x86"
|
||||
name = "scw-x86"
|
||||
tags = var.common_tags
|
||||
node_type = local.node_type
|
||||
size = var.min_nodes_per_pool
|
||||
@@ -5,7 +5,3 @@ output "kubeconfig" {
|
||||
output "cluster_id" {
|
||||
value = scaleway_k8s_cluster._.id
|
||||
}
|
||||
|
||||
output "has_metrics_server" {
|
||||
value = sort([var.k8s_version, "1.22"])[0] == "1.22"
|
||||
}
|
||||
@@ -47,12 +47,7 @@ variable "cni" {
|
||||
default = "cilium"
|
||||
}
|
||||
|
||||
variable "location" {
|
||||
type = string
|
||||
default = null
|
||||
}
|
||||
|
||||
# To view supported versions, run:
|
||||
# See supported versions with:
|
||||
# scw k8s version list -o json | jq -r .[].name
|
||||
variable "k8s_version" {
|
||||
type = string
|
||||
@@ -1,49 +0,0 @@
|
||||
#!/bin/sh
|
||||
set -e
|
||||
|
||||
TIME=$(which time)
|
||||
|
||||
PROVIDER=$1
|
||||
[ "$PROVIDER" ] || {
|
||||
echo "Please specify a provider as first argument, or 'ALL' for parallel mode."
|
||||
echo "Available providers:"
|
||||
ls -1 source/modules
|
||||
exit 1
|
||||
}
|
||||
|
||||
[ "$TAG" ] || {
|
||||
TIMESTAMP=$(date +%Y-%m-%d-%H-%M-%S)
|
||||
RANDOMTAG=$(base64 /dev/urandom | tr A-Z a-z | tr -d /+ | head -c5)
|
||||
export TAG=tag-$TIMESTAMP-$RANDOMTAG
|
||||
}
|
||||
|
||||
[ "$PROVIDER" = "ALL" ] && {
|
||||
for PROVIDER in $(ls -1 source/modules); do
|
||||
$TERMINAL -T $TAG-$PROVIDER -e sh -c "
|
||||
export TAG=$TAG-$PROVIDER
|
||||
$0 $PROVIDER
|
||||
cd $TAG-$PROVIDER
|
||||
bash
|
||||
" &
|
||||
done
|
||||
exit 0
|
||||
}
|
||||
|
||||
[ -d "source/modules/$PROVIDER" ] || {
|
||||
echo "Provider '$PROVIDER' not found."
|
||||
echo "Available providers:"
|
||||
ls -1 source/modules
|
||||
exit 1
|
||||
}
|
||||
|
||||
export LINODE_TOKEN=$(grep ^token ~/.config/linode-cli | cut -d= -f2 | tr -d " ")
|
||||
export DIGITALOCEAN_ACCESS_TOKEN=$(grep ^access-token ~/.config/doctl/config.yaml | cut -d: -f2 | tr -d " ")
|
||||
|
||||
cp -a source $TAG
|
||||
cd $TAG
|
||||
cp -r modules/$PROVIDER modules/PROVIDER
|
||||
$TIME -o time.1.init terraform init
|
||||
$TIME -o time.2.stage1 terraform apply -auto-approve
|
||||
cd stage2
|
||||
$TIME -o ../time.3.init terraform init
|
||||
$TIME -o ../time.4.stage2 terraform apply -auto-approve
|
||||
@@ -1,19 +0,0 @@
|
||||
resource "random_string" "_" {
|
||||
length = 4
|
||||
number = false
|
||||
special = false
|
||||
upper = false
|
||||
}
|
||||
|
||||
resource "time_static" "_" {}
|
||||
|
||||
locals {
|
||||
timestamp = formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339)
|
||||
tag = random_string._.result
|
||||
# Common tags to be assigned to all resources
|
||||
common_tags = [
|
||||
"created-by-terraform",
|
||||
format("created-at-%s", local.timestamp),
|
||||
format("created-for-%s", local.tag)
|
||||
]
|
||||
}
|
||||
@@ -1,65 +0,0 @@
|
||||
resource "google_container_cluster" "_" {
|
||||
name = var.cluster_name
|
||||
project = local.project
|
||||
location = local.location
|
||||
min_master_version = var.k8s_version
|
||||
|
||||
# To deploy private clusters, uncomment the section below,
|
||||
# and uncomment the block in network.tf.
|
||||
# Private clusters require extra resources (Cloud NAT,
|
||||
# router, network, subnet) and the quota for some of these
|
||||
# resources is fairly low on GCP; so if you want to deploy
|
||||
# a lot of private clusters (more than 10), you can use these
|
||||
# blocks as a base but you will probably have to refactor
|
||||
# things quite a bit (you will at least need to define a single
|
||||
# shared router and use it across all the clusters).
|
||||
/*
|
||||
network = google_compute_network._.name
|
||||
subnetwork = google_compute_subnetwork._.name
|
||||
|
||||
private_cluster_config {
|
||||
enable_private_nodes = true
|
||||
# This must be set to "false".
|
||||
# (Otherwise, access to the public endpoint is disabled.)
|
||||
enable_private_endpoint = false
|
||||
# This must be set to a /28.
|
||||
# I think it shouldn't collide with the pod network subnet.
|
||||
master_ipv4_cidr_block = "10.255.255.0/28"
|
||||
}
|
||||
# Private clusters require "VPC_NATIVE" networking mode
|
||||
# (as opposed to the legacy "ROUTES").
|
||||
networking_mode = "VPC_NATIVE"
|
||||
# ip_allocation_policy is required for VPC_NATIVE clusters.
|
||||
ip_allocation_policy {
|
||||
# This is the block that will be used for pods.
|
||||
cluster_ipv4_cidr_block = "10.0.0.0/12"
|
||||
# The services block is optional
|
||||
# (GKE will pick one automatically).
|
||||
#services_ipv4_cidr_block = ""
|
||||
}
|
||||
*/
|
||||
|
||||
node_pool {
|
||||
name = "x86"
|
||||
node_config {
|
||||
tags = var.common_tags
|
||||
machine_type = local.node_type
|
||||
}
|
||||
initial_node_count = var.min_nodes_per_pool
|
||||
autoscaling {
|
||||
min_node_count = var.min_nodes_per_pool
|
||||
max_node_count = max(var.min_nodes_per_pool, var.max_nodes_per_pool)
|
||||
}
|
||||
}
|
||||
|
||||
# This is not strictly necessary.
|
||||
# We'll see if we end up using it.
|
||||
# (If it is removed, make sure to also remove the corresponding
|
||||
# key+cert variables from outputs.tf!)
|
||||
master_auth {
|
||||
client_certificate_config {
|
||||
issue_client_certificate = true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,38 +0,0 @@
|
||||
/*
|
||||
resource "google_compute_network" "_" {
|
||||
name = var.cluster_name
|
||||
project = local.project
|
||||
# The default is to create subnets automatically.
|
||||
# However, this creates one subnet per zone in all regions,
|
||||
# which causes a quick exhaustion of the subnet quota.
|
||||
auto_create_subnetworks = false
|
||||
}
|
||||
|
||||
resource "google_compute_subnetwork" "_" {
|
||||
name = var.cluster_name
|
||||
ip_cidr_range = "10.254.0.0/16"
|
||||
region = local.region
|
||||
network = google_compute_network._.id
|
||||
project = local.project
|
||||
}
|
||||
|
||||
resource "google_compute_router" "_" {
|
||||
name = var.cluster_name
|
||||
region = local.region
|
||||
network = google_compute_network._.name
|
||||
project = local.project
|
||||
}
|
||||
|
||||
resource "google_compute_router_nat" "_" {
|
||||
name = var.cluster_name
|
||||
router = google_compute_router._.name
|
||||
region = local.region
|
||||
project = local.project
|
||||
# Everyone in the network is allowed to NAT out.
|
||||
# (We would change this if we only wanted to allow specific subnets to NAT out.)
|
||||
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
|
||||
# Pick NAT addresses automatically.
|
||||
# (We would change this if we wanted to use specific addresses to NAT out.)
|
||||
nat_ip_allocate_option = "AUTO_ONLY"
|
||||
}
|
||||
*/
|
||||
@@ -1,35 +0,0 @@
|
||||
data "google_client_config" "_" {}
|
||||
|
||||
output "kubeconfig" {
|
||||
value = <<-EOT
|
||||
apiVersion: v1
|
||||
kind: Config
|
||||
current-context: ${google_container_cluster._.name}
|
||||
clusters:
|
||||
- name: ${google_container_cluster._.name}
|
||||
cluster:
|
||||
server: https://${google_container_cluster._.endpoint}
|
||||
certificate-authority-data: ${google_container_cluster._.master_auth[0].cluster_ca_certificate}
|
||||
contexts:
|
||||
- name: ${google_container_cluster._.name}
|
||||
context:
|
||||
cluster: ${google_container_cluster._.name}
|
||||
user: client-token
|
||||
users:
|
||||
- name: client-cert
|
||||
user:
|
||||
client-key-data: ${google_container_cluster._.master_auth[0].client_key}
|
||||
client-certificate-data: ${google_container_cluster._.master_auth[0].client_certificate}
|
||||
- name: client-token
|
||||
user:
|
||||
token: ${data.google_client_config._.access_token}
|
||||
EOT
|
||||
}
|
||||
|
||||
output "cluster_id" {
|
||||
value = google_container_cluster._.id
|
||||
}
|
||||
|
||||
output "has_metrics_server" {
|
||||
value = true
|
||||
}
|
||||
@@ -1,8 +0,0 @@
|
||||
terraform {
|
||||
required_providers {
|
||||
google = {
|
||||
source = "hashicorp/google"
|
||||
version = "4.5.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,68 +0,0 @@
|
||||
variable "cluster_name" {
|
||||
type = string
|
||||
default = "deployed-with-terraform"
|
||||
}
|
||||
|
||||
variable "common_tags" {
|
||||
type = list(string)
|
||||
default = []
|
||||
}
|
||||
|
||||
variable "node_size" {
|
||||
type = string
|
||||
default = "M"
|
||||
}
|
||||
|
||||
variable "min_nodes_per_pool" {
|
||||
type = number
|
||||
default = 2
|
||||
}
|
||||
|
||||
variable "max_nodes_per_pool" {
|
||||
type = number
|
||||
default = 5
|
||||
}
|
||||
|
||||
# FIXME
|
||||
variable "enable_arm_pool" {
|
||||
type = bool
|
||||
default = false
|
||||
}
|
||||
|
||||
variable "node_types" {
|
||||
type = map(string)
|
||||
default = {
|
||||
"S" = "e2-small"
|
||||
"M" = "e2-medium"
|
||||
"L" = "e2-standard-2"
|
||||
}
|
||||
}
|
||||
|
||||
locals {
|
||||
node_type = var.node_types[var.node_size]
|
||||
}
|
||||
|
||||
# To view supported locations, run:
|
||||
# gcloud compute zones list
|
||||
variable "location" {
|
||||
type = string
|
||||
default = null
|
||||
}
|
||||
|
||||
# To view supported versions, run:
|
||||
# gcloud container get-server-config --region=europe-north1 '--format=flattened(channels)'
|
||||
# But it's also possible to just specify e.g. "1.20" and it figures it out.
|
||||
variable "k8s_version" {
|
||||
type = string
|
||||
default = "1.21"
|
||||
}
|
||||
|
||||
locals {
|
||||
location = var.location != null ? var.location : "europe-north1-a"
|
||||
region = replace(local.location, "/-[a-z]$/", "")
|
||||
# Unfortunately, the following line doesn't work
|
||||
# (that attribute just returns an empty string)
|
||||
# so we have to hard-code the project name.
|
||||
#project = data.google_client_config._.project
|
||||
project = "prepare-tf"
|
||||
}
|
||||
@@ -1,40 +0,0 @@
|
||||
variable "how_many_clusters" {
|
||||
type = number
|
||||
default = 1
|
||||
}
|
||||
|
||||
variable "node_size" {
|
||||
type = string
|
||||
default = "M"
|
||||
# Can be S, M, L.
|
||||
# We map these values to different specific instance types for each provider,
|
||||
# but the idea is that they shoudl correspond to the following sizes:
|
||||
# S = 2 GB RAM
|
||||
# M = 4 GB RAM
|
||||
# L = 8 GB RAM
|
||||
}
|
||||
|
||||
variable "min_nodes_per_pool" {
|
||||
type = number
|
||||
default = 1
|
||||
}
|
||||
|
||||
variable "max_nodes_per_pool" {
|
||||
type = number
|
||||
default = 0
|
||||
}
|
||||
|
||||
variable "enable_arm_pool" {
|
||||
type = bool
|
||||
default = false
|
||||
}
|
||||
|
||||
variable "location" {
|
||||
type = string
|
||||
default = null
|
||||
}
|
||||
|
||||
# TODO: perhaps handle if it's space-separated instead of newline?
|
||||
locals {
|
||||
locations = var.location == null ? [null] : split("\n", var.location)
|
||||
}
|
||||
@@ -2,7 +2,7 @@ terraform {
|
||||
required_providers {
|
||||
kubernetes = {
|
||||
source = "hashicorp/kubernetes"
|
||||
version = "2.7.1"
|
||||
version = "2.0.3"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -119,11 +119,6 @@ resource "kubernetes_cluster_role_binding" "shpod_${index}" {
|
||||
name = "shpod"
|
||||
namespace = "shpod"
|
||||
}
|
||||
subject {
|
||||
api_group = "rbac.authorization.k8s.io"
|
||||
kind = "Group"
|
||||
name = "shpod-cluster-admins"
|
||||
}
|
||||
}
|
||||
|
||||
resource "random_string" "shpod_${index}" {
|
||||
@@ -140,14 +135,9 @@ provider "helm" {
|
||||
}
|
||||
|
||||
resource "helm_release" "metrics_server_${index}" {
|
||||
# Some providers pre-install metrics-server.
|
||||
# Some don't. Let's install metrics-server,
|
||||
# but only if it's not already installed.
|
||||
count = yamldecode(file("./flags.${index}"))["has_metrics_server"] ? 0 : 1
|
||||
provider = helm.cluster_${index}
|
||||
repository = "https://charts.bitnami.com/bitnami"
|
||||
chart = "metrics-server"
|
||||
version = "5.8.8"
|
||||
name = "metrics-server"
|
||||
namespace = "metrics-server"
|
||||
create_namespace = true
|
||||
@@ -191,7 +181,7 @@ resource "kubernetes_config_map" "kubeconfig_${index}" {
|
||||
- name: cluster-admin
|
||||
user:
|
||||
client-key-data: $${base64encode(tls_private_key.cluster_admin_${index}.private_key_pem)}
|
||||
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request_v1.cluster_admin_${index}.certificate)}
|
||||
client-certificate-data: $${base64encode(kubernetes_certificate_signing_request.cluster_admin_${index}.certificate)}
|
||||
EOT
|
||||
}
|
||||
}
|
||||
@@ -205,14 +195,11 @@ resource "tls_cert_request" "cluster_admin_${index}" {
|
||||
private_key_pem = tls_private_key.cluster_admin_${index}.private_key_pem
|
||||
subject {
|
||||
common_name = "cluster-admin"
|
||||
# Note: CSR API v1 doesn't allow issuing certs with "system:masters" anymore.
|
||||
#organization = "system:masters"
|
||||
# We'll use this custom group name instead.cluster-admin user.
|
||||
organization = "shpod-cluster-admins"
|
||||
organization = "system:masters"
|
||||
}
|
||||
}
|
||||
|
||||
resource "kubernetes_certificate_signing_request_v1" "cluster_admin_${index}" {
|
||||
resource "kubernetes_certificate_signing_request" "cluster_admin_${index}" {
|
||||
provider = kubernetes.cluster_${index}
|
||||
metadata {
|
||||
name = "cluster-admin"
|
||||
@@ -220,7 +207,6 @@ resource "kubernetes_certificate_signing_request_v1" "cluster_admin_${index}" {
|
||||
spec {
|
||||
usages = ["client auth"]
|
||||
request = tls_cert_request.cluster_admin_${index}.cert_request_pem
|
||||
signer_name = "kubernetes.io/kube-apiserver-client"
|
||||
}
|
||||
auto_approve = true
|
||||
}
|
||||
28
prepare-tf/variables.tf
Normal file
28
prepare-tf/variables.tf
Normal file
@@ -0,0 +1,28 @@
|
||||
variable "how_many_clusters" {
|
||||
type = number
|
||||
default = 2
|
||||
}
|
||||
|
||||
variable "node_size" {
|
||||
type = string
|
||||
default = "M"
|
||||
# Can be S, M, L.
|
||||
# S = 2 GB RAM
|
||||
# M = 4 GB RAM
|
||||
# L = 8 GB RAM
|
||||
}
|
||||
|
||||
variable "min_nodes_per_pool" {
|
||||
type = number
|
||||
default = 1
|
||||
}
|
||||
|
||||
variable "max_nodes_per_pool" {
|
||||
type = number
|
||||
default = 0
|
||||
}
|
||||
|
||||
variable "enable_arm_pool" {
|
||||
type = bool
|
||||
default = true
|
||||
}
|
||||
@@ -14,9 +14,7 @@ These tools can help you to create VMs on:
|
||||
|
||||
- [Docker](https://docs.docker.com/engine/installation/)
|
||||
- [Docker Compose](https://docs.docker.com/compose/install/)
|
||||
- [Parallel SSH](https://github.com/lilydjwg/pssh)
|
||||
(should be installable with `pip install git+https://github.com/lilydjwg/pssh`;
|
||||
on a Mac, try `brew install pssh`)
|
||||
- [Parallel SSH](https://code.google.com/archive/p/parallel-ssh/) (on a Mac: `brew install pssh`)
|
||||
|
||||
Depending on the infrastructure that you want to use, you also need to install
|
||||
the CLI that is specific to that cloud. For OpenStack deployments, you will
|
||||
|
||||
@@ -75,11 +75,9 @@ _cmd_createuser() {
|
||||
echo '$USER_LOGIN ALL=(ALL) NOPASSWD:ALL' | sudo tee /etc/sudoers.d/$USER_LOGIN
|
||||
"
|
||||
|
||||
# The MaxAuthTries is here to help with folks who have many SSH keys.
|
||||
pssh "
|
||||
set -e
|
||||
sudo sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
|
||||
sudo sed -i 's/#MaxAuthTries 6/MaxAuthTries 42/' /etc/ssh/sshd_config
|
||||
sudo service ssh restart
|
||||
"
|
||||
|
||||
@@ -238,12 +236,6 @@ _cmd_docker() {
|
||||
sudo add-apt-repository 'deb https://download.docker.com/linux/ubuntu bionic stable'
|
||||
sudo apt-get -q update
|
||||
sudo apt-get -qy install docker-ce
|
||||
|
||||
# Add registry mirror configuration.
|
||||
if ! [ -f /etc/docker/daemon.json ]; then
|
||||
echo '{\"registry-mirrors\": [\"https://mirror.gcr.io\"]}' | sudo tee /etc/docker/daemon.json
|
||||
sudo systemctl restart docker
|
||||
fi
|
||||
"
|
||||
|
||||
##VERSION## https://github.com/docker/compose/releases
|
||||
@@ -311,15 +303,13 @@ _cmd_kube() {
|
||||
need_login_password
|
||||
|
||||
# Optional version, e.g. 1.13.5
|
||||
SETTINGS=tags/$TAG/settings.yaml
|
||||
KUBEVERSION=$(awk '/^kubernetes_version:/ {print $2}' $SETTINGS)
|
||||
KUBEVERSION=$2
|
||||
if [ "$KUBEVERSION" ]; then
|
||||
pssh "
|
||||
sudo tee /etc/apt/preferences.d/kubernetes <<EOF
|
||||
Package: kubectl kubeadm kubelet
|
||||
Pin: version $KUBEVERSION*
|
||||
Pin-Priority: 1000
|
||||
EOF"
|
||||
EXTRA_APTGET="=$KUBEVERSION-00"
|
||||
EXTRA_KUBEADM="kubernetesVersion: v$KUBEVERSION"
|
||||
else
|
||||
EXTRA_APTGET=""
|
||||
EXTRA_KUBEADM=""
|
||||
fi
|
||||
|
||||
# Install packages
|
||||
@@ -330,8 +320,7 @@ EOF"
|
||||
sudo tee /etc/apt/sources.list.d/kubernetes.list"
|
||||
pssh --timeout 200 "
|
||||
sudo apt-get update -q &&
|
||||
sudo apt-get install -qy kubelet kubeadm kubectl &&
|
||||
sudo apt-mark hold kubelet kubeadm kubectl
|
||||
sudo apt-get install -qy kubelet$EXTRA_APTGET kubeadm$EXTRA_APTGET kubectl$EXTRA_APTGET &&
|
||||
kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl &&
|
||||
echo 'alias k=kubectl' | sudo tee /etc/bash_completion.d/k &&
|
||||
echo 'complete -F __start_kubectl k' | sudo tee -a /etc/bash_completion.d/k"
|
||||
@@ -343,11 +332,6 @@ EOF"
|
||||
sudo swapoff -a"
|
||||
fi
|
||||
|
||||
# Re-enable CRI interface in containerd
|
||||
pssh "
|
||||
echo '# Use default parameters for containerd.' | sudo tee /etc/containerd/config.toml
|
||||
sudo systemctl restart containerd"
|
||||
|
||||
# Initialize kube control plane
|
||||
pssh --timeout 200 "
|
||||
if i_am_first_node && [ ! -f /etc/kubernetes/admin.conf ]; then
|
||||
@@ -357,38 +341,19 @@ kind: InitConfiguration
|
||||
apiVersion: kubeadm.k8s.io/v1beta2
|
||||
bootstrapTokens:
|
||||
- token: \$(cat /tmp/token)
|
||||
nodeRegistration:
|
||||
# Comment out the next line to switch back to Docker.
|
||||
criSocket: /run/containerd/containerd.sock
|
||||
ignorePreflightErrors:
|
||||
- NumCPU
|
||||
---
|
||||
kind: JoinConfiguration
|
||||
apiVersion: kubeadm.k8s.io/v1beta2
|
||||
discovery:
|
||||
bootstrapToken:
|
||||
apiServerEndpoint: \$(cat /etc/name_of_first_node):6443
|
||||
token: \$(cat /tmp/token)
|
||||
unsafeSkipCAVerification: true
|
||||
nodeRegistration:
|
||||
# Comment out the next line to switch back to Docker.
|
||||
criSocket: /run/containerd/containerd.sock
|
||||
ignorePreflightErrors:
|
||||
- NumCPU
|
||||
---
|
||||
kind: KubeletConfiguration
|
||||
apiVersion: kubelet.config.k8s.io/v1beta1
|
||||
# The following line is necessary when using Docker.
|
||||
# It doesn't seem necessary when using containerd.
|
||||
#cgroupDriver: cgroupfs
|
||||
cgroupDriver: cgroupfs
|
||||
---
|
||||
kind: ClusterConfiguration
|
||||
apiVersion: kubeadm.k8s.io/v1beta2
|
||||
apiServer:
|
||||
certSANs:
|
||||
- \$(cat /tmp/ipv4)
|
||||
$EXTRA_KUBEADM
|
||||
EOF
|
||||
sudo kubeadm init --config=/tmp/kubeadm-config.yaml
|
||||
sudo kubeadm init --config=/tmp/kubeadm-config.yaml --ignore-preflight-errors=NumCPU
|
||||
fi"
|
||||
|
||||
# Put kubeconfig in ubuntu's and $USER_LOGIN's accounts
|
||||
@@ -412,8 +377,8 @@ EOF
|
||||
pssh --timeout 200 "
|
||||
if ! i_am_first_node && [ ! -f /etc/kubernetes/kubelet.conf ]; then
|
||||
FIRSTNODE=\$(cat /etc/name_of_first_node) &&
|
||||
ssh $SSHOPTS \$FIRSTNODE cat /tmp/kubeadm-config.yaml > /tmp/kubeadm-config.yaml &&
|
||||
sudo kubeadm join --config /tmp/kubeadm-config.yaml
|
||||
TOKEN=\$(ssh $SSHOPTS \$FIRSTNODE cat /tmp/token) &&
|
||||
sudo kubeadm join --discovery-token-unsafe-skip-ca-verification --token \$TOKEN \$FIRSTNODE:6443
|
||||
fi"
|
||||
|
||||
# Install metrics server
|
||||
@@ -504,7 +469,7 @@ EOF
|
||||
if [ ! -x /usr/local/bin/kustomize ]; then
|
||||
curl -fsSL $URL |
|
||||
sudo tar -C /usr/local/bin -zx kustomize
|
||||
kustomize completion bash | sudo tee /etc/bash_completion.d/kustomize
|
||||
echo complete -C /usr/local/bin/kustomize kustomize | sudo tee /etc/bash_completion.d/kustomize
|
||||
kustomize version
|
||||
fi"
|
||||
|
||||
@@ -713,7 +678,7 @@ _cmd_tailhist () {
|
||||
ARCH=${ARCHITECTURE-amd64}
|
||||
[ "$ARCH" = "aarch64" ] && ARCH=arm64
|
||||
|
||||
pssh "
|
||||
pssh -i "
|
||||
set -e
|
||||
wget https://github.com/joewalnes/websocketd/releases/download/v0.3.0/websocketd-0.3.0-linux_$ARCH.zip
|
||||
unzip websocketd-0.3.0-linux_$ARCH.zip websocketd
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
infra_start() {
|
||||
COUNT=$1
|
||||
|
||||
cp terraform-openstack/*.tf tags/$TAG
|
||||
cp terraform/*.tf tags/$TAG
|
||||
(
|
||||
cd tags/$TAG
|
||||
if ! terraform init; then
|
||||
|
||||
@@ -1,82 +0,0 @@
|
||||
#!/bin/sh
|
||||
|
||||
# https://open-api.netlify.com/#tag/dnsZone
|
||||
[ "$1" ] || {
|
||||
echo ""
|
||||
echo "Add a record in Netlify DNS."
|
||||
echo "This script is hardcoded to add a record to container.training".
|
||||
echo ""
|
||||
echo "Syntax:"
|
||||
echo "$0 list"
|
||||
echo "$0 add <name> <ipaddr>"
|
||||
echo "$0 del <recordid>"
|
||||
echo ""
|
||||
echo "Example to create a A record for eu.container.training:"
|
||||
echo "$0 add eu 185.145.250.0"
|
||||
echo ""
|
||||
exit 1
|
||||
}
|
||||
|
||||
NETLIFY_USERID=$(jq .userId < ~/.config/netlify/config.json)
|
||||
NETLIFY_TOKEN=$(jq -r .users[$NETLIFY_USERID].auth.token < ~/.config/netlify/config.json)
|
||||
|
||||
netlify() {
|
||||
URI=$1
|
||||
shift
|
||||
http https://api.netlify.com/api/v1/$URI "$@" "Authorization:Bearer $NETLIFY_TOKEN"
|
||||
}
|
||||
|
||||
ZONE_ID=$(netlify dns_zones |
|
||||
jq -r '.[] | select ( .name == "container.training" ) | .id')
|
||||
|
||||
_list() {
|
||||
netlify dns_zones/$ZONE_ID/dns_records |
|
||||
jq -r '.[] | select(.type=="A") | [.hostname, .type, .value, .id] | @tsv'
|
||||
}
|
||||
|
||||
_add() {
|
||||
NAME=$1.container.training
|
||||
ADDR=$2
|
||||
|
||||
|
||||
# It looks like if we create two identical records, then delete one of them,
|
||||
# Netlify DNS ends up in a weird state (the name doesn't resolve anymore even
|
||||
# though it's still visible through the API and the website?)
|
||||
|
||||
if netlify dns_zones/$ZONE_ID/dns_records |
|
||||
jq '.[] | select(.hostname=="'$NAME'" and .type=="A" and .value=="'$ADDR'")' |
|
||||
grep .
|
||||
then
|
||||
echo "It looks like that record already exists. Refusing to create it."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
netlify dns_zones/$ZONE_ID/dns_records type=A hostname=$NAME value=$ADDR ttl=300
|
||||
|
||||
netlify dns_zones/$ZONE_ID/dns_records |
|
||||
jq '.[] | select(.hostname=="'$NAME'")'
|
||||
}
|
||||
|
||||
_del() {
|
||||
RECORD_ID=$1
|
||||
# OK, since that one is dangerous, I'm putting the whole request explicitly here
|
||||
http DELETE \
|
||||
https://api.netlify.com/api/v1/dns_zones/$ZONE_ID/dns_records/$RECORD_ID \
|
||||
"Authorization:Bearer $NETLIFY_TOKEN"
|
||||
}
|
||||
|
||||
case "$1" in
|
||||
list)
|
||||
_list
|
||||
;;
|
||||
add)
|
||||
_add $2 $3
|
||||
;;
|
||||
del)
|
||||
_del $2
|
||||
;;
|
||||
*)
|
||||
echo "Unknown command '$1'."
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
@@ -1,33 +0,0 @@
|
||||
# Number of VMs per cluster
|
||||
clustersize: 3
|
||||
|
||||
# The hostname of each node will be clusterprefix + a number
|
||||
clusterprefix: oldversion
|
||||
|
||||
# Jinja2 template to use to generate ready-to-cut cards
|
||||
cards_template: cards.html
|
||||
|
||||
# Use "Letter" in the US, and "A4" everywhere else
|
||||
paper_size: A4
|
||||
|
||||
# Login and password that students will use
|
||||
user_login: k8s
|
||||
user_password: training
|
||||
|
||||
# For a list of old versions, check:
|
||||
# https://kubernetes.io/releases/patch-releases/#non-active-branch-history
|
||||
kubernetes_version: 1.18.20
|
||||
|
||||
image:
|
||||
|
||||
steps:
|
||||
- wait
|
||||
- clusterize
|
||||
- tools
|
||||
- docker
|
||||
- createuser
|
||||
- webssh
|
||||
- tailhist
|
||||
- kube
|
||||
- kubetools
|
||||
- kubetest
|
||||
@@ -3,7 +3,7 @@ set -e
|
||||
|
||||
export AWS_INSTANCE_TYPE=t3a.small
|
||||
|
||||
INFRA=infra/aws-eu-north-1
|
||||
INFRA=infra/aws-us-east-2
|
||||
|
||||
STUDENTS=2
|
||||
|
||||
@@ -33,15 +33,9 @@ TAG=$PREFIX-$SETTINGS
|
||||
--settings settings/$SETTINGS.yaml \
|
||||
--students $STUDENTS
|
||||
|
||||
INFRA=infra/enix
|
||||
#INFRA=infra/aws-us-west-1
|
||||
|
||||
SETTINGS=admin-oldversion
|
||||
TAG=$PREFIX-$SETTINGS
|
||||
./workshopctl start \
|
||||
--tag $TAG \
|
||||
--infra $INFRA \
|
||||
--settings settings/$SETTINGS.yaml \
|
||||
--students $STUDENTS
|
||||
export AWS_INSTANCE_TYPE=t3a.medium
|
||||
|
||||
SETTINGS=admin-test
|
||||
TAG=$PREFIX-$SETTINGS
|
||||
|
||||
56
slides/1.yml
Normal file
56
slides/1.yml
Normal file
@@ -0,0 +1,56 @@
|
||||
title: |
|
||||
Docker & Kubernetes
|
||||
Part 1 - Docker
|
||||
|
||||
chat: "[Teams](https://teams.microsoft.com/l/channel/19%3arctk01XQVWxbj6pjTtJDfVd0_QOzfzYe7Xt8VDpl9681%40thread.tacv2/General?groupId=89c621d8-7080-447f-a7eb-9d6704776dd5&tenantId=72aa0d83-624a-4ebf-a683-1b9b45548610)"
|
||||
|
||||
gitrepo: github.com/jpetazzo/container.training
|
||||
|
||||
slides: https://2021-11-derivco.container.training/
|
||||
|
||||
#slidenumberprefix: "#SomeHashTag — "
|
||||
|
||||
exclude:
|
||||
- self-paced
|
||||
|
||||
content:
|
||||
- shared/title.md
|
||||
- logistics.md
|
||||
- containers/intro.md
|
||||
- shared/about-slides.md
|
||||
- shared/chat-room-im.md
|
||||
#- shared/chat-room-zoom-meeting.md
|
||||
#- shared/chat-room-zoom-webinar.md
|
||||
- shared/toc.md
|
||||
- # DAY 1
|
||||
#- containers/Docker_Overview.md
|
||||
#- containers/Docker_History.md
|
||||
- containers/Training_Environment.md
|
||||
#- containers/Installing_Docker.md
|
||||
- containers/First_Containers.md
|
||||
- containers/Background_Containers.md
|
||||
- containers/Initial_Images.md
|
||||
-
|
||||
- containers/Building_Images_Interactively.md
|
||||
- containers/Building_Images_With_Dockerfiles.md
|
||||
- containers/Cmd_And_Entrypoint.md
|
||||
- containers/Copying_Files_During_Build.md
|
||||
- containers/Exercise_Dockerfile_Basic.md
|
||||
- # DAY 2
|
||||
- containers/Dockerfile_Tips.md
|
||||
- containers/Multi_Stage_Builds.md
|
||||
- containers/Container_Networking_Basics.md
|
||||
- containers/Local_Development_Workflow.md
|
||||
- containers/Getting_Inside.md
|
||||
-
|
||||
- containers/Container_Network_Model.md
|
||||
- containers/Compose_For_Dev_Stacks.md
|
||||
- containers/Exercise_Composefile.md
|
||||
- containers/Exercise_Dockerfile_Advanced.md
|
||||
- shared/thankyou.md
|
||||
- # EXTRA
|
||||
- containers/Start_And_Attach.md
|
||||
- containers/Naming_And_Inspecting.md
|
||||
- containers/Labels.md
|
||||
- containers/Advanced_Dockerfiles.md
|
||||
- containers/Network_Drivers.md
|
||||
@@ -2,7 +2,7 @@
|
||||
#/ /kube-halfday.yml.html 200!
|
||||
#/ /kube-fullday.yml.html 200!
|
||||
#/ /kube-twodays.yml.html 200!
|
||||
/ /kube.yml.html 200!
|
||||
/ /1.yml.html 200!
|
||||
|
||||
# And this allows to do "git clone https://container.training".
|
||||
/info/refs service=git-upload-pack https://github.com/jpetazzo/container.training/info/refs?service=git-upload-pack
|
||||
|
||||
@@ -109,7 +109,7 @@ class: extra-details
|
||||
|
||||
- Example: [ctr.run](https://ctr.run/)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Use ctr.run to automatically build a container image and run it:
|
||||
```bash
|
||||
|
||||
@@ -28,7 +28,7 @@ class: self-paced
|
||||
- Likewise, it will take more than merely *reading* these slides
|
||||
to make you an expert
|
||||
|
||||
- These slides include *tons* of demos, exercises, and examples
|
||||
- These slides include *tons* of exercises and examples
|
||||
|
||||
- They assume that you have access to a machine running Docker
|
||||
|
||||
|
||||
@@ -1,5 +0,0 @@
|
||||
## Exercise — Application Configuration
|
||||
|
||||
- Configure an application with a ConfigMap
|
||||
|
||||
- Generate configuration file from the downward API
|
||||
@@ -1,87 +0,0 @@
|
||||
# Exercise — Application Configuration
|
||||
|
||||
- We want to configure an application with a ConfigMap
|
||||
|
||||
- We will use the "rainbow" example shown previously
|
||||
|
||||
(HAProxy load balancing traffic to services in multiple namespaces)
|
||||
|
||||
- We won't provide the HAProxy configuration file
|
||||
|
||||
- Instead, we will provide a list of namespaces
|
||||
|
||||
(e.g. as a space-delimited list in a ConfigMap)
|
||||
|
||||
- Our Pod should generate the HAProxy configuration using the ConfigMap
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
- Let's say that we have the "rainbow" app deployed:
|
||||
```bash
|
||||
kubectl apply -f ~/container.training/k8s/rainbow.yaml
|
||||
```
|
||||
|
||||
- And a ConfigMap like the following one:
|
||||
```bash
|
||||
kubectl create configmap rainbow --from-literal=namespaces="blue green"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Goal 1
|
||||
|
||||
- We want a Deployment and a Service called `rainbow`
|
||||
|
||||
- The `rainbow` Service should load balance across Namespaces `blue` and `green`
|
||||
|
||||
(i.e. to the Services called `color` in both these Namespaces)
|
||||
|
||||
- We want to be able to update the configuration:
|
||||
|
||||
- update the ConfigMap to put `blue green red`
|
||||
|
||||
- what should we do so that HAproxy picks up the change?
|
||||
|
||||
---
|
||||
|
||||
## Goal 2
|
||||
|
||||
- Check what happens if we specify a backend that doesn't exist
|
||||
|
||||
(e.g. add `purple` to the list of namespaces)
|
||||
|
||||
- If we specify invalid backends to HAProxy, it won't start!
|
||||
|
||||
- Implement a workaround among these two:
|
||||
|
||||
- remove invalid backends from the list before starting HAProxy
|
||||
|
||||
- wait until all backends are valid before starting HAProxy
|
||||
|
||||
---
|
||||
|
||||
## Goal 3
|
||||
|
||||
- We'd like HAProxy to pick up ConfigMap updates automatically
|
||||
|
||||
- How can we do that?
|
||||
|
||||
---
|
||||
|
||||
## Hints
|
||||
|
||||
- Check the following slides if you need help!
|
||||
|
||||
--
|
||||
|
||||
- We want to generate the HAProxy configuration in an `initContainer`
|
||||
|
||||
--
|
||||
|
||||
- The `namespaces` entry of the `rainbow` ConfigMap should be exposed to the `initContainer`
|
||||
|
||||
--
|
||||
|
||||
- The HAProxy configuration should be in a volume shared with HAProxy
|
||||
@@ -1,7 +0,0 @@
|
||||
## Exercise — Build a Cluster
|
||||
|
||||
- Deploy a cluster by configuring and running each component manually
|
||||
|
||||
- Add CNI networking
|
||||
|
||||
- Generate and validate ServiceAccount tokens
|
||||
@@ -1,33 +0,0 @@
|
||||
# Exercise — Build a Cluster
|
||||
|
||||
- Step 1: deploy a cluster
|
||||
|
||||
- follow the steps in the "Dessine-moi un cluster" section
|
||||
|
||||
- Step 2: add CNI networking
|
||||
|
||||
- une kube-router
|
||||
|
||||
- interconnect with the route-reflector
|
||||
|
||||
- check that you receive the routes of other clusters
|
||||
|
||||
- Step 3: generate and validate ServiceAccount tokens
|
||||
|
||||
- see next slide for help!
|
||||
|
||||
---
|
||||
|
||||
## ServiceAccount tokens
|
||||
|
||||
- We need to generate a TLS key pair and certificate
|
||||
|
||||
- A self-signed key will work
|
||||
|
||||
- We don't need anything particular in the certificate
|
||||
|
||||
(no particular CN, key use flags, etc.)
|
||||
|
||||
- The key needs to be passed to both API server and controller manager
|
||||
|
||||
- Check that ServiceAccount tokens are generated correctly
|
||||
@@ -4,6 +4,8 @@
|
||||
|
||||
(we will use the `rng` service in the dockercoins app)
|
||||
|
||||
- See what happens when the load increses
|
||||
- Observe the correct behavior of the readiness probe
|
||||
|
||||
(spoiler alert: it involves timeouts!)
|
||||
(when deploying e.g. an invalid image)
|
||||
|
||||
- Observe the behavior of the liveness probe
|
||||
|
||||
@@ -2,85 +2,36 @@
|
||||
|
||||
- We want to add healthchecks to the `rng` service in dockercoins
|
||||
|
||||
- The `rng` service exhibits an interesting behavior under load:
|
||||
|
||||
*its latency increases (which will cause probes to time out!)*
|
||||
|
||||
- We want to see:
|
||||
|
||||
- what happens when the readiness probe fails
|
||||
|
||||
- what happens when the liveness probe fails
|
||||
|
||||
- how to set "appropriate" probes and probe parameters
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
- First, deploy a new copy of dockercoins
|
||||
|
||||
(for instance, in a brand new namespace)
|
||||
- Then, add a readiness probe on the `rng` service
|
||||
|
||||
- Pro tip #1: ping (e.g. with `httping`) the `rng` service at all times
|
||||
|
||||
- it should initially show a few milliseconds latency
|
||||
|
||||
- that will increase when we scale up
|
||||
|
||||
- it will also let us detect when the service goes "boom"
|
||||
|
||||
- Pro tip #2: also keep an eye on the web UI
|
||||
|
||||
---
|
||||
|
||||
## Readiness
|
||||
|
||||
- Add a readiness probe to `rng`
|
||||
|
||||
- this requires editing the pod template in the Deployment manifest
|
||||
|
||||
- use a simple HTTP check on the `/` route of the service
|
||||
|
||||
- keep all other parameters (timeouts, thresholds...) at their default values
|
||||
(using a simple HTTP check on the `/` route of the service)
|
||||
|
||||
- Check what happens when deploying an invalid image for `rng` (e.g. `alpine`)
|
||||
|
||||
*(If the probe was set up correctly, the app will continue to work,
|
||||
because Kubernetes won't switch over the traffic to the `alpine` containers,
|
||||
because they don't pass the readiness probe.)*
|
||||
- Then roll back `rng` to the original image and add a liveness probe
|
||||
|
||||
(with the same parameters)
|
||||
|
||||
- Scale up the `worker` service (to 15+ workers) and observe
|
||||
|
||||
- What happens, and how can we improve the situation?
|
||||
|
||||
---
|
||||
|
||||
## Readiness under load
|
||||
## Goal
|
||||
|
||||
- Then roll back `rng` to the original image
|
||||
- *Before* adding the readiness probe:
|
||||
|
||||
- Check what happens when we scale up the `worker` Deployment to 15+ workers
|
||||
updating the image of the `rng` service with `alpine` should break it
|
||||
|
||||
(get the latency above 1 second)
|
||||
- *After* adding the readiness probe:
|
||||
|
||||
*(We should now observe intermittent unavailability of the service, i.e. every
|
||||
30 seconds it will be unreachable for a bit, then come back, then go away again, etc.)*
|
||||
updating the image of the `rng` service with `alpine` shouldn't break it
|
||||
|
||||
---
|
||||
- When adding the liveness probe, nothing special should happen
|
||||
|
||||
## Liveness
|
||||
- Scaling the `worker` service will then cause disruptions
|
||||
|
||||
- Now replace the readiness probe with a liveness probe
|
||||
|
||||
- What happens now?
|
||||
|
||||
*(At first the behavior looks the same as with the readiness probe:
|
||||
service becomes unreachable, then reachable again, etc.; but there is
|
||||
a significant difference behind the scenes. What is it?)*
|
||||
|
||||
---
|
||||
|
||||
## Readiness and liveness
|
||||
|
||||
- Bonus questions!
|
||||
|
||||
- What happens if we enable both probes at the same time?
|
||||
|
||||
- What strategies can we use so that both probes are useful?
|
||||
- The final goal is to understand why, and how to fix it
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
- the web app itself (dockercoins, NGINX, whatever we want)
|
||||
|
||||
- an ingress controller
|
||||
- an ingress controller (we suggest Traefik)
|
||||
|
||||
- a domain name (`use \*.nip.io` or `\*.localdev.me`)
|
||||
|
||||
@@ -16,7 +16,7 @@
|
||||
|
||||
## Goal
|
||||
|
||||
- We want to be able to access the web app using a URL like:
|
||||
- We want to be able to access the web app using an URL like:
|
||||
|
||||
http://webapp.localdev.me
|
||||
|
||||
@@ -30,13 +30,11 @@
|
||||
|
||||
## Hints
|
||||
|
||||
- For the ingress controller, we can use:
|
||||
- Traefik can be installed with Helm
|
||||
|
||||
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx/blob/main/docs/deploy/index.md)
|
||||
(it can be found on the Artifact Hub)
|
||||
|
||||
- the [Traefik Helm chart](https://doc.traefik.io/traefik/getting-started/install-traefik/#use-the-helm-chart)
|
||||
|
||||
- the container.training [Traefik DaemonSet](https://raw.githubusercontent.com/jpetazzo/container.training/main/k8s/traefik-v2.yaml)
|
||||
- If using Kubernetes 1.22+, make sure to use Traefik 2.5+
|
||||
|
||||
- If our cluster supports LoadBalancer Services: easy
|
||||
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
⚠️ BROKEN EXERCISE - DO NOT USE
|
||||
|
||||
## Exercise — Ingress Secret Policy
|
||||
|
||||
*Implement policy to limit impact of ingress controller vulnerabilities.*
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
⚠️ BROKEN EXERCISE - DO NOT USE
|
||||
|
||||
# Exercise — Ingress Secret Policy
|
||||
|
||||
- Most ingress controllers have access to all Secrets
|
||||
@@ -90,6 +88,6 @@
|
||||
|
||||
## Step 5: double-check
|
||||
|
||||
- Check that the Ingress Controller can't access other secrets
|
||||
- Check that the Ingres Controller can't access other secrets
|
||||
|
||||
(e.g. by manually creating a Secret and checking with `kubectl exec`?)
|
||||
|
||||
@@ -8,37 +8,25 @@
|
||||
|
||||
- We'll use one Deployment for each component
|
||||
|
||||
(created with `kubectl create deployment`)
|
||||
(see next slide for the images to use)
|
||||
|
||||
- We'll connect them with Services
|
||||
|
||||
(create with `kubectl expose`)
|
||||
- We'll check that we can access the web UI in a browser
|
||||
|
||||
---
|
||||
|
||||
## Images
|
||||
|
||||
- We'll use the following images:
|
||||
- hasher → `dockercoins/hasher:v0.1`
|
||||
|
||||
- hasher → `dockercoins/hasher:v0.1`
|
||||
- redis → `redis`
|
||||
|
||||
- redis → `redis`
|
||||
- rng → `dockercoins/rng:v0.1`
|
||||
|
||||
- rng → `dockercoins/rng:v0.1`
|
||||
- webui → `dockercoins/webui:v0.1`
|
||||
|
||||
- webui → `dockercoins/webui:v0.1`
|
||||
|
||||
- worker → `dockercoins/worker:v0.1`
|
||||
|
||||
- All services should be internal services, except the web UI
|
||||
|
||||
(since we want to be able to connect to the web UI from outside)
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
- worker → `dockercoins/worker:v0.1`
|
||||
|
||||
---
|
||||
|
||||
@@ -46,7 +34,7 @@ class: pic
|
||||
|
||||
- We should be able to see the web UI in our browser
|
||||
|
||||
(with the graph showing approximately 3-4 hashes/second)
|
||||
(with the graph showing approximatiely 3-4 hashes/second)
|
||||
|
||||
---
|
||||
|
||||
@@ -56,4 +44,4 @@ class: pic
|
||||
|
||||
(check the logs of the worker; they indicate the port numbers)
|
||||
|
||||
- The web UI can be exposed with a NodePort or LoadBalancer Service
|
||||
- The web UI can be exposed with a NodePort Service
|
||||
|
||||
@@ -1,9 +0,0 @@
|
||||
## Exercise — Generating Ingress With Kyverno
|
||||
|
||||
- When a Service gets created, automatically generate an Ingress
|
||||
|
||||
- Step 1: expose all services with a hard-coded domain name
|
||||
|
||||
- Step 2: only expose services that have a port named `http`
|
||||
|
||||
- Step 3: configure the domain name with a per-namespace ConfigMap
|
||||
@@ -1,33 +0,0 @@
|
||||
# Exercise — Generating Ingress With Kyverno
|
||||
|
||||
When a Service gets created...
|
||||
|
||||
*(for instance, Service `blue` in Namespace `rainbow`)*
|
||||
|
||||
...Automatically generate an Ingress.
|
||||
|
||||
*(for instance, with host name `blue.rainbow.MYDOMAIN.COM`)*
|
||||
|
||||
---
|
||||
|
||||
## Goals
|
||||
|
||||
- Step 1: expose all services with a hard-coded domain name
|
||||
|
||||
- Step 2: only expose services that have a port named `http`
|
||||
|
||||
- Step 3: configure the domain name with a per-namespace ConfigMap
|
||||
|
||||
(e.g. `kubectl create configmap ingress-domain-name --from-literal=domain=1.2.3.4.nip.io`)
|
||||
|
||||
---
|
||||
|
||||
## Hints
|
||||
|
||||
- We want to use a Kyverno `generate` ClusterPolicy
|
||||
|
||||
- For step 1, check [Generate Resources](https://kyverno.io/docs/writing-policies/generate/) documentation
|
||||
|
||||
- For step 2, check [Preconditions](https://kyverno.io/docs/writing-policies/preconditions/) documentation
|
||||
|
||||
- For step 3, check [External Data Sources](https://kyverno.io/docs/writing-policies/external-data-sources/) documentation
|
||||
@@ -1,9 +0,0 @@
|
||||
## Exercise — Remote Cluster
|
||||
|
||||
- Install kubectl locally
|
||||
|
||||
- Retrieve the kubeconfig file of our remote cluster
|
||||
|
||||
- Deploy dockercoins on that cluster
|
||||
|
||||
- Access an internal service without exposing it
|
||||
@@ -1,62 +0,0 @@
|
||||
# Exercise — Remote Cluster
|
||||
|
||||
- We want to control a remote cluster
|
||||
|
||||
- Then we want to run a copy of dockercoins on that cluster
|
||||
|
||||
- We want to be able to connect to an internal service
|
||||
|
||||
---
|
||||
|
||||
## Goal
|
||||
|
||||
- Be able to access e.g. hasher, rng, or webui
|
||||
|
||||
(without exposing them with a NodePort or LoadBalancer service)
|
||||
|
||||
---
|
||||
|
||||
## Getting access to the cluster
|
||||
|
||||
- If you don't have `kubectl` on your machine, install it
|
||||
|
||||
- Download the kubeconfig file from the remote cluster
|
||||
|
||||
(you can use `scp` or even copy-paste it)
|
||||
|
||||
- If you already have a kubeconfig file on your machine:
|
||||
|
||||
- save the remote kubeconfig with another name (e.g. `~/.kube/config.remote`)
|
||||
|
||||
- set the `KUBECONFIG` environment variable to point to that file name
|
||||
|
||||
- ...or use the `--kubeconfig=...` option with `kubectl`
|
||||
|
||||
- Check that you can access the cluster (e.g. `kubectl get nodes`)
|
||||
|
||||
---
|
||||
|
||||
## If you get an error...
|
||||
|
||||
⚠️ The following applies to clusters deployed with `kubeadm`
|
||||
|
||||
- If you have a cluster where the nodes are named `node1`, `node2`, etc.
|
||||
|
||||
- `kubectl` commands might show connection errors with internal IP addresses
|
||||
|
||||
(e.g. 10.10... or 172.17...)
|
||||
|
||||
- In that case, you might need to edit the `kubeconfig` file:
|
||||
|
||||
- find the server address
|
||||
|
||||
- update it to put the *external* address of the first node of the cluster
|
||||
|
||||
---
|
||||
|
||||
|
||||
## Deploying an app
|
||||
|
||||
- Deploy another copy of dockercoins from your local machine
|
||||
|
||||
- Access internal services (e.g. with `kubectl port-forward`)
|
||||
@@ -24,9 +24,9 @@ We will call them "dev cluster" and "prod cluster".
|
||||
|
||||
- Our application needs two secrets:
|
||||
|
||||
- a *logging API token* (not too sensitive; same in dev and prod)
|
||||
- `logging_api_token` (not too sensitive; same in dev and prod)
|
||||
|
||||
- a *database password* (sensitive; different in dev and prod)
|
||||
- `database_password` (sensitive; different in dev and prod)
|
||||
|
||||
- Secrets can be exposed as env vars, or mounted in volumes
|
||||
|
||||
@@ -42,7 +42,7 @@ We will call them "dev cluster" and "prod cluster".
|
||||
|
||||
- On the dev cluster, create a Namespace called `dev`
|
||||
|
||||
- Create the two secrets, `logging-api-token` and `database-password`
|
||||
- Create the two secrets, `logging_api_token` and `database_password`
|
||||
|
||||
(the content doesn't matter; put a random string of your choice)
|
||||
|
||||
@@ -110,8 +110,8 @@ We want Alice to be able to:
|
||||
|
||||
- deploy the whole application in the `prod` namespace
|
||||
|
||||
- access the *logging API token* secret
|
||||
- access the `logging_api_token` secret
|
||||
|
||||
- but *not* the *database password* secret
|
||||
- but *not* the `database_password` secret
|
||||
|
||||
- view the logs of the app
|
||||
|
||||
@@ -1,9 +0,0 @@
|
||||
## Exercise — Terraform Node Pools
|
||||
|
||||
- Write a Terraform configuration to deploy a cluster
|
||||
|
||||
- The cluster should have two node pools with autoscaling
|
||||
|
||||
- Deploy two apps, each using exclusively one node pool
|
||||
|
||||
- Bonus: deploy an app balanced across both node pools
|
||||
@@ -1,69 +0,0 @@
|
||||
# Exercise — Terraform Node Pools
|
||||
|
||||
- Write a Terraform configuration to deploy a cluster
|
||||
|
||||
- The cluster should have two node pools with autoscaling
|
||||
|
||||
- Deploy two apps, each using exclusively one node pool
|
||||
|
||||
- Bonus: deploy an app balanced across both node pools
|
||||
|
||||
---
|
||||
|
||||
## Cluster deployment
|
||||
|
||||
- Write a Terraform configuration to deploy a cluster
|
||||
|
||||
- We want to have two node pools with autoscaling
|
||||
|
||||
- Example for sizing:
|
||||
|
||||
- 4 GB / 1 CPU per node
|
||||
|
||||
- pools of 1 to 4 nodes
|
||||
|
||||
---
|
||||
|
||||
## Cluster autoscaling
|
||||
|
||||
- Deploy an app on the cluster
|
||||
|
||||
(you can use `nginx`, `jpetazzo/color`...)
|
||||
|
||||
- Set a resource request (e.g. 1 GB RAM)
|
||||
|
||||
- Scale up and verify that the autoscaler kicks in
|
||||
|
||||
---
|
||||
|
||||
## Pool isolation
|
||||
|
||||
- We want to deploy two apps
|
||||
|
||||
- The first app should be deployed exclusively on the first pool
|
||||
|
||||
- The second app should be deployed exclusively on the second pool
|
||||
|
||||
- Check the next slide for hints!
|
||||
|
||||
---
|
||||
|
||||
## Hints
|
||||
|
||||
- One solution involves adding a `nodeSelector` to the pod templates
|
||||
|
||||
- Another solution involves adding:
|
||||
|
||||
- `taints` to the node pools
|
||||
|
||||
- matching `tolerations` to the pod templates
|
||||
|
||||
---
|
||||
|
||||
## Balancing
|
||||
|
||||
- Step 1: make sure that the pools are not balanced
|
||||
|
||||
- Step 2: deploy a new app, check that it goes to the emptiest pool
|
||||
|
||||
- Step 3: update the app so that it balances (as much as possible) between pools
|
||||
@@ -1,60 +0,0 @@
|
||||
#!/bin/sh
|
||||
|
||||
# The materials for a given training live in their own branch.
|
||||
# Sometimes, we write custom content (or simply new content) for a training,
|
||||
# and that content doesn't get merged back to main. This script tries to
|
||||
# detect that with the following heuristics:
|
||||
# - list all remote branches
|
||||
# - for each remote branch, list the changes that weren't merged into main
|
||||
# (using "diff main...$BRANCH", three dots)
|
||||
# - ignore a bunch of training-specific files that change all the time anyway
|
||||
# - for the remaining files, compute the diff between main and the branch
|
||||
# (using "diff main..$BRANCH", two dots)
|
||||
# - ignore changes of less than 10 lines
|
||||
# - also ignore a few red herrings
|
||||
# - display whatever is left
|
||||
|
||||
# For "git diff" (in the filter function) to work correctly, we must be
|
||||
# at the root of the repo.
|
||||
cd $(git rev-parse --show-toplevel)
|
||||
|
||||
BRANCHES=$(git branch -r | grep -v origin/HEAD | grep origin/2)
|
||||
|
||||
filter() {
|
||||
threshold=10
|
||||
while read filename; do
|
||||
case $filename in
|
||||
# Generic training-specific files
|
||||
slides/*.html) continue;;
|
||||
slides/*.yml) continue;;
|
||||
slides/logistics*.md) continue;;
|
||||
# Specific content that can be ignored
|
||||
#slides/containers/Local_Environment.md) threshold=100;;
|
||||
# Content that was moved/refactored enough to confuse us
|
||||
slides/containers/Local_Environment.md) threshold=100;;
|
||||
slides/exercises.md) continue;;
|
||||
slides/k8s/batch-jobs) threshold=20;;
|
||||
# Renames
|
||||
*/{*}*) continue;;
|
||||
esac
|
||||
git diff --find-renames --numstat main..$BRANCH -- "$filename" | {
|
||||
# If the files are identical, the diff will be empty, and "read" will fail.
|
||||
read plus minus filename || return
|
||||
# Ignore binary files (FIXME though?)
|
||||
if [ $plus = - ]; then
|
||||
return
|
||||
fi
|
||||
diff=$((plus-minus))
|
||||
if [ $diff -gt $threshold ]; then
|
||||
echo git diff main..$BRANCH -- $filename
|
||||
fi
|
||||
}
|
||||
done
|
||||
}
|
||||
|
||||
for BRANCH in $BRANCHES; do
|
||||
if FILES=$(git diff --find-renames --name-only main...$BRANCH | filter | grep .); then
|
||||
echo "🌳 $BRANCH:"
|
||||
echo "$FILES"
|
||||
fi
|
||||
done
|
||||
@@ -32,7 +32,7 @@
|
||||
|
||||
- You're welcome to use whatever you like (e.g. AWS profiles)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Set the AWS region, API access key, and secret key:
|
||||
```bash
|
||||
@@ -58,7 +58,7 @@
|
||||
|
||||
- register it in our kubeconfig file
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Update our kubeconfig file:
|
||||
```bash
|
||||
|
||||
@@ -20,13 +20,13 @@
|
||||
|
||||
## Suspension of disbelief
|
||||
|
||||
The labs and demos in this section assume that we have set up `kubectl` on our
|
||||
The exercises in this section assume that we have set up `kubectl` on our
|
||||
local machine in order to access a remote cluster.
|
||||
|
||||
We will therefore show how to access services and pods of the remote cluster,
|
||||
from our local machine.
|
||||
|
||||
You can also run these commands directly on the cluster (if you haven't
|
||||
You can also run these exercises directly on the cluster (if you haven't
|
||||
installed and set up `kubectl` locally).
|
||||
|
||||
Running commands locally will be less useful
|
||||
@@ -58,7 +58,7 @@ installed and set up `kubectl` to communicate with your cluster.
|
||||
|
||||
- Let's access the `webui` service through `kubectl proxy`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Run an API proxy in the background:
|
||||
```bash
|
||||
@@ -101,7 +101,7 @@ installed and set up `kubectl` to communicate with your cluster.
|
||||
|
||||
- Let's access our remote Redis server
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Forward connections from local port 10000 to remote port 6379:
|
||||
```bash
|
||||
|
||||
@@ -198,7 +198,7 @@ Some examples ...
|
||||
|
||||
(the Node "echo" app, the Flask app, and one ngrok tunnel for each of them)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Go to the webhook directory:
|
||||
```bash
|
||||
@@ -244,7 +244,7 @@ class: extra-details
|
||||
|
||||
- We need to update the configuration with the correct `url`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Edit the webhook configuration manifest:
|
||||
```bash
|
||||
@@ -271,7 +271,7 @@ class: extra-details
|
||||
|
||||
(so if the webhook server is down, we can still create pods)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Register the webhook:
|
||||
```bash
|
||||
@@ -288,7 +288,7 @@ It is strongly recommended to tail the logs of the API server while doing that.
|
||||
|
||||
- Let's create a pod and try to set a `color` label
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create a pod named `chroma`:
|
||||
```bash
|
||||
@@ -328,7 +328,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
|
||||
|
||||
## Update the webhook configuration
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- First, check the ngrok URL of the tunnel for the Flask app:
|
||||
```bash
|
||||
@@ -395,7 +395,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
|
||||
|
||||
## Let's get to work!
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Make sure we're in the right directory:
|
||||
```bash
|
||||
@@ -424,7 +424,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
|
||||
|
||||
... we'll store it in a ConfigMap, and install dependencies on the fly
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Load the webhook source in a ConfigMap:
|
||||
```bash
|
||||
@@ -446,7 +446,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
|
||||
|
||||
(of course, there are plenty others options; e.g. `cfssl`)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Generate a self-signed certificate:
|
||||
```bash
|
||||
@@ -470,7 +470,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
|
||||
|
||||
- Let's reconfigure the webhook to use our Service instead of ngrok
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Edit the webhook configuration manifest:
|
||||
```bash
|
||||
@@ -504,7 +504,7 @@ Note: the webhook doesn't do anything (other than printing the request payload).
|
||||
|
||||
Shell to the rescue!
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Load up our cert and encode it in base64:
|
||||
```bash
|
||||
|
||||
@@ -66,7 +66,7 @@
|
||||
|
||||
- We'll ask `kubectl` to show us the exacts requests that it's making
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the URI for a cluster-scope, "core" resource, e.g. a Node:
|
||||
```bash
|
||||
@@ -122,7 +122,7 @@ class: extra-details
|
||||
|
||||
- What about namespaced resources?
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the URI for a namespaced, "core" resource, e.g. a Service:
|
||||
```bash
|
||||
@@ -169,7 +169,7 @@ class: extra-details
|
||||
|
||||
## Accessing a subresource
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- List `kube-proxy` pods:
|
||||
```bash
|
||||
@@ -200,7 +200,7 @@ command=echo&command=hello&command=world&container=kube-proxy&stderr=true&stdout
|
||||
|
||||
- There are at least three useful commands to introspect the API server
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- List resources types, their group, kind, short names, and scope:
|
||||
```bash
|
||||
@@ -249,7 +249,7 @@ command=echo&command=hello&command=world&container=kube-proxy&stderr=true&stdout
|
||||
|
||||
The following assumes that `metrics-server` is deployed on your cluster.
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check that the metrics.k8s.io is registered with `metrics-server`:
|
||||
```bash
|
||||
@@ -271,7 +271,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
|
||||
|
||||
- We can have multiple resources with the same name
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Look for resources named `node`:
|
||||
```bash
|
||||
@@ -298,7 +298,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
|
||||
|
||||
- But we can look at the raw data (with `-o json` or `-o yaml`)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Look at NodeMetrics objects with one of these commands:
|
||||
```bash
|
||||
@@ -320,7 +320,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
|
||||
|
||||
--
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Display node metrics:
|
||||
```bash
|
||||
@@ -342,7 +342,7 @@ The following assumes that `metrics-server` is deployed on your cluster.
|
||||
|
||||
- Then we can register that server by creating an APIService resource
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the definition used for the `metrics-server`:
|
||||
```bash
|
||||
|
||||
@@ -103,7 +103,7 @@ class: extra-details
|
||||
|
||||
---
|
||||
|
||||
## `WithWaitGroup`
|
||||
## `WithWaitGroup`,
|
||||
|
||||
- When we shutdown, tells clients (with in-flight requests) to retry
|
||||
|
||||
|
||||
@@ -20,67 +20,25 @@ The control plane can run:
|
||||
|
||||
- in containers, on the same nodes that run other application workloads
|
||||
|
||||
(default behavior for local clusters like [Minikube](https://github.com/kubernetes/minikube), [kind](https://kind.sigs.k8s.io/)...)
|
||||
(example: [Minikube](https://github.com/kubernetes/minikube); 1 node runs everything, [kind](https://kind.sigs.k8s.io/))
|
||||
|
||||
- on a dedicated node
|
||||
|
||||
(default behavior when deploying with kubeadm)
|
||||
(example: a cluster installed with kubeadm)
|
||||
|
||||
- on a dedicated set of nodes
|
||||
|
||||
([Kubernetes The Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way); [kops](https://github.com/kubernetes/kops); also kubeadm)
|
||||
(example: [Kubernetes The Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way); [kops](https://github.com/kubernetes/kops))
|
||||
|
||||
- outside of the cluster
|
||||
|
||||
(most managed clusters like AKS, DOK, EKS, GKE, Kapsule, LKE, OKE...)
|
||||
(example: most managed clusters like AKS, EKS, GKE)
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||

|
||||
|
||||
---
|
||||
|
||||
@@ -157,6 +115,12 @@ The kubelet agent uses a number of special-purpose protocols and interfaces, inc
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
# The Kubernetes API
|
||||
|
||||
[
|
||||
@@ -203,9 +167,9 @@ What does that mean?
|
||||
|
||||
## Let's experiment a bit!
|
||||
|
||||
- For this section, connect to the first node of the `test` cluster
|
||||
- For the exercises in this section, connect to the first node of the `test` cluster
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- SSH to the first node of the test cluster
|
||||
|
||||
@@ -224,7 +188,7 @@ What does that mean?
|
||||
|
||||
- Let's create a simple object
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create a namespace with the following command:
|
||||
```bash
|
||||
@@ -246,7 +210,7 @@ This is equivalent to `kubectl create namespace hello`.
|
||||
|
||||
- Let's retrieve the object we just created
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Read back our object:
|
||||
```bash
|
||||
@@ -354,7 +318,7 @@ class: extra-details
|
||||
|
||||
- The easiest way is to use `kubectl label`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- In one terminal, watch namespaces:
|
||||
```bash
|
||||
@@ -402,7 +366,7 @@ class: extra-details
|
||||
|
||||
- DELETED resources
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- In one terminal, watch pods, displaying full events:
|
||||
```bash
|
||||
|
||||
@@ -361,7 +361,7 @@ class: extra-details
|
||||
|
||||
## Listing service accounts
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- The resource name is `serviceaccount` or `sa` for short:
|
||||
```bash
|
||||
@@ -378,7 +378,7 @@ class: extra-details
|
||||
|
||||
## Finding the secret
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- List the secrets for the `default` service account:
|
||||
```bash
|
||||
@@ -398,7 +398,7 @@ class: extra-details
|
||||
|
||||
- The token is stored in the secret, wrapped with base64 encoding
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- View the secret:
|
||||
```bash
|
||||
@@ -421,7 +421,7 @@ class: extra-details
|
||||
|
||||
- Let's send a request to the API, without and with the token
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Find the ClusterIP for the `kubernetes` service:
|
||||
```bash
|
||||
@@ -495,49 +495,6 @@ class: extra-details
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Listing all possible verbs
|
||||
|
||||
- The Kubernetes API is self-documented
|
||||
|
||||
- We can ask it which resources, subresources, and verb exist
|
||||
|
||||
- One way to do this is to use:
|
||||
|
||||
- `kubectl get --raw /api/v1` (for core resources with `apiVersion: v1`)
|
||||
|
||||
- `kubectl get --raw /apis/<group>/<version>` (for other resources)
|
||||
|
||||
- The JSON response can be formatted with e.g. `jq` for readability
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Examples
|
||||
|
||||
- List all verbs across all `v1` resources
|
||||
|
||||
```bash
|
||||
kubectl get --raw /api/v1 | jq -r .resources[].verbs[] | sort -u
|
||||
```
|
||||
|
||||
- List all resources and subresources in `apps/v1`
|
||||
|
||||
```bash
|
||||
kubectl get --raw /apis/apps/v1 | jq -r .resources[].name
|
||||
```
|
||||
|
||||
- List which verbs are available on which resources in `networking.k8s.io`
|
||||
|
||||
```bash
|
||||
kubectl get --raw /apis/networking.k8s.io/v1 | \
|
||||
jq -r '.resources[] | .name + ": " + (.verbs | join(", "))'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## From rules to roles to rolebindings
|
||||
|
||||
- A *role* is an API object containing a list of *rules*
|
||||
@@ -616,7 +573,7 @@ class: extra-details
|
||||
|
||||
- Nixery automatically generates images with the requested packages
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Run our pod:
|
||||
```bash
|
||||
@@ -632,7 +589,7 @@ class: extra-details
|
||||
|
||||
- Normally, at this point, we don't have any API permission
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check our permissions with `kubectl`:
|
||||
```bash
|
||||
@@ -658,7 +615,7 @@ class: extra-details
|
||||
|
||||
(but again, we could call it `view` or whatever we like)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create the new role binding:
|
||||
```bash
|
||||
@@ -716,7 +673,7 @@ It's important to note a couple of details in these flags...
|
||||
|
||||
- We should be able to *view* things, but not to *edit* them
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check our permissions with `kubectl`:
|
||||
```bash
|
||||
@@ -971,18 +928,6 @@ class: extra-details
|
||||
kubectl describe clusterrole cluster-admin
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## `list` vs. `get`
|
||||
|
||||
⚠️ `list` grants read permissions to resources!
|
||||
|
||||
- It's not possible to give permission to list resources without also reading them
|
||||
|
||||
- This has implications for e.g. Secrets
|
||||
|
||||
(if a controller needs to be able to enumerate Secrets, it will be able to read them)
|
||||
|
||||
???
|
||||
|
||||
:EN:- Authentication and authorization in Kubernetes
|
||||
|
||||
@@ -93,7 +93,7 @@
|
||||
|
||||
- We can use the `--dry-run=client` option
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Generate the YAML for a Deployment without creating it:
|
||||
```bash
|
||||
@@ -128,7 +128,7 @@ class: extra-details
|
||||
|
||||
## The limits of `kubectl apply --dry-run=client`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Generate the YAML for a deployment:
|
||||
```bash
|
||||
@@ -161,7 +161,7 @@ class: extra-details
|
||||
|
||||
(all validation and mutation hooks will be executed)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Try the same YAML file as earlier, with server-side dry run:
|
||||
```bash
|
||||
@@ -200,7 +200,7 @@ class: extra-details
|
||||
|
||||
- `kubectl diff` does a server-side dry run, *and* shows differences
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Try `kubectl diff` on the YAML that we tweaked earlier:
|
||||
```bash
|
||||
|
||||
@@ -1,693 +0,0 @@
|
||||
# Amazon EKS
|
||||
|
||||
- Elastic Kubernetes Service
|
||||
|
||||
- AWS runs the Kubernetes control plane
|
||||
|
||||
(all we see is an API server endpoint)
|
||||
|
||||
- Pods can run on any combination of:
|
||||
|
||||
- EKS-managed nodes
|
||||
|
||||
- self-managed nodes
|
||||
|
||||
- Fargate
|
||||
|
||||
- Leverages and integrates with AWS services and APIs
|
||||
|
||||
---
|
||||
|
||||
## Some integrations
|
||||
|
||||
- Authenticate with IAM users and roles
|
||||
|
||||
- Associate IAM roles to Kubernetes ServiceAccounts
|
||||
|
||||
- Load balance traffic with ALB/ELB/NLB
|
||||
|
||||
- Persist data with EBS/EFS
|
||||
|
||||
- Label nodes with instance ID, instance type, region, AZ ...
|
||||
|
||||
- Pods can be "first class citizens" of VPC
|
||||
|
||||
---
|
||||
|
||||
## Pros/cons
|
||||
|
||||
- Fully managed control plane
|
||||
|
||||
- Handles deployment, upgrade, scaling of the control plane
|
||||
|
||||
- Available versions and features tend to lag a bit
|
||||
|
||||
- Doesn't fit the most demanding users
|
||||
|
||||
("demanding" starts somewhere between 100 and 1000 nodes)
|
||||
|
||||
---
|
||||
|
||||
## Good to know ...
|
||||
|
||||
- Some integrations are specific to EKS
|
||||
|
||||
(some authentication models)
|
||||
|
||||
- Many integrations are *not* specific to EKS
|
||||
|
||||
- The Cloud Controller Manager can run outside of EKS
|
||||
|
||||
(and provide LoadBalancer services, EBS volumes, and more)
|
||||
|
||||
---
|
||||
|
||||
# Provisioning clusters
|
||||
|
||||
- AWS console, API, CLI
|
||||
|
||||
- `eksctl`
|
||||
|
||||
- Infrastructure-as-Code
|
||||
|
||||
---
|
||||
|
||||
## AWS "native" provisioning
|
||||
|
||||
- AWS web console
|
||||
|
||||
- click-click-click!
|
||||
|
||||
- difficulty: low
|
||||
|
||||
- AWS API or CLI
|
||||
|
||||
- must provide subnets, ARNs
|
||||
|
||||
- difficulty: medium
|
||||
|
||||
---
|
||||
|
||||
## `eksctl`
|
||||
|
||||
- Originally developed by Weave
|
||||
|
||||
(back when AWS "native" provisioning wasn't very good)
|
||||
|
||||
- `eksctl create cluster` just works™
|
||||
|
||||
- Has been "adopted" by AWS
|
||||
|
||||
(is listed in official documentations)
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure-as-Code
|
||||
|
||||
- Cloud Formation
|
||||
|
||||
- Terraform
|
||||
|
||||
[terraform-aws-eks](https://github.com/terraform-aws-modules/terraform-aws-eks)
|
||||
by the community
|
||||
([example](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic))
|
||||
|
||||
[terraform-provider-aws](https://github.com/hashicorp/terraform-provider-aws)
|
||||
by Hashicorp
|
||||
([example](https://github.com/hashicorp/terraform-provider-aws/tree/main/examples/eks-getting-started))
|
||||
|
||||
[Kubestack](https://www.kubestack.com/)
|
||||
|
||||
---
|
||||
|
||||
## Node groups
|
||||
|
||||
- Virtually all provisioning models have a concept of "node group"
|
||||
|
||||
- Node group = group of similar nodes in an ASG
|
||||
|
||||
- can span multiple AZ
|
||||
|
||||
- can have instances of different types¹
|
||||
|
||||
- A cluster will need at least one node group
|
||||
|
||||
.footnote[¹As I understand it, to specify fallbacks if one instance type is unavailable or out of capacity.]
|
||||
|
||||
---
|
||||
|
||||
# IAM → EKS authentication
|
||||
|
||||
- Access EKS clusters using IAM users and roles
|
||||
|
||||
- No special role, permission, or policy is needed in IAM
|
||||
|
||||
(but the `eks:DescribeCluster` permission can be useful, see later)
|
||||
|
||||
- Users and roles need to be explicitly listed in the cluster
|
||||
|
||||
- Configuration is done through a ConfigMap in the cluster
|
||||
|
||||
---
|
||||
|
||||
## Setting it up
|
||||
|
||||
- Nothing to do when creating the cluster
|
||||
|
||||
(feature is always enabled)
|
||||
|
||||
- Users and roles are *mapped* to Kubernetes users and groups
|
||||
|
||||
(through the `aws-auth` ConfigMap in `kube-system`)
|
||||
|
||||
- That's it!
|
||||
|
||||
---
|
||||
|
||||
## Mapping
|
||||
|
||||
- The `aws-auth` ConfigMap can contain two entries:
|
||||
|
||||
- `mapRoles` (map IAM roles)
|
||||
|
||||
- `mapUsers` (map IAM users)
|
||||
|
||||
- Each entry is a YAML file
|
||||
|
||||
- Each entry includes:
|
||||
|
||||
- `rolearn` or `userarn` to map
|
||||
|
||||
- `username` (as a string)
|
||||
|
||||
- `groups` (as a list; can be empty)
|
||||
|
||||
---
|
||||
|
||||
## Example
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
namespace: kube-system
|
||||
name: aws-auth
|
||||
data:
|
||||
mapRoles: `|`
|
||||
- rolearn: arn:aws:iam::111122223333:role/blah
|
||||
username: blah
|
||||
groups: [ devs, ops ]
|
||||
mapUsers: `|`
|
||||
- userarn: arn:aws:iam::111122223333:user/alice
|
||||
username: alice
|
||||
groups: [ system:masters ]
|
||||
- userarn: arn:aws:iam::111122223333:user/bob
|
||||
username: bob
|
||||
groups: [ system:masters ]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Client setup
|
||||
|
||||
- We need either the `aws` CLI or the `aws-iam-authenticator`
|
||||
|
||||
- We use them as `exec` plugins in `~/.kube/config`
|
||||
|
||||
- Done automatically by `eksctl`
|
||||
|
||||
- Or manually with `aws eks update-kubeconfig`
|
||||
|
||||
- Discovering the address of the API server requires one IAM permission
|
||||
|
||||
```json
|
||||
"Action": [
|
||||
"eks:DescribeCluster"
|
||||
],
|
||||
"Resource": "arn:aws:eks:<region>:<account>:cluster/<cluster-name>"
|
||||
```
|
||||
|
||||
(wildcards can be used when specifying the resource)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## How it works
|
||||
|
||||
- The helper generates a token
|
||||
|
||||
(with `aws eks get-token` or `aws-iam-authenticator token`)
|
||||
|
||||
- Note: these calls will always succeed!
|
||||
|
||||
(even if AWS API keys are invalid)
|
||||
|
||||
- The token is used to authenticate with the Kubernetes API
|
||||
|
||||
- AWS' Kubernetes API server will decode and validate the token
|
||||
|
||||
(and map the underlying user or role accordingly)
|
||||
|
||||
---
|
||||
|
||||
## Read The Fine Manual
|
||||
|
||||
https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
|
||||
|
||||
---
|
||||
|
||||
# EKS → IAM authentication
|
||||
|
||||
- Access AWS services from workloads running on EKS
|
||||
|
||||
(e.g.: access S3 bucket from code running in a Pod)
|
||||
|
||||
- This works by associating an IAM role to a K8S ServiceAccount
|
||||
|
||||
- There are also a few specific roles used internally by EKS
|
||||
|
||||
(e.g. to let the nodes establish network configurations)
|
||||
|
||||
- ... We won't talk about these
|
||||
|
||||
---
|
||||
|
||||
## The big picture
|
||||
|
||||
- One-time setup task
|
||||
|
||||
([create an OIDC provider associated to our EKS cluster](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html))
|
||||
|
||||
- Create (or update) a role with an appropriate *trust policy*
|
||||
|
||||
(more on that later)
|
||||
|
||||
- Annotate service accounts to map them to that role
|
||||
|
||||
`eks.amazonaws.com/role-arn=arn:aws:iam::111122223333:role/some-iam-role`
|
||||
|
||||
- Create (or re-create) pods using that ServiceAccount
|
||||
|
||||
- The pods can now use that role!
|
||||
|
||||
---
|
||||
|
||||
## Trust policies
|
||||
|
||||
- IAM roles have a *trust policy* (aka *assume role policy*)
|
||||
|
||||
(cf `aws iam create-role ... --assume-role-policy-document ...`)
|
||||
|
||||
- That policy contains a *statement* list
|
||||
|
||||
- This list indicates who/what is allowed to assume (use) the role
|
||||
|
||||
- In the current scenario, that policy will contain something saying:
|
||||
|
||||
*ServiceAccount S on EKS cluster C is allowed to use this role*
|
||||
|
||||
---
|
||||
|
||||
## Trust policy for a single ServiceAccount
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Principal": {
|
||||
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
|
||||
},
|
||||
"Action": "sts:AssumeRoleWithWebIdentity",
|
||||
"Condition": {
|
||||
"StringEquals": {
|
||||
"${OIDC_PROVIDER}:sub":
|
||||
"system:serviceaccount:<namespace>:<service-account>"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Trust policy for multiple ServiceAccounts
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Principal": {
|
||||
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
|
||||
},
|
||||
"Action": "sts:AssumeRoleWithWebIdentity",
|
||||
"Condition": {
|
||||
"StringLike": {
|
||||
"${OIDC_PROVIDER}:sub":
|
||||
["system:serviceaccount:container-training:*"]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The little details
|
||||
|
||||
- When pods are created, they are processed by a mutating webhook
|
||||
|
||||
(typically named `pod-identity-webhook`)
|
||||
|
||||
- Pods using a ServiceAccount with the right annotation get:
|
||||
|
||||
- an extra token
|
||||
<br/>
|
||||
(mounted in `/var/run/secrets/eks.amazonaws.com/serviceaccount/token`)
|
||||
|
||||
- a few env vars
|
||||
<br/>
|
||||
(including `AWS_WEB_IDENTITY_TOKEN_FILE` and `AWS_ROLE_ARN`)
|
||||
|
||||
- AWS client libraries and tooling will work this that
|
||||
|
||||
(see [this list](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html) for supported versions)
|
||||
|
||||
---
|
||||
|
||||
# CNI
|
||||
|
||||
- EKS is a compliant Kubernetes implementation
|
||||
|
||||
(which means we can use a wide range of CNI plugins)
|
||||
|
||||
- However, the recommended CNI plugin is the "AWS VPC CNI"
|
||||
|
||||
(https://github.com/aws/amazon-vpc-cni-k8s)
|
||||
|
||||
- Pods are then "first class citizens" of AWS VPC
|
||||
|
||||
---
|
||||
|
||||
## AWS VPC CNI
|
||||
|
||||
- Each Pod gets an address in a VPC subnet
|
||||
|
||||
- No overlay network, no encapsulation, no overhead
|
||||
|
||||
(other than AWS network fabric, obviously)
|
||||
|
||||
- Probably the fastest network option when running on AWS
|
||||
|
||||
- Allows "direct" load balancing (more on that later)
|
||||
|
||||
- Can use security groups with Pod traffic
|
||||
|
||||
- But: limits the number of Pods per Node
|
||||
|
||||
- But: more complex configuration (more on that later)
|
||||
|
||||
---
|
||||
|
||||
## Number of Pods per Node
|
||||
|
||||
- Each Pod gets an IP address on an ENI
|
||||
|
||||
(Elastic Network Interface)
|
||||
|
||||
- EC2 instances can only have a limited number of ENIs
|
||||
|
||||
(the exact limit depends on the instance type)
|
||||
|
||||
- ENIs can only have a limited number of IP addresses
|
||||
|
||||
(with variations here as well)
|
||||
|
||||
- This gives limits of e.g. 35 pods on `t3.large`, 29 on `c5.large` ...
|
||||
|
||||
(see
|
||||
[full list of limits per instance type](https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt
|
||||
)
|
||||
and
|
||||
[ENI/IP details](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/pkg/awsutils/vpc_ip_resource_limit.go
|
||||
))
|
||||
|
||||
---
|
||||
|
||||
## Limits?
|
||||
|
||||
- These limits might seem low
|
||||
|
||||
- They're not *that* low if you compute e.g. the RAM/Pod ratio
|
||||
|
||||
- Except if you're running lots if tiny pods
|
||||
|
||||
- Bottom line: do the math!
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Pre-loading
|
||||
|
||||
- It can take a little while to allocate/attach an ENI
|
||||
|
||||
- The AWS VPC CNI can keep a few extra addresses on each Node
|
||||
|
||||
(by default, one ENI worth of IP addresses)
|
||||
|
||||
- This is tunable if needed
|
||||
|
||||
(see [the docs](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/eni-and-ip-target.md
|
||||
) for details)
|
||||
|
||||
---
|
||||
|
||||
## Better load balancing
|
||||
|
||||
- The default path for inbound traffic is:
|
||||
|
||||
Load balancer → NodePort → Pod
|
||||
|
||||
- With the AWS VPC CNI, it becomes possible to do:
|
||||
|
||||
Load balancer → Pod
|
||||
|
||||
- More on that in the load balancing section!
|
||||
|
||||
---
|
||||
|
||||
## Configuration complexity
|
||||
|
||||
- The AWS VPC CNI is a very good solution when running EKS
|
||||
|
||||
- It brings optimized solutions to various use-cases:
|
||||
|
||||
- direct load balancing
|
||||
- user authentication
|
||||
- interconnection with other infrastructure
|
||||
- etc.
|
||||
|
||||
- Keep in mind that all these solutions are AWS-specific
|
||||
|
||||
- They can require a non-trivial amount of specific configuration
|
||||
|
||||
- Especially when moving from a simple POC to an IAC deployment!
|
||||
|
||||
---
|
||||
|
||||
# Load Balancers
|
||||
|
||||
- Here be dragons!
|
||||
|
||||
- Multiple options, each with different pros/cons
|
||||
|
||||
- It's necessary to know both AWS products and K8S concepts
|
||||
|
||||
---
|
||||
|
||||
## AWS load balancers
|
||||
|
||||
- CLB / Classic Load Balancer (formerly known as ELB)
|
||||
|
||||
- can work in L4 (TCP) or L7 (HTTP) mode
|
||||
- can do TLS unrolling
|
||||
- can't do websockets, HTTP/2, content-based routing ...
|
||||
|
||||
- NLB / Network Load Balancer
|
||||
|
||||
- high-performance L4 load balancer with TLS support
|
||||
|
||||
- ALB / Application Load Balancer
|
||||
|
||||
- HTTP load balancer
|
||||
- can do TLS unrolling
|
||||
- can do websockets, HTTP/2, content-based routing ...
|
||||
|
||||
---
|
||||
|
||||
## Load balancing modes
|
||||
|
||||
- "IP targets"
|
||||
|
||||
- send traffic directly from LB to Pods
|
||||
|
||||
- Pods must use the AWS VPC CNI
|
||||
|
||||
- compatible with Fargate Pods
|
||||
|
||||
- "Instance targets"
|
||||
|
||||
- send traffic to a NodePort (generally incurs an extra hop)
|
||||
|
||||
- Pods can use any CNI
|
||||
|
||||
- not compatible with Fargate Pods
|
||||
|
||||
- Each LB (Service) can use a different mode, if necessary
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes load balancers
|
||||
|
||||
- Service (L4)
|
||||
|
||||
- ClusterIP: internal load balancing
|
||||
- NodePort: external load balancing on ports >30000
|
||||
- LoadBalancer: external load balancing on the port you want
|
||||
- ExternalIP: external load balancing directly on nodes
|
||||
|
||||
- Ingress (L7 HTTP)
|
||||
|
||||
- partial content-based routing (`Host` header, request path)
|
||||
- requires an Ingress Controller (in front)
|
||||
- works with Services (in back)
|
||||
|
||||
---
|
||||
|
||||
## Two controllers are available
|
||||
|
||||
- Kubernetes "in-tree" load balancer controller
|
||||
|
||||
- always available
|
||||
- used by default for LoadBalancer Services
|
||||
- creates CLB by default; can also do NLB
|
||||
- can only do "instance targets"
|
||||
- can use extra CLB features (TLS, HTTP)
|
||||
|
||||
- AWS Load Balancer Controller (fka AWS ALB Ingress Controller)
|
||||
|
||||
- optional add-on (requires additional config)
|
||||
- primarily meant to be an Ingress Controller
|
||||
- creates NLB and ALB
|
||||
- can do "instance targets" and "IP targets"
|
||||
- can also be used for LoadBalancer Services with type `nlb-ip`
|
||||
|
||||
- They can run side by side
|
||||
|
||||
---
|
||||
|
||||
## Which one should we use?
|
||||
|
||||
- AWS Load Balancer Controller supports "IP targets"
|
||||
|
||||
(which means direct routing of traffic to Pods)
|
||||
|
||||
- It can be used as an Ingress controller
|
||||
|
||||
- It *seems* to be the perfect solution for EKS!
|
||||
|
||||
- However ...
|
||||
|
||||
---
|
||||
|
||||
## Caveats
|
||||
|
||||
- AWS Load Balancer Controller requires extensive configuration
|
||||
|
||||
- a few hours to a few days to get it to work in a POC ...
|
||||
|
||||
- a few days to a few weeks to industrialize that process?
|
||||
|
||||
- It's AWS-specific
|
||||
|
||||
- It still introduces an extra hop, even if that hop is invisible
|
||||
|
||||
- Other ingress controllers can have interesting features
|
||||
|
||||
(canary deployment, A/B testing ...)
|
||||
|
||||
---
|
||||
|
||||
## Noteworthy annotations and docs
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip`
|
||||
|
||||
- LoadBalancer Service with "IP targets" ([docs](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/service/nlb_ip_mode/))
|
||||
- requires AWS Load Balancer Controller
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-internal: "true"`
|
||||
|
||||
- internal load balancer (for private VPC)
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-type: nlb`
|
||||
|
||||
- opt for NLB instead of CLB with in-tree controller
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"`
|
||||
|
||||
- use HAProxy [PROXY protocol](https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt)
|
||||
|
||||
---
|
||||
|
||||
## TLS-related annotations
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-ssl-cert`
|
||||
|
||||
- enable TLS and use that certificate
|
||||
- example value: `arn:aws:acm:<region>:<account>:certificate/<cert-id>`
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-ssl-ports`
|
||||
|
||||
- enable TLS *only* on the specified ports (when multiple ports are exposed)
|
||||
- example value: `"443,8443"`
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy`
|
||||
|
||||
- specify ciphers and other TLS parameters to use (see [that list](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-security-policy-table.html))
|
||||
- example value: `"ELBSecurityPolicy-TLS-1-2-2017-01"`
|
||||
|
||||
---
|
||||
|
||||
## To HTTP(S) or not to HTTP(S)
|
||||
|
||||
- `service.beta.kubernetes.io/aws-load-balancer-backend-protocol`
|
||||
|
||||
- can be either `http`, `https`, `ssl`, or `tcp`
|
||||
|
||||
- if `https` or `ssl`: enable TLS to the backend
|
||||
|
||||
- if `http` or `https`: enable HTTP `x-forwarded-for` headers (with `http` or `https`)
|
||||
|
||||
???
|
||||
|
||||
## Cluster autoscaling
|
||||
|
||||
## Logging
|
||||
|
||||
https://docs.aws.amazon.com/eks/latest/userguide/logging-using-cloudtrail.html
|
||||
|
||||
:EN:- Working with EKS
|
||||
:EN:- Cluster and user provisioning
|
||||
:EN:- Networking and load balancing
|
||||
|
||||
:FR:- Travailler avec EKS
|
||||
:FR:- Outils de déploiement
|
||||
:FR:- Intégration avec IAM
|
||||
:FR:- Fonctionalités réseau
|
||||
@@ -30,7 +30,7 @@
|
||||
|
||||
- or we hit the *backoff limit* of the Job (default=6)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create a Job that has a 50% chance of success:
|
||||
```bash
|
||||
@@ -49,7 +49,7 @@
|
||||
|
||||
- If the Pod fails, the Job creates another Pod
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the status of the Pod(s) created by the Job:
|
||||
```bash
|
||||
@@ -108,7 +108,7 @@ class: extra-details
|
||||
|
||||
(The Cron Job will not hold if a previous job is still running)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create the Cron Job:
|
||||
```bash
|
||||
@@ -135,7 +135,7 @@ class: extra-details
|
||||
|
||||
(re-creating another one if it fails, for instance if its node fails)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the Jobs that are created:
|
||||
```bash
|
||||
|
||||
@@ -98,7 +98,7 @@
|
||||
|
||||
- Let's list our bootstrap tokens on a cluster created with kubeadm
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Log into node `test1`
|
||||
|
||||
@@ -145,7 +145,7 @@ class: extra-details
|
||||
|
||||
- The token we need to use has the form `abcdef.1234567890abcdef`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check that it is accepted by the API server:
|
||||
```bash
|
||||
@@ -177,7 +177,7 @@ class: extra-details
|
||||
|
||||
- That information is stored in a public ConfigMap
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Retrieve that ConfigMap:
|
||||
```bash
|
||||
|
||||
@@ -88,7 +88,7 @@ spec:
|
||||
|
||||
- Let's try this out!
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the port used by our self-hosted registry:
|
||||
```bash
|
||||
|
||||
@@ -40,7 +40,7 @@
|
||||
|
||||
- Let's build the image for the DockerCoins `worker` service with Kaniko
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Find the port number for our self-hosted registry:
|
||||
```bash
|
||||
@@ -160,7 +160,7 @@ spec:
|
||||
|
||||
- The YAML for the pod is in `k8s/kaniko-build.yaml`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create the pod:
|
||||
```bash
|
||||
|
||||
@@ -37,7 +37,7 @@ so that your build pipeline is automated.*
|
||||
|
||||
- We will deploy a registry container, and expose it with a NodePort
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create the registry service:
|
||||
```bash
|
||||
@@ -57,7 +57,7 @@ so that your build pipeline is automated.*
|
||||
|
||||
- We need to find out which port has been allocated
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- View the service details:
|
||||
```bash
|
||||
@@ -78,7 +78,7 @@ so that your build pipeline is automated.*
|
||||
|
||||
- A convenient Docker registry API route to remember is `/v2/_catalog`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
<!-- ```hide kubectl wait deploy/registry --for condition=available```-->
|
||||
|
||||
@@ -102,7 +102,7 @@ We should see:
|
||||
|
||||
- We can retag a small image, and push it to the registry
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Make sure we have the busybox image, and retag it:
|
||||
```bash
|
||||
@@ -123,7 +123,7 @@ We should see:
|
||||
|
||||
- Let's use the same endpoint as before
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Ensure that our busybox image is now in the local registry:
|
||||
```bash
|
||||
@@ -143,7 +143,7 @@ The curl command should now output:
|
||||
|
||||
- We are going to use a convenient feature of Docker Compose
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Go to the `stacks` directory:
|
||||
```bash
|
||||
@@ -217,7 +217,7 @@ class: extra-details
|
||||
|
||||
- All our images should now be in the registry
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Re-run the same `curl` command as earlier:
|
||||
```bash
|
||||
@@ -232,4 +232,4 @@ variable, so that we can quickly switch from
|
||||
the self-hosted registry to pre-built images
|
||||
hosted on the Docker Hub. So make sure that
|
||||
this $REGISTRY variable is set correctly when
|
||||
running these commands!*
|
||||
running the exercises!*
|
||||
@@ -56,7 +56,7 @@
|
||||
|
||||
- It can be installed with a YAML manifest, or with Helm
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Let's install the cert-manager Helm chart with this one-liner:
|
||||
```bash
|
||||
@@ -86,7 +86,7 @@
|
||||
|
||||
- The manifest shown on the previous slide is in @@LINK[k8s/cm-clusterissuer.yaml]
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create the ClusterIssuer:
|
||||
```bash
|
||||
@@ -115,7 +115,7 @@
|
||||
|
||||
- The manifest shown on the previous slide is in @@LINK[k8s/cm-certificate.yaml]
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Edit the Certificate to update the domain name
|
||||
|
||||
@@ -140,7 +140,7 @@
|
||||
|
||||
- then it waits for the challenge to complete
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- View the resources created by cert-manager:
|
||||
```bash
|
||||
@@ -158,7 +158,7 @@
|
||||
|
||||
`http://<our-domain>/.well-known/acme-challenge/<token>`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the *path* of the Ingress in particular:
|
||||
```bash
|
||||
@@ -176,7 +176,7 @@
|
||||
|
||||
An Ingress Controller! 😅
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Install an Ingress Controller:
|
||||
```bash
|
||||
|
||||
@@ -1,445 +0,0 @@
|
||||
# Cluster autoscaler
|
||||
|
||||
- When the cluster is full, we need to add more nodes
|
||||
|
||||
- This can be done manually:
|
||||
|
||||
- deploy new machines and add them to the cluster
|
||||
|
||||
- if using managed Kubernetes, use some API/CLI/UI
|
||||
|
||||
- Or automatically with the cluster autoscaler:
|
||||
|
||||
https://github.com/kubernetes/autoscaler
|
||||
|
||||
---
|
||||
|
||||
## Use-cases
|
||||
|
||||
- Batch job processing
|
||||
|
||||
"once in a while, we need to execute these 1000 jobs in parallel"
|
||||
|
||||
"...but the rest of the time there is almost nothing running on the cluster"
|
||||
|
||||
- Dynamic workload
|
||||
|
||||
"a few hours per day or a few days per week, we have a lot of traffic"
|
||||
|
||||
"...but the rest of the time, the load is much lower"
|
||||
|
||||
---
|
||||
|
||||
## Pay for what you use
|
||||
|
||||
- The point of the cloud is to "pay for what you use"
|
||||
|
||||
- If you have a fixed number of cloud instances running at all times:
|
||||
|
||||
*you're doing in wrong (except if your load is always the same)*
|
||||
|
||||
- If you're not using some kind of autoscaling, you're wasting money
|
||||
|
||||
(except if you like lining the pockets of your cloud provider)
|
||||
|
||||
---
|
||||
|
||||
## Running the cluster autoscaler
|
||||
|
||||
- We must run nodes on a supported infrastructure
|
||||
|
||||
- See [here] for a non-exhaustive list of supported providers
|
||||
|
||||
- Sometimes, the cluster autoscaler is installed automatically
|
||||
|
||||
(or by setting a flag / checking a box when creating the cluster)
|
||||
|
||||
- Sometimes, it requires additional work
|
||||
|
||||
(which is often non-trivial and highly provider-specific)
|
||||
|
||||
[here]: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider
|
||||
|
||||
---
|
||||
|
||||
## Scaling up in theory
|
||||
|
||||
IF a Pod is `Pending`,
|
||||
|
||||
AND adding a Node would allow this Pod to be scheduled,
|
||||
|
||||
THEN add a Node.
|
||||
|
||||
---
|
||||
|
||||
## Fine print 1
|
||||
|
||||
*IF a Pod is `Pending`...*
|
||||
|
||||
- First of all, the Pod must exist
|
||||
|
||||
- Pod creation might be blocked by e.g. a namespace quota
|
||||
|
||||
- In that case, the cluster autoscaler will never trigger
|
||||
|
||||
---
|
||||
|
||||
## Fine print 2
|
||||
|
||||
*IF a Pod is `Pending`...*
|
||||
|
||||
- If our Pods do not have resource requests:
|
||||
|
||||
*they will be in the `BestEffort` class*
|
||||
|
||||
- Generally, Pods in the `BestEffort` class are schedulable
|
||||
|
||||
- except if they have anti-affinity placement constraints
|
||||
|
||||
- except if all Nodes already run the max number of pods (110 by default)
|
||||
|
||||
- Therefore, if we want to leverage cluster autoscaling:
|
||||
|
||||
*our Pods should have resource requests*
|
||||
|
||||
---
|
||||
|
||||
## Fine print 3
|
||||
|
||||
*AND adding a Node would allow this Pod to be scheduled...*
|
||||
|
||||
- The autoscaler won't act if:
|
||||
|
||||
- the Pod is too big to fit on a single Node
|
||||
|
||||
- the Pod has impossible placement constraints
|
||||
|
||||
- Examples:
|
||||
|
||||
- "run one Pod per datacenter" with 4 pods and 3 datacenters
|
||||
|
||||
- "use this nodeSelector" but no such Node exists
|
||||
|
||||
---
|
||||
|
||||
## Trying it out
|
||||
|
||||
- We're going to check how much capacity is available on the cluster
|
||||
|
||||
- Then we will create a basic deployment
|
||||
|
||||
- We will add resource requests to that deployment
|
||||
|
||||
- Then scale the deployment to exceed the available capacity
|
||||
|
||||
- **The following commands require a working cluster autoscaler!**
|
||||
|
||||
---
|
||||
|
||||
## Checking available resources
|
||||
|
||||
.lab[
|
||||
|
||||
- Check how much CPU is allocatable on the cluster:
|
||||
```bash
|
||||
kubectl get nodes -o jsonpath={..allocatable.cpu}
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
- If we see e.g. `2800m 2800m 2800m`, that means:
|
||||
|
||||
3 nodes with 2.8 CPUs allocatable each
|
||||
|
||||
- To trigger autoscaling, we will create 7 pods requesting 1 CPU each
|
||||
|
||||
(each node can fit 2 such pods)
|
||||
|
||||
---
|
||||
|
||||
## Creating our test Deployment
|
||||
|
||||
.lab[
|
||||
|
||||
- Create the Deployment:
|
||||
```bash
|
||||
kubectl create deployment blue --image=jpetazzo/color
|
||||
```
|
||||
|
||||
- Add a request for 1 CPU:
|
||||
```bash
|
||||
kubectl patch deployment blue --patch='
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: color
|
||||
resources:
|
||||
requests:
|
||||
cpu: 1
|
||||
'
|
||||
```
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Scaling up in practice
|
||||
|
||||
- This assumes that we have strictly less than 7 CPUs available
|
||||
|
||||
(adjust the numbers if necessary!)
|
||||
|
||||
.lab[
|
||||
|
||||
- Scale up the Deployment:
|
||||
```bash
|
||||
kubectl scale deployment blue --replicas=7
|
||||
```
|
||||
|
||||
- Check that we have a new Pod, and that it's `Pending`:
|
||||
```bash
|
||||
kubectl get pods
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Cluster autoscaling
|
||||
|
||||
- After a few minutes, a new Node should appear
|
||||
|
||||
- When that Node becomes `Ready`, the Pod will be assigned to it
|
||||
|
||||
- The Pod will then be `Running`
|
||||
|
||||
- Reminder: the `AGE` of the Pod indicates when the Pod was *created*
|
||||
|
||||
(it doesn't indicate when the Pod was scheduled or started!)
|
||||
|
||||
- To see other state transitions, check the `status.conditions` of the Pod
|
||||
|
||||
---
|
||||
|
||||
## Scaling down in theory
|
||||
|
||||
IF a Node has less than 50% utilization for 10 minutes,
|
||||
|
||||
AND all its Pods can be scheduled on other Nodes,
|
||||
|
||||
AND all its Pods are *evictable*,
|
||||
|
||||
AND the Node doesn't have a "don't scale me down" annotation¹,
|
||||
|
||||
THEN drain the Node and shut it down.
|
||||
|
||||
.footnote[¹The annotation is: `cluster-autoscaler.kubernetes.io/scale-down-disabled=true`]
|
||||
|
||||
---
|
||||
|
||||
## When is a Pod "evictable"?
|
||||
|
||||
By default, Pods are evictable, except if any of the following is true.
|
||||
|
||||
- They have a restrictive Pod Disruption Budget
|
||||
|
||||
- They are "standalone" (not controlled by a ReplicaSet/Deployment, StatefulSet, Job...)
|
||||
|
||||
- They are in `kube-system` and don't have a Pod Disruption Budget
|
||||
|
||||
- They have local storage (that includes `EmptyDir`!)
|
||||
|
||||
This can be overridden by setting the annotation:
|
||||
<br/>
|
||||
`cluster-autoscaler.kubernetes.io/safe-to-evict`
|
||||
<br/>(it can be set to `true` or `false`)
|
||||
|
||||
---
|
||||
|
||||
## Pod Disruption Budget
|
||||
|
||||
- Special resource to configure how many Pods can be *disrupted*
|
||||
|
||||
(i.e. shutdown/terminated)
|
||||
|
||||
- Applies to Pods matching a given selector
|
||||
|
||||
(typically matching the selector of a Deployment)
|
||||
|
||||
- Only applies to *voluntary disruption*
|
||||
|
||||
(e.g. cluster autoscaler draining a node, planned maintenance...)
|
||||
|
||||
- Can express `minAvailable` or `maxUnavailable`
|
||||
|
||||
- See [documentation] for details and examples
|
||||
|
||||
[documentation]: https://kubernetes.io/docs/tasks/run-application/configure-pdb/
|
||||
|
||||
---
|
||||
|
||||
## Local storage
|
||||
|
||||
- If our Pods use local storage, they will prevent scaling down
|
||||
|
||||
- If we have e.g. an `EmptyDir` volume for caching/sharing:
|
||||
|
||||
make sure to set the `.../safe-to-evict` annotation to `true`!
|
||||
|
||||
- Even if the volume...
|
||||
|
||||
- ...only has a PID file or UNIX socket
|
||||
|
||||
- ...is empty
|
||||
|
||||
- ...is not mounted by any container in the Pod!
|
||||
|
||||
---
|
||||
|
||||
## Expensive batch jobs
|
||||
|
||||
- Careful if we have long-running batch jobs!
|
||||
|
||||
(e.g. jobs that take many hours/days to complete)
|
||||
|
||||
- These jobs could get evicted before they complete
|
||||
|
||||
(especially if they use less than 50% of the allocatable resources)
|
||||
|
||||
- Make sure to set the `.../safe-to-evict` annotation to `false`!
|
||||
|
||||
---
|
||||
|
||||
## Node groups
|
||||
|
||||
- Easy scenario: all nodes have the same size
|
||||
|
||||
- Realistic scenario: we have nodes of different sizes
|
||||
|
||||
- e.g. mix of CPU and GPU nodes
|
||||
|
||||
- e.g. small nodes for control plane, big nodes for batch jobs
|
||||
|
||||
- e.g. leveraging spot capacity
|
||||
|
||||
- The cluster autoscaler can handle it!
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Leveraging spot capacity
|
||||
|
||||
- AWS, Azure, and Google Cloud are typically more expensive then their competitors
|
||||
|
||||
- However, they offer *spot* capacity (spot instances, spot VMs...)
|
||||
|
||||
- *Spot* capacity:
|
||||
|
||||
- has a much lower cost (see e.g. AWS [spot instance advisor][awsspot])
|
||||
|
||||
- has a cost that varies continuously depending on regions, instance type...
|
||||
|
||||
- can be preempted at all times
|
||||
|
||||
- To be cost-effective, it is strongly recommended to leverage spot capacity
|
||||
|
||||
[awsspot]: https://aws.amazon.com/ec2/spot/instance-advisor/
|
||||
|
||||
---
|
||||
|
||||
## Node groups in practice
|
||||
|
||||
- The cluster autoscaler maps nodes to *node groups*
|
||||
|
||||
- this is an internal, provider-dependent mechanism
|
||||
|
||||
- the node group is sometimes visible through a proprietary label or annotation
|
||||
|
||||
- Each node group is scaled independently
|
||||
|
||||
- The cluster autoscaler uses [expanders] to decide which node group to scale up
|
||||
|
||||
(the default expander is "random", i.e. pick a node group at random!)
|
||||
|
||||
- Of course, only acceptable node groups will be considered
|
||||
|
||||
(i.e. node groups that could accommodate the `Pending` Pods)
|
||||
|
||||
[expanders]: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-expanders
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Scaling to zero
|
||||
|
||||
- *In general,* a node group needs to have at least one node at all times
|
||||
|
||||
(the cluster autoscaler uses that node to figure out the size, labels, taints... of the group)
|
||||
|
||||
- *On some providers,* there are special ways to specify labels and/or taints
|
||||
|
||||
(but if you want to scale to zero, check that the provider supports it!)
|
||||
|
||||
---
|
||||
|
||||
## Warning
|
||||
|
||||
- Autoscaling up is easy
|
||||
|
||||
- Autoscaling down is harder
|
||||
|
||||
- It might get stuck because Pods are not evictable
|
||||
|
||||
- Do at least a dry run to make sure that the cluster scales down correctly!
|
||||
|
||||
- Have alerts on cloud spend
|
||||
|
||||
- *Especially when using big/expensive nodes (e.g. with GPU!)*
|
||||
|
||||
---
|
||||
|
||||
## Preferred vs. Required
|
||||
|
||||
- Some Kubernetes mechanisms allow to express "soft preferences":
|
||||
|
||||
- affinity (`requiredDuringSchedulingIgnoredDuringExecution` vs `preferredDuringSchedulingIgnoredDuringExecution`)
|
||||
|
||||
- taints (`NoSchedule`/`NoExecute` vs `PreferNoSchedule`)
|
||||
|
||||
- Remember that these "soft preferences" can be ignored
|
||||
|
||||
(and given enough time and churn on the cluster, they will!)
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- The cluster autoscaler publishes its status on a ConfigMap
|
||||
|
||||
.lab[
|
||||
|
||||
- Check the cluster autoscaler status:
|
||||
```bash
|
||||
kubectl describe configmap --namespace kube-system cluster-autoscaler-status
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
- We can also check the logs of the autoscaler
|
||||
|
||||
(except on managed clusters where it's running internally, not visible to us)
|
||||
|
||||
---
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
Special thanks to [@s0ulshake] for their help with this section!
|
||||
|
||||
If you need help to run your data science workloads on Kubernetes,
|
||||
<br/>they're available for consulting.
|
||||
|
||||
(Get in touch with them through https://www.linkedin.com/in/ajbowen/)
|
||||
|
||||
[@s0ulshake]: https://twitter.com/s0ulshake
|
||||
@@ -18,9 +18,9 @@
|
||||
|
||||
- It's easy to check the version for the API server
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Log into node `oldversion1`
|
||||
- Log into node `test1`
|
||||
|
||||
- Check the version of kubectl and of the API server:
|
||||
```bash
|
||||
@@ -39,7 +39,7 @@
|
||||
|
||||
- It's also easy to check the version of kubelet
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check node versions (includes kubelet, kernel, container engine):
|
||||
```bash
|
||||
@@ -60,7 +60,7 @@
|
||||
|
||||
- If the control plane is self-hosted (running in pods), we can check it
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Show image versions for all pods in `kube-system` namespace:
|
||||
```bash
|
||||
@@ -81,7 +81,7 @@
|
||||
|
||||
## What version are we running anyway?
|
||||
|
||||
- When I say, "I'm running Kubernetes 1.18", is that the version of:
|
||||
- When I say, "I'm running Kubernetes 1.15", is that the version of:
|
||||
|
||||
- kubectl
|
||||
|
||||
@@ -157,15 +157,15 @@
|
||||
|
||||
## Kubernetes uses semantic versioning
|
||||
|
||||
- Kubernetes versions look like MAJOR.MINOR.PATCH; e.g. in 1.18.20:
|
||||
- Kubernetes versions look like MAJOR.MINOR.PATCH; e.g. in 1.17.2:
|
||||
|
||||
- MAJOR = 1
|
||||
- MINOR = 18
|
||||
- PATCH = 20
|
||||
- MINOR = 17
|
||||
- PATCH = 2
|
||||
|
||||
- It's always possible to mix and match different PATCH releases
|
||||
|
||||
(e.g. 1.18.20 and 1.18.15 are compatible)
|
||||
(e.g. 1.16.1 and 1.16.6 are compatible)
|
||||
|
||||
- It is recommended to run the latest PATCH release
|
||||
|
||||
@@ -181,9 +181,9 @@
|
||||
|
||||
- All components support a difference of one¹ MINOR version
|
||||
|
||||
- This allows live upgrades (since we can mix e.g. 1.18 and 1.19)
|
||||
- This allows live upgrades (since we can mix e.g. 1.15 and 1.16)
|
||||
|
||||
- It also means that going from 1.18 to 1.20 requires going through 1.19
|
||||
- It also means that going from 1.14 to 1.16 requires going through 1.15
|
||||
|
||||
.footnote[¹Except kubelet, which can be up to two MINOR behind API server,
|
||||
and kubectl, which can be one MINOR ahead or behind API server.]
|
||||
@@ -214,7 +214,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
|
||||
|
||||
- We will change the version of the API server
|
||||
|
||||
- We will work with cluster `oldversion` (nodes `oldversion1`, `oldversion2`, `oldversion3`)
|
||||
- We will work with cluster `test` (nodes `test1`, `test2`, `test3`)
|
||||
|
||||
---
|
||||
|
||||
@@ -240,9 +240,9 @@ and kubectl, which can be one MINOR ahead or behind API server.]
|
||||
|
||||
- We will edit the YAML file to use a different image version
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Log into node `oldversion1`
|
||||
- Log into node `test1`
|
||||
|
||||
- Check API server version:
|
||||
```bash
|
||||
@@ -254,7 +254,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
|
||||
sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
|
||||
```
|
||||
|
||||
- Look for the `image:` line, and update it to e.g. `v1.19.0`
|
||||
- Look for the `image:` line, and update it to e.g. `v1.16.0`
|
||||
|
||||
]
|
||||
|
||||
@@ -264,7 +264,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
|
||||
|
||||
- The API server will be briefly unavailable while kubelet restarts it
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the API server version:
|
||||
```bash
|
||||
@@ -299,7 +299,7 @@ and kubectl, which can be one MINOR ahead or behind API server.]
|
||||
|
||||
(note: this is possible only because the cluster was installed with kubeadm)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check what will be upgraded:
|
||||
```bash
|
||||
@@ -308,11 +308,11 @@ and kubectl, which can be one MINOR ahead or behind API server.]
|
||||
|
||||
]
|
||||
|
||||
Note 1: kubeadm thinks that our cluster is running 1.19.0.
|
||||
Note 1: kubeadm thinks that our cluster is running 1.16.0.
|
||||
<br/>It is confused by our manual upgrade of the API server!
|
||||
|
||||
Note 2: kubeadm itself is still version 1.18.20..
|
||||
<br/>It doesn't know how to upgrade do 1.19.X.
|
||||
Note 2: kubeadm itself is still version 1.15.9.
|
||||
<br/>It doesn't know how to upgrade do 1.16.X.
|
||||
|
||||
---
|
||||
|
||||
@@ -320,7 +320,7 @@ Note 2: kubeadm itself is still version 1.18.20..
|
||||
|
||||
- First things first: we need to upgrade kubeadm
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Upgrade kubeadm:
|
||||
```
|
||||
@@ -335,28 +335,28 @@ Note 2: kubeadm itself is still version 1.18.20..
|
||||
]
|
||||
|
||||
Problem: kubeadm doesn't know know how to handle
|
||||
upgrades from version 1.18.
|
||||
upgrades from version 1.15.
|
||||
|
||||
This is because we installed version 1.22 (or even later).
|
||||
This is because we installed version 1.17 (or even later).
|
||||
|
||||
We need to install kubeadm version 1.19.X.
|
||||
We need to install kubeadm version 1.16.X.
|
||||
|
||||
---
|
||||
|
||||
## Downgrading kubeadm
|
||||
|
||||
- We need to go back to version 1.19.X.
|
||||
- We need to go back to version 1.16.X (e.g. 1.16.6)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- View available versions for package `kubeadm`:
|
||||
```bash
|
||||
apt show kubeadm -a | grep ^Version | grep 1.19
|
||||
apt show kubeadm -a | grep ^Version | grep 1.16
|
||||
```
|
||||
|
||||
- Downgrade kubeadm:
|
||||
```
|
||||
sudo apt install kubeadm=1.19.8-00
|
||||
sudo apt install kubeadm=1.16.6-00
|
||||
```
|
||||
|
||||
- Check what kubeadm tells us:
|
||||
@@ -366,7 +366,7 @@ We need to install kubeadm version 1.19.X.
|
||||
|
||||
]
|
||||
|
||||
kubeadm should now agree to upgrade to 1.19.8.
|
||||
kubeadm should now agree to upgrade to 1.16.6.
|
||||
|
||||
---
|
||||
|
||||
@@ -378,11 +378,11 @@ kubeadm should now agree to upgrade to 1.19.8.
|
||||
|
||||
- Or we can try the upgrade anyway
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Perform the upgrade:
|
||||
```bash
|
||||
sudo kubeadm upgrade apply v1.19.8
|
||||
sudo kubeadm upgrade apply v1.16.6
|
||||
```
|
||||
|
||||
]
|
||||
@@ -395,9 +395,9 @@ kubeadm should now agree to upgrade to 1.19.8.
|
||||
|
||||
- We can therefore use `apt` or `apt-get`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Log into node `oldversion3`
|
||||
- Log into node `test3`
|
||||
|
||||
- View available versions for package `kubelet`:
|
||||
```bash
|
||||
@@ -406,7 +406,7 @@ kubeadm should now agree to upgrade to 1.19.8.
|
||||
|
||||
- Upgrade kubelet:
|
||||
```bash
|
||||
sudo apt install kubelet=1.19.8-00
|
||||
sudo apt install kubelet=1.16.6-00
|
||||
```
|
||||
|
||||
]
|
||||
@@ -415,9 +415,9 @@ kubeadm should now agree to upgrade to 1.19.8.
|
||||
|
||||
## Checking what we've done
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Log into node `oldversion1`
|
||||
- Log into node `test1`
|
||||
|
||||
- Check node versions:
|
||||
```bash
|
||||
@@ -458,15 +458,15 @@ kubeadm should now agree to upgrade to 1.19.8.
|
||||
|
||||
(after upgrading the control plane)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Download the configuration on each node, and upgrade kubelet:
|
||||
```bash
|
||||
for N in 1 2 3; do
|
||||
ssh oldversion$N "
|
||||
sudo apt install kubeadm=1.19.8-00 &&
|
||||
ssh test$N "
|
||||
sudo apt install kubeadm=1.16.6-00 &&
|
||||
sudo kubeadm upgrade node &&
|
||||
sudo apt install kubelet=1.19.8-00"
|
||||
sudo apt install kubelet=1.16.6-00"
|
||||
done
|
||||
```
|
||||
]
|
||||
@@ -475,9 +475,9 @@ kubeadm should now agree to upgrade to 1.19.8.
|
||||
|
||||
## Checking what we've done
|
||||
|
||||
- All our nodes should now be updated to version 1.19.8
|
||||
- All our nodes should now be updated to version 1.16.6
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check nodes versions:
|
||||
```bash
|
||||
@@ -492,13 +492,13 @@ class: extra-details
|
||||
|
||||
## Skipping versions
|
||||
|
||||
- This example worked because we went from 1.18 to 1.19
|
||||
- This example worked because we went from 1.15 to 1.16
|
||||
|
||||
- If you are upgrading from e.g. 1.16, you will have to go through 1.17 first
|
||||
- If you are upgrading from e.g. 1.14, you will have to go through 1.15 first
|
||||
|
||||
- This means upgrading kubeadm to 1.17.X, then using it to upgrade the cluster
|
||||
- This means upgrading kubeadm to 1.15.X, then using it to upgrade the cluster
|
||||
|
||||
- Then upgrading kubeadm to 1.18.X, etc.
|
||||
- Then upgrading kubeadm to 1.16.X, etc.
|
||||
|
||||
- **Make sure to read the release notes before upgrading!**
|
||||
|
||||
|
||||
@@ -204,7 +204,7 @@ class: extra-details
|
||||
|
||||
## Logging into the new cluster
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Log into node `kuberouter1`
|
||||
|
||||
@@ -228,7 +228,7 @@ class: extra-details
|
||||
|
||||
- By default, kubelet gets the CNI configuration from `/etc/cni/net.d`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the content of `/etc/cni/net.d`
|
||||
|
||||
@@ -262,7 +262,7 @@ class: extra-details
|
||||
|
||||
(where `C` is our cluster number)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Edit the Compose file to set the Cluster CIDR:
|
||||
```bash
|
||||
@@ -298,7 +298,7 @@ class: extra-details
|
||||
|
||||
(where `A.B.C.D` is the public address of `kuberouter1`, running the control plane)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Edit the YAML file to set the API server address:
|
||||
```bash
|
||||
@@ -320,7 +320,7 @@ Note: the DaemonSet won't create any pods (yet) since there are no nodes (yet).
|
||||
|
||||
- This is similar to what we did for the `kubenet` cluster
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Generate the kubeconfig file (replacing `X.X.X.X` with the address of `kuberouter1`):
|
||||
```bash
|
||||
@@ -338,7 +338,7 @@ Note: the DaemonSet won't create any pods (yet) since there are no nodes (yet).
|
||||
|
||||
- We need to copy that kubeconfig file to the other nodes
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Copy `kubeconfig` to the other nodes:
|
||||
```bash
|
||||
@@ -359,7 +359,7 @@ Note: the DaemonSet won't create any pods (yet) since there are no nodes (yet).
|
||||
|
||||
- We need to pass `--network-plugin=cni`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Join the first node:
|
||||
```bash
|
||||
@@ -384,7 +384,7 @@ class: extra-details
|
||||
|
||||
(in `/etc/cni/net.d`)
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the content of `/etc/cni/net.d`
|
||||
|
||||
@@ -400,7 +400,7 @@ class: extra-details
|
||||
|
||||
- Let's create a Deployment and expose it with a Service
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create a Deployment running a web server:
|
||||
```bash
|
||||
@@ -423,7 +423,7 @@ class: extra-details
|
||||
|
||||
## Checking that everything works
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Get the ClusterIP address for the service:
|
||||
```bash
|
||||
@@ -449,7 +449,7 @@ class: extra-details
|
||||
|
||||
- What if we need to check that everything is working properly?
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Check the IP addresses of our pods:
|
||||
```bash
|
||||
@@ -490,7 +490,7 @@ class: extra-details
|
||||
|
||||
## Trying `kubectl logs` / `kubectl exec`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Try to show the logs of a kube-router pod:
|
||||
```bash
|
||||
|
||||
@@ -344,94 +344,32 @@ We'll cover them just after!*
|
||||
|
||||
---
|
||||
|
||||
## Example: HAProxy configuration
|
||||
## Passing a configuration file with a configmap
|
||||
|
||||
- We are going to deploy HAProxy, a popular load balancer
|
||||
- We will start a load balancer powered by HAProxy
|
||||
|
||||
- It expects to find its configuration in a specific place:
|
||||
- We will use the [official `haproxy` image](https://hub.docker.com/_/haproxy/)
|
||||
|
||||
`/usr/local/etc/haproxy/haproxy.cfg`
|
||||
- It expects to find its configuration in `/usr/local/etc/haproxy/haproxy.cfg`
|
||||
|
||||
- We will create a ConfigMap holding the configuration file
|
||||
- We will provide a simple HAproxy configuration, `k8s/haproxy.cfg`
|
||||
|
||||
- Then we will mount that ConfigMap in a Pod running HAProxy
|
||||
- It listens on port 80, and load balances connections between IBM and Google
|
||||
|
||||
---
|
||||
|
||||
## Blue/green load balancing
|
||||
## Creating the configmap
|
||||
|
||||
- In this example, we will deploy two versions of our app:
|
||||
.exercise[
|
||||
|
||||
- the "blue" version in the `blue` namespace
|
||||
|
||||
- the "green" version in the `green` namespace
|
||||
|
||||
- In both namespaces, we will have a Deployment and a Service
|
||||
|
||||
(both named `color`)
|
||||
|
||||
- We want to load balance traffic between both namespaces
|
||||
|
||||
(we can't do that with a simple service selector: these don't cross namespaces)
|
||||
|
||||
---
|
||||
|
||||
## Deploying the app
|
||||
|
||||
- We're going to use the image `jpetazzo/color`
|
||||
|
||||
(it is a simple "HTTP echo" server showing which pod served the request)
|
||||
|
||||
- We can create each Namespace, Deployment, and Service by hand, or...
|
||||
|
||||
.lab[
|
||||
|
||||
- We can deploy the app with a YAML manifest:
|
||||
- Go to the `k8s` directory in the repository:
|
||||
```bash
|
||||
kubectl apply -f ~/container.training/k8s/rainbow.yaml
|
||||
cd ~/container.training/k8s
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Testing the app
|
||||
|
||||
- Reminder: Service `x` in Namespace `y` is available through:
|
||||
|
||||
`x.y`, `x.y.svc`, `x.y.svc.cluster.local`
|
||||
|
||||
- Since the `cluster.local` suffix can change, we'll use `x.y.svc`
|
||||
|
||||
.lab[
|
||||
|
||||
- Check that the app is up and running:
|
||||
- Create a configmap named `haproxy` and holding the configuration file:
|
||||
```bash
|
||||
kubectl run --rm -it --restart=Never --image=nixery.dev/curl my-test-pod \
|
||||
curl color.blue.svc
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Creating the HAProxy configuration
|
||||
|
||||
Here is the file that we will use, @@LINK[k8s/haproxy.cfg]:
|
||||
|
||||
```
|
||||
@@INCLUDE[k8s/haproxy.cfg]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Creating the ConfigMap
|
||||
|
||||
.lab[
|
||||
|
||||
- Create a ConfigMap named `haproxy` and holding the configuration file:
|
||||
```bash
|
||||
kubectl create configmap haproxy --from-file=~/container.training/k8s/haproxy.cfg
|
||||
kubectl create configmap haproxy --from-file=haproxy.cfg
|
||||
```
|
||||
|
||||
- Check what our configmap looks like:
|
||||
@@ -443,21 +381,37 @@ Here is the file that we will use, @@LINK[k8s/haproxy.cfg]:
|
||||
|
||||
---
|
||||
|
||||
## Using the ConfigMap
|
||||
## Using the configmap
|
||||
|
||||
Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
|
||||
We are going to use the following pod definition:
|
||||
|
||||
```yaml
|
||||
@@INCLUDE[k8s/haproxy.yaml]
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: haproxy
|
||||
spec:
|
||||
volumes:
|
||||
- name: config
|
||||
configMap:
|
||||
name: haproxy
|
||||
containers:
|
||||
- name: haproxy
|
||||
image: haproxy
|
||||
volumeMounts:
|
||||
- name: config
|
||||
mountPath: /usr/local/etc/haproxy/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Creating the Pod
|
||||
## Using the configmap
|
||||
|
||||
.lab[
|
||||
- The resource definition from the previous slide is in `k8s/haproxy.yaml`
|
||||
|
||||
- Create the HAProxy Pod:
|
||||
.exercise[
|
||||
|
||||
- Create the HAProxy pod:
|
||||
```bash
|
||||
kubectl apply -f ~/container.training/k8s/haproxy.yaml
|
||||
```
|
||||
@@ -476,21 +430,27 @@ Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
|
||||
|
||||
## Testing our load balancer
|
||||
|
||||
- If everything went well, when we should see a perfect round robin
|
||||
- The load balancer will send:
|
||||
|
||||
(one request to `blue`, one request to `green`, one request to `blue`, etc.)
|
||||
- half of the connections to Google
|
||||
|
||||
.lab[
|
||||
- the other half to IBM
|
||||
|
||||
- Send a few requests:
|
||||
.exercise[
|
||||
|
||||
- Access the load balancer a few times:
|
||||
```bash
|
||||
for i in $(seq 10); do
|
||||
curl $IP
|
||||
done
|
||||
curl $IP
|
||||
curl $IP
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
We should see connections served by Google, and others served by IBM.
|
||||
<br/>
|
||||
(Each server sends us a redirect page. Look at the URL that they send us to!)
|
||||
|
||||
---
|
||||
|
||||
## Exposing configmaps with the downward API
|
||||
@@ -509,7 +469,7 @@ Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
|
||||
|
||||
## Creating the configmap
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Our configmap will have a single key, `http.addr`:
|
||||
```bash
|
||||
@@ -530,16 +490,29 @@ Here is @@LINK[k8s/haproxy.yaml], a Pod manifest using that ConfigMap:
|
||||
We are going to use the following pod definition:
|
||||
|
||||
```yaml
|
||||
@@INCLUDE[k8s/registry.yaml]
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: registry
|
||||
spec:
|
||||
containers:
|
||||
- name: registry
|
||||
image: registry
|
||||
env:
|
||||
- name: REGISTRY_HTTP_ADDR
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: registry
|
||||
key: http.addr
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Using the configmap
|
||||
|
||||
- The resource definition from the previous slide is in @@LINK[k8s/registry.yaml]
|
||||
- The resource definition from the previous slide is in `k8s/registry.yaml`
|
||||
|
||||
.lab[
|
||||
.exercise[
|
||||
|
||||
- Create the registry pod:
|
||||
```bash
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user