Compare commits

..

89 Commits

Author SHA1 Message Date
Jérôme Petazzoni
77606044f6 😈 Demonware advanced Kubernetes custom content 2023-12-07 15:31:04 -06:00
Jérôme Petazzoni
dbfda8b458 🐞 Typo fix 2023-12-06 15:31:09 -06:00
Jérôme Petazzoni
c8fc67c995 📃 Update V's name and social media link 2023-12-04 16:41:03 -06:00
Jérôme Petazzoni
28222db2e4 Add 1-second pre-pssh delay
Seems to help with AT&T fiber router.
(Actually it takes a longer delay to make a difference,
like 10 seconds, but this patch makes the delay configurable.)
2023-12-04 16:38:33 -06:00
Jérôme Petazzoni
a38f930858 📦 Use new k8s package repositories 2023-12-03 21:33:25 -06:00
Jérôme Petazzoni
2cef200726 Add DMUC+RBAC exercises 2023-12-03 15:38:43 -06:00
Jérôme Petazzoni
1f77a52137 📃 Flesh out upgrade information
Add the official policy (which is to drain nodes before upgrading),
and give some explanations about when it may/may not be fine to
upgrade without draining nodes.
2023-11-30 16:45:11 -06:00
Jérôme Petazzoni
b188e0f8a9 🔧 Mention priorityClasses around resource pressure 2023-11-30 16:10:12 -06:00
Jérôme Petazzoni
ac203a128d Add content about disruptions and PDB 2023-11-30 15:36:32 -06:00
Jérôme Petazzoni
a9920e5cf0 🌐 Add IPv6 support in netlify DNS scriptlet 2023-11-30 15:32:03 -06:00
Jérôme Petazzoni
d1047f950d 📃 Update resource limits to add ephemeral-storage 2023-11-29 14:23:24 -06:00
Jérôme Petazzoni
e380509ffe 💈 Tweak CSS for consistent spacing after titles 2023-11-29 14:22:54 -06:00
Jérôme Petazzoni
b5c754211e Mention Validating Admission Policies and CEL 2023-11-24 12:29:44 -06:00
Jérôme Petazzoni
cc57d983b2 🔧 Add Linode portal size for reference 2023-10-30 13:12:20 +01:00
Jérôme Petazzoni
fd86e6079d ✂️ Remove Service Catalog
This doesn't seem to be supported anymore, and looking at
https://github.com/kubernetes-retired/service-catalog/tree/master
it even looks like the whole thing might be deprecated?
2023-10-26 18:20:09 +02:00
Jérôme Petazzoni
08f2e76082 🐞 Fix a couple of typos 2023-10-26 17:53:53 +02:00
Jérôme Petazzoni
db848767c1 Update kubebuilder instructions for new controller semantics 2023-10-26 17:49:26 +02:00
Jérôme Petazzoni
c07f52c493 🔧 Add function to delete CloudFlare DNS records 2023-10-22 09:20:39 +02:00
Jérôme Petazzoni
016c8fc863 🔧 Add GP2 instance size to portal env (for reference) 2023-10-17 10:17:29 +02:00
Jérôme Petazzoni
b9bbccb346 Bump up Network Policy documentation link versions 2023-10-10 15:09:20 +02:00
Jérôme Petazzoni
311a2aaf32 🔧 Add scaleway invocation to konk script 2023-10-10 07:37:56 +02:00
Jérôme Petazzoni
a19585a587 🧹 Add clean up snippet for Scaleway PVC 2023-09-22 09:21:29 +02:00
Jérôme Petazzoni
354bd9542e Add scriptlet to list exoscale zones 2023-09-14 14:50:36 +02:00
Jérôme Petazzoni
0c73e91e6f 🔧 Tweak slides order + typo fix 2023-09-14 13:59:20 +02:00
Jérôme Petazzoni
23064b5d26 🔧 Show file name in vim 2023-09-13 16:11:03 +02:00
Jérôme Petazzoni
971314a84f 🔧 Minor fixes in DMUC refactor 2023-09-13 16:09:26 +02:00
Jérôme Petazzoni
c0689cc5df ️ New content for M5
Instead of showing kubenet and kuberouter with
Kubernetes 1.19, we now start with Kubernetes
1.28 (or whatever is the latest version) along
with containerd and CNI.
2023-08-27 21:16:34 +02:00
Jérôme Petazzoni
033873064a 🏭️ Refactor deployment scripts for monokube/polykube
Break out kubernetes package installation and kubeadm invocation
to two different steps, so that we can install kubernetes packages
without setting up the cluster (for the new DMUC labs).
2023-08-25 17:49:30 +02:00
Jérôme Petazzoni
1ed3af6eff 🖼️ Change openstack image selection mechanism
Instead of passing an image name through a terraform variable,
use tags to select the latest image matching the specified
tags (in this case, os=Ubuntu version=22.04).
2023-08-24 01:11:31 +02:00
Jérôme Petazzoni
33ddfce3fa 🐞 Tweak index.yaml
There's something wrong with the self-paced slides (see #632) but I'm not sure
what the problem is exactly 😅
2023-08-17 21:22:43 +02:00
Jérôme Petazzoni
943783c8fb 🐞 Fix typo in swarm metrics setup
Closes #631.

Thanks @Zakariasemlali for noticing this :)
2023-08-04 02:11:39 +02:00
Or Navon
46b3aa23bf Fix minor grammar mistake 2023-07-31 11:27:28 +02:00
Jérôme Petazzoni
4498dc41a4 🔧 Make TF_VAR_cluster_name mandatory in testing script 2023-07-28 14:51:20 +02:00
Jérôme Petazzoni
58de0d31f8 🔧 Fix AWS and OCI configurations 2023-06-19 22:38:44 +02:00
Jérôme Petazzoni
d32d986a9e Add support for Azure AKS and OVH MKS 2023-06-18 19:55:31 +02:00
Jérôme Petazzoni
fcb922628c 📃 Add documentation for cloud credentials 2023-06-17 19:22:58 +02:00
Jérôme Petazzoni
77ceba7f5b 🔧 Fix broken links in intro to docker slides
Closes #622

I recovered some of the case studies from the internet
archive, and removed the other links.
2023-06-15 23:07:25 +02:00
Jérôme Petazzoni
ccb73fc872 Add CloudFlare script (WIP) 2023-05-29 12:24:54 +02:00
Jérôme Petazzoni
bb302a25de ✂️ Split prereqs/handson instructions 2023-05-29 09:05:57 +02:00
Julien Girardin
e66b90eb4e Replace ship lab by kustomize lab 2023-05-26 17:33:38 +02:00
dependabot[bot]
74add4d435 Bump socket.io-parser from 4.2.2 to 4.2.3 in /slides/autopilot
Bumps [socket.io-parser](https://github.com/socketio/socket.io-parser) from 4.2.2 to 4.2.3.
- [Release notes](https://github.com/socketio/socket.io-parser/releases)
- [Changelog](https://github.com/socketio/socket.io-parser/blob/main/CHANGELOG.md)
- [Commits](https://github.com/socketio/socket.io-parser/compare/4.2.2...4.2.3)

---
updated-dependencies:
- dependency-name: socket.io-parser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-25 16:25:15 +02:00
Jérôme Petazzoni
5ee1367e79 🖼️ Use ngrok/ngrok image instead of building it from scratch 2023-05-25 16:09:47 +02:00
Jérôme Petazzoni
c1f8177f4e 🔧 Pass kubernetesVersion: in kubeadm config file 2023-05-17 19:04:32 +02:00
Jérôme Petazzoni
d4a9ea2461 🪆 Fix vcluster deployment and add konk.sh script 2023-05-16 19:16:19 +02:00
Jérôme Petazzoni
dd0f6d00fa 🏭️ Refactor the DaemonSet section 2023-05-14 20:10:23 +02:00
Jérôme Petazzoni
79359e2abc 🏭️ Refactor YAML and Namespace chapters 2023-05-14 19:58:45 +02:00
Jérôme Petazzoni
9cd812de75 Update ingress chapter and manifest 2023-05-13 12:06:47 +02:00
Jérôme Petazzoni
e29bfe7921 🔧 Improve mk8s Terraform configuration
- instead of using 'kubectl wait nodes', we now use a simpler
  'kubectl get nodes -o name' and check if there is anything
  in the output. This seems to work better (as the previous
  method would sometimes remain stuck because the kubectl
  process would never get stopped by SIGPIPE).
- the shpod SSH NodePort is no longer hard-coded to 32222,
  which allows us to use e.g. vcluster to deploy multiple
  Kubernetes labs on a single 'home' (or 'outer') Kubernetes
  cluster.
2023-05-13 08:19:19 +02:00
Jérôme Petazzoni
11bc78851b Add Scaleway and Hetzner to ARM providers 2023-05-12 18:13:19 +02:00
Jérôme Petazzoni
c611f55dca Update cluster upgrade section
We now go from 1.22 to 1.23.

Updating to 1.22 was necessary because Kubernetes 1.27
deprecated kubeadm config v1beta2, which forced us to
upgrade to v1beta3, which was only introduced in 1.22.
In other words, our scripts can only install Kubernetes
1.22+ now.
2023-05-12 07:23:36 +02:00
Jérôme Petazzoni
980bc66c3a 🔧 Improve output of 'labctl tags' 2023-05-12 07:03:49 +02:00
Jérôme Petazzoni
fd0bc97a7a 🔓️ Disable port protection on AWS and OpenStack
This is required for the kubenet and kuberouter labs, for
'operating kubernetes' training classes.
2023-05-12 06:57:54 +02:00
Jérôme Petazzoni
8f6c32e94a 🔧 Tweak history limit to keep 1 million lines 2023-05-11 14:43:04 +02:00
Jérôme Petazzoni
1a711f8c2c Add kubent
Kube No Trouble (kubent) is a simple tool to check whether you're using any of these API versions in your cluster and therefore should upgrade your workloads first, before upgrading your Kubernetes cluster.
2023-05-10 19:12:55 +02:00
Jérôme Petazzoni
0080f21817 Add velero CLI 2023-05-10 18:45:34 +02:00
ENIX NOC
f937456232 Fixed executable name for pssh on ubuntu 2023-05-09 15:28:37 +00:00
ENIX NOC
8376aba5fd Fixed ssh key usage when setting password 2023-05-09 15:28:20 +00:00
Jérôme Petazzoni
6d13122a4d Add BuildKit RUN --mount=type=cache... 2023-05-09 07:50:40 +02:00
Jérôme Petazzoni
8184c46ed3 Upgrade metrics-server install instructions 2023-05-09 07:25:48 +02:00
Jérôme Petazzoni
0b900f9e5c Add example file for OpenStack tfvars 2023-05-09 07:25:11 +02:00
Jérôme Petazzoni
e14d0d4ca4 🔧 Tweak netlify DNS script to take domain as env var
Now that script can be used for container.training, but also our
other properties at Netlify (e.g. tinyshellscript.com)
2023-05-08 21:50:17 +02:00
dependabot[bot]
cdb1e41524 Bump engine.io and socket.io in /slides/autopilot
Bumps [engine.io](https://github.com/socketio/engine.io) to 6.4.2 and updates ancestor dependency [socket.io](https://github.com/socketio/socket.io). These dependencies need to be updated together.


Updates `engine.io` from 6.2.1 to 6.4.2
- [Release notes](https://github.com/socketio/engine.io/releases)
- [Changelog](https://github.com/socketio/engine.io/blob/main/CHANGELOG.md)
- [Commits](https://github.com/socketio/engine.io/compare/6.2.1...6.4.2)

Updates `socket.io` from 4.5.1 to 4.6.1
- [Release notes](https://github.com/socketio/socket.io/releases)
- [Changelog](https://github.com/socketio/socket.io/blob/main/CHANGELOG.md)
- [Commits](https://github.com/socketio/socket.io/compare/4.5.1...4.6.1)

---
updated-dependencies:
- dependency-name: engine.io
  dependency-type: indirect
- dependency-name: socket.io
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-04 10:25:18 +02:00
Jérôme Petazzoni
600e7c441c Bump up kubeadm configuration version
v1beta2 support was removed in Kubernetes 1.27.
Warning, v1beta3 was introduced in Kubernetes 1.22
(I think?) which means that the minimum version for
"old cluster" deployments is now 1.22.
2023-04-24 06:58:06 +02:00
Jérôme Petazzoni
81913d88a0 Add script to list civo locations 2023-04-23 16:13:51 +02:00
Jérôme Petazzoni
17d3d9a92a ♻️ Add clean up script to remove stray LBs and PVs 2023-04-12 08:25:47 +02:00
Jérôme Petazzoni
dd026b3db2 📃 Update healthchecks section 2023-04-11 12:42:51 +02:00
Jérôme Petazzoni
b9426af9cd ✂️ Remove Dockerfile and Compose file
They're not valid anymore, and fixing them would require quite a lot of
work, since we drastically changed the way we provision things. I'm
removing them rather than leaving a completely broken thing.
2023-04-11 10:19:20 +02:00
MrUtkarsh
aa4c0846ca Update Dockerfile_Tips.md
Updated the chown to chmod as its repeated.
2023-04-10 16:18:34 +02:00
Jérôme Petazzoni
abca33af29 🏭️ Second pass of Terraform refactoring
Break down provider-specific configuration into two files:
- config.tf (actual configuration, e.g. credentials, that cannot be
  included in submodules)
- variables.tf (per-provider knobs and settings, e.g. mapping logical
  VM size like S/M/L to actual cloud SKUs)
2023-04-09 09:45:05 +02:00
Jérôme Petazzoni
f69a9d3eb8 🔧 Update .gitignore to get some Terraform stuff out of the way 2023-04-04 19:34:51 +02:00
Jérôme Petazzoni
bc10c5a5ca 📔 A bit of doc 😅 2023-04-04 19:32:49 +02:00
Jérôme Petazzoni
b6340acb6e ⚛️ Huge refactoring of lab environment deployment system
Summary of changes:
- "workshopctl" is now "labctl"
- it can handle deployment of VMs but also of managed
  Kubernetes clusters (and therefore, it replaces
  the "prepare-tf" directory)
- support for many more providers has been added

Check the README.md, in particular the "directory structure";
it has the most important information.
2023-03-29 18:36:48 +02:00
Jérôme Petazzoni
f8ab4adfb7 ⚙️ Make it possible to change number of parallel SSH connections with env var 2023-03-21 17:54:29 +01:00
Jérôme Petazzoni
dc8bd21062 📃 Add YAML exercise 2023-03-20 12:56:06 +01:00
Jérôme Petazzoni
c9710a9f70 📃 Update YAML section
- fix mapping example
- fix indentation
- add information about multi-documents
- add information about multi-line strings
2023-03-20 12:46:16 +01:00
ENIX NOC
bc1ba942c0 🔧 Retry 'terraform apply' 3 times if it fails
Some platforms (looking at you OpenStack) can exhibit random
transient failures. This helps to work around them.
2023-03-11 19:42:57 +01:00
ENIX NOC
fa0a894ebc 🔧 OpenStack pool and external_network_id are now variables 2023-03-11 19:42:57 +01:00
ENIX NOC
e78e0de377 🐞 Fix bug in 'passwords' action
It was still hard-coded to user 'docker' instead of using
the USER_LOGIN environment variable.

Also add download-retry when wgetting the websocketd deb.
2023-03-11 19:42:57 +01:00
Jérôme Petazzoni
cba2ff5ff7 🔧 Check for httpie in netlify DNS script 2023-03-08 17:57:17 +01:00
Jérôme Petazzoni
d8f8bf6d87 ♻️ Switch Hetzner to the new Terraform system 2023-03-04 15:24:51 +01:00
Jérôme Petazzoni
84f131cdc5 🏭️ Refactor Digital Ocean and Linode authentication in prepare-tf
Fetch credentials from CLI configuration files instead of environment variables.
2023-03-04 14:35:09 +01:00
Jérôme Petazzoni
8738f68a72 🏭️ Small refactorings to prepare Terraform migration
- add support for Digital Ocean (through Terraform)
- add support for per-cluster SSH key (hackish for now)
- pre-load Kubernetes APT GPG key (because of GCS outage)
2023-03-04 13:40:43 +01:00
Jérôme Petazzoni
e130884184 Bump up DOK version 2023-03-04 10:18:53 +01:00
Jérôme Petazzoni
74cb1aec85 ⚙️ Store terraform variables (# of nodes...) in tfvars file
Using environment variables was a mistake, because they must be set again
manually each time we want to re-apply the Terraform configurations.
Instead, put the variables in a tfvars file.
2023-03-04 10:18:35 +01:00
Jérôme Petazzoni
70e60d7f4e 🏭️ Big refactoring to move to Ubuntu 22.04
Instead of Ubuntu 18.04, we should use 22.04 (especially as
18.04 will be EOL soon). This moves a few providers to 22.04
(and more will follow).

We now ship a small containerd configuration file (instead
of defaulting to an empty configuration like we did before)
since it looks like recent versions of containerd cause
infinite crashloops if the cgroups driver isn't set properly.

Also, Linode is now provisioned using Terraform (instead of
the old-style system relying on linode-cli) which should make
instance provisioning faster (thanks to Terraform parallelism).

The "wait" command now tries to log in with both "ubuntu" and
"root", and if it fails with "ubuntu" but succeeds with "root",
it will create the "ubuntu" user and give it full sudo rights.

Finally, a "standardize" action has been created to gather all
the commands that deal with non-standard Ubuntu images.

Note that for completeness, we should check that all providers
work correctly; currently only Linode has been validated.
2023-02-23 16:32:10 +01:00
Jérôme Petazzoni
29b3185e7e 🐘 Add link to Mastodon profile 2023-02-23 10:06:38 +01:00
Jérôme Petazzoni
0616d74e37 Add gentle intro to YAML 2023-02-22 20:56:46 +01:00
Jérôme Petazzoni
676ebcdd3f ♻️ Replace jpetazzo/httpenv with jpetazzo/color 2023-02-20 14:22:02 +01:00
Jérôme Petazzoni
28f0253242 Add kubectl np-viewer in network policy section 2023-02-20 10:37:53 +01:00
323 changed files with 7183 additions and 6710 deletions

14
.gitignore vendored
View File

@@ -2,11 +2,14 @@
*.swp
*~
prepare-vms/tags
prepare-vms/infra
prepare-vms/www
prepare-tf/tag-*
**/terraform.tfstate
**/terraform.tfstate.backup
prepare-labs/terraform/lab-environments
prepare-labs/terraform/many-kubernetes/one-kubernetes-config/config.tf
prepare-labs/terraform/many-kubernetes/one-kubernetes-module/*.tf
prepare-labs/terraform/tags
prepare-labs/terraform/virtual-machines/openstack/*.tfvars
prepare-labs/www
slides/*.yml.html
slides/autopilot/state.yaml
@@ -26,3 +29,4 @@ node_modules
Thumbs.db
ehthumbs.db
ehthumbs_vista.db

View File

@@ -0,0 +1,13 @@
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-pdb
spec:
#minAvailable: 2
#minAvailable: 90%
maxUnavailable: 1
#maxUnavailable: 10%
selector:
matchLabels:
app: my-app

View File

@@ -1,36 +1,44 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: traefik
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: kube-system
name: traefik
namespace: traefik
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: traefik-ingress-controller
namespace: kube-system
name: traefik
namespace: traefik
labels:
k8s-app: traefik-ingress-lb
app: traefik
spec:
selector:
matchLabels:
k8s-app: traefik-ingress-lb
app: traefik
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
app: traefik
name: traefik
spec:
tolerations:
- effect: NoSchedule
operator: Exists
hostNetwork: true
serviceAccountName: traefik-ingress-controller
# If, for some reason, our CNI plugin doesn't support hostPort,
# we can enable hostNetwork instead. That should work everywhere
# but it doesn't provide the same isolation.
#hostNetwork: true
serviceAccountName: traefik
terminationGracePeriodSeconds: 60
containers:
- image: traefik:v2.5
name: traefik-ingress-lb
- image: traefik:v2.10
name: traefik
ports:
- name: http
containerPort: 80
@@ -61,7 +69,7 @@ spec:
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
name: traefik
rules:
- apiGroups:
- ""
@@ -73,14 +81,6 @@ rules:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
@@ -94,15 +94,15 @@ rules:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: traefik-ingress-controller
name: traefik
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
name: traefik
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
name: traefik
namespace: traefik
---
kind: IngressClass
apiVersion: networking.k8s.io/v1

222
prepare-labs/README.md Normal file
View File

@@ -0,0 +1,222 @@
# Tools to create lab environments
This directory contains tools to create lab environments for Docker and Kubernetes courses and workshops.
It also contains Terraform configurations that can be used stand-alone to create simple Kubernetes clusters.
Assuming that you have installed all the necessary dependencies, and placed cloud provider access tokens in the right locations, you could do, for instance:
```bash
# For a Docker course with 50 students,
# create 50 VMs on Digital Ocean.
./labctl create --students 50 --settings settings/docker.env --provider digitalocean
# For a Kubernetes training with 20 students,
# create 20 clusters of 4 VMs each using kubeadm,
# on a private Openstack cluster.
./labctl create --students 20 --settings settings/kubernetes.env --provider openstack/enix
# For a Kubernetes workshop with 80 students,
# create 80 clusters with 2 VMs each,
# using Scaleway Kapsule (managed Kubernetes).
./labctl create --students 20 --settings settings/mk8s.env --provider scaleway --mode mk8s
```
Interested? Read on!
## Software requirements
For Docker labs and Kubernetes labs based on kubeadm:
- [Parallel SSH](https://github.com/lilydjwg/pssh)
(should be installable with `pip install git+https://github.com/lilydjwg/pssh`;
on a Mac, try `brew install pssh`)
For all labs:
- Terraform
If you want to generate printable cards:
- [pyyaml](https://pypi.python.org/pypi/PyYAML)
- [jinja2](https://pypi.python.org/pypi/Jinja2)
These require Python 3. If you are on a Mac, see below for specific instructions on setting up
Python 3 to be the default Python on a Mac. In particular, if you installed `mosh`, Homebrew
may have changed your default Python to Python 2.
You will also need an account with the cloud provider(s) that you want to use to deploy the lab environments.
## Cloud provider account(s) and credentials
These scripts create VMs or Kubernetes cluster on cloud providers, so you will need cloud provider account(s) and credentials.
Generally, we try to use the credentials stored in the configuration file used by the cloud providers CLI tools.
This means, for instance, that for Linode, if you install `linode-cli` and configure it properly, it will place your credentials in `~/.config/linode-cli`, and our Terraform configurations will try to read that file and use the credentials in it.
You don't **have to** install the CLI tools of the cloud provider(s) that you want to use; but we recommend that you do.
If you want to provide your cloud credentials through other means, you will have to adjust the Terraform configuration files in `terraform/provider-config` accordingly.
Here is where we look for credentials for each provider:
- AWS: Terraform defaults; see [AWS provider documentation][creds-aws] (for instance, you can use the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables, or AWS config and profile files)
- Azure: Terraform defaults; see [AzureRM provider documentation][creds-azure] (typically, you can authenticate with the `az` CLI and Terraform will pick it up automatically)
- Civo: CLI configuration file (`~/.civo.json`)
- Digital Ocean: CLI configuration file (`~/.config/doctl/config.yaml`)
- Exoscale: CLI configuration file (`~/.config/exoscale/exoscale.toml`)
- Google Cloud: FIXME, note that the project name is currently hard-coded to `prepare-tf`
- Hetzner: CLI configuration file (`~/.config/hcloud/cli.toml`)
- Linode: CLI configuration file (`~/.config/linode-cli`)
- OpenStack: you will need to write a tfvars file (check [that exemple](terraform/virtual-machines/openstack/tfvars.example))
- Oracle: Terraform defaults; see [OCI provider documentation][creds-oci] (for instance, you can set up API keys; or you can use a short-lived token generated by the OCI CLI with `oci session authenticate`)
- OVH: Terraform defaults; see [OVH provider documentation][creds-ovh] (this typically involves setting up 5 `OVH_...` environment variables)
- Scaleway: Terraform defaults; see [Scaleway provider documentation][creds-scw] (for instance, you can set environment variables, but it will also automatically pick up CLI authentication from `~/.config/scw/config.yaml`)
[creds-aws]: https://registry.terraform.io/providers/hashicorp/aws/latest/docs#authentication-and-configuration
[creds-azure]: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs#authenticating-to-azure
[creds-oci]: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/terraformproviderconfiguration.htm#authentication
[creds-ovh]: https://registry.terraform.io/providers/ovh/ovh/latest/docs#provider-configuration
[creds-scw]: https://registry.terraform.io/providers/scaleway/scaleway/latest/docs#authentication
## General Workflow
- fork/clone repo
- make sure your cloud credentials have been configured properly
- run `./labctl create ...` to create lab environments
- run `./labctl destroy ...` when you don't need the environments anymore
## Customizing things
You can edit the `settings/*.env` files, for instance to change the size of the clusters, the login or password used for the students...
Note that these files are sourced before executing any operation on a specific set of lab environments, which means that you can set Terraform variables by adding lines like the following one in the `*.env` files:
```bash
export TF_VAR_node_size=GP1.L
export TF_VAR_location=eu-north
```
## `./labctl` Usage
If you run `./labctl` without arguments, it will show a list of available commands.
### Summary of What `./labctl` Does For You
The script will create a Terraform configuration using a provider-specific template.
There are two modes: `pssh` and `mk8s`.
In `pssh` mode, students connect directly to the virtual machines using SSH.
The Terraform configuration creates a bunch of virtual machines, then the provisioning and configuration are done with `pssh`. There are a number of "steps" that are executed on the VMs, to install Docker, install a number of convenient tools, install and set up Kubernetes (if needed)... The list of "steps" to be executed is configured in the `settings/*.env` file.
In `mk8s` mode, students don't connect directly to the virtual machines. Instead, they connect to an SSH server running in a Pod (using the `jpetazzo/shpod` image), itself running on a Kubernetes cluster. The Kubernetes cluster is a managed cluster created by the Terraform configuration.
## `terraform` directory structure and principles
Legend:
- `📁` directory
- `📄` file
- `📄📄📄` multiple files
- `🌍` Terraform configuration that can be used "as-is"
```
📁terraform
├── 📁list-locations
│ └── 📄📄📄 helper scripts
│ (to list available locations for each provider)
├── 📁many-kubernetes
│ └── 📄📄📄 Terraform configuration template
│ (used in mk8s mode)
├── 📁one-kubernetes
│ │ (contains Terraform configurations that can spawn
│ │ a single Kubernetes cluster on a given provider)
│ ├── 📁🌍aws
│ ├── 📁🌍civo
│ ├── 📄common.tf
│ ├── 📁🌍digitalocean
│ └── ...
├── 📁providers
│ ├── 📁aws
│ │ ├── 📄config.tf
│ │ └── 📄variables.tf
│ ├── 📁azure
│ │ ├── 📄config.tf
│ │ └── 📄variables.tf
│ ├── 📁civo
│ │ ├── 📄config.tf
│ │ └── 📄variables.tf
│ ├── 📁digitalocean
│ │ ├── 📄config.tf
│ │ └── 📄variables.tf
│ └── ...
├── 📁tags
│ │ (contains Terraform configurations + other files
│ │ for a specific set of VMs or K8S clusters; these
│ │ are created by labctl)
│ ├── 📁2023-03-27-10-04-79-jp
│ ├── 📁2023-03-27-10-07-41-jp
│ ├── 📁2023-03-27-10-16-418-jp
│ └── ...
└── 📁virtual-machines
│ (contains Terraform configurations that can spawn
│ a bunch of virtual machines on a given provider)
├── 📁🌍aws
├── 📁🌍azure
├── 📄common.tf
├── 📁🌍digitalocean
└── ...
```
The directory structure can feel a bit overwhelming at first, but it's built with specific goals in mind.
**Consistent input/output between providers.** The per-provider configurations in `one-kubernetes` all take the same input variables, and provide the same output variables. Same thing for the per-provider configurations in `virtual-machines`.
**Don't repeat yourself.** As much as possible, common variables, definitions, and logic has been factored in the `common.tf` file that you can see in `one-kubernetes` and `virtual-machines`. That file is then symlinked in each provider-specific directory, to make sure that all providers use the same version of the `common.tf` file.
**Don't repeat yourself (again).** The things that are specific to each provider have been placed in the `providers` directory, and are shared between the `one-kubernetes` and the `virtual-machines` configurations. Specifically, for each provider, there is `config.tf` (which contains provider configuration, e.g. how to obtain the credentials for that provider) and `variables.tf` (which contains default values like which location and which VM size to use).
**Terraform configurations should work in `labctl` or standalone, without extra work.** The Terraform configurations (identified by 🌍 in the directory tree above) can be used directly. Just go to one of these directories, `terraform init`, `terraform apply`, and you're good to go. But they can also be used from `labctl`. `labctl` shouldn't barf out if you did a `terraform apply` in one of these directories (because it will only copy the `*.tf` files, and leave alone the other files, like the Terraform state).
The latter means that it should be easy to tweak these configurations, or create a new one, without having to use `labctl` to test it. It also means that if you want to use these configurations but don't care about `labctl`, you absolutely can!
## Miscellaneous info
### Making sure Python3 is the default (Mac only)
Check the `/usr/local/bin/python` symlink. It should be pointing to
`/usr/local/Cellar/python/3`-something. If it isn't, follow these
instructions.
1) Verify that Python 3 is installed.
```
ls -la /usr/local/Cellar/Python
```
You should see one or more versions of Python 3. If you don't,
install it with `brew install python`.
2) Verify that `python` points to Python3.
```
ls -la /usr/local/bin/python
```
If this points to `/usr/local/Cellar/python@2`, then we'll need to change it.
```
rm /usr/local/bin/python
ln -s /usr/local/Cellar/Python/xxxx /usr/local/bin/python
# where xxxx is the most recent Python 3 version you saw above
```
### AWS specific notes
Initial assumptions are you're using a root account. If you'd like to use a IAM user, it will need the right permissions. For `pssh` mode, that includes at least `AmazonEC2FullAccess` and `IAMReadOnlyAccess`.
In `pssh` mode, the Terraform configuration currently uses the default VPC and Security Group. If you want to use another one, you'll have to make changes to `terraform/virtual-machines/aws`.
The default VPC Security Group does not open any ports from Internet by default. So you'll need to add Inbound rules for `SSH | TCP | 22 | 0.0.0.0/0` and `Custom TCP Rule | TCP | 8000 - 8002 | 0.0.0.0/0`.

33
prepare-labs/cleanup.sh Executable file
View File

@@ -0,0 +1,33 @@
#!/bin/sh
case "$1-$2" in
linode-lb)
linode-cli nodebalancers list --json |
jq '.[] | select(.label | startswith("ccm-")) | .id' |
xargs -n1 -P10 linode-cli nodebalancers delete
;;
linode-pvc)
linode-cli volumes list --json |
jq '.[] | select(.label | startswith("pvc")) | .id' |
xargs -n1 -P10 linode-cli volumes delete
;;
digitalocean-lb)
doctl compute load-balancer list --output json |
jq .[].id |
xargs -n1 -P10 doctl compute load-balancer delete --force
;;
digitalocean-pvc)
doctl compute volume list --output json |
jq '.[] | select(.name | startswith("pvc-")) | .id' |
xargs -n1 -P10 doctl compute volume delete --force
;;
scaleway-pvc)
scw instance volume list --output json |
jq '.[] | select(.name | contains("_pvc-")) | .id' |
xargs -n1 -P10 scw instance volume delete
;;
*)
echo "Unknown combination of provider ('$1') and resource ('$2')."
;;
esac

59
prepare-labs/dns-cloudflare.sh Executable file
View File

@@ -0,0 +1,59 @@
#!/bin/sh
#set -eu
if ! command -v http >/dev/null; then
echo "Could not find the 'http' command line tool."
echo "Please install it (the package name might be 'httpie')."
exit 1
fi
. ~/creds/creds.cloudflare.dns
cloudflare() {
case "$1" in
GET|POST|DELETE)
METHOD="$1"
shift
;;
*)
METHOD=""
;;
esac
URI=$1
shift
http --ignore-stdin $METHOD https://api.cloudflare.com/client/v4/$URI "$@" "Authorization:Bearer $CLOUDFLARE_TOKEN"
}
_list_zones() {
cloudflare zones | jq -r .result[].name
}
_get_zone_id() {
cloudflare zones?name=$1 | jq -r .result[0].id
}
_populate_zone() {
ZONE_ID=$(_get_zone_id $1)
shift
for IPADDR in $*; do
cloudflare zones/$ZONE_ID/dns_records "name=*" "type=A" "content=$IPADDR"
cloudflare zones/$ZONE_ID/dns_records "name=\@" "type=A" "content=$IPADDR"
done
}
_clear_zone() {
ZONE_ID=$(_get_zone_id $1)
for RECORD_ID in $(
cloudflare zones/$ZONE_ID/dns_records | jq -r .result[].id
); do
cloudflare DELETE zones/$ZONE_ID/dns_records/$RECORD_ID
done
}
_add_zone() {
cloudflare zones "name=$1"
}
echo "This script is still work in progress."
echo "You can source it and then use its individual functions."

View File

@@ -2,16 +2,16 @@
"""
There are two ways to use this script:
1. Pass a file name and a tag name as a single argument.
It will load a list of domains from the given file (one per line),
and assign them to the clusters corresponding to that tag.
There should be more domains than clusters.
Example: ./map-dns.py domains.txt 2020-08-15-jp
2. Pass a domain as the 1st argument, and IP addresses then.
1. Pass a domain as the 1st argument, and IP addresses then.
It will configure the domain with the listed IP addresses.
Example: ./map-dns.py open-duck.site 1.2.3.4 2.3.4.5 3.4.5.6
2. Pass two files names as argument, in which case the first
file should contain a list of domains, and the second a list of
groups of IP addresses, with one group per line.
There should be more domains than groups of addresses.
Example: ./map-dns.py domains.txt tags/2020-08-15-jp/clusters.txt
In both cases, the domains should be configured to use GANDI LiveDNS.
"""
import os
@@ -30,18 +30,9 @@ domain_or_domain_file = sys.argv[1]
if os.path.isfile(domain_or_domain_file):
domains = open(domain_or_domain_file).read().split()
domains = [ d for d in domains if not d.startswith('#') ]
ips_file_or_tag = sys.argv[2]
if os.path.isfile(ips_file_or_tag):
lines = open(ips_file_or_tag).read().split('\n')
clusters = [line.split() for line in lines]
else:
ips = open(f"tags/{ips_file_or_tag}/ips.txt").read().split()
settings_file = f"tags/{ips_file_or_tag}/settings.yaml"
clustersize = yaml.safe_load(open(settings_file))["clustersize"]
clusters = []
while ips:
clusters.append(ips[:clustersize])
ips = ips[clustersize:]
clusters_file = sys.argv[2]
lines = open(clusters_file).read().split('\n')
clusters = [line.split() for line in lines]
else:
domains = [domain_or_domain_file]
clusters = [sys.argv[2:]]

View File

@@ -12,12 +12,15 @@
echo "$0 del <recordid>"
echo ""
echo "Example to create a A record for eu.container.training:"
echo "$0 add eu 185.145.250.0"
echo "$0 add eu A 185.145.250.0"
echo ""
exit 1
}
NETLIFY_CONFIG_FILE=~/.config/netlify/config.json
if ! [ "$DOMAIN" ]; then
DOMAIN=container.training
fi
if ! [ -f "$NETLIFY_CONFIG_FILE" ]; then
echo "Could not find Netlify configuration file ($NETLIFY_CONFIG_FILE)."
@@ -26,6 +29,12 @@ if ! [ -f "$NETLIFY_CONFIG_FILE" ]; then
exit 1
fi
if ! command -v http >/dev/null; then
echo "Could not find the 'http' command line tool."
echo "Please install it (the package name might be 'httpie')."
exit 1
fi
NETLIFY_USERID=$(jq .userId < "$NETLIFY_CONFIG_FILE")
NETLIFY_TOKEN=$(jq -r .users[$NETLIFY_USERID].auth.token < "$NETLIFY_CONFIG_FILE")
@@ -36,31 +45,33 @@ netlify() {
}
ZONE_ID=$(netlify dns_zones |
jq -r '.[] | select ( .name == "container.training" ) | .id')
jq -r '.[] | select ( .name == "'$DOMAIN'" ) | .id')
_list() {
netlify dns_zones/$ZONE_ID/dns_records |
jq -r '.[] | select(.type=="A") | [.hostname, .type, .value, .id] | @tsv'
jq -r '.[] | select(.type=="A" or .type=="AAAA") | [.hostname, .type, .value, .id] | @tsv' |
sort |
column --table
}
_add() {
NAME=$1.container.training
ADDR=$2
NAME=$1.$DOMAIN
TYPE=$2
VALUE=$3
# It looks like if we create two identical records, then delete one of them,
# Netlify DNS ends up in a weird state (the name doesn't resolve anymore even
# though it's still visible through the API and the website?)
if netlify dns_zones/$ZONE_ID/dns_records |
jq '.[] | select(.hostname=="'$NAME'" and .type=="A" and .value=="'$ADDR'")' |
jq '.[] | select(.hostname=="'$NAME'" and .type=="'$TYPE'" and .value=="'$VALUE'")' |
grep .
then
echo "It looks like that record already exists. Refusing to create it."
exit 1
fi
netlify dns_zones/$ZONE_ID/dns_records type=A hostname=$NAME value=$ADDR ttl=300
netlify dns_zones/$ZONE_ID/dns_records type=$TYPE hostname=$NAME value=$VALUE ttl=300
netlify dns_zones/$ZONE_ID/dns_records |
jq '.[] | select(.hostname=="'$NAME'")'
@@ -79,7 +90,7 @@ case "$1" in
_list
;;
add)
_add $2 $3
_add $2 $3 $4
;;
del)
_del $2

View File

Before

Width:  |  Height:  |  Size: 127 KiB

After

Width:  |  Height:  |  Size: 127 KiB

23
prepare-labs/konk.sh Executable file
View File

@@ -0,0 +1,23 @@
#!/bin/sh
# deploy big cluster
#TF_VAR_node_size=g6-standard-6 \
#TF_VAR_nodes_per_cluster=5 \
#TF_VAR_location=eu-west \
TF_VAR_node_size=PRO2-XS \
TF_VAR_nodes_per_cluster=5 \
TF_VAR_location=fr-par-2 \
./labctl create --mode mk8s --settings settings/mk8s.env --provider scaleway --tag konk
# set kubeconfig file
cp tags/konk/stage2/kubeconfig.101 ~/kubeconfig
# set external_ip labels
kubectl get nodes -o=jsonpath='{range .items[*]}{.metadata.name} {.status.addresses[?(@.type=="ExternalIP")].address}{"\n"}{end}' |
while read node address; do
kubectl label node $node external_ip=$address
done
# vcluster all the things
./labctl create --settings settings/mk8s.env --provider vcluster --mode mk8s --students 50

View File

@@ -21,10 +21,13 @@ DEPENDENCIES="
man
pssh
ssh
wkhtmltopdf
yq
"
UNUSED_DEPENDENCIES="
wkhtmltopdf
"
# Check for missing dependencies, and issue a warning if necessary.
missing=0
for dependency in $DEPENDENCIES; do

View File

@@ -50,20 +50,6 @@ sep() {
fi
}
need_infra() {
if [ -z "$1" ]; then
die "Please specify infrastructure file. (e.g.: infra/aws)"
fi
if [ "$1" = "--infra" ]; then
die "The infrastructure file should be passed directly to this command. Remove '--infra' and try again."
fi
if [ ! -f "$1" ]; then
die "Infrastructure file $1 doesn't exist."
fi
. "$1"
. "lib/infra/$INFRACLASS.sh"
}
need_tag() {
if [ -z "$TAG" ]; then
die "Please specify a tag. To see available tags, run: $0 tags"
@@ -71,25 +57,12 @@ need_tag() {
if [ ! -d "tags/$TAG" ]; then
die "Tag $TAG not found (directory tags/$TAG does not exist)."
fi
for FILE in settings.yaml ips.txt infra.sh; do
for FILE in settings.env ips.txt; do
if [ ! -f "tags/$TAG/$FILE" ]; then
warning "File tags/$TAG/$FILE not found."
fi
done
. "tags/$TAG/infra.sh"
. "lib/infra/$INFRACLASS.sh"
}
need_settings() {
if [ -z "$1" ]; then
die "Please specify a settings file. (e.g.: settings/kube101.yaml)"
fi
if [ ! -f "$1" ]; then
die "Settings file $1 doesn't exist."
if [ -f "tags/$TAG/settings.env" ]; then
. tags/$TAG/settings.env
fi
}
need_login_password() {
USER_LOGIN=$(yq -r .user_login < tags/$TAG/settings.yaml)
USER_PASSWORD=$(yq -r .user_password < tags/$TAG/settings.yaml)
}

View File

@@ -1,5 +1,3 @@
export AWS_DEFAULT_OUTPUT=text
# Ignore SSH key validation when connecting to these remote hosts.
# (Otherwise, deployment scripts break when a VM IP address reuse.)
SSHOPTS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=ERROR"
@@ -16,25 +14,17 @@ _cmd_help() {
printf "%s" "$HELP" | sort
}
_cmd build "Build the Docker image to run this program in a container"
_cmd_build() {
docker-compose build
}
_cmd wrap "Run this program in a container"
_cmd_wrap() {
docker-compose run --rm workshopctl "$@"
}
_cmd cards "Generate ready-to-print cards for a group of VMs"
_cmd_cards() {
TAG=$1
need_tag
die FIXME
# This will process ips.txt to generate two files: ips.pdf and ips.html
(
cd tags/$TAG
../../lib/ips-txt-to-html.py settings.yaml
../../../lib/ips-txt-to-html.py settings.yaml
)
ln -sf ../tags/$TAG/ips.html www/$TAG.html
@@ -47,10 +37,10 @@ _cmd_cards() {
info "$0 www"
}
_cmd clean "Remove information about stopped clusters"
_cmd clean "Remove information about destroyed clusters"
_cmd_clean() {
for TAG in tags/*; do
if grep -q ^stopped$ "$TAG/status"; then
if grep -q ^destroyed$ "$TAG/status"; then
info "Removing $TAG..."
rm -rf "$TAG"
fi
@@ -61,12 +51,13 @@ _cmd createuser "Create the user that students will use"
_cmd_createuser() {
TAG=$1
need_tag
need_login_password
pssh "
set -e
# Create the user if it doesn't exist yet.
id $USER_LOGIN || sudo useradd -d /home/$USER_LOGIN -g users -m -s /bin/bash $USER_LOGIN
# Make sure there are at least exec permission on their home.
sudo chmod a+X /home/$USER_LOGIN
# Add them to the docker group, if there is one.
grep ^docker: /etc/group && sudo usermod -aG docker $USER_LOGIN
# Set their password.
@@ -80,7 +71,7 @@ _cmd_createuser() {
set -e
sudo sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
sudo sed -i 's/#MaxAuthTries 6/MaxAuthTries 42/' /etc/ssh/sshd_config
sudo service ssh restart
sudo systemctl restart ssh.service
"
pssh "
@@ -96,6 +87,12 @@ _cmd_createuser() {
fi
"
# FIXME this is a gross hack to add the deployment key to our SSH agent,
# so that it can be used to bounce from host to host (which is necessary
# in the next deployment step). In the long run, we probably want to
# generate these keys locally and push them to the machines instead
# (once we move everything to Terraform).
ssh-add tags/$TAG/id_rsa
pssh "
set -e
cd /home/$USER_LOGIN
@@ -105,6 +102,7 @@ _cmd_createuser() {
sudo -u $USER_LOGIN tar -xf-
fi
"
ssh-add -d tags/$TAG/id_rsa
# FIXME do this only once.
pssh -I "sudo -u $USER_LOGIN tee -a /home/$USER_LOGIN/.bashrc" <<"SQRL"
@@ -128,6 +126,7 @@ set number
set shiftwidth=2
set softtabstop=2
set nowrap
set laststatus=2
SQRL
pssh -I "sudo -u $USER_LOGIN tee /home/$USER_LOGIN/.tmux.conf" <<SQRL
@@ -142,9 +141,11 @@ bind l select-pane -R
set -g mouse on
# Make scrolling with wheels work
bind -n WheelUpPane if-shell -F -t = "#{mouse_any_flag}" "send-keys -M" "if -Ft= '#{pane_in_mode}' 'send-keys -M' 'select-pane -t=; copy-mode -e; send-keys -M'"
bind -n WheelDownPane select-pane -t= \; send-keys -M
# Retain one million lines
set-option -g history-limit 1000000
SQRL
# Install docker-prompt script
@@ -154,80 +155,195 @@ SQRL
echo user_ok > tags/$TAG/status
}
_cmd create "Create lab environments"
_cmd_create() {
while [ ! -z "$*" ]; do
case "$1" in
--mode) MODE=$2; shift 2;;
--provider) PROVIDER=$2; shift 2;;
--settings) SETTINGS=$2; shift 2;;
--students) STUDENTS=$2; shift 2;;
--tag) TAG=$2; shift 2;;
*) die "Unrecognized parameter: $1."
esac
done
if [ -z "$MODE" ]; then
info "Using default mode (pssh)."
MODE=pssh
fi
if [ -z "$PROVIDER" ]; then
die "Please add --provider flag to specify which provider to use."
fi
if [ -z "$SETTINGS" ]; then
die "Please add --settings flag to specify which settings file to use."
fi
if [ -z "$STUDENTS" ]; then
info "Defaulting to 1 student since --students flag wasn't specified."
STUDENTS=1
fi
case "$MODE" in
mk8s)
PROVIDER_BASE=terraform/one-kubernetes
;;
pssh)
PROVIDER_BASE=terraform/virtual-machines
;;
*) die "Invalid mode: $MODE (supported modes: mk8s, pssh)." ;;
esac
if ! [ -f "$SETTINGS" ]; then
die "Settings file ($SETTINGS) not found."
fi
# Check that the provider is valid.
if [ -d $PROVIDER_BASE/$PROVIDER ]; then
if [ -f $PROVIDER_BASE/$PROVIDER/requires_tfvars ]; then
die "Provider $PROVIDER cannot be used directly, because it requires a tfvars file."
fi
PROVIDER_DIRECTORY=$PROVIDER_BASE/$PROVIDER
TFVARS=""
elif [ -f $PROVIDER_BASE/$PROVIDER.tfvars ]; then
TFVARS=$PROVIDER_BASE/$PROVIDER.tfvars
PROVIDER_DIRECTORY=$(dirname $PROVIDER_BASE/$PROVIDER)
else
error "Provider $PROVIDER not found."
info "Available providers for mode $MODE:"
(
cd $PROVIDER_BASE
for P in *; do
if [ -d "$P" ]; then
[ -f "$P/requires_tfvars" ] || info "$P"
for V in $P/*.tfvars; do
[ -f "$V" ] && info "${V%.tfvars}"
done
fi
done
)
die "Please specify a valid provider."
fi
if [ -z "$TAG" ]; then
TAG=$(_cmd_maketag)
fi
mkdir -p tags/$TAG
echo creating > tags/$TAG/status
ln -s ../../$SETTINGS tags/$TAG/settings.env.orig
cp $SETTINGS tags/$TAG/settings.env
. $SETTINGS
echo $MODE > tags/$TAG/mode
echo $PROVIDER > tags/$TAG/provider
case "$MODE" in
mk8s)
cp -d terraform/many-kubernetes/*.* tags/$TAG
mkdir tags/$TAG/one-kubernetes-module
cp $PROVIDER_DIRECTORY/*.tf tags/$TAG/one-kubernetes-module
mkdir tags/$TAG/one-kubernetes-config
mv tags/$TAG/one-kubernetes-module/config.tf tags/$TAG/one-kubernetes-config
;;
pssh)
cp $PROVIDER_DIRECTORY/*.tf tags/$TAG
if [ "$TFVARS" ]; then
cp "$TFVARS" "tags/$TAG/$(basename $TFVARS).auto.tfvars"
fi
;;
esac
(
cd tags/$TAG
terraform init
echo tag = \"$TAG\" >> terraform.tfvars
echo how_many_clusters = $STUDENTS >> terraform.tfvars
echo nodes_per_cluster = $CLUSTERSIZE >> terraform.tfvars
for RETRY in 1 2 3; do
if terraform apply -auto-approve; then
touch terraform.ok
break
fi
done
if ! [ -f terraform.ok ]; then
die "Terraform failed."
fi
)
sep
info "Successfully created $COUNT instances with tag $TAG"
echo create_ok > tags/$TAG/status
# If the settings.env file has a "STEPS" field,
# automatically execute all the actions listed in that field.
# If an action fails, retry it up to 10 times.
for STEP in $(echo $STEPS); do
sep "$TAG -> $STEP"
TRY=1
MAXTRY=10
while ! $0 $STEP $TAG ; do
TRY=$(($TRY+1))
if [ $TRY -gt $MAXTRY ]; then
error "This step ($STEP) failed after $MAXTRY attempts."
info "You can troubleshoot the situation manually, or terminate these instances with:"
info "$0 destroy $TAG"
die "Giving up."
else
sep
info "Step '$STEP' failed for '$TAG'. Let's wait 10 seconds and try again."
info "(Attempt $TRY out of $MAXTRY.)"
sleep 10
fi
done
done
sep
info "Deployment successful."
info "To log into the first machine of that batch, you can run:"
info "$0 ssh $TAG"
info "To terminate these instances, you can run:"
info "$0 destroy $TAG"
}
_cmd destroy "Destroy lab environments"
_cmd_destroy() {
TAG=$1
need_tag
cd tags/$TAG
echo destroying > status
terraform destroy -auto-approve
echo destroyed > status
}
_cmd clusterize "Group VMs in clusters"
_cmd_clusterize() {
TAG=$1
need_tag
# Disable unattended upgrades so that they don't mess up with the subsequent steps
pssh sudo rm -f /etc/apt/apt.conf.d/50unattended-upgrades
pssh "
set -e
grep PSSH_ /etc/ssh/sshd_config || echo 'AcceptEnv PSSH_*' | sudo tee -a /etc/ssh/sshd_config
sudo systemctl restart ssh.service"
# Special case for scaleway since it doesn't come with sudo
if [ "$INFRACLASS" = "scaleway" ]; then
pssh -l root "
grep DEBIAN_FRONTEND /etc/environment || echo DEBIAN_FRONTEND=noninteractive >> /etc/environment
grep cloud-init /etc/sudoers && rm /etc/sudoers
apt-get update && apt-get install sudo -y"
pssh -I < tags/$TAG/clusters.txt "
grep -w \$PSSH_HOST | tr ' ' '\n' > /tmp/cluster"
pssh "
echo \$PSSH_HOST > /tmp/ipv4
head -n 1 /tmp/cluster | sudo tee /etc/ipv4_of_first_node
echo ${CLUSTERPREFIX}1 | sudo tee /etc/name_of_first_node
echo HOSTIP=\$PSSH_HOST | sudo tee -a /etc/environment
NODEINDEX=\$((\$PSSH_NODENUM%$CLUSTERSIZE+1))
if [ \$NODEINDEX = 1 ]; then
sudo ln -sf /bin/true /usr/local/bin/i_am_first_node
else
sudo ln -sf /bin/false /usr/local/bin/i_am_first_node
fi
# FIXME
# Special case for hetzner since it doesn't have an ubuntu user
#if [ "$INFRACLASS" = "hetzner" ]; then
# pssh -l root "
#[ -d /home/ubuntu ] ||
# useradd ubuntu -m -s /bin/bash
#echo 'ubuntu ALL=(ALL:ALL) NOPASSWD:ALL' > /etc/sudoers.d/ubuntu
#[ -d /home/ubuntu/.ssh ] ||
# install --owner=ubuntu --mode=700 --directory /home/ubuntu/.ssh
#[ -f /home/ubuntu/.ssh/authorized_keys ] ||
# install --owner=ubuntu --mode=600 /root/.ssh/authorized_keys --target-directory /home/ubuntu/.ssh"
#fi
# Special case for oracle since their iptables blocks everything but SSH
pssh "
if [ -f /etc/iptables/rules.v4 ]; then
sudo sed -i 's/-A INPUT -j REJECT --reject-with icmp-host-prohibited//' /etc/iptables/rules.v4
sudo netfilter-persistent flush
sudo netfilter-persistent start
fi"
# oracle-cloud-agent upgrades pacakges in the background.
# This breaks our deployment scripts, because when we invoke apt-get, it complains
# that the lock already exists (symptom: random "Exited with error code 100").
# Workaround: if we detect oracle-cloud-agent, remove it.
# But this agent seems to also take care of installing/upgrading
# the unified-monitoring-agent package, so when we stop the snap,
# it can leave dpkg in a broken state. We "fix" it with the 2nd command.
pssh "
if [ -d /snap/oracle-cloud-agent ]; then
sudo snap remove oracle-cloud-agent
sudo dpkg --remove --force-remove-reinstreq unified-monitoring-agent
fi"
# Copy settings and install Python YAML parser
pssh -I tee /tmp/settings.yaml <tags/$TAG/settings.yaml
pssh "
sudo apt-get update &&
sudo apt-get install -y python-yaml"
# If there is no "python" binary, symlink to python3
pssh "
if ! which python; then
sudo ln -s $(which python3) /usr/local/bin/python
fi"
# Copy postprep.py to the remote machines, and execute it, feeding it the list of IP addresses
pssh -I tee /tmp/clusterize.py <lib/clusterize.py
pssh --timeout 900 --send-input "python /tmp/clusterize.py >>/tmp/pp.out 2>>/tmp/pp.err" <tags/$TAG/ips.txt
# On the first node, create and deploy TLS certs using Docker Machine
# (Currently disabled.)
true || pssh "
if i_am_first_node; then
grep '[0-9]\$' /etc/hosts |
xargs -n2 sudo -H -u $USER_LOGIN \
docker-machine create -d generic --generic-ssh-user $USER_LOGIN --generic-ip-address
fi"
echo $CLUSTERPREFIX\$NODEINDEX | sudo tee /etc/hostname
sudo hostname $CLUSTERPREFIX\$NODEINDEX
N=1
while read ip; do
grep -w \$ip /etc/hosts || echo \$ip $CLUSTERPREFIX\$N | sudo tee -a /etc/hosts
N=\$((\$N+1))
done < /tmp/cluster
"
echo cluster_ok > tags/$TAG/status
}
@@ -261,7 +377,7 @@ _cmd_docker() {
# This will install the latest Docker.
sudo apt-get -qy install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository 'deb https://download.docker.com/linux/ubuntu bionic stable'
sudo add-apt-repository 'deb https://download.docker.com/linux/ubuntu jammy stable'
sudo apt-get -q update
sudo apt-get -qy install docker-ce
@@ -305,10 +421,23 @@ _cmd_kubebins() {
TAG=$1
need_tag
if [ "$KUBEVERSION" = "" ]; then
KUBEVERSION="$(curl -fsSL https://cdn.dl.k8s.io/release/stable.txt | sed s/^v//)"
fi
##VERSION##
ETCD_VERSION=v3.4.13
K8SBIN_VERSION=v1.19.11 # Can't go to 1.20 because it requires a serviceaccount signing key.
CNI_VERSION=v0.8.7
case "$KUBEVERSION" in
1.19.*)
ETCD_VERSION=v3.4.13
CNI_VERSION=v0.8.7
;;
*)
ETCD_VERSION=v3.5.10
CNI_VERSION=v1.3.0
;;
esac
K8SBIN_VERSION="v$KUBEVERSION"
ARCH=${ARCHITECTURE-amd64}
pssh --timeout 300 "
set -e
@@ -332,29 +461,41 @@ _cmd_kubebins() {
"
}
_cmd kube "Setup kubernetes clusters with kubeadm (must be run AFTER deploy)"
_cmd_kube() {
_cmd kubepkgs "Install Kubernetes packages (kubectl, kubeadm, kubelet)"
_cmd_kubepkgs() {
TAG=$1
need_tag
need_login_password
# Optional version, e.g. 1.13.5
SETTINGS=tags/$TAG/settings.yaml
KUBEVERSION=$(awk '/^kubernetes_version:/ {print $2}' $SETTINGS)
if [ "$KUBEVERSION" ]; then
pssh "
sudo tee /etc/apt/preferences.d/kubernetes <<EOF
# Prior September 2023, there was a single Kubernetes package repo that
# contained packages for all versions, so we could just add that repo
# and install whatever was the latest version available there.
# Things have changed (versions after September 2023, e.g. 1.28.3 are
# not in the old repo) and now there is a different repo for each
# minor version, so we need to figure out what minor version we are
# installing to add the corresponding repo.
if [ "$KUBEVERSION" = "" ]; then
KUBEVERSION="$(curl -fsSL https://cdn.dl.k8s.io/release/stable.txt | sed s/^v//)"
fi
KUBEREPOVERSION="$(echo $KUBEVERSION | cut -d. -f1-2)"
# Since the new repo doesn't have older versions, add a safety check here.
MINORVERSION="$(echo $KUBEVERSION | cut -d. -f2)"
if [ "$MINORVERSION" -lt 24 ]; then
die "Cannot install kubepkgs for versions before 1.24."
fi
pssh "
sudo tee /etc/apt/preferences.d/kubernetes <<EOF
Package: kubectl kubeadm kubelet
Pin: version $KUBEVERSION*
Pin: version $KUBEVERSION-*
Pin-Priority: 1000
EOF"
fi
# Install packages
pssh --timeout 200 "
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg |
sudo apt-key add - &&
echo deb http://apt.kubernetes.io/ kubernetes-xenial main |
curl -fsSL https://pkgs.k8s.io/core:/stable:/v$KUBEREPOVERSION/deb/Release.key |
gpg --dearmor | sudo tee /etc/apt/keyrings/kubernetes-apt-keyring.gpg &&
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v$KUBEREPOVERSION/deb/ /' |
sudo tee /etc/apt/sources.list.d/kubernetes.list"
pssh --timeout 200 "
sudo apt-get update -q &&
@@ -364,18 +505,25 @@ EOF"
kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl &&
echo 'alias k=kubectl' | sudo tee /etc/bash_completion.d/k &&
echo 'complete -F __start_kubectl k' | sudo tee -a /etc/bash_completion.d/k"
}
# Disable swap
# (note that this won't survive across node reboots!)
if [ "$INFRACLASS" = "linode" ]; then
pssh "
sudo swapoff -a"
_cmd kubeadm "Setup kubernetes clusters with kubeadm"
_cmd_kubeadm() {
TAG=$1
need_tag
if [ "$KUBEVERSION" ]; then
CLUSTER_CONFIGURATION_KUBERNETESVERSION='kubernetesVersion: "v'$KUBEVERSION'"'
IGNORE_SYSTEMVERIFICATION="- SystemVerification"
IGNORE_SWAP="- Swap"
fi
# Re-enable CRI interface in containerd
pssh "
echo '# Use default parameters for containerd.' | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd"
# Install a valid configuration for containerd
# (first, the CRI interface needs to be re-enabled;
# also, the correct systemd cgroup driver must be selected,
# otherwise containerd just restarts containers for no good reason)
pssh -I "sudo tee /etc/containerd/config.toml" < lib/containerd-config.toml
pssh "sudo systemctl restart containerd"
# Initialize kube control plane
pssh --timeout 200 "
@@ -383,39 +531,38 @@ EOF"
kubeadm token generate > /tmp/token &&
cat >/tmp/kubeadm-config.yaml <<EOF
kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- token: \$(cat /tmp/token)
nodeRegistration:
# Comment out the next line to switch back to Docker.
criSocket: /run/containerd/containerd.sock
ignorePreflightErrors:
- NumCPU
$IGNORE_SYSTEMVERIFICATION
$IGNORE_SWAP
---
kind: JoinConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
apiVersion: kubeadm.k8s.io/v1beta3
discovery:
bootstrapToken:
apiServerEndpoint: \$(cat /etc/name_of_first_node):6443
token: \$(cat /tmp/token)
unsafeSkipCAVerification: true
nodeRegistration:
# Comment out the next line to switch back to Docker.
criSocket: /run/containerd/containerd.sock
ignorePreflightErrors:
- NumCPU
$IGNORE_SYSTEMVERIFICATION
$IGNORE_SWAP
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
# The following line is necessary when using Docker.
# It doesn't seem necessary when using containerd.
#cgroupDriver: cgroupfs
failSwapOn: false
---
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta2
apiVersion: kubeadm.k8s.io/v1beta3
apiServer:
certSANs:
- \$(cat /tmp/ipv4)
$CLUSTER_CONFIGURATION_KUBERNETESVERSION
EOF
sudo kubeadm init --config=/tmp/kubeadm-config.yaml
fi"
@@ -433,11 +580,17 @@ EOF
# Install weave as the pod network
pssh "
if i_am_first_node; then
#kubever=\$(kubectl version | base64 | tr -d '\n') &&
#kubectl apply -f https://cloud.weave.works/k8s/net?k8s-version=\$kubever
kubectl apply -f https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s-1.11.yaml
fi"
# FIXME this is a gross hack to add the deployment key to our SSH agent,
# so that it can be used to bounce from host to host (which is necessary
# in the next deployment step). In the long run, we probably want to
# generate these keys locally and push them to the machines instead
# (once we move everything to Terraform).
if [ -f "tags/$TAG/id_rsa" ]; then
ssh-add tags/$TAG/id_rsa
fi
# Join the other nodes to the cluster
pssh --timeout 200 "
if ! i_am_first_node && [ ! -f /etc/kubernetes/kubelet.conf ]; then
@@ -445,6 +598,9 @@ EOF
ssh $SSHOPTS \$FIRSTNODE cat /tmp/kubeadm-config.yaml > /tmp/kubeadm-config.yaml &&
sudo kubeadm join --config /tmp/kubeadm-config.yaml
fi"
if [ -f "tags/$TAG/id_rsa" ]; then
ssh-add -d tags/$TAG/id_rsa
fi
# Install metrics server
pssh "
@@ -460,7 +616,6 @@ _cmd kubetools "Install a bunch of CLI tools for Kubernetes"
_cmd_kubetools() {
TAG=$1
need_tag
need_login_password
ARCH=${ARCHITECTURE-amd64}
@@ -655,6 +810,25 @@ EOF
sudo tar -zxvf- -C /usr/local/bin kubeseal
kubeseal --version
fi"
##VERSION## https://github.com/vmware-tanzu/velero/releases
VELERO_VERSION=1.11.0
pssh "
if [ ! -x /usr/local/bin/velero ]; then
curl -fsSL https://github.com/vmware-tanzu/velero/releases/download/v$VELERO_VERSION/velero-v$VELERO_VERSION-linux-$ARCH.tar.gz |
sudo tar --strip-components=1 --wildcards -zx -C /usr/local/bin '*/velero'
velero completion bash | sudo tee /etc/bash_completion.d/velero
velero version --client-only
fi"
##VERSION## https://github.com/doitintl/kube-no-trouble/releases
KUBENT_VERSION=0.7.0
pssh "
if [ ! -x /usr/local/bin/kubent ]; then
curl -fsSL https://github.com/doitintl/kube-no-trouble/releases/download/${KUBENT_VERSION}/kubent-${KUBENT_VERSION}-linux-$ARCH.tar.gz |
sudo tar -zxvf- -C /usr/local/bin kubent
kubent --version
fi"
}
_cmd kubereset "Wipe out Kubernetes configuration on all nodes"
@@ -688,8 +862,6 @@ _cmd_ips() {
TAG=$1
need_tag $TAG
SETTINGS=tags/$TAG/settings.yaml
CLUSTERSIZE=$(awk '/^clustersize:/ {print $2}' $SETTINGS)
while true; do
for I in $(seq $CLUSTERSIZE); do
read ip || return 0
@@ -699,22 +871,9 @@ _cmd_ips() {
done < tags/$TAG/ips.txt
}
_cmd inventory "List all VMs on a given infrastructure (or all infras if no arg given)"
_cmd inventory "List all VMs on a given provider (or across all providers if no arg given)"
_cmd_inventory() {
case "$1" in
"")
for INFRA in infra/*; do
$0 inventory $INFRA
done
;;
*/example.*)
;;
*)
need_infra $1
sep "Listing instances for $1"
infra_list
;;
esac
FIXME
}
_cmd maketag "Generate a quasi-unique tag for a group of instances"
@@ -759,18 +918,92 @@ _cmd_ping() {
fping < tags/$TAG/ips.txt
}
_cmd stage2 "Finalize the setup of managed Kubernetes clusters"
_cmd_stage2() {
TAG=$1
need_tag
cd tags/$TAG/stage2
terraform init -upgrade
terraform apply -auto-approve
}
_cmd standardize "Deal with non-standard Ubuntu cloud images"
_cmd_standardize() {
TAG=$1
need_tag
# Try to log in as root.
# If successful, make sure than we have:
# - sudo
# - ubuntu user
# Note that on Scaleway, the keys of the root account get copied
# a little bit later after boot; so the first time we run "standardize"
# we might end up copying an incomplete authorized_keys file.
# That's why we copy it inconditionally here, rather than checking
# for existence and skipping if it already exists.
pssh -l root -t 5 true 2>&1 >/dev/null && {
pssh -l root "
grep DEBIAN_FRONTEND /etc/environment || echo DEBIAN_FRONTEND=noninteractive >> /etc/environment
#grep cloud-init /etc/sudoers && rm /etc/sudoers
apt-get update && apt-get install sudo -y
getent passwd ubuntu || {
useradd ubuntu -m -s /bin/bash
echo 'ubuntu ALL=(ALL:ALL) NOPASSWD:ALL' > /etc/sudoers.d/ubuntu
}
install --owner=ubuntu --mode=700 --directory /home/ubuntu/.ssh
install --owner=ubuntu --mode=600 /root/.ssh/authorized_keys --target-directory /home/ubuntu/.ssh
"
}
# Now make sure that we have an ubuntu user
pssh true
# Disable unattended upgrades so that they don't mess up with the subsequent steps
pssh sudo rm -f /etc/apt/apt.conf.d/50unattended-upgrades
# Digital Ocean's cloud init disables password authentication; re-enable it.
pssh "
if [ -f /etc/ssh/sshd_config.d/50-cloud-init.conf ]; then
sudo rm /etc/ssh/sshd_config.d/50-cloud-init.conf
sudo systemctl restart ssh.service
fi"
# Special case for oracle since their iptables blocks everything but SSH
pssh "
if [ -f /etc/iptables/rules.v4 ]; then
sudo sed -i 's/-A INPUT -j REJECT --reject-with icmp-host-prohibited//' /etc/iptables/rules.v4
sudo netfilter-persistent flush
sudo netfilter-persistent start
fi"
# oracle-cloud-agent upgrades pacakges in the background.
# This breaks our deployment scripts, because when we invoke apt-get, it complains
# that the lock already exists (symptom: random "Exited with error code 100").
# Workaround: if we detect oracle-cloud-agent, remove it.
# But this agent seems to also take care of installing/upgrading
# the unified-monitoring-agent package, so when we stop the snap,
# it can leave dpkg in a broken state. We "fix" it with the 2nd command.
pssh "
if [ -d /snap/oracle-cloud-agent ]; then
sudo snap remove oracle-cloud-agent
sudo dpkg --remove --force-remove-reinstreq unified-monitoring-agent
fi"
}
_cmd tailhist "Install history viewer on port 1088"
_cmd_tailhist () {
TAG=$1
need_tag
need_login_password
ARCH=${ARCHITECTURE-amd64}
[ "$ARCH" = "aarch64" ] && ARCH=arm64
# We use "wget -c" here in case the download was aborted
# halfway through and we're actually trying to download it again.
pssh "
set -e
wget https://github.com/joewalnes/websocketd/releases/download/v0.3.0/websocketd-0.3.0-linux_$ARCH.zip
wget -c https://github.com/joewalnes/websocketd/releases/download/v0.3.0/websocketd-0.3.0-linux_$ARCH.zip
unzip websocketd-0.3.0-linux_$ARCH.zip websocketd
sudo mv websocketd /usr/local/bin/websocketd
sudo mkdir -p /tmp/tailhist
@@ -804,25 +1037,9 @@ _cmd_tools() {
sudo apt-get -qy install apache2-utils emacs-nox git httping htop jid joe jq mosh python-setuptools tree unzip
# This is for VMs with broken PRNG (symptom: running docker-compose randomly hangs)
sudo apt-get -qy install haveged
# I don't remember why we need to remove this
sudo apt-get remove -y --purge dnsmasq-base
"
}
_cmd opensg "Open the default security group to ALL ingress traffic"
_cmd_opensg() {
need_infra $1
infra_opensg
}
_cmd disableaddrchecks "Disable source/destination IP address checks"
_cmd_disableaddrchecks() {
TAG=$1
need_tag
infra_disableaddrchecks
}
_cmd pssh "Run an arbitrary command on all nodes"
_cmd_pssh() {
TAG=$1
@@ -864,122 +1081,21 @@ fi
"
}
_cmd quotas "Check our infrastructure quotas (max instances)"
_cmd_quotas() {
need_infra $1
infra_quotas
}
_cmd ssh "Open an SSH session to the first node of a tag"
_cmd_ssh() {
TAG=$1
need_tag
need_login_password
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
}
_cmd start "Start a group of VMs"
_cmd_start() {
while [ ! -z "$*" ]; do
case "$1" in
--infra) INFRA=$2; shift 2;;
--settings) SETTINGS=$2; shift 2;;
--count) die "Flag --count is deprecated; please use --students instead." ;;
--tag) TAG=$2; shift 2;;
--students) STUDENTS=$2; shift 2;;
*) die "Unrecognized parameter: $1."
esac
done
if [ -z "$INFRA" ]; then
die "Please add --infra flag to specify which infrastructure file to use."
fi
if [ -z "$SETTINGS" ]; then
die "Please add --settings flag to specify which settings file to use."
fi
if [ -z "$COUNT" ]; then
CLUSTERSIZE=$(awk '/^clustersize:/ {print $2}' $SETTINGS)
if [ -z "$STUDENTS" ]; then
warning "Neither --count nor --students was specified."
warning "According to the settings file, the cluster size is $CLUSTERSIZE."
warning "Deploying one cluster of $CLUSTERSIZE nodes."
STUDENTS=1
fi
COUNT=$(($STUDENTS*$CLUSTERSIZE))
fi
# Check that the specified settings and infrastructure are valid.
need_settings $SETTINGS
need_infra $INFRA
if [ -z "$TAG" ]; then
TAG=$(_cmd_maketag)
fi
mkdir -p tags/$TAG
ln -s ../../$INFRA tags/$TAG/infra.sh
ln -s ../../$SETTINGS tags/$TAG/settings.yaml
echo creating > tags/$TAG/status
infra_start $COUNT
sep
info "Successfully created $COUNT instances with tag $TAG"
echo create_ok > tags/$TAG/status
# If the settings.yaml file has a "steps" field,
# automatically execute all the actions listed in that field.
# If an action fails, retry it up to 10 times.
python -c 'if True: # hack to deal with indentation
import sys, yaml
settings = yaml.safe_load(sys.stdin)
print ("\n".join(settings.get("steps", [])))
' < tags/$TAG/settings.yaml \
| while read step; do
if [ -z "$step" ]; then
break
fi
sep "$TAG -> $step"
TRY=1
MAXTRY=10
while ! $0 $step $TAG ; do
TRY=$(($TRY+1))
if [ $TRY -gt $MAXTRY ]; then
error "This step ($step) failed after $MAXTRY attempts."
info "You can troubleshoot the situation manually, or terminate these instances with:"
info "$0 stop $TAG"
die "Giving up."
else
sep
info "Step '$step' failed for '$TAG'. Let's wait 10 seconds and try again."
info "(Attempt $TRY out of $MAXTRY.)"
sleep 10
fi
done
done
sep
info "Deployment successful."
info "To log into the first machine of that batch, you can run:"
info "$0 ssh $TAG"
info "To terminate these instances, you can run:"
info "$0 stop $TAG"
}
_cmd stop "Stop (terminate, shutdown, kill, remove, destroy...) instances"
_cmd_stop() {
TAG=$1
need_tag
infra_stop
echo stopped > tags/$TAG/status
}
_cmd tags "List groups of VMs known locally"
_cmd_tags() {
(
cd tags
echo "[#] [Status] [Tag] [Infra]" \
| awk '{ printf "%-7s %-12s %-25s %-25s\n", $1, $2, $3, $4}'
echo "[#] [Status] [Tag] [Mode] [Provider]"
for tag in *; do
if [ -f $tag/ips.txt ]; then
count="$(wc -l < $tag/ips.txt)"
@@ -991,15 +1107,19 @@ _cmd_tags() {
else
status="?"
fi
if [ -f $tag/infra.sh ]; then
infra="$(basename $(readlink $tag/infra.sh))"
if [ -f $tag/mode ]; then
mode="$(cat $tag/mode)"
else
infra="?"
mode="?"
fi
echo "$count $status $tag $infra" \
| awk '{ printf "%-7s %-12s %-25s %-25s\n", $1, $2, $3, $4}'
if [ -f $tag/provider ]; then
provider="$(cat $tag/provider)"
else
provider="?"
fi
echo "$count $status $tag $mode $provider"
done
)
) | column -t
}
_cmd test "Run tests (pre-flight checks) on a group of VMs"
@@ -1054,21 +1174,28 @@ _cmd_passwords() {
$0 ips "$TAG" | paste "$PASSWORDS_FILE" - | while read password nodes; do
info "Setting password for $nodes..."
for node in $nodes; do
echo docker:$password | ssh $SSHOPTS ubuntu@$node sudo chpasswd
echo $USER_LOGIN:$password | ssh $SSHOPTS -i tags/$TAG/id_rsa ubuntu@$node sudo chpasswd
done
done
info "Done."
}
_cmd wait "Wait until VMs are ready (reachable and cloud init is done)"
_cmd wait "Wait until VMs are ready (reachable, cloud init is done, ubuntu user is up)"
_cmd_wait() {
TAG=$1
need_tag
# Wait until all hosts are reachable.
info "Trying to reach $TAG instances..."
while ! pssh -t 5 true 2>&1 >/dev/null; do
>/dev/stderr echo -n "."
while >/dev/stderr echo -n "."; do
pssh -t 5 true 2>&1 >/dev/null && {
SSH_USER=ubuntu
break
}
pssh -l root -t 5 true 2>&1 >/dev/null && {
SSH_USER=root
break
}
sleep 2
done
>/dev/stderr echo ""
@@ -1076,11 +1203,9 @@ _cmd_wait() {
# If this VM image is using cloud-init,
# wait for cloud-init to be done
info "Waiting for cloud-init to be done on $TAG instances..."
pssh "
pssh -l $SSH_USER "
if [ -d /var/lib/cloud ]; then
while [ ! -f /var/lib/cloud/instance/boot-finished ]; do
sleep 1
done
cloud-init status --wait
fi"
}
@@ -1106,7 +1231,6 @@ _cmd_webssh() {
need_tag
pssh "
sudo apt-get update &&
sudo apt-get install python-tornado python-paramiko -y ||
sudo apt-get install python3-tornado python3-paramiko -y"
pssh "
cd /opt

View File

@@ -0,0 +1,7 @@
version = 2
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true

View File

@@ -16,18 +16,18 @@ pssh() {
}
echo "[parallel-ssh] $@"
export PSSH=$(which pssh || which parallel-ssh)
case "$INFRACLASS" in
hetzner) LOGIN=root ;;
linode) LOGIN=root ;;
*) LOGIN=ubuntu ;;
esac
# There are some routers that really struggle with the number of TCP
# connections that we open when deploying large fleets of clusters.
# We're adding a 1 second delay here, but this can be cranked up if
# necessary - or down to zero, too.
sleep ${PSSH_DELAY_PRE-1}
$PSSH -h $HOSTFILE -l $LOGIN \
--par 100 \
$(which pssh || which parallel-ssh) -h $HOSTFILE -l ubuntu \
--par ${PSSH_PARALLEL_CONNECTIONS-100} \
--timeout 300 \
-O LogLevel=ERROR \
-O IdentityFile=tags/$TAG/id_rsa \
-O UserKnownHostsFile=/dev/null \
-O StrictHostKeyChecking=no \
-O ForwardAgent=yes \

View File

@@ -0,0 +1,21 @@
CLUSTERSIZE=3
CLUSTERPREFIX=kubenet
CLUSTERNUMBER=100
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubebins
kubetools
ips
"

View File

@@ -0,0 +1,21 @@
CLUSTERSIZE=3
CLUSTERPREFIX=kuberouter
CLUSTERNUMBER=200
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubebins
kubetools
ips
"

View File

@@ -0,0 +1,26 @@
CLUSTERSIZE=1
CLUSTERPREFIX=monokube
# We're sticking to this in the first DMUC lab,
# because it still works with Docker, and doesn't
# require a ServiceAccount signing key.
KUBEVERSION=1.19.11
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
disabledocker
createuser
webssh
tailhist
kubebins
kubetools
ips
"

View File

@@ -0,0 +1,25 @@
CLUSTERSIZE=3
CLUSTERPREFIX=oldversion
USER_LOGIN=k8s
USER_PASSWORD=training
# For a list of old versions, check:
# https://kubernetes.io/releases/patch-releases/#non-active-branch-history
KUBEVERSION=1.24.14
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubepkgs
kubeadm
kubetools
kubetest
"

View File

@@ -0,0 +1,20 @@
CLUSTERSIZE=3
CLUSTERPREFIX=polykube
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
kubepkgs
kubebins
createuser
webssh
tailhist
kubetools
ips
"

View File

@@ -0,0 +1,21 @@
CLUSTERSIZE=3
CLUSTERPREFIX=test
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubepkgs
kubeadm
kubetools
kubetest
"

View File

@@ -0,0 +1,19 @@
CLUSTERSIZE=1
CLUSTERPREFIX=moby
USER_LOGIN=docker
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
cards
ips
"

View File

@@ -0,0 +1,21 @@
CLUSTERSIZE=4
CLUSTERPREFIX=node
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubepkgs
kubeadm
kubetools
kubetest
"

View File

@@ -0,0 +1,22 @@
CLUSTERSIZE=10
export TF_VAR_node_size=GP1.M
CLUSTERPREFIX=node
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubepkgs
kubeadm
kubetools
kubetest
"

View File

@@ -0,0 +1,6 @@
CLUSTERSIZE=2
USER_LOGIN=k8s
USER_PASSWORD=
STEPS="stage2"

View File

@@ -0,0 +1,19 @@
#export TF_VAR_node_size=GP2.4
#export TF_VAR_node_size=g6-standard-6
CLUSTERSIZE=1
CLUSTERPREFIX=CHANGEME
USER_LOGIN=portal
USER_PASSWORD=CHANGEME
STEPS="
wait
standardize
clusterize
tools
docker
createuser
ips
"

View File

@@ -0,0 +1,40 @@
#!/bin/sh
set -e
PREFIX=$(date +%Y-%m-%d-%H-%M)
PROVIDER=openstack/enix # aws also works
STUDENTS=2
#export TF_VAR_location=eu-north-1
export TF_VAR_node_size=S
SETTINGS=admin-monokube
TAG=$PREFIX-$SETTINGS
./labctl create \
--tag $TAG \
--provider $PROVIDER \
--settings settings/$SETTINGS.env \
--students $STUDENTS
SETTINGS=admin-polykube
TAG=$PREFIX-$SETTINGS
./labctl create \
--tag $TAG \
--provider $PROVIDER \
--settings settings/$SETTINGS.env \
--students $STUDENTS
SETTINGS=admin-oldversion
TAG=$PREFIX-$SETTINGS
./labctl create \
--tag $TAG \
--provider $PROVIDER \
--settings settings/$SETTINGS.env \
--students $STUDENTS
SETTINGS=admin-test
TAG=$PREFIX-$SETTINGS
./labctl create \
--tag $TAG \
--provider $PROVIDER \
--settings settings/$SETTINGS.env \
--students $STUDENTS

1
prepare-labs/tags Symbolic link
View File

@@ -0,0 +1 @@
terraform/tags

View File

Can't render this file because it contains an unexpected character in line 1 and column 42.

View File

@@ -0,0 +1,4 @@
#!/bin/sh
az account list-locations -o table \
--query "sort_by([?metadata.regionType == 'Physical'], &regionalDisplayName)[]
.{ displayName: displayName, regionalDisplayName: regionalDisplayName }"

View File

@@ -0,0 +1,2 @@
#!/bin/sh
civo region ls

View File

@@ -0,0 +1,2 @@
#!/bin/sh
exo zone

View File

@@ -8,8 +8,10 @@ resource "random_string" "_" {
resource "time_static" "_" {}
locals {
timestamp = formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339)
tag = random_string._.result
min_nodes_per_pool = var.nodes_per_cluster
max_nodes_per_pool = var.nodes_per_cluster * 2
timestamp = formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339)
tag = random_string._.result
# Common tags to be assigned to all resources
common_tags = [
"created-by-terraform",

View File

@@ -1,10 +1,9 @@
module "clusters" {
source = "./modules/PROVIDER"
source = "./one-kubernetes-module"
for_each = local.clusters
cluster_name = each.value.cluster_name
min_nodes_per_pool = var.min_nodes_per_pool
max_nodes_per_pool = var.max_nodes_per_pool
enable_arm_pool = var.enable_arm_pool
min_nodes_per_pool = local.min_nodes_per_pool
max_nodes_per_pool = local.max_nodes_per_pool
node_size = var.node_size
common_tags = local.common_tags
location = each.value.location
@@ -63,7 +62,7 @@ resource "null_resource" "wait_for_nodes" {
}
command = <<-EOT
while sleep 1; do
kubectl get nodes --watch | grep --silent --line-buffered . &&
kubectl get nodes -o name | grep --silent . &&
kubectl wait node --for=condition=Ready --all --timeout=10m &&
break
done

View File

@@ -0,0 +1 @@
one-kubernetes-config/config.tf

View File

@@ -0,0 +1,3 @@
This directory should contain a config.tf file, even if it's empty.
(Because if the file doesn't exist, then the Terraform configuration
in the parent directory will fail.)

View File

@@ -0,0 +1,8 @@
This directory should contain a copy of one of the "one-kubernetes" modules.
For instance, when located in this directory, you can do:
cp ../../one-kubernetes/linode/* .
Then, move the config.tf file to ../one-kubernetes-config:
mv config.tf ../one-kubernetes-config

View File

@@ -0,0 +1 @@
one-kubernetes-module/provider.tf

View File

@@ -0,0 +1,3 @@
terraform {
required_version = ">= 1.4"
}

View File

@@ -90,7 +90,6 @@ resource "kubernetes_service" "shpod_${index}" {
name = "ssh"
port = 22
target_port = 22
node_port = 32222
}
type = "NodePort"
}
@@ -222,7 +221,10 @@ output "ip_addresses_of_nodes" {
value = join("\n", [
%{ for index, cluster in clusters ~}
join("\t", concat(
[ random_string.shpod_${index}.result, "ssh -l k8s -p 32222" ],
[
random_string.shpod_${index}.result,
"ssh -l k8s -p $${kubernetes_service.shpod_${index}.spec[0].port[0].node_port}"
],
split(" ", file("./externalips.${index}"))
)),
%{ endfor ~}

View File

@@ -0,0 +1,28 @@
variable "tag" {
type = string
}
variable "how_many_clusters" {
type = number
default = 2
}
variable "nodes_per_cluster" {
type = number
default = 2
}
variable "node_size" {
type = string
default = "M"
}
variable "location" {
type = string
default = null
}
# TODO: perhaps handle if it's space-separated instead of newline?
locals {
locations = var.location == null ? [null] : split("\n", var.location)
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/aws/config.tf

View File

@@ -0,0 +1,87 @@
# Taken from:
# https://github.com/hashicorp/learn-terraform-provision-eks-cluster/blob/main/main.tf
data "aws_availability_zones" "available" {}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.19.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "19.5.1"
cluster_name = var.cluster_name
cluster_version = "1.24"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
eks_managed_node_group_defaults = {
ami_type = "AL2_x86_64"
}
eks_managed_node_groups = {
one = {
name = "node-group-one"
instance_types = [local.node_size]
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
}
}
}
# https://aws.amazon.com/blogs/containers/amazon-ebs-csi-driver-is-now-generally-available-in-amazon-eks-add-ons/
data "aws_iam_policy" "ebs_csi_policy" {
arn = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
}
module "irsa-ebs-csi" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "4.7.0"
create_role = true
role_name = "AmazonEKSTFEBSCSIRole-${module.eks.cluster_name}"
provider_url = module.eks.oidc_provider
role_policy_arns = [data.aws_iam_policy.ebs_csi_policy.arn]
oidc_fully_qualified_subjects = ["system:serviceaccount:kube-system:ebs-csi-controller-sa"]
}
resource "aws_eks_addon" "ebs-csi" {
cluster_name = module.eks.cluster_name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.5.2-eksbuild.1"
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
tags = {
"eks_addon" = "ebs-csi"
"terraform" = "true"
}
}

View File

@@ -0,0 +1,44 @@
output "cluster_id" {
value = module.eks.cluster_arn
}
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
sensitive = true
value = yamlencode({
apiVersion = "v1"
kind = "Config"
clusters = [{
name = var.cluster_name
cluster = {
certificate-authority-data = module.eks.cluster_certificate_authority_data
server = module.eks.cluster_endpoint
}
}]
contexts = [{
name = var.cluster_name
context = {
cluster = var.cluster_name
user = var.cluster_name
}
}]
users = [{
name = var.cluster_name
user = {
exec = {
apiVersion = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", var.cluster_name]
}
}
}]
current-context = var.cluster_name
})
}
data "aws_eks_cluster_auth" "_" {
name = module.eks.cluster_name
}

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.47.0"
}
}
}

View File

@@ -0,0 +1 @@
../../providers/aws/variables.tf

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/azure/config.tf

View File

@@ -0,0 +1,22 @@
resource "azurerm_resource_group" "_" {
name = var.cluster_name
location = var.location
}
resource "azurerm_kubernetes_cluster" "_" {
name = var.cluster_name
location = var.location
dns_prefix = var.cluster_name
identity {
type = "SystemAssigned"
}
resource_group_name = azurerm_resource_group._.name
default_node_pool {
name = "x86"
node_count = var.min_nodes_per_pool
min_count = var.min_nodes_per_pool
max_count = var.max_nodes_per_pool
vm_size = local.node_size
enable_auto_scaling = true
}
}

View File

@@ -0,0 +1,12 @@
output "cluster_id" {
value = azurerm_kubernetes_cluster._.id
}
output "has_metrics_server" {
value = true
}
output "kubeconfig" {
value = azurerm_kubernetes_cluster._.kube_config_raw
sensitive = true
}

View File

@@ -0,0 +1,7 @@
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
}
}
}

View File

@@ -0,0 +1 @@
../../providers/azure/variables.tf

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/civo/config.tf

View File

@@ -0,0 +1,17 @@
# As of March 2023, the default type ("k3s") only supports up
# to Kubernetes 1.23, which belongs to a museum.
# So let's use Talos, which supports up to 1.25.
resource "civo_kubernetes_cluster" "_" {
name = var.cluster_name
firewall_id = civo_firewall._.id
cluster_type = "talos"
pools {
size = local.node_size
node_count = var.min_nodes_per_pool
}
}
resource "civo_firewall" "_" {
name = var.cluster_name
}

View File

@@ -0,0 +1,12 @@
output "cluster_id" {
value = civo_kubernetes_cluster._.id
}
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
value = civo_kubernetes_cluster._.kubeconfig
sensitive = true
}

View File

@@ -0,0 +1,7 @@
terraform {
required_providers {
civo = {
source = "civo/civo"
}
}
}

View File

@@ -0,0 +1 @@
../../providers/civo/variables.tf

View File

@@ -0,0 +1,28 @@
variable "cluster_name" {
type = string
default = "deployed-with-terraform"
}
variable "common_tags" {
type = list(string)
default = []
}
variable "node_size" {
type = string
default = "M"
}
variable "min_nodes_per_pool" {
type = number
default = 2
}
variable "max_nodes_per_pool" {
type = number
default = 4
}
locals {
node_size = lookup(var.node_sizes, var.node_size, var.node_size)
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/digitalocean/config.tf

View File

@@ -3,15 +3,18 @@ resource "digitalocean_kubernetes_cluster" "_" {
tags = var.common_tags
# Region is mandatory, so let's provide a default value.
region = var.location != null ? var.location : "nyc1"
version = var.k8s_version
version = data.digitalocean_kubernetes_versions._.latest_version
node_pool {
name = "x86"
tags = var.common_tags
size = local.node_type
auto_scale = true
size = local.node_size
auto_scale = var.max_nodes_per_pool > var.min_nodes_per_pool
min_nodes = var.min_nodes_per_pool
max_nodes = max(var.min_nodes_per_pool, var.max_nodes_per_pool)
}
}
data "digitalocean_kubernetes_versions" "_" {
}

View File

@@ -1,7 +1,3 @@
output "kubeconfig" {
value = digitalocean_kubernetes_cluster._.kube_config.0.raw_config
}
output "cluster_id" {
value = digitalocean_kubernetes_cluster._.id
}
@@ -9,3 +5,8 @@ output "cluster_id" {
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
value = digitalocean_kubernetes_cluster._.kube_config.0.raw_config
sensitive = true
}

View File

@@ -0,0 +1 @@
../../providers/digitalocean/variables.tf

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/exoscale/config.tf

View File

@@ -0,0 +1,20 @@
resource "exoscale_sks_cluster" "_" {
zone = var.location
name = var.cluster_name
service_level = "starter"
}
resource "exoscale_sks_nodepool" "_" {
cluster_id = exoscale_sks_cluster._.id
zone = exoscale_sks_cluster._.zone
name = var.cluster_name
instance_type = local.node_size
size = var.min_nodes_per_pool
}
resource "exoscale_sks_kubeconfig" "_" {
cluster_id = exoscale_sks_cluster._.id
zone = exoscale_sks_cluster._.zone
user = "kubernetes-admin"
groups = ["system:masters"]
}

View File

@@ -0,0 +1,12 @@
output "cluster_id" {
value = exoscale_sks_cluster._.id
}
output "has_metrics_server" {
value = true
}
output "kubeconfig" {
value = exoscale_sks_kubeconfig._.kubeconfig
sensitive = true
}

View File

@@ -0,0 +1,7 @@
terraform {
required_providers {
exoscale = {
source = "exoscale/exoscale"
}
}
}

View File

@@ -0,0 +1 @@
../../providers/exoscale/variables.tf

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/googlecloud/config.tf

View File

@@ -0,0 +1,12 @@
locals {
location = var.location != null ? var.location : "europe-north1-a"
region = replace(local.location, "/-[a-z]$/", "")
# Unfortunately, the following line doesn't work
# (that attribute just returns an empty string)
# so we have to hard-code the project name.
#project = data.google_client_config._.project
project = "prepare-tf"
}
data "google_client_config" "_" {}

View File

@@ -1,8 +1,8 @@
resource "google_container_cluster" "_" {
name = var.cluster_name
project = local.project
location = local.location
min_master_version = var.k8s_version
name = var.cluster_name
project = local.project
location = local.location
#min_master_version = var.k8s_version
# To deploy private clusters, uncomment the section below,
# and uncomment the block in network.tf.
@@ -43,12 +43,12 @@ resource "google_container_cluster" "_" {
name = "x86"
node_config {
tags = var.common_tags
machine_type = local.node_type
machine_type = local.node_size
}
initial_node_count = var.min_nodes_per_pool
autoscaling {
min_node_count = var.min_nodes_per_pool
max_node_count = max(var.min_nodes_per_pool, var.max_nodes_per_pool)
max_node_count = var.max_nodes_per_pool
}
}
@@ -62,4 +62,3 @@ resource "google_container_cluster" "_" {
}
}
}

View File

@@ -1,7 +1,14 @@
data "google_client_config" "_" {}
output "cluster_id" {
value = google_container_cluster._.id
}
output "has_metrics_server" {
value = true
}
output "kubeconfig" {
value = <<-EOT
sensitive = true
value = <<-EOT
apiVersion: v1
kind: Config
current-context: ${google_container_cluster._.name}
@@ -25,11 +32,3 @@ output "kubeconfig" {
token: ${data.google_client_config._.access_token}
EOT
}
output "cluster_id" {
value = google_container_cluster._.id
}
output "has_metrics_server" {
value = true
}

View File

@@ -0,0 +1 @@
../../providers/googlecloud/variables.tf

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../providers/linode/config.tf

View File

@@ -3,10 +3,10 @@ resource "linode_lke_cluster" "_" {
tags = var.common_tags
# "region" is mandatory, so let's provide a default value if none was given.
region = var.location != null ? var.location : "eu-central"
k8s_version = local.k8s_version
k8s_version = data.linode_lke_versions._.versions[0].id
pool {
type = local.node_type
type = local.node_size
count = var.min_nodes_per_pool
autoscaler {
min = var.min_nodes_per_pool
@@ -15,3 +15,9 @@ resource "linode_lke_cluster" "_" {
}
}
data "linode_lke_versions" "_" {
}
# FIXME: sort the versions to be sure that we get the most recent one?
# (We don't know in which order they are returned by the provider.)

View File

@@ -1,7 +1,3 @@
output "kubeconfig" {
value = base64decode(linode_lke_cluster._.kubeconfig)
}
output "cluster_id" {
value = linode_lke_cluster._.id
}
@@ -9,3 +5,8 @@ output "cluster_id" {
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
value = base64decode(linode_lke_cluster._.kubeconfig)
sensitive = true
}

View File

@@ -2,7 +2,7 @@ terraform {
required_providers {
linode = {
source = "linode/linode"
version = "1.22.0"
version = "1.30.0"
}
}
}

View File

@@ -0,0 +1 @@
../../providers/linode/variables.tf

View File

@@ -0,0 +1 @@
../common.tf

Some files were not shown because too many files have changed in this diff Show More