⚛️ Huge refactoring of lab environment deployment system

Summary of changes:
- "workshopctl" is now "labctl"
- it can handle deployment of VMs but also of managed
  Kubernetes clusters (and therefore, it replaces
  the "prepare-tf" directory)
- support for many more providers has been added

Check the README.md, in particular the "directory structure";
it has the most important information.
This commit is contained in:
Jérôme Petazzoni
2023-03-29 16:30:06 +02:00
parent f8ab4adfb7
commit b6340acb6e
205 changed files with 1793 additions and 2836 deletions

12
.gitignore vendored
View File

@@ -2,11 +2,12 @@
*.swp
*~
prepare-vms/tags
prepare-vms/infra
prepare-vms/www
prepare-tf/tag-*
**/terraform.tfstate
**/terraform.tfstate.backup
prepare-labs/terraform/lab-environments
prepare-labs/terraform/many-kubernetes/one-kubernetes-config/config.tf
prepare-labs/terraform/many-kubernetes/one-kubernetes-module/*.tf
prepare-labs/www
slides/*.yml.html
slides/autopilot/state.yaml
@@ -26,3 +27,4 @@ node_modules
Thumbs.db
ehthumbs.db
ehthumbs_vista.db

196
prepare-labs/README.md Normal file
View File

@@ -0,0 +1,196 @@
# Tools to create lab environments
This directory contains tools to create lab environments for Docker and Kubernetes courses and workshops.
It also contains Terraform configurations that can be used stand-alone to create simple Kubernetes clusters.
Assuming that you have installed all the necessary dependencies, and placed cloud provider access tokens in the right locations, you could do, for instance:
```bash
# For a Docker course with 50 students,
# create 50 VMs on Digital Ocean.
./labctl create --students 50 --settings settings/docker.env --provider digitalocean
# For a Kubernetes training with 20 students,
# create 20 clusters of 4 VMs each using kubeadm,
# on a private Openstack cluster.
./labctl create --students 20 --settings settings/kubernetes.env --provider openstack/enix
# For a Kubernetes workshop with 80 students,
# create 80 clusters with 2 VMs each,
# using Scaleway Kapsule (managed Kubernetes).
./labctl create --students 20 --settings settings/mk8s.env --provider scaleway --mode mk8s
```
Interested? Read on!
## Software requirements
For Docker labs and Kubernetes labs based on kubeadm:
- [Parallel SSH](https://github.com/lilydjwg/pssh)
(should be installable with `pip install git+https://github.com/lilydjwg/pssh`;
on a Mac, try `brew install pssh`)
For all labs:
- Terraform
If you want to generate printable cards:
- [pyyaml](https://pypi.python.org/pypi/PyYAML)
- [jinja2](https://pypi.python.org/pypi/Jinja2)
These require Python 3. If you are on a Mac, see below for specific instructions on setting up
Python 3 to be the default Python on a Mac. In particular, if you installed `mosh`, Homebrew
may have changed your default Python to Python 2.
If you don't or can't install these requirements, you
can also run this script in a Docker image.
You will also need an account with the cloud provider(s) that you want to use to deploy the lab environments.
## Cloud provider account(s) and credentials
These scripts create VMs or Kubernetes cluster on cloud providers, so you will need cloud provider account(s) and credentials.
Generally, we try to use the credentials stored in the configuration file used by the cloud providers CLI tools.
This means, for instance, that for Linode, if you install `linode-cli` and configure it properly, it will place your credentials in `~/.config/linode-cli`, and our Terraform configurations will try to read that file and use the credentials in it.
You don't **have to** install the CLI tools of the cloud provider(s) that you want to use; but we recommend that you do.
If you want to provide your cloud credentials through other means, you will have to adjust the Terraform configuration files in `terraform/provider-config` accordingly.
## General Workflow
- fork/clone repo
- make sure your cloud credentials have been configured properly
- run `./labctl create ...` to create lab environments
- run `./labctl destroy ...` when you don't need the environments anymore
## Customizing things
You can edit the `settings/*.env` files, for instance to change the size of the clusters, the login or password used for the students...
Note that these files are sourced before executing any operation on a specific set of lab environments, which means that you can set Terraform variables by adding lines like the following one in the `*.env` files:
```bash
export TF_VAR_node_size=GP1.L
export TF_VAR_location=eu-north
```
## `./labctl` Usage
If you run `./labctl` without arguments, it will show a list of available commands.
### Summary of What `./labctl` Does For You
The script will create a Terraform configuration using a provider-specific template.
There are two modes: `pssh` and `mk8s`.
In `pssh` mode, students connect directly to the virtual machines using SSH.
The Terraform configuration creates a bunch of virtual machines, then the provisioning and configuration are done with `pssh`. There are a number of "steps" that are executed on the VMs, to install Docker, install a number of convenient tools, install and set up Kubernetes (if needed)... The list of "steps" to be executed is configured in the `settings/*.env` file.
In `mk8s` mode, students don't connect directly to the virtual machines. Instead, they connect to an SSH server running in a Pod (using the `jpetazzo/shpod` image), itself running on a Kubernetes cluster. The Kubernetes cluster is a managed cluster created by the Terraform configuration.
## `terraform` directory structure and principles
Legend:
- `📁` directory
- `📄` file
- `📄📄📄` multiple files
- `🌍` Terraform configuration that can be used "as-is"
```
📁terraform
├── 📁list-locations
│ └── 📄📄📄 helper scripts
│ (to list available locations for each provider)
├── 📁many-kubernetes
│ └── 📄📄📄 Terraform configuration template
│ (used in mk8s mode)
├── 📁one-kubernetes
│ │ (contains Terraform configurations that can spawn
│ │ a single Kubernetes cluster on a given provider)
│ ├── 📁🌍aws
│ ├── 📁🌍civo
│ ├── 📄common.tf
│ ├── 📁🌍digitalocean
│ └── ...
├── 📁provider-config
│ ├── 📄aws.tf
│ ├── 📄azure.tf
│ ├── 📄civo.tf
│ ├── 📄digitalocean.tf
│ └── ...
├── 📁tags
│ │ (contains Terraform configurations + other files
│ │ for a specific set of VMs or K8S clusters; these
│ │ are created by labctl)
│ ├── 📁2023-03-27-10-04-79-jp
│ ├── 📁2023-03-27-10-07-41-jp
│ ├── 📁2023-03-27-10-16-418-jp
│ └── ...
└── 📁virtual-machines
│ (contains Terraform configurations that can spawn
│ a bunch of virtual machines on a given provider)
├── 📁🌍aws
├── 📁🌍azure
├── 📄common.tf
├── 📁🌍digitalocean
└── ...
```
The directory structure can feel a bit overwhelming at first, but it's built with specific goals in mind.
**Consistent input/output between providers.** The per-provider configurations in `one-kubernetes` all take the same input variables, and provide the same output variables. Same thing for the per-provider configurations in `virtual-machines`.
**Don't repeat yourself.** As much as possible, common variables, definitions, and logic has been factored in the `common.tf` file that you can see in `one-kubernetes` and `virtual-machines`. That file is then symlinked in each provider-specific directory, to make sure that all providers use the same version of the `common.tf` file.
**Don't repeat yourself (again).** The things that are specific to each provider (e.g. how to obtain the credentials; the size of the VMs to use...) have been placed in the `provider-config` directory, and are shared between the `one-kubernetes` and the `virtual-machines` configurations.
**Terraform configurations should work in `labctl` or standalone, without extra work.** The Terraform configurations (identified by 🌍 in the directory tree above) can be used directly. Just go to one of these directories, `terraform init`, `terraform apply`, and you're good to go. But they can also be used from `labctl`. `labctl` shouldn't barf out if you did a `terraform apply` in one of these directories (because it will only copy the `*.tf` files, and leave alone the other files, like the Terraform state).
The latter means that it should be easy to tweak these configurations, or create a new one, without having to use `labctl` to test it. It also means that if you want to use these configurations but don't care about `labctl`, you absolutely can!
## Miscellaneous info
### Making sure Python3 is the default (Mac only)
Check the `/usr/local/bin/python` symlink. It should be pointing to
`/usr/local/Cellar/python/3`-something. If it isn't, follow these
instructions.
1) Verify that Python 3 is installed.
```
ls -la /usr/local/Cellar/Python
```
You should see one or more versions of Python 3. If you don't,
install it with `brew install python`.
2) Verify that `python` points to Python3.
```
ls -la /usr/local/bin/python
```
If this points to `/usr/local/Cellar/python@2`, then we'll need to change it.
```
rm /usr/local/bin/python
ln -s /usr/local/Cellar/Python/xxxx /usr/local/bin/python
# where xxxx is the most recent Python 3 version you saw above
```
### AWS specific notes
Initial assumptions are you're using a root account. If you'd like to use a IAM user, it will need the right permissions. For `pssh` mode, that includes at least `AmazonEC2FullAccess` and `IAMReadOnlyAccess`.
In `pssh` mode, the Terraform configuration currently uses the default VPC and Security Group. If you want to use another one, you'll have to make changes to `terraform/virtual-machines/aws`.
The default VPC Security Group does not open any ports from Internet by default. So you'll need to add Inbound rules for `SSH | TCP | 22 | 0.0.0.0/0` and `Custom TCP Rule | TCP | 8000 - 8002 | 0.0.0.0/0`.

View File

Before

Width:  |  Height:  |  Size: 127 KiB

After

Width:  |  Height:  |  Size: 127 KiB

View File

@@ -21,10 +21,13 @@ DEPENDENCIES="
man
pssh
ssh
wkhtmltopdf
yq
"
UNUSED_DEPENDENCIES="
wkhtmltopdf
"
# Check for missing dependencies, and issue a warning if necessary.
missing=0
for dependency in $DEPENDENCIES; do

View File

@@ -50,20 +50,6 @@ sep() {
fi
}
need_infra() {
if [ -z "$1" ]; then
die "Please specify infrastructure file. (e.g.: infra/aws)"
fi
if [ "$1" = "--infra" ]; then
die "The infrastructure file should be passed directly to this command. Remove '--infra' and try again."
fi
if [ ! -f "$1" ]; then
die "Infrastructure file $1 doesn't exist."
fi
. "$1"
. "lib/infra/$INFRACLASS.sh"
}
need_tag() {
if [ -z "$TAG" ]; then
die "Please specify a tag. To see available tags, run: $0 tags"
@@ -71,25 +57,12 @@ need_tag() {
if [ ! -d "tags/$TAG" ]; then
die "Tag $TAG not found (directory tags/$TAG does not exist)."
fi
for FILE in settings.yaml ips.txt infra.sh; do
for FILE in settings.env ips.txt; do
if [ ! -f "tags/$TAG/$FILE" ]; then
warning "File tags/$TAG/$FILE not found."
fi
done
. "tags/$TAG/infra.sh"
. "lib/infra/$INFRACLASS.sh"
}
need_settings() {
if [ -z "$1" ]; then
die "Please specify a settings file. (e.g.: settings/kube101.yaml)"
fi
if [ ! -f "$1" ]; then
die "Settings file $1 doesn't exist."
if [ -f "tags/$TAG/settings.env" ]; then
. tags/$TAG/settings.env
fi
}
need_login_password() {
USER_LOGIN=$(yq -r .user_login < tags/$TAG/settings.yaml)
USER_PASSWORD=$(yq -r .user_password < tags/$TAG/settings.yaml)
}

View File

@@ -1,5 +1,3 @@
export AWS_DEFAULT_OUTPUT=text
# Ignore SSH key validation when connecting to these remote hosts.
# (Otherwise, deployment scripts break when a VM IP address reuse.)
SSHOPTS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=ERROR"
@@ -23,7 +21,7 @@ _cmd_build() {
_cmd wrap "Run this program in a container"
_cmd_wrap() {
docker-compose run --rm workshopctl "$@"
docker-compose run --rm labctl "$@"
}
_cmd cards "Generate ready-to-print cards for a group of VMs"
@@ -31,10 +29,12 @@ _cmd_cards() {
TAG=$1
need_tag
die FIXME
# This will process ips.txt to generate two files: ips.pdf and ips.html
(
cd tags/$TAG
../../lib/ips-txt-to-html.py settings.yaml
../../../lib/ips-txt-to-html.py settings.yaml
)
ln -sf ../tags/$TAG/ips.html www/$TAG.html
@@ -47,10 +47,10 @@ _cmd_cards() {
info "$0 www"
}
_cmd clean "Remove information about stopped clusters"
_cmd clean "Remove information about destroyed clusters"
_cmd_clean() {
for TAG in tags/*; do
if grep -q ^stopped$ "$TAG/status"; then
if grep -q ^destroyed$ "$TAG/status"; then
info "Removing $TAG..."
rm -rf "$TAG"
fi
@@ -61,7 +61,6 @@ _cmd createuser "Create the user that students will use"
_cmd_createuser() {
TAG=$1
need_tag
need_login_password
pssh "
set -e
@@ -82,7 +81,7 @@ _cmd_createuser() {
set -e
sudo sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
sudo sed -i 's/#MaxAuthTries 6/MaxAuthTries 42/' /etc/ssh/sshd_config
sudo service ssh restart
sudo systemctl restart ssh.service
"
pssh "
@@ -103,9 +102,7 @@ _cmd_createuser() {
# in the next deployment step). In the long run, we probably want to
# generate these keys locally and push them to the machines instead
# (once we move everything to Terraform).
if [ -f "tags/$TAG/id_rsa" ]; then
ssh-add tags/$TAG/id_rsa
fi
ssh-add tags/$TAG/id_rsa
pssh "
set -e
cd /home/$USER_LOGIN
@@ -115,9 +112,7 @@ _cmd_createuser() {
sudo -u $USER_LOGIN tar -xf-
fi
"
if [ -f "tags/$TAG/id_rsa" ]; then
ssh-add -d tags/$TAG/id_rsa
fi
ssh-add -d tags/$TAG/id_rsa
# FIXME do this only once.
pssh -I "sudo -u $USER_LOGIN tee -a /home/$USER_LOGIN/.bashrc" <<"SQRL"
@@ -167,49 +162,162 @@ SQRL
echo user_ok > tags/$TAG/status
}
_cmd standardize "Deal with non-standard Ubuntu cloud images"
_cmd_standardize() {
TAG=$1
need_tag
# Disable unattended upgrades so that they don't mess up with the subsequent steps
pssh sudo rm -f /etc/apt/apt.conf.d/50unattended-upgrades
_cmd create "Create lab environments"
_cmd_create() {
while [ ! -z "$*" ]; do
case "$1" in
--mode) MODE=$2; shift 2;;
--provider) PROVIDER=$2; shift 2;;
--settings) SETTINGS=$2; shift 2;;
--students) STUDENTS=$2; shift 2;;
--tag) TAG=$2; shift 2;;
*) die "Unrecognized parameter: $1."
esac
done
# Digital Ocean's cloud init disables password authentication; re-enable it.
pssh "
if [ -f /etc/ssh/sshd_config.d/50-cloud-init.conf ]; then
sudo rm /etc/ssh/sshd_config.d/50-cloud-init.conf
sudo systemctl restart ssh.service
fi"
# Special case for scaleway since it doesn't come with sudo
if [ "$INFRACLASS" = "scaleway" ]; then
pssh -l root "
grep DEBIAN_FRONTEND /etc/environment || echo DEBIAN_FRONTEND=noninteractive >> /etc/environment
grep cloud-init /etc/sudoers && rm /etc/sudoers
apt-get update && apt-get install sudo -y"
if [ -z "$MODE" ]; then
info "Using default mode (pssh)."
MODE=pssh
fi
if [ -z "$PROVIDER" ]; then
die "Please add --provider flag to specify which provider to use."
fi
if [ -z "$SETTINGS" ]; then
die "Please add --settings flag to specify which settings file to use."
fi
if [ -z "$STUDENTS" ]; then
info "Defaulting to 1 student since --students flag wasn't specified."
STUDENTS=1
fi
# Special case for oracle since their iptables blocks everything but SSH
pssh "
if [ -f /etc/iptables/rules.v4 ]; then
sudo sed -i 's/-A INPUT -j REJECT --reject-with icmp-host-prohibited//' /etc/iptables/rules.v4
sudo netfilter-persistent flush
sudo netfilter-persistent start
fi"
case "$MODE" in
mk8s)
PROVIDER_BASE=terraform/one-kubernetes
;;
pssh)
PROVIDER_BASE=terraform/virtual-machines
;;
*) die "Invalid mode: $MODE (supported modes: mk8s, pssh)." ;;
esac
if ! [ -f "$SETTINGS" ]; then
die "Settings file ($SETTINGS) not found."
fi
# oracle-cloud-agent upgrades pacakges in the background.
# This breaks our deployment scripts, because when we invoke apt-get, it complains
# that the lock already exists (symptom: random "Exited with error code 100").
# Workaround: if we detect oracle-cloud-agent, remove it.
# But this agent seems to also take care of installing/upgrading
# the unified-monitoring-agent package, so when we stop the snap,
# it can leave dpkg in a broken state. We "fix" it with the 2nd command.
pssh "
if [ -d /snap/oracle-cloud-agent ]; then
sudo snap remove oracle-cloud-agent
sudo dpkg --remove --force-remove-reinstreq unified-monitoring-agent
fi"
# Check that the provider is valid.
if [ -d $PROVIDER_BASE/$PROVIDER ]; then
if [ -f $PROVIDER_BASE/$PROVIDER/requires_tfvars ]; then
die "Provider $PROVIDER cannot be used directly, because it requires a tfvars file."
fi
PROVIDER_DIRECTORY=$PROVIDER_BASE/$PROVIDER
TFVARS=""
elif [ -f $PROVIDER_BASE/$PROVIDER.tfvars ]; then
TFVARS=$PROVIDER_BASE/$PROVIDER.tfvars
PROVIDER_DIRECTORY=$(dirname $PROVIDER_BASE/$PROVIDER)
else
error "Provider $PROVIDER not found."
info "Available providers for mode $MODE:"
(
cd $PROVIDER_BASE
for P in *; do
if [ -d "$P" ]; then
[ -f "$P/requires_tfvars" ] || info "$P"
for V in $P/*.tfvars; do
[ -f "$V" ] && info "${V%.tfvars}"
done
fi
done
)
die "Please specify a valid provider."
fi
if [ -z "$TAG" ]; then
TAG=$(_cmd_maketag)
fi
mkdir -p tags/$TAG
echo creating > tags/$TAG/status
ln -s ../../$SETTINGS tags/$TAG/settings.env.orig
cp $SETTINGS tags/$TAG/settings.env
. $SETTINGS
echo $MODE > tags/$TAG/mode
echo $PROVIDER > tags/$TAG/provider
case "$MODE" in
mk8s)
cp -d terraform/many-kubernetes/*.* tags/$TAG
mkdir tags/$TAG/one-kubernetes-module
cp $PROVIDER_DIRECTORY/*.tf tags/$TAG/one-kubernetes-module
mkdir tags/$TAG/one-kubernetes-config
mv tags/$TAG/one-kubernetes-module/config.tf tags/$TAG/one-kubernetes-config
;;
pssh)
cp $PROVIDER_DIRECTORY/*.tf tags/$TAG
if [ "$TFVARS" ]; then
cp "$TFVARS" "tags/$TAG/$(basename $TFVARS).auto.tfvars"
fi
;;
esac
(
cd tags/$TAG
terraform init
echo tag = \"$TAG\" >> terraform.tfvars
echo how_many_clusters = $STUDENTS >> terraform.tfvars
echo nodes_per_cluster = $CLUSTERSIZE >> terraform.tfvars
for RETRY in 1 2 3; do
if terraform apply -auto-approve; then
touch terraform.ok
break
fi
done
if ! [ -f terraform.ok ]; then
die "Terraform failed."
fi
)
sep
info "Successfully created $COUNT instances with tag $TAG"
echo create_ok > tags/$TAG/status
# If the settings.env file has a "STEPS" field,
# automatically execute all the actions listed in that field.
# If an action fails, retry it up to 10 times.
for STEP in $(echo $STEPS); do
sep "$TAG -> $STEP"
TRY=1
MAXTRY=10
while ! $0 $STEP $TAG ; do
TRY=$(($TRY+1))
if [ $TRY -gt $MAXTRY ]; then
error "This step ($STEP) failed after $MAXTRY attempts."
info "You can troubleshoot the situation manually, or terminate these instances with:"
info "$0 destroy $TAG"
die "Giving up."
else
sep
info "Step '$STEP' failed for '$TAG'. Let's wait 10 seconds and try again."
info "(Attempt $TRY out of $MAXTRY.)"
sleep 10
fi
done
done
sep
info "Deployment successful."
info "To log into the first machine of that batch, you can run:"
info "$0 ssh $TAG"
info "To terminate these instances, you can run:"
info "$0 destroy $TAG"
}
_cmd destroy "Destroy lab environments"
_cmd_destroy() {
TAG=$1
need_tag
cd tags/$TAG
echo destroying > status
terraform destroy -auto-approve
echo destroyed > status
}
_cmd clusterize "Group VMs in clusters"
@@ -217,24 +325,32 @@ _cmd_clusterize() {
TAG=$1
need_tag
# Copy settings and install Python YAML parser
pssh -I tee /tmp/settings.yaml <tags/$TAG/settings.yaml
pssh "
sudo apt-get update &&
sudo apt-get install -y python3-yaml python-is-python3"
set -e
grep PSSH_ /etc/ssh/sshd_config || echo 'AcceptEnv PSSH_*' | sudo tee -a /etc/ssh/sshd_config
sudo systemctl restart ssh.service"
# Copy postprep.py to the remote machines, and execute it, feeding it the list of IP addresses
pssh -I tee /tmp/clusterize.py <lib/clusterize.py
pssh --timeout 900 --send-input "python /tmp/clusterize.py >>/tmp/pp.out 2>>/tmp/pp.err" <tags/$TAG/ips.txt
# On the first node, create and deploy TLS certs using Docker Machine
# (Currently disabled.)
true || pssh "
if i_am_first_node; then
grep '[0-9]\$' /etc/hosts |
xargs -n2 sudo -H -u $USER_LOGIN \
docker-machine create -d generic --generic-ssh-user $USER_LOGIN --generic-ip-address
fi"
pssh -I < tags/$TAG/clusters.txt "
grep -w \$PSSH_HOST | tr ' ' '\n' > /tmp/cluster"
pssh "
echo \$PSSH_HOST > /tmp/ipv4
head -n 1 /tmp/cluster | sudo tee /etc/ipv4_of_first_node
echo ${CLUSTERPREFIX}1 | sudo tee /etc/name_of_first_node
echo HOSTIP=\$PSSH_HOST | sudo tee -a /etc/environment
NODEINDEX=\$((\$PSSH_NODENUM%$CLUSTERSIZE+1))
if [ \$NODEINDEX = 1 ]; then
sudo ln -sf /bin/true /usr/local/bin/i_am_first_node
else
sudo ln -sf /bin/false /usr/local/bin/i_am_first_node
fi
echo $CLUSTERPREFIX\$NODEINDEX | sudo tee /etc/hostname
sudo hostname $CLUSTERPREFIX\$NODEINDEX
N=1
while read ip; do
grep -w \$ip /etc/hosts || echo \$ip $CLUSTERPREFIX\$N | sudo tee -a /etc/hosts
N=\$((\$N+1))
done < /tmp/cluster
"
echo cluster_ok > tags/$TAG/status
}
@@ -343,11 +459,7 @@ _cmd kube "Setup kubernetes clusters with kubeadm (must be run AFTER deploy)"
_cmd_kube() {
TAG=$1
need_tag
need_login_password
# Optional version, e.g. 1.13.5
SETTINGS=tags/$TAG/settings.yaml
KUBEVERSION=$(awk '/^kubernetes_version:/ {print $2}' $SETTINGS)
if [ "$KUBEVERSION" ]; then
pssh "
sudo tee /etc/apt/preferences.d/kubernetes <<EOF
@@ -476,7 +588,6 @@ _cmd kubetools "Install a bunch of CLI tools for Kubernetes"
_cmd_kubetools() {
TAG=$1
need_tag
need_login_password
ARCH=${ARCHITECTURE-amd64}
@@ -704,8 +815,6 @@ _cmd_ips() {
TAG=$1
need_tag $TAG
SETTINGS=tags/$TAG/settings.yaml
CLUSTERSIZE=$(awk '/^clustersize:/ {print $2}' $SETTINGS)
while true; do
for I in $(seq $CLUSTERSIZE); do
read ip || return 0
@@ -715,22 +824,9 @@ _cmd_ips() {
done < tags/$TAG/ips.txt
}
_cmd inventory "List all VMs on a given infrastructure (or all infras if no arg given)"
_cmd inventory "List all VMs on a given provider (or across all providers if no arg given)"
_cmd_inventory() {
case "$1" in
"")
for INFRA in infra/*; do
$0 inventory $INFRA
done
;;
*/example.*)
;;
*)
need_infra $1
sep "Listing instances for $1"
infra_list
;;
esac
FIXME
}
_cmd maketag "Generate a quasi-unique tag for a group of instances"
@@ -775,11 +871,83 @@ _cmd_ping() {
fping < tags/$TAG/ips.txt
}
_cmd stage2 "Finalize the setup of managed Kubernetes clusters"
_cmd_stage2() {
TAG=$1
need_tag
cd tags/$TAG/stage2
terraform init -upgrade
terraform apply -auto-approve
}
_cmd standardize "Deal with non-standard Ubuntu cloud images"
_cmd_standardize() {
TAG=$1
need_tag
# Try to log in as root.
# If successful, make sure than we have:
# - sudo
# - ubuntu user
# Note that on Scaleway, the keys of the root account get copied
# a little bit later after boot; so the first time we run "standardize"
# we might end up copying an incomplete authorized_keys file.
# That's why we copy it inconditionally here, rather than checking
# for existence and skipping if it already exists.
pssh -l root -t 5 true 2>&1 >/dev/null && {
pssh -l root "
grep DEBIAN_FRONTEND /etc/environment || echo DEBIAN_FRONTEND=noninteractive >> /etc/environment
#grep cloud-init /etc/sudoers && rm /etc/sudoers
apt-get update && apt-get install sudo -y
getent passwd ubuntu || {
useradd ubuntu -m -s /bin/bash
echo 'ubuntu ALL=(ALL:ALL) NOPASSWD:ALL' > /etc/sudoers.d/ubuntu
}
install --owner=ubuntu --mode=700 --directory /home/ubuntu/.ssh
install --owner=ubuntu --mode=600 /root/.ssh/authorized_keys --target-directory /home/ubuntu/.ssh
"
}
# Now make sure that we have an ubuntu user
pssh true
# Disable unattended upgrades so that they don't mess up with the subsequent steps
pssh sudo rm -f /etc/apt/apt.conf.d/50unattended-upgrades
# Digital Ocean's cloud init disables password authentication; re-enable it.
pssh "
if [ -f /etc/ssh/sshd_config.d/50-cloud-init.conf ]; then
sudo rm /etc/ssh/sshd_config.d/50-cloud-init.conf
sudo systemctl restart ssh.service
fi"
# Special case for oracle since their iptables blocks everything but SSH
pssh "
if [ -f /etc/iptables/rules.v4 ]; then
sudo sed -i 's/-A INPUT -j REJECT --reject-with icmp-host-prohibited//' /etc/iptables/rules.v4
sudo netfilter-persistent flush
sudo netfilter-persistent start
fi"
# oracle-cloud-agent upgrades pacakges in the background.
# This breaks our deployment scripts, because when we invoke apt-get, it complains
# that the lock already exists (symptom: random "Exited with error code 100").
# Workaround: if we detect oracle-cloud-agent, remove it.
# But this agent seems to also take care of installing/upgrading
# the unified-monitoring-agent package, so when we stop the snap,
# it can leave dpkg in a broken state. We "fix" it with the 2nd command.
pssh "
if [ -d /snap/oracle-cloud-agent ]; then
sudo snap remove oracle-cloud-agent
sudo dpkg --remove --force-remove-reinstreq unified-monitoring-agent
fi"
}
_cmd tailhist "Install history viewer on port 1088"
_cmd_tailhist () {
TAG=$1
need_tag
need_login_password
ARCH=${ARCHITECTURE-amd64}
[ "$ARCH" = "aarch64" ] && ARCH=arm64
@@ -825,20 +993,6 @@ _cmd_tools() {
"
}
_cmd opensg "Open the default security group to ALL ingress traffic"
_cmd_opensg() {
need_infra $1
infra_opensg
}
_cmd disableaddrchecks "Disable source/destination IP address checks"
_cmd_disableaddrchecks() {
TAG=$1
need_tag
infra_disableaddrchecks
}
_cmd pssh "Run an arbitrary command on all nodes"
_cmd_pssh() {
TAG=$1
@@ -880,122 +1034,22 @@ fi
"
}
_cmd quotas "Check our infrastructure quotas (max instances)"
_cmd_quotas() {
need_infra $1
infra_quotas
}
_cmd ssh "Open an SSH session to the first node of a tag"
_cmd_ssh() {
TAG=$1
need_tag
need_login_password
IP=$(head -1 tags/$TAG/ips.txt)
info "Logging into $IP (default password: $USER_PASSWORD)"
ssh $SSHOPTS $USER_LOGIN@$IP
}
_cmd start "Start a group of VMs"
_cmd_start() {
while [ ! -z "$*" ]; do
case "$1" in
--infra) INFRA=$2; shift 2;;
--settings) SETTINGS=$2; shift 2;;
--count) die "Flag --count is deprecated; please use --students instead." ;;
--tag) TAG=$2; shift 2;;
--students) STUDENTS=$2; shift 2;;
*) die "Unrecognized parameter: $1."
esac
done
if [ -z "$INFRA" ]; then
die "Please add --infra flag to specify which infrastructure file to use."
fi
if [ -z "$SETTINGS" ]; then
die "Please add --settings flag to specify which settings file to use."
fi
if [ -z "$COUNT" ]; then
CLUSTERSIZE=$(awk '/^clustersize:/ {print $2}' $SETTINGS)
if [ -z "$STUDENTS" ]; then
warning "Neither --count nor --students was specified."
warning "According to the settings file, the cluster size is $CLUSTERSIZE."
warning "Deploying one cluster of $CLUSTERSIZE nodes."
STUDENTS=1
fi
COUNT=$(($STUDENTS*$CLUSTERSIZE))
fi
# Check that the specified settings and infrastructure are valid.
need_settings $SETTINGS
need_infra $INFRA
if [ -z "$TAG" ]; then
TAG=$(_cmd_maketag)
fi
mkdir -p tags/$TAG
ln -s ../../$INFRA tags/$TAG/infra.sh
ln -s ../../$SETTINGS tags/$TAG/settings.yaml
echo creating > tags/$TAG/status
infra_start $COUNT
sep
info "Successfully created $COUNT instances with tag $TAG"
echo create_ok > tags/$TAG/status
# If the settings.yaml file has a "steps" field,
# automatically execute all the actions listed in that field.
# If an action fails, retry it up to 10 times.
python -c 'if True: # hack to deal with indentation
import sys, yaml
settings = yaml.safe_load(sys.stdin)
print ("\n".join(settings.get("steps", [])))
' < tags/$TAG/settings.yaml \
| while read step; do
if [ -z "$step" ]; then
break
fi
sep "$TAG -> $step"
TRY=1
MAXTRY=10
while ! $0 $step $TAG ; do
TRY=$(($TRY+1))
if [ $TRY -gt $MAXTRY ]; then
error "This step ($step) failed after $MAXTRY attempts."
info "You can troubleshoot the situation manually, or terminate these instances with:"
info "$0 stop $TAG"
die "Giving up."
else
sep
info "Step '$step' failed for '$TAG'. Let's wait 10 seconds and try again."
info "(Attempt $TRY out of $MAXTRY.)"
sleep 10
fi
done
done
sep
info "Deployment successful."
info "To log into the first machine of that batch, you can run:"
info "$0 ssh $TAG"
info "To terminate these instances, you can run:"
info "$0 stop $TAG"
}
_cmd stop "Stop (terminate, shutdown, kill, remove, destroy...) instances"
_cmd_stop() {
TAG=$1
need_tag
infra_stop
echo stopped > tags/$TAG/status
}
_cmd tags "List groups of VMs known locally"
_cmd_tags() {
(
cd tags
echo "[#] [Status] [Tag] [Infra]" \
| awk '{ printf "%-7s %-12s %-25s %-25s\n", $1, $2, $3, $4}'
echo "[#] [Status] [Tag] [Mode] [Provider]" \
| awk '{ printf "%-7s %-12s %-30s %-10s %-25s\n", $1, $2, $3, $4, $5 }'
for tag in *; do
if [ -f $tag/ips.txt ]; then
count="$(wc -l < $tag/ips.txt)"
@@ -1007,13 +1061,18 @@ _cmd_tags() {
else
status="?"
fi
if [ -f $tag/infra.sh ]; then
infra="$(basename $(readlink $tag/infra.sh))"
if [ -f $tag/mode ]; then
mode="$(cat $tag/mode)"
else
infra="?"
mode="?"
fi
echo "$count $status $tag $infra" \
| awk '{ printf "%-7s %-12s %-25s %-25s\n", $1, $2, $3, $4}'
if [ -f $tag/provider ]; then
provider="$(cat $tag/provider)"
else
provider="?"
fi
echo "$count $status $tag $mode $provider" \
| awk '{ printf "%-7s %-12s %-30s %-10s %-25s\n", $1, $2, $3, $4, $5 }'
done
)
}
@@ -1055,7 +1114,6 @@ _cmd passwords "Set individual passwords for each cluster"
_cmd_passwords() {
TAG=$1
need_tag
need_login_password
PASSWORDS_FILE="tags/$TAG/passwords"
if ! [ -f "$PASSWORDS_FILE" ]; then
error "File $PASSWORDS_FILE not found. Please create it first."
@@ -1104,22 +1162,6 @@ _cmd_wait() {
if [ -d /var/lib/cloud ]; then
cloud-init status --wait
fi"
if [ "$SSH_USER" = "root" ]; then
pssh -l root "
getent passwd ubuntu || {
useradd ubuntu -m -s /bin/bash
echo 'ubuntu ALL=(ALL:ALL) NOPASSWD:ALL' > /etc/sudoers.d/ubuntu
}
[ -d /home/ubuntu/.ssh ] ||
install --owner=ubuntu --mode=700 --directory /home/ubuntu/.ssh
[ -f /home/ubuntu/.ssh/authorized_keys ] ||
install --owner=ubuntu --mode=600 /root/.ssh/authorized_keys --target-directory /home/ubuntu/.ssh
"
fi
# Now make sure that we have an ubuntu user
pssh true
}
# Sometimes, weave fails to come up on some nodes.

View File

@@ -16,24 +16,12 @@ pssh() {
}
echo "[parallel-ssh] $@"
export PSSH=$(which pssh || which parallel-ssh)
case "$INFRACLASS" in
hetzner) LOGIN=root ;;
linode) LOGIN=root ;;
*) LOGIN=ubuntu ;;
esac
if [ -f "tags/$TAG/id_rsa" ]; then
KEYFLAG="-O IdentityFile=tags/$TAG/id_rsa"
else
KEYFLAG=""
fi
$PSSH $KEYFLAG -h $HOSTFILE -l $LOGIN \
$(which pssh) -h $HOSTFILE -l ubuntu \
--par ${PSSH_PARALLEL_CONNECTIONS-100} \
--timeout 300 \
-O LogLevel=ERROR \
-O IdentityFile=tags/$TAG/id_rsa \
-O UserKnownHostsFile=/dev/null \
-O StrictHostKeyChecking=no \
-O ForwardAgent=yes \

View File

@@ -2,16 +2,16 @@
"""
There are two ways to use this script:
1. Pass a file name and a tag name as a single argument.
It will load a list of domains from the given file (one per line),
and assign them to the clusters corresponding to that tag.
There should be more domains than clusters.
Example: ./map-dns.py domains.txt 2020-08-15-jp
2. Pass a domain as the 1st argument, and IP addresses then.
1. Pass a domain as the 1st argument, and IP addresses then.
It will configure the domain with the listed IP addresses.
Example: ./map-dns.py open-duck.site 1.2.3.4 2.3.4.5 3.4.5.6
2. Pass two files names as argument, in which case the first
file should contain a list of domains, and the second a list of
groups of IP addresses, with one group per line.
There should be more domains than groups of addresses.
Example: ./map-dns.py domains.txt tags/2020-08-15-jp/clusters.txt
In both cases, the domains should be configured to use GANDI LiveDNS.
"""
import os
@@ -30,18 +30,9 @@ domain_or_domain_file = sys.argv[1]
if os.path.isfile(domain_or_domain_file):
domains = open(domain_or_domain_file).read().split()
domains = [ d for d in domains if not d.startswith('#') ]
ips_file_or_tag = sys.argv[2]
if os.path.isfile(ips_file_or_tag):
lines = open(ips_file_or_tag).read().split('\n')
clusters = [line.split() for line in lines]
else:
ips = open(f"tags/{ips_file_or_tag}/ips.txt").read().split()
settings_file = f"tags/{ips_file_or_tag}/settings.yaml"
clustersize = yaml.safe_load(open(settings_file))["clustersize"]
clusters = []
while ips:
clusters.append(ips[:clustersize])
ips = ips[clustersize:]
clusters_file = sys.argv[2]
lines = open(clusters_file).read().split('\n')
clusters = [line.split() for line in lines]
else:
domains = [domain_or_domain_file]
clusters = [sys.argv[2:]]

View File

@@ -0,0 +1,22 @@
CLUSTERSIZE=1
CLUSTERPREFIX=dmuc
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
disabledocker
createuser
webssh
tailhist
kubebins
kubetools
cards
ips
"

View File

@@ -0,0 +1,23 @@
CLUSTERSIZE=3
CLUSTERPREFIX=kubenet
CLUSTERNUMBER=100
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
disableaddrchecks
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubebins
kubetools
cards
ips
"

View File

@@ -0,0 +1,23 @@
CLUSTERSIZE=3
CLUSTERPREFIX=kuberouter
CLUSTERNUMBER=200
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
disableaddrchecks
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kubebins
kubetools
cards
ips
"

View File

@@ -0,0 +1,24 @@
CLUSTERSIZE=3
CLUSTERPREFIX=oldversion
USER_LOGIN=k8s
USER_PASSWORD=training
# For a list of old versions, check:
# https://kubernetes.io/releases/patch-releases/#non-active-branch-history
KUBEVERSION=1.20.15
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kube
kubetools
kubetest
"

View File

@@ -0,0 +1,20 @@
CLUSTERSIZE=3
CLUSTERPREFIX=test
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kube
kubetools
kubetest
"

View File

@@ -0,0 +1,19 @@
CLUSTERSIZE=1
CLUSTERPREFIX=moby
USER_LOGIN=docker
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
cards
ips
"

View File

@@ -0,0 +1,20 @@
CLUSTERSIZE=4
CLUSTERPREFIX=node
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kube
kubetools
kubetest
"

View File

@@ -0,0 +1,21 @@
CLUSTERSIZE=10
export TF_VAR_node_size=GP1.M
CLUSTERPREFIX=node
USER_LOGIN=k8s
USER_PASSWORD=training
STEPS="
wait
standardize
clusterize
tools
docker
createuser
webssh
tailhist
kube
kubetools
kubetest
"

View File

@@ -0,0 +1,6 @@
CLUSTERSIZE=2
USER_LOGIN=k8s
USER_PASSWORD=
STEPS="stage2"

View File

@@ -0,0 +1,16 @@
CLUSTERSIZE=1
CLUSTERPREFIX=CHANGEME
USER_LOGIN=portal
USER_PASSWORD=CHANGEME
STEPS="
wait
standardize
clusterize
tools
docker
createuser
ips
"

1
prepare-labs/tags Symbolic link
View File

@@ -0,0 +1 @@
terraform/tags

View File

Can't render this file because it contains an unexpected character in line 1 and column 42.

View File

@@ -0,0 +1,4 @@
#!/bin/sh
az account list-locations -o table \
--query "sort_by([?metadata.regionType == 'Physical'], &regionalDisplayName)[]
.{ displayName: displayName, regionalDisplayName: regionalDisplayName }"

View File

@@ -8,8 +8,10 @@ resource "random_string" "_" {
resource "time_static" "_" {}
locals {
timestamp = formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339)
tag = random_string._.result
min_nodes_per_pool = var.nodes_per_cluster
max_nodes_per_pool = var.nodes_per_cluster * 2
timestamp = formatdate("YYYY-MM-DD-hh-mm", time_static._.rfc3339)
tag = random_string._.result
# Common tags to be assigned to all resources
common_tags = [
"created-by-terraform",

View File

@@ -1,9 +1,9 @@
module "clusters" {
source = "./modules/PROVIDER"
source = "./one-kubernetes-module"
for_each = local.clusters
cluster_name = each.value.cluster_name
min_nodes_per_pool = var.min_nodes_per_pool
max_nodes_per_pool = var.max_nodes_per_pool
min_nodes_per_pool = local.min_nodes_per_pool
max_nodes_per_pool = local.max_nodes_per_pool
enable_arm_pool = var.enable_arm_pool
node_size = var.node_size
common_tags = local.common_tags

View File

@@ -0,0 +1 @@
one-kubernetes-config/config.tf

View File

@@ -0,0 +1,3 @@
This directory should contain a config.tf file, even if it's empty.
(Because if the file doesn't exist, then the Terraform configuration
in the parent directory will fail.)

View File

@@ -0,0 +1,8 @@
This directory should contain a copy of one of the "one-kubernetes" modules.
For instance, when located in this directory, you can do:
cp ../../one-kubernetes/linode/* .
Then, move the config.tf file to ../one-kubernetes-config:
mv config.tf ../one-kubernetes-config

View File

@@ -0,0 +1 @@
one-kubernetes-module/provider.tf

View File

@@ -0,0 +1,3 @@
terraform {
required_version = ">= 1.4"
}

View File

@@ -1,27 +1,20 @@
variable "tag" {
type = string
}
variable "how_many_clusters" {
type = number
default = 1
default = 2
}
variable "nodes_per_cluster" {
type = number
default = 2
}
variable "node_size" {
type = string
default = "M"
# Can be S, M, L.
# We map these values to different specific instance types for each provider,
# but the idea is that they shoudl correspond to the following sizes:
# S = 2 GB RAM
# M = 4 GB RAM
# L = 8 GB RAM
}
variable "min_nodes_per_pool" {
type = number
default = 1
}
variable "max_nodes_per_pool" {
type = number
default = 0
}
variable "enable_arm_pool" {

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/aws.tf

View File

@@ -0,0 +1,87 @@
# Taken from:
# https://github.com/hashicorp/learn-terraform-provision-eks-cluster/blob/main/main.tf
data "aws_availability_zones" "available" {}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.19.0"
name = var.cluster_name
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "19.5.1"
cluster_name = var.cluster_name
cluster_version = "1.24"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
eks_managed_node_group_defaults = {
ami_type = "AL2_x86_64"
}
eks_managed_node_groups = {
one = {
name = "node-group-one"
instance_types = [local.node_size]
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
desired_size = var.min_nodes_per_pool
}
}
}
# https://aws.amazon.com/blogs/containers/amazon-ebs-csi-driver-is-now-generally-available-in-amazon-eks-add-ons/
data "aws_iam_policy" "ebs_csi_policy" {
arn = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
}
module "irsa-ebs-csi" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "4.7.0"
create_role = true
role_name = "AmazonEKSTFEBSCSIRole-${module.eks.cluster_name}"
provider_url = module.eks.oidc_provider
role_policy_arns = [data.aws_iam_policy.ebs_csi_policy.arn]
oidc_fully_qualified_subjects = ["system:serviceaccount:kube-system:ebs-csi-controller-sa"]
}
resource "aws_eks_addon" "ebs-csi" {
cluster_name = module.eks.cluster_name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.5.2-eksbuild.1"
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
tags = {
"eks_addon" = "ebs-csi"
"terraform" = "true"
}
}

View File

@@ -0,0 +1,44 @@
output "cluster_id" {
value = module.eks.cluster_arn
}
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
sensitive = true
value = yamlencode({
apiVersion = "v1"
kind = "Config"
clusters = [{
name = var.cluster_name
cluster = {
certificate-authority-data = module.eks.cluster_certificate_authority_data
server = module.eks.cluster_endpoint
}
}]
contexts = [{
name = var.cluster_name
context = {
cluster = var.cluster_name
user = var.cluster_name
}
}]
users = [{
name = var.cluster_name
user = {
exec = {
apiVersion = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", var.cluster_name]
}
}
}]
current-context = var.cluster_name
})
}
data "aws_eks_cluster_auth" "_" {
name = module.eks.cluster_name
}

View File

@@ -0,0 +1,7 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/civo.tf

View File

@@ -0,0 +1,17 @@
# As of March 2023, the default type ("k3s") only supports up
# to Kubernetes 1.23, which belongs to a museum.
# So let's use Talos, which supports up to 1.25.
resource "civo_kubernetes_cluster" "_" {
name = var.cluster_name
firewall_id = civo_firewall._.id
cluster_type = "talos"
pools {
size = local.node_size
node_count = var.min_nodes_per_pool
}
}
resource "civo_firewall" "_" {
name = var.cluster_name
}

View File

@@ -0,0 +1,12 @@
output "cluster_id" {
value = civo_kubernetes_cluster._.id
}
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
value = civo_kubernetes_cluster._.kubeconfig
sensitive = true
}

View File

@@ -0,0 +1,7 @@
terraform {
required_providers {
civo = {
source = "civo/civo"
}
}
}

View File

@@ -0,0 +1,28 @@
variable "cluster_name" {
type = string
default = "deployed-with-terraform"
}
variable "common_tags" {
type = list(string)
default = []
}
variable "node_size" {
type = string
default = "M"
}
variable "min_nodes_per_pool" {
type = number
default = 2
}
variable "max_nodes_per_pool" {
type = number
default = 4
}
locals {
node_size = lookup(var.node_sizes, var.node_size, var.node_size)
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/digitalocean.tf

View File

@@ -3,15 +3,18 @@ resource "digitalocean_kubernetes_cluster" "_" {
tags = var.common_tags
# Region is mandatory, so let's provide a default value.
region = var.location != null ? var.location : "nyc1"
version = var.k8s_version
version = data.digitalocean_kubernetes_versions._.latest_version
node_pool {
name = "x86"
tags = var.common_tags
size = local.node_type
auto_scale = true
size = local.node_size
auto_scale = var.max_nodes_per_pool > var.min_nodes_per_pool
min_nodes = var.min_nodes_per_pool
max_nodes = max(var.min_nodes_per_pool, var.max_nodes_per_pool)
}
}
data "digitalocean_kubernetes_versions" "_" {
}

View File

@@ -1,7 +1,3 @@
output "kubeconfig" {
value = digitalocean_kubernetes_cluster._.kube_config.0.raw_config
}
output "cluster_id" {
value = digitalocean_kubernetes_cluster._.id
}
@@ -9,3 +5,8 @@ output "cluster_id" {
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
value = digitalocean_kubernetes_cluster._.kube_config.0.raw_config
sensitive = true
}

View File

@@ -6,7 +6,3 @@ terraform {
}
}
}
provider "digitalocean" {
token = yamldecode(file("~/.config/doctl/config.yaml"))["access-token"]
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/exoscale.tf

View File

@@ -0,0 +1,20 @@
resource "exoscale_sks_cluster" "_" {
zone = var.location
name = var.cluster_name
service_level = "starter"
}
resource "exoscale_sks_nodepool" "_" {
cluster_id = exoscale_sks_cluster._.id
zone = exoscale_sks_cluster._.zone
name = var.cluster_name
instance_type = local.node_size
size = var.min_nodes_per_pool
}
resource "exoscale_sks_kubeconfig" "_" {
cluster_id = exoscale_sks_cluster._.id
zone = exoscale_sks_cluster._.zone
user = "kubernetes-admin"
groups = ["system:masters"]
}

View File

@@ -0,0 +1,12 @@
output "cluster_id" {
value = exoscale_sks_cluster._.id
}
output "has_metrics_server" {
value = true
}
output "kubeconfig" {
value = exoscale_sks_kubeconfig._.kubeconfig
sensitive = true
}

View File

@@ -0,0 +1,7 @@
terraform {
required_providers {
exoscale = {
source = "exoscale/exoscale"
}
}
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/googlecloud.tf

View File

@@ -1,8 +1,8 @@
resource "google_container_cluster" "_" {
name = var.cluster_name
project = local.project
location = local.location
min_master_version = var.k8s_version
name = var.cluster_name
project = local.project
location = local.location
#min_master_version = var.k8s_version
# To deploy private clusters, uncomment the section below,
# and uncomment the block in network.tf.
@@ -43,12 +43,12 @@ resource "google_container_cluster" "_" {
name = "x86"
node_config {
tags = var.common_tags
machine_type = local.node_type
machine_type = local.node_size
}
initial_node_count = var.min_nodes_per_pool
autoscaling {
min_node_count = var.min_nodes_per_pool
max_node_count = max(var.min_nodes_per_pool, var.max_nodes_per_pool)
max_node_count = var.max_nodes_per_pool
}
}
@@ -62,4 +62,3 @@ resource "google_container_cluster" "_" {
}
}
}

View File

@@ -1,7 +1,14 @@
data "google_client_config" "_" {}
output "cluster_id" {
value = google_container_cluster._.id
}
output "has_metrics_server" {
value = true
}
output "kubeconfig" {
value = <<-EOT
sensitive = true
value = <<-EOT
apiVersion: v1
kind: Config
current-context: ${google_container_cluster._.name}
@@ -25,11 +32,3 @@ output "kubeconfig" {
token: ${data.google_client_config._.access_token}
EOT
}
output "cluster_id" {
value = google_container_cluster._.id
}
output "has_metrics_server" {
value = true
}

View File

@@ -0,0 +1,12 @@
locals {
location = var.location != null ? var.location : "europe-north1-a"
region = replace(local.location, "/-[a-z]$/", "")
# Unfortunately, the following line doesn't work
# (that attribute just returns an empty string)
# so we have to hard-code the project name.
#project = data.google_client_config._.project
project = "prepare-tf"
}
data "google_client_config" "_" {}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/linode.tf

View File

@@ -3,10 +3,10 @@ resource "linode_lke_cluster" "_" {
tags = var.common_tags
# "region" is mandatory, so let's provide a default value if none was given.
region = var.location != null ? var.location : "eu-central"
k8s_version = local.k8s_version
k8s_version = data.linode_lke_versions._.versions[0].id
pool {
type = local.node_type
type = local.node_size
count = var.min_nodes_per_pool
autoscaler {
min = var.min_nodes_per_pool
@@ -15,3 +15,9 @@ resource "linode_lke_cluster" "_" {
}
}
data "linode_lke_versions" "_" {
}
# FIXME: sort the versions to be sure that we get the most recent one?
# (We don't know in which order they are returned by the provider.)

View File

@@ -1,7 +1,3 @@
output "kubeconfig" {
value = base64decode(linode_lke_cluster._.kubeconfig)
}
output "cluster_id" {
value = linode_lke_cluster._.id
}
@@ -9,3 +5,8 @@ output "cluster_id" {
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
value = base64decode(linode_lke_cluster._.kubeconfig)
sensitive = true
}

View File

@@ -0,0 +1,8 @@
terraform {
required_providers {
linode = {
source = "linode/linode"
version = "1.30.0"
}
}
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/oci.tf

View File

@@ -4,8 +4,13 @@ resource "oci_identity_compartment" "_" {
enable_delete = true
}
data "oci_containerengine_cluster_option" "_" {
cluster_option_id = "all"
}
locals {
compartment_id = oci_identity_compartment._.id
compartment_id = oci_identity_compartment._.id
kubernetes_version = data.oci_containerengine_cluster_option._.kubernetes_versions[0]
}
data "oci_identity_availability_domains" "_" {
@@ -13,16 +18,15 @@ data "oci_identity_availability_domains" "_" {
}
data "oci_core_images" "_" {
for_each = local.pools
compartment_id = local.compartment_id
operating_system = "Oracle Linux"
operating_system_version = "7.9"
shape = each.value.shape
operating_system_version = "8"
shape = local.shape
}
resource "oci_containerengine_cluster" "_" {
compartment_id = local.compartment_id
kubernetes_version = var.k8s_version
kubernetes_version = local.kubernetes_version
name = "tf-oke"
vcn_id = oci_core_vcn._.id
options {
@@ -35,15 +39,14 @@ resource "oci_containerengine_cluster" "_" {
}
resource "oci_containerengine_node_pool" "_" {
for_each = local.pools
cluster_id = oci_containerengine_cluster._.id
compartment_id = local.compartment_id
kubernetes_version = var.k8s_version
name = each.key
node_shape = each.value.shape
kubernetes_version = local.kubernetes_version
name = "pool"
node_shape = local.shape
node_shape_config {
memory_in_gbs = local.node_type.memory_in_gbs
ocpus = local.node_type.ocpus
memory_in_gbs = local.memory_in_gbs
ocpus = local.ocpus
}
node_config_details {
size = var.min_nodes_per_pool
@@ -53,7 +56,7 @@ resource "oci_containerengine_node_pool" "_" {
}
}
node_source_details {
image_id = data.oci_core_images._[each.key].images[0].id
image_id = data.oci_core_images._.images[0].id
source_type = "image"
}
}

View File

@@ -1,11 +1,3 @@
data "oci_containerengine_cluster_kube_config" "_" {
cluster_id = oci_containerengine_cluster._.id
}
output "kubeconfig" {
value = data.oci_containerengine_cluster_kube_config._.content
}
output "cluster_id" {
value = oci_containerengine_cluster._.id
}
@@ -13,3 +5,11 @@ output "cluster_id" {
output "has_metrics_server" {
value = false
}
output "kubeconfig" {
value = data.oci_containerengine_cluster_kube_config._.content
}
data "oci_containerengine_cluster_kube_config" "_" {
cluster_id = oci_containerengine_cluster._.id
}

View File

@@ -1,8 +1,7 @@
terraform {
required_providers {
oci = {
source = "hashicorp/oci"
version = "4.48.0"
source = "oracle/oci"
}
}
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/scaleway.tf

View File

@@ -0,0 +1,28 @@
resource "scaleway_k8s_cluster" "_" {
name = var.cluster_name
#region = var.location
tags = var.common_tags
version = local.k8s_version
cni = "cilium"
delete_additional_resources = true
}
resource "scaleway_k8s_pool" "_" {
cluster_id = scaleway_k8s_cluster._.id
name = "x86"
tags = var.common_tags
node_type = local.node_size
size = var.min_nodes_per_pool
min_size = var.min_nodes_per_pool
max_size = var.max_nodes_per_pool
autoscaling = var.max_nodes_per_pool > var.min_nodes_per_pool
autohealing = true
}
data "scaleway_k8s_version" "_" {
name = "latest"
}
locals {
k8s_version = data.scaleway_k8s_version._.name
}

View File

@@ -0,0 +1,12 @@
output "cluster_id" {
value = scaleway_k8s_cluster._.id
}
output "has_metrics_server" {
value = sort([local.k8s_version, "1.22"])[0] == "1.22"
}
output "kubeconfig" {
sensitive = true
value = scaleway_k8s_cluster._.kubeconfig.0.config_file
}

View File

@@ -1,8 +1,7 @@
terraform {
required_providers {
scaleway = {
source = "scaleway/scaleway"
version = "2.1.0"
source = "scaleway/scaleway"
}
}
}

View File

@@ -0,0 +1 @@
../common.tf

View File

@@ -0,0 +1 @@
../../provider-config/vcluster.tf

View File

@@ -0,0 +1,20 @@
resource "helm_release" "_" {
name = "vcluster"
namespace = var.cluster_name
create_namespace = true
#tags = var.common_tags
repository = "https://charts.loft.sh"
chart = "vcluster"
set {
name = "storage.persistence"
value = "false"
}
set {
name = "service.type"
value = "NodePort"
}
set {
name = "syncer.extraArgs"
value = "{--tls-san=${local.outer_api_server_host}}"
}
}

View File

@@ -0,0 +1,39 @@
output "cluster_id" {
value = var.cluster_name
}
output "has_metrics_server" {
value = true
}
output "kubeconfig" {
sensitive = true
value = local.kubeconfig
}
data "kubernetes_secret_v1" "kubeconfig" {
depends_on = [helm_release._]
metadata {
name = "vc-vcluster"
namespace = var.cluster_name
}
}
data "kubernetes_service_v1" "vcluster" {
depends_on = [helm_release._]
metadata {
name = "vcluster"
namespace = var.cluster_name
}
}
locals {
kubeconfig_raw = data.kubernetes_secret_v1.kubeconfig.data.config
node_port = data.kubernetes_service_v1.vcluster.spec[0].port[0].node_port
outer_api_server_url = yamldecode(file("~/kubeconfig")).clusters[0].cluster.server
outer_api_server_host = regex("https://([^:]+):", local.outer_api_server_url)[0]
inner_api_server_host = local.outer_api_server_host
inner_old_server_url = yamldecode(local.kubeconfig_raw).clusters[0].cluster.server
inner_new_server_url = "https://${local.inner_api_server_host}:${local.node_port}"
kubeconfig = replace(local.kubeconfig_raw, local.inner_old_server_url, local.inner_new_server_url)
}

View File

@@ -0,0 +1,9 @@
variable "node_sizes" {
type = map(string)
default = {}
}
variable "location" {
type = string
default = null
}

View File

@@ -0,0 +1,17 @@
provider "aws" {
region = var.location
}
variable "node_sizes" {
type = map(any)
default = {
S = "t3.small"
M = "t3.medium"
L = "t3.large"
}
}
variable "location" {
type = string
default = "eu-north-1"
}

View File

@@ -0,0 +1,25 @@
provider "azurerm" {
features {}
}
/*
Available sizes:
"Standard_D11_v2" # CPU=2 RAM=14
"Standard_F4s_v2" # CPU=4 RAM=8
"Standard_D1_v2" # CPU=1 RAM=3.5
"Standard_B1ms" # CPU=1 RAM=2
"Standard_B2s" # CPU=2 RAM=4
*/
variable "node_sizes" {
type = map(any)
default = {
S = "Standard_B1ms"
M = "Standard_B2s"
L = "Standard_F4s_v2"
}
}
variable "location" {
type = string
default = "France Central"
}

Some files were not shown because too many files have changed in this diff Show More