In Kubernetes 1.18, `kubectl run` no longer creates
a Deployment, and cannot create Jobs or CronJobs
anymore. It only creates Pods. Since we were using
`kubectl run` to create our first Deployment, I've
changed the materials to explain that change, and
explain how the behavior differs between 1.17- and
1.18+, since I expect that people will deal with
a mix of both scenarios for a while (at least a
year).
Many improvements. QR code, fixed page size, better
use of page estate, etc.
Also pdfkit should kind of work now (not quite using
the full page size, but at least it's not utterly
broken like before).
This script needs:
- a list of domains managed by GANDI LiveDNS
- a list of IP addresses of clusters (like in tags/*/ips.txt)
It will replace the current configuration for these
domains so that they point to the clusters.
The apex of each domain and a wildcard entry will
have round-robin records pointing to all the nodes
of the cluster.
In addition, there will be records node[1234...]
pointing to each individual node.
Instead of upgrading from 1.16 to <latest> we upgrade from 1.15
to 1.16, because upgrading from <latest-1> is a special case and
it is better to show the general case.
Also, the script that sets up admin clusters now has some retry
logic to accommodate hiccups in pssh or in the cloud provider.
- Kubernetes binaries installed for ops labs bumped up to 1.17.2
- Composed-based control plane bumped up to 1.17.2
- kuberouter now uses apps/v1 DaemonSet (compatible with 1.16+)
- disable containerd (cosmetic)
Use v1beta1 for the first example (it's a bit simpler) and v1 for the second example.
The second example illustrate the served and storage attributes, and the fact that
each version can have a different schema.
Closes#541
Revamp most of the Helm content:
- overview of Helm moved to helm-intro.md
- explanation of chart format in helm-chart-format.md
- the very crude chart example is now in helm-create-basic-chart.md
- the more advanced chart (with templates etc) is now in helm-create-better-chart.md
- deep dive into Helm internals (how it stores it's data) in helm-secrets.md
This is all for Helm 3. Helm 2 is not supported anymore.
Autopilot can now continue when errors happen, and it writes
success/failure of each snippet in a log file for later review.
Also added e2e.sh to provision a test environment and start
the remote tmux instance.
Bump up Consul version to 1.6.
Change persistent consul demo; instead of a separate namespace,
use a different label. This way, the two manifests can be more
similar; and this simplifies the demo flow.
'keys' does not handle special keys (like ^J) anymore.
Instead, we should use `key`, which will pass its entire
argument to tmux, without any processing. It is therefore
possible to do something like:
```key ^C```
Or
```key Escape```
Most (if not all) calls to special keys have been
converted to use 'key' instead of 'keys'.
Action ```copypaste``` has been deprecated in favor
of three separate actions:
```copy REGEX``` (searches the regex in the active pane,
and if found, places it in an internal clipboard)
```paste``` (inserts the content of the clipboard as
keystrokes)
```check``` (forces a status check)
Also, a 'tmux' command has been added. It allows to
do stuff like:
```tmux split-pane -v```
The flow is better this way, since we can introduce pods
just after seeing them in kubectl describe node.
Also, add some extra info when we curl the Kubernetes API.
Thanks @qerub for letting me know that the protobuf format
was deprecated in Prom 2. Also, that technical document by
@beorn7 is a real delight to read. 💯
These variables are useful when deploying images
from a local registry (or from another place than
the Docker Hub) but they turned out to be quite
confusing. After holding to them for a while,
I think it is time to see the errors of my ways
and simplify that stuff.
Installing `mosh` via Homebrew may change `/usr/local/bin/python` to
Python 2. Adds docs to check and fix this so that `pyyml` and `jinja2`
can be installed.
Most parameters used by the Jinja template for the cards
can now be specified in settings.yaml. This should make
the generation of cards for admin training much easier.
This helps when the customer's internet connection filters out
the default port range. It still requires to have a port range
open somewhere, though. here we use 10000-10999, but this should
be adjusted if necessary.
This introduces concepts more progressively (instead of
front-loading most of the theory before tackling first
useful commands). It was successfully testsed at PyCon
and at a few 1-day engagements and works really well.
I'm now making it the official flow.
I'm also reformatting the YAML a little bit to facilitate
content suffling.
three changes:
CSRs don't have expiry dates
"-nodes" just means "no encryption" it's not really specific to DES
the cert comes from the controller not the CSR
I'd like to use these YAML files without having to tell people
to explicitly check a specific branch. So I'm merging the YAML
files right away. I'm not merging the Markdown content so that
it can be reviewed further.
Includes a simplified demo using Google OAuth Playground,
as well as numerous examples aiming at piercing the veil
to explain JWT, JWS, and associated protocols and algos.
This pins to a specific version of Alpine to insulate against Alpine version bumps renaming packages (or changing the way they work like when `pip` got split out into a separate package) and uses `apk add --no-cache` instead of `apk update` to create a slightly smaller end result.
Indicate clearly if we expect people to deploy
Prometheus or not. Explain better what the Helm
deployment does. Add a conclusion slide about
Grafana dashboards.
Prometheus deployment with Helm now stores
correctly Helm files in ~docker instead of
~ubuntu.
This is a rather convoluted example, showing step by
step how to build a system where each user gets a
ServiceAcccount and token with limited access, and
can use this token to submit a CSR that will give
them a short-lived certificate.
Even if this is not a 100% realistic scenario,
the general idea (using a "long-term" password
or token to obtain a "short-term" token) is used
by many other systems, so it makes sense to get
acquainted with the various moving parts.
In a few places, we were using 'Persistent Volume' the
wrong way. This was fixed.
Also added a whole chapter showing how to use local
persistent volumes, with an actually persistent
Consul cluster.
Use Replicated Ship to generate the base and overlays
from the kubercoins GitHub repo.
The namespaces chapter has been slightly tweaked so
that we can use it for either Helm or Kustomize demo.
This will be used for kubernetes admin labs, to upgrade
an existing cluster. In order to be able to perform an
upgrade, we need a cluster running an older version.
This is a new thing in Kubernetes 1.14. Added some details
about it (TL,DR it helps with cluster scalability but you
don't even have to know/care about it).
This will use a more recent Debian-based image, instead of the
older alpine image. It also sets a couple of env vars to
avoid spurious messages. And it removes a lot of defaults
and useless parameters to make the YAML file more readable.
Slight grammatical adjustments. If you wanted to say "an etcd instance" that works, but "an etcd" doesn't parse correctly. And for "allows to use" we have to say who's allowed - "one" or "us" or "you".
1) When introducing "kubectl describe", we ask people to
look at "kubectl describe node node1", which shows
them a bunch of pods. This makes it easier to contrast
with the (empty) output of "kubectl get pods" later.
2) Then, instead of going straight to "-n kube-system",
we introduce "--all-namespaces" to show pods across
all namespaces. Of course we also mention "-n" and
we also explain when these flags can be used.
3) Finally, I rewrote the section about kube-public,
because it was misleading. It pointed at the Secret
in kube-public, but that Secret merely corresponds
to the token automatically created for the default
ServiceAccount in that namespace. Instead, it's
more relevant to look at the ConfigMap cluster-info,
which contains a kubeconfig data piece.
The last item gives us an opportunity to talk to the
API with curl, because that cluster-info ConfigMap is
a public resource.
In the Kubernetes courses, it takes a bit too long before we
reach the Kubernetes content. Furthermore, learning how to
scale with Compose is not super helpful. These changes
allow to switch between two course flows:
- show how to scale with Compose, then transition to k8s/Swarm
- do not show how to scale with Compose; jump to k8s/Swarm earlier
In the latter case, we still benchmark the speed of rng and
hasher, but we do it on Kuberntes (by running httping on
the ClusterIP of these services).
These changes will also allow to make the whole DaemonSet
section optional, for shorter courses when we want to
simply scale the rng service without telling the bogus
explanation about entropy.
This is a concatenation of the files found in this directory:
https://github.com/kubernetes-incubator/metrics-server/tree/master/deploy/1.8%2B
... but with extra args added to the metrics server process,
to use InternalIP to contact the nodes, disable TLS cert validation
and reduce the polling interval to 5s.
Now that we have this file here, we can refer to it in the deployment
scripts to create clusters that have metrics-server pre-installed.
When following all the instructions, the Helm Chart that
we create is buggy, and the app shows up but with a zero
hash rate. This explains why, and how to fix it.
Fixes#432
The slides didn't mention to clone the git repo containing
the Compose file for the ELK stack. This is now fixed.
Also, the version numbers were not all correctly set
in this Compose file. Also fixed.
This is a bunch of changes that I had staged, + a few
typo fixes after going through the deck to check its readiness.
There are no deep changes; just a few extra slides
(e.g. about Kata containers and gVisor, and about
services meshes) and typo fixes.
Consul 1.4 introduces Cloud auto-join, which finds the
IP addresses of the other nodes by querying an API (in
that case, the Kubernetes API).
This involves creating a service account and granting
permissions to list and get pods. It is a little bit
more complex, but it reuses previous notions (like RBAC)
so I like it better.
This reformulates the section where we run DockerCoins
to better explain why we use images (and how they are
essential to the "ship" part of the action), and it
tells upfront that it will be possible to use images
from the Docker Hub (and skip altogether the part where
we run our own registry and build and push images).
It also reshuffles section headers a bit, because that
part had a handful of really small sections. Now we
have:
- Shipping images with a registry
- Running our application on Kubernetes
I think that's better.
It also paves the way to make the entire self-hosted
registry part optional.
We show how to change namespace by creating a new context, then
switching to the new context. It works, but it is very cumbersome.
Instead, let's just update the current context, and give some
details about when it's better to update the current context, and
when it is better to use different contexts and hop between them.
This supersedes #399.
There was a bug in Kubernetes 1.12. It was fixed in 1.13.
Let's just mention the issue in one brief slide but not add
too much extra fluff about it.
This was a request by @abuisine, so I'm flagging him for review :-)
This section explains the challenges associated with self-hosting
the control plane; and segues into static pods. It also mentions
bootkube and the Pod Checkpointer. There is an exercise showing
how to run a static pod.
In index.yaml, the date can now be specified as a range. For instance,
instead of:
date: 2018-11-28
We can use:
date: [2018-11-28, 2018-12-05]
For now, only the start date is shown (so the event still appears
as happening on 2018-11-28 in that example), but it will be considered
"current" (and show up in the list of "coming soon" events) until
the end date.
This way, when updating the content during a multi-day event, the
event stays in the top list and is not pushed to the "past events"
section.
Single-day events can still use the old syntax, of course.
The old version was using a slightly confusing way to
show which pods were receiving traffic:
kubectl logs --tail 1 --selector app=rng
(And then we look at the timestamp of the last request.)
In this new version, concepts are introduced progressively;
the YAML parser magic is isolated from the other concerns;
we show the impact of removing a pod from load balancing
in a way that is (IMHO) more straightforward:
- follow logs of specific pod
- remove pod from load balancer
- logs instantly stop flowing
These slides also explain why the DaemonSet and the
ReplicaSet for the rng service don't step on each other's
toes.
The last 5(ish) times I presented DockerCoins, I ended up
explaining it slightly differently. While the application
is building, I explain what it does and its architecture
(instead of watching the build and pointing out, 'oh look
there is ruby... and python...') and I found that it
worked better. It may also be better for shorter
workshops, because we can deliver useful information
while the app is building (instead of filling with
a tapdancing show).
@bretfisher and @bridgetkromhout, do you like the new
flow for that section? If not, I can figure something
out so that we each have our own section here, but I
hope you will actually like this one better. :)
Committing straight to master since this file
is not used by @bridgetkromhout, and people use
that file by cloning the repo (so it has to be
merged in master for people to see it).
HASHTAG YOLO
We have images on the Docker Hub for the various components
of dockercoins. Let's add one slide explaining how to use that,
for people who would be lost or would have issues with their
registry, so that they can catch up.
In some cases, I would like Prometheus to be pre-installed (so that
it shows a bunch of metrics) without relying on people doing it (and
setting up Helm correctly). This patch allows to run:
./workshopctl helmprom TAG
It will setup Helm with a proper service account, then deploy
the Pormetheus chart, disabling the alert manager, persistence,
and assigning the Prometheus server to NodePort 30090.
This command is idempotent.
kubectl run is being deprecated as a multi-purpose tool.
This PR replaces 'kubectl run' with 'kubectl create deployment'
in most places (except in the very first example, to reduce the
cognitive load; and when we really want a single-shot container).
It also updates the places where we use a 'run' label, since
'kubectl create deployment' uses the 'app' label instead.
NOTE: this hasn't gone through end-to-end testing yet.
For each concept that is present in the full-length tutorial,
I added a link to the corresponding chapter in the final section,
so that people who liked the short version can get similarly
presented info from the longer version.
kube-fullday is now suitable for one-day tutorials
kube-twodays is not suitable for two-day tutorials
I also tweaked (added a couple of line breaks) so that line
numbers would be aligned on all kube-...yml files.
This adds a few features:
- ./workshopctl kubereset TAG (closes#306)
- remove python-setuptools (prepare for #353)
- ./workshopctl weavetest TAG (help detecting weave issues
like we had at OSCON, July 2018)
- remove a bit of dead code
kubectl and kubens are added as kctl and kns (to avoid clashing with
completion for kubectl). Their completion is added too (so you can
do 'kns kube-sy[TAB]' to switch to kube-system).
kube_ps1 is added and enabled. The default prompt for the docker
user now shows the current context and namespace.
This allows to manage groups of VMs across multiple infrastructure
providers. It also adds support to create groups of VMs on OpenStack.
WARNING: the syntax of workshopctl has changed slightly. Check READMEs
for details.
ElasticSearch slowly uses up to 2GB of RAM.
Eventually, on instances provisioned with
only 4GB of RAM and without swap, if more
than one ElasticSearch pod end up on the
same instance, it will cause the instance
to slow down and ultimately crash. Instead,
we now use a tiny Go web server that shows
its environment in JSON. It still highlights
that multiple backends are serving requests
but without the memory usage issue.
I've dispatched the new content so that the fullday training
(actually two days, don't let the file name distract you)
is broken down in 8 chapters of approximately equal lengths,
where the most complex content is preferably located at the
end of the chapter (to allow people to catch up and ask questions
during breaks) + 1 chapter with the what's next / links / thank you
slides
In this section, we setup Portworx to have a dynamic provisioner.
Then we use it to deploy a PostgreSQL Stateful Set.
Finally we simulate a node failure and observe the failover.
- explain the reason why we have stateful sets
- explain the relationship between volumes, persistent volumes,
persistent volume claims, volume claim templates
- show how to run a Consul cluster with a stateful set
- volumes (general overview)
- building with the docker engine (bind-mounting the docker socket)
- building with kaniko (and init containers)
- managing configuration (configmaps, downward api)
Also added a new-content.yml file with just the new content
(for easier review), containing my plans for future chapters.
The commands to install Stern and Helm aren't super exciting,
so let's pre-install these tools. That way, we also generate
completion for them. We still give installation instructions
just in case, but this saves time for more important stuff.
This was discussed and agreed in #246. It will probably break a few
outstanding PRs as well as a few external links but it's for the
better good long term.
We don't always need to track slides, switch desktops, and open links.
(These things are not necessary when we're purely testing the labs.)
All these features are now behind boolean flags saved in the state file.
@abuisine ran through the whole deck recently, taking the long route each time it was possible; and he noticed that another field had to be removed when transforming the Deployment into a DaemonSet.
Thanks to @abuisine for reminding me that Heapster is going through a deprecation cycle.
I'm also expanding these two slides to be a bit more useful and relevant.
Added needed single quotes. I've also moved `nginx` to the end of the line, to follow a more consistent syntax (`options` before `name|id`).
```
Usage: docker inspect [OPTIONS] NAME|ID [NAME|ID...]
Return low-level information on Docker objects
Options:
-f, --format string Format the output using the given Go template
-s, --size Display total file sizes if the type is container
--type string Return JSON for specified type
```
The section about namespaces and cgroups is very thorough,
but we also need something showing how to practically
limit container resource usage without diving into a very
deep technical chapter.
The events are now listend in index.yaml, and generated
with index.py. The latter is called automatically by
build.sh.
The list of events has been slightly improved:
- we only show the last 5 past events
- video recordings now get a section of their own
count-slides.py will count the number of slides per section,
and compute size of each chapter as well. It is not perfect
(for instance, it assumes that excluded_classes=in_person)
but it should help to assess the size of the content before
delivering long workshops.
We were using 'kubectl apply' with a YAML snppet.
It's valid, but it's quite convoluted. Instead,
let's use 'kubectl create namespace'. We can still
mention the other method of course.
We have two extra variables in the slides:
@@GITREPO@@ (current value: github.com/jpetazzo/container.training)
@@SLIDES@@ (current value: http://container.training/)
These variables are set with gitrepo and slides in the YAML files.
(Just like the chat variable.)
Supercedes #256
This adds a few realistic examples of label usage.
It also adds explanations about why deploying a new
version of the worker doesn't seem to be effective
immediately (the worker doesn't handle signals).
The current version of the logistics.md slide shows AJ and JP.
The new version is an obvious template, i.e. it says 'this slide
should be customized' and it uses imaginary personas instead.
This is a very high-level overview (we can't cover a lot within the current time constraints) but it gives a primer about network policies and a few links to explore further.
In these chapters, we:
- show how to install Helm
- run the Helm tiller on our cluster
- use Helm to install Prometheus
- don't do anything fancy with
Prometheus (it's just for the
sake of installing something)
- create a basic Helm chart for
DockerCoins
- explain namespace concepts
- show how to use contexts to hop
between namespaces
- use Helm to deploy DockerCoins
to a new namespace
These two chapters go together.
Explain the purpose of centralized logging. Describe the
EFK stack. Deploy a simplified EFK stack through a YAML
file. Use it to view container logs. Profit.
We had two open-ended exercises (questions without
answers). We have added more explanations, as well
as solutions for the exercises. It lets us show a
few more tricks with selectors, and how to apply
changes to sets of resources.
Bump up Compose and Machine to latest versions.
Bump down Engine to stable branch.
I'm pushing straight to master because YOLO^W^W
because @bridgetkromhout is using the kube101.yaml
file anyway, so this shouldn't break her things.
(Famous last words...)
Explain better that the control plane can run outside
of the cluster, and that the word master can be
confusing (does it designate the control plane, or
the node running the control plane? What if there is
no node running the control plane, because the control
plane is external?)
The kubetest command used to say [SUCCESS] on completely
fresh nodes. Now we check the existence of the /tmp/node
file, as well as of the kubectl executable.
This tags the first build with v0.1, allowing for a smoother, more
logical rollback. Also adds a slide explaining why to stay away
from latest. @kelseyhightower would be proud :-)
`trap ... ERR` does not automatically propagate to functions. Therefore,
Our fancy error-reporting mechanism did not catch errors happening in
functions; and we do most of the actual work in functions. The solution
is to `set -E` or `set -o errtrace`.
Navigation now includes all slides and all snippets.
ENTER skips to the next snippet, or executes the
selected snippet.
More improvements to come: allow SPACE to navigate
step by step through slides and snippets, executing
the snippets.
While you wrote:
`--detach=false` does not complete *faster*. It just *doesn't wait* for completion.
I think you actually meant:
`--detach=TRUE` does not complete *faster*. It just *doesn't wait* for completion.
- added self-paced manifest for intro workshop
- refactored intro content to work for all workshops
- fixed footnote CSS to work in all contexts
- intro+logistics content is good to go
We now have two clean classes: btp-manual and btp-auto
Exclude one or the other depending on whether you want to
show the full process of building and pushing images, or
if you want to shortcut and use stack files directly.
- [ ] Create event-named branch (such as `conferenceYYYY`) in the [main repo](https://github.com/jpetazzo/container.training/)
- [ ] Create file `slides/_redirects` containing a link to the desired tutorial: `/ /kube-halfday.yml.html 200`
- [ ] Push local branch to GitHub and merge into main repo
- [ ] [Netlify setup](https://app.netlify.com/sites/container-training/settings/domain): create subdomain for event-named branch
- [ ] Add link to event-named branch to [container.training front page](https://github.com/jpetazzo/container.training/blob/master/slides/index.html)
- [ ] Update the slides that says which versions we are using for [kube](https://github.com/jpetazzo/container.training/blob/master/slides/kube/versions-k8s.md) or [swarm](https://github.com/jpetazzo/container.training/blob/master/slides/swarm/versions.md) workshops
- [ ] Update the version of Compose and Machine in [settings](https://github.com/jpetazzo/container.training/tree/master/prepare-vms/settings)
- [ ] (optional) Create chatroom
- [ ] (optional) Set chatroom in YML ([kube half-day example](https://github.com/jpetazzo/container.training/blob/master/slides/kube-halfday.yml#L6-L8)) and deploy
- [ ] (optional) Put chat link on [container.training front page](https://github.com/jpetazzo/container.training/blob/master/slides/index.html)
- [ ] How many VMs do we need? Check with event organizers ahead of time
- [ ] Provision VMs (slightly more than we think we'll need)
- [ ] Change password on presenter's VMs (to forestall any hijinx)
- [ ] Onsite: walk the room to count seats, check power supplies, lectern, A/V setup
- [ ] Print cards
- [ ] Cut cards
- [ ] Last-minute merge from master
- [ ] Check that all looks good
- [ ] DELIVER!
- [ ] Shut down VMs
- [ ] Update index.html to remove chat link and move session to past things
- Contributed scripts to automate the creation of local environments.
These could use some help to test/check that they work.
- [prepare-vms](prepare-vms/):
- Scripts to automate the creation of AWS instances for students.
These are routinely used and actively maintained.
- [slides](slides/):
- All the slides! They are assembled from Markdown files with
a custom Python script, and then rendered using [gnab/remark](
https://github.com/gnab/remark). Check this directory for more details.
- [stacks](stacks/):
- A handful of Compose files (version 3) allowing to easily
deploy complex application stacks.
### Using Docker Machine to create your own cluster
## Course structure
This method requires a bit more work to get started, but you get a permanent
cluster, with less limitations.
(This applies only for the orchestration workshops.)
You will need Docker Machine (if you have Docker Mac, Docker Windows, or
the Docker Toolbox, you're all set already). You will also need:
The workshop introduces a demo app, "DockerCoins," built
around a micro-services architecture. First, we run it
on a single node, using Docker Compose. Then, we pretend
that we need to scale it, and we use an orchestrator
(SwarmKit or Kubernetes) to deploy and scale the app on
a cluster.
- credentials for a cloud provider (e.g. API keys or tokens),
- or a local install of VirtualBox or VMware (or anything supported
by Docker Machine).
We explain the concepts of the orchestrator. For SwarmKit,
we setup the cluster with `docker swarm init` and `docker swarm join`.
For Kubernetes, we use pre-configured clusters.
Full instructions are in the [prepare-machine](prepare-machine) subdirectory.
Then, we cover more advanced concepts: scaling, load balancing,
updates, global services or daemon sets.
There are a number of advanced optional chapters about
logging, metrics, secrets, network encryption, etc.
The content is very modular: it is broken down in a large
number of Markdown files, that are put together according
to a YAML manifest. This allows to re-use content
between different workshops very easily.
### Using our scripts to mass-create a bunch of clusters
### DockerCoins
Since we often deliver the workshop during conferences or similar events,
we have scripts to automate the creation of a bunch of clusters using
AWS EC2. If you want to create multiple clusters and have EC2 credits,
check the [prepare-vms](prepare-vms) directory for more information.
## How This Repo is Organized
- **dockercoins**
- Sample App: compose files and source code for the dockercoins sample apps
used throughout the workshop
- **docs**
- Slide Deck: presentation slide deck, works out-of-box with GitHub Pages,
uses https://remarkjs.com
- **prepare-local**
- untested scripts for automating the creation of local virtualbox VM's
(could use your help validating)
- **prepare-machine**
- instructions explaining how to use Docker Machine to create VMs
- **prepare-vms**
- scripts for automating the creation of AWS instances for students
## Slide Deck
- The slides are in the `docs` directory.
- To view them locally open `docs/index.html` in your browser. It works
offline too.
- To view them online open https://jpetazzo.github.io/orchestration-workshop/
in your browser.
- When you fork this repo, be sure GitHub Pages is enabled in repo Settings
for "master branch /docs folder" and you'll have your own website for them.
- They use https://remarkjs.com to allow simple markdown in a html file that
remark will transform into a presentation in the browser.
## Sample App: Dockercoins!
The sample app is in the `dockercoins` directory. It's used during all chapters
The sample app is in the `dockercoins` directory.
It's used during all chapters
for explaining different concepts of orchestration.
To see it in action:
@@ -141,13 +138,18 @@ To see it in action:
- the web UI will be available on port 8000
*If you just want to run the workshop for yourself, you can stop reading
here. If you want to deliver the workshop for others (i.e. if you
want to become an instructor), keep reading!*
## Running the Workshop
If you want to deliver one of these workshops yourself,
this section is for you!
> *This section has been mostly contributed by
> [Bret Fisher](https://twitter.com/bretfisher), who was
> one of the first persons to have the bravery of delivering
> this workshop without me. Thanks Bret! 🍻
>
> Jérôme.*
### General timeline of planning a workshop
@@ -155,7 +157,7 @@ want to become an instructor), keep reading!*
understand the different `dockercoins` repo's and the steps we go through to
get to a full Swarm Mode cluster of many containers. You'll update the first
few slides and last slide at a minimum, with your info.
- Your docs directory can use GitHub Pages.
-~~Your docs directory can use GitHub Pages.~~
- This workshop expects 5 servers per student. You can get away with as little
as 2 servers per student, but you'll need to change the slide deck to
accommodate. More servers = more fun.
@@ -181,13 +183,14 @@ want to become an instructor), keep reading!*
they need for class.
- Typically you create the servers the day before or morning of workshop, and
leave them up the rest of day after workshop. If creating hundreds of servers,
you'll likely want to run all these `trainer` commands from a dedicated
you'll likely want to run all these `workshopctl` commands from a dedicated
instance you have in same region as instances you want to create. Much faster
this way if you're on poor internet. Also, create 2 sets of servers for
yourself, and use one during workshop and the 2nd is a backup.
- Remember you'll need to print the "cards" for students, so you'll need to
create instances while you have a way to print them.
### Things That Could Go Wrong
- Creating AWS instances ahead of time, and you hit its limits in region and
@@ -196,18 +199,20 @@ want to become an instructor), keep reading!*
locked-down computer, host firewall, etc.
- Horrible wifi, or ssh port TCP/22 not open on network! If wifi sucks you
can try using MOSH https://mosh.org which handles SSH over UDP. TMUX can also
prevent you from loosing your place if you get disconnected from servers.
prevent you from losing your place if you get disconnected from servers.
https://tmux.github.io
- Forget to print "cards" and cut them up for handing out IP's.
- Forget to have fun and focus on your students!
### Creating the VMs
`prepare-vms/trainer` is the script that gets you most of what you need for
`prepare-vms/workshopctl` is the script that gets you most of what you need for
setting up instances. See
[prepare-vms/README.md](prepare-vms)
for all the info on tools and scripts.
### Content for Different Workshop Durations
With all the slides, this workshop is a full day long. If you need to deliver
@@ -216,6 +221,7 @@ can replace `---` with `???` which will hide slides. Or leave them there and
add something like `(EXTRA CREDIT)` to title so students can still view the
content but you also know to skip during presentation.
#### 3 Hour Version
- Limit time on debug tools, maybe skip a few. *"Chapter 1:
@@ -227,6 +233,7 @@ content but you also know to skip during presentation.
- Mention what DAB's are, but make this part optional in case you run out
of time
#### 2 Hour Version
- Skip all the above, and:
@@ -240,6 +247,17 @@ content but you also know to skip during presentation.
- Last 15-30 minutes is for stateful services, DAB files, and questions.
### Pre-built images
There are pre-built images for the 4 components of the DockerCoins demo app: `dockercoins/hasher:v0.1`, `dockercoins/rng:v0.1`, `dockercoins/webui:v0.1`, and `dockercoins/worker:v0.1`. They correspond to the code in this repository.
There are also three variants, for demo purposes:
-`dockercoins/rng:v0.2` is broken (the server won't even start),
-`dockercoins/webui:v0.2` has bigger font on the Y axis and a green graph (instead of blue),
-`dockercoins/worker:v0.2` is 11x slower than `v0.1`.
## Past events
Since its inception, this workshop has been delivered dozens of times,
@@ -271,13 +289,34 @@ If there is a bug and you can't fix it, but you can
reproduce it: submit an issue explaining how to reproduce.
If there is a bug and you can't even reproduce it:
sorry. It is probably an Heisenbug. I can't act on it
until it's reproducible.
sorry. It is probably an Heisenbug. We can't act on it
until it's reproducible, alas.
if you have attended this workshop and have feedback,
or if you want us to deliver that workshop at your
conference or for your company: contact me (jerome
at docker dot com).
Thank you!
# “Please teach us!”
If you have attended one of these workshops, and want
your team or organization to attend a similar one, you
can look at the list of upcoming events on
http://container.training/.
You are also welcome to reuse these materials to run
your own workshop, for your team or even at a meetup
or conference. In that case, you might enjoy watching
# This is mirrored from https://github.com/upmc-enterprises/elasticsearch-operator/blob/master/example/controller.yaml but using the elasticsearch-operator namespace instead of operator
- [Parallel SSH](https://code.google.com/archive/p/parallel-ssh/) (on a Mac: `brew install pssh`)
Depending on the infrastructure that you want to use, you also need to install
the Azure CLI, the AWS CLI, or terraform (for OpenStack deployment).
And if you want to generate printable cards:
- [pyyaml](https://pypi.python.org/pypi/PyYAML)
- [jinja2](https://pypi.python.org/pypi/Jinja2)
You can install them with pip (perhaps with `pip install --user`, or even use `virtualenv` if that's your thing).
These require Python 3. If you are on a Mac, see below for specific instructions on setting up
Python 3 to be the default Python on a Mac. In particular, if you installed `mosh`, Homebrew
may have changed your default Python to Python 2.
## General Workflow
- fork/clone repo
-set required environment variables for AWS
-create an infrastructure configuration in the `prepare-vms/infra` directory
(using one of the example files in that directory)
- create your own setting file from `settings/example.yaml`
-run `./trainer` commands to create instances, install docker, setup each users environment in node1, other management tasks
- run `./trainer cards` command to generate PDF for printing handouts of each users host IP's and login info
-if necessary, increase allowed open files: `ulimit -Sn 10000`
- run `./workshopctl start` to create instances
- run `./workshopctl deploy` to install Docker and setup environment
- run `./workshopctl kube` (if you want to install and setup Kubernetes)
- run `./workshopctl cards` (if you want to generate PDF for printing handouts of each users host IP's and login info)
- run `./workshopctl stop` at the end of the workshop to terminate instances
## Clone/Fork the Repo, and Build the Tools Image
The Docker Compose file here is used to build a image with all the dependencies to run the `./trainer` commands and optional tools. Each run of the script will check if you have those dependencies locally on your host, and will only use the container if you're [missing a dependency](trainer#L5).
The Docker Compose file here is used to build a image with all the dependencies to run the `./workshopctl` commands and optional tools. Each run of the script will check if you have those dependencies locally on your host, and will only use the container if you're [missing a dependency](workshopctl#L5).
- Initial assumptions are you're using a root account. If you'd like to use a IAM user, it will need `AmazonEC2FullAccess` and `IAMReadOnlyAccess`.
- Using a non-default VPC or Security Group isn't supported out of box yet, but until then you can [customize the `trainer-cli` script](scripts/trainer-cli#L396-L401).
- These instances will assign the default VPC Security Group, which does not open any ports from Internet by default. So you'll need to add Inbound rules for `SSH | TCP | 22 | 0.0.0.0/0` and `Custom TCP Rule | TCP | 8000 - 8002 | 0.0.0.0/0`, or run `./trainer opensg` which opens up all ports.
- Using a non-default VPC or Security Group isn't supported out of box yet, so you will have to customize `lib/commands.sh` if you want to change that.
- These instances will assign the default VPC Security Group, which does not open any ports from Internet by default. So you'll need to add Inbound rules for `SSH | TCP | 22 | 0.0.0.0/0` and `Custom TCP Rule | TCP | 8000 - 8002 | 0.0.0.0/0`, or run `./workshopctl opensg` which opens up all ports.
### Required Environment Variables
### Create your `infra` file
-`AWS_ACCESS_KEY_ID`
-`AWS_SECRET_ACCESS_KEY`
-`AWS_DEFAULT_REGION`
You need to do this only once. (On AWS, you can create one `infra`
file per region.)
### Update/copy `settings/example.yaml`
Make a copy of one of the example files in the `infra` directory.
Then pass `settings/YOUR_WORKSHOP_NAME-settings.yaml` as an argument to `trainer deploy`, `trainer cards`, etc.
test Run tests (pre-flight checks) on a group of VMs
weavetest Check that weave seems properly setup
webssh Install a WEB SSH server on the machines (port 1080)
wrap Run this program in a container
www Run a web server to access card HTML and PDF
```
### Summary of What `./workshopctl` Does For You
- Used to manage bulk AWS instances for you without needing to use AWS cli or gui.
- Can manage multiple "tags" or groups of instances, which are tracked in `prepare-vms/tags/`
- Can also create PDF/HTML for printing student info for instance IP's and login.
- The `./trainer` script can be executed directly.
- The `./workshopctl` script can be executed directly.
- It will run locally if all its dependencies are fulfilled; otherwise it will run in the Docker container you created with `docker-compose build` (preparevms_prepare-vms).
- During `start` it will add your default local SSH key to all instances under the `ubuntu` user.
- During `deploy` it will create the `docker` user with password `training`, which is printing on the cards for students. For now, this is hard coded.
- During `deploy` it will create the `docker` user with password `training`, which is printing on the cards for students. This can be configured with the `docker_user_password` property in the settings file.
### Example Steps to Launch a Batch of Instances for a Workshop
### Example Steps to Launch a group of AWS Instances for a Workshop
-Export the environment variables needed by the AWS CLI (see **Required Environment Variables** above)
- Run `./trainer start N` Creates `N` EC2 instances
- Your local SSH key will be synced to instances under `ubuntu` user
- AWS instances will be created and tagged based on date, and IP's stored in `prepare-vms/tags/`
- Run `./trainer deploy TAG settings/somefile.yaml` to run `scripts/postprep.rc` via parallel-ssh
- Run `./workshopctl deploy TAG` to run `lib/postprep.py` via parallel-ssh
- If it errors or times out, you should be able to rerun
- Requires good connection to run all the parallel SSH connections, up to 100 parallel (ProTip: create dedicated management instance in same AWS region where you run all these utils from)
- Run `./trainer pull-images TAG` to pre-pull a bunch of Docker images to the instances
- Run `./trainer cards TAG settings/somefile.yaml` generates PDF/HTML files to print and cut and hand out to students
- Run `./workshopctl pull_images TAG` to pre-pull a bunch of Docker images to the instances
- Run `./workshopctl cards TAG` generates PDF/HTML files to print and cut and hand out to students
- *Have a great workshop*
- Run `./trainer stop TAG` to terminate instances.
- Run `./workshopctl stop TAG` to terminate instances.
## Other Tools
### Example Steps to Launch Azure Instances
### Deploying your SSH key to all the machines
- Install the [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) and authenticate with a valid account (`az login`)
- Customize `azuredeploy.parameters.json`
- Required:
- Provide the SSH public key you plan to use for instance configuration
- Optional:
- Choose a name for the workshop (default is "workshop")
- Choose the number of instances (default is 3)
- Customize the desired instance size (default is Standard_D1_v2)
- Launch instances with your chosen resource group name and your preferred region; the examples are "workshop" and "eastus":
```
az group create --name workshop --location eastus
az group deployment create --resource-group workshop --template-file azuredeploy.json --parameters @azuredeploy.parameters.json
```
- Make sure that you have SSH keys loaded (`ssh-add -l`).
- Source `rc`.
- Run `pcopykey`.
The `az group deployment create` command can take several minutes and will only say `- Running ..` until it completes, unless you increase the verbosity with `--verbose` or `--debug`.
To display the IPs of the instances you've launched:
### Installing extra packages
```
az vm list-ip-addresses --resource-group workshop --output table
```
- Source `postprep.rc`.
(This will install a few extra packages, add entries to
/etc/hosts, generate SSH keys, and deploy them on all hosts.)
If you want to put the IPs into `prepare-vms/tags/<tag>/ips.txt` for a tag of "myworkshop":
1) If you haven't yet installed `jq` and/or created your event's tags directory in `prepare-vms`:
```
brew install jq
mkdir -p tags/myworkshop
```
2) And then generate the IP list:
```
az vm list-ip-addresses --resource-group workshop --output json | jq -r '.[].virtualMachine.network.publicIpAddresses[].ipAddress' > tags/myworkshop/ips.txt
```
After the workshop is over, remove the instances:
```
az group delete --resource-group workshop
```
### Example Steps to Configure Instances from a non-AWS Source
- Copy `infra/example.generic` to `infra/generic`
- Run `./workshopctl start --infra infra/generic --settings settings/...yaml`
- Note the `prepare-vms/tags/TAG/` path that has been auto-created.
- Launch instances via your preferred method. You'll need to get the instance IPs and be able to SSH into them.
- Edit the file `prepare-vms/tags/TAG/ips.txt`, it should list the IP addresses of the VMs (one per line, without any comments or other info)
- Continue deployment of cluster configuration with `./workshopctl deploy TAG`
- Optionally, configure Kubernetes clusters of the size in the settings: workshopctl kube `TAG`
- Optionally, test your Kubernetes clusters. They may take a little time to become ready: workshopctl kubetest `TAG`
- Generate cards to print and hand out: workshopctl cards `TAG`
- Print the cards file: prepare-vms/tags/`TAG`/ips.html
## Even More Details
@@ -117,7 +218,7 @@ To see which local key will be uploaded, run `ssh-add -l | grep RSA`.
#### Instance + tag creation
10 VMs will be started, with an automatically generated tag (timestamp + your username).
The VMs will be started, with an automatically generated tag (timestamp + your username).
Your SSH key will be added to the `authorized_keys` of the ubuntu user.
@@ -125,41 +226,68 @@ Your SSH key will be added to the `authorized_keys` of the ubuntu user.
Following the creation of the VMs, a text file will be created containing a list of their IPs.
This ips.txt file will be created in the $TAG/ directory and a symlink will be placed in the working directory of the script.
If you create new VMs, the symlinked file will be overwritten.
#### Deployment
Instances can be deployed manually using the `deploy` command:
$ ./trainer deploy TAG settings/somefile.yaml
$ ./workshopctl deploy TAG
The `postprep.rc` file will be copied via parallel-ssh to all of the VMs and executed.
The `postprep.py` file will be copied via parallel-ssh to all of the VMs and executed.
#### Pre-pull images
$ ./trainer pull-images TAG
$ ./workshopctl pull_images TAG
#### Generate cards
$ ./trainer cards TAG settings/somefile.yaml
$ ./workshopctl cards TAG
If you want to generate both HTML and PDF cards, install [wkhtmltopdf](https://wkhtmltopdf.org/downloads.html); without that installed, only HTML cards will be generated.
If you don't have `wkhtmltopdf` installed, you will get a warning that it is a missing dependency. If you plan to just print the HTML cards, you can ignore this.
#### List tags
$ ./trainer list
$ ./workshopctl list infra/some-infra-file
#### List VMs
$ ./workshopctl listall
$ ./trainer list TAG
This will print a human-friendly list containing some information about each instance.
$ ./workshopctl tags
#### Stop and destroy VMs
$ ./trainer stop TAG
$ ./workshopctl stop TAG
## ToDo
- Don't write to bash history in system() in postprep
- compose, etc version inconsistent (int vs str)
## Making sure Python3 is the default (Mac only)
Check the `/usr/local/bin/python` symlink. It should be pointing to
`/usr/local/Cellar/python/3`-something. If it isn't, follow these
instructions.
1) Verify that Python 3 is installed.
```
ls -la /usr/local/Cellar/Python
```
You should see one or more versions of Python 3. If you don't,
install it with `brew install python`.
2) Verify that `python` points to Python3.
```
ls -la /usr/local/bin/python
```
If this points to `/usr/local/Cellar/python@2`, then we'll need to change it.
ExecStart=/usr/bin/websocketd --port=1088 --staticdir=. sh -c \"tail -n +1 -f /home/docker/.history || echo 'Could not read history file. Perhaps you need to \\\"chmod +r .history\\\"?'\"
User=nobody
Group=nogroup
Restart=always
EOF
sudo systemctl enable /root/tailhist.service
sudo systemctl start tailhist"
pssh -I sudo tee /tmp/tailhist/index.html <lib/tailhist.html
}
_cmd opensg "Open the default security group to ALL ingress traffic"
_cmd_opensg(){
need_infra $1
infra_opensg
}
_cmd portworx "Prepare the nodes for Portworx deployment"
_cmd_portworx(){
TAG=$1
need_tag
pssh "
sudo truncate --size 10G /portworx.blk &&
sudo losetup /dev/loop4 /portworx.blk"
}
_cmd disableaddrchecks "Disable source/destination IP address checks"
_cmd_disableaddrchecks(){
TAG=$1
need_tag
infra_disableaddrchecks
}
_cmd pssh "Run an arbitrary command on all nodes"
_cmd_pssh(){
TAG=$1
need_tag
shift
pssh "$@"
}
_cmd pull_images "Pre-pull a bunch of Docker images"
_cmd_pull_images(){
TAG=$1
need_tag
pull_tag
}
_cmd remap_nodeports "Remap NodePort range to 10000-10999"
|| die "The output of \`ssh-add -l\` doesn't contain 'RSA'. Start the agent, add your keys?"
AWS_KEY_NAME=$(make_key_name)
info "Syncing keys... "
if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME"&>/dev/null;then
aws ec2 import-key-pair --key-name $AWS_KEY_NAME\
--public-key-material "$(ssh-add -L \
| grep -i RSA \
| head -n1 \
| cut -d " " -f 1-2)"&>/dev/null
if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME"&>/dev/null;then
die "Somehow, importing the key didn't work. Make sure that 'ssh-add -l | grep RSA | head -n1' returns an RSA key?"
else
info "Imported new key $AWS_KEY_NAME."
fi
else
info "Using existing key $AWS_KEY_NAME."
fi
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.