Merge the big 2017 refactor
2
.gitignore
vendored
@@ -6,3 +6,5 @@ prepare-vms/ips.html
|
||||
prepare-vms/ips.pdf
|
||||
prepare-vms/settings.yaml
|
||||
prepare-vms/tags
|
||||
docs/*.yml.html
|
||||
autotest/nextstep
|
||||
|
||||
79
README.md
@@ -8,10 +8,27 @@ non-stop since June 2015.
|
||||
|
||||
## Content
|
||||
|
||||
- Chapter 1: Getting Started: running apps with docker-compose
|
||||
- Chapter 2: Scaling out with Swarm Mode
|
||||
- Chapter 3: Operating the Swarm (networks, updates, logging, metrics)
|
||||
- Chapter 4: Deeper in Swarm (stateful services, scripting, DAB's)
|
||||
The workshop introduces a demo app, "DockerCoins," built
|
||||
around a micro-services architecture. First, we run it
|
||||
on a single node, using Docker Compose. Then, we pretend
|
||||
that we need to scale it, and we use an orchestrator
|
||||
(SwarmKit or Kubernetes) to deploy and scale the app on
|
||||
a cluster.
|
||||
|
||||
We explain the concepts of the orchestrator. For SwarmKit,
|
||||
we setup the cluster with `docker swarm init` and `docker swarm join`.
|
||||
For Kubernetes, we use pre-configured clusters.
|
||||
|
||||
Then, we cover more advanced concepts: scaling, load balancing,
|
||||
updates, global services or daemon sets.
|
||||
|
||||
There are a number of advanced optional chapters about
|
||||
logging, metrics, secrets, network encryption, etc.
|
||||
|
||||
The content is very modular: it is broken down in a large
|
||||
number of Markdown files, that are put together according
|
||||
to a YAML manifest. This allows to re-use content
|
||||
between different workshops very easily.
|
||||
|
||||
|
||||
## Quick start (or, "I want to try it!")
|
||||
@@ -32,8 +49,8 @@ own cluster, we have multiple solutions for you!
|
||||
|
||||
### Using [play-with-docker](http://play-with-docker.com/)
|
||||
|
||||
This method is very easy to get started (you don't need any extra account
|
||||
or resources!) but will require a bit of adaptation from the workshop slides.
|
||||
This method is very easy to get started: you don't need any extra account
|
||||
or resources! It works only for the SwarmKit version of the workshop, though.
|
||||
|
||||
To get started, go to [play-with-docker](http://play-with-docker.com/), and
|
||||
click on _ADD NEW INSTANCE_ five times. You will get five "docker-in-docker"
|
||||
@@ -44,31 +61,9 @@ the tab corresponding to that node.
|
||||
|
||||
The nodes are not directly reachable from outside; so when the slides tell
|
||||
you to "connect to the IP address of your node on port XYZ" you will have
|
||||
to use a different method.
|
||||
|
||||
We suggest to use "supergrok", a container offering a NGINX+ngrok combo to
|
||||
expose your services. To use it, just start (on any of your nodes) the
|
||||
`jpetazzo/supergrok` image. The image will output further instructions:
|
||||
|
||||
```
|
||||
docker run --name supergrok -d jpetazzo/supergrok
|
||||
docker logs --follow supergrok
|
||||
```
|
||||
|
||||
The logs of the container will give you a tunnel address and explain you
|
||||
how to connected to exposed services. That's all you need to do!
|
||||
|
||||
We are also working on a native proxy, embedded to Play-With-Docker.
|
||||
Stay tuned!
|
||||
|
||||
<!--
|
||||
|
||||
- You can use a proxy provided by Play-With-Docker. When the slides
|
||||
instruct you to connect to nodeX on port ABC, instead, you will connect
|
||||
to http://play-with-docker.com/XXX.XXX.XXX.XXX:ABC, where XXX.XXX.XXX.XXX
|
||||
is the IP address of nodeX.
|
||||
|
||||
-->
|
||||
to use a different method: click on the port number that should appear on
|
||||
top of the play-with-docker window. This only works for HTTP services,
|
||||
though.
|
||||
|
||||
Note that the instances provided by Play-With-Docker have a short lifespan
|
||||
(a few hours only), so if you want to do the workshop over multiple sessions,
|
||||
@@ -119,14 +114,16 @@ check the [prepare-vms](prepare-vms) directory for more information.
|
||||
## Slide Deck
|
||||
|
||||
- The slides are in the `docs` directory.
|
||||
- To view them locally open `docs/index.html` in your browser. It works
|
||||
offline too.
|
||||
- To view them online open https://jpetazzo.github.io/orchestration-workshop/
|
||||
in your browser.
|
||||
- When you fork this repo, be sure GitHub Pages is enabled in repo Settings
|
||||
for "master branch /docs folder" and you'll have your own website for them.
|
||||
- They use https://remarkjs.com to allow simple markdown in a html file that
|
||||
remark will transform into a presentation in the browser.
|
||||
- For each slide deck, there is a `.yml` file referencing `.md` files.
|
||||
- The `.md` files contain Markdown snippets.
|
||||
- When you run `build.sh once`, it will "compile" all the `.yml` files
|
||||
into `.yml.html` files that you can open in your browser.
|
||||
- You can also run `build.sh forever`, which will watch the directory
|
||||
and rebuild slides automatically when files are modified.
|
||||
- If needed, you can fine-tune `workshop.css` and `workshop.html`
|
||||
(respectively the CSS style used, and the boilerplate template).
|
||||
- The slides use https://remarkjs.com to render Markdown into HTML in
|
||||
a web browser.
|
||||
|
||||
|
||||
## Sample App: Dockercoins!
|
||||
@@ -181,7 +178,7 @@ want to become an instructor), keep reading!*
|
||||
they need for class.
|
||||
- Typically you create the servers the day before or morning of workshop, and
|
||||
leave them up the rest of day after workshop. If creating hundreds of servers,
|
||||
you'll likely want to run all these `trainer` commands from a dedicated
|
||||
you'll likely want to run all these `workshopctl` commands from a dedicated
|
||||
instance you have in same region as instances you want to create. Much faster
|
||||
this way if you're on poor internet. Also, create 2 sets of servers for
|
||||
yourself, and use one during workshop and the 2nd is a backup.
|
||||
@@ -203,7 +200,7 @@ want to become an instructor), keep reading!*
|
||||
|
||||
### Creating the VMs
|
||||
|
||||
`prepare-vms/trainer` is the script that gets you most of what you need for
|
||||
`prepare-vms/workshopctl` is the script that gets you most of what you need for
|
||||
setting up instances. See
|
||||
[prepare-vms/README.md](prepare-vms)
|
||||
for all the info on tools and scripts.
|
||||
|
||||
@@ -1,15 +1,28 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import uuid
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import signal
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import uuid
|
||||
|
||||
def print_snippet(snippet):
|
||||
print(78*'-')
|
||||
print(snippet)
|
||||
print(78*'-')
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
|
||||
|
||||
TIMEOUT = 60 # 1 minute
|
||||
|
||||
|
||||
def hrule():
|
||||
return "="*int(subprocess.check_output(["tput", "cols"]))
|
||||
|
||||
# A "snippet" is something that the user is supposed to do in the workshop.
|
||||
# Most of the "snippets" are shell commands.
|
||||
# Some of them can be key strokes or other actions.
|
||||
# In the markdown source, they are the code sections (identified by triple-
|
||||
# quotes) within .exercise[] sections.
|
||||
|
||||
class Snippet(object):
|
||||
|
||||
@@ -29,26 +42,22 @@ class Slide(object):
|
||||
def __init__(self, content):
|
||||
Slide.current_slide += 1
|
||||
self.number = Slide.current_slide
|
||||
|
||||
# Remove commented-out slides
|
||||
# (remark.js considers ??? to be the separator for speaker notes)
|
||||
content = re.split("\n\?\?\?\n", content)[0]
|
||||
self.content = content
|
||||
|
||||
self.snippets = []
|
||||
exercises = re.findall("\.exercise\[(.*)\]", content, re.DOTALL)
|
||||
for exercise in exercises:
|
||||
if "```" in exercise and "<br/>`" in exercise:
|
||||
print("! Exercise on slide {} has both ``` and <br/>` delimiters, skipping."
|
||||
.format(self.number))
|
||||
print_snippet(exercise)
|
||||
elif "```" in exercise:
|
||||
if "```" in exercise:
|
||||
for snippet in exercise.split("```")[1::2]:
|
||||
self.snippets.append(Snippet(self, snippet))
|
||||
elif "<br/>`" in exercise:
|
||||
for snippet in re.findall("<br/>`(.*)`", exercise):
|
||||
self.snippets.append(Snippet(self, snippet))
|
||||
else:
|
||||
print(" Exercise on slide {} has neither ``` or <br/>` delimiters, skipping."
|
||||
.format(self.number))
|
||||
logging.warning("Exercise on slide {} does not have any ``` snippet."
|
||||
.format(self.number))
|
||||
self.debug()
|
||||
|
||||
def __str__(self):
|
||||
text = self.content
|
||||
@@ -56,136 +65,165 @@ class Slide(object):
|
||||
text = text.replace(snippet.content, ansi("7")(snippet.content))
|
||||
return text
|
||||
|
||||
def debug(self):
|
||||
logging.debug("\n{}\n{}\n{}".format(hrule(), self.content, hrule()))
|
||||
|
||||
|
||||
def ansi(code):
|
||||
return lambda s: "\x1b[{}m{}\x1b[0m".format(code, s)
|
||||
|
||||
slides = []
|
||||
with open("index.html") as f:
|
||||
content = f.read()
|
||||
for slide in re.split("\n---?\n", content):
|
||||
slides.append(Slide(slide))
|
||||
|
||||
is_editing_file = False
|
||||
placeholders = {}
|
||||
def wait_for_string(s):
|
||||
logging.debug("Waiting for string: {}".format(s))
|
||||
deadline = time.time() + TIMEOUT
|
||||
while time.time() < deadline:
|
||||
output = capture_pane()
|
||||
if s in output:
|
||||
return
|
||||
time.sleep(1)
|
||||
raise Exception("Timed out while waiting for {}!".format(s))
|
||||
|
||||
|
||||
def wait_for_prompt():
|
||||
logging.debug("Waiting for prompt.")
|
||||
deadline = time.time() + TIMEOUT
|
||||
while time.time() < deadline:
|
||||
output = capture_pane()
|
||||
# If we are not at the bottom of the screen, there will be a bunch of extra \n's
|
||||
output = output.rstrip('\n')
|
||||
if output[-2:] == "\n$":
|
||||
return
|
||||
time.sleep(1)
|
||||
raise Exception("Timed out while waiting for prompt!")
|
||||
|
||||
|
||||
def check_exit_status():
|
||||
token = uuid.uuid4().hex
|
||||
data = "echo {} $?\n".format(token)
|
||||
logging.debug("Sending {!r} to get exit status.".format(data))
|
||||
send_keys(data)
|
||||
time.sleep(0.5)
|
||||
wait_for_prompt()
|
||||
screen = capture_pane()
|
||||
status = re.findall("\n{} ([0-9]+)\n".format(token), screen, re.MULTILINE)
|
||||
logging.debug("Got exit status: {}.".format(status))
|
||||
if len(status) == 0:
|
||||
raise Exception("Couldn't retrieve status code {}. Timed out?".format(token))
|
||||
if len(status) > 1:
|
||||
raise Exception("More than one status code {}. I'm seeing double! Shoot them both.".format(token))
|
||||
code = int(status[0])
|
||||
if code != 0:
|
||||
raise Exception("Non-zero exit status: {}.".format(code))
|
||||
# Otherwise just return peacefully.
|
||||
|
||||
|
||||
slides = []
|
||||
content = open(sys.argv[1]).read()
|
||||
for slide in re.split("\n---?\n", content):
|
||||
slides.append(Slide(slide))
|
||||
|
||||
actions = []
|
||||
for slide in slides:
|
||||
for snippet in slide.snippets:
|
||||
content = snippet.content
|
||||
# Multi-line snippets should be ```highlightsyntax...
|
||||
# Single-line snippets will be interpreted as shell commands
|
||||
# Extract the "method" (e.g. bash, keys, ...)
|
||||
# On multi-line snippets, the method is alone on the first line
|
||||
# On single-line snippets, the data follows the method immediately
|
||||
if '\n' in content:
|
||||
highlight, content = content.split('\n', 1)
|
||||
method, data = content.split('\n', 1)
|
||||
else:
|
||||
highlight = "bash"
|
||||
content = content.strip()
|
||||
# If the previous snippet was a file fragment, and the current
|
||||
# snippet is not YAML or EDIT, complain.
|
||||
if is_editing_file and highlight not in ["yaml", "edit"]:
|
||||
print("! On slide {}, previous snippet was YAML, so what do what do?"
|
||||
.format(slide.number))
|
||||
print_snippet(content)
|
||||
is_editing_file = False
|
||||
if highlight == "yaml":
|
||||
is_editing_file = True
|
||||
elif highlight == "placeholder":
|
||||
for line in content.split('\n'):
|
||||
variable, value = line.split(' ', 1)
|
||||
placeholders[variable] = value
|
||||
elif highlight == "bash":
|
||||
for variable, value in placeholders.items():
|
||||
quoted = "`{}`".format(variable)
|
||||
if quoted in content:
|
||||
content = content.replace(quoted, value)
|
||||
del placeholders[variable]
|
||||
if '`' in content:
|
||||
print("! The following snippet on slide {} contains a backtick:"
|
||||
.format(slide.number))
|
||||
print_snippet(content)
|
||||
continue
|
||||
print("_ "+content)
|
||||
snippet.actions.append((highlight, content))
|
||||
elif highlight == "edit":
|
||||
print(". "+content)
|
||||
snippet.actions.append((highlight, content))
|
||||
elif highlight == "meta":
|
||||
print("^ "+content)
|
||||
snippet.actions.append((highlight, content))
|
||||
else:
|
||||
print("! Unknown highlight {!r} on slide {}.".format(highlight, slide.number))
|
||||
if placeholders:
|
||||
print("! Remaining placeholder values: {}".format(placeholders))
|
||||
method, data = content.split(' ', 1)
|
||||
actions.append((slide, snippet, method, data))
|
||||
|
||||
actions = sum([snippet.actions for snippet in sum([slide.snippets for slide in slides], [])], [])
|
||||
|
||||
# Strip ^{ ... ^} for now
|
||||
def strip_curly_braces(actions, in_braces=False):
|
||||
if actions == []:
|
||||
return []
|
||||
elif actions[0] == ("meta", "^{"):
|
||||
return strip_curly_braces(actions[1:], True)
|
||||
elif actions[0] == ("meta", "^}"):
|
||||
return strip_curly_braces(actions[1:], False)
|
||||
elif in_braces:
|
||||
return strip_curly_braces(actions[1:], True)
|
||||
def send_keys(data):
|
||||
subprocess.check_call(["tmux", "send-keys", data])
|
||||
|
||||
def capture_pane():
|
||||
return subprocess.check_output(["tmux", "capture-pane", "-p"])
|
||||
|
||||
|
||||
try:
|
||||
i = int(open("nextstep").read())
|
||||
logging.info("Loaded next step ({}) from file.".format(i))
|
||||
except Exception as e:
|
||||
logging.warning("Could not read nextstep file ({}), initializing to 0.".format(e))
|
||||
i = 0
|
||||
|
||||
interactive = True
|
||||
|
||||
while i < len(actions):
|
||||
with open("nextstep", "w") as f:
|
||||
f.write(str(i))
|
||||
slide, snippet, method, data = actions[i]
|
||||
|
||||
# Remove extra spaces (we don't want them in the terminal) and carriage returns
|
||||
data = data.strip()
|
||||
|
||||
print(hrule())
|
||||
print(slide.content.replace(snippet.content, ansi(7)(snippet.content)))
|
||||
print(hrule())
|
||||
if interactive:
|
||||
print("[{}/{}] Shall we execute that snippet above?".format(i, len(actions)))
|
||||
print("(ENTER to execute, 'c' to continue until next error, N to jump to step #N)")
|
||||
command = raw_input("> ")
|
||||
else:
|
||||
return [actions[0]] + strip_curly_braces(actions[1:], False)
|
||||
command = ""
|
||||
|
||||
actions = strip_curly_braces(actions)
|
||||
# For now, remove the `highlighted` sections
|
||||
# (Make sure to use $() in shell snippets!)
|
||||
if '`' in data:
|
||||
logging.info("Stripping ` from snippet.")
|
||||
data = data.replace('`', '')
|
||||
|
||||
background = []
|
||||
cwd = os.path.expanduser("~")
|
||||
env = {}
|
||||
for current_action, next_action in zip(actions, actions[1:]+[("bash", "true")]):
|
||||
if current_action[0] == "meta":
|
||||
continue
|
||||
print(ansi(7)(">>> {}".format(current_action[1])))
|
||||
time.sleep(1)
|
||||
popen_options = dict(shell=True, cwd=cwd, stdin=subprocess.PIPE, preexec_fn=os.setpgrp)
|
||||
# The follow hack allows to capture the environment variables set by `docker-machine env`
|
||||
# FIXME: this doesn't handle `unset` for now
|
||||
if any([
|
||||
"eval $(docker-machine env" in current_action[1],
|
||||
"DOCKER_HOST" in current_action[1],
|
||||
"COMPOSE_FILE" in current_action[1],
|
||||
]):
|
||||
popen_options["stdout"] = subprocess.PIPE
|
||||
current_action[1] += "\nenv"
|
||||
proc = subprocess.Popen(current_action[1], **popen_options)
|
||||
proc.cmd = current_action[1]
|
||||
if next_action[0] == "meta":
|
||||
print(">>> {}".format(next_action[1]))
|
||||
time.sleep(3)
|
||||
if next_action[1] == "^C":
|
||||
os.killpg(proc.pid, signal.SIGINT)
|
||||
proc.wait()
|
||||
elif next_action[1] == "^Z":
|
||||
# Let the process run
|
||||
background.append(proc)
|
||||
elif next_action[1] == "^D":
|
||||
proc.communicate()
|
||||
proc.wait()
|
||||
if command == "c":
|
||||
# continue until next timeout
|
||||
interactive = False
|
||||
elif command.isdigit():
|
||||
i = int(command)
|
||||
elif command == "":
|
||||
logging.info("Running with method {}: {}".format(method, data))
|
||||
if method == "keys":
|
||||
send_keys(data)
|
||||
elif method == "bash":
|
||||
# Make sure that we're ready
|
||||
wait_for_prompt()
|
||||
# Strip leading spaces
|
||||
data = re.sub("\n +", "\n", data)
|
||||
# Add "RETURN" at the end of the command :)
|
||||
data += "\n"
|
||||
# Send command
|
||||
send_keys(data)
|
||||
# Force a short sleep to avoid race condition
|
||||
time.sleep(0.5)
|
||||
_, _, next_method, next_data = actions[i+1]
|
||||
if next_method == "wait":
|
||||
wait_for_string(next_data)
|
||||
else:
|
||||
wait_for_prompt()
|
||||
# Verify return code FIXME should be optional
|
||||
check_exit_status()
|
||||
elif method == "copypaste":
|
||||
screen = capture_pane()
|
||||
matches = re.findall(data, screen, flags=re.DOTALL)
|
||||
if len(matches) == 0:
|
||||
raise Exception("Could not find regex {} in output.".format(data))
|
||||
# Arbitrarily get the most recent match
|
||||
match = matches[-1]
|
||||
# Remove line breaks (like a screen copy paste would do)
|
||||
match = match.replace('\n', '')
|
||||
send_keys(match + '\n')
|
||||
# FIXME: we should factor out the "bash" method
|
||||
wait_for_prompt()
|
||||
check_exit_status()
|
||||
else:
|
||||
print("! Unknown meta action {} after snippet:".format(next_action[1]))
|
||||
print_snippet(next_action[1])
|
||||
print(ansi(7)("<<< {}".format(current_action[1])))
|
||||
else:
|
||||
proc.wait()
|
||||
if "stdout" in popen_options:
|
||||
stdout, stderr = proc.communicate()
|
||||
for line in stdout.split('\n'):
|
||||
if line.startswith("DOCKER_"):
|
||||
variable, value = line.split('=', 1)
|
||||
env[variable] = value
|
||||
print("=== {}={}".format(variable, value))
|
||||
print(ansi(7)("<<< {} >>> {}".format(proc.returncode, current_action[1])))
|
||||
if proc.returncode != 0:
|
||||
print("Got non-zero status code; aborting.")
|
||||
break
|
||||
if current_action[1].startswith("cd "):
|
||||
cwd = os.path.expanduser(current_action[1][3:])
|
||||
for proc in background:
|
||||
print("Terminating background process:")
|
||||
print_snippet(proc.cmd)
|
||||
proc.terminate()
|
||||
proc.wait()
|
||||
logging.warning("Unknown method {}: {!r}".format(method, data))
|
||||
i += 1
|
||||
|
||||
else:
|
||||
i += 1
|
||||
logging.warning("Unknown command {}, skipping to next step.".format(command))
|
||||
|
||||
# Reset slide counter
|
||||
with open("nextstep", "w") as f:
|
||||
f.write(str(0))
|
||||
|
||||
@@ -1 +0,0 @@
|
||||
../www/htdocs/index.html
|
||||
8
docs/TODO
Normal file
@@ -0,0 +1,8 @@
|
||||
Black belt references that I want to add somewhere:
|
||||
|
||||
What Have Namespaces Done for You Lately?
|
||||
https://www.youtube.com/watch?v=MHv6cWjvQjM&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=8
|
||||
|
||||
Cilium: Network and Application Security with BPF and XDP
|
||||
https://www.youtube.com/watch?v=ilKlmTDdFgk&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=9
|
||||
|
||||
BIN
docs/aj-containers.jpeg
Normal file
|
After Width: | Height: | Size: 127 KiB |
41
docs/apiscope.md
Normal file
@@ -0,0 +1,41 @@
|
||||
## A reminder about *scope*
|
||||
|
||||
- Out of the box, Docker API access is "all or nothing"
|
||||
|
||||
- When someone has access to the Docker API, they can access *everything*
|
||||
|
||||
- If your developers are using the Docker API to deploy on the dev cluster ...
|
||||
|
||||
... and the dev cluster is the same as the prod cluster ...
|
||||
|
||||
... it means that your devs have access to your production data, passwords, etc.
|
||||
|
||||
- This can easily be avoided
|
||||
|
||||
---
|
||||
|
||||
## Fine-grained API access control
|
||||
|
||||
A few solutions, by increasing order of flexibility:
|
||||
|
||||
- Use separate clusters for different security perimeters
|
||||
|
||||
(And different credentials for each cluster)
|
||||
|
||||
--
|
||||
|
||||
- Add an extra layer of abstraction (sudo scripts, hooks, or full-blown PAAS)
|
||||
|
||||
--
|
||||
|
||||
- Enable [authorization plugins]
|
||||
|
||||
- each API request is vetted by your plugin(s)
|
||||
|
||||
- by default, the *subject name* in the client TLS certificate is used as user name
|
||||
|
||||
- example: [user and permission management] in [UCP]
|
||||
|
||||
[authorization plugins]: https://docs.docker.com/engine/extend/plugins_authorization/
|
||||
[UCP]: https://docs.docker.com/datacenter/ucp/2.1/guides/
|
||||
[user and permission management]: https://docs.docker.com/datacenter/ucp/2.1/guides/admin/manage-users/
|
||||
BIN
docs/blackbelt.png
Normal file
|
After Width: | Height: | Size: 16 KiB |
33
docs/build.sh
Executable file
@@ -0,0 +1,33 @@
|
||||
#!/bin/sh
|
||||
case "$1" in
|
||||
once)
|
||||
for YAML in *.yml; do
|
||||
./markmaker.py < $YAML > $YAML.html || {
|
||||
rm $YAML.html
|
||||
break
|
||||
}
|
||||
done
|
||||
;;
|
||||
|
||||
forever)
|
||||
# There is a weird bug in entr, at least on MacOS,
|
||||
# where it doesn't restore the terminal to a clean
|
||||
# state when exitting. So let's try to work around
|
||||
# it with stty.
|
||||
STTY=$(stty -g)
|
||||
while true; do
|
||||
find . | entr -d $0 once
|
||||
STATUS=$?
|
||||
case $STATUS in
|
||||
2) echo "Directory has changed. Restarting.";;
|
||||
130) echo "SIGINT or q pressed. Exiting."; break;;
|
||||
*) echo "Weird exit code: $STATUS. Retrying in 1 second."; sleep 1;;
|
||||
esac
|
||||
done
|
||||
stty $STTY
|
||||
;;
|
||||
|
||||
*)
|
||||
echo "$0 <once|forever>"
|
||||
;;
|
||||
esac
|
||||
@@ -1,9 +0,0 @@
|
||||
<html>
|
||||
<!-- Generated with index.html.sh -->
|
||||
<head>
|
||||
<meta http-equiv="refresh" content="0; URL='https://dockercommunity.slack.com/messages/docker-mentor'" />
|
||||
</head>
|
||||
<body>
|
||||
<a href="https://dockercommunity.slack.com/messages/docker-mentor">https://dockercommunity.slack.com/messages/docker-mentor</a>
|
||||
</body>
|
||||
</html>
|
||||
@@ -1,16 +0,0 @@
|
||||
#!/bin/sh
|
||||
#LINK=https://gitter.im/jpetazzo/workshop-20170322-sanjose
|
||||
LINK=https://dockercommunity.slack.com/messages/docker-mentor
|
||||
#LINK=https://usenix-lisa.slack.com/messages/docker
|
||||
sed "s,@@LINK@@,$LINK,g" >index.html <<EOF
|
||||
<html>
|
||||
<!-- Generated with index.html.sh -->
|
||||
<head>
|
||||
<meta http-equiv="refresh" content="0; URL='$LINK'" />
|
||||
</head>
|
||||
<body>
|
||||
<a href="$LINK">$LINK</a>
|
||||
</body>
|
||||
</html>
|
||||
EOF
|
||||
|
||||
296
docs/concepts-k8s.md
Normal file
@@ -0,0 +1,296 @@
|
||||
# Kubernetes concepts
|
||||
|
||||
- Kubernetes is a container management system
|
||||
|
||||
- It runs and manages containerized applications on a cluster
|
||||
|
||||
--
|
||||
|
||||
- What does that really mean?
|
||||
|
||||
---
|
||||
|
||||
## Basic things we can ask Kubernetes to do
|
||||
|
||||
--
|
||||
|
||||
- Start 5 containers using image `atseashop/api:v1.3`
|
||||
|
||||
--
|
||||
|
||||
- Place an internal load balancer in front of these containers
|
||||
|
||||
--
|
||||
|
||||
- Start 10 containers using image `atseashop/webfront:v1.3`
|
||||
|
||||
--
|
||||
|
||||
- Place a public load balancer in front of these containers
|
||||
|
||||
--
|
||||
|
||||
- It's Black Friday (or Christmas), traffic spikes, grow our cluster and add containers
|
||||
|
||||
--
|
||||
|
||||
- New release! Replace my containers with the new image `atseashop/webfront:v1.4`
|
||||
|
||||
--
|
||||
|
||||
- Keep processing requests during the upgrade; update my containers one at a time
|
||||
|
||||
---
|
||||
|
||||
## Other things that Kubernetes can do for us
|
||||
|
||||
- Basic autoscaling
|
||||
|
||||
- Blue/green deployment, canary deployment
|
||||
|
||||
- Long running services, but also batch (one-off) jobs
|
||||
|
||||
- Overcommit our cluster and *evict* low-priority jobs
|
||||
|
||||
- Run services with *stateful* data (databases etc.)
|
||||
|
||||
- Fine-grained access control defining *what* can be done by *whom* on *which* resources
|
||||
|
||||
- Integrating third party services (*service catalog*)
|
||||
|
||||
- Automating complex tasks (*operators*)
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes architecture
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Kubernetes architecture
|
||||
|
||||
- Ha ha ha ha
|
||||
|
||||
- OK, I was trying to scare you, it's much simpler than that ❤️
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Credits
|
||||
|
||||
- The first schema is a Kubernetes cluster with storage backed by multi-path iSCSI
|
||||
|
||||
(Courtesy of [Yongbok Kim](https://www.yongbok.net/blog/))
|
||||
|
||||
- The second one is a simplified representation of a Kubernetes cluster
|
||||
|
||||
(Courtesy of [Imesh Gunaratne](https://medium.com/containermind/a-reference-architecture-for-deploying-wso2-middleware-on-kubernetes-d4dee7601e8e))
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes architecture: the master
|
||||
|
||||
- The Kubernetes logic (its "brains") is a collection of services:
|
||||
|
||||
- the API server (our point of entry to everything!)
|
||||
- core services like the scheduler and controller manager
|
||||
- `etcd` (a highly available key/value store; the "database" of Kubernetes)
|
||||
|
||||
- Together, these services form what is called the "master"
|
||||
|
||||
- These services can run straight on a host, or in containers
|
||||
<br/>
|
||||
(that's an implementation detail)
|
||||
|
||||
- `etcd` can be run on separate machines (first schema) or co-located (second schema)
|
||||
|
||||
- We need at least one master, but we can have more (for high availability)
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes architecture: the nodes
|
||||
|
||||
- The nodes executing our containers run another collection of services:
|
||||
|
||||
- a container Engine (typically Docker)
|
||||
- kubelet (the "node agent")
|
||||
- kube-proxy (a necessary but not sufficient network component)
|
||||
|
||||
- Nodes were formerly called "minions"
|
||||
|
||||
- It is customary to *not* run apps on the node(s) running master components
|
||||
|
||||
(Except when using small development clusters)
|
||||
|
||||
---
|
||||
|
||||
## Do we need to run Docker at all?
|
||||
|
||||
No!
|
||||
|
||||
--
|
||||
|
||||
- By default, Kubernetes uses the Docker Engine to run containers
|
||||
|
||||
- We could also use `rkt` ("Rocket") from CoreOS
|
||||
|
||||
- Or leverage other pluggable runtimes through the *Container Runtime Interface*
|
||||
|
||||
(like CRI-O, or containerd)
|
||||
|
||||
---
|
||||
|
||||
## Do we need to run Docker at all?
|
||||
|
||||
Yes!
|
||||
|
||||
--
|
||||
|
||||
- In this workshop, we run our app on a single node first
|
||||
|
||||
- We will need to build images and ship them around
|
||||
|
||||
- We can do these things without Docker
|
||||
<br/>
|
||||
(and get diagnosed with NIH¹ syndrome)
|
||||
|
||||
- Docker is still the most stable container engine today
|
||||
<br/>
|
||||
(but other options are maturing very quickly)
|
||||
|
||||
.footnote[¹[Not Invented Here](https://en.wikipedia.org/wiki/Not_invented_here)]
|
||||
|
||||
---
|
||||
|
||||
## Do we need to run Docker at all?
|
||||
|
||||
- On our development environments, CI pipelines ... :
|
||||
|
||||
*Yes, almost certainly*
|
||||
|
||||
- On our production servers:
|
||||
|
||||
*Yes (today)*
|
||||
|
||||
*Probably not (in the future)*
|
||||
|
||||
.footnote[More information about CRI [on the Kubernetes blog](http://blog.kubernetes.io/2016/12/]container-runtime-interface-cri-in-kubernetes.html).
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes resources
|
||||
|
||||
- The Kubernetes API defines a lot of objects called *resources*
|
||||
|
||||
- These resources are organized by type, or `Kind` (in the API)
|
||||
|
||||
- A few common resource types are:
|
||||
|
||||
- node (a machine — physical or virtual — in our cluster)
|
||||
- pod (group of containers running together on a node)
|
||||
- service (stable network endpoint to connect to one or multiple containers)
|
||||
- namespace (more-or-less isolated group of things)
|
||||
- secret (bundle of sensitive data to be passed to a container)
|
||||
|
||||
And much more! (We can see the full list by running `kubectl get`)
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
(Diagram courtesy of Weave Works, used with permission.)
|
||||
|
||||
---
|
||||
|
||||
# Declarative vs imperative
|
||||
|
||||
- Kubernetes puts a very strong emphasis on being *declarative*
|
||||
|
||||
- Declarative:
|
||||
|
||||
*I would like a cup of tea.*
|
||||
|
||||
- Imperative:
|
||||
|
||||
*Boil some water. Pour it in a teapot. Add tea leaves. Steep for a while. Serve in cup.*
|
||||
|
||||
--
|
||||
|
||||
- Declarative seems simpler at first ...
|
||||
|
||||
--
|
||||
|
||||
- ... As long as you know how to brew tea
|
||||
|
||||
---
|
||||
|
||||
## Declarative vs imperative
|
||||
|
||||
- What declarative would really be:
|
||||
|
||||
*I want a cup of tea, obtained by pouring an infusion¹ of tea leaves in a cup.*
|
||||
|
||||
--
|
||||
|
||||
*¹An infusion is obtained by letting the object steep a few minutes in hot² water.*
|
||||
|
||||
--
|
||||
|
||||
*²Hot liquid is obtained by pouring it in an appropriate container³ and setting it on a stove.*
|
||||
|
||||
--
|
||||
|
||||
*³Ah, finally, containers! Something we know about. Let's get to work, shall we?*
|
||||
|
||||
--
|
||||
|
||||
.footnote[Did you know there was an [ISO standard](https://en.wikipedia.org/wiki/ISO_3103)
|
||||
specifying how to brew tea?]
|
||||
|
||||
---
|
||||
|
||||
## Declarative vs imperative
|
||||
|
||||
- Imperative systems:
|
||||
|
||||
- simpler
|
||||
|
||||
- if a task is interrupted, we have to restart from scratch
|
||||
|
||||
- Declarative systems:
|
||||
|
||||
- if a task is interrupted (or if we show up to the party half-way through),
|
||||
we can figure out what's missing and do only what's necessary
|
||||
|
||||
- we need to be able to *observe* the system
|
||||
|
||||
- ... and compute a "diff" between *what we have* and *what we want*
|
||||
|
||||
---
|
||||
|
||||
## Declarative vs imperative in Kubernetes
|
||||
|
||||
- Virtually everything we create in Kubernetes is created from a *spec*
|
||||
|
||||
- Watch for the `spec` fields in the YAML files later!
|
||||
|
||||
- The *spec* describes *how we want the thing to be*
|
||||
|
||||
- Kubernetes will *reconcile* the current state with the spec
|
||||
<br/>(technically, this is done by a number of *controllers*)
|
||||
|
||||
- When we want to change some resource, we update the *spec*
|
||||
|
||||
- Kubernetes will then *converge* that resource
|
||||
364
docs/creatingswarm.md
Normal file
@@ -0,0 +1,364 @@
|
||||
# Creating our first Swarm
|
||||
|
||||
- The cluster is initialized with `docker swarm init`
|
||||
|
||||
- This should be executed on a first, seed node
|
||||
|
||||
- .warning[DO NOT execute `docker swarm init` on multiple nodes!]
|
||||
|
||||
You would have multiple disjoint clusters.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create our cluster from node1:
|
||||
```bash
|
||||
docker swarm init
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
class: advertise-addr
|
||||
|
||||
If Docker tells you that it `could not choose an IP address to advertise`, see next slide!
|
||||
|
||||
---
|
||||
|
||||
class: advertise-addr
|
||||
|
||||
## IP address to advertise
|
||||
|
||||
- When running in Swarm mode, each node *advertises* its address to the others
|
||||
<br/>
|
||||
(i.e. it tells them *"you can contact me on 10.1.2.3:2377"*)
|
||||
|
||||
- If the node has only one IP address (other than 127.0.0.1), it is used automatically
|
||||
|
||||
- If the node has multiple IP addresses, you **must** specify which one to use
|
||||
<br/>
|
||||
(Docker refuses to pick one randomly)
|
||||
|
||||
- You can specify an IP address or an interface name
|
||||
<br/>(in the latter case, Docker will read the IP address of the interface and use it)
|
||||
|
||||
- You can also specify a port number
|
||||
<br/>(otherwise, the default port 2377 will be used)
|
||||
|
||||
---
|
||||
|
||||
class: advertise-addr
|
||||
|
||||
## Which IP address should be advertised?
|
||||
|
||||
- If your nodes have only one IP address, it's safe to let autodetection do the job
|
||||
|
||||
.small[(Except if your instances have different private and public addresses, e.g.
|
||||
on EC2, and you are building a Swarm involving nodes inside and outside the
|
||||
private network: then you should advertise the public address.)]
|
||||
|
||||
- If your nodes have multiple IP addresses, pick an address which is reachable
|
||||
*by every other node* of the Swarm
|
||||
|
||||
- If you are using [play-with-docker](http://play-with-docker.com/), use the IP
|
||||
address shown next to the node name
|
||||
|
||||
.small[(This is the address of your node on your private internal overlay network.
|
||||
The other address that you might see is the address of your node on the
|
||||
`docker_gwbridge` network, which is used for outbound traffic.)]
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
docker swarm init --advertise-addr 10.0.9.2
|
||||
docker swarm init --advertise-addr eth0:7777
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Using a separate interface for the data path
|
||||
|
||||
- You can use different interfaces (or IP addresses) for control and data
|
||||
|
||||
- You set the _control plane path_ with `--advertise-addr`
|
||||
|
||||
(This will be used for SwarmKit manager/worker communication, leader election, etc.)
|
||||
|
||||
- You set the _data plane path_ with `--data-path-addr`
|
||||
|
||||
(This will be used for traffic between containers)
|
||||
|
||||
- Both flags can accept either an IP address, or an interface name
|
||||
|
||||
(When specifying an interface name, Docker will use its first IP address)
|
||||
|
||||
---
|
||||
|
||||
## Token generation
|
||||
|
||||
- In the output of `docker swarm init`, we have a message
|
||||
confirming that our node is now the (single) manager:
|
||||
|
||||
```
|
||||
Swarm initialized: current node (8jud...) is now a manager.
|
||||
```
|
||||
|
||||
- Docker generated two security tokens (like passphrases or passwords) for our cluster
|
||||
|
||||
- The CLI shows us the command to use on other nodes to add them to the cluster using the "worker"
|
||||
security token:
|
||||
|
||||
```
|
||||
To add a worker to this swarm, run the following command:
|
||||
docker swarm join \
|
||||
--token SWMTKN-1-59fl4ak4nqjmao1ofttrc4eprhrola2l87... \
|
||||
172.31.4.182:2377
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Checking that Swarm mode is enabled
|
||||
|
||||
.exercise[
|
||||
|
||||
- Run the traditional `docker info` command:
|
||||
```bash
|
||||
docker info
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The output should include:
|
||||
|
||||
```
|
||||
Swarm: active
|
||||
NodeID: 8jud7o8dax3zxbags3f8yox4b
|
||||
Is Manager: true
|
||||
ClusterID: 2vcw2oa9rjps3a24m91xhvv0c
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Running our first Swarm mode command
|
||||
|
||||
- Let's retry the exact same command as earlier
|
||||
|
||||
.exercise[
|
||||
|
||||
- List the nodes (well, the only node) of our cluster:
|
||||
```bash
|
||||
docker node ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The output should look like the following:
|
||||
```
|
||||
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
|
||||
8jud...ox4b * node1 Ready Active Leader
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Adding nodes to the Swarm
|
||||
|
||||
- A cluster with one node is not a lot of fun
|
||||
|
||||
- Let's add `node2`!
|
||||
|
||||
- We need the token that was shown earlier
|
||||
|
||||
--
|
||||
|
||||
- You wrote it down, right?
|
||||
|
||||
--
|
||||
|
||||
- Don't panic, we can easily see it again 😏
|
||||
|
||||
---
|
||||
|
||||
## Adding nodes to the Swarm
|
||||
|
||||
.exercise[
|
||||
|
||||
- Show the token again:
|
||||
```bash
|
||||
docker swarm join-token worker
|
||||
```
|
||||
|
||||
- Log into `node2`:
|
||||
```bash
|
||||
ssh node2
|
||||
```
|
||||
|
||||
- Copy-paste the `docker swarm join ...` command
|
||||
<br/>(that was displayed just before)
|
||||
|
||||
<!-- ```copypaste docker swarm join --token SWMTKN.*?:2377``` -->
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Check that the node was added correctly
|
||||
|
||||
- Stay on `node2` for now!
|
||||
|
||||
.exercise[
|
||||
|
||||
- We can still use `docker info` to verify that the node is part of the Swarm:
|
||||
```bash
|
||||
docker info | grep ^Swarm
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
- However, Swarm commands will not work; try, for instance:
|
||||
```bash
|
||||
docker node ls
|
||||
```
|
||||
|
||||
```wait```
|
||||
|
||||
- This is because the node that we added is currently a *worker*
|
||||
- Only *managers* can accept Swarm-specific commands
|
||||
|
||||
---
|
||||
|
||||
## View our two-node cluster
|
||||
|
||||
- Let's go back to `node1` and see what our cluster looks like
|
||||
|
||||
.exercise[
|
||||
|
||||
- Switch back to `node1`:
|
||||
```keys
|
||||
^D
|
||||
```
|
||||
|
||||
- View the cluster from `node1`, which is a manager:
|
||||
```bash
|
||||
docker node ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The output should be similar to the following:
|
||||
```
|
||||
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
|
||||
8jud...ox4b * node1 Ready Active Leader
|
||||
ehb0...4fvx node2 Ready Active
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
class: under-the-hood
|
||||
|
||||
## Under the hood: docker swarm init
|
||||
|
||||
When we do `docker swarm init`:
|
||||
|
||||
- a keypair is created for the root CA of our Swarm
|
||||
|
||||
- a keypair is created for the first node
|
||||
|
||||
- a certificate is issued for this node
|
||||
|
||||
- the join tokens are created
|
||||
|
||||
---
|
||||
|
||||
class: under-the-hood
|
||||
|
||||
## Under the hood: join tokens
|
||||
|
||||
There is one token to *join as a worker*, and another to *join as a manager*.
|
||||
|
||||
The join tokens have two parts:
|
||||
|
||||
- a secret key (preventing unauthorized nodes from joining)
|
||||
|
||||
- a fingerprint of the root CA certificate (preventing MITM attacks)
|
||||
|
||||
If a token is compromised, it can be rotated instantly with:
|
||||
```
|
||||
docker swarm join-token --rotate <worker|manager>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
class: under-the-hood
|
||||
|
||||
## Under the hood: docker swarm join
|
||||
|
||||
When a node joins the Swarm:
|
||||
|
||||
- it is issued its own keypair, signed by the root CA
|
||||
|
||||
- if the node is a manager:
|
||||
|
||||
- it joins the Raft consensus
|
||||
- it connects to the current leader
|
||||
- it accepts connections from worker nodes
|
||||
|
||||
- if the node is a worker:
|
||||
|
||||
- it connects to one of the managers (leader or follower)
|
||||
|
||||
---
|
||||
|
||||
class: under-the-hood
|
||||
|
||||
## Under the hood: cluster communication
|
||||
|
||||
- The *control plane* is encrypted with AES-GCM; keys are rotated every 12 hours
|
||||
|
||||
- Authentication is done with mutual TLS; certificates are rotated every 90 days
|
||||
|
||||
(`docker swarm update` allows to change this delay or to use an external CA)
|
||||
|
||||
- The *data plane* (communication between containers) is not encrypted by default
|
||||
|
||||
(but this can be activated on a by-network basis, using IPSEC,
|
||||
leveraging hardware crypto if available)
|
||||
|
||||
---
|
||||
|
||||
class: under-the-hood
|
||||
|
||||
## Under the hood: I want to know more!
|
||||
|
||||
Revisit SwarmKit concepts:
|
||||
|
||||
- Docker 1.12 Swarm Mode Deep Dive Part 1: Topology
|
||||
([video](https://www.youtube.com/watch?v=dooPhkXT9yI))
|
||||
|
||||
- Docker 1.12 Swarm Mode Deep Dive Part 2: Orchestration
|
||||
([video](https://www.youtube.com/watch?v=_F6PSP-qhdA))
|
||||
|
||||
Some presentations from the Docker Distributed Systems Summit in Berlin:
|
||||
|
||||
- Heart of the SwarmKit: Topology Management
|
||||
([slides](https://speakerdeck.com/aluzzardi/heart-of-the-swarmkit-topology-management))
|
||||
|
||||
- Heart of the SwarmKit: Store, Topology & Object Model
|
||||
([slides](http://www.slideshare.net/Docker/heart-of-the-swarmkit-store-topology-object-model))
|
||||
([video](https://www.youtube.com/watch?v=EmePhjGnCXY))
|
||||
|
||||
And DockerCon Black Belt talks:
|
||||
|
||||
.blackbelt[DC17US: Everything You Thought You Already Knew About Orchestration
|
||||
([video](https://www.youtube.com/watch?v=Qsv-q8WbIZY&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=6))]
|
||||
|
||||
.blackbelt[DC17EU: Container Orchestration from Theory to Practice
|
||||
([video](https://dockercon.docker.com/watch/5fhwnQxW8on1TKxPwwXZ5r))]
|
||||
|
||||
409
docs/daemonset.md
Normal file
@@ -0,0 +1,409 @@
|
||||
# Daemon sets
|
||||
|
||||
- Remember: we did all that cluster orchestration business for `rng`
|
||||
|
||||
- We want one (and exactly one) instance of `rng` per node
|
||||
|
||||
- If we just scale `deploy/rng` to 4, nothing guarantees that they spread
|
||||
|
||||
- Instead of a `deployment`, we will use a `daemonset`
|
||||
|
||||
- Daemon sets are great for cluster-wide, per-node processes:
|
||||
|
||||
- `kube-proxy`
|
||||
- `weave` (our overlay network)
|
||||
- monitoring agents
|
||||
- hardware management tools (e.g. SCSI/FC HBA agents)
|
||||
- etc.
|
||||
|
||||
- They can also be restricted to run [only on some nodes](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#running-pods-on-only-some-nodes)
|
||||
|
||||
---
|
||||
|
||||
## Creating a daemon set
|
||||
|
||||
- Unfortunately, as of Kubernetes 1.8, the CLI cannot create daemon sets
|
||||
|
||||
--
|
||||
|
||||
- More precisely: it doesn't have a subcommand to create a daemon set
|
||||
|
||||
--
|
||||
|
||||
- But any kind of resource can always be created by providing a YAML description:
|
||||
```bash
|
||||
kubectl apply -f foo.yaml
|
||||
```
|
||||
|
||||
--
|
||||
|
||||
- How do we create the YAML file for our daemon set?
|
||||
|
||||
--
|
||||
|
||||
- option 1: read the docs
|
||||
|
||||
--
|
||||
|
||||
- option 2: `vi` our way out of it
|
||||
|
||||
---
|
||||
|
||||
## Creating the YAML file for our daemon set
|
||||
|
||||
- Let's start with the YAML file for the current `rng` resource
|
||||
|
||||
.exercise[
|
||||
|
||||
- Dump the `rng` resource in YAML:
|
||||
```bash
|
||||
kubectl get deploy/rng -o yaml --export >rng.yml
|
||||
```
|
||||
|
||||
- Edit `rng.yml`
|
||||
|
||||
]
|
||||
|
||||
Note: `--export` will remove "cluster-specific" information, i.e.:
|
||||
- namespace (so that the resource is not tied to a specific namespace)
|
||||
- status and creation timestamp (useless when creating a new resource)
|
||||
- resourceVersion and uid (these would cause... *interesting* problems)
|
||||
|
||||
---
|
||||
|
||||
## "Casting" a resource to another
|
||||
|
||||
- What if we just changed the `kind` field?
|
||||
|
||||
(It can't be that easy, right?)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Change `kind: Deployment` to `kind: DaemonSet`
|
||||
|
||||
- Save, quit
|
||||
|
||||
- Try to create our new resource:
|
||||
```bash
|
||||
kubectl apply -f rng.yml
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
We all knew this couldn't be that easy, right!
|
||||
|
||||
---
|
||||
|
||||
## Understanding the problem
|
||||
|
||||
- The core of the error is:
|
||||
```
|
||||
error validating data:
|
||||
[ValidationError(DaemonSet.spec):
|
||||
unknown field "replicas" in io.k8s.api.extensions.v1beta1.DaemonSetSpec,
|
||||
...
|
||||
```
|
||||
|
||||
--
|
||||
|
||||
- *Obviously,* it doesn't make sense to specify a number of replicas for a daemon set
|
||||
|
||||
--
|
||||
|
||||
- Workaround: fix the YAML
|
||||
|
||||
- remove the `replicas` field
|
||||
- remove the `strategy` field (which defines the rollout mechanism for a deployment)
|
||||
- remove the `status: {}` line at the end
|
||||
|
||||
--
|
||||
|
||||
- Or, we could also ...
|
||||
|
||||
---
|
||||
|
||||
## Use the `--force`, Luke
|
||||
|
||||
- We could also tell Kubernetes to ignore these errors and try anyway
|
||||
|
||||
- The `--force` flag actual name is `--validate=false`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try to load our YAML file and ignore errors:
|
||||
```bash
|
||||
kubectl apply -f rng.yml --validate=false
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
🎩✨🐇
|
||||
|
||||
--
|
||||
|
||||
Wait ... Now, can it be *that* easy?
|
||||
|
||||
---
|
||||
|
||||
## Checking what we've done
|
||||
|
||||
- Did we transform our `deployment` into a `daemonset`?
|
||||
|
||||
.exercise[
|
||||
|
||||
- Look at the resources that we have now:
|
||||
```bash
|
||||
kubectl get all
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
We have both `deploy/rng` and `ds/rng` now!
|
||||
|
||||
--
|
||||
|
||||
And one too many pods...
|
||||
|
||||
---
|
||||
|
||||
## Explanation
|
||||
|
||||
- You can have different resource types with the same name
|
||||
|
||||
(i.e. a *deployment* and a *daemonset* both named `rng`)
|
||||
|
||||
- We still have the old `rng` *deployment*
|
||||
|
||||
- But now we have the new `rng` *daemonset* as well
|
||||
|
||||
- If we look at the pods, we have:
|
||||
|
||||
- *one pod* for the deployment
|
||||
|
||||
- *one pod per node* for the daemonset
|
||||
|
||||
---
|
||||
|
||||
## What are all these pods doing?
|
||||
|
||||
- Let's check the logs of all these `rng` pods
|
||||
|
||||
- All these pods have a `run=rng` label:
|
||||
|
||||
- the first pod, because that's what `kubectl run` does
|
||||
- the other ones (in the daemon set), because we
|
||||
*copied the spec from the first one*
|
||||
|
||||
- Therefore, we can query everybody's logs using that `run=rng` selector
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the logs of all the pods having a label `run=rng`:
|
||||
```bash
|
||||
kubectl logs -l run=rng --tail 1
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
It appears that *all the pods* are serving requests at the moment.
|
||||
|
||||
---
|
||||
|
||||
## The magic of selectors
|
||||
|
||||
- The `rng` *service* is load balancing requests to a set of pods
|
||||
|
||||
- This set of pods is defined as "pods having the label `run=rng`"
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the *selector* in the `rng` service definition:
|
||||
```bash
|
||||
kubectl describe service rng
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
When we created additional pods with this label, they were
|
||||
automatically detected by `svc/rng` and added as *endpoints*
|
||||
to the associated load balancer.
|
||||
|
||||
---
|
||||
|
||||
## Removing the first pod from the load balancer
|
||||
|
||||
- What would happen if we removed that pod, with `kubectl delete pod ...`?
|
||||
|
||||
--
|
||||
|
||||
The `replicaset` would re-create it immediately.
|
||||
|
||||
--
|
||||
|
||||
- What would happen if we removed the `run=rng` label from that pod?
|
||||
|
||||
--
|
||||
|
||||
The `replicaset` would re-create it immediately.
|
||||
|
||||
--
|
||||
|
||||
... Because what matters to the `replicaset` is the number of pods *matching that selector.*
|
||||
|
||||
--
|
||||
|
||||
- But but but ... Don't we have more than one pod with `run=rng` now?
|
||||
|
||||
--
|
||||
|
||||
The answer lies in the exact selector used by the `replicaset` ...
|
||||
|
||||
---
|
||||
|
||||
## Deep dive into selectors
|
||||
|
||||
- Let's look at the selectors for the `rng` *deployment* and the associated *replica set*
|
||||
|
||||
.exercise[
|
||||
|
||||
- Show detailed information about the `rng` deployment:
|
||||
```bash
|
||||
kubectl describe deploy rng
|
||||
```
|
||||
|
||||
- Show detailed information about the `rng` replica:
|
||||
<br/>(The second command doesn't require you to get the exact name of the replica set)
|
||||
```bash
|
||||
kubectl describe rs rng-yyyy
|
||||
kubectl describe rs -l run=rng
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
The replica set selector also has a `pod-template-hash`, unlike the pods in our daemon set.
|
||||
|
||||
---
|
||||
|
||||
# Updating a service through labels and selectors
|
||||
|
||||
- What if we want to drop the `rng` deployment from the load balancer?
|
||||
|
||||
- Option 1:
|
||||
|
||||
- destroy it
|
||||
|
||||
- Option 2:
|
||||
|
||||
- add an extra *label* to the daemon set
|
||||
|
||||
- update the service *selector* to refer to that *label*
|
||||
|
||||
--
|
||||
|
||||
Of course, option 2 offers more learning opportunities. Right?
|
||||
|
||||
---
|
||||
|
||||
## Add an extra label to the daemon set
|
||||
|
||||
- We will update the daemon set "spec"
|
||||
|
||||
- Option 1:
|
||||
|
||||
- edit the `rng.yml` file that we used earlier
|
||||
|
||||
- load the new definition with `kubectl apply`
|
||||
|
||||
- Option 2:
|
||||
|
||||
- use `kubectl edit`
|
||||
|
||||
--
|
||||
|
||||
*If you feel like you got this💕🌈, feel free to try directly.*
|
||||
|
||||
*We've included a few hints on the next slides for your convenience!*
|
||||
|
||||
---
|
||||
|
||||
## We've put resources in your resources all the way down
|
||||
|
||||
- Reminder: a daemon set is a resource that creates more resources!
|
||||
|
||||
- There is a difference between:
|
||||
|
||||
- the label(s) of a resource (in the `metadata` block in the beginning)
|
||||
|
||||
- the selector of a resource (in the `spec` block)
|
||||
|
||||
- the label(s) of the resource(s) created by the first resource (in the `template` block)
|
||||
|
||||
- You need to update the selector and the template (metadata labels are not mandatory)
|
||||
|
||||
- The template must match the selector
|
||||
|
||||
(i.e. the resource will refuse to create resources that it will not select)
|
||||
|
||||
---
|
||||
|
||||
## Adding our label
|
||||
|
||||
- Let's add a label `isactive: yes`
|
||||
|
||||
- In YAML, `yes` should be quoted; i.e. `isactive: "yes"`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Update the daemon set to add `isactive: "yes"` to the selector and template label:
|
||||
```bash
|
||||
kubectl edit daemonset rng
|
||||
```
|
||||
|
||||
- Update the service to add `isactive: "yes"` to its selector:
|
||||
```bash
|
||||
kubectl edit service rng
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Checking what we've done
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the logs of all `run=rng` pods to confirm that only 4 of them are now active:
|
||||
```bash
|
||||
kubectl logs -l run=rng
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The timestamps should give us a hint about how many pods are currently receiving traffic.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Look at the pods that we have right now:
|
||||
```bash
|
||||
kubectl get pods
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## More labels, more selectors, more problems?
|
||||
|
||||
- Bonus exercise 1: clean up the pods of the "old" daemon set
|
||||
|
||||
- Bonus exercise 2: how could we have done to avoid creating new pods?
|
||||
181
docs/dashboard.md
Normal file
@@ -0,0 +1,181 @@
|
||||
# The Kubernetes dashboard
|
||||
|
||||
- Kubernetes resources can also be viewed with a web dashboard
|
||||
|
||||
- We are going to deploy that dashboard with *three commands:*
|
||||
|
||||
- one to actually *run* the dashboard
|
||||
|
||||
- one to make the dashboard available from outside
|
||||
|
||||
- one to bypass authentication for the dashboard
|
||||
|
||||
--
|
||||
|
||||
.footnote[.warning[Yes, this will open our cluster to all kinds of shenanigans. Don't do this at home.]]
|
||||
|
||||
---
|
||||
|
||||
## Running the dashboard
|
||||
|
||||
- We need to create a *deployment* and a *service* for the dashboard
|
||||
|
||||
- But also a *secret*, a *service account*, a *role* and a *role binding*
|
||||
|
||||
- All these things can be defined in a YAML file and created with `kubectl apply -f`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create all the dashboard resources, with the following command:
|
||||
```bash
|
||||
kubectl apply -f https://goo.gl/Qamqab
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The goo.gl URL expands to:
|
||||
<br/>
|
||||
.small[https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml]
|
||||
|
||||
---
|
||||
|
||||
## Making the dashboard reachable from outside
|
||||
|
||||
- The dashboard is exposed through a `ClusterIP` service
|
||||
|
||||
- We need a `NodePort` service instead
|
||||
|
||||
.exercise[
|
||||
|
||||
- Edit the service:
|
||||
```bash
|
||||
kubectl edit service kubernetes-dashboard
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
`NotFound`?!? Y U NO WORK?!?
|
||||
|
||||
---
|
||||
|
||||
## Editing the `kubernetes-dashboard` service
|
||||
|
||||
- If we look at the YAML that we loaded just before, we'll get a hint
|
||||
|
||||
--
|
||||
|
||||
- The dashboard was created in the `kube-system` namespace
|
||||
|
||||
.exercise[
|
||||
|
||||
- Edit the service:
|
||||
```bash
|
||||
kubectl -n kube-system edit service kubernetes-dashboard
|
||||
```
|
||||
|
||||
- Change `ClusterIP` to `NodePort`, save, and exit
|
||||
|
||||
- Check the port that was assigned with `kubectl -n kube-system get services`
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Connecting to the dashboard
|
||||
|
||||
.exercise[
|
||||
|
||||
- Connect to https://oneofournodes:3xxxx/
|
||||
|
||||
(You will have to work around the TLS certificate validation warning)
|
||||
|
||||
<!-- ```open https://node1:3xxxx/``` -->
|
||||
|
||||
]
|
||||
|
||||
- We have three authentication options at this point:
|
||||
|
||||
- token (associated with a role that has appropriate permissions)
|
||||
|
||||
- kubeconfig (e.g. using the `~/.kube/config` file from `node1`)
|
||||
|
||||
- "skip" (use the dashboard "service account")
|
||||
|
||||
- Let's use "skip": we get a bunch of warnings and don't see much
|
||||
|
||||
---
|
||||
|
||||
## Granting more rights to the dashboard
|
||||
|
||||
- The dashboard documentation [explains how to do](https://github.com/kubernetes/dashboard/wiki/Access-control#admin-privileges)
|
||||
|
||||
- We just need to load another YAML file!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Grant admin privileges to the dashboard so we can see our resources:
|
||||
```bash
|
||||
kubectl apply -f https://goo.gl/CHsLTA
|
||||
```
|
||||
|
||||
- Reload the dashboard and enjoy!
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
.warning[By the way, we just added a backdoor to our Kubernetes cluster!]
|
||||
|
||||
---
|
||||
|
||||
# Security implications of `kubectl apply`
|
||||
|
||||
- When we do `kubectl apply -f <URL>`, we create arbitrary resources
|
||||
|
||||
- Resources can be evil; imagine a `deployment` that ...
|
||||
|
||||
--
|
||||
|
||||
- starts bitcoin miners on the whole cluster
|
||||
|
||||
--
|
||||
|
||||
- hides in a non-default namespace
|
||||
|
||||
--
|
||||
|
||||
- bind-mounts our nodes' filesystem
|
||||
|
||||
--
|
||||
|
||||
- inserts SSH keys in the root account (on the node)
|
||||
|
||||
--
|
||||
|
||||
- encrypts our data and ransoms it
|
||||
|
||||
--
|
||||
|
||||
- ☠️☠️☠️
|
||||
|
||||
---
|
||||
|
||||
## `kubectl apply` is the new `curl | sh`
|
||||
|
||||
- `curl | sh` is convenient
|
||||
|
||||
- It's safe if you use HTTPS URLs from trusted sources
|
||||
|
||||
--
|
||||
|
||||
- `kubectl apply -f` is convenient
|
||||
|
||||
- It's safe if you use HTTPS URLs from trusted sources
|
||||
|
||||
--
|
||||
|
||||
- It introduces new failure modes
|
||||
|
||||
- Example: the official setup instructions for most pod networks
|
||||
182
docs/dockercon.yml
Normal file
@@ -0,0 +1,182 @@
|
||||
chat: "[Slack](https://dockercommunity.slack.com/messages/C7ET1GY4Q)"
|
||||
|
||||
exclude:
|
||||
- self-paced
|
||||
- snap
|
||||
- auto-btp
|
||||
- benchmarking
|
||||
- elk-manual
|
||||
- prom-manual
|
||||
|
||||
title: "Swarm: from Zero to Hero (DC17EU)"
|
||||
chapters:
|
||||
- |
|
||||
class: title
|
||||
|
||||
.small[
|
||||
|
||||
Swarm: from Zero to Hero
|
||||
|
||||
.small[.small[
|
||||
|
||||
**Be kind to the WiFi!**
|
||||
|
||||
*Use the 5G network*
|
||||
<br/>
|
||||
*Don't use your hotspot*
|
||||
<br/>
|
||||
*Don't stream videos from YouTube, Netflix, etc.
|
||||
<br/>(if you're bored, watch local content instead)*
|
||||
|
||||
Also: share the power outlets
|
||||
<br/>
|
||||
*(with limited power comes limited responsibility?)*
|
||||
<br/>
|
||||
*(or something?)*
|
||||
|
||||
Thank you!
|
||||
|
||||
]
|
||||
]
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Intros
|
||||
|
||||
<!--
|
||||
- Hello! We are
|
||||
AJ ([@s0ulshake](https://twitter.com/s0ulshake))
|
||||
&
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
|
||||
-->
|
||||
|
||||
- Hello! We are Jérôme, Lee, Nicholas, and Scott
|
||||
|
||||
<!--
|
||||
I am
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
|
||||
-->
|
||||
|
||||
--
|
||||
|
||||
- This is our collective Docker knowledge:
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## "From zero to hero"
|
||||
|
||||
--
|
||||
|
||||
- It rhymes, but it's a pretty bad title, to be honest
|
||||
|
||||
--
|
||||
|
||||
- None of you is a "zero"
|
||||
|
||||
--
|
||||
|
||||
- None of us is a "hero"
|
||||
|
||||
--
|
||||
|
||||
- None of us should even try to be a hero
|
||||
|
||||
--
|
||||
|
||||
*The hero syndrome is a phenomenon affecting people who seek heroism or recognition,
|
||||
usually by creating a desperate situation which they can resolve.
|
||||
This can include unlawful acts, such as arson.
|
||||
The phenomenon has been noted to affect civil servants,
|
||||
such as firefighters, nurses, police officers, and security guards.*
|
||||
|
||||
(Wikipedia page on [hero syndrome](https://en.wikipedia.org/wiki/Hero_syndrome))
|
||||
|
||||
---
|
||||
|
||||
## Agenda
|
||||
|
||||
.small[
|
||||
- 09:00-09:10 Hello!
|
||||
- 09:10-10:30 Part 1
|
||||
- 10:30-11:00 coffee break
|
||||
- 11:00-12:30 Part 2
|
||||
- 12:30-13:30 lunch break
|
||||
- 13:30-15:00 Part 3
|
||||
- 15:00-15:30 coffee break
|
||||
- 15:30-17:00 Part 4
|
||||
- 17:00-18:00 Afterhours and Q&A
|
||||
]
|
||||
|
||||
<!--
|
||||
- The tutorial will run from 9:00am to 12:20pm
|
||||
|
||||
- This will be fast-paced, but DON'T PANIC!
|
||||
|
||||
- There will be a coffee break at 10:30am
|
||||
<br/>
|
||||
(please remind me if I forget about it!)
|
||||
-->
|
||||
|
||||
- All the content is publicly available (slides, code samples, scripts)
|
||||
|
||||
Upstream URL: https://github.com/jpetazzo/orchestration-workshop
|
||||
|
||||
- Feel free to interrupt for questions at any time
|
||||
|
||||
- Live feedback, questions, help on [Gitter](chat)
|
||||
|
||||
http://container.training/chat
|
||||
|
||||
- intro.md
|
||||
- |
|
||||
@@TOC@@
|
||||
- - prereqs.md
|
||||
- versions.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
All right!
|
||||
<br/>
|
||||
We're all set.
|
||||
<br/>
|
||||
Let's do this.
|
||||
- sampleapp.md
|
||||
- swarmkit.md
|
||||
- creatingswarm.md
|
||||
- morenodes.md
|
||||
- - firstservice.md
|
||||
- ourapponswarm.md
|
||||
- updatingservices.md
|
||||
- healthchecks.md
|
||||
- - operatingswarm.md
|
||||
- netshoot.md
|
||||
- ipsec.md
|
||||
- swarmtools.md
|
||||
- security.md
|
||||
- secrets.md
|
||||
- encryptionatrest.md
|
||||
- leastprivilege.md
|
||||
- apiscope.md
|
||||
- - logging.md
|
||||
- metrics.md
|
||||
- stateful.md
|
||||
- extratips.md
|
||||
- end.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
That's all folks! <br/> Questions?
|
||||
|
||||
.small[.small[
|
||||
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@docker](https://twitter.com/docker)
|
||||
|
||||
]]
|
||||
|
||||
<!--
|
||||
Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
|
||||
AJ ([@s0ulshake](https://twitter.com/s0ulshake))
|
||||
-->
|
||||
154
docs/encryptionatrest.md
Normal file
@@ -0,0 +1,154 @@
|
||||
## Encryption at rest
|
||||
|
||||
- Swarm data is always encrypted
|
||||
|
||||
- A Swarm cluster can be "locked"
|
||||
|
||||
- When a cluster is "locked", the encryption key is protected with a passphrase
|
||||
|
||||
- Starting or restarting a locked manager requires the passphrase
|
||||
|
||||
- This protects against:
|
||||
|
||||
- theft (stealing a physical machine, a disk, a backup tape...)
|
||||
|
||||
- unauthorized access (to e.g. a remote or virtual volume)
|
||||
|
||||
- some vulnerabilities (like path traversal)
|
||||
|
||||
---
|
||||
|
||||
## Locking a Swarm cluster
|
||||
|
||||
- This is achieved through the `docker swarm update` command
|
||||
|
||||
.exercise[
|
||||
|
||||
- Lock our cluster:
|
||||
```bash
|
||||
docker swarm update --autolock=true
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
This will display the unlock key. Copy-paste it somewhere safe.
|
||||
|
||||
---
|
||||
|
||||
## Locked state
|
||||
|
||||
- If we restart a manager, it will now be locked
|
||||
|
||||
.exercise[
|
||||
|
||||
- Restart the local Engine:
|
||||
```bash
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: if you are doing the workshop on your own, using nodes
|
||||
that you [provisioned yourself](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine) or with [Play-With-Docker](http://play-with-docker.com/), you might have to use a different method to restart the Engine.
|
||||
|
||||
---
|
||||
|
||||
## Checking that our node is locked
|
||||
|
||||
- Manager commands (requiring access to crypted data) will fail
|
||||
|
||||
- Other commands are OK
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try a few basic commands:
|
||||
```bash
|
||||
docker ps
|
||||
docker run alpine echo ♥
|
||||
docker node ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
(The last command should fail, and it will tell you how to unlock this node.)
|
||||
|
||||
---
|
||||
|
||||
## Checking the state of the node programmatically
|
||||
|
||||
- The state of the node shows up in the output of `docker info`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the output of `docker info`:
|
||||
```bash
|
||||
docker info
|
||||
```
|
||||
|
||||
- Can't see it? Too verbose? Grep to the rescue!
|
||||
```bash
|
||||
docker info | grep ^Swarm
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Unlocking a node
|
||||
|
||||
- You will need the secret token that we obtained when enabling auto-lock earlier
|
||||
|
||||
.exercise[
|
||||
|
||||
- Unlock the node:
|
||||
```bash
|
||||
docker swarm unlock
|
||||
```
|
||||
|
||||
- Copy-paste the secret token that we got earlier
|
||||
|
||||
- Check that manager commands now work correctly:
|
||||
```bash
|
||||
docker node ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Managing the secret key
|
||||
|
||||
- If the key is compromised, you can change it and re-encrypt with a new key:
|
||||
```bash
|
||||
docker swarm unlock-key --rotate
|
||||
```
|
||||
|
||||
- If you lost the key, you can get it as long as you have at least one unlocked node:
|
||||
```bash
|
||||
docker swarm unlock-key -q
|
||||
```
|
||||
|
||||
Note: if you rotate the key while some nodes are locked, without saving the previous key, those nodes won't be able to rejoin.
|
||||
|
||||
Note: if somebody steals both your disks and your key, .strike[you're doomed! Doooooomed!]
|
||||
<br/>you can block the compromised node with `docker node demote` and `docker node rm`.
|
||||
|
||||
---
|
||||
|
||||
## Unlocking the cluster permanently
|
||||
|
||||
- If you want to remove the secret key, disable auto-lock
|
||||
|
||||
.exercise[
|
||||
|
||||
- Permanently unlock the cluster:
|
||||
```bash
|
||||
docker swarm update --autolock=false
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: if some nodes are in locked state at that moment (or if they are offline/restarting
|
||||
while you disabled autolock), they still need the previous unlock key to get back online.
|
||||
|
||||
For more information about locking, you can check the [upcoming documentation](https://github.com/docker/docker.github.io/pull/694).
|
||||
38
docs/end.md
Normal file
@@ -0,0 +1,38 @@
|
||||
class: title, extra-details
|
||||
|
||||
# What's next?
|
||||
|
||||
## (What to expect in future versions of this workshop)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Implemented and stable, but out of scope
|
||||
|
||||
- [Docker Content Trust](https://docs.docker.com/engine/security/trust/content_trust/) and
|
||||
[Notary](https://github.com/docker/notary) (image signature and verification)
|
||||
|
||||
- Image security scanning (many products available, Docker Inc. and 3rd party)
|
||||
|
||||
- [Docker Cloud](https://cloud.docker.com/) and
|
||||
[Docker Datacenter](https://www.docker.com/products/docker-datacenter)
|
||||
(commercial offering with node management, secure registry, CI/CD pipelines, all the bells and whistles)
|
||||
|
||||
- Network and storage plugins
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Work in progress
|
||||
|
||||
- Demo at least one volume plugin
|
||||
<br/>(bonus points if it's a distributed storage system)
|
||||
|
||||
- ..................................... (your favorite feature here)
|
||||
|
||||
Reminder: there is a tag for each iteration of the content
|
||||
in the Github repository.
|
||||
|
||||
It makes it easy to come back later and check what has changed since you did it!
|
||||
|
Before Width: | Height: | Size: 11 KiB After Width: | Height: | Size: 13 KiB |
@@ -1,19 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Extract and print level 1 and 2 titles from workshop slides.
|
||||
"""
|
||||
|
||||
separators = [
|
||||
"---",
|
||||
"--"
|
||||
]
|
||||
|
||||
slide_count = 1
|
||||
for line in open("index.html"):
|
||||
line = line.strip()
|
||||
if line in separators:
|
||||
slide_count += 1
|
||||
if line.startswith('# '):
|
||||
print slide_count, '# #', line
|
||||
elif line.startswith('# '):
|
||||
print slide_count, line
|
||||
246
docs/extratips.md
Normal file
@@ -0,0 +1,246 @@
|
||||
# Controlling Docker from a container
|
||||
|
||||
- In a local environment, just bind-mount the Docker control socket:
|
||||
```bash
|
||||
docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker
|
||||
```
|
||||
|
||||
- Otherwise, you have to:
|
||||
|
||||
- set `DOCKER_HOST`,
|
||||
- set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS),
|
||||
- copy certificates to the container that will need API access.
|
||||
|
||||
More resources on this topic:
|
||||
|
||||
- [Do not use Docker-in-Docker for CI](
|
||||
http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
|
||||
- [One container to rule them all](
|
||||
http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/)
|
||||
|
||||
---
|
||||
|
||||
## Bind-mounting the Docker control socket
|
||||
|
||||
- In Swarm mode, bind-mounting the control socket gives you access to the whole cluster
|
||||
|
||||
- You can tell Docker to place a given service on a manager node, using constraints:
|
||||
```bash
|
||||
docker service create \
|
||||
--mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
|
||||
--name autoscaler --constraint node.role==manager ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Constraints and global services
|
||||
|
||||
(New in Docker Engine 1.13)
|
||||
|
||||
- By default, global services run on *all* nodes
|
||||
```bash
|
||||
docker service create --mode global ...
|
||||
```
|
||||
|
||||
- You can specify constraints for global services
|
||||
|
||||
- These services will run only on the node satisfying the constraints
|
||||
|
||||
- For instance, this service will run on all manager nodes:
|
||||
```bash
|
||||
docker service create --mode global --constraint node.role==manager ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Constraints and dynamic scheduling
|
||||
|
||||
(New in Docker Engine 1.13)
|
||||
|
||||
- If constraints change, services are started/stopped accordingly
|
||||
|
||||
(e.g., `--constraint node.role==manager` and nodes are promoted/demoted)
|
||||
|
||||
- This is particularly useful with labels:
|
||||
```bash
|
||||
docker node update node1 --label-add defcon=five
|
||||
docker service create --constraint node.labels.defcon==five ...
|
||||
docker node update node2 --label-add defcon=five
|
||||
docker node update node1 --label-rm defcon=five
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Shortcomings of dynamic scheduling
|
||||
|
||||
.warning[If a service becomes "unschedulable" (constraints can't be satisfied):]
|
||||
|
||||
- It won't be scheduled automatically when constraints are satisfiable again
|
||||
|
||||
- You will have to update the service; you can do a no-op udate with:
|
||||
```bash
|
||||
docker service update ... --force
|
||||
```
|
||||
|
||||
.warning[Docker will silently ignore attempts to remove a non-existent label or constraint]
|
||||
|
||||
- It won't warn you if you typo when removing a label or constraint!
|
||||
|
||||
---
|
||||
|
||||
# Node management
|
||||
|
||||
- SwarmKit allows to change (almost?) everything on-the-fly
|
||||
|
||||
- Nothing should require a global restart
|
||||
|
||||
---
|
||||
|
||||
## Node availability
|
||||
|
||||
```bash
|
||||
docker node update <node-name> --availability <active|pause|drain>
|
||||
```
|
||||
|
||||
- Active = schedule tasks on this node (default)
|
||||
|
||||
- Pause = don't schedule new tasks on this node; existing tasks are not affected
|
||||
|
||||
You can use it to troubleshoot a node without disrupting existing tasks
|
||||
|
||||
It can also be used (in conjunction with labels) to reserve resources
|
||||
|
||||
- Drain = don't schedule new tasks on this node; existing tasks are moved away
|
||||
|
||||
This is just like crashing the node, but containers get a chance to shutdown cleanly
|
||||
|
||||
---
|
||||
|
||||
## Managers and workers
|
||||
|
||||
- Nodes can be promoted to manager with `docker node promote`
|
||||
|
||||
- Nodes can be demoted to worker with `docker node demote`
|
||||
|
||||
- This can also be done with `docker node update <node> --role <manager|worker>`
|
||||
|
||||
- Reminder: this has to be done from a manager node
|
||||
<br/>(workers cannot promote themselves)
|
||||
|
||||
---
|
||||
|
||||
## Removing nodes
|
||||
|
||||
- You can leave Swarm mode with `docker swarm leave`
|
||||
|
||||
- Nodes are drained before being removed (i.e. all tasks are rescheduled somewhere else)
|
||||
|
||||
- Managers cannot leave (they have to be demoted first)
|
||||
|
||||
- After leaving, a node still shows up in `docker node ls` (in `Down` state)
|
||||
|
||||
- When a node is `Down`, you can remove it with `docker node rm` (from a manager node)
|
||||
|
||||
---
|
||||
|
||||
## Join tokens and automation
|
||||
|
||||
- If you have used Docker 1.12-RC: join tokens are now mandatory!
|
||||
|
||||
- You cannot specify your own token (SwarmKit generates it)
|
||||
|
||||
- If you need to change the token: `docker swarm join-token --rotate ...`
|
||||
|
||||
- To automate cluster deployment:
|
||||
|
||||
- have a seed node do `docker swarm init` if it's not already in Swarm mode
|
||||
|
||||
- propagate the token to the other nodes (secure bucket, facter, ohai...)
|
||||
|
||||
---
|
||||
|
||||
## Disk space management: `docker system df`
|
||||
|
||||
- Shows disk usage for images, containers, and volumes
|
||||
|
||||
- Breaks down between *active* and *reclaimable* categories
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check how much disk space is used at the end of the workshop:
|
||||
```bash
|
||||
docker system df
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: `docker system` is new in Docker Engine 1.13.
|
||||
|
||||
---
|
||||
|
||||
## Reclaiming unused resources: `docker system prune`
|
||||
|
||||
- Removes stopped containers
|
||||
|
||||
- Removes dangling images (that don't have a tag associated anymore)
|
||||
|
||||
- Removes orphaned volumes
|
||||
|
||||
- Removes empty networks
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try it:
|
||||
```bash
|
||||
docker system prune -f
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: `docker system prune -a` will also remove *unused* images.
|
||||
|
||||
---
|
||||
|
||||
## Events
|
||||
|
||||
- You can get a real-time stream of events with `docker events`
|
||||
|
||||
- This will report *local events* and *cluster events*
|
||||
|
||||
- Local events =
|
||||
<br/>
|
||||
all activity related to containers, images, plugins, volumes, networks, *on this node*
|
||||
|
||||
- Cluster events =
|
||||
<br/>Swarm Mode activity related to services, nodes, secrets, configs, *on the whole cluster*
|
||||
|
||||
- `docker events` doesn't report *local events happening on other nodes*
|
||||
|
||||
- Events can be filtered (by type, target, labels...)
|
||||
|
||||
- Events can be formatted with Go's `text/template` or in JSON
|
||||
|
||||
---
|
||||
|
||||
## Getting *all the events*
|
||||
|
||||
- There is no built-in to get a stream of *all the events* on *all the nodes*
|
||||
|
||||
- This can be achieved with (for instance) the four following services working together:
|
||||
|
||||
- a Redis container (used as a stateless, fan-in message queue)
|
||||
|
||||
- a global service bind-mounting the Docker socket, pushing local events to the queue
|
||||
|
||||
- a similar singleton service to push global events to the queue
|
||||
|
||||
- a queue consumer fetching events and processing them as you please
|
||||
|
||||
I'm not saying that you should implement it with Shell scripts, but you totally could.
|
||||
|
||||
.small[
|
||||
(It might or might not be one of the initiating rites of the
|
||||
[House of Bash](https://twitter.com/carmatrocity/status/676559402787282944))
|
||||
]
|
||||
|
||||
For more information about event filters and types, check [the documentation](https://docs.docker.com/engine/reference/commandline/events/).
|
||||
474
docs/firstservice.md
Normal file
@@ -0,0 +1,474 @@
|
||||
# Running our first Swarm service
|
||||
|
||||
- How do we run services? Simplified version:
|
||||
|
||||
`docker run` → `docker service create`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create a service featuring an Alpine container pinging Google resolvers:
|
||||
```bash
|
||||
docker service create alpine ping 8.8.8.8
|
||||
```
|
||||
|
||||
- Check the result:
|
||||
```bash
|
||||
docker service ps <serviceID>
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## `--detach` for service creation
|
||||
|
||||
(New in Docker Engine 17.05)
|
||||
|
||||
If you are running Docker 17.05 to 17.09, you will see the following message:
|
||||
|
||||
```
|
||||
Since --detach=false was not specified, tasks will be created in the background.
|
||||
In a future release, --detach=false will become the default.
|
||||
```
|
||||
|
||||
You can ignore that for now; but we'll come back to it in just a few minutes!
|
||||
|
||||
---
|
||||
|
||||
## Checking service logs
|
||||
|
||||
(New in Docker Engine 17.05)
|
||||
|
||||
- Just like `docker logs` shows the output of a specific local container ...
|
||||
|
||||
- ... `docker service logs` shows the output of all the containers of a specific service
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the output of our ping command:
|
||||
```bash
|
||||
docker service logs <serviceID>
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Flags `--follow` and `--tail` are available, as well as a few others.
|
||||
|
||||
Note: by default, when a container is destroyed (e.g. when scaling down), its logs are lost.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Before Docker Engine 17.05
|
||||
|
||||
- Docker 1.13/17.03/17.04 have `docker service logs` as an experimental feature
|
||||
<br/>(available only when enabling the experimental feature flag)
|
||||
|
||||
- We have to use `docker logs`, which only works on local containers
|
||||
|
||||
- We will have to connect to the node running our container
|
||||
<br/>(unless it was scheduled locally, of course)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Looking up where our container is running
|
||||
|
||||
- The `docker service ps` command told us where our container was scheduled
|
||||
|
||||
.exercise[
|
||||
|
||||
- Look up the `NODE` on which the container is running:
|
||||
```bash
|
||||
docker service ps <serviceID>
|
||||
```
|
||||
|
||||
- If you use Play-With-Docker, switch to that node's tab, or set `DOCKER_HOST`
|
||||
|
||||
- Otherwise, `ssh` into tht node or use `$(eval docker-machine env node...)`
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Viewing the logs of the container
|
||||
|
||||
.exercise[
|
||||
|
||||
- See that the container is running and check its ID:
|
||||
```bash
|
||||
docker ps
|
||||
```
|
||||
|
||||
- View its logs:
|
||||
```bash
|
||||
docker logs <containerID>
|
||||
```
|
||||
|
||||
- Go back to `node1` afterwards
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Scale our service
|
||||
|
||||
- Services can be scaled in a pinch with the `docker service update` command
|
||||
|
||||
.exercise[
|
||||
|
||||
- Scale the service to ensure 2 copies per node:
|
||||
```bash
|
||||
docker service update <serviceID> --replicas 10 --detach=true
|
||||
```
|
||||
|
||||
- Check that we have two containers on the current node:
|
||||
```bash
|
||||
docker ps
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## View deployment progress
|
||||
|
||||
(New in Docker Engine 17.05)
|
||||
|
||||
- Commands that create/update/delete services can run with `--detach=false`
|
||||
|
||||
- The CLI will show the status of the command, and exit once it's done working
|
||||
|
||||
.exercise[
|
||||
|
||||
- Scale the service to ensure 3 copies per node:
|
||||
```bash
|
||||
docker service update <serviceID> --replicas 15 --detach=false
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: with Docker Engine 17.10 and later, `--detach=false` is the default.
|
||||
|
||||
With versions older than 17.05, you can use e.g.: `watch docker service ps <serviceID>`
|
||||
|
||||
---
|
||||
|
||||
## Expose a service
|
||||
|
||||
- Services can be exposed, with two special properties:
|
||||
|
||||
- the public port is available on *every node of the Swarm*,
|
||||
|
||||
- requests coming on the public port are load balanced across all instances.
|
||||
|
||||
- This is achieved with option `-p/--publish`; as an approximation:
|
||||
|
||||
`docker run -p → docker service create -p`
|
||||
|
||||
- If you indicate a single port number, it will be mapped on a port
|
||||
starting at 30000
|
||||
<br/>(vs. 32768 for single container mapping)
|
||||
|
||||
- You can indicate two port numbers to set the public port number
|
||||
<br/>(just like with `docker run -p`)
|
||||
|
||||
---
|
||||
|
||||
## Expose ElasticSearch on its default port
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create an ElasticSearch service (and give it a name while we're at it):
|
||||
```bash
|
||||
docker service create --name search --publish 9200:9200 --replicas 7 \
|
||||
--detach=false elasticsearch`:2`
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: don't forget the **:2**!
|
||||
|
||||
The latest version of the ElasticSearch image won't start without mandatory configuration.
|
||||
|
||||
---
|
||||
|
||||
## Tasks lifecycle
|
||||
|
||||
- During the deployment, you will be able to see multiple states:
|
||||
|
||||
- assigned (the task has been assigned to a specific node)
|
||||
|
||||
- preparing (this mostly means "pulling the image")
|
||||
|
||||
- starting
|
||||
|
||||
- running
|
||||
|
||||
- When a task is terminated (stopped, killed...) it cannot be restarted
|
||||
|
||||
(A replacement task will be created)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Test our service
|
||||
|
||||
- We mapped port 9200 on the nodes, to port 9200 in the containers
|
||||
|
||||
- Let's try to reach that port!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try the following command:
|
||||
```bash
|
||||
curl localhost:9200
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
(If you get `Connection refused`: congratulations, you are very fast indeed! Just try again.)
|
||||
|
||||
ElasticSearch serves a little JSON document with some basic information
|
||||
about this instance; including a randomly-generated super-hero name.
|
||||
|
||||
---
|
||||
|
||||
## Test the load balancing
|
||||
|
||||
- If we repeat our `curl` command multiple times, we will see different names
|
||||
|
||||
.exercise[
|
||||
|
||||
- Send 10 requests, and see which instances serve them:
|
||||
```bash
|
||||
for N in $(seq 1 10); do
|
||||
curl -s localhost:9200 | jq .name
|
||||
done
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: if you don't have `jq` on your Play-With-Docker instance, just install it:
|
||||
```
|
||||
apk add --no-cache jq
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Load balancing results
|
||||
|
||||
Traffic is handled by our clusters [TCP routing mesh](
|
||||
https://docs.docker.com/engine/swarm/ingress/).
|
||||
|
||||
Each request is served by one of the 7 instances, in rotation.
|
||||
|
||||
Note: if you try to access the service from your browser,
|
||||
you will probably see the same
|
||||
instance name over and over, because your browser (unlike curl) will try
|
||||
to re-use the same connection.
|
||||
|
||||
---
|
||||
|
||||
## Under the hood of the TCP routing mesh
|
||||
|
||||
- Load balancing is done by IPVS
|
||||
|
||||
- IPVS is a high-performance, in-kernel load balancer
|
||||
|
||||
- It's been around for a long time (merged in the kernel since 2.4)
|
||||
|
||||
- Each node runs a local load balancer
|
||||
|
||||
(Allowing connections to be routed directly to the destination,
|
||||
without extra hops)
|
||||
|
||||
---
|
||||
|
||||
## Managing inbound traffic
|
||||
|
||||
There are many ways to deal with inbound traffic on a Swarm cluster.
|
||||
|
||||
- Put all (or a subset) of your nodes in a DNS `A` record
|
||||
|
||||
- Assign your nodes (or a subset) to an ELB
|
||||
|
||||
- Use a virtual IP and make sure that it is assigned to an "alive" node
|
||||
|
||||
- etc.
|
||||
|
||||
---
|
||||
|
||||
class: btw-labels
|
||||
|
||||
## Managing HTTP traffic
|
||||
|
||||
- The TCP routing mesh doesn't parse HTTP headers
|
||||
|
||||
- If you want to place multiple HTTP services on port 80, you need something more
|
||||
|
||||
- You can set up NGINX or HAProxy on port 80 to do the virtual host switching
|
||||
|
||||
- Docker Universal Control Plane provides its own [HTTP routing mesh](
|
||||
https://docs.docker.com/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services/)
|
||||
|
||||
- add a specific label starting with `com.docker.ucp.mesh.http` to your services
|
||||
|
||||
- labels are detected automatically and dynamically update the configuration
|
||||
|
||||
---
|
||||
|
||||
class: btw-labels
|
||||
|
||||
## You should use labels
|
||||
|
||||
- Labels are a great way to attach arbitrary information to services
|
||||
|
||||
- Examples:
|
||||
|
||||
- HTTP vhost of a web app or web service
|
||||
|
||||
- backup schedule for a stateful service
|
||||
|
||||
- owner of a service (for billing, paging...)
|
||||
|
||||
- etc.
|
||||
|
||||
---
|
||||
|
||||
## Pro-tip for ingress traffic management
|
||||
|
||||
- It is possible to use *local* networks with Swarm services
|
||||
|
||||
- This means that you can do something like this:
|
||||
```bash
|
||||
docker service create --network host --mode global traefik ...
|
||||
```
|
||||
|
||||
(This runs the `traefik` load balancer on each node of your cluster, in the `host` network)
|
||||
|
||||
- This gives you native performance (no iptables, no proxy, no nothing!)
|
||||
|
||||
- The load balancer will "see" the clients' IP addresses
|
||||
|
||||
- But: a container cannot simultaneously be in the `host` network and another network
|
||||
|
||||
(You will have to route traffic to containers using exposed ports or UNIX sockets)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Using local networks (`host`, `macvlan` ...) with Swarm services
|
||||
|
||||
- Using the `host` network is fairly straightforward
|
||||
|
||||
(With the caveats described on the previous slide)
|
||||
|
||||
- It is also possible to use drivers like `macvlan`
|
||||
|
||||
- see [this guide](
|
||||
https://docs.docker.com/engine/userguide/networking/get-started-macvlan/
|
||||
) to get started on `macvlan`
|
||||
|
||||
- see [this PR](https://github.com/moby/moby/pull/32981) for more information about local network drivers in Swarm mode
|
||||
|
||||
---
|
||||
|
||||
## Visualize container placement
|
||||
|
||||
- Let's leverage the Docker API!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Get the source code of this simple-yet-beautiful visualization app:
|
||||
```bash
|
||||
cd ~
|
||||
git clone git://github.com/dockersamples/docker-swarm-visualizer
|
||||
```
|
||||
|
||||
- Build and run the Swarm visualizer:
|
||||
```bash
|
||||
cd docker-swarm-visualizer
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Connect to the visualization webapp
|
||||
|
||||
- It runs a web server on port 8080
|
||||
|
||||
.exercise[
|
||||
|
||||
- Point your browser to port 8080 of your node1's public ip
|
||||
|
||||
(If you use Play-With-Docker, click on the (8080) badge)
|
||||
|
||||
<!-- ```open http://node1:8080``` -->
|
||||
|
||||
]
|
||||
|
||||
- The webapp updates the display automatically (you don't need to reload the page)
|
||||
|
||||
- It only shows Swarm services (not standalone containers)
|
||||
|
||||
- It shows when nodes go down
|
||||
|
||||
- It has some glitches (it's not Carrier-Grade Enterprise-Compliant ISO-9001 software)
|
||||
|
||||
---
|
||||
|
||||
## Why This Is More Important Than You Think
|
||||
|
||||
- The visualizer accesses the Docker API *from within a container*
|
||||
|
||||
- This is a common pattern: run container management tools *in containers*
|
||||
|
||||
- Instead of viewing your cluster, this could take care of logging, metrics, autoscaling ...
|
||||
|
||||
- We can run it within a service, too! We won't do it, but the command would look like:
|
||||
|
||||
```bash
|
||||
docker service create \
|
||||
--mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
|
||||
--name viz --constraint node.role==manager ...
|
||||
```
|
||||
|
||||
Credits: the visualization code was written by
|
||||
[Francisco Miranda](https://github.com/maroshii).
|
||||
<br/>
|
||||
[Mano Marks](https://twitter.com/manomarks) adapted
|
||||
it to Swarm and maintains it.
|
||||
|
||||
---
|
||||
|
||||
## Terminate our services
|
||||
|
||||
- Before moving on, we will remove those services
|
||||
|
||||
- `docker service rm` can accept multiple services names or IDs
|
||||
|
||||
- `docker service ls` can accept the `-q` flag
|
||||
|
||||
- A Shell snippet a day keeps the cruft away
|
||||
|
||||
.exercise[
|
||||
|
||||
- Remove all services with this one liner:
|
||||
```bash
|
||||
docker service ls -q | xargs docker service rm
|
||||
```
|
||||
|
||||
]
|
||||
211
docs/healthchecks.md
Normal file
@@ -0,0 +1,211 @@
|
||||
name: healthchecks
|
||||
|
||||
# Health checks
|
||||
|
||||
(New in Docker Engine 1.12)
|
||||
|
||||
- Commands that are executed on regular intervals in a container
|
||||
|
||||
- Must return 0 or 1 to indicate "all is good" or "something's wrong"
|
||||
|
||||
- Must execute quickly (timeouts = failures)
|
||||
|
||||
- Example:
|
||||
```bash
|
||||
curl -f http://localhost/_ping || false
|
||||
```
|
||||
- the `-f` flag ensures that `curl` returns non-zero for 404 and similar errors
|
||||
- `|| false` ensures that any non-zero exit status gets mapped to 1
|
||||
- `curl` must be installed in the container that is being checked
|
||||
|
||||
---
|
||||
|
||||
## Defining health checks
|
||||
|
||||
- In a Dockerfile, with the [HEALTHCHECK](https://docs.docker.com/engine/reference/builder/#healthcheck) instruction
|
||||
```
|
||||
HEALTHCHECK --interval=1s --timeout=3s CMD curl -f http://localhost/ || false
|
||||
```
|
||||
|
||||
- From the command line, when running containers or services
|
||||
```
|
||||
docker run --health-cmd "curl -f http://localhost/ || false" ...
|
||||
docker service create --health-cmd "curl -f http://localhost/ || false" ...
|
||||
```
|
||||
|
||||
- In Compose files, with a per-service [healthcheck](https://docs.docker.com/compose/compose-file/#healthcheck) section
|
||||
```yaml
|
||||
www:
|
||||
image: hellowebapp
|
||||
healthcheck:
|
||||
test: "curl -f https://localhost/ || false"
|
||||
timeout: 3s
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Using health checks
|
||||
|
||||
- With `docker run`, health checks are purely informative
|
||||
|
||||
- `docker ps` shows health status
|
||||
|
||||
- `docker inspect` has extra details (including health check command output)
|
||||
|
||||
- With `docker service`:
|
||||
|
||||
- unhealthy tasks are terminated (i.e. the service is restarted)
|
||||
|
||||
- failed deployments can be rolled back automatically
|
||||
<br/>(by setting *at least* the flag `--update-failure-action rollback`)
|
||||
|
||||
---
|
||||
|
||||
## Automated rollbacks
|
||||
|
||||
Here is a comprehensive example using the CLI:
|
||||
|
||||
```bash
|
||||
docker service update \
|
||||
--update-delay 5s \
|
||||
--update-failure-action rollback \
|
||||
--update-max-failure-ratio .25 \
|
||||
--update-monitor 5s \
|
||||
--update-parallelism 1 \
|
||||
--rollback-delay 5s \
|
||||
--rollback-failure-action pause \
|
||||
--rollback-max-failure-ratio .5 \
|
||||
--rollback-monitor 5s \
|
||||
--rollback-parallelism 0 \
|
||||
--health-cmd "curl -f http://localhost/ || exit 1" \
|
||||
--health-interval 2s \
|
||||
--health-retries 1 \
|
||||
--image yourimage:newversion \
|
||||
yourservice
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementing auto-rollback in practice
|
||||
|
||||
We will use the following Compose file (`stacks/dockercoins+healthcheck.yml`):
|
||||
|
||||
```yaml
|
||||
...
|
||||
hasher:
|
||||
build: dockercoins/hasher
|
||||
image: ${REGISTRY-127.0.0.1:5000}/hasher:${TAG-latest}
|
||||
deploy:
|
||||
replicas: 7
|
||||
update_config:
|
||||
delay: 5s
|
||||
failure_action: rollback
|
||||
max_failure_ratio: .5
|
||||
monitor: 5s
|
||||
parallelism: 1
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Enabling auto-rollback
|
||||
|
||||
.exercise[
|
||||
|
||||
- Go to the `stacks` directory:
|
||||
```bash
|
||||
cd ~/orchestration-workshop/stacks
|
||||
```
|
||||
|
||||
- Deploy the updated stack:
|
||||
```bash
|
||||
docker stack deploy dockercoins --compose-file dockercoins+healthcheck.yml
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
This will also scale the `hasher` service to 7 instances.
|
||||
|
||||
---
|
||||
|
||||
## Visualizing a rolling update
|
||||
|
||||
First, let's make an "innocent" change and deploy it.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Update the `sleep` delay in the code:
|
||||
```bash
|
||||
sed -i "s/sleep 0.1/sleep 0.2/" dockercoins/hasher/hasher.rb
|
||||
```
|
||||
|
||||
- Build, ship, and run the new image:
|
||||
```bash
|
||||
export TAG=v0.5
|
||||
docker-compose -f dockercoins+healthcheck.yml build
|
||||
docker-compose -f dockercoins+healthcheck.yml push
|
||||
docker service update dockercoins_hasher \
|
||||
--detach=false --image=127.0.0.1:5000/hasher:$TAG
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Visualizing an automated rollback
|
||||
|
||||
And now, a breaking change that will cause the health check to fail:
|
||||
|
||||
.exercise[
|
||||
|
||||
- Change the HTTP listening port:
|
||||
```bash
|
||||
sed -i "s/80/81/" dockercoins/hasher/hasher.rb
|
||||
```
|
||||
|
||||
- Build, ship, and run the new image:
|
||||
```bash
|
||||
export TAG=v0.6
|
||||
docker-compose -f dockercoins+healthcheck.yml build
|
||||
docker-compose -f dockercoins+healthcheck.yml push
|
||||
docker service update dockercoins_hasher \
|
||||
--detach=false --image=127.0.0.1:5000/hasher:$TAG
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Command-line options available for health checks, rollbacks, etc.
|
||||
|
||||
Batteries included, but swappable
|
||||
|
||||
.small[
|
||||
```
|
||||
--health-cmd string Command to run to check health
|
||||
--health-interval duration Time between running the check (ms|s|m|h)
|
||||
--health-retries int Consecutive failures needed to report unhealthy
|
||||
--health-start-period duration Start period for the container to initialize before counting retries towards unstable (ms|s|m|h)
|
||||
--health-timeout duration Maximum time to allow one check to run (ms|s|m|h)
|
||||
--no-healthcheck Disable any container-specified HEALTHCHECK
|
||||
--restart-condition string Restart when condition is met ("none"|"on-failure"|"any")
|
||||
--restart-delay duration Delay between restart attempts (ns|us|ms|s|m|h)
|
||||
--restart-max-attempts uint Maximum number of restarts before giving up
|
||||
--restart-window duration Window used to evaluate the restart policy (ns|us|ms|s|m|h)
|
||||
--rollback Rollback to previous specification
|
||||
--rollback-delay duration Delay between task rollbacks (ns|us|ms|s|m|h)
|
||||
--rollback-failure-action string Action on rollback failure ("pause"|"continue")
|
||||
--rollback-max-failure-ratio float Failure rate to tolerate during a rollback
|
||||
--rollback-monitor duration Duration after each task rollback to monitor for failure (ns|us|ms|s|m|h)
|
||||
--rollback-order string Rollback order ("start-first"|"stop-first")
|
||||
--rollback-parallelism uint Maximum number of tasks rolled back simultaneously (0 to roll back all at once)
|
||||
--update-delay duration Delay between updates (ns|us|ms|s|m|h)
|
||||
--update-failure-action string Action on update failure ("pause"|"continue"|"rollback")
|
||||
--update-max-failure-ratio float Failure rate to tolerate during an update
|
||||
--update-monitor duration Duration after each task update to monitor for failure (ns|us|ms|s|m|h)
|
||||
--update-order string Update order ("start-first"|"stop-first")
|
||||
--update-parallelism uint Maximum number of tasks updated simultaneously (0 to update all at once)
|
||||
```
|
||||
]
|
||||
|
||||
Yup ... That's a lot of batteries!
|
||||
8552
docs/index.html
15
docs/intro-ks.md
Normal file
@@ -0,0 +1,15 @@
|
||||
## About these slides
|
||||
|
||||
- Your one-stop shop to awesomeness:
|
||||
|
||||
http://container.training/
|
||||
|
||||
- The content that you're viewing right now is in a public GitHub repository:
|
||||
|
||||
https://github.com/jpetazzo/orchestration-workshop
|
||||
|
||||
- Typos? Mistakes? Questions? Feel free to hover over the bottom of the slide ...
|
||||
|
||||
--
|
||||
|
||||
.footnote[👇 Try it! The source file will be shown and you can view it on GitHub and fork and edit it.]
|
||||
41
docs/intro.md
Normal file
@@ -0,0 +1,41 @@
|
||||
## A brief introduction
|
||||
|
||||
- This was initially written to support in-person,
|
||||
instructor-led workshops and tutorials
|
||||
|
||||
- You can also follow along on your own, at your own pace
|
||||
|
||||
- We included as much information as possible in these slides
|
||||
|
||||
- We recommend having a mentor to help you ...
|
||||
|
||||
- ... Or be comfortable spending some time reading the Docker
|
||||
[documentation](https://docs.docker.com/) ...
|
||||
|
||||
- ... And looking for answers in the [Docker forums](forums.docker.com),
|
||||
[StackOverflow](http://stackoverflow.com/questions/tagged/docker),
|
||||
and other outlets
|
||||
|
||||
---
|
||||
|
||||
class: self-paced
|
||||
|
||||
## Hands on, you shall practice
|
||||
|
||||
- Nobody ever became a Jedi by spending their lives reading Wookiepedia
|
||||
|
||||
- Likewise, it will take more than merely *reading* these slides
|
||||
to make you an expert
|
||||
|
||||
- These slides include *tons* of exercises
|
||||
|
||||
- They assume that you have access to a cluster of Docker nodes
|
||||
|
||||
- If you are attending a workshop or tutorial:
|
||||
<br/>you will be given specific instructions to access your cluster
|
||||
|
||||
- If you are doing this on your own:
|
||||
<br/>you can use
|
||||
[Play-With-Docker](http://www.play-with-docker.com/) and
|
||||
read [these instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker) for extra
|
||||
details
|
||||
140
docs/ipsec.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# Securing overlay networks
|
||||
|
||||
- By default, overlay networks are using plain VXLAN encapsulation
|
||||
|
||||
(~Ethernet over UDP, using SwarmKit's control plane for ARP resolution)
|
||||
|
||||
- Encryption can be enabled on a per-network basis
|
||||
|
||||
(It will use IPSEC encryption provided by the kernel, leveraging hardware acceleration)
|
||||
|
||||
- This is only for the `overlay` driver
|
||||
|
||||
(Other drivers/plugins will use different mechanisms)
|
||||
|
||||
---
|
||||
|
||||
## Creating two networks: encrypted and not
|
||||
|
||||
- Let's create two networks for testing purposes
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create an "insecure" network:
|
||||
```bash
|
||||
docker network create insecure --driver overlay --attachable
|
||||
```
|
||||
|
||||
- Create a "secure" network:
|
||||
```bash
|
||||
docker network create secure --opt encrypted --driver overlay --attachable
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
.warning[Make sure that you don't typo that option; errors are silently ignored!]
|
||||
|
||||
---
|
||||
|
||||
## Deploying a web server sitting on both networks
|
||||
|
||||
- Let's use good old NGINX
|
||||
|
||||
- We will attach it to both networks
|
||||
|
||||
- We will use a placement constraint to make sure that it is on a different node
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create a web server running somewhere else:
|
||||
```bash
|
||||
docker service create --name web \
|
||||
--network secure --network insecure \
|
||||
--constraint node.hostname!=node1 \
|
||||
nginx
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Sniff HTTP traffic
|
||||
|
||||
- We will use `ngrep`, which allows to grep for network traffic
|
||||
|
||||
- We will run it in a container, using host networking to access the host's interfaces
|
||||
|
||||
.exercise[
|
||||
|
||||
- Sniff network traffic and display all packets containing "HTTP":
|
||||
```bash
|
||||
docker run --net host nicolaka/netshoot ngrep -tpd eth0 HTTP
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
Seeing tons of HTTP request? Shutdown your DockerCoins workers:
|
||||
```bash
|
||||
docker service update dockercoins_worker --replicas=0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Check that we are, indeed, sniffing traffic
|
||||
|
||||
- Let's see if we can intercept our traffic with Google!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Open a new terminal
|
||||
|
||||
- Issue an HTTP request to Google (or anything you like):
|
||||
```bash
|
||||
curl google.com
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The ngrep container will display one `#` per packet traversing the network interface.
|
||||
|
||||
When you do the `curl`, you should see the HTTP request in clear text in the output.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## If you are using Play-With-Docker, Vagrant, etc.
|
||||
|
||||
- You will probably have *two* network interfaces
|
||||
|
||||
- One interface will be used for outbound traffic (to Google)
|
||||
|
||||
- The other one will be used for internode traffic
|
||||
|
||||
- You might have to adapt/relaunch the `ngrep` command to specify the right one!
|
||||
|
||||
---
|
||||
|
||||
## Try to sniff traffic across overlay networks
|
||||
|
||||
- We will run `curl web` through both secure and insecure networks
|
||||
|
||||
.exercise[
|
||||
|
||||
- Access the web server through the insecure network:
|
||||
```bash
|
||||
docker run --rm --net insecure nicolaka/netshoot curl web
|
||||
```
|
||||
|
||||
- Now do the same through the secure network:
|
||||
```bash
|
||||
docker run --rm --net secure nicolaka/netshoot curl web
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
When you run the first command, you will see HTTP fragments.
|
||||
<br/>
|
||||
However, when you run the second one, only `#` will show up.
|
||||
BIN
docs/k8s-arch1.png
Normal file
|
After Width: | Height: | Size: 352 KiB |
BIN
docs/k8s-arch2.png
Normal file
|
After Width: | Height: | Size: 136 KiB |
106
docs/kube.yml
Normal file
@@ -0,0 +1,106 @@
|
||||
exclude:
|
||||
- self-paced
|
||||
- snap
|
||||
|
||||
chat: "[Gitter](https://gitter.im/jpetazzo/workshop-20171026-prague)"
|
||||
|
||||
title: "Deploying and Scaling Microservices with Docker and Kubernetes"
|
||||
|
||||
chapters:
|
||||
- |
|
||||
class: title
|
||||
|
||||
.small[
|
||||
|
||||
Deploying and Scaling Microservices <br/> with Docker and Kubernetes
|
||||
|
||||
.small[.small[
|
||||
|
||||
**Be kind to the WiFi!**
|
||||
|
||||
<!--
|
||||
*Use the 5G network*
|
||||
<br/>
|
||||
-->
|
||||
*Don't use your hotspot*
|
||||
<br/>
|
||||
*Don't stream videos from YouTube, Netflix, etc.
|
||||
<br/>(if you're bored, watch local content instead)*
|
||||
|
||||
Thank you!
|
||||
|
||||
]
|
||||
]
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Intros
|
||||
|
||||
- Hello! We are
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo), Docker Inc.)
|
||||
&
|
||||
AJ ([@s0ulshake](https://twitter.com/s0ulshake), Travis CI)
|
||||
|
||||
--
|
||||
|
||||
- This is our first time doing this
|
||||
|
||||
--
|
||||
|
||||
- But ... containers and us go back a long way
|
||||
|
||||
--
|
||||
|
||||

|
||||
|
||||
--
|
||||
|
||||
- In the immortal words of [Chelsea Manning](https://twitter.com/xychelsea): #WeGotThis!
|
||||
|
||||
---
|
||||
|
||||
## Logistics
|
||||
|
||||
- The tutorial will run from 9:00am to 12:15pm
|
||||
|
||||
- There will be a coffee break at 10:30am
|
||||
<br/>
|
||||
(please remind me if I forget about it!)
|
||||
|
||||
- This will be fast-paced, but DON'T PANIC!
|
||||
<br/>
|
||||
(all the content is publicly available)
|
||||
|
||||
- Feel free to interrupt for questions at any time
|
||||
|
||||
- Live feedback, questions, help on @@CHAT@@
|
||||
|
||||
- intro-ks.md
|
||||
- |
|
||||
@@TOC@@
|
||||
- - prereqs-k8s.md
|
||||
- versions-k8s.md
|
||||
- sampleapp.md
|
||||
- - concepts-k8s.md
|
||||
- kubenet.md
|
||||
- kubectlget.md
|
||||
- setup-k8s.md
|
||||
- kubectlrun.md
|
||||
- - kubectlexpose.md
|
||||
- ourapponkube.md
|
||||
- dashboard.md
|
||||
- - kubectlscale.md
|
||||
- daemonset.md
|
||||
- rollout.md
|
||||
- whatsnext.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
That's all folks! <br/> Questions?
|
||||
|
||||
.small[.small[
|
||||
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@docker](https://twitter.com/docker)
|
||||
|
||||
]]
|
||||
140
docs/kubectlexpose.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# Exposing containers
|
||||
|
||||
- `kubectl expose` creates a *service* for existing pods
|
||||
|
||||
- A *service* is a stable address for a pod (or a bunch of pods)
|
||||
|
||||
- If we want to connect to our pod(s), we need to create a *service*
|
||||
|
||||
- Once a service is created, `kube-dns` will allow us to resolve it by name
|
||||
|
||||
(i.e. after creating service `hello`, the name `hello` will resolve to something)
|
||||
|
||||
- There are different types of services, detailed on the following slides:
|
||||
|
||||
`ClusterIP`, `NodePort`, `LoadBalancer`, `ExternalName`
|
||||
|
||||
---
|
||||
|
||||
## Basic service types
|
||||
|
||||
- `ClusterIP` (default type)
|
||||
|
||||
- a virtual IP address is allocated for the service (in an internal, private range)
|
||||
- this IP address is reachable only from within the cluster (nodes and pods)
|
||||
- our code can connect to the service using the original port number
|
||||
|
||||
- `NodePort`
|
||||
|
||||
- a port is allocated for the service (by default, in the 30000-32768 range)
|
||||
- that port is made available *on all our nodes* and anybody can connect to it
|
||||
- our code must be changed to connect to that new port number
|
||||
|
||||
These service types are always available.
|
||||
|
||||
Under the hood: `kube-proxy` is using a userland proxy and a bunch of `iptables` rules.
|
||||
|
||||
---
|
||||
|
||||
## More service types
|
||||
|
||||
- `LoadBalancer`
|
||||
|
||||
- an external load balancer is allocated for the service
|
||||
- the load balancer is configured accordingly
|
||||
<br/>(e.g.: a `NodePort` service is created, and the load balancer sends traffic to that port)
|
||||
|
||||
- `ExternalName`
|
||||
|
||||
- the DNS entry managed by `kube-dns` will just be a `CNAME` to a provided record
|
||||
- no port, no IP address, no nothing else is allocated
|
||||
|
||||
The `LoadBalancer` type is currently only available on AWS, Azure, and GCE.
|
||||
|
||||
---
|
||||
|
||||
## Running containers with open ports
|
||||
|
||||
- Since `ping` doesn't have anything to connect to, we'll have to run something else
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start a bunch of ElasticSearch containers:
|
||||
```bash
|
||||
kubectl run elastic --image=elasticsearch:2 --replicas=7
|
||||
```
|
||||
|
||||
- Watch them being started:
|
||||
```bash
|
||||
kubectl get pods -w
|
||||
```
|
||||
|
||||
<!-- ```keys ^C``` -->
|
||||
|
||||
]
|
||||
|
||||
The `-w` option "watches" events happening on the specified resources.
|
||||
|
||||
Note: please DO NOT call the service `search`. It would collide with the TLD.
|
||||
|
||||
---
|
||||
|
||||
## Exposing our deployment
|
||||
|
||||
- We'll create a default `ClusterIP` service
|
||||
|
||||
.exercise[
|
||||
|
||||
- Expose the ElasticSearch HTTP API port:
|
||||
```bash
|
||||
kubectl expose deploy/elastic --port 9200
|
||||
```
|
||||
|
||||
- Look up which IP address was allocated:
|
||||
```bash
|
||||
kubectl get svc
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Services are layer 4 constructs
|
||||
|
||||
- You can assign IP addresses to services, but they are still *layer 4*
|
||||
|
||||
(i.e. a service is not an IP address; it's an IP address + protocol + port)
|
||||
|
||||
- This is caused by the current implementation of `kube-proxy`
|
||||
|
||||
(it relies on mechanisms that don't support layer 3)
|
||||
|
||||
- As a result: you *have to* indicate the port number for your service
|
||||
|
||||
- Running services with arbitrary port (or port ranges) requires hacks
|
||||
|
||||
(e.g. host networking mode)
|
||||
|
||||
---
|
||||
|
||||
## Testing our service
|
||||
|
||||
- We will now send a few HTTP requests to our ElasticSearch pods
|
||||
|
||||
.exercise[
|
||||
|
||||
- Let's obtain the IP address that was allocated for our service, *programatically:*
|
||||
```bash
|
||||
IP=$(kubectl get svc elastic -o go-template --template '{{ .spec.clusterIP }}')
|
||||
```
|
||||
|
||||
- Send a few requests:
|
||||
```bash
|
||||
curl http://$IP:9200/
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
Our requests are load balanced across multiple pods.
|
||||
234
docs/kubectlget.md
Normal file
@@ -0,0 +1,234 @@
|
||||
# First contact with `kubectl`
|
||||
|
||||
- `kubectl` is (almost) the only tool we'll need to talk to Kubernetes
|
||||
|
||||
- It is a rich CLI tool around the Kubernetes API
|
||||
|
||||
(Everything you can do with `kubectl`, you can do directly with the API)
|
||||
|
||||
- On our machines, there is a `~/.kube/config` file with:
|
||||
|
||||
- the Kubernetes API address
|
||||
|
||||
- the path to our TLS certificates used to authenticate
|
||||
|
||||
- You can also use the `--kubeconfig` flag to pass a config file
|
||||
|
||||
- Or directly `--server`, `--user`, etc.
|
||||
|
||||
- `kubectl` can be pronounced "Cube C T L", "Cube cuttle", "Cube cuddle"...
|
||||
|
||||
---
|
||||
|
||||
## `kubectl get`
|
||||
|
||||
- Let's look at our `Node` resources with `kubectl get`!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Look at the composition of our cluster:
|
||||
```bash
|
||||
kubectl get node
|
||||
```
|
||||
|
||||
- These commands are equivalent:
|
||||
```bash
|
||||
kubectl get no
|
||||
kubectl get node
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## From human-readable to machine-readable output
|
||||
|
||||
- `kubectl get` can output JSON, YAML, or be directly formatted
|
||||
|
||||
.exercise[
|
||||
|
||||
- Give us more info about them nodes:
|
||||
```bash
|
||||
kubectl get nodes -o wide
|
||||
```
|
||||
|
||||
- Let's have some YAML:
|
||||
```bash
|
||||
kubectl get no -o yaml
|
||||
```
|
||||
See that `kind: List` at the end? It's the type of our result!
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## (Ab)using `kubectl` and `jq`
|
||||
|
||||
- It's super easy to build custom reports
|
||||
|
||||
.exercise[
|
||||
|
||||
- Show the capacity of all our nodes as a stream of JSON objects:
|
||||
```bash
|
||||
kubectl get nodes -o json |
|
||||
jq ".items[] | {name:.metadata.name} + .status.capacity"
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## What's available?
|
||||
|
||||
- `kubectl` has pretty good introspection facilities
|
||||
|
||||
- We can list all available resource types by running `kubectl get`
|
||||
|
||||
- We can view details about a resource with:
|
||||
```bash
|
||||
kubectl describe type/name
|
||||
kubectl describe type name
|
||||
```
|
||||
|
||||
- We can view the definition for a resource type with:
|
||||
```bash
|
||||
kubectl explain type
|
||||
```
|
||||
|
||||
Each time, `type` can be singular, plural, or abbreviated type name.
|
||||
|
||||
---
|
||||
|
||||
## Services
|
||||
|
||||
- A *service* is a stable endpoint to connect to "something"
|
||||
|
||||
(In the initial proposal, they were called "portals")
|
||||
|
||||
.exercise[
|
||||
|
||||
- List the services on our cluster with one of these commands:
|
||||
```bash
|
||||
kubectl get services
|
||||
kubectl get svc
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
There is already one service on our cluster: the Kubernetes API itself.
|
||||
|
||||
---
|
||||
|
||||
## ClusterIP services
|
||||
|
||||
- A `ClusterIP` service is internal, available from the cluster only
|
||||
|
||||
- This is useful for introspection from within containers
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try to connect to the API:
|
||||
```bash
|
||||
curl -k https://`10.96.0.1`
|
||||
```
|
||||
|
||||
- `-k` is used to skip certificate verification
|
||||
- Make sure to replace 10.96.0.1 with the CLUSTER-IP shown earlier
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
The error that we see is expected: the Kubernetes API requires authentication.
|
||||
|
||||
---
|
||||
|
||||
## Listing running containers
|
||||
|
||||
- Containers are manipulated through *pods*
|
||||
|
||||
- A pod is a group of containers:
|
||||
|
||||
- running together (on the same node)
|
||||
|
||||
- sharing resources (RAM, CPU; but also network, volumes)
|
||||
|
||||
.exercise[
|
||||
|
||||
- List pods on our cluster:
|
||||
```bash
|
||||
kubectl get pods
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
*These are not the pods you're looking for.* But where are they?!?
|
||||
|
||||
---
|
||||
|
||||
## Namespaces
|
||||
|
||||
- Namespaces allow to segregate resources
|
||||
|
||||
.exercise[
|
||||
|
||||
- List the namespaces on our cluster with one of these commands:
|
||||
```bash
|
||||
kubectl get namespaces
|
||||
kubectl get namespace
|
||||
kubectl get ns
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
*You know what ... This `kube-system` thing looks suspicious.*
|
||||
|
||||
---
|
||||
|
||||
## Accessing namespaces
|
||||
|
||||
- By default, `kubectl` uses the `default` namespace
|
||||
|
||||
- We can switch to a different namespace with the `-n` option
|
||||
|
||||
.exercise[
|
||||
|
||||
- List the pods in the `kube-system` namespace:
|
||||
```bash
|
||||
kubectl -n kube-system get pods
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
*Ding ding ding ding ding!*
|
||||
|
||||
---
|
||||
|
||||
## What are all these pods?
|
||||
|
||||
- `etcd` is our etcd server
|
||||
|
||||
- `kube-apiserver` is the API server
|
||||
|
||||
- `kube-controller-manager` and `kube-scheduler` are other master components
|
||||
|
||||
- `kube-dns` is an additional component (not mandatory but super useful, so it's there)
|
||||
|
||||
- `kube-proxy` is the (per-node) component managing port mappings and such
|
||||
|
||||
- `weave` is the (per-node) component managing the network overlay
|
||||
|
||||
- the `READY` column indicates the number of containers in each pod
|
||||
|
||||
- the pods with a name ending with `-node1` are the master components
|
||||
<br/>
|
||||
(they have been specifically "pinned" to the master node)
|
||||
249
docs/kubectlrun.md
Normal file
@@ -0,0 +1,249 @@
|
||||
# Running our first containers on Kubernetes
|
||||
|
||||
- First things first: we cannot run a container
|
||||
|
||||
--
|
||||
|
||||
- We are going to run a pod, and in that pod there will be a single container
|
||||
|
||||
--
|
||||
|
||||
- In that container in the pod, we are going to run a simple `ping` command
|
||||
|
||||
- Then we are going to start additional copies of the pod
|
||||
|
||||
---
|
||||
|
||||
## Starting a simple pod with `kubectl run`
|
||||
|
||||
- We need to specify at least a *name* and the image we want to use
|
||||
|
||||
.exercise[
|
||||
|
||||
- Let's ping `goo.gl`:
|
||||
```bash
|
||||
kubectl run pingpong --image alpine ping goo.gl
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
OK, what just happened?
|
||||
|
||||
---
|
||||
|
||||
## Behind the scenes of `kubectl run`
|
||||
|
||||
- Let's look at the resources that were created by `kubectl run`
|
||||
|
||||
.exercise[
|
||||
|
||||
- List most resource types:
|
||||
```bash
|
||||
kubectl get all
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
We should see the following things:
|
||||
- `deploy/pingpong` (the *deployment* that we just created)
|
||||
- `rs/pingpong-xxxx` (a *replica set* created by the deployment)
|
||||
- `po/pingpong-yyyy` (a *pod* created by the replica set)
|
||||
|
||||
---
|
||||
|
||||
## Deployments, replica sets, and replication controllers
|
||||
|
||||
- A *deployment* is a high-level construct
|
||||
|
||||
- allows scaling, rolling updates, rollbacks
|
||||
|
||||
- multiple deployments can be used together to implement a
|
||||
[canary deployment](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#canary-deployments)
|
||||
|
||||
- delegates pods management to *replica sets*
|
||||
|
||||
- A *replica set* is a low-level construct
|
||||
|
||||
- makes sure that a given number of identical pods are running
|
||||
|
||||
- allows scaling
|
||||
|
||||
- rarely used directly
|
||||
|
||||
- A *replication controller* is the (deprecated) predecessor of a replica set
|
||||
|
||||
---
|
||||
|
||||
## Our `pingpong` deployment
|
||||
|
||||
- `kubectl run` created a *deployment*, `deploy/pingpong`
|
||||
|
||||
- That deployment created a *replica set*, `rs/pingpong-xxxx`
|
||||
|
||||
- That replica set created a *pod*, `po/pingpong-yyyy`
|
||||
|
||||
- We'll see later how these folks play together for:
|
||||
|
||||
- scaling
|
||||
|
||||
- high availability
|
||||
|
||||
- rolling updates
|
||||
|
||||
---
|
||||
|
||||
## Viewing container output
|
||||
|
||||
- Let's use the `kubectl logs` command
|
||||
|
||||
- We will pass either a *pod name*, or a *type/name*
|
||||
|
||||
(E.g. if we specify a deployment or replica set, it will get the first pod in it)
|
||||
|
||||
- Unless specified otherwise, it will only show logs of the first container in the pod
|
||||
|
||||
(Good thing there's only one in ours!)
|
||||
|
||||
.exercise[
|
||||
|
||||
- View the result of our `ping` command:
|
||||
```bash
|
||||
kubectl logs deploy/pingpong
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Streaming logs in real time
|
||||
|
||||
- Just like `docker logs`, `kubectl logs` supports convenient options:
|
||||
|
||||
- `-f`/`--follow` to stream logs in real time (à la `tail -f`)
|
||||
|
||||
- `--tail` to indicate how many lines you want to see (from the end)
|
||||
|
||||
- `--since` to get logs only after a given timestamp
|
||||
|
||||
.exercise[
|
||||
|
||||
- View the latest logs of our `ping` command:
|
||||
```bash
|
||||
kubectl logs deploy/pingpong --tail 1 --follow
|
||||
```
|
||||
|
||||
<!--
|
||||
```keys
|
||||
^C
|
||||
```
|
||||
-->
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Scaling our application
|
||||
|
||||
- We can create additional copies of our container (I mean, our pod) with `kubectl scale`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Scale our `pingpong` deployment:
|
||||
```bash
|
||||
kubectl scale deploy/pingpong --replicas 8
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: what if we tried to scale `rs/pingpong-xxxx`?
|
||||
|
||||
We could! But the *deployment* would notice it right away, and scale back to the initial level.
|
||||
|
||||
---
|
||||
|
||||
## Resilience
|
||||
|
||||
- The *deployment* `pingpong` watches its *replica set*
|
||||
|
||||
- The *replica set* ensures that the right number of *pods* are running
|
||||
|
||||
- What happens if pods disappear?
|
||||
|
||||
.exercise[
|
||||
|
||||
- In a separate window, list pods, and keep watching them:
|
||||
```bash
|
||||
kubectl get pods -w
|
||||
```
|
||||
|
||||
<!--
|
||||
```keys
|
||||
^C
|
||||
```
|
||||
-->
|
||||
|
||||
- Destroy a pod:
|
||||
```bash
|
||||
kubectl delete pod pingpong-yyyy
|
||||
```
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## What if we wanted something different?
|
||||
|
||||
- What if we wanted to start a "one-shot" container that *doesn't* get restarted?
|
||||
|
||||
- We could use `kubectl run --restart=OnFailure` or `kubectl run --restart=Never`
|
||||
|
||||
- These commands would create *jobs* or *pods* instead of *deployments*
|
||||
|
||||
- Under the hood, `kubectl run` invokes "generators" to create resource descriptions
|
||||
|
||||
- We could also write these resource descriptions ourselves (typically in YAML),
|
||||
<br/>and create them on the cluster with `kubectl apply -f` (discussed later)
|
||||
|
||||
- With `kubectl run --schedule=...`, we can also create *cronjobs*
|
||||
|
||||
---
|
||||
|
||||
## Viewing logs of multiple pods
|
||||
|
||||
- When we specify a deployment name, only one single pod's logs are shown
|
||||
|
||||
- We can view the logs of multiple pods by specifying a *selector*
|
||||
|
||||
- A selector is a logic expression using *labels*
|
||||
|
||||
- Conveniently, when you `kubectl run somename`, the associated objects have a `run=somename` label
|
||||
|
||||
.exercise[
|
||||
|
||||
- View the last line of log from all pods with the `run=pingpong` label:
|
||||
```bash
|
||||
kubectl logs -l run=pingpong --tail 1
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Unfortunately, `--follow` cannot (yet) be used to stream the logs from multiple containers.
|
||||
|
||||
---
|
||||
|
||||
class: title
|
||||
|
||||
.small[
|
||||
Meanwhile, at the Google NOC ...
|
||||
|
||||
.small[
|
||||
Why the hell
|
||||
<br/>
|
||||
are we getting 1000 packets per second
|
||||
<br/>
|
||||
of ICMP ECHO traffic from EC2 ?!?
|
||||
]
|
||||
]
|
||||
24
docs/kubectlscale.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Scaling a deployment
|
||||
|
||||
- We will start with an easy one: the `worker` deployment
|
||||
|
||||
.exercise[
|
||||
|
||||
- Open two new terminals to check what's going on with pods and deployments:
|
||||
```bash
|
||||
kubectl get pods -w
|
||||
kubectl get deployments -w
|
||||
```
|
||||
|
||||
<!-- ```keys ^C``` -->
|
||||
|
||||
- Now, create more `worker` replicas:
|
||||
```bash
|
||||
kubectl scale deploy/worker --replicas=10
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
After a few seconds, the graph in the web UI should show up.
|
||||
<br/>
|
||||
(And peak at 10 hashes/second, just like when we were running on a single one.)
|
||||
81
docs/kubenet.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# Kubernetes network model
|
||||
|
||||
- TL,DR:
|
||||
|
||||
*Our cluster (nodes and pods) is one big flat IP network.*
|
||||
|
||||
--
|
||||
|
||||
- In detail:
|
||||
|
||||
- all nodes must be able to reach each other, without NAT
|
||||
|
||||
- all pods must be able to reach each other, without NAT
|
||||
|
||||
- pods and nodes must be able to reach each other, without NAT
|
||||
|
||||
- each pod is aware of its IP address (no NAT)
|
||||
|
||||
- Kubernetes doesn't mandate any particular implementation
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes network model: the good
|
||||
|
||||
- Everything can reach everything
|
||||
|
||||
- No address translation
|
||||
|
||||
- No port translation
|
||||
|
||||
- No new protocol
|
||||
|
||||
- Pods cannot move from a node to another and keep their IP address
|
||||
|
||||
- IP addresses don't have to be "portable" from a node to another
|
||||
|
||||
(We can use e.g. a subnet per node and use a simple routed topology)
|
||||
|
||||
- The specification is simple enough to allow many various implementations
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes network model: the bad and the ugly
|
||||
|
||||
- Everything can reach everything
|
||||
|
||||
- if you want security, you need to add network policies
|
||||
|
||||
- the network implementation that you use needs to support them
|
||||
|
||||
- There are literally dozens of implementations out there
|
||||
|
||||
(15 are listed in the Kubernetes documentation)
|
||||
|
||||
- It *looks like* you have a level 3 network, but it's only level 4
|
||||
|
||||
(The spec requires UDP and TCP, but not port ranges or arbitrary IP packets)
|
||||
|
||||
- `kube-proxy` is on the data path when connecting to a pod or container,
|
||||
<br/>and it's not particularly fast (relies on userland proxying or iptables)
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes network model: in practice
|
||||
|
||||
- The nodes that we are using have been set up to use Weave
|
||||
|
||||
- We don't endorse Weave in a particular way, it just Works For Us
|
||||
|
||||
- Don't worry about the warning about `kube-proxy` performance
|
||||
|
||||
- Unless you:
|
||||
|
||||
- routinely saturate 10G network interfaces
|
||||
|
||||
- count packet rates in millions per second
|
||||
|
||||
- run high-traffic VOIP or gaming platforms
|
||||
|
||||
- do weird things that involve millions of simultaneous connections
|
||||
<br/>(in which case you're already familiar with kernel tuning)
|
||||
60
docs/leastprivilege.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# Least privilege model
|
||||
|
||||
- All the important data is stored in the "Raft log"
|
||||
|
||||
- Managers nodes have read/write access to this data
|
||||
|
||||
- Workers nodes have no access to this data
|
||||
|
||||
- Workers only receive the minimum amount of data that they need:
|
||||
|
||||
- which services to run
|
||||
- network configuration information for these services
|
||||
- credentials for these services
|
||||
|
||||
- Compromising a worker node does not give access to the full cluster
|
||||
|
||||
---
|
||||
|
||||
## What can I do if I compromise a worker node?
|
||||
|
||||
- I can enter the containers running on that node
|
||||
|
||||
- I can access the configuration and credentials used by these containers
|
||||
|
||||
- I can inspect the network traffic of these containers
|
||||
|
||||
- I cannot inspect or disrupt the network traffic of other containers
|
||||
|
||||
(network information is provided by manager nodes; ARP spoofing is not possible)
|
||||
|
||||
- I cannot infer the topology of the cluster and its number of nodes
|
||||
|
||||
- I can only learn the IP addresses of the manager nodes
|
||||
|
||||
---
|
||||
|
||||
## Guidelines for workload isolation leveraging least privilege model
|
||||
|
||||
- Define security levels
|
||||
|
||||
- Define security zones
|
||||
|
||||
- Put managers in the highest security zone
|
||||
|
||||
- Enforce workloads of a given security level to run in a given zone
|
||||
|
||||
- Enforcement can be done with [Authorization Plugins](https://docs.docker.com/engine/extend/plugins_authorization/)
|
||||
|
||||
---
|
||||
|
||||
## Learning more about container security
|
||||
|
||||
.blackbelt[DC17US: Securing Containers, One Patch At A Time
|
||||
([video](https://www.youtube.com/watch?v=jZSs1RHwcqo&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=4))]
|
||||
|
||||
.blackbelt[DC17EU: Container-relevant Upstream Kernel Developments
|
||||
([video](https://dockercon.docker.com/watch/7JQBpvHJwjdW6FKXvMfCK1))]
|
||||
|
||||
.blackbelt[DC17EU: What Have Syscalls Done for you Lately?
|
||||
([video](https://dockercon.docker.com/watch/4ZxNyWuwk9JHSxZxgBBi6J))]
|
||||
136
docs/lisa.yml
Normal file
@@ -0,0 +1,136 @@
|
||||
title: "LISA17 T9: Build, Ship, and Run Microservices on a Docker Swarm Cluster"
|
||||
|
||||
chat: "[Gitter](https://gitter.im/jpetazzo/workshop-20171031-sanfrancisco)"
|
||||
|
||||
|
||||
exclude:
|
||||
- self-paced
|
||||
- snap
|
||||
- auto-btp
|
||||
- benchmarking
|
||||
- elk-manual
|
||||
- prom-manual
|
||||
|
||||
chapters:
|
||||
- |
|
||||
class: title
|
||||
|
||||
.small[
|
||||
|
||||
LISA17 T9
|
||||
|
||||
Build, Ship, and Run Microservices on a Docker Swarm Cluster
|
||||
|
||||
.small[.small[
|
||||
|
||||
**Be kind to the WiFi!**
|
||||
|
||||
*Use the 5G network*
|
||||
<br/>
|
||||
*Don't use your hotspot*
|
||||
<br/>
|
||||
*Don't stream videos from YouTube, Netflix, etc.
|
||||
<br/>(if you're bored, watch local content instead)*
|
||||
|
||||
<!--
|
||||
Also: share the power outlets
|
||||
<br/>
|
||||
*(with limited power comes limited responsibility?)*
|
||||
<br/>
|
||||
*(or something?)*
|
||||
-->
|
||||
|
||||
Thank you!
|
||||
|
||||
]
|
||||
]
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Intros
|
||||
|
||||
- Hello! We are
|
||||
AJ ([@s0ulshake](https://twitter.com/s0ulshake), Travis CI)
|
||||
&
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo), Docker Inc.)
|
||||
|
||||
--
|
||||
|
||||
- This is our collective Docker knowledge:
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Logistics
|
||||
|
||||
- The tutorial will run from 1:30pm to 5:00pm
|
||||
|
||||
- This will be fast-paced, but DON'T PANIC!
|
||||
|
||||
- There will be a coffee break at 3:00pm
|
||||
<br/>
|
||||
(please remind us if we forget about it!)
|
||||
|
||||
- Feel free to interrupt for questions at any time
|
||||
|
||||
- All the content is publicly available (slides, code samples, scripts)
|
||||
|
||||
One URL to remember: http://container.training
|
||||
|
||||
- Live feedback, questions, help on @@CHAT@@
|
||||
|
||||
- intro.md
|
||||
- |
|
||||
@@TOC@@
|
||||
- - prereqs.md
|
||||
- versions.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
All right!
|
||||
<br/>
|
||||
We're all set.
|
||||
<br/>
|
||||
Let's do this.
|
||||
- sampleapp.md
|
||||
- swarmkit.md
|
||||
- creatingswarm.md
|
||||
- morenodes.md
|
||||
- - firstservice.md
|
||||
- ourapponswarm.md
|
||||
- updatingservices.md
|
||||
#- rollingupdates.md
|
||||
#- healthchecks.md
|
||||
- - operatingswarm.md
|
||||
#- netshoot.md
|
||||
#- ipsec.md
|
||||
#- swarmtools.md
|
||||
- security.md
|
||||
#- secrets.md
|
||||
#- encryptionatrest.md
|
||||
- leastprivilege.md
|
||||
- apiscope.md
|
||||
- logging.md
|
||||
- metrics.md
|
||||
#- stateful.md
|
||||
#- extratips.md
|
||||
- end.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
That's all folks! <br/> Questions?
|
||||
|
||||
.small[.small[
|
||||
|
||||
AJ ([@s0ulshake](https://twitter.com/s0ulshake)) — [@TravisCI](https://twitter.com/travisci)
|
||||
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@Docker](https://twitter.com/docker)
|
||||
|
||||
]]
|
||||
|
||||
<!--
|
||||
Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
|
||||
AJ ([@s0ulshake](https://twitter.com/s0ulshake))
|
||||
-->
|
||||
420
docs/logging.md
Normal file
@@ -0,0 +1,420 @@
|
||||
name: logging
|
||||
|
||||
# Centralized logging
|
||||
|
||||
- We want to send all our container logs to a central place
|
||||
|
||||
- If that place could offer a nice web dashboard too, that'd be nice
|
||||
|
||||
--
|
||||
|
||||
- We are going to deploy an ELK stack
|
||||
|
||||
- It will accept logs over a GELF socket
|
||||
|
||||
- We will update our services to send logs through the GELF logging driver
|
||||
|
||||
---
|
||||
|
||||
# Setting up ELK to store container logs
|
||||
|
||||
*Important foreword: this is not an "official" or "recommended"
|
||||
setup; it is just an example. We used ELK in this demo because
|
||||
it's a popular setup and we keep being asked about it; but you
|
||||
will have equal success with Fluent or other logging stacks!*
|
||||
|
||||
What we will do:
|
||||
|
||||
- Spin up an ELK stack with services
|
||||
|
||||
- Gaze at the spiffy Kibana web UI
|
||||
|
||||
- Manually send a few log entries using one-shot containers
|
||||
|
||||
- Set our containers up to send their logs to Logstash
|
||||
|
||||
---
|
||||
|
||||
## What's in an ELK stack?
|
||||
|
||||
- ELK is three components:
|
||||
|
||||
- ElasticSearch (to store and index log entries)
|
||||
|
||||
- Logstash (to receive log entries from various
|
||||
sources, process them, and forward them to various
|
||||
destinations)
|
||||
|
||||
- Kibana (to view/search log entries with a nice UI)
|
||||
|
||||
- The only component that we will configure is Logstash
|
||||
|
||||
- We will accept log entries using the GELF protocol
|
||||
|
||||
- Log entries will be stored in ElasticSearch,
|
||||
<br/>and displayed on Logstash's stdout for debugging
|
||||
|
||||
---
|
||||
|
||||
class: elk-manual
|
||||
|
||||
## Setting up ELK
|
||||
|
||||
- We need three containers: ElasticSearch, Logstash, Kibana
|
||||
|
||||
- We will place them on a common network, `logging`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create the network:
|
||||
```bash
|
||||
docker network create --driver overlay logging
|
||||
```
|
||||
|
||||
- Create the ElasticSearch service:
|
||||
```bash
|
||||
docker service create --network logging --name elasticsearch elasticsearch:2.4
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: elk-manual
|
||||
|
||||
## Setting up Kibana
|
||||
|
||||
- Kibana exposes the web UI
|
||||
|
||||
- Its default port (5601) needs to be published
|
||||
|
||||
- It needs a tiny bit of configuration: the address of the ElasticSearch service
|
||||
|
||||
- We don't want Kibana logs to show up in Kibana (it would create clutter)
|
||||
<br/>so we tell Logspout to ignore them
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create the Kibana service:
|
||||
```bash
|
||||
docker service create --network logging --name kibana --publish 5601:5601 \
|
||||
-e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana:4.6
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: elk-manual
|
||||
|
||||
## Setting up Logstash
|
||||
|
||||
- Logstash needs some configuration to listen to GELF messages and send them to ElasticSearch
|
||||
|
||||
- We could author a custom image bundling this configuration
|
||||
|
||||
- We can also pass the [configuration](https://github.com/jpetazzo/orchestration-workshop/blob/master/elk/logstash.conf) on the command line
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create the Logstash service:
|
||||
```bash
|
||||
docker service create --network logging --name logstash -p 12201:12201/udp \
|
||||
logstash:2.4 -e "$(cat ~/orchestration-workshop/elk/logstash.conf)"
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: elk-manual
|
||||
|
||||
## Checking Logstash
|
||||
|
||||
- Before proceeding, let's make sure that Logstash started properly
|
||||
|
||||
.exercise[
|
||||
|
||||
- Lookup the node running the Logstash container:
|
||||
```bash
|
||||
docker service ps logstash
|
||||
```
|
||||
|
||||
- Connect to that node
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: elk-manual
|
||||
|
||||
## View Logstash logs
|
||||
|
||||
.exercise[
|
||||
|
||||
- View the logs of the logstash service:
|
||||
```bash
|
||||
docker service logs logstash --follow
|
||||
```
|
||||
|
||||
<!-- ```wait "message" => "ok"``` -->
|
||||
<!-- ```keys ^C``` -->
|
||||
|
||||
]
|
||||
|
||||
You should see the heartbeat messages:
|
||||
.small[
|
||||
```json
|
||||
{ "message" => "ok",
|
||||
"host" => "1a4cfb063d13",
|
||||
"@version" => "1",
|
||||
"@timestamp" => "2016-06-19T00:45:45.273Z"
|
||||
}
|
||||
```
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: elk-auto
|
||||
|
||||
## Deploying our ELK cluster
|
||||
|
||||
- We will use a stack file
|
||||
|
||||
.exercise[
|
||||
|
||||
- Build, ship, and run our ELK stack:
|
||||
```bash
|
||||
docker-compose -f elk.yml build
|
||||
docker-compose -f elk.yml push
|
||||
docker stack deploy elk -c elk.yml
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: the *build* and *push* steps are not strictly necessary, but they don't hurt!
|
||||
|
||||
Let's have a look at the [Compose file](
|
||||
https://github.com/jpetazzo/orchestration-workshop/blob/master/stacks/elk.yml).
|
||||
|
||||
---
|
||||
|
||||
class: elk-auto
|
||||
|
||||
## Checking that our ELK stack works correctly
|
||||
|
||||
- Let's view the logs of logstash
|
||||
|
||||
(Who logs the loggers?)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Stream logstash's logs:
|
||||
```bash
|
||||
docker service logs --follow --tail 1 elk_logstash
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
You should see the heartbeat messages:
|
||||
|
||||
.small[
|
||||
```json
|
||||
{ "message" => "ok",
|
||||
"host" => "1a4cfb063d13",
|
||||
"@version" => "1",
|
||||
"@timestamp" => "2016-06-19T00:45:45.273Z"
|
||||
}
|
||||
```
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Testing the GELF receiver
|
||||
|
||||
- In a new window, we will generate a logging message
|
||||
|
||||
- We will use a one-off container, and Docker's GELF logging driver
|
||||
|
||||
.exercise[
|
||||
|
||||
- Send a test message:
|
||||
```bash
|
||||
docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
|
||||
--rm alpine echo hello
|
||||
```
|
||||
]
|
||||
|
||||
The test message should show up in the logstash container logs.
|
||||
|
||||
---
|
||||
|
||||
## Sending logs from a service
|
||||
|
||||
- We were sending from a "classic" container so far; let's send logs from a service instead
|
||||
|
||||
- We're lucky: the parameters (`--log-driver` and `--log-opt`) are exactly the same!
|
||||
|
||||
|
||||
.exercise[
|
||||
|
||||
- Send a test message:
|
||||
```bash
|
||||
docker service create \
|
||||
--log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
|
||||
alpine echo hello
|
||||
```
|
||||
|
||||
<!-- ```wait Detected task failure``` -->
|
||||
<!-- ```keys ^C``` -->
|
||||
|
||||
]
|
||||
|
||||
The test message should show up as well in the logstash container logs.
|
||||
|
||||
--
|
||||
|
||||
In fact, *multiple messages will show up, and continue to show up every few seconds!*
|
||||
|
||||
---
|
||||
|
||||
## Restart conditions
|
||||
|
||||
- By default, if a container exits (or is killed with `docker kill`, or runs out of memory ...),
|
||||
the Swarm will restart it (possibly on a different machine)
|
||||
|
||||
- This behavior can be changed by setting the *restart condition* parameter
|
||||
|
||||
.exercise[
|
||||
|
||||
- Change the restart condition so that Swarm doesn't try to restart our container forever:
|
||||
```bash
|
||||
docker service update `xxx` --restart-condition none
|
||||
```
|
||||
]
|
||||
|
||||
Available restart conditions are `none`, `any`, and `on-error`.
|
||||
|
||||
You can also set `--restart-delay`, `--restart-max-attempts`, and `--restart-window`.
|
||||
|
||||
---
|
||||
|
||||
## Connect to Kibana
|
||||
|
||||
- The Kibana web UI is exposed on cluster port 5601
|
||||
|
||||
.exercise[
|
||||
|
||||
- Connect to port 5601 of your cluster
|
||||
|
||||
- if you're using Play-With-Docker, click on the (5601) badge above the terminal
|
||||
|
||||
- otherwise, open http://(any-node-address):5601/ with your browser
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## "Configuring" Kibana
|
||||
|
||||
- If you see a status page with a yellow item, wait a minute and reload
|
||||
(Kibana is probably still initializing)
|
||||
|
||||
- Kibana should offer you to "Configure an index pattern":
|
||||
<br/>in the "Time-field name" drop down, select "@timestamp", and hit the
|
||||
"Create" button
|
||||
|
||||
- Then:
|
||||
|
||||
- click "Discover" (in the top-left corner)
|
||||
- click "Last 15 minutes" (in the top-right corner)
|
||||
- click "Last 1 hour" (in the list in the middle)
|
||||
- click "Auto-refresh" (top-right corner)
|
||||
- click "5 seconds" (top-left of the list)
|
||||
|
||||
- You should see a series of green bars (with one new green bar every minute)
|
||||
|
||||
---
|
||||
|
||||
## Updating our services to use GELF
|
||||
|
||||
- We will now inform our Swarm to add GELF logging to all our services
|
||||
|
||||
- This is done with the `docker service update` command
|
||||
|
||||
- The logging flags are the same as before
|
||||
|
||||
.exercise[
|
||||
|
||||
- Enable GELF logging for the `rng` service:
|
||||
```bash
|
||||
docker service update dockercoins_rng \
|
||||
--log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
After ~15 seconds, you should see the log messages in Kibana.
|
||||
|
||||
---
|
||||
|
||||
## Viewing container logs
|
||||
|
||||
- Go back to Kibana
|
||||
|
||||
- Container logs should be showing up!
|
||||
|
||||
- We can customize the web UI to be more readable
|
||||
|
||||
.exercise[
|
||||
|
||||
- In the left column, move the mouse over the following
|
||||
columns, and click the "Add" button that appears:
|
||||
|
||||
- host
|
||||
- container_name
|
||||
- message
|
||||
|
||||
<!--
|
||||
- logsource
|
||||
- program
|
||||
- message
|
||||
-->
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## .warning[Don't update stateful services!]
|
||||
|
||||
- What would have happened if we had updated the Redis service?
|
||||
|
||||
- When a service changes, SwarmKit replaces existing container with new ones
|
||||
|
||||
- This is fine for stateless services
|
||||
|
||||
- But if you update a stateful service, its data will be lost in the process
|
||||
|
||||
- If we updated our Redis service, all our DockerCoins would be lost
|
||||
|
||||
---
|
||||
|
||||
## Important afterword
|
||||
|
||||
**This is not a "production-grade" setup.**
|
||||
|
||||
It is just an educational example. We did set up a single
|
||||
ElasticSearch instance and a single Logstash instance.
|
||||
|
||||
In a production setup, you need an ElasticSearch cluster
|
||||
(both for capacity and availability reasons). You also
|
||||
need multiple Logstash instances.
|
||||
|
||||
And if you want to withstand
|
||||
bursts of logs, you need some kind of message queue:
|
||||
Redis if you're cheap, Kafka if you want to make sure
|
||||
that you don't drop messages on the floor. Good luck.
|
||||
|
||||
If you want to learn more about the GELF driver,
|
||||
have a look at [this blog post](
|
||||
http://jpetazzo.github.io/2017/01/20/docker-logging-gelf/).
|
||||
5
docs/loop.sh
Executable file
@@ -0,0 +1,5 @@
|
||||
#!/bin/sh
|
||||
while true; do
|
||||
find . |
|
||||
entr -d . sh -c "DEBUG=1 ./markmaker.py < kube.yml > workshop.md"
|
||||
done
|
||||
225
docs/machine.md
Normal file
@@ -0,0 +1,225 @@
|
||||
## Adding nodes using the Docker API
|
||||
|
||||
- We don't have to SSH into the other nodes, we can use the Docker API
|
||||
|
||||
- If you are using Play-With-Docker:
|
||||
|
||||
- the nodes expose the Docker API over port 2375/tcp, without authentication
|
||||
|
||||
- we will connect by setting the `DOCKER_HOST` environment variable
|
||||
|
||||
- Otherwise:
|
||||
|
||||
- the nodes expose the Docker API over port 2376/tcp, with TLS mutual authentication
|
||||
|
||||
- we will use Docker Machine to set the correct environment variables
|
||||
<br/>(the nodes have been suitably pre-configured to be controlled through `node1`)
|
||||
|
||||
---
|
||||
|
||||
# Docker Machine
|
||||
|
||||
- Docker Machine has two primary uses:
|
||||
|
||||
- provisioning cloud instances running the Docker Engine
|
||||
|
||||
- managing local Docker VMs within e.g. VirtualBox
|
||||
|
||||
- Docker Machine is purely optional
|
||||
|
||||
- It makes it easy to create, upgrade, manage... Docker hosts:
|
||||
|
||||
- on your favorite cloud provider
|
||||
|
||||
- locally (e.g. to test clustering, or different versions)
|
||||
|
||||
- across different cloud providers
|
||||
|
||||
---
|
||||
|
||||
class: self-paced
|
||||
|
||||
## If you're using Play-With-Docker ...
|
||||
|
||||
- You won't need to use Docker Machine
|
||||
|
||||
- Instead, to "talk" to another node, we'll just set `DOCKER_HOST`
|
||||
|
||||
- You can skip the exercises telling you to do things with Docker Machine!
|
||||
|
||||
---
|
||||
|
||||
## Docker Machine basic usage
|
||||
|
||||
- We will learn two commands:
|
||||
|
||||
- `docker-machine ls` (list existing hosts)
|
||||
|
||||
- `docker-machine env` (switch to a specific host)
|
||||
|
||||
.exercise[
|
||||
|
||||
- List configured hosts:
|
||||
```bash
|
||||
docker-machine ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
You should see your 5 nodes.
|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
## How did we make our 5 nodes show up there?
|
||||
|
||||
*For the curious...*
|
||||
|
||||
- This was done by our VM provisioning scripts
|
||||
|
||||
- After setting up everything else, `node1` adds the 5 nodes
|
||||
to the local Docker Machine configuration
|
||||
(located in `$HOME/.docker/machine`)
|
||||
|
||||
- Nodes are added using [Docker Machine generic driver](https://docs.docker.com/machine/drivers/generic/)
|
||||
|
||||
(It skips machine provisioning and jumps straight to the configuration phase)
|
||||
|
||||
- Docker Machine creates TLS certificates and deploys them to the nodes through SSH
|
||||
|
||||
---
|
||||
|
||||
## Using Docker Machine to communicate with a node
|
||||
|
||||
- To select a node, use `eval $(docker-machine env nodeX)`
|
||||
|
||||
- This sets a number of environment variables
|
||||
|
||||
- To unset these variables, use `eval $(docker-machine env -u)`
|
||||
|
||||
.exercise[
|
||||
|
||||
- View the variables used by Docker Machine:
|
||||
```bash
|
||||
docker-machine env node3
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
(This shows which variables *would* be set by Docker Machine; but it doesn't change them.)
|
||||
|
||||
---
|
||||
|
||||
## Getting the token
|
||||
|
||||
- First, let's store the join token in a variable
|
||||
|
||||
- This must be done from a manager
|
||||
|
||||
.exercise[
|
||||
|
||||
- Make sure we talk to the local node, or `node1`:
|
||||
```bash
|
||||
eval $(docker-machine env -u)
|
||||
```
|
||||
|
||||
- Get the join token:
|
||||
```bash
|
||||
TOKEN=$(docker swarm join-token -q worker)
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Change the node targeted by the Docker CLI
|
||||
|
||||
- We need to set the right environment variables to communicate with `node3`
|
||||
|
||||
.exercise[
|
||||
|
||||
- If you're using Play-With-Docker:
|
||||
```bash
|
||||
export DOCKER_HOST=tcp://node3:2375
|
||||
```
|
||||
|
||||
- Otherwise, use Docker Machine:
|
||||
```bash
|
||||
eval $(docker-machine env node3)
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Checking which node we're talking to
|
||||
|
||||
- Let's use the Docker API to ask "who are you?" to the remote node
|
||||
|
||||
.exercise[
|
||||
|
||||
- Extract the node name from the output of `docker info`:
|
||||
```bash
|
||||
docker info | grep ^Name
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
This should tell us that we are talking to `node3`.
|
||||
|
||||
Note: it can be useful to use a [custom shell prompt](
|
||||
https://github.com/jpetazzo/orchestration-workshop/blob/master/prepare-vms/scripts/postprep.rc#L68)
|
||||
reflecting the `DOCKER_HOST` variable.
|
||||
|
||||
---
|
||||
|
||||
## Adding a node through the Docker API
|
||||
|
||||
- We are going to use the same `docker swarm join` command as before
|
||||
|
||||
.exercise[
|
||||
|
||||
- Add `node3` to the Swarm:
|
||||
```bash
|
||||
docker swarm join --token $TOKEN node1:2377
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Going back to the local node
|
||||
|
||||
- We need to revert the environment variable(s) that we had set previously
|
||||
|
||||
.exercise[
|
||||
|
||||
- If you're using Play-With-Docker, just clear `DOCKER_HOST`:
|
||||
```bash
|
||||
unset DOCKER_HOST
|
||||
```
|
||||
|
||||
- Otherwise, use Docker Machine to reset all the relevant variables:
|
||||
```bash
|
||||
eval $(docker-machine env -u)
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
From that point, we are communicating with `node1` again.
|
||||
|
||||
---
|
||||
|
||||
## Checking the composition of our cluster
|
||||
|
||||
- Now that we're talking to `node1` again, we can use management commands
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check that the node is here:
|
||||
```bash
|
||||
docker node ls
|
||||
```
|
||||
|
||||
]
|
||||
168
docs/markmaker.py
Executable file
@@ -0,0 +1,168 @@
|
||||
#!/usr/bin/env python
|
||||
# transforms a YAML manifest into a HTML workshop file
|
||||
|
||||
import glob
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import string
|
||||
import subprocess
|
||||
import sys
|
||||
import yaml
|
||||
|
||||
|
||||
logging.basicConfig(level=os.environ.get("LOG_LEVEL", "INFO"))
|
||||
|
||||
|
||||
class InvalidChapter(ValueError):
|
||||
|
||||
def __init__(self, chapter):
|
||||
ValueError.__init__(self, "Invalid chapter: {!r}".format(chapter))
|
||||
|
||||
|
||||
def anchor(title):
|
||||
title = title.lower().replace(' ', '-')
|
||||
title = ''.join(c for c in title if c in string.ascii_letters+'-')
|
||||
return "toc-" + title
|
||||
|
||||
|
||||
def insertslide(markdown, title):
|
||||
title_position = markdown.find("\n# {}\n".format(title))
|
||||
slide_position = markdown.rfind("\n---\n", 0, title_position+1)
|
||||
logging.debug("Inserting title slide at position {}: {}".format(slide_position, title))
|
||||
|
||||
before = markdown[:slide_position]
|
||||
|
||||
extra_slide = """
|
||||
---
|
||||
|
||||
name: {anchor}
|
||||
class: title
|
||||
|
||||
{title}
|
||||
|
||||
.nav[[Back to table of contents](#{toclink})]
|
||||
|
||||
.debug[(automatically generated title slide)]
|
||||
""".format(anchor=anchor(title), title=title, toclink=title2chapter[title])
|
||||
after = markdown[slide_position:]
|
||||
return before + extra_slide + after
|
||||
|
||||
|
||||
def flatten(titles):
|
||||
for title in titles:
|
||||
if isinstance(title, list):
|
||||
for t in flatten(title):
|
||||
yield t
|
||||
else:
|
||||
yield title
|
||||
|
||||
|
||||
def generatefromyaml(manifest):
|
||||
manifest = yaml.load(manifest)
|
||||
|
||||
markdown, titles = processchapter(manifest["chapters"], "(inline)")
|
||||
logging.debug("Found {} titles.".format(len(titles)))
|
||||
toc = gentoc(titles)
|
||||
markdown = markdown.replace("@@TOC@@", toc)
|
||||
for title in flatten(titles):
|
||||
markdown = insertslide(markdown, title)
|
||||
|
||||
exclude = manifest.get("exclude", [])
|
||||
logging.debug("exclude={!r}".format(exclude))
|
||||
if not exclude:
|
||||
logging.warning("'exclude' is empty.")
|
||||
exclude = ",".join('"{}"'.format(c) for c in exclude)
|
||||
|
||||
html = open("workshop.html").read()
|
||||
html = html.replace("@@MARKDOWN@@", markdown)
|
||||
html = html.replace("@@EXCLUDE@@", exclude)
|
||||
html = html.replace("@@CHAT@@", manifest["chat"])
|
||||
html = html.replace("@@TITLE@@", manifest["title"])
|
||||
return html
|
||||
|
||||
|
||||
title2chapter = {}
|
||||
|
||||
|
||||
def gentoc(titles, depth=0, chapter=0):
|
||||
if not titles:
|
||||
return ""
|
||||
if isinstance(titles, str):
|
||||
title2chapter[titles] = "toc-chapter-1"
|
||||
logging.debug("Chapter {} Title {}".format(chapter, titles))
|
||||
return " "*(depth-2) + "- [{}](#{})\n".format(titles, anchor(titles))
|
||||
if isinstance(titles, list):
|
||||
if depth==0:
|
||||
sep = "\n\n.debug[(auto-generated TOC)]\n---\n\n"
|
||||
head = ""
|
||||
tail = ""
|
||||
elif depth==1:
|
||||
sep = "\n"
|
||||
head = "name: toc-chapter-{}\n\n## Chapter {}\n\n".format(chapter, chapter)
|
||||
tail = ""
|
||||
else:
|
||||
sep = "\n"
|
||||
head = ""
|
||||
tail = ""
|
||||
return head + sep.join(gentoc(t, depth+1, c+1) for (c,t) in enumerate(titles)) + tail
|
||||
|
||||
|
||||
# Arguments:
|
||||
# - `chapter` is a string; if it has multiple lines, it will be used as
|
||||
# a markdown fragment; otherwise it will be considered as a file name
|
||||
# to be recursively loaded and parsed
|
||||
# - `filename` is the name of the file that we're currently processing
|
||||
# (to generate inline comments to facilitate edition)
|
||||
# Returns: (epxandedmarkdown,[list of titles])
|
||||
# The list of titles can be nested.
|
||||
def processchapter(chapter, filename):
|
||||
if isinstance(chapter, unicode):
|
||||
return processchapter(chapter.encode("utf-8"), filename)
|
||||
if isinstance(chapter, str):
|
||||
if "\n" in chapter:
|
||||
titles = re.findall("^# (.*)", chapter, re.MULTILINE)
|
||||
slidefooter = ".debug[{}]".format(makelink(filename))
|
||||
chapter = chapter.replace("\n---\n", "\n{}\n---\n".format(slidefooter))
|
||||
chapter += "\n" + slidefooter
|
||||
return (chapter, titles)
|
||||
if os.path.isfile(chapter):
|
||||
return processchapter(open(chapter).read(), chapter)
|
||||
if isinstance(chapter, list):
|
||||
chapters = [processchapter(c, filename) for c in chapter]
|
||||
markdown = "\n---\n".join(c[0] for c in chapters)
|
||||
titles = [t for (m,t) in chapters if t]
|
||||
return (markdown, titles)
|
||||
raise InvalidChapter(chapter)
|
||||
|
||||
# Try to figure out the URL of the repo on GitHub.
|
||||
# This is used to generate "edit me on GitHub"-style links.
|
||||
try:
|
||||
if "REPOSITORY_URL" in os.environ:
|
||||
repo = os.environ["REPOSITORY_URL"]
|
||||
else:
|
||||
repo = subprocess.check_output(["git", "config", "remote.origin.url"])
|
||||
repo = repo.strip().replace("git@github.com:", "https://github.com/")
|
||||
if "BRANCH" in os.environ:
|
||||
branch = os.environ["BRANCH"]
|
||||
else:
|
||||
branch = subprocess.check_output(["git", "status", "--short", "--branch"])
|
||||
branch = branch[3:].split("...")[0]
|
||||
base = subprocess.check_output(["git", "rev-parse", "--show-prefix"])
|
||||
base = base.strip().strip("/")
|
||||
urltemplate = ("{repo}/tree/{branch}/{base}/{filename}"
|
||||
.format(repo=repo, branch=branch, base=base, filename="{}"))
|
||||
except:
|
||||
logging.exception("Could not generate repository URL; generating local URLs instead.")
|
||||
urltemplate = "file://{pwd}/{filename}".format(pwd=os.environ["PWD"], filename="{}")
|
||||
|
||||
def makelink(filename):
|
||||
if os.path.isfile(filename):
|
||||
url = urltemplate.format(filename)
|
||||
return "[{}]({})".format(filename, url)
|
||||
else:
|
||||
return filename
|
||||
|
||||
|
||||
sys.stdout.write(generatefromyaml(sys.stdin))
|
||||
logging.info("Done")
|
||||
1637
docs/metrics.md
Normal file
236
docs/morenodes.md
Normal file
@@ -0,0 +1,236 @@
|
||||
## Adding more manager nodes
|
||||
|
||||
- Right now, we have only one manager (node1)
|
||||
|
||||
- If we lose it, we lose quorum - and that's *very bad!*
|
||||
|
||||
- Containers running on other nodes will be fine ...
|
||||
|
||||
- But we won't be able to get or set anything related to the cluster
|
||||
|
||||
- If the manager is permanently gone, we will have to do a manual repair!
|
||||
|
||||
- Nobody wants to do that ... so let's make our cluster highly available
|
||||
|
||||
---
|
||||
|
||||
class: self-paced
|
||||
|
||||
## Adding more managers
|
||||
|
||||
With Play-With-Docker:
|
||||
|
||||
```bash
|
||||
TOKEN=$(docker swarm join-token -q manager)
|
||||
for N in $(seq 4 5); do
|
||||
export DOCKER_HOST=tcp://node$N:2375
|
||||
docker swarm join --token $TOKEN node1:2377
|
||||
done
|
||||
unset DOCKER_HOST
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
## Building our full cluster
|
||||
|
||||
- We could SSH to nodes 3, 4, 5; and copy-paste the command
|
||||
|
||||
--
|
||||
|
||||
class: in-person
|
||||
|
||||
- Or we could use the AWESOME POWER OF THE SHELL!
|
||||
|
||||
--
|
||||
|
||||
class: in-person
|
||||
|
||||

|
||||
|
||||
--
|
||||
|
||||
class: in-person
|
||||
|
||||
- No, not *that* shell
|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
## Let's form like Swarm-tron
|
||||
|
||||
- Let's get the token, and loop over the remaining nodes with SSH
|
||||
|
||||
.exercise[
|
||||
|
||||
- Obtain the manager token:
|
||||
```bash
|
||||
TOKEN=$(docker swarm join-token -q manager)
|
||||
```
|
||||
|
||||
- Loop over the 3 remaining nodes:
|
||||
```bash
|
||||
for NODE in node3 node4 node5; do
|
||||
ssh $NODE docker swarm join --token $TOKEN node1:2377
|
||||
done
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
[That was easy.](https://www.youtube.com/watch?v=3YmMNpbFjp0)
|
||||
|
||||
---
|
||||
|
||||
## You can control the Swarm from any manager node
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try the following command on a few different nodes:
|
||||
```bash
|
||||
docker node ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
On manager nodes:
|
||||
<br/>you will see the list of nodes, with a `*` denoting
|
||||
the node you're talking to.
|
||||
|
||||
On non-manager nodes:
|
||||
<br/>you will get an error message telling you that
|
||||
the node is not a manager.
|
||||
|
||||
As we saw earlier, you can only control the Swarm through a manager node.
|
||||
|
||||
---
|
||||
|
||||
class: self-paced
|
||||
|
||||
## Play-With-Docker node status icon
|
||||
|
||||
- If you're using Play-With-Docker, you get node status icons
|
||||
|
||||
- Node status icons are displayed left of the node name
|
||||
|
||||
- No icon = no Swarm mode detected
|
||||
- Solid blue icon = Swarm manager detected
|
||||
- Blue outline icon = Swarm worker detected
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Dynamically changing the role of a node
|
||||
|
||||
- We can change the role of a node on the fly:
|
||||
|
||||
`docker node promote nodeX` → make nodeX a manager
|
||||
<br/>
|
||||
`docker node demote nodeX` → make nodeX a worker
|
||||
|
||||
.exercise[
|
||||
|
||||
- See the current list of nodes:
|
||||
```
|
||||
docker node ls
|
||||
```
|
||||
|
||||
- Promote any worker node to be a manager:
|
||||
```
|
||||
docker node promote <node_name_or_id>
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## How many managers do we need?
|
||||
|
||||
- 2N+1 nodes can (and will) tolerate N failures
|
||||
<br/>(you can have an even number of managers, but there is no point)
|
||||
|
||||
--
|
||||
|
||||
- 1 manager = no failure
|
||||
|
||||
- 3 managers = 1 failure
|
||||
|
||||
- 5 managers = 2 failures (or 1 failure during 1 maintenance)
|
||||
|
||||
- 7 managers and more = now you might be overdoing it a little bit
|
||||
|
||||
---
|
||||
|
||||
## Why not have *all* nodes be managers?
|
||||
|
||||
- Intuitively, it's harder to reach consensus in larger groups
|
||||
|
||||
- With Raft, writes have to go to (and be acknowledged by) all nodes
|
||||
|
||||
- More nodes = more network traffic
|
||||
|
||||
- Bigger network = more latency
|
||||
|
||||
---
|
||||
|
||||
## What would McGyver do?
|
||||
|
||||
- If some of your machines are more than 10ms away from each other,
|
||||
<br/>
|
||||
try to break them down in multiple clusters
|
||||
(keeping internal latency low)
|
||||
|
||||
- Groups of up to 9 nodes: all of them are managers
|
||||
|
||||
- Groups of 10 nodes and up: pick 5 "stable" nodes to be managers
|
||||
<br/>
|
||||
(Cloud pro-tip: use separate auto-scaling groups for managers and workers)
|
||||
|
||||
- Groups of more than 100 nodes: watch your managers' CPU and RAM
|
||||
|
||||
- Groups of more than 1000 nodes:
|
||||
|
||||
- if you can afford to have fast, stable managers, add more of them
|
||||
- otherwise, break down your nodes in multiple clusters
|
||||
|
||||
---
|
||||
|
||||
## What's the upper limit?
|
||||
|
||||
- We don't know!
|
||||
|
||||
- Internal testing at Docker Inc.: 1000-10000 nodes is fine
|
||||
|
||||
- deployed to a single cloud region
|
||||
|
||||
- one of the main take-aways was *"you're gonna need a bigger manager"*
|
||||
|
||||
- Testing by the community: [4700 heterogenous nodes all over the 'net](https://sematext.com/blog/2016/11/14/docker-swarm-lessons-from-swarm3k/)
|
||||
|
||||
- it just works
|
||||
|
||||
- more nodes require more CPU; more containers require more RAM
|
||||
|
||||
- scheduling of large jobs (70000 containers) is slow, though (working on it!)
|
||||
|
||||
---
|
||||
|
||||
## Real-life deployment methods
|
||||
|
||||
--
|
||||
|
||||
Running commands manually over SSH
|
||||
|
||||
--
|
||||
|
||||
(lol jk)
|
||||
|
||||
--
|
||||
|
||||
- Using your favorite configuration management tool
|
||||
|
||||
- [Docker for AWS](https://docs.docker.com/docker-for-aws/#quickstart)
|
||||
|
||||
- [Docker for Azure](https://docs.docker.com/docker-for-azure/)
|
||||
236
docs/namespaces.md
Normal file
@@ -0,0 +1,236 @@
|
||||
class: namespaces
|
||||
name: namespaces
|
||||
|
||||
# Improving isolation with User Namespaces
|
||||
|
||||
- *Namespaces* are kernel mechanisms to compartimetalize the system
|
||||
|
||||
- There are different kind of namespaces: `pid`, `net`, `mnt`, `ipc`, `uts`, and `user`
|
||||
|
||||
- For a primer, see "Anatomy of a Container"
|
||||
([video](https://www.youtube.com/watch?v=sK5i-N34im8))
|
||||
([slides](https://www.slideshare.net/jpetazzo/cgroups-namespaces-and-beyond-what-are-containers-made-from-dockercon-europe-2015))
|
||||
|
||||
- The *user namespace* allows to map UIDs between the containers and the host
|
||||
|
||||
- As a result, `root` in a container can map to a non-privileged user on the host
|
||||
|
||||
Note: even without user namespaces, `root` in a container cannot go wild on the host.
|
||||
<br/>
|
||||
It is mediated by capabilities, cgroups, namespaces, seccomp, LSMs...
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## User Namespaces in Docker
|
||||
|
||||
- Optional feature added in Docker Engine 1.10
|
||||
|
||||
- Not enabled by default
|
||||
|
||||
- Has to be enabled at Engine startup, and affects all containers
|
||||
|
||||
- When enabled, `UID:GID` in containers are mapped to a different range on the host
|
||||
|
||||
- Safer than switching to a non-root user (with `-u` or `USER`) in the container
|
||||
<br/>
|
||||
(Since with user namespaces, root escalation maps to a non-privileged user)
|
||||
|
||||
- Can be selectively disabled per container by starting them with `--userns=host`
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## User Namespaces Caveats
|
||||
|
||||
When user namespaces are enabled, containers cannot:
|
||||
|
||||
- Use the host's network namespace (with `docker run --network=host`)
|
||||
|
||||
- Use the host's PID namespace (with `docker run --pid=host`)
|
||||
|
||||
- Run in privileged mode (with `docker run --privileged`)
|
||||
|
||||
... Unless user namespaces are disabled for the container, with flag `--userns=host`
|
||||
|
||||
External volume and graph drivers that don't support user mapping might not work.
|
||||
|
||||
All containers are currently mapped to the same UID:GID range.
|
||||
|
||||
Some of these limitations might be lifted in the future!
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Filesystem ownership details
|
||||
|
||||
When enabling user namespaces:
|
||||
|
||||
- the UID:GID on disk (in the images and containers) has to match the *mapped* UID:GID
|
||||
|
||||
- existing images and containers cannot work (their UID:GID would have to be changed)
|
||||
|
||||
For practical reasons, when enabling user namespaces, the Docker Engine places containers and images (and everything else) in a different directory.
|
||||
|
||||
As a resut, if you enable user namespaces on an existing installation:
|
||||
|
||||
- all containers and images (and e.g. Swarm data) disappear
|
||||
|
||||
- *if a node is a member of a Swarm, it is then kicked out of the Swarm*
|
||||
|
||||
- everything will re-appear if you disable user namespaces again
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Picking a node
|
||||
|
||||
- We will select a node where we will enable user namespaces
|
||||
|
||||
- This node will have to be re-added to the Swarm
|
||||
|
||||
- All containers and services running on this node will be rescheduled
|
||||
|
||||
- Let's make sure that we do not pick the node running the registry!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check on which node the registry is running:
|
||||
```bash
|
||||
docker service ps registry
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Pick any other node (noted `nodeX` in the next slides).
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Logging into the right Engine
|
||||
|
||||
.exercise[
|
||||
|
||||
- Log into the right node:
|
||||
```bash
|
||||
ssh node`X`
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Configuring the Engine
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create a configuration file for the Engine:
|
||||
```bash
|
||||
echo '{"userns-remap": "default"}' | sudo tee /etc/docker/daemon.json
|
||||
```
|
||||
|
||||
- Restart the Engine:
|
||||
```bash
|
||||
kill $(pidof dockerd)
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Checking that User Namespaces are enabled
|
||||
|
||||
.exercise[
|
||||
- Notice the new Docker path:
|
||||
```bash
|
||||
docker info | grep var/lib
|
||||
```
|
||||
|
||||
- Notice the new UID:GID permissions:
|
||||
```bash
|
||||
sudo ls -l /var/lib/docker
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
You should see a line like the following:
|
||||
```
|
||||
drwx------ 11 296608 296608 4096 Aug 3 05:11 296608.296608
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Add the node back to the Swarm
|
||||
|
||||
.exercise[
|
||||
|
||||
- Get our manager token from another node:
|
||||
```bash
|
||||
ssh node`Y` docker swarm join-token manager
|
||||
```
|
||||
|
||||
- Copy-paste the join command to the node
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Check the new UID:GID
|
||||
|
||||
.exercise[
|
||||
|
||||
- Run a background container on the node:
|
||||
```bash
|
||||
docker run -d --name lockdown alpine sleep 1000000
|
||||
```
|
||||
|
||||
- Look at the processes in this container:
|
||||
```bash
|
||||
docker top lockdown
|
||||
ps faux
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: namespaces
|
||||
|
||||
## Comparing on-disk ownership with/without User Namespaces
|
||||
|
||||
.exercise[
|
||||
|
||||
- Compare the output of the two following commands:
|
||||
```bash
|
||||
docker run alpine ls -l /
|
||||
docker run --userns=host alpine ls -l /
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
class: namespaces
|
||||
|
||||
In the first case, it looks like things belong to `root:root`.
|
||||
|
||||
In the second case, we will see the "real" (on-disk) ownership.
|
||||
|
||||
--
|
||||
|
||||
class: namespaces
|
||||
|
||||
Remember to get back to `node1` when finished!
|
||||
385
docs/netshoot.md
Normal file
@@ -0,0 +1,385 @@
|
||||
class: extra-details
|
||||
|
||||
## Troubleshooting overlay networks
|
||||
|
||||
<!--
|
||||
|
||||
## Finding the real cause of the bottleneck
|
||||
|
||||
- We want to debug our app as we scale `worker` up and down
|
||||
|
||||
-->
|
||||
|
||||
- We want to run tools like `ab` or `httping` on the internal network
|
||||
|
||||
--
|
||||
|
||||
class: extra-details
|
||||
|
||||
- Ah, if only we had created our overlay network with the `--attachable` flag ...
|
||||
|
||||
--
|
||||
|
||||
class: extra-details
|
||||
|
||||
- Oh well, let's use this as an excuse to introduce New Ways To Do Things
|
||||
|
||||
---
|
||||
|
||||
# Breaking into an overlay network
|
||||
|
||||
- We will create a dummy placeholder service on our network
|
||||
|
||||
- Then we will use `docker exec` to run more processes in this container
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start a "do nothing" container using our favorite Swiss-Army distro:
|
||||
```bash
|
||||
docker service create --network dockercoins_default --name debug \
|
||||
--constraint node.hostname==$HOSTNAME alpine sleep 1000000000
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The `constraint` makes sure that the container will be created on the local node.
|
||||
|
||||
---
|
||||
|
||||
## Entering the debug container
|
||||
|
||||
- Once our container is started (which should be really fast because the alpine image is small), we can enter it (from any node)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Locate the container:
|
||||
```bash
|
||||
docker ps
|
||||
```
|
||||
|
||||
- Enter it:
|
||||
```bash
|
||||
docker exec -ti <containerID> sh
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Labels
|
||||
|
||||
- We can also be fancy and find the ID of the container automatically
|
||||
|
||||
- SwarmKit places labels on containers
|
||||
|
||||
.exercise[
|
||||
|
||||
- Get the ID of the container:
|
||||
```bash
|
||||
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=debug)
|
||||
```
|
||||
|
||||
- And enter the container:
|
||||
```bash
|
||||
docker exec -ti $CID sh
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Installing our debugging tools
|
||||
|
||||
- Ideally, you would author your own image, with all your favorite tools, and use it instead of the base `alpine` image
|
||||
|
||||
- But we can also dynamically install whatever we need
|
||||
|
||||
.exercise[
|
||||
|
||||
- Install a few tools:
|
||||
```bash
|
||||
apk add --update curl apache2-utils drill
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Investigating the `rng` service
|
||||
|
||||
- First, let's check what `rng` resolves to
|
||||
|
||||
.exercise[
|
||||
|
||||
- Use drill or nslookup to resolve `rng`:
|
||||
```bash
|
||||
drill rng
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
This give us one IP address. It is not the IP address of a container.
|
||||
It is a virtual IP address (VIP) for the `rng` service.
|
||||
|
||||
---
|
||||
|
||||
## Investigating the VIP
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try to ping the VIP:
|
||||
```bash
|
||||
ping rng
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
It *should* ping. (But this might change in the future.)
|
||||
|
||||
With Engine 1.12: VIPs respond to ping if a
|
||||
backend is available on the same machine.
|
||||
|
||||
With Engine 1.13: VIPs respond to ping if a
|
||||
backend is available anywhere.
|
||||
|
||||
(Again: this might change in the future.)
|
||||
|
||||
---
|
||||
|
||||
## What if I don't like VIPs?
|
||||
|
||||
- Services can be published using two modes: VIP and DNSRR.
|
||||
|
||||
- With VIP, you get a virtual IP for the service, and a load balancer
|
||||
based on IPVS
|
||||
|
||||
(By the way, IPVS is totally awesome and if you want to learn more about it in the context of containers,
|
||||
I highly recommend [this talk](https://www.youtube.com/watch?v=oFsJVV1btDU&index=5&list=PLkA60AVN3hh87OoVra6MHf2L4UR9xwJkv) by [@kobolog](https://twitter.com/kobolog) at DC15EU!)
|
||||
|
||||
- With DNSRR, you get the former behavior (from Engine 1.11), where
|
||||
resolving the service yields the IP addresses of all the containers for
|
||||
this service
|
||||
|
||||
- You change this with `docker service create --endpoint-mode [VIP|DNSRR]`
|
||||
|
||||
---
|
||||
|
||||
## Looking up VIP backends
|
||||
|
||||
- You can also resolve a special name: `tasks.<name>`
|
||||
|
||||
- It will give you the IP addresses of the containers for a given service
|
||||
|
||||
.exercise[
|
||||
|
||||
- Obtain the IP addresses of the containers for the `rng` service:
|
||||
```bash
|
||||
drill tasks.rng
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
This should list 5 IP addresses.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Testing and benchmarking our service
|
||||
|
||||
- We will check that the service is up with `rng`, then
|
||||
benchmark it with `ab`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Make a test request to the service:
|
||||
```bash
|
||||
curl rng
|
||||
```
|
||||
|
||||
- Open another window, and stop the workers, to test in isolation:
|
||||
```bash
|
||||
docker service update dockercoins_worker --replicas 0
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Wait until the workers are stopped (check with `docker service ls`)
|
||||
before continuing.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Benchmarking `rng`
|
||||
|
||||
We will send 50 requests, but with various levels of concurrency.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Send 50 requests, with a single sequential client:
|
||||
```bash
|
||||
ab -c 1 -n 50 http://rng/10
|
||||
```
|
||||
|
||||
- Send 50 requests, with fifty parallel clients:
|
||||
```bash
|
||||
ab -c 50 -n 50 http://rng/10
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Benchmark results for `rng`
|
||||
|
||||
- When serving requests sequentially, they each take 100ms
|
||||
|
||||
- In the parallel scenario, the latency increased dramatically:
|
||||
|
||||
- What about `hasher`?
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Benchmarking `hasher`
|
||||
|
||||
We will do the same tests for `hasher`.
|
||||
|
||||
The command is slightly more complex, since we need to post random data.
|
||||
|
||||
First, we need to put the POST payload in a temporary file.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Install curl in the container, and generate 10 bytes of random data:
|
||||
```bash
|
||||
curl http://rng/10 >/tmp/random
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Benchmarking `hasher`
|
||||
|
||||
Once again, we will send 50 requests, with different levels of concurrency.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Send 50 requests with a sequential client:
|
||||
```bash
|
||||
ab -c 1 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
|
||||
```
|
||||
|
||||
- Send 50 requests with 50 parallel clients:
|
||||
```bash
|
||||
ab -c 50 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Benchmark results for `hasher`
|
||||
|
||||
- The sequential benchmarks takes ~5 seconds to complete
|
||||
|
||||
- The parallel benchmark takes less than 1 second to complete
|
||||
|
||||
- In both cases, each request takes a bit more than 100ms to complete
|
||||
|
||||
- Requests are a bit slower in the parallel benchmark
|
||||
|
||||
- It looks like `hasher` is better equiped to deal with concurrency than `rng`
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, title, benchmarking
|
||||
|
||||
Why?
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Why does everything take (at least) 100ms?
|
||||
|
||||
`rng` code:
|
||||
|
||||

|
||||
|
||||
`hasher` code:
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: extra-details, title, benchmarking
|
||||
|
||||
But ...
|
||||
|
||||
WHY?!?
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, benchmarking
|
||||
|
||||
## Why did we sprinkle this sample app with sleeps?
|
||||
|
||||
- Deterministic performance
|
||||
<br/>(regardless of instance speed, CPUs, I/O...)
|
||||
|
||||
- Actual code sleeps all the time anyway
|
||||
|
||||
- When your code makes a remote API call:
|
||||
|
||||
- it sends a request;
|
||||
|
||||
- it sleeps until it gets the response;
|
||||
|
||||
- it processes the response.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, in-person, benchmarking
|
||||
|
||||
## Why do `rng` and `hasher` behave differently?
|
||||
|
||||

|
||||
|
||||
(Synchronous vs. asynchronous event processing)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Global scheduling → global debugging
|
||||
|
||||
- Traditional approach:
|
||||
|
||||
- log into a node
|
||||
- install our Swiss Army Knife (if necessary)
|
||||
- troubleshoot things
|
||||
|
||||
- Proposed alternative:
|
||||
|
||||
- put our Swiss Army Knife in a container (e.g. [nicolaka/netshoot](https://hub.docker.com/r/nicolaka/netshoot/))
|
||||
- run tests from multiple locations at the same time
|
||||
|
||||
(This becomes very practical with the `docker service log` command, available since 17.05.)
|
||||
|
||||
---
|
||||
|
||||
## More about overlay networks
|
||||
|
||||
.blackbelt[[Deep Dive in Docker Overlay Networks](https://www.youtube.com/watch?v=b3XDl0YsVsg&index=1&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8) by Laurent Bernaille (DC17US)]
|
||||
|
||||
.blackbelt[Deeper Dive in Docker Overlay Networks by Laurent Bernaille (Wednesday 13:30)]
|
||||
18
docs/nodeinfo.md
Normal file
@@ -0,0 +1,18 @@
|
||||
## Getting task information for a given node
|
||||
|
||||
- You can see all the tasks assigned to a node with `docker node ps`
|
||||
|
||||
- It shows the *desired state* and *current state* of each task
|
||||
|
||||
- `docker node ps` shows info about the current node
|
||||
|
||||
- `docker node ps <node_name_or_id>` shows info for another node
|
||||
|
||||
- `docker node ps -f <filter_expression>` allows to select which tasks to show
|
||||
|
||||
```bash
|
||||
# Show only tasks that are supposed to be running
|
||||
docker node ps -f desired-state=running
|
||||
# Show only tasks whose name contains the string "front"
|
||||
docker node ps -f name=front
|
||||
```
|
||||
58
docs/operatingswarm.md
Normal file
@@ -0,0 +1,58 @@
|
||||
class: title, in-person
|
||||
|
||||
Operating the Swarm
|
||||
|
||||
---
|
||||
|
||||
name: part-2
|
||||
|
||||
class: title, self-paced
|
||||
|
||||
Part 2
|
||||
|
||||
---
|
||||
|
||||
class: self-paced
|
||||
|
||||
## Before we start ...
|
||||
|
||||
The following exercises assume that you have a 5-nodes Swarm cluster.
|
||||
|
||||
If you come here from a previous tutorial and still have your cluster: great!
|
||||
|
||||
Otherwise: check [part 1](#part-1) to learn how to set up your own cluster.
|
||||
|
||||
We pick up exactly where we left you, so we assume that you have:
|
||||
|
||||
- a five nodes Swarm cluster,
|
||||
|
||||
- a self-hosted registry,
|
||||
|
||||
- DockerCoins up and running.
|
||||
|
||||
The next slide has a cheat sheet if you need to set that up in a pinch.
|
||||
|
||||
---
|
||||
|
||||
class: self-paced
|
||||
|
||||
## Catching up
|
||||
|
||||
Assuming you have 5 nodes provided by
|
||||
[Play-With-Docker](http://www.play-with-docker/), do this from `node1`:
|
||||
|
||||
```bash
|
||||
docker swarm init --advertise-addr eth0
|
||||
TOKEN=$(docker swarm join-token -q manager)
|
||||
for N in $(seq 2 5); do
|
||||
DOCKER_HOST=tcp://node$N:2375 docker swarm join --token $TOKEN node1:2377
|
||||
done
|
||||
git clone git://github.com/jpetazzo/orchestration-workshop
|
||||
cd orchestration-workshop/stacks
|
||||
docker stack deploy --compose-file registry.yml registry
|
||||
docker-compose -f dockercoins.yml build
|
||||
docker-compose -f dockercoins.yml push
|
||||
docker stack deploy --compose-file dockercoins.yml dockercoins
|
||||
```
|
||||
|
||||
You should now be able to connect to port 8000 and see the DockerCoins web UI.
|
||||
357
docs/ourapponkube.md
Normal file
@@ -0,0 +1,357 @@
|
||||
class: title
|
||||
|
||||
Our app on Kube
|
||||
|
||||
---
|
||||
|
||||
## What's on the menu?
|
||||
|
||||
In this part, we will:
|
||||
|
||||
- **build** images for our app,
|
||||
|
||||
- **ship** these images with a registry,
|
||||
|
||||
- **run** deployments using these images,
|
||||
|
||||
- expose these deployments so they can communicate with each other,
|
||||
|
||||
- expose the web UI so we can access it from outside.
|
||||
|
||||
---
|
||||
|
||||
## The plan
|
||||
|
||||
- Build on our control node (`node1`)
|
||||
|
||||
- Tag images so that they are named `$REGISTRY/servicename`
|
||||
|
||||
- Upload them to a registry
|
||||
|
||||
- Create deployments using the images
|
||||
|
||||
- Expose (with a ClusterIP) the services that need to communicate
|
||||
|
||||
- Expose (with a NodePort) the WebUI
|
||||
|
||||
---
|
||||
|
||||
## Which registry do we want to use?
|
||||
|
||||
- We could use the Docker Hub
|
||||
|
||||
- Or a service offered by our cloud provider (GCR, ECR...)
|
||||
|
||||
- Or we could just self-host that registry
|
||||
|
||||
*We'll self-host the registry because it's the most generic solution for this workshop.*
|
||||
|
||||
---
|
||||
|
||||
## Using the open source registry
|
||||
|
||||
- We need to run a `registry:2` container
|
||||
<br/>(make sure you specify tag `:2` to run the new version!)
|
||||
|
||||
- It will store images and layers to the local filesystem
|
||||
<br/>(but you can add a config file to use S3, Swift, etc.)
|
||||
|
||||
- Docker *requires* TLS when communicating with the registry
|
||||
|
||||
- unless for registries on `127.0.0.0/8` (i.e. `localhost`)
|
||||
|
||||
- or with the Engine flag `--insecure-registry`
|
||||
|
||||
- Our strategy: publish the registry container on a NodePort,
|
||||
<br/>so that it's available through `127.0.0.1:xxxxx` on each node
|
||||
|
||||
---
|
||||
|
||||
# Deploying a self-hosted registry
|
||||
|
||||
- We will deploy a registry container, and expose it with a NodePort
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create the registry service:
|
||||
```bash
|
||||
kubectl run registry --image=registry:2
|
||||
```
|
||||
|
||||
- Expose it on a NodePort:
|
||||
```bash
|
||||
kubectl expose deploy/registry --port=5000 --type=NodePort
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Connecting to our registry
|
||||
|
||||
- We need to find out which port has been allocated
|
||||
|
||||
.exercise[
|
||||
|
||||
- View the service details:
|
||||
```bash
|
||||
kubectl describe svc/registry
|
||||
```
|
||||
|
||||
- Get the port number programmatically:
|
||||
```bash
|
||||
NODEPORT=$(kubectl get svc/registry -o json | jq .spec.ports[0].nodePort)
|
||||
REGISTRY=127.0.0.1:$NODEPORT
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Testing our registry
|
||||
|
||||
- A convenient Docker registry API route to remember is `/v2/_catalog`
|
||||
|
||||
.exercise[
|
||||
|
||||
- View the repositories currently held in our registry:
|
||||
```bash
|
||||
curl $REGISTRY/v2/_catalog
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
We should see:
|
||||
```json
|
||||
{"repositories":[]}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing our local registry
|
||||
|
||||
- We can retag a small image, and push it to the registry
|
||||
|
||||
.exercise[
|
||||
|
||||
- Make sure we have the busybox image, and retag it:
|
||||
```bash
|
||||
docker pull busybox
|
||||
docker tag busybox $REGISTRY/busybox
|
||||
```
|
||||
|
||||
- Push it:
|
||||
```bash
|
||||
docker push $REGISTRY/busybox
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Checking again what's on our local registry
|
||||
|
||||
- Let's use the same endpoint as before
|
||||
|
||||
.exercise[
|
||||
|
||||
- Ensure that our busybox image is now in the local registry:
|
||||
```bash
|
||||
curl $REGISTRY/v2/_catalog
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The curl command should now output:
|
||||
```json
|
||||
{"repositories":["busybox"]}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Building and pushing our images
|
||||
|
||||
- We are going to use a convenient feature of Docker Compose
|
||||
|
||||
.exercise[
|
||||
|
||||
- Go to the `stacks` directory:
|
||||
```bash
|
||||
cd ~/orchestration-workshop/stacks
|
||||
```
|
||||
|
||||
- Build and push the images:
|
||||
```bash
|
||||
export REGISTRY
|
||||
docker-compose -f dockercoins.yml build
|
||||
docker-compose -f dockercoins.yml push
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Let's have a look at the `dockercoins.yml` file while this is building and pushing.
|
||||
|
||||
---
|
||||
|
||||
```yaml
|
||||
version: "3"
|
||||
|
||||
services:
|
||||
rng:
|
||||
build: dockercoins/rng
|
||||
image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
|
||||
deploy:
|
||||
mode: global
|
||||
...
|
||||
redis:
|
||||
image: redis
|
||||
...
|
||||
worker:
|
||||
build: dockercoins/worker
|
||||
image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
|
||||
...
|
||||
deploy:
|
||||
replicas: 10
|
||||
```
|
||||
|
||||
.warning[Just in case you were wondering ... Docker "services" are not Kubernetes "services".]
|
||||
|
||||
---
|
||||
|
||||
## Deploying all the things
|
||||
|
||||
- We can now deploy our code (as well as a redis instance)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Deploy `redis`:
|
||||
```bash
|
||||
kubectl run redis --image=redis
|
||||
```
|
||||
|
||||
- Deploy everything else:
|
||||
```bash
|
||||
for SERVICE in hasher rng webui worker; do
|
||||
kubectl run $SERVICE --image=$REGISTRY/$SERVICE
|
||||
done
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Is this working?
|
||||
|
||||
- After waiting for the deployment to complete, let's look at the logs!
|
||||
|
||||
(Hint: use `kubectl get deploy -w` to watch deployment events)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Look at some logs:
|
||||
```bash
|
||||
kubectl logs deploy/rng
|
||||
kubectl logs deploy/worker
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
🤔 `rng` is fine ... But not `worker`.
|
||||
|
||||
--
|
||||
|
||||
💡 Oh right! We forgot to `expose`.
|
||||
|
||||
---
|
||||
|
||||
# Exposing services internally
|
||||
|
||||
- Three deployments need to be reachable by others: `hasher`, `redis`, `rng`
|
||||
|
||||
- `worker` doesn't need to be exposed
|
||||
|
||||
- `webui` will be dealt with later
|
||||
|
||||
.exercise[
|
||||
|
||||
- Expose each deployment, specifying the right port:
|
||||
```bash
|
||||
kubectl expose deployment redis --port 6379
|
||||
kubectl expose deployment rng --port 80
|
||||
kubectl expose deployment hasher --port 80
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Is this working yet?
|
||||
|
||||
- The `worker` has an infinite loop, that retries 10 seconds after an error
|
||||
|
||||
.exercise[
|
||||
|
||||
- Stream the worker's logs:
|
||||
```bash
|
||||
kubectl logs deploy/worker --follow
|
||||
```
|
||||
|
||||
(Give it about 10 seconds to recover)
|
||||
|
||||
<!--
|
||||
```keys
|
||||
^C
|
||||
```
|
||||
-->
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
We should now see the `worker`, well, working happily.
|
||||
|
||||
---
|
||||
|
||||
# Exposing services for external access
|
||||
|
||||
- Now we would like to access the Web UI
|
||||
|
||||
- We will expose it with a `NodePort`
|
||||
|
||||
(just like we did for the registry)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create a `NodePort` service for the Web UI:
|
||||
```bash
|
||||
kubectl expose deploy/webui --type=NodePort --port=80
|
||||
```
|
||||
|
||||
- Check the port that was allocated:
|
||||
```bash
|
||||
kubectl get svc
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Accessing the web UI
|
||||
|
||||
- We can now connect to *any node*, on the allocated node port, to view the web UI
|
||||
|
||||
.exercise[
|
||||
|
||||
- Open the web UI in your browser (http://node-ip-address:3xxxx/)
|
||||
|
||||
<!-- ```open http://node1:3xxxx/``` -->
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
*Alright, we're back to where we started, when we were running on a single node!*
|
||||
979
docs/ourapponswarm.md
Normal file
@@ -0,0 +1,979 @@
|
||||
class: title
|
||||
|
||||
Our app on Swarm
|
||||
|
||||
---
|
||||
|
||||
## What's on the menu?
|
||||
|
||||
In this part, we will:
|
||||
|
||||
- **build** images for our app,
|
||||
|
||||
- **ship** these images with a registry,
|
||||
|
||||
- **run** services using these images.
|
||||
|
||||
---
|
||||
|
||||
## Why do we need to ship our images?
|
||||
|
||||
- When we do `docker-compose up`, images are built for our services
|
||||
|
||||
- These images are present only on the local node
|
||||
|
||||
- We need these images to be distributed on the whole Swarm
|
||||
|
||||
- The easiest way to achieve that is to use a Docker registry
|
||||
|
||||
- Once our images are on a registry, we can reference them when
|
||||
creating our services
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Build, ship, and run, for a single service
|
||||
|
||||
If we had only one service (built from a `Dockerfile` in the
|
||||
current directory), our workflow could look like this:
|
||||
|
||||
```
|
||||
docker build -t jpetazzo/doublerainbow:v0.1 .
|
||||
docker push jpetazzo/doublerainbow:v0.1
|
||||
docker service create jpetazzo/doublerainbow:v0.1
|
||||
```
|
||||
|
||||
We just have to adapt this to our application, which has 4 services!
|
||||
|
||||
---
|
||||
|
||||
## The plan
|
||||
|
||||
- Build on our local node (`node1`)
|
||||
|
||||
- Tag images so that they are named `localhost:5000/servicename`
|
||||
|
||||
- Upload them to a registry
|
||||
|
||||
- Create services using the images
|
||||
|
||||
---
|
||||
|
||||
## Which registry do we want to use?
|
||||
|
||||
.small[
|
||||
|
||||
- **Docker Hub**
|
||||
|
||||
- hosted by Docker Inc.
|
||||
- requires an account (free, no credit card needed)
|
||||
- images will be public (unless you pay)
|
||||
- located in AWS EC2 us-east-1
|
||||
|
||||
- **Docker Trusted Registry**
|
||||
|
||||
- self-hosted commercial product
|
||||
- requires a subscription (free 30-day trial available)
|
||||
- images can be public or private
|
||||
- located wherever you want
|
||||
|
||||
- **Docker open source registry**
|
||||
|
||||
- self-hosted barebones repository hosting
|
||||
- doesn't require anything
|
||||
- doesn't come with anything either
|
||||
- located wherever you want
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Using Docker Hub
|
||||
|
||||
*If we wanted to use the Docker Hub...*
|
||||
|
||||
<!--
|
||||
```meta
|
||||
^{
|
||||
```
|
||||
-->
|
||||
|
||||
- We would log into the Docker Hub:
|
||||
```bash
|
||||
docker login
|
||||
```
|
||||
|
||||
- And in the following slides, we would use our Docker Hub login
|
||||
(e.g. `jpetazzo`) instead of the registry address (i.e. `127.0.0.1:5000`)
|
||||
|
||||
<!--
|
||||
```meta
|
||||
^}
|
||||
```
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Using Docker Trusted Registry
|
||||
|
||||
*If we wanted to use DTR, we would...*
|
||||
|
||||
- Make sure we have a Docker Hub account
|
||||
|
||||
- [Activate a Docker Datacenter subscription](
|
||||
https://hub.docker.com/enterprise/trial/)
|
||||
|
||||
- Install DTR on our machines
|
||||
|
||||
- Use `dtraddress:port/user` instead of the registry address
|
||||
|
||||
*This is out of the scope of this workshop!*
|
||||
|
||||
---
|
||||
|
||||
## Using the open source registry
|
||||
|
||||
- We need to run a `registry:2` container
|
||||
<br/>(make sure you specify tag `:2` to run the new version!)
|
||||
|
||||
- It will store images and layers to the local filesystem
|
||||
<br/>(but you can add a config file to use S3, Swift, etc.)
|
||||
|
||||
- Docker *requires* TLS when communicating with the registry
|
||||
|
||||
- unless for registries on `127.0.0.0/8` (i.e. `localhost`)
|
||||
|
||||
- or with the Engine flag `--insecure-registry`
|
||||
|
||||
<!-- -->
|
||||
|
||||
- Our strategy: publish the registry container on port 5000,
|
||||
<br/>so that it's available through `127.0.0.1:5000` on each node
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
# Deploying a local registry
|
||||
|
||||
- We will create a single-instance service, publishing its port
|
||||
on the whole cluster
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create the registry service:
|
||||
```bash
|
||||
docker service create --name registry --publish 5000:5000 registry:2
|
||||
```
|
||||
|
||||
- Now try the following command; it should return `{"repositories":[]}`:
|
||||
```bash
|
||||
curl 127.0.0.1:5000/v2/_catalog
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
(If that doesn't work, wait a few seconds and try again.)
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Testing our local registry
|
||||
|
||||
- We can retag a small image, and push it to the registry
|
||||
|
||||
.exercise[
|
||||
|
||||
- Make sure we have the busybox image, and retag it:
|
||||
```bash
|
||||
docker pull busybox
|
||||
docker tag busybox 127.0.0.1:5000/busybox
|
||||
```
|
||||
|
||||
- Push it:
|
||||
```bash
|
||||
docker push 127.0.0.1:5000/busybox
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Checking what's on our local registry
|
||||
|
||||
- The registry API has endpoints to query what's there
|
||||
|
||||
.exercise[
|
||||
|
||||
- Ensure that our busybox image is now in the local registry:
|
||||
```bash
|
||||
curl http://127.0.0.1:5000/v2/_catalog
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The curl command should now output:
|
||||
```json
|
||||
{"repositories":["busybox"]}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Build, tag, and push our application container images
|
||||
|
||||
- Compose has named our images `dockercoins_XXX` for each service
|
||||
|
||||
- We need to retag them (to `127.0.0.1:5000/XXX:v1`) and push them
|
||||
|
||||
.exercise[
|
||||
|
||||
- Set `REGISTRY` and `TAG` environment variables to use our local registry
|
||||
- And run this little for loop:
|
||||
```bash
|
||||
cd ~/orchestration-workshop/dockercoins
|
||||
REGISTRY=127.0.0.1:5000 TAG=v1
|
||||
for SERVICE in hasher rng webui worker; do
|
||||
docker tag dockercoins_$SERVICE $REGISTRY/$SERVICE:$TAG
|
||||
docker push $REGISTRY/$SERVICE
|
||||
done
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
# Overlay networks
|
||||
|
||||
- SwarmKit integrates with overlay networks
|
||||
|
||||
- Networks are created with `docker network create`
|
||||
|
||||
- Make sure to specify that you want an *overlay* network
|
||||
<br/>(otherwise you will get a local *bridge* network by default)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create an overlay network for our application:
|
||||
```bash
|
||||
docker network create --driver overlay dockercoins
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Viewing existing networks
|
||||
|
||||
- Let's confirm that our network was created
|
||||
|
||||
.exercise[
|
||||
|
||||
- List existing networks:
|
||||
```bash
|
||||
docker network ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Can you spot the differences?
|
||||
|
||||
The networks `dockercoins` and `ingress` are different from the other ones.
|
||||
|
||||
Can you see how?
|
||||
|
||||
--
|
||||
|
||||
class: manual-btp
|
||||
|
||||
- They are using a different kind of ID, reflecting the fact that they
|
||||
are SwarmKit objects instead of "classic" Docker Engine objects.
|
||||
|
||||
- Their *scope* is `swarm` instead of `local`.
|
||||
|
||||
- They are using the overlay driver.
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp, extra-details
|
||||
|
||||
## Caveats
|
||||
|
||||
.warning[In Docker 1.12, you cannot join an overlay network with `docker run --net ...`.]
|
||||
|
||||
Starting with version 1.13, you can, if the network was created with the `--attachable` flag.
|
||||
|
||||
*Why is that?*
|
||||
|
||||
Placing a container on a network requires allocating an IP address for this container.
|
||||
|
||||
The allocation must be done by a manager node (worker nodes cannot update Raft data).
|
||||
|
||||
As a result, `docker run --net ...` requires collaboration with manager nodes.
|
||||
|
||||
It alters the code path for `docker run`, so it is allowed only under strict circumstances.
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Run the application
|
||||
|
||||
- First, create the `redis` service; that one is using a Docker Hub image
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create the `redis` service:
|
||||
```bash
|
||||
docker service create --network dockercoins --name redis redis
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Run the other services
|
||||
|
||||
- Then, start the other services one by one
|
||||
|
||||
- We will use the images pushed previously
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start the other services:
|
||||
```bash
|
||||
REGISTRY=127.0.0.1:5000
|
||||
TAG=v1
|
||||
for SERVICE in hasher rng webui worker; do
|
||||
docker service create --network dockercoins --detach=true \
|
||||
--name $SERVICE $REGISTRY/$SERVICE:$TAG
|
||||
done
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
???
|
||||
|
||||
## Wait for our application to be up
|
||||
|
||||
- We will see later a way to watch progress for all the tasks of the cluster
|
||||
|
||||
- But for now, a scrappy Shell loop will do the trick
|
||||
|
||||
.exercise[
|
||||
|
||||
- Repeatedly display the status of all our services:
|
||||
```bash
|
||||
watch "docker service ls -q | xargs -n1 docker service ps"
|
||||
```
|
||||
|
||||
- Stop it once everything is running
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Expose our application web UI
|
||||
|
||||
- We need to connect to the `webui` service, but it is not publishing any port
|
||||
|
||||
- Let's reconfigure it to publish a port
|
||||
|
||||
.exercise[
|
||||
|
||||
- Update `webui` so that we can connect to it from outside:
|
||||
```bash
|
||||
docker service update webui --publish-add 8000:80 --detach=false
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: to "de-publish" a port, you would have to specify the container port.
|
||||
</br>(i.e. in that case, `--publish-rm 80`)
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## What happens when we modify a service?
|
||||
|
||||
- Let's find out what happened to our `webui` service
|
||||
|
||||
.exercise[
|
||||
|
||||
- Look at the tasks and containers associated to `webui`:
|
||||
```bash
|
||||
docker service ps webui
|
||||
```
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
class: manual-btp
|
||||
|
||||
The first version of the service (the one that was not exposed) has been shutdown.
|
||||
|
||||
It has been replaced by the new version, with port 80 accessible from outside.
|
||||
|
||||
(This will be discussed with more details in the section about stateful services.)
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Connect to the web UI
|
||||
|
||||
- The web UI is now available on port 8000, *on all the nodes of the cluster*
|
||||
|
||||
.exercise[
|
||||
|
||||
- If you're using Play-With-Docker, just click on the `(8000)` badge
|
||||
|
||||
- Otherwise, point your browser to any node, on port 8000
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Scaling the application
|
||||
|
||||
- We can change scaling parameters with `docker update` as well
|
||||
|
||||
- We will do the equivalent of `docker-compose scale`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Bring up more workers:
|
||||
```bash
|
||||
docker service update worker --replicas 10 --detach=false
|
||||
```
|
||||
|
||||
- Check the result in the web UI
|
||||
|
||||
]
|
||||
|
||||
You should see the performance peaking at 10 hashes/s (like before).
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
# Global scheduling
|
||||
|
||||
- We want to utilize as best as we can the entropy generators
|
||||
on our nodes
|
||||
|
||||
- We want to run exactly one `rng` instance per node
|
||||
|
||||
- SwarmKit has a special scheduling mode for that, let's use it
|
||||
|
||||
- We cannot enable/disable global scheduling on an existing service
|
||||
|
||||
- We have to destroy and re-create the `rng` service
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Scaling the `rng` service
|
||||
|
||||
.exercise[
|
||||
|
||||
- Remove the existing `rng` service:
|
||||
```bash
|
||||
docker service rm rng
|
||||
```
|
||||
|
||||
- Re-create the `rng` service with *global scheduling*:
|
||||
```bash
|
||||
docker service create --name rng --network dockercoins --mode global \
|
||||
--detach=false $REGISTRY/rng:$TAG
|
||||
```
|
||||
|
||||
- Look at the result in the web UI
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details, manual-btp
|
||||
|
||||
## Why do we have to re-create the service to enable global scheduling?
|
||||
|
||||
- Enabling it dynamically would make rolling updates semantics very complex
|
||||
|
||||
- This might change in the future (after all, it was possible in 1.12 RC!)
|
||||
|
||||
- As of Docker Engine 17.05, other parameters requiring to `rm`/`create` the service are:
|
||||
|
||||
- service name
|
||||
|
||||
- hostname
|
||||
|
||||
- network
|
||||
|
||||
---
|
||||
|
||||
class: swarm-ready
|
||||
|
||||
## How did we make our app "Swarm-ready"?
|
||||
|
||||
This app was written in June 2015. (One year before Swarm mode was released.)
|
||||
|
||||
What did we change to make it compatible with Swarm mode?
|
||||
|
||||
--
|
||||
|
||||
.exercise[
|
||||
|
||||
- Go to the app directory:
|
||||
```bash
|
||||
cd ~/orchestration-workshop/dockercoins
|
||||
```
|
||||
|
||||
- See modifications in the code:
|
||||
```bash
|
||||
git log -p --since "4-JUL-2015" -- . ':!*.yml*' ':!*.html'
|
||||
```
|
||||
|
||||
<!-- ```wait commit``` -->
|
||||
<!-- ```keys q``` -->
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: swarm-ready
|
||||
|
||||
## What did we change in our app since its inception?
|
||||
|
||||
- Compose files
|
||||
|
||||
- HTML file (it contains an embedded contextual tweet)
|
||||
|
||||
- Dockerfiles (to switch to smaller images)
|
||||
|
||||
- That's it!
|
||||
|
||||
--
|
||||
|
||||
class: swarm-ready
|
||||
|
||||
*We didn't change a single line of code in this app since it was written.*
|
||||
|
||||
--
|
||||
|
||||
class: swarm-ready
|
||||
|
||||
*The images that were [built in June 2015](
|
||||
https://hub.docker.com/r/jpetazzo/dockercoins_worker/tags/)
|
||||
(when the app was written) can still run today ...
|
||||
<br/>... in Swarm mode (distributed across a cluster, with load balancing) ...
|
||||
<br/>... without any modification.*
|
||||
|
||||
---
|
||||
|
||||
class: swarm-ready
|
||||
|
||||
## How did we design our app in the first place?
|
||||
|
||||
- [Twelve-Factor App](https://12factor.net/) principles
|
||||
|
||||
- Service discovery using DNS names
|
||||
|
||||
- Initially implemented as "links"
|
||||
|
||||
- Then "ambassadors"
|
||||
|
||||
- And now "services"
|
||||
|
||||
- Existing apps might require more changes!
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
# Integration with Compose
|
||||
|
||||
- The previous section showed us how to streamline image build and push
|
||||
|
||||
- We will now see how to streamline service creation
|
||||
|
||||
(i.e. get rid of the `for SERVICE in ...; do docker service create ...` part)
|
||||
|
||||
---
|
||||
|
||||
## Compose file version 3
|
||||
|
||||
(New in Docker Engine 1.13)
|
||||
|
||||
- Almost identical to version 2
|
||||
|
||||
- Can be directly used by a Swarm cluster through `docker stack ...` commands
|
||||
|
||||
- Introduces a `deploy` section to pass Swarm-specific parameters
|
||||
|
||||
- Resource limits are moved to this `deploy` section
|
||||
|
||||
- See [here](https://github.com/aanand/docker.github.io/blob/8524552f99e5b58452fcb1403e1c273385988b71/compose/compose-file.md#upgrading) for the complete list of changes
|
||||
|
||||
- Supersedes *Distributed Application Bundles*
|
||||
|
||||
(JSON payload describing an application; could be generated from a Compose file)
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Removing everything
|
||||
|
||||
- Before deploying using "stacks," let's get a clean slate
|
||||
|
||||
.exercise[
|
||||
|
||||
- Remove *all* the services:
|
||||
```bash
|
||||
docker service ls -q | xargs docker service rm
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Our first stack
|
||||
|
||||
We need a registry to move images around.
|
||||
|
||||
Without a stack file, it would be deployed with the following command:
|
||||
|
||||
```bash
|
||||
docker service create --publish 5000:5000 registry:2
|
||||
```
|
||||
|
||||
Now, we are going to deploy it with the following stack file:
|
||||
|
||||
```yaml
|
||||
version: "3"
|
||||
|
||||
services:
|
||||
registry:
|
||||
image: registry:2
|
||||
ports:
|
||||
- "5000:5000"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checking our stack files
|
||||
|
||||
- All the stack files that we will use are in the `stacks` directory
|
||||
|
||||
.exercise[
|
||||
|
||||
- Go to the `stacks` directory:
|
||||
```bash
|
||||
cd ~/orchestration-workshop/stacks
|
||||
```
|
||||
|
||||
- Check `registry.yml`:
|
||||
```bash
|
||||
cat registry.yml
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Deploying our first stack
|
||||
|
||||
- All stack manipulation commands start with `docker stack`
|
||||
|
||||
- Under the hood, they map to `docker service` commands
|
||||
|
||||
- Stacks have a *name* (which also serves as a namespace)
|
||||
|
||||
- Stacks are specified with the aforementioned Compose file format version 3
|
||||
|
||||
.exercise[
|
||||
|
||||
- Deploy our local registry:
|
||||
```bash
|
||||
docker stack deploy registry --compose-file registry.yml
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Inspecting stacks
|
||||
|
||||
- `docker stack ps` shows the detailed state of all services of a stack
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check that our registry is running correctly:
|
||||
```bash
|
||||
docker stack ps registry
|
||||
```
|
||||
|
||||
- Confirm that we get the same output with the following command:
|
||||
```bash
|
||||
docker service ps registry_registry
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: manual-btp
|
||||
|
||||
## Specifics of stack deployment
|
||||
|
||||
Our registry is not *exactly* identical to the one deployed with `docker service create`!
|
||||
|
||||
- Each stack gets its own overlay network
|
||||
|
||||
- Services of the task are connected to this network
|
||||
<br/>(unless specified differently in the Compose file)
|
||||
|
||||
- Services get network aliases matching their name in the Compose file
|
||||
<br/>(just like when Compose brings up an app specified in a v2 file)
|
||||
|
||||
- Services are explicitly named `<stack_name>_<service_name>`
|
||||
|
||||
- Services and tasks also get an internal label indicating which stack they belong to
|
||||
|
||||
---
|
||||
|
||||
class: auto-btp
|
||||
|
||||
## Testing our local registry
|
||||
|
||||
- Connecting to port 5000 *on any node of the cluster* routes us to the registry
|
||||
|
||||
- Therefore, we can use `localhost:5000` or `127.0.0.1:5000` as our registry
|
||||
|
||||
.exercise[
|
||||
|
||||
- Issue the following API request to the registry:
|
||||
```bash
|
||||
curl 127.0.0.1:5000/v2/_catalog
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
It should return:
|
||||
|
||||
```json
|
||||
{"repositories":[]}
|
||||
```
|
||||
|
||||
If that doesn't work, retry a few times; perhaps the container is still starting.
|
||||
|
||||
---
|
||||
|
||||
class: auto-btp
|
||||
|
||||
## Pushing an image to our local registry
|
||||
|
||||
- We can retag a small image, and push it to the registry
|
||||
|
||||
.exercise[
|
||||
|
||||
- Make sure we have the busybox image, and retag it:
|
||||
```bash
|
||||
docker pull busybox
|
||||
docker tag busybox 127.0.0.1:5000/busybox
|
||||
```
|
||||
|
||||
- Push it:
|
||||
```bash
|
||||
docker push 127.0.0.1:5000/busybox
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: auto-btp
|
||||
|
||||
## Checking what's on our local registry
|
||||
|
||||
- The registry API has endpoints to query what's there
|
||||
|
||||
.exercise[
|
||||
|
||||
- Ensure that our busybox image is now in the local registry:
|
||||
```bash
|
||||
curl http://127.0.0.1:5000/v2/_catalog
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The curl command should now output:
|
||||
```json
|
||||
"repositories":["busybox"]}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Building and pushing stack services
|
||||
|
||||
- When using Compose file version 2 and above, you can specify *both* `build` and `image`
|
||||
|
||||
- When both keys are present:
|
||||
|
||||
- Compose does "business as usual" (uses `build`)
|
||||
|
||||
- but the resulting image is named as indicated by the `image` key
|
||||
<br/>
|
||||
(instead of `<projectname>_<servicename>:latest`)
|
||||
|
||||
- it can be pushed to a registry with `docker-compose push`
|
||||
|
||||
- Example:
|
||||
|
||||
```yaml
|
||||
webfront:
|
||||
build: www
|
||||
image: myregistry.company.net:5000/webfront
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Using Compose to build and push images
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try it:
|
||||
```bash
|
||||
docker-compose -f dockercoins.yml build
|
||||
docker-compose -f dockercoins.yml push
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Let's have a look at the `dockercoins.yml` file while this is building and pushing.
|
||||
|
||||
---
|
||||
|
||||
```yaml
|
||||
version: "3"
|
||||
|
||||
services:
|
||||
rng:
|
||||
build: dockercoins/rng
|
||||
image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
|
||||
deploy:
|
||||
mode: global
|
||||
...
|
||||
redis:
|
||||
image: redis
|
||||
...
|
||||
worker:
|
||||
build: dockercoins/worker
|
||||
image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
|
||||
...
|
||||
deploy:
|
||||
replicas: 10
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deploying the application
|
||||
|
||||
- Now that the images are on the registry, we can deploy our application stack
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create the application stack:
|
||||
```bash
|
||||
docker stack deploy dockercoins --compose-file dockercoins.yml
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
We can now connect to any of our nodes on port 8000, and we will see the familiar hashing speed graph.
|
||||
|
||||
---
|
||||
|
||||
## Maintaining multiple environments
|
||||
|
||||
There are many ways to handle variations between environments.
|
||||
|
||||
- Compose loads `docker-compose.yml` and (if it exists) `docker-compose.override.yml`
|
||||
|
||||
- Compose can load alternate file(s) by setting the `-f` flag or the `COMPOSE_FILE` environment variable
|
||||
|
||||
- Compose files can *extend* other Compose files, selectively including services:
|
||||
|
||||
```yaml
|
||||
web:
|
||||
extends:
|
||||
file: common-services.yml
|
||||
service: webapp
|
||||
```
|
||||
|
||||
See [this documentation page](https://docs.docker.com/compose/extends/) for more details about these techniques.
|
||||
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Good to know ...
|
||||
|
||||
- Compose file version 3 adds the `deploy` section
|
||||
|
||||
- Further versions (3.1, ...) add more features (secrets, configs ...)
|
||||
|
||||
- You can re-run `docker stack deploy` to update a stack
|
||||
|
||||
- You can make manual changes with `docker service update` ...
|
||||
|
||||
- ... But they will be wiped out each time you `docker stack deploy`
|
||||
|
||||
(That's the intended behavior, when one thinks about it!)
|
||||
|
||||
- `extends` doesn't work with `docker stack deploy`
|
||||
|
||||
(But you can use `docker-compose config` to "flatten" your configuration)
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
- We've seen how to set up a Swarm
|
||||
|
||||
- We've used it to host our own registry
|
||||
|
||||
- We've built our app container images
|
||||
|
||||
- We've used the registry to host those images
|
||||
|
||||
- We've deployed and scaled our application
|
||||
|
||||
- We've seen how to use Compose to streamline deployments
|
||||
|
||||
- Awesome job, team!
|
||||
161
docs/prereqs-k8s.md
Normal file
@@ -0,0 +1,161 @@
|
||||
# Pre-requirements
|
||||
|
||||
- Computer with internet connection and a web browser
|
||||
|
||||
- For instructor-led workshops: an SSH client to connect to remote machines
|
||||
|
||||
- on Linux, OS X, FreeBSD... you are probably all set
|
||||
|
||||
- on Windows, get [putty](http://www.putty.org/),
|
||||
Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
|
||||
[Git BASH](https://git-for-windows.github.io/), or
|
||||
[MobaXterm](http://mobaxterm.mobatek.net/)
|
||||
|
||||
- A tiny little bit of Docker knowledge
|
||||
|
||||
(that's totally OK if you're not a Docker expert!)
|
||||
|
||||
---
|
||||
|
||||
class: in-person, extra-details
|
||||
|
||||
## Nice-to-haves
|
||||
|
||||
- [Mosh](https://mosh.org/) instead of SSH, if your internet connection tends to lose packets
|
||||
<br/>(available with `(apt|yum|brew) install mosh`; then connect with `mosh user@host`)
|
||||
|
||||
- [GitHub](https://github.com/join) account
|
||||
<br/>(if you want to fork the repo)
|
||||
|
||||
- [Slack](https://community.docker.com/registrations/groups/4316) account
|
||||
<br/>(to join the conversation after the workshop)
|
||||
|
||||
- [Docker Hub](https://hub.docker.com) account
|
||||
<br/>(it's one way to distribute images on your cluster)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Extra details
|
||||
|
||||
- This slide should have a little magnifying glass in the top left corner
|
||||
|
||||
(If it doesn't, it's because CSS is hard — Jérôme is only a backend person, alas)
|
||||
|
||||
- Slides with that magnifying glass indicate slides providing extra details
|
||||
|
||||
- Feel free to skip them if you're in a hurry!
|
||||
|
||||
---
|
||||
|
||||
## Hands-on sections
|
||||
|
||||
- The whole workshop is hands-on
|
||||
|
||||
- We will see Docker and Kubernetes in action
|
||||
|
||||
- You are invited to reproduce all the demos
|
||||
|
||||
- All hands-on sections are clearly identified, like the gray rectangle below
|
||||
|
||||
.exercise[
|
||||
|
||||
- This is the stuff you're supposed to do!
|
||||
- Go to [container.training](http://container.training/) to view these slides
|
||||
- Join the chat room on @@CHAT@@
|
||||
|
||||
<!-- ```open http://container.training/``` -->
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: pic, in-person
|
||||
|
||||

|
||||
|
||||
<!--
|
||||
```bash
|
||||
kubectl get all -o name | grep -v services/kubernetes | xargs -n1 kubectl delete
|
||||
```
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
## You get five VMs
|
||||
|
||||
- Each person gets 5 private VMs (not shared with anybody else)
|
||||
- Kubernetes has been deployed and pre-configured on these machines
|
||||
- They'll remain up until the day after the tutorial
|
||||
- You should have a little card with login+password+IP addresses
|
||||
- You can automatically SSH from one VM to another
|
||||
|
||||
.exercise[
|
||||
|
||||
<!--
|
||||
```bash
|
||||
for N in $(seq 1 5); do
|
||||
ssh -o StrictHostKeyChecking=no node$N true
|
||||
done
|
||||
```
|
||||
-->
|
||||
|
||||
- Log into the first VM (`node1`) with SSH or MOSH
|
||||
- Check that you can SSH (without password) to `node2`:
|
||||
```bash
|
||||
ssh node2
|
||||
```
|
||||
- Type `exit` or `^D` to come back to node1
|
||||
|
||||
<!-- ```bash exit``` -->
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## We will (mostly) interact with node1 only
|
||||
|
||||
- Unless instructed, **all commands must be run from the first VM, `node1`**
|
||||
|
||||
- We will only checkout/copy the code on `node1`
|
||||
|
||||
- During normal operations, we do not need access to the other nodes
|
||||
|
||||
- If we had to troubleshoot issues, we would use a combination of:
|
||||
|
||||
- SSH (to access system logs, daemon status...)
|
||||
|
||||
- Docker API (to check running containers and container engine status)
|
||||
|
||||
---
|
||||
|
||||
## Terminals
|
||||
|
||||
Once in a while, the instructions will say:
|
||||
<br/>"Open a new terminal."
|
||||
|
||||
There are multiple ways to do this:
|
||||
|
||||
- create a new window or tab on your machine, and SSH into the VM;
|
||||
|
||||
- use screen or tmux on the VM and open a new window from there.
|
||||
|
||||
You are welcome to use the method that you feel the most comfortable with.
|
||||
|
||||
---
|
||||
|
||||
## Tmux cheatsheet
|
||||
|
||||
- Ctrl-b c → creates a new window
|
||||
- Ctrl-b n → go to next window
|
||||
- Ctrl-b p → go to previous window
|
||||
- Ctrl-b " → split window top/bottom
|
||||
- Ctrl-b % → split window left/right
|
||||
- Ctrl-b Alt-1 → rearrange windows in columns
|
||||
- Ctrl-b Alt-2 → rearrange windows in rows
|
||||
- Ctrl-b arrows → navigate to other windows
|
||||
- Ctrl-b d → detach session
|
||||
- tmux attach → reattach to session
|
||||
226
docs/prereqs.md
Normal file
@@ -0,0 +1,226 @@
|
||||
# Pre-requirements
|
||||
|
||||
- Computer with internet connection and a web browser
|
||||
|
||||
- For instructor-led workshops: an SSH client to connect to remote machines
|
||||
|
||||
- on Linux, OS X, FreeBSD... you are probably all set
|
||||
|
||||
- on Windows, get [putty](http://www.putty.org/),
|
||||
Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
|
||||
[Git BASH](https://git-for-windows.github.io/), or
|
||||
[MobaXterm](http://mobaxterm.mobatek.net/)
|
||||
|
||||
- For self-paced learning: SSH is not necessary if you use
|
||||
[Play-With-Docker](http://www.play-with-docker.com/)
|
||||
|
||||
- Some Docker knowledge
|
||||
|
||||
(but that's OK if you're not a Docker expert!)
|
||||
|
||||
---
|
||||
|
||||
class: in-person, extra-details
|
||||
|
||||
## Nice-to-haves
|
||||
|
||||
- [Mosh](https://mosh.org/) instead of SSH, if your internet connection tends to lose packets
|
||||
<br/>(available with `(apt|yum|brew) install mosh`; then connect with `mosh user@host`)
|
||||
|
||||
- [GitHub](https://github.com/join) account
|
||||
<br/>(if you want to fork the repo)
|
||||
|
||||
- [Slack](https://community.docker.com/registrations/groups/4316) account
|
||||
<br/>(to join the conversation after the workshop)
|
||||
|
||||
- [Docker Hub](https://hub.docker.com) account
|
||||
<br/>(it's one way to distribute images on your cluster)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Extra details
|
||||
|
||||
- This slide should have a little magnifying glass in the top left corner
|
||||
|
||||
(If it doesn't, it's because CSS is hard — Jérôme is only a backend person, alas)
|
||||
|
||||
- Slides with that magnifying glass indicate slides providing extra details
|
||||
|
||||
- Feel free to skip them if you're in a hurry!
|
||||
|
||||
---
|
||||
|
||||
## Hands-on sections
|
||||
|
||||
- The whole workshop is hands-on
|
||||
|
||||
- We will see Docker in action
|
||||
|
||||
- You are invited to reproduce all the demos
|
||||
|
||||
- All hands-on sections are clearly identified, like the gray rectangle below
|
||||
|
||||
.exercise[
|
||||
|
||||
- This is the stuff you're supposed to do!
|
||||
- Go to [container.training](http://container.training/) to view these slides
|
||||
- Join the chat room on @@CHAT@@
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
# VM environment
|
||||
|
||||
- To follow along, you need a cluster of five Docker Engines
|
||||
|
||||
- If you are doing this with an instructor, see next slide
|
||||
|
||||
- If you are doing (or re-doing) this on your own, you can:
|
||||
|
||||
- create your own cluster (local or cloud VMs) with Docker Machine
|
||||
([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine))
|
||||
|
||||
- use [Play-With-Docker](http://play-with-docker.com) ([instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker))
|
||||
|
||||
- create a bunch of clusters for you and your friends
|
||||
([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-vms))
|
||||
|
||||
---
|
||||
|
||||
class: pic, in-person
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
## You get five VMs
|
||||
|
||||
- Each person gets 5 private VMs (not shared with anybody else)
|
||||
- They'll remain up until the day after the tutorial
|
||||
- You should have a little card with login+password+IP addresses
|
||||
- You can automatically SSH from one VM to another
|
||||
|
||||
.exercise[
|
||||
|
||||
<!--
|
||||
```bash
|
||||
for N in $(seq 1 5); do
|
||||
ssh -o StrictHostKeyChecking=no node$N true
|
||||
done
|
||||
```
|
||||
-->
|
||||
|
||||
- Log into the first VM (`node1`) with SSH or MOSH
|
||||
- Check that you can SSH (without password) to `node2`:
|
||||
```bash
|
||||
ssh node2
|
||||
```
|
||||
- Type `exit` or `^D` to come back to node1
|
||||
|
||||
<!-- ```bash exit``` -->
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## If doing or re-doing the workshop on your own ...
|
||||
|
||||
- Use [Play-With-Docker](http://www.play-with-docker.com/)!
|
||||
|
||||
- Main differences:
|
||||
|
||||
- you don't need to SSH to the machines
|
||||
<br/>(just click on the node that you want to control in the left tab bar)
|
||||
|
||||
- Play-With-Docker automagically detects exposed ports
|
||||
<br/>(and displays them as little badges with port numbers, above the terminal)
|
||||
|
||||
- You can access HTTP services by clicking on the port numbers
|
||||
|
||||
- exposing TCP services requires something like
|
||||
[ngrok](https://ngrok.com/)
|
||||
or [supergrok](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker)
|
||||
|
||||
<!--
|
||||
|
||||
- If you use VMs deployed with Docker Machine:
|
||||
|
||||
- you won't have pre-authorized SSH keys to bounce across machines
|
||||
|
||||
- you won't have host aliases
|
||||
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
class: self-paced
|
||||
|
||||
## Using Play-With-Docker
|
||||
|
||||
- Open a new browser tab to [www.play-with-docker.com](http://www.play-with-docker.com/)
|
||||
|
||||
- Confirm that you're not a robot
|
||||
|
||||
- Click on "ADD NEW INSTANCE": congratulations, you have your first Docker node!
|
||||
|
||||
- When you will need more nodes, just click on "ADD NEW INSTANCE" again
|
||||
|
||||
- Note the countdown in the corner; when it expires, your instances are destroyed
|
||||
|
||||
- If you give your URL to somebody else, they can access your nodes too
|
||||
<br/>
|
||||
(You can use that for pair programming, or to get help from a mentor)
|
||||
|
||||
- Loving it? Not loving it? Tell it to the wonderful authors,
|
||||
[@marcosnils](https://twitter.com/marcosnils) &
|
||||
[@xetorthio](https://twitter.com/xetorthio)!
|
||||
|
||||
---
|
||||
|
||||
## We will (mostly) interact with node1 only
|
||||
|
||||
- Unless instructed, **all commands must be run from the first VM, `node1`**
|
||||
|
||||
- We will only checkout/copy the code on `node1`
|
||||
|
||||
- When we will use the other nodes, we will do it mostly through the Docker API
|
||||
|
||||
- We will log into other nodes only for initial setup and a few "out of band" operations
|
||||
<br/>(checking internal logs, debugging...)
|
||||
|
||||
---
|
||||
|
||||
## Terminals
|
||||
|
||||
Once in a while, the instructions will say:
|
||||
<br/>"Open a new terminal."
|
||||
|
||||
There are multiple ways to do this:
|
||||
|
||||
- create a new window or tab on your machine, and SSH into the VM;
|
||||
|
||||
- use screen or tmux on the VM and open a new window from there.
|
||||
|
||||
You are welcome to use the method that you feel the most comfortable with.
|
||||
|
||||
---
|
||||
|
||||
## Tmux cheatsheet
|
||||
|
||||
- Ctrl-b c → creates a new window
|
||||
- Ctrl-b n → go to next window
|
||||
- Ctrl-b p → go to previous window
|
||||
- Ctrl-b " → split window top/bottom
|
||||
- Ctrl-b % → split window left/right
|
||||
- Ctrl-b Alt-1 → rearrange windows in columns
|
||||
- Ctrl-b Alt-2 → rearrange windows in rows
|
||||
- Ctrl-b arrows → navigate to other windows
|
||||
- Ctrl-b d → detach session
|
||||
- tmux attach → reattach to session
|
||||
2
docs/requirements.txt
Normal file
@@ -0,0 +1,2 @@
|
||||
# This is for netlify
|
||||
PyYAML
|
||||
139
docs/rollingupdates.md
Normal file
@@ -0,0 +1,139 @@
|
||||
# Rolling updates
|
||||
|
||||
- Let's change a scaled service: `worker`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Edit `worker/worker.py`
|
||||
|
||||
- Locate the `sleep` instruction and change the delay
|
||||
|
||||
- Build, ship, and run our changes:
|
||||
```bash
|
||||
export TAG=v0.4
|
||||
docker-compose -f dockercoins.yml build
|
||||
docker-compose -f dockercoins.yml push
|
||||
docker stack deploy -c dockercoins.yml dockercoins
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Viewing our update as it rolls out
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the status of the `dockercoins_worker` service:
|
||||
```bash
|
||||
watch docker service ps dockercoins_worker
|
||||
```
|
||||
|
||||
- Hide the tasks that are shutdown:
|
||||
```bash
|
||||
watch -n1 "docker service ps dockercoins_worker | grep -v Shutdown.*Shutdown"
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
If you had stopped the workers earlier, this will automatically restart them.
|
||||
|
||||
By default, SwarmKit does a rolling upgrade, one instance at a time.
|
||||
|
||||
We should therefore see the workers being updated one my one.
|
||||
|
||||
---
|
||||
|
||||
## Changing the upgrade policy
|
||||
|
||||
- We can set upgrade parallelism (how many instances to update at the same time)
|
||||
|
||||
- And upgrade delay (how long to wait between two batches of instances)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Change the parallelism to 2 and the delay to 5 seconds:
|
||||
```bash
|
||||
docker service update dockercoins_worker \
|
||||
--update-parallelism 2 --update-delay 5s
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The current upgrade will continue at a faster pace.
|
||||
|
||||
---
|
||||
|
||||
## Changing the policy in the Compose file
|
||||
|
||||
- The policy can also be updated in the Compose file
|
||||
|
||||
- This is done by adding an `update_config` key under the `deploy` key:
|
||||
|
||||
```yaml
|
||||
deploy:
|
||||
replicas: 10
|
||||
update_config:
|
||||
parallelism: 2
|
||||
delay: 10s
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rolling back
|
||||
|
||||
- At any time (e.g. before the upgrade is complete), we can rollback:
|
||||
|
||||
- by editing the Compose file and redeploying;
|
||||
|
||||
- or with the special `--rollback` flag
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try to rollback the service:
|
||||
```bash
|
||||
docker service update dockercoins_worker --rollback
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
What happens with the web UI graph?
|
||||
|
||||
---
|
||||
|
||||
## The fine print with rollback
|
||||
|
||||
- Rollback reverts to the previous service definition
|
||||
|
||||
- If we visualize successive updates as a stack:
|
||||
|
||||
- it doesn't "pop" the latest update
|
||||
|
||||
- it "pushes" a copy of the previous update on top
|
||||
|
||||
- ergo, rolling back twice does nothing
|
||||
|
||||
- "Service definition" includes rollout cadence
|
||||
|
||||
- Each `docker service update` command = a new service definition
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Timeline of an upgrade
|
||||
|
||||
- SwarmKit will upgrade N instances at a time
|
||||
<br/>(following the `update-parallelism` parameter)
|
||||
|
||||
- New tasks are created, and their desired state is set to `Ready`
|
||||
<br/>.small[(this pulls the image if necessary, ensures resource availability, creates the container ... without starting it)]
|
||||
|
||||
- If the new tasks fail to get to `Ready` state, go back to the previous step
|
||||
<br/>.small[(SwarmKit will try again and again, until the situation is addressed or desired state is updated)]
|
||||
|
||||
- When the new tasks are `Ready`, it sets the old tasks desired state to `Shutdown`
|
||||
|
||||
- When the old tasks are `Shutdown`, it starts the new tasks
|
||||
|
||||
- Then it waits for the `update-delay`, and continues with the next batch of instances
|
||||
206
docs/rollout.md
Normal file
@@ -0,0 +1,206 @@
|
||||
# Rolling updates
|
||||
|
||||
- By default (without rolling updates), when a scaled resource is updated:
|
||||
|
||||
- new pods are created
|
||||
|
||||
- old pods are terminated
|
||||
|
||||
- ... all at the same time
|
||||
|
||||
- if something goes wrong, ¯\\\_(ツ)\_/¯
|
||||
|
||||
---
|
||||
|
||||
## Rolling updates
|
||||
|
||||
- With rolling updates, when a resource is updated, it happens progressively
|
||||
|
||||
- Two parameters determine the pace of the rollout: `maxUnavailable` and `maxSurge`
|
||||
|
||||
- They can be specified in absolute number of pods, or percentage of the `replicas` count
|
||||
|
||||
- At any given time ...
|
||||
|
||||
- there will always be at least `replicas`-`maxUnavailable` pods available
|
||||
|
||||
- there will never be more than `replicas`+`maxSurge` pods in total
|
||||
|
||||
- there will therefore be up to `maxUnavailable`+`maxSurge` pods being updated
|
||||
|
||||
- We have the possibility to rollback to the previous version
|
||||
<br/>(if the update fails or is unsatisfactory in any way)
|
||||
|
||||
---
|
||||
|
||||
## Rolling updates in practice
|
||||
|
||||
- As of Kubernetes 1.8, we can do rolling updates with:
|
||||
|
||||
`deployments`, `daemonsets`, `statefulsets`
|
||||
|
||||
- Editing one of these resources will automatically result in a rolling update
|
||||
|
||||
- Rolling updates can be monitored with the `kubectl rollout` subcommand
|
||||
|
||||
---
|
||||
|
||||
## Building a new version of the `worker` service
|
||||
|
||||
.exercise[
|
||||
|
||||
- Go to the `stack` directory:
|
||||
```bash
|
||||
cd ~/orchestration-workshop/stacks
|
||||
```
|
||||
|
||||
- Edit `dockercoins/worker/worker.py`, update the `sleep` line to sleep 1 second
|
||||
|
||||
- Build a new tag and push it to the registry:
|
||||
```bash
|
||||
#export REGISTRY=localhost:3xxxx
|
||||
export TAG=v0.2
|
||||
docker-compose -f dockercoins.yml build
|
||||
docker-compose -f dockercoins.yml push
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Rolling out the new version of the `worker` service
|
||||
|
||||
.exercise[
|
||||
|
||||
- Let's monitor what's going on by opening a few terminals, and run:
|
||||
```bash
|
||||
kubectl get pods -w
|
||||
kubectl get replicasets -w
|
||||
kubectl get deployments -w
|
||||
```
|
||||
|
||||
<!-- ```keys ^C``` -->
|
||||
|
||||
- Update `worker` either with `kubectl edit`, or by running:
|
||||
```bash
|
||||
kubectl set image deploy worker worker=$REGISTRY/worker:$TAG
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
That rollout should be pretty quick. What shows in the web UI?
|
||||
|
||||
---
|
||||
|
||||
## Rolling out a boo-boo
|
||||
|
||||
- What happens if we make a mistake?
|
||||
|
||||
.exercise[
|
||||
|
||||
- Update `worker` by specifying a non-existent image:
|
||||
```bash
|
||||
export TAG=v0.3
|
||||
kubectl set image deploy worker worker=$REGISTRY/worker:$TAG
|
||||
```
|
||||
|
||||
- Check what's going on:
|
||||
```bash
|
||||
kubectl rollout status deploy worker
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
Our rollout is stuck. However, the app is not dead (just 10% slower).
|
||||
|
||||
---
|
||||
|
||||
## Recovering from a bad rollout
|
||||
|
||||
- We could push some `v0.3` image
|
||||
|
||||
(the pod retry logic will eventually catch it and the rollout will proceed)
|
||||
|
||||
- Or we could invoke a manual rollback
|
||||
|
||||
.exercise[
|
||||
|
||||
<!--
|
||||
```keys
|
||||
^C
|
||||
```
|
||||
-->
|
||||
|
||||
- Cancel the deployment and wait for the dust to settle down:
|
||||
```bash
|
||||
kubectl rollout undo deploy worker
|
||||
kubectl rollout status deploy worker
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Changing rollout parameters
|
||||
|
||||
- We want to:
|
||||
|
||||
- revert to `v0.1`
|
||||
- be conservative on availability (always have desired number of available workers)
|
||||
- be aggressive on rollout speed (update more than one pod at a time)
|
||||
- give some time to our workers to "warm up" before starting more
|
||||
|
||||
The corresponding changes can be expressed in the following YAML snippet:
|
||||
|
||||
.small[
|
||||
```yaml
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: worker
|
||||
image: $REGISTRY/worker:v0.1
|
||||
strategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: 0
|
||||
maxSurge: 3
|
||||
minReadySeconds: 10
|
||||
```
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Applying changes through a YAML patch
|
||||
|
||||
- We could use `kubectl edit deployment worker`
|
||||
|
||||
- But we could also use `kubectl patch` with the exact YAML shown before
|
||||
|
||||
.exercise[
|
||||
|
||||
.small[
|
||||
|
||||
- Apply all our changes and wait for them to take effect:
|
||||
```bash
|
||||
kubectl patch deployment worker -p "
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: worker
|
||||
image: $REGISTRY/worker:v0.1
|
||||
strategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: 0
|
||||
maxSurge: 3
|
||||
minReadySeconds: 10
|
||||
"
|
||||
kubectl rollout status deployment worker
|
||||
```
|
||||
]
|
||||
|
||||
]
|
||||
477
docs/sampleapp.md
Normal file
@@ -0,0 +1,477 @@
|
||||
# Our sample application
|
||||
|
||||
- Visit the GitHub repository with all the materials of this workshop:
|
||||
<br/>https://github.com/jpetazzo/orchestration-workshop
|
||||
|
||||
- The application is in the [dockercoins](
|
||||
https://github.com/jpetazzo/orchestration-workshop/tree/master/dockercoins)
|
||||
subdirectory
|
||||
|
||||
- Let's look at the general layout of the source code:
|
||||
|
||||
there is a Compose file [docker-compose.yml](
|
||||
https://github.com/jpetazzo/orchestration-workshop/blob/master/dockercoins/docker-compose.yml) ...
|
||||
|
||||
... and 4 other services, each in its own directory:
|
||||
|
||||
- `rng` = web service generating random bytes
|
||||
- `hasher` = web service computing hash of POSTed data
|
||||
- `worker` = background process using `rng` and `hasher`
|
||||
- `webui` = web interface to watch progress
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Compose file format version
|
||||
|
||||
*Particularly relevant if you have used Compose before...*
|
||||
|
||||
- Compose 1.6 introduced support for a new Compose file format (aka "v2")
|
||||
|
||||
- Services are no longer at the top level, but under a `services` section
|
||||
|
||||
- There has to be a `version` key at the top level, with value `"2"` (as a string, not an integer)
|
||||
|
||||
- Containers are placed on a dedicated network, making links unnecessary
|
||||
|
||||
- There are other minor differences, but upgrade is easy and straightforward
|
||||
|
||||
---
|
||||
|
||||
## Links, naming, and service discovery
|
||||
|
||||
- Containers can have network aliases (resolvable through DNS)
|
||||
|
||||
- Compose file version 2+ makes each container reachable through its service name
|
||||
|
||||
- Compose file version 1 did require "links" sections
|
||||
|
||||
- Our code can connect to services using their short name
|
||||
|
||||
(instead of e.g. IP address or FQDN)
|
||||
|
||||
- Network aliases are automatically namespaced
|
||||
|
||||
(i.e. you can have multiple apps declaring and using a service named `database`)
|
||||
|
||||
---
|
||||
|
||||
## Example in `worker/worker.py`
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## What's this application?
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
(DockerCoins 2016 logo courtesy of [@XtlCnslt](https://twitter.com/xtlcnslt) and [@ndeloof](https://twitter.com/ndeloof). Thanks!)
|
||||
|
||||
---
|
||||
|
||||
## What's this application?
|
||||
|
||||
- It is a DockerCoin miner! 💰🐳📦🚢
|
||||
|
||||
--
|
||||
|
||||
- No, you can't buy coffee with DockerCoins
|
||||
|
||||
--
|
||||
|
||||
- How DockerCoins works:
|
||||
|
||||
- `worker` asks to `rng` to generate a few random bytes
|
||||
|
||||
- `worker` feeds these bytes into `hasher`
|
||||
|
||||
- and repeat forever!
|
||||
|
||||
- every second, `worker` updates `redis` to indicate how many loops were done
|
||||
|
||||
- `webui` queries `redis`, and computes and exposes "hashing speed" in your browser
|
||||
|
||||
---
|
||||
|
||||
## Getting the application source code
|
||||
|
||||
- We will clone the GitHub repository
|
||||
|
||||
- The repository also contains scripts and tools that we will use through the workshop
|
||||
|
||||
.exercise[
|
||||
|
||||
<!--
|
||||
```bash
|
||||
if [ -d orchestration-workshop ]; then
|
||||
mv orchestration-workshop orchestration-workshop.$$
|
||||
fi
|
||||
```
|
||||
-->
|
||||
|
||||
- Clone the repository on `node1`:
|
||||
```bash
|
||||
git clone git://github.com/jpetazzo/orchestration-workshop
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
(You can also fork the repository on GitHub and clone your fork if you prefer that.)
|
||||
|
||||
---
|
||||
|
||||
# Running the application
|
||||
|
||||
Without further ado, let's start our application.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Go to the `dockercoins` directory, in the cloned repo:
|
||||
```bash
|
||||
cd ~/orchestration-workshop/dockercoins
|
||||
```
|
||||
|
||||
- Use Compose to build and run all containers:
|
||||
```bash
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
<!--
|
||||
```wait units of work done```
|
||||
```keys ^C```
|
||||
-->
|
||||
|
||||
]
|
||||
|
||||
Compose tells Docker to build all container images (pulling
|
||||
the corresponding base images), then starts all containers,
|
||||
and displays aggregated logs.
|
||||
|
||||
---
|
||||
|
||||
## Lots of logs
|
||||
|
||||
- The application continuously generates logs
|
||||
|
||||
- We can see the `worker` service making requests to `rng` and `hasher`
|
||||
|
||||
- Let's put that in the background
|
||||
|
||||
.exercise[
|
||||
|
||||
- Stop the application by hitting `^C`
|
||||
|
||||
]
|
||||
|
||||
- `^C` stops all containers by sending them the `TERM` signal
|
||||
|
||||
- Some containers exit immediately, others take longer
|
||||
<br/>(because they don't handle `SIGTERM` and end up being killed after a 10s timeout)
|
||||
|
||||
---
|
||||
|
||||
## Restarting in the background
|
||||
|
||||
- Many flags and commands of Compose are modeled after those of `docker`
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start the app in the background with the `-d` option:
|
||||
```bash
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
- Check that our app is running with the `ps` command:
|
||||
```bash
|
||||
docker-compose ps
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
`docker-compose ps` also shows the ports exposed by the application.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Viewing logs
|
||||
|
||||
- The `docker-compose logs` command works like `docker logs`
|
||||
|
||||
.exercise[
|
||||
|
||||
- View all logs since container creation and exit when done:
|
||||
```bash
|
||||
docker-compose logs
|
||||
```
|
||||
|
||||
- Stream container logs, starting at the last 10 lines for each container:
|
||||
```bash
|
||||
docker-compose logs --tail 10 --follow
|
||||
```
|
||||
|
||||
<!--
|
||||
```wait units of work done```
|
||||
```keys ^C```
|
||||
-->
|
||||
|
||||
]
|
||||
|
||||
Tip: use `^S` and `^Q` to pause/resume log output.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Upgrading from Compose 1.6
|
||||
|
||||
.warning[The `logs` command has changed between Compose 1.6 and 1.7!]
|
||||
|
||||
- Up to 1.6
|
||||
|
||||
- `docker-compose logs` is the equivalent of `logs --follow`
|
||||
|
||||
- `docker-compose logs` must be restarted if containers are added
|
||||
|
||||
- Since 1.7
|
||||
|
||||
- `--follow` must be specified explicitly
|
||||
|
||||
- new containers are automatically picked up by `docker-compose logs`
|
||||
|
||||
---
|
||||
|
||||
## Connecting to the web UI
|
||||
|
||||
- The `webui` container exposes a web dashboard; let's view it
|
||||
|
||||
.exercise[
|
||||
|
||||
- With a web browser, connect to `node1` on port 8000
|
||||
|
||||
- Remember: the `nodeX` aliases are valid only on the nodes themselves
|
||||
|
||||
- In your browser, you need to enter the IP address of your node
|
||||
|
||||
<!-- ```open http://node1:8000``` -->
|
||||
|
||||
]
|
||||
|
||||
You should see a speed of approximately 4 hashes/second.
|
||||
|
||||
More precisely: 4 hashes/second, with regular dips down to zero.
|
||||
<br/>This is because Jérôme is incapable of writing good frontend code.
|
||||
<br/>Don't ask. Seriously, don't ask. This is embarrassing.
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Why does the speed seem irregular?
|
||||
|
||||
- The app actually has a constant, steady speed: 3.33 hashes/second
|
||||
<br/>
|
||||
(which corresponds to 1 hash every 0.3 seconds, for *reasons*)
|
||||
|
||||
- The worker doesn't update the counter after every loop, but up to once per second
|
||||
|
||||
- The speed is computed by the browser, checking the counter about once per second
|
||||
|
||||
- Between two consecutive updates, the counter will increase either by 4, or by 0
|
||||
|
||||
- The perceived speed will therefore be 4 - 4 - 4 - 0 - 4 - 4 - etc.
|
||||
|
||||
*We told you to not ask!!!*
|
||||
|
||||
---
|
||||
|
||||
## Scaling up the application
|
||||
|
||||
- Our goal is to make that performance graph go up (without changing a line of code!)
|
||||
|
||||
--
|
||||
|
||||
- Before trying to scale the application, we'll figure out if we need more resources
|
||||
|
||||
(CPU, RAM...)
|
||||
|
||||
- For that, we will use good old UNIX tools on our Docker node
|
||||
|
||||
---
|
||||
|
||||
## Looking at resource usage
|
||||
|
||||
- Let's look at CPU, memory, and I/O usage
|
||||
|
||||
.exercise[
|
||||
|
||||
- run `top` to see CPU and memory usage (you should see idle cycles)
|
||||
|
||||
<!--
|
||||
```bash top```
|
||||
|
||||
```wait Tasks```
|
||||
```keys ^C```
|
||||
-->
|
||||
|
||||
- run `vmstat 1` to see I/O usage (si/so/bi/bo)
|
||||
<br/>(the 4 numbers should be almost zero, except `bo` for logging)
|
||||
|
||||
<!--
|
||||
```bash vmstat 1```
|
||||
|
||||
```wait memory```
|
||||
```keys ^C```
|
||||
-->
|
||||
|
||||
]
|
||||
|
||||
We have available resources.
|
||||
|
||||
- Why?
|
||||
- How can we use them?
|
||||
|
||||
---
|
||||
|
||||
## Scaling workers on a single node
|
||||
|
||||
- Docker Compose supports scaling
|
||||
- Let's scale `worker` and see what happens!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start one more `worker` container:
|
||||
```bash
|
||||
docker-compose scale worker=2
|
||||
```
|
||||
|
||||
- Look at the performance graph (it should show a x2 improvement)
|
||||
|
||||
- Look at the aggregated logs of our containers (`worker_2` should show up)
|
||||
|
||||
- Look at the impact on CPU load with e.g. top (it should be negligible)
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Adding more workers
|
||||
|
||||
- Great, let's add more workers and call it a day, then!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start eight more `worker` containers:
|
||||
```bash
|
||||
docker-compose scale worker=10
|
||||
```
|
||||
|
||||
- Look at the performance graph: does it show a x10 improvement?
|
||||
|
||||
- Look at the aggregated logs of our containers
|
||||
|
||||
- Look at the impact on CPU load and memory usage
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
# Identifying bottlenecks
|
||||
|
||||
- You should have seen a 3x speed bump (not 10x)
|
||||
|
||||
- Adding workers didn't result in linear improvement
|
||||
|
||||
- *Something else* is slowing us down
|
||||
|
||||
--
|
||||
|
||||
- ... But what?
|
||||
|
||||
--
|
||||
|
||||
- The code doesn't have instrumentation
|
||||
|
||||
- Let's use state-of-the-art HTTP performance analysis!
|
||||
<br/>(i.e. good old tools like `ab`, `httping`...)
|
||||
|
||||
---
|
||||
|
||||
## Accessing internal services
|
||||
|
||||
- `rng` and `hasher` are exposed on ports 8001 and 8002
|
||||
|
||||
- This is declared in the Compose file:
|
||||
|
||||
```yaml
|
||||
...
|
||||
rng:
|
||||
build: rng
|
||||
ports:
|
||||
- "8001:80"
|
||||
|
||||
hasher:
|
||||
build: hasher
|
||||
ports:
|
||||
- "8002:80"
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Measuring latency under load
|
||||
|
||||
We will use `httping`.
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the latency of `rng`:
|
||||
```bash
|
||||
httping -c 10 localhost:8001
|
||||
```
|
||||
|
||||
- Check the latency of `hasher`:
|
||||
```bash
|
||||
httping -c 10 localhost:8002
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
`rng` has a much higher latency than `hasher`.
|
||||
|
||||
---
|
||||
|
||||
## Let's draw hasty conclusions
|
||||
|
||||
- The bottleneck seems to be `rng`
|
||||
|
||||
- *What if* we don't have enough entropy and can't generate enough random numbers?
|
||||
|
||||
- We need to scale out the `rng` service on multiple machines!
|
||||
|
||||
Note: this is a fiction! We have enough entropy. But we need a pretext to scale out.
|
||||
|
||||
(In fact, the code of `rng` uses `/dev/urandom`, which never runs out of entropy...
|
||||
<br/>
|
||||
...and is [just as good as `/dev/random`](http://www.slideshare.net/PacSecJP/filippo-plain-simple-reality-of-entropy).)
|
||||
|
||||
---
|
||||
|
||||
## Clean up
|
||||
|
||||
- Before moving on, let's remove those containers
|
||||
|
||||
.exercise[
|
||||
|
||||
- Tell Compose to remove everything:
|
||||
```bash
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
]
|
||||
194
docs/secrets.md
Normal file
@@ -0,0 +1,194 @@
|
||||
class: secrets
|
||||
|
||||
## Secret management
|
||||
|
||||
- Docker has a "secret safe" (secure key→value store)
|
||||
|
||||
- You can create as many secrets as you like
|
||||
|
||||
- You can associate secrets to services
|
||||
|
||||
- Secrets are exposed as plain text files, but kept in memory only (using `tmpfs`)
|
||||
|
||||
- Secrets are immutable (at least in Engine 1.13)
|
||||
|
||||
- Secrets have a max size of 500 KB
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Creating secrets
|
||||
|
||||
- Must specify a name for the secret; and the secret itself
|
||||
|
||||
.exercise[
|
||||
|
||||
- Assign [one of the four most commonly used passwords](https://www.youtube.com/watch?v=0Jx8Eay5fWQ) to a secret called `hackme`:
|
||||
```bash
|
||||
echo love | docker secret create hackme -
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
If the secret is in a file, you can simply pass the path to the file.
|
||||
|
||||
(The special path `-` indicates to read from the standard input.)
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Creating better secrets
|
||||
|
||||
- Picking lousy passwords always leads to security breaches
|
||||
|
||||
.exercise[
|
||||
|
||||
- Let's craft a better password, and assign it to another secret:
|
||||
```bash
|
||||
base64 /dev/urandom | head -c16 | docker secret create arewesecureyet -
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: in the latter case, we don't even know the secret at this point. But Swarm does.
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Using secrets
|
||||
|
||||
- Secrets must be handed explicitly to services
|
||||
|
||||
.exercise[
|
||||
|
||||
- Create a dummy service with both secrets:
|
||||
```bash
|
||||
docker service create \
|
||||
--secret hackme --secret arewesecureyet \
|
||||
--name dummyservice --mode global \
|
||||
alpine sleep 1000000000
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
We use a global service to make sure that there will be an instance on the local node.
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Accessing secrets
|
||||
|
||||
- Secrets are materialized on `/run/secrets` (which is an in-memory filesystem)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Find the ID of the container for the dummy service:
|
||||
```bash
|
||||
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
|
||||
```
|
||||
|
||||
- Enter the container:
|
||||
```bash
|
||||
docker exec -ti $CID sh
|
||||
```
|
||||
|
||||
- Check the files in `/run/secrets`
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Rotating secrets
|
||||
|
||||
- You can't change a secret
|
||||
|
||||
(Sounds annoying at first; but allows clean rollbacks if a secret update goes wrong)
|
||||
|
||||
- You can add a secret to a service with `docker service update --secret-add`
|
||||
|
||||
(This will redeploy the service; it won't add the secret on the fly)
|
||||
|
||||
- You can remove a secret with `docker service update --secret-rm`
|
||||
|
||||
- Secrets can be mapped to different names by expressing them with a micro-format:
|
||||
```bash
|
||||
docker service create --secret source=secretname,target=filename
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Changing our insecure password
|
||||
|
||||
- We want to replace our `hackme` secret with a better one
|
||||
|
||||
.exercise[
|
||||
|
||||
- Remove the insecure `hackme` secret:
|
||||
```bash
|
||||
docker service update dummyservice --secret-rm hackme
|
||||
```
|
||||
|
||||
- Add our better secret instead:
|
||||
```bash
|
||||
docker service update dummyservice \
|
||||
--secret-add source=arewesecureyet,target=hackme
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Wait for the service to be fully updated with e.g. `watch docker service ps dummyservice`.
|
||||
<br/>(With Docker Engine 17.10 and later, the CLI will wait for you!)
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Checking that our password is now stronger
|
||||
|
||||
- We will use the power of `docker exec`!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Get the ID of the new container:
|
||||
```bash
|
||||
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
|
||||
```
|
||||
|
||||
- Check the contents of the secret files:
|
||||
```bash
|
||||
docker exec $CID grep -r . /run/secrets
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Secrets in practice
|
||||
|
||||
- Can be (ab)used to hold whole configuration files if needed
|
||||
|
||||
- If you intend to rotate secret `foo`, call it `foo.N` instead, and map it to `foo`
|
||||
|
||||
(N can be a serial, a timestamp...)
|
||||
|
||||
```bash
|
||||
docker service create --secret source=foo.N,target=foo ...
|
||||
```
|
||||
|
||||
- You can update (remove+add) a secret in a single command:
|
||||
|
||||
```bash
|
||||
docker service update ... --secret-rm foo.M --secret-add source=foo.N,target=foo
|
||||
```
|
||||
|
||||
- For more details and examples, [check the documentation](https://docs.docker.com/engine/swarm/secrets/)
|
||||
16
docs/security.md
Normal file
@@ -0,0 +1,16 @@
|
||||
# Secrets management and encryption at rest
|
||||
|
||||
(New in Docker Engine 1.13)
|
||||
|
||||
- Secrets management = selectively and securely bring secrets to services
|
||||
|
||||
- Encryption at rest = protect against storage theft or prying
|
||||
|
||||
- Remember:
|
||||
|
||||
- control plane is authenticated through mutual TLS, certs rotated every 90 days
|
||||
|
||||
- control plane is encrypted with AES-GCM, keys rotated every 12 hours
|
||||
|
||||
- data plane is not encrypted by default (for performance reasons),
|
||||
<br/>but we saw earlier how to enable that with a single flag
|
||||
65
docs/selfpaced.yml
Normal file
@@ -0,0 +1,65 @@
|
||||
exclude:
|
||||
- in-person
|
||||
|
||||
chat: FIXME
|
||||
|
||||
title: Docker Orchestration Workshop
|
||||
|
||||
chapters:
|
||||
- |
|
||||
class: title
|
||||
Docker <br/> Orchestration <br/> Workshop
|
||||
- intro.md
|
||||
- |
|
||||
@@TOC@@
|
||||
- - prereqs.md
|
||||
- versions.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
All right!
|
||||
<br/>
|
||||
We're all set.
|
||||
<br/>
|
||||
Let's do this.
|
||||
- |
|
||||
name: part-1
|
||||
|
||||
class: title, self-paced
|
||||
|
||||
Part 1
|
||||
- sampleapp.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
Scaling out
|
||||
- swarmkit.md
|
||||
- creatingswarm.md
|
||||
- machine.md
|
||||
- morenodes.md
|
||||
- - firstservice.md
|
||||
- ourapponswarm.md
|
||||
- - operatingswarm.md
|
||||
- netshoot.md
|
||||
- swarmnbt.md
|
||||
- ipsec.md
|
||||
- updatingservices.md
|
||||
- rollingupdates.md
|
||||
- healthchecks.md
|
||||
- nodeinfo.md
|
||||
- swarmtools.md
|
||||
- - security.md
|
||||
- secrets.md
|
||||
- leastprivilege.md
|
||||
- namespaces.md
|
||||
- apiscope.md
|
||||
- encryptionatrest.md
|
||||
- logging.md
|
||||
- metrics.md
|
||||
- stateful.md
|
||||
- extratips.md
|
||||
- end.md
|
||||
- |
|
||||
class: title
|
||||
|
||||
Thank you!
|
||||
65
docs/setup-k8s.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# Setting up Kubernetes
|
||||
|
||||
- How did we set up these Kubernetes clusters that we're using?
|
||||
|
||||
--
|
||||
|
||||
- We used `kubeadm` on "fresh" EC2 instances with Ubuntu 16.04 LTS
|
||||
|
||||
1. Install Docker
|
||||
|
||||
2. Install Kubernetes packages
|
||||
|
||||
3. Run `kubeadm init` on the master node
|
||||
|
||||
4. Set up Weave (the overlay network)
|
||||
<br/>
|
||||
(that step is just one `kubectl apply` command; discussed later)
|
||||
|
||||
5. Run `kubeadm join` on the other nodes (with the token produced by `kubeadm init`)
|
||||
|
||||
6. Copy the configuration file generated by `kubeadm init`
|
||||
|
||||
---
|
||||
|
||||
## `kubeadm` drawbacks
|
||||
|
||||
- Doesn't set up Docker or any other container engine
|
||||
|
||||
- Doesn't set up the overlay network
|
||||
|
||||
- Scripting is complex
|
||||
<br/>
|
||||
(because extracting the token requires advanced `kubectl` commands)
|
||||
|
||||
- Doesn't set up multi-master (no high availability)
|
||||
|
||||
--
|
||||
|
||||
- It's still twice as many steps as setting up a Swarm cluster 😕
|
||||
|
||||
---
|
||||
|
||||
## Other deployment options
|
||||
|
||||
- If you are on Google Cloud:
|
||||
[GKE](https://cloud.google.com/container-engine/)
|
||||
|
||||
Empirically the best Kubernetes deployment out there
|
||||
|
||||
- If you are on AWS:
|
||||
[kops](https://github.com/kubernetes/kops)
|
||||
|
||||
... But with AWS re:invent just around the corner, expect some changes
|
||||
|
||||
- On a local machine:
|
||||
[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/),
|
||||
[kubespawn](https://github.com/kinvolk/kube-spawn),
|
||||
[Docker4Mac (coming soon)](https://beta.docker.com/)
|
||||
|
||||
- If you want something customizable:
|
||||
[kubicorn](https://github.com/kris-nova/kubicorn)
|
||||
|
||||
Probably the closest to a multi-cloud/hybrid solution so far, but in development
|
||||
|
||||
- Also, many commercial options!
|
||||
BIN
docs/startrek-federation.jpg
Normal file
|
After Width: | Height: | Size: 27 KiB |
344
docs/stateful.md
Normal file
@@ -0,0 +1,344 @@
|
||||
# Dealing with stateful services
|
||||
|
||||
- First of all, you need to make sure that the data files are on a *volume*
|
||||
|
||||
- Volumes are host directories that are mounted to the container's filesystem
|
||||
|
||||
- These host directories can be backed by the ordinary, plain host filesystem ...
|
||||
|
||||
- ... Or by distributed/networked filesystems
|
||||
|
||||
- In the latter scenario, in case of node failure, the data is safe elsewhere ...
|
||||
|
||||
- ... And the container can be restarted on another node without data loss
|
||||
|
||||
---
|
||||
|
||||
## Building a stateful service experiment
|
||||
|
||||
- We will use Redis for this example
|
||||
|
||||
- We will expose it on port 10000 to access it easily
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start the Redis service:
|
||||
```bash
|
||||
docker service create --name stateful -p 10000:6379 redis
|
||||
```
|
||||
|
||||
- Check that we can connect to it:
|
||||
```bash
|
||||
docker run --net host --rm redis redis-cli -p 10000 info server
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Accessing our Redis service easily
|
||||
|
||||
- Typing that whole command is going to be tedious
|
||||
|
||||
.exercise[
|
||||
|
||||
- Define a shell alias to make our lives easier:
|
||||
```bash
|
||||
alias redis='docker run --net host --rm redis redis-cli -p 10000'
|
||||
```
|
||||
|
||||
- Try it:
|
||||
```bash
|
||||
redis info server
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Basic Redis commands
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check that the `foo` key doesn't exist:
|
||||
```bash
|
||||
redis get foo
|
||||
```
|
||||
|
||||
- Set it to `bar`:
|
||||
```bash
|
||||
redis set foo bar
|
||||
```
|
||||
|
||||
- Check that it exists now:
|
||||
```bash
|
||||
redis get foo
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Local volumes vs. global volumes
|
||||
|
||||
- Global volumes exist in a single namespace
|
||||
|
||||
- A global volume can be mounted on any node
|
||||
<br/>.small[(bar some restrictions specific to the volume driver in use; e.g. using an EBS-backed volume on a GCE/EC2 mixed cluster)]
|
||||
|
||||
- Attaching a global volume to a container allows to start the container anywhere
|
||||
<br/>(and retain its data wherever you start it!)
|
||||
|
||||
- Global volumes require extra *plugins* (Flocker, Portworx...)
|
||||
|
||||
- Docker doesn't come with a default global volume driver at this point
|
||||
|
||||
- Therefore, we will fall back on *local volumes*
|
||||
|
||||
---
|
||||
|
||||
## Local volumes
|
||||
|
||||
- We will use the default volume driver, `local`
|
||||
|
||||
- As the name implies, the `local` volume driver manages *local* volumes
|
||||
|
||||
- Since local volumes are (duh!) *local*, we need to pin our container to a specific host
|
||||
|
||||
- We will do that with a *constraint*
|
||||
|
||||
.exercise[
|
||||
|
||||
- Add a placement constraint to our service:
|
||||
```bash
|
||||
docker service update stateful --constraint-add node.hostname==$HOSTNAME
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Where is our data?
|
||||
|
||||
- If we look for our `foo` key, it's gone!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the `foo` key:
|
||||
```bash
|
||||
redis get foo
|
||||
```
|
||||
|
||||
- Adding a constraint caused the service to be redeployed:
|
||||
```bash
|
||||
docker service ps stateful
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: even if the constraint ends up being a no-op (i.e. not
|
||||
moving the service), the service gets redeployed.
|
||||
This ensures consistent behavior.
|
||||
|
||||
---
|
||||
|
||||
## Setting the key again
|
||||
|
||||
- Since our database was wiped out, let's populate it again
|
||||
|
||||
.exercise[
|
||||
|
||||
- Set `foo` again:
|
||||
```bash
|
||||
redis set foo bar
|
||||
```
|
||||
|
||||
- Check that it's there:
|
||||
```bash
|
||||
redis get foo
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Service updates cause containers to be replaced
|
||||
|
||||
- Let's try to make a trivial update to the service and see what happens
|
||||
|
||||
.exercise[
|
||||
|
||||
- Set a memory limit to our Redis service:
|
||||
```bash
|
||||
docker service update stateful --limit-memory 100M
|
||||
```
|
||||
|
||||
- Try to get the `foo` key one more time:
|
||||
```bash
|
||||
redis get foo
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
The key is blank again!
|
||||
|
||||
---
|
||||
|
||||
## Service volumes are ephemeral by default
|
||||
|
||||
- Let's highlight what's going on with volumes!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check the current list of volumes:
|
||||
```bash
|
||||
docker volume ls
|
||||
```
|
||||
|
||||
- Carry a minor update to our Redis service:
|
||||
```bash
|
||||
docker service update stateful --limit-memory 200M
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Again: all changes trigger the creation of a new task, and therefore a replacement of the existing container;
|
||||
even when it is not strictly technically necessary.
|
||||
|
||||
---
|
||||
|
||||
## The data is gone again
|
||||
|
||||
- What happened to our data?
|
||||
|
||||
.exercise[
|
||||
|
||||
- The list of volumes is slightly different:
|
||||
```bash
|
||||
docker volume ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
(You should see one extra volume.)
|
||||
|
||||
---
|
||||
|
||||
## Assigning a persistent volume to the container
|
||||
|
||||
- Let's add an explicit volume mount to our service, referencing a named volume
|
||||
|
||||
.exercise[
|
||||
|
||||
- Update the service with a volume mount:
|
||||
```bash
|
||||
docker service update stateful \
|
||||
--mount-add type=volume,source=foobarstore,target=/data
|
||||
```
|
||||
|
||||
- Check the new volume list:
|
||||
```bash
|
||||
docker volume ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: the `local` volume driver automatically creates volumes.
|
||||
|
||||
---
|
||||
|
||||
## Checking that persistence actually works across service updates
|
||||
|
||||
.exercise[
|
||||
|
||||
- Store something in the `foo` key:
|
||||
```bash
|
||||
redis set foo barbar
|
||||
```
|
||||
|
||||
- Update the service with yet another trivial change:
|
||||
```bash
|
||||
docker service update stateful --limit-memory 300M
|
||||
```
|
||||
|
||||
- Check that `foo` is still set:
|
||||
```bash
|
||||
redis get foo
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Recap
|
||||
|
||||
- The service must commit its state to disk when being shutdown.red[*]
|
||||
|
||||
(Shutdown = being sent a `TERM` signal)
|
||||
|
||||
- The state must be written on files located on a volume
|
||||
|
||||
- That volume must be specified to be persistent
|
||||
|
||||
- If using a local volume, the service must also be pinned to a specific node
|
||||
|
||||
(And losing that node means losing the data, unless there are other backups)
|
||||
|
||||
.footnote[<br/>
|
||||
.red[*]If you customize Redis configuration, make sure you
|
||||
persist data correctly!
|
||||
<br/>
|
||||
It's easy to make that mistake — __Trust me!__]
|
||||
|
||||
---
|
||||
|
||||
## Cleaning up
|
||||
|
||||
.exercise[
|
||||
|
||||
- Remove the stateful service:
|
||||
```bash
|
||||
docker service rm stateful
|
||||
```
|
||||
|
||||
- Remove the associated volume:
|
||||
```bash
|
||||
docker volume rm foobarstore
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: we could keep the volume around if we wanted.
|
||||
|
||||
---
|
||||
|
||||
## Should I run stateful services in containers?
|
||||
|
||||
--
|
||||
|
||||
Depending whom you ask, they'll tell you:
|
||||
|
||||
--
|
||||
|
||||
- certainly not, heathen!
|
||||
|
||||
--
|
||||
|
||||
- we've been running a few thousands PostgreSQL instances in containers ...
|
||||
<br/>for a few years now ... in production ... is that bad?
|
||||
|
||||
--
|
||||
|
||||
- what's a container?
|
||||
|
||||
--
|
||||
|
||||
Perhaps a better question would be:
|
||||
|
||||
*"Should I run stateful services?"*
|
||||
|
||||
--
|
||||
|
||||
- is it critical for my business?
|
||||
- is it my value-add?
|
||||
- or should I find somebody else to run them for me?
|
||||
153
docs/swarmkit.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# SwarmKit
|
||||
|
||||
- [SwarmKit](https://github.com/docker/swarmkit) is an open source
|
||||
toolkit to build multi-node systems
|
||||
|
||||
- It is a reusable library, like libcontainer, libnetwork, vpnkit ...
|
||||
|
||||
- It is a plumbing part of the Docker ecosystem
|
||||
|
||||
--
|
||||
|
||||
.footnote[🐳 Did you know that кит means "whale" in Russian?]
|
||||
|
||||
---
|
||||
|
||||
## SwarmKit features
|
||||
|
||||
- Highly-available, distributed store based on [Raft](
|
||||
https://en.wikipedia.org/wiki/Raft_%28computer_science%29)
|
||||
<br/>(avoids depending on an external store: easier to deploy; higher performance)
|
||||
|
||||
- Dynamic reconfiguration of Raft without interrupting cluster operations
|
||||
|
||||
- *Services* managed with a *declarative API*
|
||||
<br/>(implementing *desired state* and *reconciliation loop*)
|
||||
|
||||
- Integration with overlay networks and load balancing
|
||||
|
||||
- Strong emphasis on security:
|
||||
|
||||
- automatic TLS keying and signing; automatic cert rotation
|
||||
- full encryption of the data plane; automatic key rotation
|
||||
- least privilege architecture (single-node compromise ≠ cluster compromise)
|
||||
- on-disk encryption with optional passphrase
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Where is the key/value store?
|
||||
|
||||
- Many orchestration systems use a key/value store backed by a consensus algorithm
|
||||
<br/>
|
||||
(k8s→etcd→Raft, mesos→zookeeper→ZAB, etc.)
|
||||
|
||||
- SwarmKit implements the Raft algorithm directly
|
||||
<br/>
|
||||
(Nomad is similar; thanks [@cbednarski](https://twitter.com/@cbednarski),
|
||||
[@diptanu](https://twitter.com/diptanu) and others for point it out!)
|
||||
|
||||
- Analogy courtesy of [@aluzzardi](https://twitter.com/aluzzardi):
|
||||
|
||||
*It's like B-Trees and RDBMS. They are different layers, often
|
||||
associated. But you don't need to bring up a full SQL server when
|
||||
all you need is to index some data.*
|
||||
|
||||
- As a result, the orchestrator has direct access to the data
|
||||
<br/>
|
||||
(the main copy of the data is stored in the orchestrator's memory)
|
||||
|
||||
- Simpler, easier to deploy and operate; also faster
|
||||
|
||||
---
|
||||
|
||||
## SwarmKit concepts (1/2)
|
||||
|
||||
- A *cluster* will be at least one *node* (preferably more)
|
||||
|
||||
- A *node* can be a *manager* or a *worker*
|
||||
|
||||
- A *manager* actively takes part in the Raft consensus, and keeps the Raft log
|
||||
|
||||
- You can talk to a *manager* using the SwarmKit API
|
||||
|
||||
- One *manager* is elected as the *leader*; other managers merely forward requests to it
|
||||
|
||||
- The *workers* get their instructions from the *managers*
|
||||
|
||||
- Both *workers* and *managers* can run containers
|
||||
|
||||
---
|
||||
|
||||
## Illustration
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## SwarmKit concepts (2/2)
|
||||
|
||||
- The *managers* expose the SwarmKit API
|
||||
|
||||
- Using the API, you can indicate that you want to run a *service*
|
||||
|
||||
- A *service* is specified by its *desired state*: which image, how many instances...
|
||||
|
||||
- The *leader* uses different subsystems to break down services into *tasks*:
|
||||
<br/>orchestrator, scheduler, allocator, dispatcher
|
||||
|
||||
- A *task* corresponds to a specific container, assigned to a specific *node*
|
||||
|
||||
- *Nodes* know which *tasks* should be running, and will start or stop containers accordingly (through the Docker Engine API)
|
||||
|
||||
You can refer to the [NOMENCLATURE](https://github.com/docker/swarmkit/blob/master/design/nomenclature.md) in the SwarmKit repo for more details.
|
||||
|
||||
---
|
||||
|
||||
## Swarm Mode
|
||||
|
||||
- Since version 1.12, Docker Engine embeds SwarmKit
|
||||
|
||||
- All the SwarmKit features are "asleep" until you enable "Swarm Mode"
|
||||
|
||||
- Examples of Swarm Mode commands:
|
||||
|
||||
- `docker swarm` (enable Swarm mode; join a Swarm; adjust cluster parameters)
|
||||
|
||||
- `docker node` (view nodes; promote/demote managers; manage nodes)
|
||||
|
||||
- `docker service` (create and manage services)
|
||||
|
||||
???
|
||||
|
||||
- The Docker API exposes the same concepts
|
||||
|
||||
- The SwarmKit API is also exposed (on a separate socket)
|
||||
|
||||
---
|
||||
|
||||
## You need to enable Swarm mode to use the new stuff
|
||||
|
||||
- By default, all this new code is inactive
|
||||
|
||||
- Swarm Mode can be enabled, "unlocking" SwarmKit functions
|
||||
<br/>(services, out-of-the-box overlay networks, etc.)
|
||||
|
||||
.exercise[
|
||||
|
||||
- Try a Swarm-specific command:
|
||||
```bash
|
||||
docker node ls
|
||||
```
|
||||
|
||||
<!-- Ignore errors: ```wait ``` -->
|
||||
|
||||
]
|
||||
|
||||
--
|
||||
|
||||
You will get an error message:
|
||||
```
|
||||
Error response from daemon: This node is not a swarm manager. [...]
|
||||
```
|
||||
72
docs/swarmnbt.md
Normal file
@@ -0,0 +1,72 @@
|
||||
class: nbt, extra-details
|
||||
|
||||
## Measuring network conditions on the whole cluster
|
||||
|
||||
- Since we have built-in, cluster-wide discovery, it's relatively straightforward
|
||||
to monitor the whole cluster automatically
|
||||
|
||||
- [Alexandros Mavrogiannis](https://github.com/alexmavr) wrote
|
||||
[Swarm NBT](https://github.com/alexmavr/swarm-nbt), a tool doing exactly that!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start Swarm NBT:
|
||||
```bash
|
||||
docker run --rm -v inventory:/inventory \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
alexmavr/swarm-nbt start
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Note: in this mode, Swarm NBT connects to the Docker API socket,
|
||||
and issues additional API requests to start all the components it needs.
|
||||
|
||||
---
|
||||
|
||||
class: nbt, extra-details
|
||||
|
||||
## Viewing network conditions with Prometheus
|
||||
|
||||
- Swarm NBT relies on Prometheus to scrape and store data
|
||||
|
||||
- We can directly consume the Prometheus endpoint to view telemetry data
|
||||
|
||||
.exercise[
|
||||
|
||||
- Point your browser to any Swarm node, on port 9090
|
||||
|
||||
(If you're using Play-With-Docker, click on the (9090) badge)
|
||||
|
||||
- In the drop-down, select `icmp_rtt_gauge_seconds`
|
||||
|
||||
- Click on "Graph"
|
||||
|
||||
]
|
||||
|
||||
You are now seeing ICMP latency across your cluster.
|
||||
|
||||
---
|
||||
|
||||
class: nbt, in-person, extra-details
|
||||
|
||||
## Viewing network conditions with Grafana
|
||||
|
||||
- If you are using a "real" cluster (not Play-With-Docker) you can use Grafana
|
||||
|
||||
.exercise[
|
||||
|
||||
- Start Grafana with `docker service create -p 3000:3000 grafana`
|
||||
- Point your browser to Grafana, on port 3000 on any Swarm node
|
||||
- Login with username `admin` and password `admin`
|
||||
- Click on the top-left menu and browse to Data Sources
|
||||
- Create a prometheus datasource with any name
|
||||
- Point it to http://any-node-IP:9090
|
||||
- Set access to "direct" and leave credentials blank
|
||||
- Click on the top-left menu, highlight "Dashboards" and select the "Import" option
|
||||
- Copy-paste [this JSON payload](
|
||||
https://raw.githubusercontent.com/alexmavr/swarm-nbt/master/grafana.json),
|
||||
then use the Prometheus Data Source defined before
|
||||
- Poke around the dashboard that magically appeared!
|
||||
|
||||
]
|
||||
184
docs/swarmtools.md
Normal file
@@ -0,0 +1,184 @@
|
||||
# SwarmKit debugging tools
|
||||
|
||||
- The SwarmKit repository comes with debugging tools
|
||||
|
||||
- They are *low level* tools; not for general use
|
||||
|
||||
- We are going to see two of these tools:
|
||||
|
||||
- `swarmctl`, to communicate directly with the SwarmKit API
|
||||
|
||||
- `swarm-rafttool`, to inspect the content of the Raft log
|
||||
|
||||
---
|
||||
|
||||
## Building the SwarmKit tools
|
||||
|
||||
- We are going to install a Go compiler, then download SwarmKit source and build it
|
||||
|
||||
.exercise[
|
||||
- Download, compile, and install SwarmKit with this one-liner:
|
||||
```bash
|
||||
docker run -v /usr/local/bin:/go/bin golang \
|
||||
go get `-v` github.com/docker/swarmkit/...
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Remove `-v` if you don't like verbose things.
|
||||
|
||||
Shameless promo: for more Go and Docker love, check
|
||||
[this blog post](http://jpetazzo.github.io/2016/09/09/go-docker/)!
|
||||
|
||||
Note: in the unfortunate event of SwarmKit *master* branch being broken,
|
||||
the build might fail. In that case, just skip the Swarm tools section.
|
||||
|
||||
---
|
||||
|
||||
## Getting cluster-wide task information
|
||||
|
||||
- The Docker API doesn't expose this directly (yet)
|
||||
|
||||
- But the SwarmKit API does
|
||||
|
||||
- We are going to query it with `swarmctl`
|
||||
|
||||
- `swarmctl` is an example program showing how to
|
||||
interact with the SwarmKit API
|
||||
|
||||
---
|
||||
|
||||
## Using `swarmctl`
|
||||
|
||||
- The Docker Engine places the SwarmKit control socket in a special path
|
||||
|
||||
- You need root privileges to access it
|
||||
|
||||
.exercise[
|
||||
|
||||
- If you are using Play-With-Docker, set the following alias:
|
||||
```bash
|
||||
alias swarmctl='/lib/ld-musl-x86_64.so.1 /usr/local/bin/swarmctl \
|
||||
--socket /var/run/docker/swarm/control.sock'
|
||||
```
|
||||
|
||||
- Otherwise, set the following alias:
|
||||
```bash
|
||||
alias swarmctl='sudo swarmctl \
|
||||
--socket /var/run/docker/swarm/control.sock'
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## `swarmctl` in action
|
||||
|
||||
- Let's review a few useful `swarmctl` commands
|
||||
|
||||
.exercise[
|
||||
|
||||
- List cluster nodes (that's equivalent to `docker node ls`):
|
||||
```bash
|
||||
swarmctl node ls
|
||||
```
|
||||
|
||||
- View all tasks across all services:
|
||||
```bash
|
||||
swarmctl task ls
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## `swarmctl` notes
|
||||
|
||||
- SwarmKit is vendored into the Docker Engine
|
||||
|
||||
- If you want to use `swarmctl`, you need the exact version of
|
||||
SwarmKit that was used in your Docker Engine
|
||||
|
||||
- Otherwise, you might get some errors like:
|
||||
|
||||
```
|
||||
Error: grpc: failed to unmarshal the received message proto: wrong wireType = 0
|
||||
```
|
||||
|
||||
- With Docker 1.12, the control socket was in `/var/lib/docker/swarm/control.sock`
|
||||
|
||||
---
|
||||
|
||||
## `swarm-rafttool`
|
||||
|
||||
- SwarmKit stores all its important data in a distributed log using the Raft protocol
|
||||
|
||||
(This log is also simply called the "Raft log")
|
||||
|
||||
- You can decode that log with `swarm-rafttool`
|
||||
|
||||
- This is a great tool to understand how SwarmKit works
|
||||
|
||||
- It can also be used in forensics or troubleshooting
|
||||
|
||||
(But consider it as a *very low level* tool!)
|
||||
|
||||
---
|
||||
|
||||
## The powers of `swarm-rafttool`
|
||||
|
||||
With `swarm-rafttool`, you can:
|
||||
|
||||
- view the latest snapshot of the cluster state;
|
||||
|
||||
- view the Raft log (i.e. changes to the cluster state);
|
||||
|
||||
- view specific objects from the log or snapshot;
|
||||
|
||||
- decrypt the Raft data (to analyze it with other tools).
|
||||
|
||||
It *cannot* work on live files, so you must stop Docker or make a copy first.
|
||||
|
||||
---
|
||||
|
||||
## Using `swarm-rafttool`
|
||||
|
||||
- First, let's make a copy of the current Swarm data
|
||||
|
||||
.exercise[
|
||||
|
||||
- If you are using Play-With-Docker, the Docker data directory is `/graph`:
|
||||
```bash
|
||||
cp -r /graph/swarm /swarmdata
|
||||
```
|
||||
|
||||
- Otherwise, it is in the default `/var/lib/docker`:
|
||||
```bash
|
||||
sudo cp -r /var/lib/docker/swarm /swarmdata
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Dumping the Raft log
|
||||
|
||||
- We have to indicate the path holding the Swarm data
|
||||
|
||||
(Otherwise `swarm-rafttool` will try to use the live data, and complain that it's locked!)
|
||||
|
||||
.exercise[
|
||||
|
||||
- If you are using Play-With-Docker, you must use the musl linker:
|
||||
```bash
|
||||
/lib/ld-musl-x86_64.so.1 /usr/local/bin/swarm-rafttool -d /swarmdata/ dump-wal
|
||||
```
|
||||
|
||||
- Otherwise, you don't need the musl linker but you need to get root:
|
||||
```bash
|
||||
sudo swarm-rafttool -d /swarmdata/ dump-wal
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
Reminder: this is a very low-level tool, requiring a knowledge of SwarmKit's internals!
|
||||
BIN
docs/thanks-weave.png
Normal file
|
After Width: | Height: | Size: 174 KiB |
98
docs/updatingservices.md
Normal file
@@ -0,0 +1,98 @@
|
||||
# Updating services
|
||||
|
||||
- We want to make changes to the web UI
|
||||
|
||||
- The process is as follows:
|
||||
|
||||
- edit code
|
||||
|
||||
- build new image
|
||||
|
||||
- ship new image
|
||||
|
||||
- run new image
|
||||
|
||||
---
|
||||
|
||||
## Updating a single service the hard way
|
||||
|
||||
- To update a single service, we could do the following:
|
||||
```bash
|
||||
REGISTRY=localhost:5000 TAG=v0.3
|
||||
IMAGE=$REGISTRY/dockercoins_webui:$TAG
|
||||
docker build -t $IMAGE webui/
|
||||
docker push $IMAGE
|
||||
docker service update dockercoins_webui --image $IMAGE
|
||||
```
|
||||
|
||||
- Make sure to tag properly your images: update the `TAG` at each iteration
|
||||
|
||||
(When you check which images are running, you want these tags to be uniquely identifiable)
|
||||
|
||||
---
|
||||
|
||||
## Updating services the easy way
|
||||
|
||||
- With the Compose integration, all we have to do is:
|
||||
```bash
|
||||
export TAG=v0.3
|
||||
docker-compose -f composefile.yml build
|
||||
docker-compose -f composefile.yml push
|
||||
docker stack deploy -c composefile.yml nameofstack
|
||||
```
|
||||
|
||||
--
|
||||
|
||||
- That's exactly what we used earlier to deploy the app
|
||||
|
||||
- We don't need to learn new commands!
|
||||
|
||||
---
|
||||
|
||||
## Updating the web UI
|
||||
|
||||
- Let's make the numbers on the Y axis bigger!
|
||||
|
||||
.exercise[
|
||||
|
||||
- Edit the file `webui/files/index.html`:
|
||||
```bash
|
||||
vi dockercoins/webui/files/index.html
|
||||
```
|
||||
|
||||
<!-- ```wait <title>``` -->
|
||||
|
||||
- Locate the `font-size` CSS attribute and increase it (at least double it)
|
||||
|
||||
<!--
|
||||
```keys /font-size```
|
||||
```keys ^J```
|
||||
```keys lllllllllllllcw45px```
|
||||
```keys ^[``` ]
|
||||
```keys :wq```
|
||||
```keys ^J```
|
||||
-->
|
||||
|
||||
- Save and exit
|
||||
|
||||
- Build, ship, and run:
|
||||
```bash
|
||||
export TAG=v0.3
|
||||
docker-compose -f dockercoins.yml build
|
||||
docker-compose -f dockercoins.yml push
|
||||
docker stack deploy -c dockercoins.yml dockercoins
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Viewing our changes
|
||||
|
||||
- Wait at least 10 seconds (for the new version to be deployed)
|
||||
|
||||
- Then reload the web UI
|
||||
|
||||
- Or just mash "reload" frantically
|
||||
|
||||
- ... Eventually the legend on the left will be bigger!
|
||||
41
docs/versions-k8s.md
Normal file
@@ -0,0 +1,41 @@
|
||||
## Brand new versions!
|
||||
|
||||
- Kubernetes 1.8
|
||||
- Docker Engine 17.10
|
||||
- Docker Compose 1.16
|
||||
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check all installed versions:
|
||||
```bash
|
||||
kubectl version
|
||||
docker version
|
||||
docker-compose -v
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Kubernetes and Docker compatibility
|
||||
|
||||
- Kubernetes only validates Docker Engine versions 1.11.2, 1.12.6, 1.13.1, and 17.03.2
|
||||
|
||||
--
|
||||
|
||||
class: extra-details
|
||||
|
||||
- Are we living dangerously?
|
||||
|
||||
--
|
||||
|
||||
class: extra-details
|
||||
|
||||
- "Validates" = continuous integration builds
|
||||
|
||||
- The Docker API is versioned, and offers strong backward-compatibility
|
||||
|
||||
(If a client uses e.g. API v1.25, the Docker Engine will keep behaving the same way)
|
||||
96
docs/versions.md
Normal file
@@ -0,0 +1,96 @@
|
||||
## Brand new versions!
|
||||
|
||||
- Engine 17.10
|
||||
- Compose 1.16
|
||||
- Machine 0.12
|
||||
|
||||
.exercise[
|
||||
|
||||
- Check all installed versions:
|
||||
```bash
|
||||
docker version
|
||||
docker-compose -v
|
||||
docker-machine -v
|
||||
```
|
||||
|
||||
]
|
||||
|
||||
---
|
||||
|
||||
## Wait, what, 17.10 ?!?
|
||||
|
||||
--
|
||||
|
||||
- Docker 1.13 = Docker 17.03 (year.month, like Ubuntu)
|
||||
|
||||
- Every month, there is a new "edge" release (with new features)
|
||||
|
||||
- Every quarter, there is a new "stable" release
|
||||
|
||||
- Docker CE releases are maintained 4+ months
|
||||
|
||||
- Docker EE releases are maintained 12+ months
|
||||
|
||||
- For more details, check the [Docker EE announcement blog post](https://blog.docker.com/2017/03/docker-enterprise-edition/)
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Docker CE vs Docker EE
|
||||
|
||||
- Docker EE:
|
||||
|
||||
- $$$
|
||||
- certification for select distros, clouds, and plugins
|
||||
- advanced management features (fine-grained access control, security scanning...)
|
||||
|
||||
- Docker CE:
|
||||
|
||||
- free
|
||||
- available through Docker Mac, Docker Windows, and major Linux distros
|
||||
- perfect for individuals and small organizations
|
||||
|
||||
---
|
||||
|
||||
class: extra-details
|
||||
|
||||
## Why?
|
||||
|
||||
- More readable for enterprise users
|
||||
|
||||
(i.e. the very nice folks who are kind enough to pay us big $$$ for our stuff)
|
||||
|
||||
- No impact for the community
|
||||
|
||||
(beyond CE/EE suffix and version numbering change)
|
||||
|
||||
- Both trains leverage the same open source components
|
||||
|
||||
(containerd, libcontainer, SwarmKit...)
|
||||
|
||||
- More predictible release schedule (see next slide)
|
||||
|
||||
---
|
||||
|
||||
class: pic
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## What was added when?
|
||||
|
||||
||||
|
||||
| ---- | ----- | --- |
|
||||
| 2015 | 1.9 | Overlay (multi-host) networking, network/IPAM plugins
|
||||
| 2016 | 1.10 | Embedded dynamic DNS
|
||||
| 2016 | 1.11 | DNS round robin load balancing
|
||||
| 2016 | 1.12 | Swarm mode, routing mesh, encrypted networking, healthchecks
|
||||
| 2017 | 1.13 | Stacks, attachable overlays, image squash and compress
|
||||
| 2017 | 1.13 | Windows Server 2016 Swarm mode
|
||||
| 2017 | 17.03 | Secrets
|
||||
| 2017 | 17.04 | Update rollback, placement preferences (soft constraints)
|
||||
| 2017 | 17.05 | Multi-stage image builds, service logs
|
||||
| 2017 | 17.06 | Swarm configs, node/service events
|
||||
| 2017 | 17.06 | Windows Server 2016 Swarm overlay networks, secrets
|
||||
187
docs/whatsnext.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# Next steps
|
||||
|
||||
*Alright, how do I get started and containerize my apps?*
|
||||
|
||||
--
|
||||
|
||||
Suggested containerization checklist:
|
||||
|
||||
.checklist[
|
||||
- write a Dockerfile for one service in one app
|
||||
- write Dockerfiles for the other (buildable) services
|
||||
- write a Compose file for that whole app
|
||||
- make sure that devs are empowered to run the app in containers
|
||||
- set up automated builds of container images from the code repo
|
||||
- set up a CI pipeline using these container images
|
||||
- set up a CD pipeline (for staging/QA) using these images
|
||||
]
|
||||
|
||||
And *then* it is time to look at orchestration!
|
||||
|
||||
---
|
||||
|
||||
## Namespaces
|
||||
|
||||
- Namespaces let you run multiple identical stacks side by side
|
||||
|
||||
- Two namespaces (e.g. `blue` and `green`) can each have their own `redis` service
|
||||
|
||||
- Each of the two `redis` services has its own `ClusterIP`
|
||||
|
||||
- `kube-dns` creates two entries, mapping to these two `ClusterIP` addresses:
|
||||
|
||||
`redis.blue.svc.cluster.local` and `redis.green.svc.cluster.local`
|
||||
|
||||
- Pods in the `blue` namespace get a *search suffix* of `blue.svc.cluster.local`
|
||||
|
||||
- As a result, resolving `redis` from a pod in the `blue` namespace yields the "local" `redis`
|
||||
|
||||
.warning[This does not provide *isolation*! That would be the job of network policies.]
|
||||
|
||||
---
|
||||
|
||||
## Stateful services (databases etc.)
|
||||
|
||||
- As a first step, it is wiser to keep stateful services *outside* of the cluster
|
||||
|
||||
- Exposing them to pods can be done with multiple solutions:
|
||||
|
||||
- `ExternalName` services
|
||||
<br/>
|
||||
(`redis.blue.svc.cluster.local` will be a `CNAME` record)
|
||||
|
||||
- `ClusterIP` services with explicit `Endpoints`
|
||||
<br/>
|
||||
(instead of letting Kubernetes generate the endpoints from a selector)
|
||||
|
||||
- Ambassador services
|
||||
<br/>
|
||||
(application-level proxies that can provide credentials injection and more)
|
||||
|
||||
---
|
||||
|
||||
## Stateful services (second take)
|
||||
|
||||
- If you really want to host stateful services on Kubernetes, you can look into:
|
||||
|
||||
- volumes (to carry persistent data)
|
||||
|
||||
- storage plugins
|
||||
|
||||
- persistent volume claims (to ask for specific volume characteristics)
|
||||
|
||||
- stateful sets (pods that are *not* ephemeral)
|
||||
|
||||
---
|
||||
|
||||
## HTTP traffic handling
|
||||
|
||||
- *Services* are layer 4 constructs
|
||||
|
||||
- HTTP is a layer 7 protocol
|
||||
|
||||
- It is handled by *ingresses* (a different resource kind)
|
||||
|
||||
- *Ingresses* allow:
|
||||
|
||||
- virtual host routing
|
||||
- session stickiness
|
||||
- URI mapping
|
||||
- and much more!
|
||||
|
||||
- Check out e.g. [Træfik](https://docs.traefik.io/user-guide/kubernetes/)
|
||||
|
||||
---
|
||||
|
||||
## Logging and metrics
|
||||
|
||||
- Logging is delegated to the container engine
|
||||
|
||||
- Metrics are typically handled with Prometheus
|
||||
|
||||
(Heapster is a popular add-on)
|
||||
|
||||
---
|
||||
|
||||
## Managing the configuration of our applications
|
||||
|
||||
- Two constructs are particularly useful: secrets and config maps
|
||||
|
||||
- They allow to expose arbitrary information to our containers
|
||||
|
||||
- **Avoid** storing configuration in container images
|
||||
|
||||
(There are some exceptions to that rule, but it's generally a Bad Idea)
|
||||
|
||||
- **Never** store sensitive information in container images
|
||||
|
||||
(It's the container equivalent of the password on a post-it note on your screen)
|
||||
|
||||
---
|
||||
|
||||
## Managing stack deployments
|
||||
|
||||
- The best deployment tool will vary, depending on:
|
||||
|
||||
- the size and complexity of your stack(s)
|
||||
- how often you change it (i.e. add/remove components)
|
||||
- the size and skills of your team
|
||||
|
||||
- A few examples:
|
||||
|
||||
- shell scripts invoking `kubectl`
|
||||
- YAML resources descriptions committed to a repo
|
||||
- [Helm](https://github.com/kubernetes/helm) (~package manager)
|
||||
- [Spinnaker](https://www.spinnaker.io/) (Netflix' CD platform)
|
||||
|
||||
---
|
||||
|
||||
## Cluster federation
|
||||
|
||||
--
|
||||
|
||||

|
||||
|
||||
--
|
||||
|
||||
Sorry Star Trek fans, this is not the federation you're looking for!
|
||||
|
||||
--
|
||||
|
||||
(If I add "Your cluster is in another federation" I might get a 3rd fandom wincing!)
|
||||
|
||||
---
|
||||
|
||||
## Cluster federation
|
||||
|
||||
- Kubernetes master operation relies on etcd
|
||||
|
||||
- etcd uses the Raft protocol
|
||||
|
||||
- Raft recommends low latency between nodes
|
||||
|
||||
- What if our cluster spreads multiple regions?
|
||||
|
||||
--
|
||||
|
||||
- Break it down in local clusters
|
||||
|
||||
- Regroup them in a *cluster federation*
|
||||
|
||||
- Synchronize resources across clusters
|
||||
|
||||
- Discover resources across clusters
|
||||
|
||||
---
|
||||
|
||||
## Developer experience
|
||||
|
||||
*I've put this last, but it's pretty important!*
|
||||
|
||||
- How do you on-board a new developer?
|
||||
|
||||
- What do they need to install to get a dev stack?
|
||||
|
||||
- How does a code change make it from dev to prod?
|
||||
|
||||
- How does someone add a component to a stack?
|
||||
168
docs/workshop.css
Normal file
@@ -0,0 +1,168 @@
|
||||
@import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
|
||||
@import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
|
||||
@import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
|
||||
|
||||
/* For print! Borrowed from https://github.com/gnab/remark/issues/50 */
|
||||
@page {
|
||||
size: 1210px 681px;
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
@media print {
|
||||
.remark-slide-scaler {
|
||||
width: 100% !important;
|
||||
height: 100% !important;
|
||||
transform: scale(1) !important;
|
||||
top: 0 !important;
|
||||
left: 0 !important;
|
||||
}
|
||||
}
|
||||
|
||||
/* put slide numbers in top-right corner instead of bottom-right */
|
||||
div.remark-slide-number {
|
||||
top: 6px;
|
||||
left: unset;
|
||||
bottom: unset;
|
||||
right: 6px;
|
||||
}
|
||||
|
||||
.debug {
|
||||
font-size: 25px;
|
||||
position: absolute;
|
||||
left: 0px;
|
||||
right: 0px;
|
||||
bottom: 0px;
|
||||
font-family: monospace;
|
||||
color: white;
|
||||
}
|
||||
.debug a {
|
||||
color: white;
|
||||
}
|
||||
.debug:hover {
|
||||
background-color: black;
|
||||
}
|
||||
|
||||
body { font-family: 'Droid Serif'; }
|
||||
|
||||
h1, h2, h3 {
|
||||
font-family: 'Yanone Kaffeesatz';
|
||||
font-weight: normal;
|
||||
margin-top: 0.5em;
|
||||
}
|
||||
|
||||
a {
|
||||
text-decoration: none;
|
||||
color: blue;
|
||||
}
|
||||
|
||||
.remark-slide-content { padding: 1em 2.5em 1em 2.5em; }
|
||||
.remark-slide-content { font-size: 25px; }
|
||||
.remark-slide-content h1 { font-size: 50px; }
|
||||
.remark-slide-content h2 { font-size: 50px; }
|
||||
.remark-slide-content h3 { font-size: 25px; }
|
||||
|
||||
.footnote {
|
||||
position: absolute;
|
||||
bottom: 3em;
|
||||
}
|
||||
|
||||
.remark-code { font-size: 25px; }
|
||||
.small .remark-code { font-size: 16px; }
|
||||
.remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
|
||||
.remark-inline-code {
|
||||
background-color: #ccc;
|
||||
}
|
||||
|
||||
.red { color: #fa0000; }
|
||||
.gray { color: #ccc; }
|
||||
.small { font-size: 70%; }
|
||||
.big { font-size: 140%; }
|
||||
.underline { text-decoration: underline; }
|
||||
.strike { text-decoration: line-through; }
|
||||
|
||||
.pic {
|
||||
vertical-align: middle;
|
||||
text-align: center;
|
||||
padding: 0 0 0 0 !important;
|
||||
}
|
||||
img {
|
||||
max-width: 100%;
|
||||
max-height: 550px;
|
||||
}
|
||||
.small img {
|
||||
max-height: 250px;
|
||||
}
|
||||
|
||||
.title {
|
||||
vertical-align: middle;
|
||||
text-align: center;
|
||||
}
|
||||
.title h1 { font-size: 3em; font-family: unset;}
|
||||
.title p { font-size: 3em; }
|
||||
|
||||
.nav {
|
||||
font-size: 25px;
|
||||
position: absolute;
|
||||
left: 0;
|
||||
right: 0;
|
||||
bottom: 2em;
|
||||
}
|
||||
|
||||
.quote {
|
||||
background: #eee;
|
||||
border-left: 10px solid #ccc;
|
||||
margin: 1.5em 10px;
|
||||
padding: 0.5em 10px;
|
||||
quotes: "\201C""\201D""\2018""\2019";
|
||||
font-style: italic;
|
||||
}
|
||||
.quote:before {
|
||||
color: #ccc;
|
||||
content: open-quote;
|
||||
font-size: 4em;
|
||||
line-height: 0.1em;
|
||||
margin-right: 0.25em;
|
||||
vertical-align: -0.4em;
|
||||
}
|
||||
.quote p {
|
||||
display: inline;
|
||||
}
|
||||
|
||||
.blackbelt {
|
||||
background-image: url("blackbelt.png");
|
||||
background-size: 1.5em;
|
||||
background-repeat: no-repeat;
|
||||
padding-left: 2em;
|
||||
}
|
||||
.warning {
|
||||
background-image: url("warning.png");
|
||||
background-size: 1.5em;
|
||||
background-repeat: no-repeat;
|
||||
padding-left: 2em;
|
||||
}
|
||||
.exercise {
|
||||
background-color: #eee;
|
||||
background-image: url("keyboard.png");
|
||||
background-size: 1.4em;
|
||||
background-repeat: no-repeat;
|
||||
background-position: 0.2em 0.2em;
|
||||
border: 2px dotted black;
|
||||
}
|
||||
.exercise:before {
|
||||
content: "Exercise";
|
||||
margin-left: 1.8em;
|
||||
}
|
||||
|
||||
li p { line-height: 1.25em; }
|
||||
|
||||
div.extra-details {
|
||||
background-image: url(extra-details.png);
|
||||
background-position: 0.5% 1%;
|
||||
background-size: 4%;
|
||||
}
|
||||
|
||||
/* This is used only for the history slide (the only table in this doc) */
|
||||
td {
|
||||
padding: 0.1em 0.5em;
|
||||
background: #eee;
|
||||
}
|
||||
35
docs/workshop.html
Normal file
@@ -0,0 +1,35 @@
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>@@TITLE@@</title>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
|
||||
<link rel="stylesheet" href="workshop.css">
|
||||
</head>
|
||||
<body>
|
||||
<!--
|
||||
<div style="position: absolute; left: 20%; right: 20%; top: 30%;">
|
||||
<h1 style="font-size: 3em;">Loading ...</h1>
|
||||
The slides should show up here. If they don't, it might be
|
||||
because you are accessing this file directly from your filesystem.
|
||||
It needs to be served from a web server. You can try this:
|
||||
<pre>
|
||||
docker-compose up -d
|
||||
open http://localhost:8888/workshop.html # on MacOS
|
||||
xdg-open http://localhost:8888/workshop.html # on Linux
|
||||
</pre>
|
||||
Once the slides are loaded, this notice disappears when you
|
||||
go full screen (e.g. by hitting "f").
|
||||
</div>
|
||||
-->
|
||||
<textarea id="source">@@MARKDOWN@@</textarea>
|
||||
<script src="remark.min.js" type="text/javascript">
|
||||
</script>
|
||||
<script type="text/javascript">
|
||||
var slideshow = remark.create({
|
||||
ratio: '16:9',
|
||||
highlightSpans: true,
|
||||
excludedClasses: [@@EXCLUDE@@]
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
5
netlify.toml
Normal file
@@ -0,0 +1,5 @@
|
||||
[build]
|
||||
base = "docs"
|
||||
publish = "docs"
|
||||
command = "./build.sh once"
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
aws_display_tags(){
|
||||
aws_display_tags() {
|
||||
# Print all "Name" tags in our region with their instance count
|
||||
echo "[#] [Status] [Token] [Tag]" \
|
||||
| awk '{ printf "%-7s %-12s %-25s %-25s\n", $1, $2, $3, $4}'
|
||||
aws ec2 describe-instances \
|
||||
--query "Reservations[*].Instances[*].[State.Name,ClientToken,Tags[0].Value]" \
|
||||
--query "Reservations[*].Instances[*].[State.Name,ClientToken,Tags[0].Value]" \
|
||||
| tr -d "\r" \
|
||||
| uniq -c \
|
||||
| sort -k 3 \
|
||||
@@ -12,17 +12,17 @@ aws_display_tags(){
|
||||
|
||||
aws_get_tokens() {
|
||||
aws ec2 describe-instances --output text \
|
||||
--query 'Reservations[*].Instances[*].[ClientToken]' \
|
||||
--query 'Reservations[*].Instances[*].[ClientToken]' \
|
||||
| sort -u
|
||||
}
|
||||
|
||||
aws_display_instance_statuses_by_tag() {
|
||||
TAG=$1
|
||||
need_tag $TAG
|
||||
|
||||
|
||||
IDS=$(aws ec2 describe-instances \
|
||||
--filters "Name=tag:Name,Values=$TAG" \
|
||||
--query "Reservations[*].Instances[*].InstanceId" | tr '\t' ' ' )
|
||||
--query "Reservations[*].Instances[*].InstanceId" | tr '\t' ' ')
|
||||
|
||||
aws ec2 describe-instance-status \
|
||||
--instance-ids $IDS \
|
||||
@@ -34,20 +34,20 @@ aws_display_instances_by_tag() {
|
||||
TAG=$1
|
||||
need_tag $TAG
|
||||
result=$(aws ec2 describe-instances --output table \
|
||||
--filter "Name=tag:Name,Values=$TAG" \
|
||||
--query "Reservations[*].Instances[*].[ \
|
||||
--filter "Name=tag:Name,Values=$TAG" \
|
||||
--query "Reservations[*].Instances[*].[ \
|
||||
InstanceId, \
|
||||
State.Name, \
|
||||
Tags[0].Value, \
|
||||
PublicIpAddress, \
|
||||
InstanceType \
|
||||
]"
|
||||
)
|
||||
if [[ -z $result ]]; then
|
||||
die "No instances found with tag $TAG in region $AWS_DEFAULT_REGION."
|
||||
else
|
||||
echo "$result"
|
||||
fi
|
||||
)
|
||||
if [[ -z $result ]]; then
|
||||
die "No instances found with tag $TAG in region $AWS_DEFAULT_REGION."
|
||||
else
|
||||
echo "$result"
|
||||
fi
|
||||
}
|
||||
|
||||
aws_get_instance_ids_by_filter() {
|
||||
@@ -57,7 +57,6 @@ aws_get_instance_ids_by_filter() {
|
||||
--output text | tr "\t" "\n" | tr -d "\r"
|
||||
}
|
||||
|
||||
|
||||
aws_get_instance_ids_by_client_token() {
|
||||
TOKEN=$1
|
||||
need_tag $TOKEN
|
||||
@@ -76,8 +75,8 @@ aws_get_instance_ips_by_tag() {
|
||||
aws ec2 describe-instances --filter "Name=tag:Name,Values=$TAG" \
|
||||
--output text \
|
||||
--query "Reservations[*].Instances[*].PublicIpAddress" \
|
||||
| tr "\t" "\n" \
|
||||
| sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4 # sort IPs
|
||||
| tr "\t" "\n" \
|
||||
| sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4 # sort IPs
|
||||
}
|
||||
|
||||
aws_kill_instances_by_tag() {
|
||||
|
||||
@@ -31,18 +31,18 @@ sep() {
|
||||
if [ -z "$COLUMNS" ]; then
|
||||
COLUMNS=80
|
||||
fi
|
||||
SEP=$(yes = | tr -d "\n" | head -c $[$COLUMNS - 1])
|
||||
SEP=$(yes = | tr -d "\n" | head -c $(($COLUMNS - 1)))
|
||||
if [ -z "$1" ]; then
|
||||
>/dev/stderr echo $SEP
|
||||
else
|
||||
MSGLEN=$(echo "$1" | wc -c)
|
||||
if [ $[ $MSGLEN +4 ] -gt $COLUMNS ]; then
|
||||
if [ $(($MSGLEN + 4)) -gt $COLUMNS ]; then
|
||||
>/dev/stderr echo "$SEP"
|
||||
>/dev/stderr echo "$1"
|
||||
>/dev/stderr echo "$SEP"
|
||||
else
|
||||
LEFTLEN=$[ ($COLUMNS - $MSGLEN - 2) / 2 ]
|
||||
RIGHTLEN=$[ $COLUMNS - $MSGLEN - 2 - $LEFTLEN ]
|
||||
LEFTLEN=$((($COLUMNS - $MSGLEN - 2) / 2))
|
||||
RIGHTLEN=$(($COLUMNS - $MSGLEN - 2 - $LEFTLEN))
|
||||
LEFTSEP=$(echo $SEP | head -c $LEFTLEN)
|
||||
RIGHTSEP=$(echo $SEP | head -c $RIGHTLEN)
|
||||
>/dev/stderr echo "$LEFTSEP $1 $RIGHTSEP"
|
||||
|
||||
@@ -1,15 +1,15 @@
|
||||
bold() {
|
||||
bold() {
|
||||
echo "$(tput bold)$1$(tput sgr0)"
|
||||
}
|
||||
|
||||
red() {
|
||||
echo "$(tput setaf 1)$1$(tput sgr0)"
|
||||
}
|
||||
}
|
||||
|
||||
green() {
|
||||
red() {
|
||||
echo "$(tput setaf 1)$1$(tput sgr0)"
|
||||
}
|
||||
|
||||
green() {
|
||||
echo "$(tput setaf 2)$1$(tput sgr0)"
|
||||
}
|
||||
|
||||
yellow(){
|
||||
}
|
||||
|
||||
yellow() {
|
||||
echo "$(tput setaf 3)$1$(tput sgr0)"
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
export AWS_DEFAULT_OUTPUT=text
|
||||
|
||||
HELP=""
|
||||
_cmd () {
|
||||
_cmd() {
|
||||
HELP="$(printf "%s\n%-12s %s\n" "$HELP" "$1" "$2")"
|
||||
}
|
||||
|
||||
@@ -39,7 +39,7 @@ _cmd_cards() {
|
||||
need_tag $TAG
|
||||
need_settings $SETTINGS
|
||||
|
||||
aws_get_instance_ips_by_tag $TAG > tags/$TAG/ips.txt
|
||||
aws_get_instance_ips_by_tag $TAG >tags/$TAG/ips.txt
|
||||
|
||||
# Remove symlinks to old cards
|
||||
rm -f ips.html ips.pdf
|
||||
@@ -78,18 +78,18 @@ _cmd_deploy() {
|
||||
>/dev/stderr echo ""
|
||||
|
||||
sep "Deploying tag $TAG"
|
||||
pssh -I tee /tmp/settings.yaml < $SETTINGS
|
||||
pssh -I tee /tmp/settings.yaml <$SETTINGS
|
||||
pssh "
|
||||
sudo apt-get update &&
|
||||
sudo apt-get install -y python-setuptools &&
|
||||
sudo easy_install pyyaml"
|
||||
|
||||
# Copy postprep.py to the remote machines, and execute it, feeding it the list of IP addresses
|
||||
pssh -I tee /tmp/postprep.py < lib/postprep.py
|
||||
pssh --timeout 900 --send-input "python /tmp/postprep.py >>/tmp/pp.out 2>>/tmp/pp.err" < ips.txt
|
||||
pssh -I tee /tmp/postprep.py <lib/postprep.py
|
||||
pssh --timeout 900 --send-input "python /tmp/postprep.py >>/tmp/pp.out 2>>/tmp/pp.err" <ips.txt
|
||||
|
||||
# Install docker-prompt script
|
||||
pssh -I sudo tee /usr/local/bin/docker-prompt < lib/docker-prompt
|
||||
pssh -I sudo tee /usr/local/bin/docker-prompt <lib/docker-prompt
|
||||
pssh sudo chmod +x /usr/local/bin/docker-prompt
|
||||
|
||||
# If /home/docker/.ssh/id_rsa doesn't exist, copy it from node1
|
||||
@@ -116,7 +116,7 @@ _cmd_deploy() {
|
||||
sep "Deployed tag $TAG"
|
||||
info "You may want to run one of the following commands:"
|
||||
info "$0 kube $TAG"
|
||||
info "$0 pull-images $TAG"
|
||||
info "$0 pull_images $TAG"
|
||||
info "$0 cards $TAG $SETTINGS"
|
||||
}
|
||||
|
||||
@@ -206,7 +206,7 @@ _cmd_ips() {
|
||||
}
|
||||
|
||||
_cmd list "List available batches in the current region"
|
||||
_cmd_list(){
|
||||
_cmd_list() {
|
||||
info "Listing batches in region $AWS_DEFAULT_REGION:"
|
||||
aws_display_tags
|
||||
}
|
||||
@@ -259,7 +259,7 @@ _cmd_retag() {
|
||||
if [[ -z "$NEWTAG" ]]; then
|
||||
die "You must specify a new tag to apply."
|
||||
fi
|
||||
aws_tag_instances $OLDTAG $NEWTAG
|
||||
aws_tag_instances $OLDTAG $NEWTAG
|
||||
}
|
||||
|
||||
_cmd start "Start a batch of VMs"
|
||||
@@ -279,8 +279,8 @@ _cmd_start() {
|
||||
# Upload our SSH keys to AWS if needed, to be added to each VM's authorized_keys
|
||||
key_name=$(sync_keys)
|
||||
|
||||
AMI=$(_cmd_ami) # Retrieve the AWS image ID
|
||||
TOKEN=$(get_token) # generate a timestamp token for this batch of VMs
|
||||
AMI=$(_cmd_ami) # Retrieve the AWS image ID
|
||||
TOKEN=$(get_token) # generate a timestamp token for this batch of VMs
|
||||
AWS_KEY_NAME=$(make_key_name)
|
||||
|
||||
sep "Starting instances"
|
||||
@@ -295,7 +295,7 @@ _cmd_start() {
|
||||
--instance-type t2.medium \
|
||||
--client-token $TOKEN \
|
||||
--image-id $AMI)
|
||||
reservation_id=$(echo "$result" | head -1 | awk '{print $2}' )
|
||||
reservation_id=$(echo "$result" | head -1 | awk '{print $2}')
|
||||
info "Reservation ID: $reservation_id"
|
||||
sep
|
||||
|
||||
@@ -317,7 +317,7 @@ _cmd_start() {
|
||||
|
||||
mkdir -p tags/$TAG
|
||||
IPS=$(aws_get_instance_ips_by_tag $TAG)
|
||||
echo "$IPS" > tags/$TAG/ips.txt
|
||||
echo "$IPS" >tags/$TAG/ips.txt
|
||||
link_tag $TAG
|
||||
if [ -n "$SETTINGS" ]; then
|
||||
_cmd_deploy $TAG $SETTINGS
|
||||
@@ -325,16 +325,16 @@ _cmd_start() {
|
||||
info "To deploy or kill these instances, run one of the following:"
|
||||
info "$0 deploy $TAG <settings/somefile.yaml>"
|
||||
info "$0 stop $TAG"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
_cmd ec2quotas "Check our EC2 quotas (max instances)"
|
||||
_cmd_ec2quotas(){
|
||||
_cmd_ec2quotas() {
|
||||
greet
|
||||
|
||||
max_instances=$(aws ec2 describe-account-attributes \
|
||||
--attribute-names max-instances \
|
||||
--query 'AccountAttributes[*][AttributeValues]')
|
||||
--attribute-names max-instances \
|
||||
--query 'AccountAttributes[*][AttributeValues]')
|
||||
info "In the current region ($AWS_DEFAULT_REGION) you can deploy up to $max_instances instances."
|
||||
|
||||
# Print list of AWS EC2 regions, highlighting ours ($AWS_DEFAULT_REGION) in the list
|
||||
@@ -373,7 +373,7 @@ link_tag() {
|
||||
ln -sf $IPS_FILE ips.txt
|
||||
}
|
||||
|
||||
pull_tag(){
|
||||
pull_tag() {
|
||||
TAG=$1
|
||||
need_tag $TAG
|
||||
link_tag $TAG
|
||||
@@ -405,13 +405,13 @@ wait_until_tag_is_running() {
|
||||
COUNT=$2
|
||||
i=0
|
||||
done_count=0
|
||||
while [[ $done_count -lt $COUNT ]]; do \
|
||||
while [[ $done_count -lt $COUNT ]]; do
|
||||
let "i += 1"
|
||||
info "$(printf "%d/%d instances online" $done_count $COUNT)"
|
||||
done_count=$(aws ec2 describe-instances \
|
||||
--filters "Name=instance-state-name,Values=running" \
|
||||
"Name=tag:Name,Values=$TAG" \
|
||||
--query "Reservations[*].Instances[*].State.Name" \
|
||||
--filters "Name=instance-state-name,Values=running" \
|
||||
"Name=tag:Name,Values=$TAG" \
|
||||
--query "Reservations[*].Instances[*].State.Name" \
|
||||
| tr "\t" "\n" \
|
||||
| wc -l)
|
||||
|
||||
@@ -432,7 +432,7 @@ tag_is_reachable() {
|
||||
test_tag() {
|
||||
ips_file=tags/$TAG/ips.txt
|
||||
info "Picking a random IP address in $ips_file to run tests."
|
||||
n=$[ 1 + $RANDOM % $(wc -l < $ips_file) ]
|
||||
n=$((1 + $RANDOM % $(wc -l <$ips_file)))
|
||||
ip=$(head -n $n $ips_file | tail -n 1)
|
||||
test_vm $ip
|
||||
info "Tests complete."
|
||||
@@ -461,8 +461,8 @@ test_vm() {
|
||||
"env" \
|
||||
"ls -la /home/docker/.ssh"; do
|
||||
sep "$cmd"
|
||||
echo "$cmd" |
|
||||
ssh -A -q \
|
||||
echo "$cmd" \
|
||||
| ssh -A -q \
|
||||
-o "UserKnownHostsFile /dev/null" \
|
||||
-o "StrictHostKeyChecking=no" \
|
||||
$user@$ip sudo -u docker -i \
|
||||
@@ -480,26 +480,26 @@ test_vm() {
|
||||
info "Test VM was $ip."
|
||||
}
|
||||
|
||||
make_key_name(){
|
||||
make_key_name() {
|
||||
SHORT_FINGERPRINT=$(ssh-add -l | grep RSA | head -n1 | cut -d " " -f 2 | tr -d : | cut -c 1-8)
|
||||
echo "${SHORT_FINGERPRINT}-${USER}"
|
||||
}
|
||||
|
||||
sync_keys() {
|
||||
# make sure ssh-add -l contains "RSA"
|
||||
ssh-add -l | grep -q RSA ||
|
||||
die "The output of \`ssh-add -l\` doesn't contain 'RSA'. Start the agent, add your keys?"
|
||||
ssh-add -l | grep -q RSA \
|
||||
|| die "The output of \`ssh-add -l\` doesn't contain 'RSA'. Start the agent, add your keys?"
|
||||
|
||||
AWS_KEY_NAME=$(make_key_name)
|
||||
info "Syncing keys... "
|
||||
if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &> /dev/null; then
|
||||
if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &>/dev/null; then
|
||||
aws ec2 import-key-pair --key-name $AWS_KEY_NAME \
|
||||
--public-key-material "$(ssh-add -L \
|
||||
| grep -i RSA \
|
||||
| head -n1 \
|
||||
| cut -d " " -f 1-2)" &> /dev/null
|
||||
| grep -i RSA \
|
||||
| head -n1 \
|
||||
| cut -d " " -f 1-2)" &>/dev/null
|
||||
|
||||
if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &> /dev/null; then
|
||||
if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &>/dev/null; then
|
||||
die "Somehow, importing the key didn't work. Make sure that 'ssh-add -l | grep RSA | head -n1' returns an RSA key?"
|
||||
else
|
||||
info "Imported new key $AWS_KEY_NAME."
|
||||
@@ -523,4 +523,3 @@ describe_tag() {
|
||||
aws_display_instances_by_tag $TAG
|
||||
aws_display_instance_statuses_by_tag $TAG
|
||||
}
|
||||
|
||||
|
||||
@@ -3,10 +3,10 @@
|
||||
# That way, it can be safely invoked as a function from other scripts.
|
||||
|
||||
find_ubuntu_ami() {
|
||||
(
|
||||
(
|
||||
|
||||
usage() {
|
||||
cat >&2 <<__
|
||||
usage() {
|
||||
cat >&2 <<__
|
||||
usage: find-ubuntu-ami.sh [ <filter>... ] [ <sorting> ] [ <options> ]
|
||||
where:
|
||||
<filter> is pair of key and substring to search
|
||||
@@ -33,66 +33,94 @@ where:
|
||||
protip for Docker orchestration workshop admin:
|
||||
./find-ubuntu-ami.sh -t hvm:ebs -r \$AWS_REGION -v 15.10 -N
|
||||
__
|
||||
exit 1
|
||||
}
|
||||
exit 1
|
||||
}
|
||||
|
||||
args=`getopt hr:n:v:a:t:d:i:k:RNVATDIKq $*`
|
||||
if [ $? != 0 ] ; then
|
||||
echo >&2
|
||||
usage
|
||||
fi
|
||||
args=$(getopt hr:n:v:a:t:d:i:k:RNVATDIKq $*)
|
||||
if [ $? != 0 ]; then
|
||||
echo >&2
|
||||
usage
|
||||
fi
|
||||
|
||||
region=
|
||||
name=
|
||||
version=
|
||||
arch=
|
||||
type=
|
||||
date=
|
||||
image=
|
||||
kernel=
|
||||
region=
|
||||
name=
|
||||
version=
|
||||
arch=
|
||||
type=
|
||||
date=
|
||||
image=
|
||||
kernel=
|
||||
|
||||
sort=date
|
||||
sort=date
|
||||
|
||||
quiet=
|
||||
quiet=
|
||||
|
||||
set -- $args
|
||||
for a ; do
|
||||
case "$a" in
|
||||
-h) usage ;;
|
||||
set -- $args
|
||||
for a; do
|
||||
case "$a" in
|
||||
-h) usage ;;
|
||||
|
||||
-r) region=$2 ; shift ;;
|
||||
-n) name=$2 ; shift ;;
|
||||
-v) version=$2 ; shift ;;
|
||||
-a) arch=$2 ; shift ;;
|
||||
-t) type=$2 ; shift ;;
|
||||
-d) date=$2 ; shift ;;
|
||||
-i) image=$2 ; shift ;;
|
||||
-k) kernel=$2 ; shift ;;
|
||||
|
||||
-R) sort=region ;;
|
||||
-N) sort=name ;;
|
||||
-V) sort=version ;;
|
||||
-A) sort=arch ;;
|
||||
-T) sort=type ;;
|
||||
-D) sort=date ;;
|
||||
-I) sort=image ;;
|
||||
-K) sort=kernel ;;
|
||||
-r)
|
||||
region=$2
|
||||
shift
|
||||
;;
|
||||
-n)
|
||||
name=$2
|
||||
shift
|
||||
;;
|
||||
-v)
|
||||
version=$2
|
||||
shift
|
||||
;;
|
||||
-a)
|
||||
arch=$2
|
||||
shift
|
||||
;;
|
||||
-t)
|
||||
type=$2
|
||||
shift
|
||||
;;
|
||||
-d)
|
||||
date=$2
|
||||
shift
|
||||
;;
|
||||
-i)
|
||||
image=$2
|
||||
shift
|
||||
;;
|
||||
-k)
|
||||
kernel=$2
|
||||
shift
|
||||
;;
|
||||
|
||||
-q) quiet=y ;;
|
||||
|
||||
--) shift ; break ;;
|
||||
*) continue ;;
|
||||
esac
|
||||
shift
|
||||
done
|
||||
-R) sort=region ;;
|
||||
-N) sort=name ;;
|
||||
-V) sort=version ;;
|
||||
-A) sort=arch ;;
|
||||
-T) sort=type ;;
|
||||
-D) sort=date ;;
|
||||
-I) sort=image ;;
|
||||
-K) sort=kernel ;;
|
||||
|
||||
[ $# = 0 ] || usage
|
||||
-q) quiet=y ;;
|
||||
|
||||
fix_json() {
|
||||
tr -d \\n | sed 's/,]}/]}/'
|
||||
}
|
||||
--)
|
||||
shift
|
||||
break
|
||||
;;
|
||||
*) continue ;;
|
||||
esac
|
||||
shift
|
||||
done
|
||||
|
||||
jq_query() { cat <<__
|
||||
[ $# = 0 ] || usage
|
||||
|
||||
fix_json() {
|
||||
tr -d \\n | sed 's/,]}/]}/'
|
||||
}
|
||||
|
||||
jq_query() {
|
||||
cat <<__
|
||||
.aaData | map (
|
||||
{
|
||||
region: .[0],
|
||||
@@ -116,31 +144,31 @@ jq_query() { cat <<__
|
||||
) | sort_by(.$sort) | .[] |
|
||||
"\(.region)|\(.name)|\(.version)|\(.arch)|\(.type)|\(.date)|\(.image)|\(.kernel)"
|
||||
__
|
||||
}
|
||||
|
||||
trim_quotes() {
|
||||
sed 's/^"//;s/"$//'
|
||||
}
|
||||
|
||||
escape_spaces() {
|
||||
sed 's/ /\\\ /g'
|
||||
}
|
||||
|
||||
url=http://cloud-images.ubuntu.com/locator/ec2/releasesTable
|
||||
|
||||
{
|
||||
[ "$quiet" ] || echo REGION NAME VERSION ARCH TYPE DATE IMAGE KERNEL
|
||||
curl -s $url | fix_json | jq "$(jq_query)" | trim_quotes | escape_spaces | tr \| ' '
|
||||
} \
|
||||
| while read region name version arch type date image kernel; do
|
||||
image=${image%<*}
|
||||
image=${image#*>}
|
||||
if [ "$quiet" ]; then
|
||||
echo $image
|
||||
else
|
||||
echo "$region|$name|$version|$arch|$type|$date|$image|$kernel"
|
||||
fi
|
||||
done | column -t -s \|
|
||||
|
||||
)
|
||||
}
|
||||
|
||||
trim_quotes() {
|
||||
sed 's/^"//;s/"$//'
|
||||
}
|
||||
|
||||
escape_spaces() {
|
||||
sed 's/ /\\\ /g'
|
||||
}
|
||||
|
||||
url=http://cloud-images.ubuntu.com/locator/ec2/releasesTable
|
||||
|
||||
{
|
||||
[ "$quiet" ] || echo REGION NAME VERSION ARCH TYPE DATE IMAGE KERNEL
|
||||
curl -s $url | fix_json | jq "`jq_query`" | trim_quotes | escape_spaces | tr \| ' '
|
||||
} |
|
||||
while read region name version arch type date image kernel ; do
|
||||
image=${image%<*}
|
||||
image=${image#*>}
|
||||
if [ "$quiet" ]; then
|
||||
echo $image
|
||||
else
|
||||
echo "$region|$name|$version|$arch|$type|$date|$image|$kernel"
|
||||
fi
|
||||
done | column -t -s \|
|
||||
|
||||
)
|
||||
}
|
||||
@@ -60,7 +60,7 @@ system("echo docker:training | sudo chpasswd")
|
||||
|
||||
# Fancy prompt courtesy of @soulshake.
|
||||
system("""sudo -u docker tee -a /home/docker/.bashrc <<SQRL
|
||||
export PS1='\e[1m\e[31m[\h] \e[32m(\\$(docker-prompt)) \e[34m\u@{}\e[35m \w\e[0m\n$ '
|
||||
export PS1='\e[1m\e[31m[{}] \e[32m(\\$(docker-prompt)) \e[34m\u@\h\e[35m \w\e[0m\n$ '
|
||||
SQRL""".format(ipv4))
|
||||
|
||||
# Custom .vimrc
|
||||
@@ -135,7 +135,9 @@ while addresses:
|
||||
print(cluster)
|
||||
|
||||
mynode = cluster.index(ipv4) + 1
|
||||
system("echo 'node{}' | sudo -u docker tee /tmp/node".format(mynode))
|
||||
system("echo node{} | sudo -u docker tee /tmp/node".format(mynode))
|
||||
system("echo node{} | sudo tee /etc/hostname".format(mynode))
|
||||
system("sudo hostname node{}".format(mynode))
|
||||
system("sudo -u docker mkdir -p /home/docker/.ssh")
|
||||
system("sudo -u docker touch /home/docker/.ssh/authorized_keys")
|
||||
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
# This file can be sourced in order to directly run commands on
|
||||
# This file can be sourced in order to directly run commands on
|
||||
# a batch of VMs whose IPs are located in ips.txt of the directory in which
|
||||
# the command is run.
|
||||
|
||||
pssh () {
|
||||
pssh() {
|
||||
HOSTFILE="ips.txt"
|
||||
|
||||
[ -f $HOSTFILE ] || {
|
||||
@@ -14,10 +14,10 @@ pssh () {
|
||||
export PSSH=$(which pssh || which parallel-ssh)
|
||||
|
||||
$PSSH -h $HOSTFILE -l ubuntu \
|
||||
--par 100 \
|
||||
-O LogLevel=ERROR \
|
||||
-O UserKnownHostsFile=/dev/null \
|
||||
-O StrictHostKeyChecking=no \
|
||||
-O ForwardAgent=yes \
|
||||
"$@"
|
||||
--par 100 \
|
||||
-O LogLevel=ERROR \
|
||||
-O UserKnownHostsFile=/dev/null \
|
||||
-O StrictHostKeyChecking=no \
|
||||
-O ForwardAgent=yes \
|
||||
"$@"
|
||||
}
|
||||
|
||||
@@ -50,8 +50,8 @@ check_dependencies() {
|
||||
status=0
|
||||
for dependency in $DEPENDENCIES; do
|
||||
if ! command -v $dependency >/dev/null; then
|
||||
warning "Dependency $dependency could not be found."
|
||||
status=1
|
||||
warning "Dependency $dependency could not be found."
|
||||
status=1
|
||||
fi
|
||||
done
|
||||
return $status
|
||||
@@ -61,11 +61,11 @@ check_image() {
|
||||
docker inspect $TRAINER_IMAGE >/dev/null 2>&1
|
||||
}
|
||||
|
||||
check_envvars ||
|
||||
die "Please set all required environment variables."
|
||||
check_envvars \
|
||||
|| die "Please set all required environment variables."
|
||||
|
||||
check_dependencies ||
|
||||
warning "At least one dependency is missing. Install it or try the image wrapper."
|
||||
check_dependencies \
|
||||
|| warning "At least one dependency is missing. Install it or try the image wrapper."
|
||||
|
||||
# Now check which command was invoked and execute it
|
||||
if [ "$1" ]; then
|
||||
|
||||
35
stacks/dockercoins+healthchecks.yml
Normal file
@@ -0,0 +1,35 @@
|
||||
version: "3"
|
||||
|
||||
services:
|
||||
rng:
|
||||
build: dockercoins/rng
|
||||
image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
|
||||
deploy:
|
||||
mode: global
|
||||
|
||||
hasher:
|
||||
build: dockercoins/hasher
|
||||
image: ${REGISTRY-127.0.0.1:5000}/hasher:${TAG-latest}
|
||||
deploy:
|
||||
replicas: 7
|
||||
update_config:
|
||||
delay: 5s
|
||||
failure_action: rollback
|
||||
max_failure_ratio: .5
|
||||
monitor: 5s
|
||||
parallelism: 1
|
||||
|
||||
webui:
|
||||
build: dockercoins/webui
|
||||
image: ${REGISTRY-127.0.0.1:5000}/webui:${TAG-latest}
|
||||
ports:
|
||||
- "8000:80"
|
||||
|
||||
redis:
|
||||
image: redis
|
||||
|
||||
worker:
|
||||
build: dockercoins/worker
|
||||
image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
|
||||
deploy:
|
||||
replicas: 10
|
||||