diff --git a/.gitignore b/.gitignore
index 23e580b7..40c39306 100644
--- a/.gitignore
+++ b/.gitignore
@@ -6,3 +6,5 @@ prepare-vms/ips.html
 prepare-vms/ips.pdf
 prepare-vms/settings.yaml
 prepare-vms/tags
+docs/*.yml.html
+autotest/nextstep
diff --git a/README.md b/README.md
index 4ada1a07..6db58feb 100644
--- a/README.md
+++ b/README.md
@@ -8,10 +8,27 @@ non-stop since June 2015.
 
 ## Content
 
-- Chapter 1: Getting Started: running apps with docker-compose
-- Chapter 2: Scaling out with Swarm Mode
-- Chapter 3: Operating the Swarm (networks, updates, logging, metrics)
-- Chapter 4: Deeper in Swarm (stateful services, scripting, DAB's)
+The workshop introduces a demo app, "DockerCoins," built
+around a micro-services architecture. First, we run it
+on a single node, using Docker Compose. Then, we pretend
+that we need to scale it, and we use an orchestrator
+(SwarmKit or Kubernetes) to deploy and scale the app on
+a cluster.
+
+We explain the concepts of the orchestrator. For SwarmKit,
+we setup the cluster with `docker swarm init` and `docker swarm join`.
+For Kubernetes, we use pre-configured clusters.
+
+Then, we cover more advanced concepts: scaling, load balancing,
+updates, global services or daemon sets.
+
+There are a number of advanced optional chapters about
+logging, metrics, secrets, network encryption, etc.
+
+The content is very modular: it is broken down in a large
+number of Markdown files, that are put together according
+to a YAML manifest. This allows to re-use content
+between different workshops very easily.
 
 
 ## Quick start (or, "I want to try it!")
@@ -32,8 +49,8 @@ own cluster, we have multiple solutions for you!
 
 ### Using [play-with-docker](http://play-with-docker.com/)
 
-This method is very easy to get started (you don't need any extra account
-or resources!) but will require a bit of adaptation from the workshop slides.
+This method is very easy to get started: you don't need any extra account
+or resources! It works only for the SwarmKit version of the workshop, though.
 
 To get started, go to [play-with-docker](http://play-with-docker.com/), and
 click on _ADD NEW INSTANCE_ five times. You will get five "docker-in-docker"
@@ -44,31 +61,9 @@ the tab corresponding to that node.
 
 The nodes are not directly reachable from outside; so when the slides tell
 you to "connect to the IP address of your node on port XYZ" you will have
-to use a different method.
-
-We suggest to use "supergrok", a container offering a NGINX+ngrok combo to
-expose your services. To use it, just start (on any of your nodes) the
-`jpetazzo/supergrok` image. The image will output further instructions:
-
-```
-docker run --name supergrok -d jpetazzo/supergrok
-docker logs --follow supergrok
-```
-
-The logs of the container will give you a tunnel address and explain you
-how to connected to exposed services. That's all you need to do!
-
-We are also working on a native proxy, embedded to Play-With-Docker.
-Stay tuned!
-
-<!--
-
-- You can use a proxy provided by Play-With-Docker. When the slides
-  instruct you to connect to nodeX on port ABC, instead, you will connect
-  to http://play-with-docker.com/XXX.XXX.XXX.XXX:ABC, where XXX.XXX.XXX.XXX
-  is the IP address of nodeX.
-
--->
+to use a different method: click on the port number that should appear on
+top of the play-with-docker window. This only works for HTTP services,
+though.
 
 Note that the instances provided by Play-With-Docker have a short lifespan
 (a few hours only), so if you want to do the workshop over multiple sessions,
@@ -119,14 +114,16 @@ check the [prepare-vms](prepare-vms) directory for more information.
 ## Slide Deck
 
 - The slides are in the `docs` directory.
-- To view them locally open `docs/index.html` in your browser. It works
-  offline too.
-- To view them online open https://jpetazzo.github.io/orchestration-workshop/
-  in your browser.
-- When you fork this repo, be sure GitHub Pages is enabled in repo Settings
-  for "master branch /docs folder" and you'll have your own website for them.
-- They use https://remarkjs.com to allow simple markdown in a html file that
-  remark will transform into a presentation in the browser.
+- For each slide deck, there is a `.yml` file referencing `.md` files.
+- The `.md` files contain Markdown snippets.
+- When you run `build.sh once`, it will "compile" all the `.yml` files
+  into `.yml.html` files that you can open in your browser.
+- You can also run `build.sh forever`, which will watch the directory
+  and rebuild slides automatically when files are modified.
+- If needed, you can fine-tune `workshop.css` and `workshop.html`
+  (respectively the CSS style used, and the boilerplate template).
+- The slides use https://remarkjs.com to render Markdown into HTML in
+  a web browser.
 
 
 ## Sample App: Dockercoins!
@@ -181,7 +178,7 @@ want to become an instructor), keep reading!*
   they need for class.
 - Typically you create the servers the day before or morning of workshop, and
   leave them up the rest of day after workshop. If creating hundreds of servers,
-  you'll likely want to run all these `trainer` commands from a dedicated
+  you'll likely want to run all these `workshopctl` commands from a dedicated
   instance you have in same region as instances you want to create. Much faster
   this way if you're on poor internet. Also, create 2 sets of servers for
   yourself, and use one during workshop and the 2nd is a backup.
@@ -203,7 +200,7 @@ want to become an instructor), keep reading!*
 
 ### Creating the VMs
 
-`prepare-vms/trainer` is the script that gets you most of what you need for
+`prepare-vms/workshopctl` is the script that gets you most of what you need for
 setting up instances. See
 [prepare-vms/README.md](prepare-vms)
 for all the info on tools and scripts.
diff --git a/autotest/autotest.py b/autotest/autotest.py
index 8299836e..27001237 100755
--- a/autotest/autotest.py
+++ b/autotest/autotest.py
@@ -1,15 +1,28 @@
 #!/usr/bin/env python
 
+import uuid
+import logging
 import os
 import re
-import signal
 import subprocess
+import sys
 import time
+import uuid
 
-def print_snippet(snippet):
-    print(78*'-')
-    print(snippet)
-    print(78*'-')
+logging.basicConfig(level=logging.DEBUG)
+
+
+TIMEOUT = 60 # 1 minute
+
+
+def hrule():
+    return "="*int(subprocess.check_output(["tput", "cols"]))
+
+# A "snippet" is something that the user is supposed to do in the workshop.
+# Most of the "snippets" are shell commands.
+# Some of them can be key strokes or other actions.
+# In the markdown source, they are the code sections (identified by triple-
+# quotes) within .exercise[] sections.
 
 class Snippet(object):
 
@@ -29,26 +42,22 @@ class Slide(object):
     def __init__(self, content):
         Slide.current_slide += 1
         self.number = Slide.current_slide
+
         # Remove commented-out slides
         # (remark.js considers ??? to be the separator for speaker notes)
         content = re.split("\n\?\?\?\n", content)[0]
         self.content = content
+
         self.snippets = []
         exercises = re.findall("\.exercise\[(.*)\]", content, re.DOTALL)
         for exercise in exercises:
-            if "```" in exercise and "<br/>`" in exercise:
-                print("! Exercise on slide {} has both ``` and <br/>` delimiters, skipping."
-                      .format(self.number))
-                print_snippet(exercise)
-            elif "```" in exercise:
+            if "```" in exercise:
                 for snippet in exercise.split("```")[1::2]:
                     self.snippets.append(Snippet(self, snippet))
-            elif "<br/>`" in exercise:
-                for snippet in re.findall("<br/>`(.*)`", exercise):
-                    self.snippets.append(Snippet(self, snippet))
             else:
-                print("  Exercise on slide {} has neither ``` or <br/>` delimiters, skipping."
-                     .format(self.number))
+                logging.warning("Exercise on slide {} does not have any ``` snippet."
+                                .format(self.number))
+                self.debug()
 
     def __str__(self):
         text = self.content
@@ -56,136 +65,165 @@ class Slide(object):
             text = text.replace(snippet.content, ansi("7")(snippet.content))
         return text
 
+    def debug(self):
+        logging.debug("\n{}\n{}\n{}".format(hrule(), self.content, hrule()))
+
 
 def ansi(code):
     return lambda s: "\x1b[{}m{}\x1b[0m".format(code, s)
 
-slides = []
-with open("index.html") as f:
-    content = f.read()
-    for slide in re.split("\n---?\n", content):
-        slides.append(Slide(slide))
 
-is_editing_file = False
-placeholders = {}
+def wait_for_string(s):
+    logging.debug("Waiting for string: {}".format(s))
+    deadline = time.time() + TIMEOUT
+    while time.time() < deadline:
+        output = capture_pane()
+        if s in output:
+            return
+        time.sleep(1)
+    raise Exception("Timed out while waiting for {}!".format(s))
+
+
+def wait_for_prompt():
+    logging.debug("Waiting for prompt.")
+    deadline = time.time() + TIMEOUT
+    while time.time() < deadline:
+        output = capture_pane()
+        # If we are not at the bottom of the screen, there will be a bunch of extra \n's
+        output = output.rstrip('\n')
+        if output[-2:] == "\n$":
+            return
+        time.sleep(1)
+    raise Exception("Timed out while waiting for prompt!")
+
+
+def check_exit_status():
+    token = uuid.uuid4().hex
+    data = "echo {} $?\n".format(token)
+    logging.debug("Sending {!r} to get exit status.".format(data))
+    send_keys(data)
+    time.sleep(0.5)
+    wait_for_prompt()
+    screen = capture_pane()
+    status = re.findall("\n{} ([0-9]+)\n".format(token), screen, re.MULTILINE)
+    logging.debug("Got exit status: {}.".format(status))
+    if len(status) == 0:
+        raise Exception("Couldn't retrieve status code {}. Timed out?".format(token))
+    if len(status) > 1:
+        raise Exception("More than one status code {}. I'm seeing double! Shoot them both.".format(token))
+    code = int(status[0])
+    if code != 0:
+        raise Exception("Non-zero exit status: {}.".format(code))
+    # Otherwise just return peacefully.
+
+
+slides = []
+content = open(sys.argv[1]).read()
+for slide in re.split("\n---?\n", content):
+    slides.append(Slide(slide))
+
+actions = []
 for slide in slides:
     for snippet in slide.snippets:
         content = snippet.content
-        # Multi-line snippets should be ```highlightsyntax...
-        # Single-line snippets will be interpreted as shell commands
+        # Extract the "method" (e.g. bash, keys, ...)
+        # On multi-line snippets, the method is alone on the first line
+        # On single-line snippets, the data follows the method immediately
         if '\n' in content:
-            highlight, content = content.split('\n', 1)
+            method, data = content.split('\n', 1)
         else:
-            highlight = "bash"
-        content = content.strip()
-        # If the previous snippet was a file fragment, and the current
-        # snippet is not YAML or EDIT, complain.
-        if is_editing_file and highlight not in ["yaml", "edit"]:
-            print("! On slide {}, previous snippet was YAML, so what do what do?"
-                  .format(slide.number))
-            print_snippet(content)
-        is_editing_file = False
-        if highlight == "yaml":
-            is_editing_file = True
-        elif highlight == "placeholder":
-            for line in content.split('\n'):
-                variable, value = line.split(' ', 1)
-                placeholders[variable] = value
-        elif highlight == "bash":
-            for variable, value in placeholders.items():
-                quoted = "`{}`".format(variable)
-                if quoted in content:
-                    content = content.replace(quoted, value)
-                    del placeholders[variable]
-            if '`' in content:
-                print("! The following snippet on slide {} contains a backtick:"
-                      .format(slide.number))
-                print_snippet(content)
-                continue
-            print("_ "+content)
-            snippet.actions.append((highlight, content))
-        elif highlight == "edit":
-            print(". "+content)
-            snippet.actions.append((highlight, content))
-        elif highlight == "meta":
-            print("^ "+content)
-            snippet.actions.append((highlight, content))
-        else:
-            print("! Unknown highlight {!r} on slide {}.".format(highlight, slide.number))
-if placeholders:
-    print("! Remaining placeholder values: {}".format(placeholders))
+            method, data = content.split(' ', 1)
+        actions.append((slide, snippet, method, data))
 
-actions = sum([snippet.actions for snippet in sum([slide.snippets for slide in slides], [])], [])
 
-# Strip ^{ ... ^} for now
-def strip_curly_braces(actions, in_braces=False):
-    if actions == []:
-        return []
-    elif actions[0] == ("meta", "^{"):
-        return strip_curly_braces(actions[1:], True)
-    elif actions[0] == ("meta", "^}"):
-        return strip_curly_braces(actions[1:], False)
-    elif in_braces:
-        return strip_curly_braces(actions[1:], True)
+def send_keys(data):
+    subprocess.check_call(["tmux", "send-keys", data])
+
+def capture_pane():
+    return subprocess.check_output(["tmux", "capture-pane", "-p"])
+
+
+try:
+    i = int(open("nextstep").read())
+    logging.info("Loaded next step ({}) from file.".format(i))
+except Exception as e:
+    logging.warning("Could not read nextstep file ({}), initializing to 0.".format(e))
+    i = 0
+
+interactive = True
+
+while i < len(actions):
+    with open("nextstep", "w") as f:
+        f.write(str(i))
+    slide, snippet, method, data = actions[i]
+
+    # Remove extra spaces (we don't want them in the terminal) and carriage returns
+    data = data.strip()
+
+    print(hrule())
+    print(slide.content.replace(snippet.content, ansi(7)(snippet.content)))
+    print(hrule())
+    if interactive:
+        print("[{}/{}] Shall we execute that snippet above?".format(i, len(actions)))
+        print("(ENTER to execute, 'c' to continue until next error, N to jump to step #N)")
+        command = raw_input("> ")
     else:
-        return [actions[0]] + strip_curly_braces(actions[1:], False)
+        command = ""
 
-actions = strip_curly_braces(actions)
+    # For now, remove the `highlighted` sections
+    # (Make sure to use $() in shell snippets!)
+    if '`' in data:
+        logging.info("Stripping ` from snippet.")
+        data = data.replace('`', '')
 
-background = []
-cwd = os.path.expanduser("~")
-env = {}
-for current_action, next_action in zip(actions, actions[1:]+[("bash", "true")]):
-    if current_action[0] == "meta":
-        continue
-    print(ansi(7)(">>> {}".format(current_action[1])))
-    time.sleep(1)
-    popen_options = dict(shell=True, cwd=cwd, stdin=subprocess.PIPE, preexec_fn=os.setpgrp)
-    # The follow hack allows to capture the environment variables set by `docker-machine env`
-    # FIXME: this doesn't handle `unset` for now
-    if any([
-        "eval $(docker-machine env" in current_action[1],
-        "DOCKER_HOST" in current_action[1],
-        "COMPOSE_FILE" in current_action[1],
-        ]):
-        popen_options["stdout"] = subprocess.PIPE
-        current_action[1] += "\nenv"
-    proc = subprocess.Popen(current_action[1], **popen_options)
-    proc.cmd = current_action[1]
-    if next_action[0] == "meta":
-        print(">>> {}".format(next_action[1]))
-        time.sleep(3)
-        if next_action[1] == "^C":
-            os.killpg(proc.pid, signal.SIGINT)
-            proc.wait()
-        elif next_action[1] == "^Z":
-            # Let the process run
-            background.append(proc)
-        elif next_action[1] == "^D":
-            proc.communicate()
-            proc.wait()
+    if command == "c":
+        # continue until next timeout
+        interactive = False
+    elif command.isdigit():
+        i = int(command)
+    elif command == "":
+        logging.info("Running with method {}: {}".format(method, data))
+        if method == "keys":
+            send_keys(data)
+        elif method == "bash":
+            # Make sure that we're ready
+            wait_for_prompt()
+            # Strip leading spaces
+            data = re.sub("\n +", "\n", data)
+            # Add "RETURN" at the end of the command :)
+            data += "\n"
+            # Send command
+            send_keys(data)
+            # Force a short sleep to avoid race condition
+            time.sleep(0.5)
+            _, _, next_method, next_data = actions[i+1]
+            if next_method == "wait":
+                wait_for_string(next_data)
+            else:
+                wait_for_prompt()
+                # Verify return code FIXME should be optional
+                check_exit_status()
+        elif method == "copypaste":
+            screen = capture_pane()
+            matches = re.findall(data, screen, flags=re.DOTALL)
+            if len(matches) == 0:
+                raise Exception("Could not find regex {} in output.".format(data))
+            # Arbitrarily get the most recent match
+            match = matches[-1]
+            # Remove line breaks (like a screen copy paste would do)
+            match = match.replace('\n', '')
+            send_keys(match + '\n')
+            # FIXME: we should factor out the "bash" method
+            wait_for_prompt()
+            check_exit_status()
         else:
-            print("! Unknown meta action {} after snippet:".format(next_action[1]))
-            print_snippet(next_action[1])
-        print(ansi(7)("<<< {}".format(current_action[1])))
-    else:
-        proc.wait()
-        if "stdout" in popen_options:
-            stdout, stderr = proc.communicate()
-            for line in stdout.split('\n'):
-                if line.startswith("DOCKER_"):
-                    variable, value = line.split('=', 1)
-                    env[variable] = value
-                    print("=== {}={}".format(variable, value))
-        print(ansi(7)("<<< {} >>> {}".format(proc.returncode, current_action[1])))
-        if proc.returncode != 0:
-            print("Got non-zero status code; aborting.")
-            break
-    if current_action[1].startswith("cd "):
-        cwd = os.path.expanduser(current_action[1][3:])
-for proc in background:
-    print("Terminating background process:")
-    print_snippet(proc.cmd)
-    proc.terminate()
-    proc.wait()
+            logging.warning("Unknown method {}: {!r}".format(method, data))
+        i += 1
 
+    else:
+        i += 1
+        logging.warning("Unknown command {}, skipping to next step.".format(command))
+
+# Reset slide counter
+with open("nextstep", "w") as f:
+    f.write(str(0))
diff --git a/autotest/index.html b/autotest/index.html
deleted file mode 120000
index f8d31667..00000000
--- a/autotest/index.html
+++ /dev/null
@@ -1 +0,0 @@
-../www/htdocs/index.html
\ No newline at end of file
diff --git a/docs/TODO b/docs/TODO
new file mode 100644
index 00000000..2329bf59
--- /dev/null
+++ b/docs/TODO
@@ -0,0 +1,8 @@
+Black belt references that I want to add somewhere:
+
+What Have Namespaces Done for You Lately?
+https://www.youtube.com/watch?v=MHv6cWjvQjM&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=8
+
+Cilium: Network and Application Security with BPF and XDP
+https://www.youtube.com/watch?v=ilKlmTDdFgk&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=9
+
diff --git a/docs/aj-containers.jpeg b/docs/aj-containers.jpeg
new file mode 100644
index 00000000..907dec69
Binary files /dev/null and b/docs/aj-containers.jpeg differ
diff --git a/docs/apiscope.md b/docs/apiscope.md
new file mode 100644
index 00000000..b924a0e0
--- /dev/null
+++ b/docs/apiscope.md
@@ -0,0 +1,41 @@
+## A reminder about *scope*
+
+- Out of the box, Docker API access is "all or nothing"
+
+- When someone has access to the Docker API, they can access *everything*
+
+- If your developers are using the Docker API to deploy on the dev cluster ...
+
+  ... and the dev cluster is the same as the prod cluster ...
+
+  ... it means that your devs have access to your production data, passwords, etc.
+
+- This can easily be avoided
+
+---
+
+## Fine-grained API access control
+
+A few solutions, by increasing order of flexibility:
+
+- Use separate clusters for different security perimeters
+
+  (And different credentials for each cluster)
+
+--
+
+- Add an extra layer of abstraction (sudo scripts, hooks, or full-blown PAAS)
+
+--
+
+- Enable [authorization plugins]
+
+  - each API request is vetted by your plugin(s)
+
+  - by default, the *subject name* in the client TLS certificate is used as user name
+
+  - example: [user and permission management] in [UCP]
+
+[authorization plugins]: https://docs.docker.com/engine/extend/plugins_authorization/
+[UCP]: https://docs.docker.com/datacenter/ucp/2.1/guides/
+[user and permission management]: https://docs.docker.com/datacenter/ucp/2.1/guides/admin/manage-users/
diff --git a/docs/blackbelt.png b/docs/blackbelt.png
new file mode 100644
index 00000000..d478fd83
Binary files /dev/null and b/docs/blackbelt.png differ
diff --git a/docs/build.sh b/docs/build.sh
new file mode 100755
index 00000000..a610c0bc
--- /dev/null
+++ b/docs/build.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+case "$1" in
+once)
+  for YAML in *.yml; do
+    ./markmaker.py < $YAML > $YAML.html || { 
+      rm $YAML.html
+      break
+    }
+  done
+  ;;
+
+forever)
+  # There is a weird bug in entr, at least on MacOS,
+  # where it doesn't restore the terminal to a clean
+  # state when exitting. So let's try to work around
+  # it with stty.
+  STTY=$(stty -g)
+  while true; do
+    find . | entr -d $0 once
+    STATUS=$?
+    case $STATUS in
+    2) echo "Directory has changed. Restarting.";;
+    130) echo "SIGINT or q pressed. Exiting."; break;;
+    *) echo "Weird exit code: $STATUS. Retrying in 1 second."; sleep 1;;
+    esac
+  done
+  stty $STTY
+  ;;
+
+*)
+  echo "$0 <once|forever>"
+  ;;
+esac
diff --git a/docs/chat/index.html b/docs/chat/index.html
deleted file mode 100644
index 880a844b..00000000
--- a/docs/chat/index.html
+++ /dev/null
@@ -1,9 +0,0 @@
-<html>
-<!-- Generated with index.html.sh -->
-<head>
-<meta http-equiv="refresh" content="0; URL='https://dockercommunity.slack.com/messages/docker-mentor'" />
-</head>
-<body>
-<a href="https://dockercommunity.slack.com/messages/docker-mentor">https://dockercommunity.slack.com/messages/docker-mentor</a>
-</body>
-</html>
diff --git a/docs/chat/index.html.sh b/docs/chat/index.html.sh
deleted file mode 100755
index a33f3cbe..00000000
--- a/docs/chat/index.html.sh
+++ /dev/null
@@ -1,16 +0,0 @@
-#!/bin/sh
-#LINK=https://gitter.im/jpetazzo/workshop-20170322-sanjose
-LINK=https://dockercommunity.slack.com/messages/docker-mentor
-#LINK=https://usenix-lisa.slack.com/messages/docker
-sed "s,@@LINK@@,$LINK,g" >index.html <<EOF
-<html>
-<!-- Generated with index.html.sh -->
-<head>
-<meta http-equiv="refresh" content="0; URL='$LINK'" />
-</head>
-<body>
-<a href="$LINK">$LINK</a>
-</body>
-</html>
-EOF
-
diff --git a/docs/concepts-k8s.md b/docs/concepts-k8s.md
new file mode 100644
index 00000000..65043800
--- /dev/null
+++ b/docs/concepts-k8s.md
@@ -0,0 +1,296 @@
+# Kubernetes concepts
+
+- Kubernetes is a container management system
+
+- It runs and manages containerized applications on a cluster
+
+--
+
+- What does that really mean?
+
+---
+
+## Basic things we can ask Kubernetes to do
+
+--
+
+- Start 5 containers using image `atseashop/api:v1.3`
+
+--
+
+- Place an internal load balancer in front of these containers
+
+--
+
+- Start 10 containers using image `atseashop/webfront:v1.3`
+
+--
+
+- Place a public load balancer in front of these containers
+
+--
+
+- It's Black Friday (or Christmas), traffic spikes, grow our cluster and add containers
+
+--
+
+- New release! Replace my containers with the new image `atseashop/webfront:v1.4`
+
+--
+
+- Keep processing requests during the upgrade; update my containers one at a time
+
+---
+
+## Other things that Kubernetes can do for us
+
+- Basic autoscaling
+
+- Blue/green deployment, canary deployment
+
+- Long running services, but also batch (one-off) jobs
+
+- Overcommit our cluster and *evict* low-priority jobs
+
+- Run services with *stateful* data (databases etc.)
+
+- Fine-grained access control defining *what* can be done by *whom* on *which* resources
+
+- Integrating third party services (*service catalog*)
+
+- Automating complex tasks (*operators*)
+
+---
+
+## Kubernetes architecture
+
+---
+
+class: pic
+
+![haha only kidding](k8s-arch1.png)
+
+---
+
+## Kubernetes architecture
+
+- Ha ha ha ha
+
+- OK, I was trying to scare you, it's much simpler than that ❤️
+
+---
+
+class: pic
+
+![that one is more like the real thing](k8s-arch2.png)
+
+---
+
+## Credits
+
+- The first schema is a Kubernetes cluster with storage backed by multi-path iSCSI
+
+  (Courtesy of [Yongbok Kim](https://www.yongbok.net/blog/))
+
+- The second one is a simplified representation of a Kubernetes cluster
+
+  (Courtesy of [Imesh Gunaratne](https://medium.com/containermind/a-reference-architecture-for-deploying-wso2-middleware-on-kubernetes-d4dee7601e8e))
+
+---
+
+## Kubernetes architecture: the master
+
+- The Kubernetes logic (its "brains") is a collection of services:
+
+  - the API server (our point of entry to everything!)
+  - core services like the scheduler and controller manager
+  - `etcd` (a highly available key/value store; the "database" of Kubernetes)
+
+- Together, these services form what is called the "master"
+
+- These services can run straight on a host, or in containers
+  <br/>
+  (that's an implementation detail)
+
+- `etcd` can be run on separate machines (first schema) or co-located (second schema)
+
+- We need at least one master, but we can have more (for high availability)
+
+---
+
+## Kubernetes architecture: the nodes
+
+- The nodes executing our containers run another collection of services:
+
+  - a container Engine (typically Docker)
+  - kubelet (the "node agent")
+  - kube-proxy (a necessary but not sufficient network component)
+
+- Nodes were formerly called "minions"
+
+- It is customary to *not* run apps on the node(s) running master components
+
+  (Except when using small development clusters) 
+
+---
+
+## Do we need to run Docker at all?
+
+No!
+
+--
+
+- By default, Kubernetes uses the Docker Engine to run containers
+
+- We could also use `rkt` ("Rocket") from CoreOS
+
+- Or leverage other pluggable runtimes through the *Container Runtime Interface*
+
+  (like CRI-O, or containerd)
+
+---
+
+## Do we need to run Docker at all?
+
+Yes!
+
+--
+
+- In this workshop, we run our app on a single node first
+
+- We will need to build images and ship them around
+
+- We can do these things without Docker
+  <br/>
+  (and get diagnosed with NIH¹ syndrome)
+
+- Docker is still the most stable container engine today
+  <br/>
+  (but other options are maturing very quickly)
+
+.footnote[¹[Not Invented Here](https://en.wikipedia.org/wiki/Not_invented_here)]
+
+---
+
+## Do we need to run Docker at all?
+
+- On our development environments, CI pipelines ... :
+
+  *Yes, almost certainly*
+
+- On our production servers:
+
+  *Yes (today)*
+
+  *Probably not (in the future)*
+
+.footnote[More information about CRI [on the Kubernetes blog](http://blog.kubernetes.io/2016/12/]container-runtime-interface-cri-in-kubernetes.html).
+
+---
+
+## Kubernetes resources
+
+- The Kubernetes API defines a lot of objects called *resources*
+
+- These resources are organized by type, or `Kind` (in the API)
+
+- A few common resource types are:
+
+  - node (a machine — physical or virtual — in our cluster)
+  - pod (group of containers running together on a node)
+  - service (stable network endpoint to connect to one or multiple containers)
+  - namespace (more-or-less isolated group of things)
+  - secret (bundle of sensitive data to be passed to a container)
+ 
+  And much more! (We can see the full list by running `kubectl get`)
+
+---
+
+class: pic
+
+![Node, pod, container](thanks-weave.png)
+
+(Diagram courtesy of Weave Works, used with permission.)
+
+---
+
+# Declarative vs imperative
+
+- Kubernetes puts a very strong emphasis on being *declarative*
+
+- Declarative:
+
+  *I would like a cup of tea.*
+
+- Imperative:
+
+  *Boil some water. Pour it in a teapot. Add tea leaves. Steep for a while. Serve in cup.*
+
+--
+
+- Declarative seems simpler at first ... 
+
+--
+
+- ... As long as you know how to brew tea
+
+---
+
+## Declarative vs imperative
+
+- What declarative would really be:
+
+  *I want a cup of tea, obtained by pouring an infusion¹ of tea leaves in a cup.*
+
+--
+
+  *¹An infusion is obtained by letting the object steep a few minutes in hot² water.*
+
+--
+
+  *²Hot liquid is obtained by pouring it in an appropriate container³ and setting it on a stove.*
+
+--
+
+  *³Ah, finally, containers! Something we know about. Let's get to work, shall we?*
+
+--
+
+.footnote[Did you know there was an [ISO standard](https://en.wikipedia.org/wiki/ISO_3103)
+specifying how to brew tea?]
+
+---
+
+## Declarative vs imperative
+
+- Imperative systems:
+
+  - simpler
+
+  - if a task is interrupted, we have to restart from scratch
+
+- Declarative systems:
+
+  - if a task is interrupted (or if we show up to the party half-way through),
+    we can figure out what's missing and do only what's necessary
+
+  - we need to be able to *observe* the system
+
+  - ... and compute a "diff" between *what we have* and *what we want*
+
+---
+
+## Declarative vs imperative in Kubernetes
+
+- Virtually everything we create in Kubernetes is created from a *spec*
+
+- Watch for the `spec` fields in the YAML files later!
+
+- The *spec* describes *how we want the thing to be*
+
+- Kubernetes will *reconcile* the current state with the spec
+  <br/>(technically, this is done by a number of *controllers*)
+
+- When we want to change some resource, we update the *spec*
+
+- Kubernetes will then *converge* that resource
diff --git a/docs/creatingswarm.md b/docs/creatingswarm.md
new file mode 100644
index 00000000..b6139b07
--- /dev/null
+++ b/docs/creatingswarm.md
@@ -0,0 +1,364 @@
+# Creating our first Swarm
+
+- The cluster is initialized with `docker swarm init`
+
+- This should be executed on a first, seed node
+
+- .warning[DO NOT execute `docker swarm init` on multiple nodes!]
+
+  You would have multiple disjoint clusters.
+
+.exercise[
+
+- Create our cluster from node1:
+  ```bash
+  docker swarm init
+  ```
+
+]
+
+--
+
+class: advertise-addr
+
+If Docker tells you that it `could not choose an IP address to advertise`, see next slide!
+
+---
+
+class: advertise-addr
+
+## IP address to advertise
+
+- When running in Swarm mode, each node *advertises* its address to the others
+  <br/>
+  (i.e. it tells them *"you can contact me on 10.1.2.3:2377"*)
+
+- If the node has only one IP address (other than 127.0.0.1), it is used automatically
+
+- If the node has multiple IP addresses, you **must** specify which one to use
+  <br/>
+  (Docker refuses to pick one randomly)
+
+- You can specify an IP address or an interface name
+  <br/>(in the latter case, Docker will read the IP address of the interface and use it)
+
+- You can also specify a port number
+  <br/>(otherwise, the default port 2377 will be used)
+
+---
+
+class: advertise-addr
+
+## Which IP address should be advertised?
+
+- If your nodes have only one IP address, it's safe to let autodetection do the job
+
+  .small[(Except if your instances have different private and public addresses, e.g.
+  on EC2, and you are building a Swarm involving nodes inside and outside the
+  private network: then you should advertise the public address.)]
+
+- If your nodes have multiple IP addresses, pick an address which is reachable
+  *by every other node* of the Swarm
+
+- If you are using [play-with-docker](http://play-with-docker.com/), use the IP
+  address shown next to the node name
+
+  .small[(This is the address of your node on your private internal overlay network.
+  The other address that you might see is the address of your node on the
+  `docker_gwbridge` network, which is used for outbound traffic.)]
+
+Examples:
+
+```bash
+docker swarm init --advertise-addr 10.0.9.2
+docker swarm init --advertise-addr eth0:7777
+```
+
+---
+
+class: extra-details
+
+## Using a separate interface for the data path
+
+- You can use different interfaces (or IP addresses) for control and data
+
+- You set the _control plane path_ with `--advertise-addr`
+
+  (This will be used for SwarmKit manager/worker communication, leader election, etc.)
+
+- You set the _data plane path_ with `--data-path-addr`
+
+  (This will be used for traffic between containers)
+
+- Both flags can accept either an IP address, or an interface name
+
+  (When specifying an interface name, Docker will use its first IP address)
+
+---
+
+## Token generation
+
+- In the output of `docker swarm init`, we have a message
+  confirming that our node is now the (single) manager:
+
+  ```
+  Swarm initialized: current node (8jud...) is now a manager.
+  ```
+
+- Docker generated two security tokens (like passphrases or passwords) for our cluster
+
+- The CLI shows us the command to use on other nodes to add them to the cluster using the "worker"
+  security token:
+
+  ```
+    To add a worker to this swarm, run the following command:
+      docker swarm join \
+      --token SWMTKN-1-59fl4ak4nqjmao1ofttrc4eprhrola2l87... \
+      172.31.4.182:2377
+  ```
+
+---
+
+class: extra-details
+
+## Checking that Swarm mode is enabled
+
+.exercise[
+
+- Run the traditional `docker info` command:
+  ```bash
+  docker info
+  ```
+
+]
+
+The output should include:
+
+```
+Swarm: active
+ NodeID: 8jud7o8dax3zxbags3f8yox4b
+ Is Manager: true
+ ClusterID: 2vcw2oa9rjps3a24m91xhvv0c
+ ...
+```
+
+---
+
+## Running our first Swarm mode command
+
+- Let's retry the exact same command as earlier
+
+.exercise[
+
+- List the nodes (well, the only node) of our cluster:
+  ```bash
+  docker node ls
+  ```
+
+]
+
+The output should look like the following:
+```
+ID             HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
+8jud...ox4b *  node1     Ready   Active        Leader
+```
+
+---
+
+## Adding nodes to the Swarm
+
+- A cluster with one node is not a lot of fun
+
+- Let's add `node2`!
+
+- We need the token that was shown earlier
+
+--
+
+- You wrote it down, right?
+
+--
+
+- Don't panic, we can easily see it again 😏
+
+---
+
+## Adding nodes to the Swarm
+
+.exercise[
+
+- Show the token again:
+  ```bash
+  docker swarm join-token worker
+  ```
+
+- Log into `node2`:
+  ```bash
+  ssh node2
+  ```
+
+- Copy-paste the `docker swarm join ...` command
+  <br/>(that was displayed just before)
+
+<!-- ```copypaste docker swarm join --token SWMTKN.*?:2377``` -->
+
+]
+
+---
+
+class: extra-details
+
+## Check that the node was added correctly
+
+- Stay on `node2` for now!
+
+.exercise[
+
+- We can still use `docker info` to verify that the node is part of the Swarm:
+  ```bash
+  docker info | grep ^Swarm
+  ```
+
+]
+
+- However, Swarm commands will not work; try, for instance:
+  ```bash
+  docker node ls
+  ```
+
+```wait```
+
+- This is because the node that we added is currently a *worker*
+- Only *managers* can accept Swarm-specific commands
+
+---
+
+## View our two-node cluster
+
+- Let's go back to `node1` and see what our cluster looks like
+
+.exercise[
+
+- Switch back to `node1`:
+  ```keys
+  ^D
+  ```
+
+- View the cluster from `node1`, which is a manager:
+  ```bash
+  docker node ls
+  ```
+
+]
+
+The output should be similar to the following:
+```
+ID             HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
+8jud...ox4b *  node1     Ready   Active        Leader
+ehb0...4fvx    node2     Ready   Active
+```
+
+
+---
+
+class: under-the-hood
+
+## Under the hood: docker swarm init
+
+When we do `docker swarm init`:
+
+- a keypair is created for the root CA of our Swarm
+
+- a keypair is created for the first node
+
+- a certificate is issued for this node
+
+- the join tokens are created
+
+---
+
+class: under-the-hood
+
+## Under the hood: join tokens
+
+There is one token to *join as a worker*, and another to *join as a manager*.
+
+The join tokens have two parts:
+
+- a secret key (preventing unauthorized nodes from joining)
+
+- a fingerprint of the root CA certificate (preventing MITM attacks)
+
+If a token is compromised, it can be rotated instantly with:
+```
+docker swarm join-token --rotate <worker|manager>
+```
+
+---
+
+class: under-the-hood
+
+## Under the hood: docker swarm join
+
+When a node joins the Swarm:
+
+- it is issued its own keypair, signed by the root CA
+
+- if the node is a manager:
+
+  - it joins the Raft consensus
+  - it connects to the current leader
+  - it accepts connections from worker nodes
+
+- if the node is a worker:
+
+  - it connects to one of the managers (leader or follower)
+
+---
+
+class: under-the-hood
+
+## Under the hood: cluster communication
+
+- The *control plane* is encrypted with AES-GCM; keys are rotated every 12 hours
+
+- Authentication is done with mutual TLS; certificates are rotated every 90 days
+
+  (`docker swarm update` allows to change this delay or to use an external CA)
+
+- The *data plane* (communication between containers) is not encrypted by default
+
+  (but this can be activated on a by-network basis, using IPSEC,
+  leveraging hardware crypto if available)
+
+---
+
+class: under-the-hood
+
+## Under the hood: I want to know more!
+
+Revisit SwarmKit concepts:
+
+- Docker 1.12 Swarm Mode Deep Dive Part 1: Topology
+  ([video](https://www.youtube.com/watch?v=dooPhkXT9yI))
+
+- Docker 1.12 Swarm Mode Deep Dive Part 2: Orchestration
+  ([video](https://www.youtube.com/watch?v=_F6PSP-qhdA))
+
+Some presentations from the Docker Distributed Systems Summit in Berlin:
+
+- Heart of the SwarmKit: Topology Management
+  ([slides](https://speakerdeck.com/aluzzardi/heart-of-the-swarmkit-topology-management))
+
+- Heart of the SwarmKit: Store, Topology & Object Model
+  ([slides](http://www.slideshare.net/Docker/heart-of-the-swarmkit-store-topology-object-model))
+  ([video](https://www.youtube.com/watch?v=EmePhjGnCXY))
+
+And DockerCon Black Belt talks:
+
+.blackbelt[DC17US: Everything You Thought You Already Knew About Orchestration
+ ([video](https://www.youtube.com/watch?v=Qsv-q8WbIZY&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=6))]
+
+.blackbelt[DC17EU: Container Orchestration from Theory to Practice
+ ([video](https://dockercon.docker.com/watch/5fhwnQxW8on1TKxPwwXZ5r))]
+
diff --git a/docs/daemonset.md b/docs/daemonset.md
new file mode 100644
index 00000000..78563972
--- /dev/null
+++ b/docs/daemonset.md
@@ -0,0 +1,409 @@
+# Daemon sets
+
+- Remember: we did all that cluster orchestration business for `rng`
+
+- We want one (and exactly one) instance of `rng` per node
+
+- If we just scale `deploy/rng` to 4, nothing guarantees that they spread
+
+- Instead of a `deployment`, we will use a `daemonset`
+
+- Daemon sets are great for cluster-wide, per-node processes:
+
+  - `kube-proxy`
+  - `weave` (our overlay network)
+  - monitoring agents
+  - hardware management tools (e.g. SCSI/FC HBA agents)
+  - etc.
+
+- They can also be restricted to run [only on some nodes](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#running-pods-on-only-some-nodes)
+
+---
+
+## Creating a daemon set
+
+- Unfortunately, as of Kubernetes 1.8, the CLI cannot create daemon sets
+
+--
+
+- More precisely: it doesn't have a subcommand to create a daemon set
+
+--
+
+- But any kind of resource can always be created by providing a YAML description:
+  ```bash
+  kubectl apply -f foo.yaml
+  ```
+
+--
+
+- How do we create the YAML file for our daemon set?
+
+--
+
+  - option 1: read the docs
+
+--
+
+  - option 2: `vi` our way out of it
+
+---
+
+## Creating the YAML file for our daemon set
+
+- Let's start with the YAML file for the current `rng` resource
+
+.exercise[
+
+- Dump the `rng` resource in YAML:
+  ```bash
+  kubectl get deploy/rng -o yaml --export >rng.yml 
+  ```
+
+- Edit `rng.yml`
+
+]
+
+Note: `--export` will remove "cluster-specific" information, i.e.:
+- namespace (so that the resource is not tied to a specific namespace)
+- status and creation timestamp (useless when creating a new resource)
+- resourceVersion and uid (these would cause... *interesting* problems)
+
+---
+
+## "Casting" a resource to another
+
+- What if we just changed the `kind` field?
+
+  (It can't be that easy, right?)
+
+.exercise[
+
+- Change `kind: Deployment` to `kind: DaemonSet`
+
+- Save, quit
+
+- Try to create our new resource:
+  ```bash
+  kubectl apply -f rng.yml
+  ```
+
+]
+
+--
+
+We all knew this couldn't be that easy, right!
+
+---
+
+## Understanding the problem
+
+- The core of the error is:
+  ```
+  error validating data:
+  [ValidationError(DaemonSet.spec):
+  unknown field "replicas" in io.k8s.api.extensions.v1beta1.DaemonSetSpec,
+  ...
+  ```
+
+--
+
+- *Obviously,* it doesn't make sense to specify a number of replicas for a daemon set
+
+--
+
+- Workaround: fix the YAML
+
+  - remove the `replicas` field
+  - remove the `strategy` field (which defines the rollout mechanism for a deployment)
+  - remove the `status: {}` line at the end
+
+--
+
+- Or, we could also ...
+
+---
+
+## Use the `--force`, Luke
+
+- We could also tell Kubernetes to ignore these errors and try anyway
+
+- The `--force` flag actual name is `--validate=false`
+
+.exercise[
+
+- Try to load our YAML file and ignore errors:
+  ```bash
+  kubectl apply -f rng.yml --validate=false
+  ```
+
+]
+
+--
+
+🎩✨🐇
+
+--
+
+Wait ... Now, can it be *that* easy?
+
+---
+
+## Checking what we've done
+
+- Did we transform our `deployment` into a `daemonset`?
+
+.exercise[
+
+- Look at the resources that we have now:
+  ```bash
+  kubectl get all
+  ```
+
+]
+
+--
+
+We have both `deploy/rng` and `ds/rng` now!
+
+--
+
+And one too many pods...
+
+---
+
+## Explanation
+
+- You can have different resource types with the same name
+
+  (i.e. a *deployment* and a *daemonset* both named `rng`)
+
+- We still have the old `rng` *deployment*
+
+- But now we have the new `rng` *daemonset* as well
+
+- If we look at the pods, we have:
+
+  - *one pod* for the deployment
+
+  - *one pod per node* for the daemonset
+
+---
+
+## What are all these pods doing?
+
+- Let's check the logs of all these `rng` pods
+
+- All these pods have a `run=rng` label:
+
+  - the first pod, because that's what `kubectl run` does
+  - the other ones (in the daemon set), because we
+    *copied the spec from the first one*
+
+- Therefore, we can query everybody's logs using that `run=rng` selector
+
+.exercise[
+
+- Check the logs of all the pods having a label `run=rng`:
+  ```bash
+  kubectl logs -l run=rng --tail 1
+  ```
+
+]
+
+--
+
+It appears that *all the pods* are serving requests at the moment.
+
+---
+
+## The magic of selectors
+
+- The `rng` *service* is load balancing requests to a set of pods
+
+- This set of pods is defined as "pods having the label `run=rng`"
+
+.exercise[
+
+- Check the *selector* in the `rng` service definition:
+  ```bash
+  kubectl describe service rng
+  ```
+
+]
+
+When we created additional pods with this label, they were
+automatically detected by `svc/rng` and added as *endpoints*
+to the associated load balancer.
+
+---
+
+## Removing the first pod from the load balancer
+
+- What would happen if we removed that pod, with `kubectl delete pod ...`?
+
+--
+
+  The `replicaset` would re-create it immediately.
+
+--
+
+- What would happen if we removed the `run=rng` label from that pod?
+
+--
+
+  The `replicaset` would re-create it immediately.
+
+--
+
+  ... Because what matters to the `replicaset` is the number of pods *matching that selector.*
+
+--
+
+- But but but ... Don't we have more than one pod with `run=rng` now?
+
+--
+
+  The answer lies in the exact selector used by the `replicaset` ...
+
+---
+
+## Deep dive into selectors
+
+- Let's look at the selectors for the `rng` *deployment* and the associated *replica set*
+
+.exercise[
+
+- Show detailed information about the `rng` deployment:
+  ```bash
+  kubectl describe deploy rng
+  ```
+
+- Show detailed information about the `rng` replica:
+  <br/>(The second command doesn't require you to get the exact name of the replica set)
+  ```bash
+  kubectl describe rs rng-yyyy
+  kubectl describe rs -l run=rng
+  ```
+
+]
+
+--
+
+The replica set selector also has a `pod-template-hash`, unlike the pods in our daemon set.
+
+---
+
+# Updating a service through labels and selectors
+
+- What if we want to drop the `rng` deployment from the load balancer?
+
+- Option 1: 
+
+  - destroy it
+
+- Option 2: 
+
+  - add an extra *label* to the daemon set
+
+  - update the service *selector* to refer to that *label*
+
+--
+
+Of course, option 2 offers more learning opportunities. Right?
+
+---
+
+## Add an extra label to the daemon set
+
+- We will update the daemon set "spec"
+
+- Option 1:
+
+  - edit the `rng.yml` file that we used earlier
+
+  - load the new definition with `kubectl apply`
+
+- Option 2: 
+
+  - use `kubectl edit`
+
+--
+
+*If you feel like you got this💕🌈, feel free to try directly.*
+
+*We've included a few hints on the next slides for your convenience!*
+
+---
+
+## We've put resources in your resources all the way down
+
+- Reminder: a daemon set is a resource that creates more resources!
+
+- There is a difference between:
+
+  - the label(s) of a resource (in the `metadata` block in the beginning)
+
+  - the selector of a resource (in the `spec` block)
+
+  - the label(s) of the resource(s) created by the first resource (in the `template` block)
+
+- You need to update the selector and the template (metadata labels are not mandatory)
+
+- The template must match the selector
+
+  (i.e. the resource will refuse to create resources that it will not select)
+
+---
+
+## Adding our label
+
+- Let's add a label `isactive: yes`
+
+- In YAML, `yes` should be quoted; i.e. `isactive: "yes"`
+
+.exercise[
+
+- Update the daemon set to add `isactive: "yes"` to the selector and template label:
+  ```bash
+  kubectl edit daemonset rng
+  ```
+
+- Update the service to add `isactive: "yes"` to its selector:
+  ```bash
+  kubectl edit service rng
+  ```
+
+]
+
+---
+
+## Checking what we've done
+
+.exercise[
+
+- Check the logs of all `run=rng` pods to confirm that only 4 of them are now active:
+  ```bash
+  kubectl logs -l run=rng
+  ```
+
+]
+
+The timestamps should give us a hint about how many pods are currently receiving traffic.
+
+.exercise[
+
+- Look at the pods that we have right now:
+  ```bash
+  kubectl get pods
+  ```
+
+]
+
+---
+
+## More labels, more selectors, more problems?
+
+- Bonus exercise 1: clean up the pods of the "old" daemon set
+
+- Bonus exercise 2: how could we have done to avoid creating new pods?
diff --git a/docs/dashboard.md b/docs/dashboard.md
new file mode 100644
index 00000000..21ef54b6
--- /dev/null
+++ b/docs/dashboard.md
@@ -0,0 +1,181 @@
+# The Kubernetes dashboard
+
+- Kubernetes resources can also be viewed with a web dashboard
+
+- We are going to deploy that dashboard with *three commands:*
+
+  - one to actually *run* the dashboard
+
+  - one to make the dashboard available from outside
+
+  - one to bypass authentication for the dashboard
+
+--
+
+.footnote[.warning[Yes, this will open our cluster to all kinds of shenanigans. Don't do this at home.]]
+
+---
+
+## Running the dashboard
+
+- We need to create a *deployment* and a *service* for the dashboard
+
+- But also a *secret*, a *service account*, a *role* and a *role binding*
+
+- All these things can be defined in a YAML file and created with `kubectl apply -f`
+
+.exercise[
+
+- Create all the dashboard resources, with the following command:
+  ```bash
+  kubectl apply -f https://goo.gl/Qamqab
+  ```
+
+]
+
+The goo.gl URL expands to:
+<br/>
+.small[https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml]
+
+---
+
+## Making the dashboard reachable from outside
+
+- The dashboard is exposed through a `ClusterIP` service
+
+- We need a `NodePort` service instead
+
+.exercise[
+
+- Edit the service:
+  ```bash
+  kubectl edit service kubernetes-dashboard
+  ```
+
+]
+
+--
+
+`NotFound`?!? Y U NO WORK?!?
+
+---
+
+## Editing the `kubernetes-dashboard` service
+
+- If we look at the YAML that we loaded just before, we'll get a hint
+
+--
+
+- The dashboard was created in the `kube-system` namespace
+
+.exercise[
+
+- Edit the service:
+  ```bash
+  kubectl -n kube-system edit service kubernetes-dashboard
+  ```
+
+- Change `ClusterIP` to `NodePort`, save, and exit
+
+- Check the port that was assigned with `kubectl -n kube-system get services`
+
+]
+
+---
+
+## Connecting to the dashboard
+
+.exercise[
+
+- Connect to https://oneofournodes:3xxxx/
+
+  (You will have to work around the TLS certificate validation warning)
+
+<!-- ```open https://node1:3xxxx/``` -->
+
+]
+
+- We have three authentication options at this point:
+
+  - token (associated with a role that has appropriate permissions)
+
+  - kubeconfig (e.g. using the `~/.kube/config` file from `node1`)
+
+  - "skip" (use the dashboard "service account")
+
+- Let's use "skip": we get a bunch of warnings and don't see much
+
+---
+
+## Granting more rights to the dashboard
+
+- The dashboard documentation [explains how to do](https://github.com/kubernetes/dashboard/wiki/Access-control#admin-privileges)
+
+- We just need to load another YAML file!
+
+.exercise[
+
+- Grant admin privileges to the dashboard so we can see our resources:
+  ```bash
+  kubectl apply -f https://goo.gl/CHsLTA
+  ```
+
+- Reload the dashboard and enjoy!
+
+]
+
+--
+
+.warning[By the way, we just added a backdoor to our Kubernetes cluster!]
+
+---
+
+# Security implications of `kubectl apply`
+
+- When we do `kubectl apply -f <URL>`, we create arbitrary resources
+
+- Resources can be evil; imagine a `deployment` that ...
+
+--
+
+  - starts bitcoin miners on the whole cluster
+
+--
+
+  - hides in a non-default namespace
+
+--
+
+  - bind-mounts our nodes' filesystem
+
+--
+
+  - inserts SSH keys in the root account (on the node)
+
+--
+
+  - encrypts our data and ransoms it
+
+--
+
+  - ☠️☠️☠️
+
+---
+
+## `kubectl apply` is the new `curl | sh`
+
+- `curl | sh` is convenient
+
+- It's safe if you use HTTPS URLs from trusted sources
+
+--
+
+- `kubectl apply -f` is convenient
+
+- It's safe if you use HTTPS URLs from trusted sources
+
+--
+
+- It introduces new failure modes
+
+- Example: the official setup instructions for most pod networks
diff --git a/docs/dockercon.yml b/docs/dockercon.yml
new file mode 100644
index 00000000..e7ccf4b5
--- /dev/null
+++ b/docs/dockercon.yml
@@ -0,0 +1,182 @@
+chat: "[Slack](https://dockercommunity.slack.com/messages/C7ET1GY4Q)"
+
+exclude:
+- self-paced
+- snap
+- auto-btp
+- benchmarking
+- elk-manual
+- prom-manual
+
+title: "Swarm: from Zero to Hero (DC17EU)"
+chapters:
+- |
+  class: title
+
+  .small[
+
+  Swarm: from Zero to Hero
+
+  .small[.small[
+
+  **Be kind to the WiFi!**
+
+  *Use the 5G network*
+  <br/>
+  *Don't use your hotspot*
+  <br/>
+  *Don't stream videos from YouTube, Netflix, etc.
+  <br/>(if you're bored, watch local content instead)*
+
+  Also: share the power outlets
+  <br/>
+  *(with limited power comes limited responsibility?)*
+  <br/>
+  *(or something?)*
+
+  Thank you!
+
+  ]
+  ]
+  ]
+
+  ---
+
+  ## Intros
+ 
+  <!--
+  - Hello! We are
+    AJ ([@s0ulshake](https://twitter.com/s0ulshake))
+    &
+    Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
+  -->
+
+  - Hello! We are Jérôme, Lee, Nicholas, and Scott
+
+  <!--
+    I am
+    Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
+  -->
+
+  --
+
+  - This is our collective Docker knowledge:
+
+    ![Bell Curve](bell-curve.jpg)
+
+  ---
+
+  ## "From zero to hero"
+
+  --
+
+  - It rhymes, but it's a pretty bad title, to be honest
+
+  --
+  
+  - None of you is a "zero"
+
+  --
+  
+  - None of us is a "hero"
+
+  --
+  
+  - None of us should even try to be a hero
+
+  --
+  
+    *The hero syndrome is a phenomenon affecting people who seek heroism or recognition,
+    usually by creating a desperate situation which they can resolve.
+    This can include unlawful acts, such as arson.
+    The phenomenon has been noted to affect civil servants,
+    such as firefighters, nurses, police officers, and security guards.*
+
+    (Wikipedia page on [hero syndrome](https://en.wikipedia.org/wiki/Hero_syndrome))
+
+  ---
+
+  ## Agenda
+
+  .small[
+  - 09:00-09:10 Hello!
+  - 09:10-10:30 Part 1
+  - 10:30-11:00 coffee break
+  - 11:00-12:30 Part 2
+  - 12:30-13:30 lunch break
+  - 13:30-15:00 Part 3
+  - 15:00-15:30 coffee break
+  - 15:30-17:00 Part 4
+  - 17:00-18:00 Afterhours and Q&A
+  ]
+
+  <!--
+  - The tutorial will run from 9:00am to 12:20pm
+
+  - This will be fast-paced, but DON'T PANIC!
+
+  - There will be a coffee break at 10:30am
+    <br/>
+    (please remind me if I forget about it!)
+  -->
+
+  - All the content is publicly available (slides, code samples, scripts)
+
+    Upstream URL: https://github.com/jpetazzo/orchestration-workshop
+
+  - Feel free to interrupt for questions at any time
+
+  - Live feedback, questions, help on [Gitter](chat)
+
+    http://container.training/chat
+
+- intro.md
+- |
+  @@TOC@@
+- - prereqs.md
+  - versions.md
+  - |
+    class: title
+
+    All right!
+    <br/>
+    We're all set.
+    <br/>
+    Let's do this.
+  - sampleapp.md
+  - swarmkit.md
+  - creatingswarm.md
+  - morenodes.md
+- - firstservice.md
+  - ourapponswarm.md
+  - updatingservices.md
+  - healthchecks.md
+- - operatingswarm.md
+  - netshoot.md
+  - ipsec.md
+  - swarmtools.md
+  - security.md
+  - secrets.md
+  - encryptionatrest.md
+  - leastprivilege.md
+  - apiscope.md
+- - logging.md
+  - metrics.md
+  - stateful.md
+  - extratips.md
+  - end.md
+- |
+  class: title
+
+  That's all folks! <br/> Questions?
+
+  .small[.small[
+
+  Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@docker](https://twitter.com/docker)
+
+  ]]
+
+  <!--
+  Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
+  AJ ([@s0ulshake](https://twitter.com/s0ulshake))
+  -->
diff --git a/docs/encryptionatrest.md b/docs/encryptionatrest.md
new file mode 100644
index 00000000..4bda6e1b
--- /dev/null
+++ b/docs/encryptionatrest.md
@@ -0,0 +1,154 @@
+## Encryption at rest
+
+- Swarm data is always encrypted
+
+- A Swarm cluster can be "locked"
+
+- When a cluster is "locked", the encryption key is protected with a passphrase
+
+- Starting or restarting a locked manager requires the passphrase
+
+- This protects against:
+
+  - theft (stealing a physical machine, a disk, a backup tape...)
+
+  - unauthorized access (to e.g. a remote or virtual volume)
+
+  - some vulnerabilities (like path traversal)
+
+---
+
+## Locking a Swarm cluster
+
+- This is achieved through the `docker swarm update` command
+
+.exercise[
+
+- Lock our cluster:
+  ```bash
+  docker swarm update --autolock=true
+  ```
+
+]
+
+This will display the unlock key. Copy-paste it somewhere safe.
+
+---
+
+## Locked state
+
+- If we restart a manager, it will now be locked
+
+.exercise[
+
+- Restart the local Engine:
+  ```bash
+  sudo systemctl restart docker
+  ```
+
+]
+
+Note: if you are doing the workshop on your own, using nodes
+that you [provisioned yourself](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine) or with [Play-With-Docker](http://play-with-docker.com/), you might have to use a different method to restart the Engine.
+
+---
+
+## Checking that our node is locked
+
+- Manager commands (requiring access to crypted data) will fail
+
+- Other commands are OK
+
+.exercise[
+
+- Try a few basic commands:
+  ```bash
+  docker ps
+  docker run alpine echo ♥
+  docker node ls
+  ```
+
+]
+
+(The last command should fail, and it will tell you how to unlock this node.)
+
+---
+
+## Checking the state of the node programmatically
+
+- The state of the node shows up in the output of `docker info`
+
+.exercise[
+
+- Check the output of `docker info`:
+  ```bash
+  docker info
+  ```
+
+- Can't see it? Too verbose? Grep to the rescue!
+  ```bash
+  docker info | grep ^Swarm
+  ```
+
+]
+
+---
+
+## Unlocking a node
+
+- You will need the secret token that we obtained when enabling auto-lock earlier
+
+.exercise[
+
+- Unlock the node:
+  ```bash
+  docker swarm unlock
+  ```
+
+- Copy-paste the secret token that we got earlier
+
+- Check that manager commands now work correctly:
+  ```bash
+  docker node ls
+  ```
+
+]
+
+---
+
+## Managing the secret key
+
+- If the key is compromised, you can change it and re-encrypt with a new key:
+  ```bash
+  docker swarm unlock-key --rotate
+  ```
+
+- If you lost the key, you can get it as long as you have at least one unlocked node:
+  ```bash
+  docker swarm unlock-key -q
+  ```
+
+Note: if you rotate the key while some nodes are locked, without saving the previous key, those nodes won't be able to rejoin.
+
+Note: if somebody steals both your disks and your key, .strike[you're doomed! Doooooomed!]
+<br/>you can block the compromised node with `docker node demote` and `docker node rm`.
+
+---
+
+## Unlocking the cluster permanently
+
+- If you want to remove the secret key, disable auto-lock
+
+.exercise[
+
+- Permanently unlock the cluster:
+  ```bash
+  docker swarm update --autolock=false
+  ```
+
+]
+
+Note: if some nodes are in locked state at that moment (or if they are offline/restarting
+while you disabled autolock), they still need the previous unlock key to get back online.
+
+For more information about locking, you can check the [upcoming documentation](https://github.com/docker/docker.github.io/pull/694).
diff --git a/docs/end.md b/docs/end.md
new file mode 100644
index 00000000..c1a7b9be
--- /dev/null
+++ b/docs/end.md
@@ -0,0 +1,38 @@
+class: title, extra-details
+
+# What's next?
+
+## (What to expect in future versions of this workshop)
+
+---
+
+class: extra-details
+
+## Implemented and stable, but out of scope
+
+- [Docker Content Trust](https://docs.docker.com/engine/security/trust/content_trust/) and
+  [Notary](https://github.com/docker/notary) (image signature and verification)
+
+- Image security scanning (many products available, Docker Inc. and 3rd party)
+
+- [Docker Cloud](https://cloud.docker.com/) and
+  [Docker Datacenter](https://www.docker.com/products/docker-datacenter)
+  (commercial offering with node management, secure registry, CI/CD pipelines, all the bells and whistles)
+
+- Network and storage plugins
+
+---
+
+class: extra-details
+
+## Work in progress
+
+- Demo at least one volume plugin
+  <br/>(bonus points if it's a distributed storage system)
+
+- ..................................... (your favorite feature here)
+
+Reminder: there is a tag for each iteration of the content
+in the Github repository.
+
+It makes it easy to come back later and check what has changed since you did it!
diff --git a/docs/extra-details.png b/docs/extra-details.png
index 51d6206b..9b138507 100644
Binary files a/docs/extra-details.png and b/docs/extra-details.png differ
diff --git a/docs/extract-section-titles.py b/docs/extract-section-titles.py
deleted file mode 100755
index a5f27fc0..00000000
--- a/docs/extract-section-titles.py
+++ /dev/null
@@ -1,19 +0,0 @@
-#!/usr/bin/env python
-"""
-Extract and print level 1 and 2 titles from workshop slides.
-"""
-
-separators = [
-    "---",
-    "--"
-]
-
-slide_count = 1
-for line in open("index.html"):
-    line = line.strip()
-    if line in separators:
-        slide_count += 1
-    if line.startswith('#  '):
-        print slide_count, '# #', line
-    elif line.startswith('# '):
-        print slide_count, line
diff --git a/docs/extratips.md b/docs/extratips.md
new file mode 100644
index 00000000..76b1a3b1
--- /dev/null
+++ b/docs/extratips.md
@@ -0,0 +1,246 @@
+# Controlling Docker from a container
+
+- In a local environment, just bind-mount the Docker control socket:
+  ```bash
+  docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker
+  ```
+
+- Otherwise, you have to:
+
+  - set `DOCKER_HOST`,
+  - set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS),
+  - copy certificates to the container that will need API access.
+
+More resources on this topic:
+
+- [Do not use Docker-in-Docker for CI](
+  http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
+- [One container to rule them all](
+  http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/)
+
+---
+
+## Bind-mounting the Docker control socket
+
+- In Swarm mode, bind-mounting the control socket gives you access to the whole cluster
+
+- You can tell Docker to place a given service on a manager node, using constraints:
+  ```bash
+    docker service create \
+      --mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
+      --name autoscaler --constraint node.role==manager ...
+  ```
+
+---
+
+## Constraints and global services
+
+(New in Docker Engine 1.13)
+
+- By default, global services run on *all* nodes
+  ```bash
+  docker service create --mode global ...
+  ```
+
+- You can specify constraints for global services
+
+- These services will run only on the node satisfying the constraints
+
+- For instance, this service will run on all manager nodes:
+  ```bash
+  docker service create --mode global --constraint node.role==manager ...
+  ```
+
+---
+
+## Constraints and dynamic scheduling
+
+(New in Docker Engine 1.13)
+
+- If constraints change, services are started/stopped accordingly
+
+  (e.g., `--constraint node.role==manager` and nodes are promoted/demoted)
+
+- This is particularly useful with labels:
+  ```bash
+  docker node update node1 --label-add defcon=five
+  docker service create --constraint node.labels.defcon==five ...
+  docker node update node2 --label-add defcon=five
+  docker node update node1 --label-rm defcon=five
+  ```
+
+---
+
+## Shortcomings of dynamic scheduling
+
+.warning[If a service becomes "unschedulable" (constraints can't be satisfied):]
+
+- It won't be scheduled automatically when constraints are satisfiable again
+
+- You will have to update the service; you can do a no-op udate with:
+  ```bash
+  docker service update ... --force
+  ```
+
+.warning[Docker will silently ignore attempts to remove a non-existent label or constraint]
+
+- It won't warn you if you typo when removing a label or constraint!
+
+---
+
+# Node management
+
+- SwarmKit allows to change (almost?) everything on-the-fly
+
+- Nothing should require a global restart
+
+---
+
+## Node availability
+
+```bash
+docker node update <node-name> --availability <active|pause|drain>
+```
+
+- Active = schedule tasks on this node (default)
+
+- Pause = don't schedule new tasks on this node; existing tasks are not affected
+
+  You can use it to troubleshoot a node without disrupting existing tasks
+
+  It can also be used (in conjunction with labels) to reserve resources
+
+- Drain = don't schedule new tasks on this node; existing tasks are moved away
+
+  This is just like crashing the node, but containers get a chance to shutdown cleanly
+
+---
+
+## Managers and workers
+
+- Nodes can be promoted to manager with `docker node promote`
+
+- Nodes can be demoted to worker with `docker node demote`
+
+- This can also be done with `docker node update <node> --role <manager|worker>`
+
+- Reminder: this has to be done from a manager node
+  <br/>(workers cannot promote themselves)
+
+---
+
+## Removing nodes
+
+- You can leave Swarm mode with `docker swarm leave`
+
+- Nodes are drained before being removed (i.e. all tasks are rescheduled somewhere else)
+
+- Managers cannot leave (they have to be demoted first)
+
+- After leaving, a node still shows up in `docker node ls` (in `Down` state)
+
+- When a node is `Down`, you can remove it with `docker node rm` (from a manager node)
+
+---
+
+## Join tokens and automation
+
+- If you have used Docker 1.12-RC: join tokens are now mandatory!
+
+- You cannot specify your own token (SwarmKit generates it)
+
+- If you need to change the token: `docker swarm join-token --rotate ...`
+
+- To automate cluster deployment:
+
+  - have a seed node do `docker swarm init` if it's not already in Swarm mode
+
+  - propagate the token to the other nodes (secure bucket, facter, ohai...)
+
+---
+
+## Disk space management: `docker system df`
+
+- Shows disk usage for images, containers, and volumes
+
+- Breaks down between *active* and *reclaimable* categories
+
+.exercise[
+
+- Check how much disk space is used at the end of the workshop:
+  ```bash
+  docker system df
+  ```
+
+]
+
+Note: `docker system` is new in Docker Engine 1.13.
+
+---
+
+## Reclaiming unused resources: `docker system prune`
+
+- Removes stopped containers
+
+- Removes dangling images (that don't have a tag associated anymore)
+
+- Removes orphaned volumes
+
+- Removes empty networks
+
+.exercise[
+
+- Try it:
+  ```bash
+  docker system prune -f
+  ```
+
+]
+
+Note: `docker system prune -a` will also remove *unused* images.
+
+---
+
+## Events
+
+- You can get a real-time stream of events with `docker events`
+
+- This will report *local events* and *cluster events*
+
+- Local events =
+  <br/>
+  all activity related to containers, images, plugins, volumes, networks, *on this node*
+
+- Cluster events =
+  <br/>Swarm Mode activity related to services, nodes, secrets, configs, *on the whole cluster*
+
+- `docker events` doesn't report *local events happening on other nodes*
+
+- Events can be filtered (by type, target, labels...)
+
+- Events can be formatted with Go's `text/template` or in JSON
+
+---
+
+## Getting *all the events*
+
+- There is no built-in to get a stream of *all the events* on *all the nodes*
+
+- This can be achieved with (for instance) the four following services working together:
+
+  - a Redis container (used as a stateless, fan-in message queue)
+
+  - a global service bind-mounting the Docker socket, pushing local events to the queue
+
+  - a similar singleton service to push global events to the queue
+
+  - a queue consumer fetching events and processing them as you please
+
+I'm not saying that you should implement it with Shell scripts, but you totally could.
+
+.small[
+(It might or might not be one of the initiating rites of the
+[House of Bash](https://twitter.com/carmatrocity/status/676559402787282944))
+]
+
+For more information about event filters and types, check [the documentation](https://docs.docker.com/engine/reference/commandline/events/).
diff --git a/docs/firstservice.md b/docs/firstservice.md
new file mode 100644
index 00000000..aa8d2c1f
--- /dev/null
+++ b/docs/firstservice.md
@@ -0,0 +1,474 @@
+# Running our first Swarm service
+
+- How do we run services? Simplified version:
+
+  `docker run` → `docker service create`
+
+.exercise[
+
+- Create a service featuring an Alpine container pinging Google resolvers:
+  ```bash
+  docker service create alpine ping 8.8.8.8
+  ```
+
+- Check the result:
+  ```bash
+  docker service ps <serviceID>
+  ```
+
+]
+
+---
+
+## `--detach` for service creation
+
+(New in Docker Engine 17.05)
+
+If you are running Docker 17.05 to 17.09, you will see the following message:
+
+```
+Since --detach=false was not specified, tasks will be created in the background.
+In a future release, --detach=false will become the default.
+```
+
+You can ignore that for now; but we'll come back to it in just a few minutes!
+
+---
+
+## Checking service logs
+
+(New in Docker Engine 17.05)
+
+- Just like `docker logs` shows the output of a specific local container ...
+
+- ... `docker service logs` shows the output of all the containers of a specific service
+
+.exercise[
+
+- Check the output of our ping command:
+  ```bash
+  docker service logs <serviceID>
+  ```
+
+]
+
+Flags `--follow` and `--tail` are available, as well as a few others.
+
+Note: by default, when a container is destroyed (e.g. when scaling down), its logs are lost.
+
+---
+
+class: extra-details
+
+## Before Docker Engine 17.05
+
+- Docker 1.13/17.03/17.04 have `docker service logs` as an experimental feature
+  <br/>(available only when enabling the experimental feature flag)
+
+- We have to use `docker logs`, which only works on local containers
+
+- We will have to connect to the node running our container
+  <br/>(unless it was scheduled locally, of course)
+
+---
+
+class: extra-details
+
+## Looking up where our container is running
+
+- The `docker service ps` command told us where our container was scheduled
+
+.exercise[
+
+- Look up the `NODE` on which the container is running:
+  ```bash
+  docker service ps <serviceID>
+  ```
+
+- If you use Play-With-Docker, switch to that node's tab, or set `DOCKER_HOST`
+
+- Otherwise, `ssh` into tht node or use `$(eval docker-machine env node...)`
+
+]
+
+---
+
+class: extra-details
+
+## Viewing the logs of the container
+
+.exercise[
+
+- See that the container is running and check its ID:
+  ```bash
+  docker ps
+  ```
+
+- View its logs:
+  ```bash
+  docker logs <containerID>
+  ```
+
+- Go back to `node1` afterwards
+
+]
+
+---
+
+## Scale our service
+
+- Services can be scaled in a pinch with the `docker service update` command
+
+.exercise[
+
+- Scale the service to ensure 2 copies per node:
+  ```bash
+  docker service update <serviceID> --replicas 10 --detach=true
+  ```
+
+- Check that we have two containers on the current node:
+  ```bash
+  docker ps
+  ```
+
+]
+
+---
+
+## View deployment progress
+
+(New in Docker Engine 17.05)
+
+- Commands that create/update/delete services can run with `--detach=false`
+
+- The CLI will show the status of the command, and exit once it's done working
+
+.exercise[
+
+- Scale the service to ensure 3 copies per node:
+  ```bash
+  docker service update <serviceID> --replicas 15 --detach=false
+  ```
+
+]
+
+Note: with Docker Engine 17.10 and later, `--detach=false` is the default.
+
+With versions older than 17.05, you can use e.g.: `watch docker service ps <serviceID>`
+
+---
+
+## Expose a service
+
+- Services can be exposed, with two special properties:
+
+  - the public port is available on *every node of the Swarm*,
+
+  - requests coming on the public port are load balanced across all instances.
+
+- This is achieved with option `-p/--publish`; as an approximation:
+
+  `docker run -p → docker service create -p`
+
+- If you indicate a single port number, it will be mapped on a port
+  starting at 30000
+  <br/>(vs. 32768 for single container mapping)
+
+- You can indicate two port numbers to set the public port number
+  <br/>(just like with `docker run -p`)
+
+---
+
+## Expose ElasticSearch on its default port
+
+.exercise[
+
+- Create an ElasticSearch service (and give it a name while we're at it):
+  ```bash
+  docker service create --name search --publish 9200:9200 --replicas 7 \
+         --detach=false elasticsearch`:2`
+  ```
+
+]
+
+Note: don't forget the **:2**!
+
+The latest version of the ElasticSearch image won't start without mandatory configuration.
+
+---
+
+## Tasks lifecycle
+
+- During the deployment, you will be able to see multiple states:
+
+  - assigned (the task has been assigned to a specific node)
+
+  - preparing (this mostly means "pulling the image")
+
+  - starting
+
+  - running
+
+- When a task is terminated (stopped, killed...) it cannot be restarted
+
+  (A replacement task will be created)
+
+---
+
+class: extra-details
+
+![diagram showing what happens during docker service create, courtesy of @aluzzardi](docker-service-create.svg)
+
+---
+
+## Test our service
+
+- We mapped port 9200 on the nodes, to port 9200 in the containers
+
+- Let's try to reach that port!
+
+.exercise[
+
+- Try the following command:
+  ```bash
+  curl localhost:9200
+  ```
+
+]
+
+(If you get `Connection refused`: congratulations, you are very fast indeed! Just try again.)
+
+ElasticSearch serves a little JSON document with some basic information
+about this instance; including a randomly-generated super-hero name.
+
+---
+
+## Test the load balancing
+
+- If we repeat our `curl` command multiple times, we will see different names
+
+.exercise[
+
+- Send 10 requests, and see which instances serve them:
+  ```bash
+    for N in $(seq 1 10); do
+      curl -s localhost:9200 | jq .name
+    done
+  ```
+
+]
+
+Note: if you don't have `jq` on your Play-With-Docker instance, just install it:
+```
+apk add --no-cache jq
+```
+
+---
+
+## Load balancing results
+
+Traffic is handled by our clusters [TCP routing mesh](
+https://docs.docker.com/engine/swarm/ingress/).
+
+Each request is served by one of the 7 instances, in rotation.
+
+Note: if you try to access the service from your browser,
+you will probably see the same
+instance name over and over, because your browser (unlike curl) will try
+to re-use the same connection.
+
+---
+
+## Under the hood of the TCP routing mesh
+
+- Load balancing is done by IPVS
+
+- IPVS is a high-performance, in-kernel load balancer
+
+- It's been around for a long time (merged in the kernel since 2.4)
+
+- Each node runs a local load balancer
+
+  (Allowing connections to be routed directly to the destination,
+  without extra hops)
+
+---
+
+## Managing inbound traffic
+
+There are many ways to deal with inbound traffic on a Swarm cluster.
+
+- Put all (or a subset) of your nodes in a DNS `A` record
+
+- Assign your nodes (or a subset) to an ELB
+
+- Use a virtual IP and make sure that it is assigned to an "alive" node
+
+- etc.
+
+---
+
+class: btw-labels
+
+## Managing HTTP traffic
+
+- The TCP routing mesh doesn't parse HTTP headers
+
+- If you want to place multiple HTTP services on port 80, you need something more
+
+- You can set up NGINX or HAProxy on port 80 to do the virtual host switching
+
+- Docker Universal Control Plane provides its own [HTTP routing mesh](
+  https://docs.docker.com/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services/)
+
+  - add a specific label starting with `com.docker.ucp.mesh.http` to your services
+
+  - labels are detected automatically and dynamically update the configuration
+
+---
+
+class: btw-labels
+
+## You should use labels
+
+- Labels are a great way to attach arbitrary information to services
+
+- Examples:
+
+  - HTTP vhost of a web app or web service
+
+  - backup schedule for a stateful service
+
+  - owner of a service (for billing, paging...)
+
+  - etc.
+
+---
+
+## Pro-tip for ingress traffic management
+
+- It is possible to use *local* networks with Swarm services
+
+- This means that you can do something like this:
+  ```bash
+  docker service create --network host --mode global traefik ...
+  ```
+
+  (This runs the `traefik` load balancer on each node of your cluster, in the `host` network)
+
+- This gives you native performance (no iptables, no proxy, no nothing!)
+
+- The load balancer will "see" the clients' IP addresses
+
+- But: a container cannot simultaneously be in the `host` network and another network
+
+  (You will have to route traffic to containers using exposed ports or UNIX sockets)
+
+---
+
+class: extra-details
+
+## Using local networks (`host`, `macvlan` ...) with Swarm services
+
+- Using the `host` network is fairly straightforward
+
+  (With the caveats described on the previous slide)
+
+- It is also possible to use drivers like `macvlan`
+
+  - see [this guide](
+https://docs.docker.com/engine/userguide/networking/get-started-macvlan/
+) to get started on `macvlan`
+
+  - see [this PR](https://github.com/moby/moby/pull/32981) for more information about local network drivers in Swarm mode
+
+---
+
+## Visualize container placement
+
+- Let's leverage the Docker API!
+
+.exercise[
+
+- Get the source code of this simple-yet-beautiful visualization app:
+  ```bash
+  cd ~
+  git clone git://github.com/dockersamples/docker-swarm-visualizer
+  ```
+
+- Build and run the Swarm visualizer:
+  ```bash
+  cd docker-swarm-visualizer
+  docker-compose up -d
+  ```
+
+]
+
+---
+
+## Connect to the visualization webapp
+
+- It runs a web server on port 8080
+
+.exercise[
+
+- Point your browser to port 8080 of your node1's public ip
+
+  (If you use Play-With-Docker, click on the (8080) badge)
+
+  <!-- ```open http://node1:8080``` -->
+
+]
+
+- The webapp updates the display automatically (you don't need to reload the page)
+
+- It only shows Swarm services (not standalone containers)
+
+- It shows when nodes go down
+
+- It has some glitches (it's not Carrier-Grade Enterprise-Compliant ISO-9001 software)
+
+---
+
+## Why This Is More Important Than You Think
+
+- The visualizer accesses the Docker API *from within a container*
+
+- This is a common pattern: run container management tools *in containers*
+
+- Instead of viewing your cluster, this could take care of logging, metrics, autoscaling ...
+
+- We can run it within a service, too! We won't do it, but the command would look like:
+
+  ```bash
+    docker service create \
+      --mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
+      --name viz --constraint node.role==manager ...
+  ```
+
+Credits: the visualization code was written by
+[Francisco Miranda](https://github.com/maroshii).
+<br/>
+[Mano Marks](https://twitter.com/manomarks) adapted
+it to Swarm and maintains it.
+
+---
+
+## Terminate our services
+
+- Before moving on, we will remove those services
+
+- `docker service rm` can accept multiple services names or IDs
+
+- `docker service ls` can accept the `-q` flag
+
+- A Shell snippet a day keeps the cruft away
+
+.exercise[
+
+- Remove all services with this one liner:
+  ```bash
+  docker service ls -q | xargs docker service rm
+  ```
+
+]
diff --git a/docs/healthchecks.md b/docs/healthchecks.md
new file mode 100644
index 00000000..dbdd6a57
--- /dev/null
+++ b/docs/healthchecks.md
@@ -0,0 +1,211 @@
+name: healthchecks
+
+# Health checks
+
+(New in Docker Engine 1.12)
+
+- Commands that are executed on regular intervals in a container
+
+- Must return 0 or 1 to indicate "all is good" or "something's wrong"
+
+- Must execute quickly (timeouts = failures)
+
+- Example:
+  ```bash
+  curl -f http://localhost/_ping || false
+  ```
+  - the `-f` flag ensures that `curl` returns non-zero for 404 and similar errors
+  - `|| false` ensures that any non-zero exit status gets mapped to 1
+  - `curl` must be installed in the container that is being checked
+
+---
+
+## Defining health checks
+
+- In a Dockerfile, with the [HEALTHCHECK](https://docs.docker.com/engine/reference/builder/#healthcheck) instruction
+  ```
+  HEALTHCHECK --interval=1s --timeout=3s CMD curl -f http://localhost/ || false
+  ```
+
+- From the command line, when running containers or services
+  ```
+  docker run --health-cmd "curl -f http://localhost/ || false" ...
+  docker service create --health-cmd "curl -f http://localhost/ || false" ...
+  ```
+
+- In Compose files, with a per-service [healthcheck](https://docs.docker.com/compose/compose-file/#healthcheck) section
+  ```yaml
+    www:
+      image: hellowebapp
+      healthcheck:
+        test: "curl -f https://localhost/ || false"
+        timeout: 3s
+  ```
+
+---
+
+## Using health checks
+
+- With `docker run`, health checks are purely informative
+
+  - `docker ps` shows health status
+
+  - `docker inspect` has extra details (including health check command output)
+
+- With `docker service`:
+
+  - unhealthy tasks are terminated (i.e. the service is restarted)
+
+  - failed deployments can be rolled back automatically
+    <br/>(by setting *at least* the flag `--update-failure-action rollback`)
+
+---
+
+## Automated rollbacks
+
+Here is a comprehensive example using the CLI:
+
+```bash
+docker service update \
+  --update-delay 5s \
+  --update-failure-action rollback \
+  --update-max-failure-ratio .25 \
+  --update-monitor 5s \
+  --update-parallelism 1 \
+  --rollback-delay 5s \
+  --rollback-failure-action pause \
+  --rollback-max-failure-ratio .5 \
+  --rollback-monitor 5s \
+  --rollback-parallelism 0 \
+  --health-cmd "curl -f http://localhost/ || exit 1" \
+  --health-interval 2s \
+  --health-retries 1 \
+  --image yourimage:newversion \
+  yourservice
+```
+
+---
+
+## Implementing auto-rollback in practice
+
+We will use the following Compose file (`stacks/dockercoins+healthcheck.yml`):
+
+```yaml
+...
+  hasher:
+    build: dockercoins/hasher
+    image: ${REGISTRY-127.0.0.1:5000}/hasher:${TAG-latest}
+    deploy:
+      replicas: 7
+      update_config:
+        delay: 5s
+        failure_action: rollback
+        max_failure_ratio: .5
+        monitor: 5s
+        parallelism: 1
+...
+```
+
+---
+
+## Enabling auto-rollback
+
+.exercise[
+
+- Go to the `stacks` directory:
+  ```bash
+  cd ~/orchestration-workshop/stacks
+  ```
+
+- Deploy the updated stack:
+  ```bash
+  docker stack deploy dockercoins --compose-file dockercoins+healthcheck.yml
+  ```
+
+]
+
+This will also scale the `hasher` service to 7 instances.
+
+---
+
+## Visualizing a rolling update
+
+First, let's make an "innocent" change and deploy it.
+
+.exercise[
+
+- Update the `sleep` delay in the code:
+  ```bash
+  sed -i "s/sleep 0.1/sleep 0.2/" dockercoins/hasher/hasher.rb
+  ```
+
+- Build, ship, and run the new image:
+  ```bash
+  export TAG=v0.5
+  docker-compose -f dockercoins+healthcheck.yml build
+  docker-compose -f dockercoins+healthcheck.yml push
+  docker service update dockercoins_hasher \
+           --detach=false --image=127.0.0.1:5000/hasher:$TAG
+  ```
+
+]
+
+---
+
+## Visualizing an automated rollback
+
+And now, a breaking change that will cause the health check to fail:
+
+.exercise[
+
+- Change the HTTP listening port:
+  ```bash
+  sed -i "s/80/81/" dockercoins/hasher/hasher.rb
+  ```
+
+- Build, ship, and run the new image:
+  ```bash
+  export TAG=v0.6
+  docker-compose -f dockercoins+healthcheck.yml build
+  docker-compose -f dockercoins+healthcheck.yml push
+  docker service update dockercoins_hasher \
+           --detach=false --image=127.0.0.1:5000/hasher:$TAG
+  ```
+
+]
+
+---
+
+## Command-line options available for health checks, rollbacks, etc.
+
+Batteries included, but swappable
+
+.small[
+```
+--health-cmd string                  Command to run to check health
+--health-interval duration           Time between running the check (ms|s|m|h)
+--health-retries int                 Consecutive failures needed to report unhealthy
+--health-start-period duration       Start period for the container to initialize before counting retries towards unstable (ms|s|m|h)
+--health-timeout duration            Maximum time to allow one check to run (ms|s|m|h)
+--no-healthcheck                     Disable any container-specified HEALTHCHECK
+--restart-condition string           Restart when condition is met ("none"|"on-failure"|"any")
+--restart-delay duration             Delay between restart attempts (ns|us|ms|s|m|h)
+--restart-max-attempts uint          Maximum number of restarts before giving up
+--restart-window duration            Window used to evaluate the restart policy (ns|us|ms|s|m|h)
+--rollback                           Rollback to previous specification
+--rollback-delay duration            Delay between task rollbacks (ns|us|ms|s|m|h)
+--rollback-failure-action string     Action on rollback failure ("pause"|"continue")
+--rollback-max-failure-ratio float   Failure rate to tolerate during a rollback
+--rollback-monitor duration          Duration after each task rollback to monitor for failure (ns|us|ms|s|m|h)
+--rollback-order string              Rollback order ("start-first"|"stop-first")
+--rollback-parallelism uint          Maximum number of tasks rolled back simultaneously (0 to roll back all at once)
+--update-delay duration              Delay between updates (ns|us|ms|s|m|h)
+--update-failure-action string       Action on update failure ("pause"|"continue"|"rollback")
+--update-max-failure-ratio float     Failure rate to tolerate during an update
+--update-monitor duration            Duration after each task update to monitor for failure (ns|us|ms|s|m|h)
+--update-order string                Update order ("start-first"|"stop-first")
+--update-parallelism uint            Maximum number of tasks updated simultaneously (0 to update all at once)
+```
+]
+
+Yup ... That's a lot of batteries!
diff --git a/docs/index.html b/docs/index.html
deleted file mode 100644
index 49ae9475..00000000
--- a/docs/index.html
+++ /dev/null
@@ -1,8552 +0,0 @@
-<!DOCTYPE html>
-<html>
-  <head>
-    <base target="_blank">
-    <title>Docker Orchestration Workshop</title>
-    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
-    <style type="text/css">
-      @import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
-      @import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
-      @import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
-
-      /* For print! Borrowed from https://github.com/gnab/remark/issues/50 */
-      @page {
-        size: 1210px 681px;
-        margin: 0;
-       }
-
-      @media print {
-        .remark-slide-scaler {
-            width: 100% !important;
-            height: 100% !important;
-            transform: scale(1) !important;
-            top: 0 !important;
-            left: 0 !important;
-        }
-      }
-
-      body { font-family: 'Droid Serif'; }
-
-      h1, h2, h3 {
-        font-family: 'Yanone Kaffeesatz';
-        font-weight: normal;
-        margin-top: 0.5em;
-      }
-      a {
-        text-decoration: none;
-        color: blue;
-      }
-      .remark-slide-content { padding: 1em 2.5em 1em 2.5em; }
-
-      .remark-slide-content { font-size: 25px; }
-      .remark-slide-content h1 { font-size: 50px; }
-      .remark-slide-content h2 { font-size: 50px; }
-      .remark-slide-content h3 { font-size: 25px; }
-      .footnote {
-        position: absolute;
-        bottom: 3em;
-      }
-      .remark-code { font-size: 25px; }
-      .small .remark-code { font-size: 16px; }
-
-      .remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
-      .remark-inline-code {
-         background-color: #ccc;
-      }
-      .red { color: #fa0000; }
-      .gray { color: #ccc; }
-      .small { font-size: 70%; }
-      .big { font-size: 140%; }
-      .underline { text-decoration: underline; }
-      .strike { text-decoration: line-through; }
-      .pic {
-        vertical-align: middle;
-        text-align: center;
-        padding: 0 0 0 0 !important;
-      }
-      img {
-        max-width: 100%;
-        max-height: 550px;
-      }
-      .title {
-        vertical-align: middle;
-        text-align: center;
-      }
-      .title h1 { font-size: 5em; }
-      .title p { font-size: 3em; }
-      .quote {
-        background: #eee;
-        border-left: 10px solid #ccc;
-        margin: 1.5em 10px;
-        padding: 0.5em 10px;
-        quotes: "\201C""\201D""\2018""\2019";
-        font-style: italic;
-      }
-      .quote:before {
-        color: #ccc;
-        content: open-quote;
-        font-size: 4em;
-        line-height: 0.1em;
-        margin-right: 0.25em;
-        vertical-align: -0.4em;
-      }
-      .quote p {
-        display: inline;
-      }
-      .warning {
-        background-image: url("warning.png");
-        background-size: 1.5em;
-        background-repeat: no-repeat;
-        padding-left: 2em;
-      }
-      .exercise {
-        background-color: #eee;
-        background-image: url("keyboard.png");
-        background-size: 1.4em;
-        background-repeat: no-repeat;
-        background-position: 0.2em 0.2em;
-        border: 2px dotted black;
-      }
-      .exercise::before {
-        content: "Exercise";
-        margin-left: 1.8em;
-      }
-      li p { line-height: 1.25em; }
-      div.extra-details {
-        background-image: url(extra-details.png);
-        background-position: 99.5% 1%;
-        background-size: 4%;
-      }
-      /* This is used only for the history slide (the only table in this doc) */
-      td {
-        padding: 0.1em 0.5em;
-        background: #eee;
-      }
-    </style>
-  </head>
-  <body>
-    <textarea id="source">
-
-class: title, self-paced
-
-Docker <br/> Orchestration <br/> Workshop
-
----
-
-class: title, in-person
-
-.small[
-
-Deploy and scale containers with Docker native, open source orchestration
-
-.small[.small[
-
-**Be kind to the WiFi!**
-
-*Use the 5G network*
-<br/>
-*Don't use your hotspot*
-<br/>
-*Don't stream videos from YouTube, Netflix, etc.
-<br/>(if you're bored, watch local content instead)*
-
-Thank you!
-
-]
-]
-]
-
----
-
-class: in-person
-
-## Intros
-
-- Hello! We are
-  AJ ([@s0ulshake](https://twitter.com/s0ulshake))
-  &
-  Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
-
---
-
-class: in-person
-
-- This is our collective Docker knowledge:
-
-  ![Bell Curve](bell-curve.jpg)
-
-<!--
-Reminder, when updating the agenda: when people are told to show
-up at 9am, they usually trickle in until 9:30am (except for paid
-training sessions). If you're not sure that people will be there
-on time, it's a good idea to have a breakfast with the attendees
-at e.g. 9am, and start at 9:30.
--->
-
----
-
-class: in-person
-
-## Agenda
-
-<!--
-- Agenda:
--->
-
-<!--
-
-.small[
-- 09:00-10:30 part 1
-- 10:30-11:00 coffee break
-- 11:00-12:30 part 2
-- 12:00-13:00 lunch break
-- 13:00-14:30 part 3
-- 14:30-14:45 coffee break
-- 14:45-16:00 part 4
-- 16:00-16:01 Q&A
-]
-
--->
-
-- The tutorial will run from 9:00am to 12:20pm
-
-- This will be fast-paced, but DON'T PANIC!
-
-- All the content is publicly available (slides, code samples, scripts)
-
-  Upstream URL: https://github.com/jpetazzo/orchestration-workshop
-
-- There will be a coffee break at 10:30am
-  <br/>
-  (please remind me if I forget about it!)
-
-- Feel free to interrupt for questions at any time
-
-- Live feedback, questions, help on [Gitter](chat)
-
-  http://container.training/chat
-
----
-
-## A brief introduction
-
-- This was initially written to support in-person,
-  instructor-led workshops and tutorials
-
-- You can also follow along on your own, at your own pace
-
-- We included as much information as possible in these slides
-
-- We recommend having a mentor to help you ...
-
-- ... Or be comfortable spending some time reading the Docker
- [documentation](https://docs.docker.com/) ...
-
-- ... And looking for answers in the [Docker forums](forums.docker.com),
-  [StackOverflow](http://stackoverflow.com/questions/tagged/docker),
-  and other outlets
-
----
-
-class: self-paced
-
-## Hands on, you shall practice
-
-- Nobody ever became a Jedi by spending their lives reading Wookiepedia
-
-- Likewise, it will take more than merely *reading* these slides
-  to make you an expert
-
-- These slides include *tons* of exercises
-
-- They assume that you have access to a cluster of Docker nodes
-
-- If you are attending a workshop or tutorial:
-  <br/>you will be given specific instructions to access your cluster
-
-- If you are doing this on your own:
-  <br/>you can use
-  [Play-With-Docker](http://www.play-with-docker.com/) and
-  read [these instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker) for extra
-  details
-
-???
-
-<!--
-grep '^# ' index.html | grep -v '<br' | tr '#' '-'
--->
-
----
-
-class: in-person
-
-## Chapter 1: getting started
-
-- Pre-requirements
-
-- VM environment
-
-- Our sample application
-
-- Running the application
-
-- Identifying bottlenecks
-
-- Introducing SwarmKit
-
----
-
-class: in-person
-
-## Chapter 2: scaling out our app on Swarm
-
-- Creating our first Swarm
-
-- Docker Machine
-
-- Running our first Swarm service
-
-- Deploying a local registry
-
-- Overlay networks
-
-- Global scheduling
-
-- Scripting image building and pushing
-
-- Integration with Compose
-
----
-
-class: in-person
-
-## Chapter 3: operating the Swarm
-
-- Breaking into an overlay network
-
-- Securing overlay networks
-
-- Rolling updates
-
-- (Secrets management and encryption at rest)
-
-- [Centralized logging](#logging)
-
-- Metrics collection
-
----
-
-class: in-person
-
-## Chapter 4: bonus material
-
-- Dealing with stateful services
-
-- Controlling Docker from a container
-
-- Node management
-
-- What's next?
-
----
-
-# Pre-requirements
-
-- Computer with internet connection and a web browser
-
-- For instructor-led workshops: an SSH client to connect to remote machines
-
-  - on Linux, OS X, FreeBSD... you are probably all set
-
-  - on Windows, get [putty](http://www.putty.org/),
-  Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
-  [Git BASH](https://git-for-windows.github.io/), or
-  [MobaXterm](http://mobaxterm.mobatek.net/)
-
-- For self-paced learning: SSH is not necessary if you use
-  [Play-With-Docker](http://www.play-with-docker.com/)
-
-- Some Docker knowledge
-
-  (but that's OK if you're not a Docker expert!)
-
----
-
-class: in-person, extra-details
-
-## Nice-to-haves
-
-- [Mosh](https://mosh.org/) instead of SSH, if your internet connection tends to lose packets
-  <br/>(available with `(apt|yum|brew) install mosh`; then connect with `mosh user@host`)
-
-- [GitHub](https://github.com/join) account
-  <br/>(if you want to fork the repo; also used to join Gitter)
-
-- [Gitter](https://gitter.im/) account
-  <br/>(to join the conversation during the workshop)
-
-- [Slack](https://community.docker.com/registrations/groups/4316) account
-  <br/>(to join the conversation after the workshop)
-
-- [Docker Hub](https://hub.docker.com) account
-  <br/>(it's one way to distribute images on your Swarm cluster)
-
----
-
-class: extra-details
-
-## Extra details
-
-- This slide should have a little magnifying glass in the top right corner
-
-  (If it doesn't, it's because CSS is hard — Jérôme is only a backend person, alas)
-
-- Slides with that magnifying glass indicate slides providing extra details
-
-- Feel free to skip them if you're in a hurry!
-
----
-
-## Hands-on sections
-
-- The whole workshop is hands-on
-
-- We will see Docker in action
-
-- You are invited to reproduce all the demos
-
-- All hands-on sections are clearly identified, like the gray rectangle below
-
-.exercise[
-
-- This is the stuff you're supposed to do!
-- Go to [container.training](http://container.training/) to view these slides
-- Join the [chat room](chat)
-
-]
-
----
-
-class: in-person
-
-# VM environment
-
-- To follow along, you need a cluster of five Docker Engines
-
-- If you are doing this with an instructor, see next slide
-
-- If you are doing (or re-doing) this on your own, you can:
-
-  - create your own cluster (local or cloud VMs) with Docker Machine
-    ([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine))
-
-  - use [Play-With-Docker](http://play-with-docker.com) ([instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker))
-
-  - create a bunch of clusters for you and your friends
-    ([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-vms))
-
----
-
-class: pic, in-person
-
-![You get five VMs](you-get-five-vms.jpg)
-
----
-
-class: in-person
-
-## You get five VMs
-
-- Each person gets 5 private VMs (not shared with anybody else)
-- They'll remain up until the day after the tutorial
-- You should have a little card with login+password+IP addresses
-- You can automatically SSH from one VM to another
-
-.exercise[
-
-<!--
-```bash
-for N in $(seq 1 5); do
-  ssh -o StrictHostKeyChecking=no node$N true
-done
-for N in $(seq 1 5); do
-  (.
-  docker-machine rm -f node$N
-  ssh node$N "docker ps -aq | xargs -r docker rm -f"
-  ssh node$N sudo rm -f /etc/systemd/system/docker.service
-  ssh node$N sudo systemctl daemon-reload
-  echo Restarting node$N.
-  ssh node$N sudo systemctl restart docker
-  echo Restarted node$N.
-  ) &
-done
-wait
-```
--->
-
-- Log into the first VM (`node1`) with SSH or MOSH
-- Check that you can SSH (without password) to `node2`:
-  ```bash
-  ssh node2
-  ```
-- Type `exit` or `^D` to come back to node1
-
-<!--
-```meta
-^D
-```
--->
-
-]
-
----
-
-class: in-person
-
-## If doing or re-doing the workshop on your own ...
-
----
-
-class: self-paced
-
-## How to get your own Docker nodes?
-
-- Use [Play-With-Docker](http://www.play-with-docker.com/)!
-
---
-
-- Main differences:
-
-  - you don't need to SSH to the machines
-    <br/>(just click on the node that you want to control in the left tab bar)
-
-  - Play-With-Docker automagically detects exposed ports
-    <br/>(and displays them as little badges with port numbers, above the terminal)
-
-  - You can access HTTP services by clicking on the port numbers
-
-  - exposing TCP services requires something like
-    [ngrok](https://ngrok.com/)
-    or [supergrok](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker)
-
-<!--
-
-- If you use VMs deployed with Docker Machine:
-
-  - you won't have pre-authorized SSH keys to bounce across machines
-
-  - you won't have host aliases
-
--->
-
----
-
-class: self-paced
-
-## Using Play-With-Docker
-
-- Open a new browser tab to [www.play-with-docker.com](http://www.play-with-docker.com/)
-
-- Confirm that you're not a robot
-
-- Click on "ADD NEW INSTANCE": congratulations, you have your first Docker node!
-
-- When you will need more nodes, just click on "ADD NEW INSTANCE" again
-
-- Note the countdown in the corner; when it expires, your instances are destroyed
-
-- If you give your URL to somebody else, they can access your nodes too
-  <br/>
-  (You can use that for pair programming, or to get help from a mentor)
-
-- Loving it? Not loving it? Tell it to the wonderful authors,
-  [@marcosnils](https://twitter.com/marcosnils) &
-  [@xetorthio](https://twitter.com/xetorthio)!
-
----
-
-## We will (mostly) interact with node1 only
-
-- Unless instructed, **all commands must be run from the first VM, `node1`**
-
-- We will only checkout/copy the code on `node1`
-
-- When we will use the other nodes, we will do it mostly through the Docker API
-
-- We will log into other nodes only for initial setup and a few "out of band" operations
-  <br/>(checking internal logs, debugging...)
-
----
-
-## Terminals
-
-Once in a while, the instructions will say:
-<br/>"Open a new terminal."
-
-There are multiple ways to do this:
-
-- create a new window or tab on your machine, and SSH into the VM;
-
-- use screen or tmux on the VM and open a new window from there.
-
-You are welcome to use the method that you feel the most comfortable with.
-
----
-
-## Tmux cheatsheet
-
-- Ctrl-b c → creates a new window
-- Ctrl-b n → go to next window
-- Ctrl-b p → go to previous window
-- Ctrl-b " → split window top/bottom
-- Ctrl-b % → split window left/right
-- Ctrl-b Alt-1 → rearrange windows in columns
-- Ctrl-b Alt-2 → rearrange windows in rows
-- Ctrl-b arrows → navigate to other windows
-- Ctrl-b d → detach session
-- tmux attach → reattach to session
-
----
-
-## Brand new versions!
-
-- Engine 17.09
-- Compose 1.16
-- Machine 0.12
-
-.exercise[
-
-- Check all installed versions:
-  ```bash
-  docker version
-  docker-compose -v
-  docker-machine -v
-  ```
-
-]
-
----
-
-## Wait, what, 17.09 ?!?
-
---
-
-- Docker 1.13 = Docker 17.03 (year.month, like Ubuntu)
-
-- Every month, there is a new "edge" release (with new features)
-
-- Every quarter, there is a new "stable" release
-
-- Docker CE releases are maintained 4+ months
-
-- Docker EE releases are maintained 12+ months
-
-- For more details, check the [Docker EE announcement blog post](https://blog.docker.com/2017/03/docker-enterprise-edition/)
-
----
-
-class: extra-details
-
-## Docker CE vs Docker EE
-
-- Docker EE:
-
-  - $$$
-  - certification for select distros, clouds, and plugins
-  - advanced management features (fine-grained access control, security scanning...)
-
-- Docker CE:
-
-  - free
-  - available through Docker Mac, Docker Windows, and major Linux distros
-  - perfect for individuals and small organizations
-
----
-
-class: extra-details
-
-## Why?
-
-- More readable for enterprise users
-
-  (i.e. the very nice folks who are kind enough to pay us big $$$ for our stuff)
-
-- No impact for the community
-
-  (beyond CE/EE suffix and version numbering change)
-
-- Both trains leverage the same open source components
-
-  (containerd, libcontainer, SwarmKit...)
-
-- More predictible release schedule (see next slide)
-
----
-
-class: pic
-
-![Docker CE/EE release cycle](lifecycle.png)
-
----
-
-## What was added when?
-
-||||
-| ---- | ----- | --- |
-| 2015 |  1.9  | Overlay (multi-host) networking, network/IPAM plugins
-| 2016 |  1.10 | Embedded dynamic DNS
-| 2016 |  1.11 | DNS round robin load balancing
-| 2016 |  1.12 | Swarm mode, routing mesh, encrypted networking, healthchecks
-| 2017 |  1.13 | Stacks, attachable overlays, image squash and compress
-| 2017 |  1.13 | Windows Server 2016 Swarm mode
-| 2017 | 17.03 | Secrets
-| 2017 | 17.04 | Update rollback, placement preferences (soft constraints)
-| 2017 | 17.05 | Multi-stage image builds, service logs
-| 2017 | 17.06 | Swarm configs, node/service events
-| 2017 | 17.06 | Windows Server 2016 Swarm overlay networks, secrets
-
----
-
-class: title
-
-All right!
-<br/>
-We're all set.
-<br/>
-Let's do this.
-
----
-
-name: part-1
-
-class: title, self-paced
-
-Part 1
-
----
-
-# Our sample application
-
-- Visit the GitHub repository with all the materials of this workshop:
-  <br/>https://github.com/jpetazzo/orchestration-workshop
-
-- The application is in the [dockercoins](
-  https://github.com/jpetazzo/orchestration-workshop/tree/master/dockercoins)
-  subdirectory
-
-- Let's look at the general layout of the source code:
-
-  there is a Compose file [docker-compose.yml](
-  https://github.com/jpetazzo/orchestration-workshop/blob/master/dockercoins/docker-compose.yml) ...
-
-  ... and 4 other services, each in its own directory:
-
-  - `rng` = web service generating random bytes
-  - `hasher` = web service computing hash of POSTed data
-  - `worker` = background process using `rng` and `hasher`
-  - `webui` = web interface to watch progress
-
----
-
-class: extra-details
-
-## Compose file format version
-
-*Particularly relevant if you have used Compose before...*
-
-- Compose 1.6 introduced support for a new Compose file format (aka "v2")
-
-- Services are no longer at the top level, but under a `services` section
-
-- There has to be a `version` key at the top level, with value `"2"` (as a string, not an integer)
-
-- Containers are placed on a dedicated network, making links unnecessary
-
-- There are other minor differences, but upgrade is easy and straightforward
-
----
-
-## Links, naming, and service discovery
-
-- Containers can have network aliases (resolvable through DNS)
-
-- Compose file version 2+ makes each container reachable through its service name
-
-- Compose file version 1 did require "links" sections
-
-- Our code can connect to services using their short name
-
-  (instead of e.g. IP address or FQDN)
-
-- Network aliases are automatically namespaced
-
-  (i.e. you can have multiple apps declaring and using a service named `database`)
-
----
-
-## Example in `worker/worker.py`
-
-![Service discovery](service-discovery.png)
-
----
-
-## What's this application?
-
----
-
-class: pic
-
-![DockerCoins logo](dockercoins.png)
-
-(DockerCoins 2016 logo courtesy of [@XtlCnslt](https://twitter.com/xtlcnslt) and [@ndeloof](https://twitter.com/ndeloof). Thanks!)
-
----
-
-## What's this application?
-
-- It is a DockerCoin miner! 💰🐳📦🚢
-
---
-
-- No, you can't buy coffee with DockerCoins
-
---
-
-- How DockerCoins works:
-
-  - `worker` asks to `rng` to generate a few random bytes
-
-  - `worker` feeds these bytes into `hasher`
-
-  - and repeat forever!
-
-  - every second, `worker` updates `redis` to indicate how many loops were done
-
-  - `webui` queries `redis`, and computes and exposes "hashing speed" in your browser
-
----
-
-## Getting the application source code
-
-- We will clone the GitHub repository
-
-- The repository also contains scripts and tools that we will use through the workshop
-
-.exercise[
-
-<!--
-```bash
-[ -d orchestration-workshop ] && mv orchestration-workshop orchestration-workshop.$$
-```
--->
-
-- Clone the repository on `node1`:
-  ```bash
-  git clone git://github.com/jpetazzo/orchestration-workshop
-  ```
-
-]
-
-(You can also fork the repository on GitHub and clone your fork if you prefer that.)
-
----
-
-# Running the application
-
-Without further ado, let's start our application.
-
-.exercise[
-
-- Go to the `dockercoins` directory, in the cloned repo:
-  ```bash
-  cd ~/orchestration-workshop/dockercoins
-  ```
-
-- Use Compose to build and run all containers:
-  ```bash
-  docker-compose up
-  ```
-
-]
-
-Compose tells Docker to build all container images (pulling
-the corresponding base images), then starts all containers,
-and displays aggregated logs.
-
----
-
-## Lots of logs
-
-- The application continuously generates logs
-
-- We can see the `worker` service making requests to `rng` and `hasher`
-
-- Let's put that in the background
-
-.exercise[
-
-- Stop the application by hitting `^C`
-
-<!--
-```meta
-^C
-```
--->
-
-]
-
-- `^C` stops all containers by sending them the `TERM` signal
-
-- Some containers exit immediately, others take longer
-  <br/>(because they don't handle `SIGTERM` and end up being killed after a 10s timeout)
-
----
-
-## Restarting in the background
-
-- Many flags and commands of Compose are modeled after those of `docker`
-
-.exercise[
-
-- Start the app in the background with the `-d` option:
-  ```bash
-  docker-compose up -d
-  ```
-
-- Check that our app is running with the `ps` command:
-  ```bash
-  docker-compose ps
-  ```
-
-]
-
-`docker-compose ps` also shows the ports exposed by the application.
-
----
-
-class: extra-details
-
-## Viewing logs
-
-- The `docker-compose logs` command works like `docker logs`
-
-.exercise[
-
-- View all logs since container creation and exit when done:
-  ```bash
-  docker-compose logs
-  ```
-
-- Stream container logs, starting at the last 10 lines for each container:
-  ```bash
-  docker-compose logs --tail 10 --follow
-  ```
-
-<!--
-```meta
-^C
-```
--->
-
-]
-
-Tip: use `^S` and `^Q` to pause/resume log output.
-
----
-
-class: extra-details
-
-## Upgrading from Compose 1.6
-
-.warning[The `logs` command has changed between Compose 1.6 and 1.7!]
-
-- Up to 1.6
-
-  - `docker-compose logs` is the equivalent of `logs --follow`
-
-  - `docker-compose logs` must be restarted if containers are added
-
-- Since 1.7
-
-  - `--follow` must be specified explicitly
-
-  - new containers are automatically picked up by `docker-compose logs`
-
----
-
-## Connecting to the web UI
-
-- The `webui` container exposes a web dashboard; let's view it
-
-.exercise[
-
-- With a web browser, connect to `node1` on port 8000
-
-- Remember: the `nodeX` aliases are valid only on the nodes themselves
-
-- In your browser, you need to enter the IP address of your node
-
-]
-
-You should see a speed of approximately 4 hashes/second.
-
-More precisely: 4 hashes/second, with regular dips down to zero.
-<br/>This is because Jérôme is incapable of writing good frontend code.
-<br/>Don't ask. Seriously, don't ask. This is embarrassing.
-
----
-
-class: extra-details
-
-## Why does the speed seem irregular?
-
-- The app actually has a constant, steady speed: 3.33 hashes/second
-  <br/>
-  (which corresponds to 1 hash every 0.3 seconds, for *reasons*)
-
-- The worker doesn't update the counter after every loop, but up to once per second
-
-- The speed is computed by the browser, checking the counter about once per second
-
-- Between two consecutive updates, the counter will increase either by 4, or by 0
-
-- The perceived speed will therefore be 4 - 4 - 4 - 0 - 4 - 4 - etc.
-
-*We told you to not ask!!!*
-
----
-
-## Scaling up the application
-
-- Our goal is to make that performance graph go up (without changing a line of code!)
-
---
-
-- Before trying to scale the application, we'll figure out if we need more resources
-
-  (CPU, RAM...)
-
-- For that, we will use good old UNIX tools on our Docker node
-
----
-
-## Looking at resource usage
-
-- Let's look at CPU, memory, and I/O usage
-
-.exercise[
-
-- run `top` to see CPU and memory usage (you should see idle cycles)
-
-- run `vmstat 3` to see I/O usage (si/so/bi/bo)
-  <br/>(the 4 numbers should be almost zero, except `bo` for logging)
-
-]
-
-We have available resources.
-
-- Why?
-- How can we use them?
-
----
-
-## Scaling workers on a single node
-
-- Docker Compose supports scaling
-- Let's scale `worker` and see what happens!
-
-.exercise[
-
-- Start one more `worker` container:
-  ```bash
-  docker-compose scale worker=2
-  ```
-
-- Look at the performance graph (it should show a x2 improvement)
-
-- Look at the aggregated logs of our containers (`worker_2` should show up)
-
-- Look at the impact on CPU load with e.g. top (it should be negligible)
-
-]
-
----
-
-## Adding more workers
-
-- Great, let's add more workers and call it a day, then!
-
-.exercise[
-
-- Start eight more `worker` containers:
-  ```bash
-  docker-compose scale worker=10
-  ```
-
-- Look at the performance graph: does it show a x10 improvement?
-
-- Look at the aggregated logs of our containers
-
-- Look at the impact on CPU load and memory usage
-
-<!--
-```bash
-sleep 5
-killall docker-compose
-```
--->
-
-]
-
----
-
-# Identifying bottlenecks
-
-- You should have seen a 3x speed bump (not 10x)
-
-- Adding workers didn't result in linear improvement
-
-- *Something else* is slowing us down
-
---
-
-- ... But what?
-
---
-
-- The code doesn't have instrumentation
-
-- Let's use state-of-the-art HTTP performance analysis!
-  <br/>(i.e. good old tools like `ab`, `httping`...)
-
----
-
-## Accessing internal services
-
-- `rng` and `hasher` are exposed on ports 8001 and 8002
-
-- This is declared in the Compose file:
-
-  ```yaml
-    ...
-    rng:
-      build: rng
-      ports:
-      - "8001:80"
-
-    hasher:
-      build: hasher
-      ports:
-      - "8002:80"
-    ...
-  ```
-
----
-
-## Measuring latency under load
-
-We will use `httping`.
-
-.exercise[
-
-- Check the latency of `rng`:
-  ```bash
-  httping -c 10 localhost:8001
-  ```
-
-- Check the latency of `hasher`:
-  ```bash
-  httping -c 10 localhost:8002
-  ```
-
-]
-
-`rng` has a much higher latency than `hasher`.
-
----
-
-## Let's draw hasty conclusions
-
-- The bottleneck seems to be `rng`
-
-- *What if* we don't have enough entropy and can't generate enough random numbers?
-
-- We need to scale out the `rng` service on multiple machines!
-
-Note: this is a fiction! We have enough entropy. But we need a pretext to scale out.
-
-(In fact, the code of `rng` uses `/dev/urandom`, which never runs out of entropy...
-<br/>
-...and is [just as good as `/dev/random`](http://www.slideshare.net/PacSecJP/filippo-plain-simple-reality-of-entropy).)
-
----
-
-## Clean up
-
-- Before moving on, let's remove those containers
-
-.exercise[
-
-- Tell Compose to remove everything:
-  ```bash
-  docker-compose down
-  ```
-
-]
-
----
-
-class: title
-
-Scaling out
-
----
-
-# SwarmKit
-
-- [SwarmKit](https://github.com/docker/swarmkit) is an open source
-  toolkit to build multi-node systems
-
-- It is a reusable library, like libcontainer, libnetwork, vpnkit ...
-
-- It is a plumbing part of the Docker ecosystem
-
----
-
-## SwarmKit features
-
-- Highly-available, distributed store based on [Raft](
-  https://en.wikipedia.org/wiki/Raft_%28computer_science%29)
-  <br/>(avoids depending on an external store: easier to deploy; higher performance)
-
-- Dynamic reconfiguration of Raft without interrupting cluster operations
-
-- *Services* managed with a *declarative API*
-  <br/>(implementing *desired state* and *reconciliation loop*)
-
-- Integration with overlay networks and load balancing
-
-- Strong emphasis on security:
-
-  - automatic TLS keying and signing; automatic cert rotation
-  - full encryption of the data plane; automatic key rotation
-  - least privilege architecture (single-node compromise ≠ cluster compromise)
-  - on-disk encryption with optional passphrase
-
----
-
-class: extra-details
-
-## Where is the key/value store?
-
-- Many orchestration systems use a key/value store backed by a consensus algorithm
-  <br/>
-  (k8s→etcd→Raft, mesos→zookeeper→ZAB, etc.)
-
-- SwarmKit implements the Raft algorithm directly
-  <br/>
-  (Nomad is similar; thanks [@cbednarski](https://twitter.com/@cbednarski),
-  [@diptanu](https://twitter.com/diptanu) and others for point it out!)
-
-- Analogy courtesy of [@aluzzardi](https://twitter.com/aluzzardi):
-
-  *It's like B-Trees and RDBMS. They are different layers, often
-  associated. But you don't need to bring up a full SQL server when
-  all you need is to index some data.*
-
-- As a result, the orchestrator has direct access to the data
-  <br/>
-  (the main copy of the data is stored in the orchestrator's memory)
-
-- Simpler, easier to deploy and operate; also faster
-
----
-
-## SwarmKit concepts (1/2)
-
-- A *cluster* will be at least one *node* (preferably more)
-
-- A *node* can be a *manager* or a *worker*
-
-- A *manager* actively takes part in the Raft consensus, and keeps the Raft log
-
-- You can talk to a *manager* using the SwarmKit API
-
-- One *manager* is elected as the *leader*; other managers merely forward requests to it
-
-- The *workers* get their instructions from the *managers*
-
-- Both *workers* and *managers* can run containers
-
----
-
-## Illustration
-
-![Illustration](swarm-mode.svg)
-
----
-
-## SwarmKit concepts (2/2)
-
-- The *managers* expose the SwarmKit API
-
-- Using the API, you can indicate that you want to run a *service*
-
-- A *service* is specified by its *desired state*: which image, how many instances...
-
-- The *leader* uses different subsystems to break down services into *tasks*:
-  <br/>orchestrator, scheduler, allocator, dispatcher
-
-- A *task* corresponds to a specific container, assigned to a specific *node*
-
-- *Nodes* know which *tasks* should be running, and will start or stop containers accordingly (through the Docker Engine API)
-
-You can refer to the [NOMENCLATURE](https://github.com/docker/swarmkit/blob/master/design/nomenclature.md) in the SwarmKit repo for more details.
-
----
-
-## Swarm Mode
-
-- Since version 1.12, Docker Engine embeds SwarmKit
-
-- All the SwarmKit features are "asleep" until you enable "Swarm Mode"
-
-- Examples of Swarm Mode commands:
-
-  - `docker swarm` (enable Swarm mode; join a Swarm; adjust cluster parameters)
-
-  - `docker node` (view nodes; promote/demote managers; manage nodes)
-
-  - `docker service` (create and manage services)
-
-???
-
-- The Docker API exposes the same concepts
-
-- The SwarmKit API is also exposed (on a separate socket)
-
----
-
-## You need to enable Swarm mode to use the new stuff
-
-- By default, all this new code is inactive
-
-- Swarm Mode can be enabled, "unlocking" SwarmKit functions
-  <br/>(services, out-of-the-box overlay networks, etc.)
-
-.exercise[
-
-- Try a Swarm-specific command:
-  ```bash
-  docker node ls
-  ```
-
-]
-
---
-
-You will get an error message:
-```
-Error response from daemon: This node is not a swarm manager. [...]
-```
-
----
-
-# Creating our first Swarm
-
-- The cluster is initialized with `docker swarm init`
-
-- This should be executed on a first, seed node
-
-- .warning[DO NOT execute `docker swarm init` on multiple nodes!]
-
-  You would have multiple disjoint clusters.
-
-.exercise[
-
-- Create our cluster from node1:
-  ```bash
-  docker swarm init
-  ```
-
-]
-
---
-
-class: advertise-addr
-
-If Docker tells you that it `could not choose an IP address to advertise`, see next slide!
-
----
-
-class: advertise-addr
-
-## IP address to advertise
-
-- When running in Swarm mode, each node *advertises* its address to the others
-  <br/>
-  (i.e. it tells them *"you can contact me on 10.1.2.3:2377"*)
-
-- If the node has only one IP address (other than 127.0.0.1), it is used automatically
-
-- If the node has multiple IP addresses, you **must** specify which one to use
-  <br/>
-  (Docker refuses to pick one randomly)
-
-- You can specify an IP address or an interface name
-  <br/>(in the latter case, Docker will read the IP address of the interface and use it)
-
-- You can also specify a port number
-  <br/>(otherwise, the default port 2377 will be used)
-
----
-
-class: advertise-addr
-
-## Which IP address should be advertised?
-
-- If your nodes have only one IP address, it's safe to let autodetection do the job
-
-  .small[(Except if your instances have different private and public addresses, e.g.
-  on EC2, and you are building a Swarm involving nodes inside and outside the
-  private network: then you should advertise the public address.)]
-
-- If your nodes have multiple IP addresses, pick an address which is reachable
-  *by every other node* of the Swarm
-
-- If you are using [play-with-docker](http://play-with-docker.com/), use the IP
-  address shown next to the node name
-
-  .small[(This is the address of your node on your private internal overlay network.
-  The other address that you might see is the address of your node on the
-  `docker_gwbridge` network, which is used for outbound traffic.)]
-
-Examples:
-
-```bash
-docker swarm init --advertise-addr 10.0.9.2
-docker swarm init --advertise-addr eth0:7777
-```
-
----
-
-class: extra-details
-
-## Using a separate interface for the data path
-
-- You can use different interfaces (or IP addresses) for control and data
-
-- You set the _control plane path_ with `--advertise-addr`
-
-  (This will be used for SwarmKit manager/worker communication, leader election, etc.)
-
-- You set the _data plane path_ with `--data-path-addr`
-
-  (This will be used for traffic between containers)
-
-- Both flags can accept either an IP address, or an interface name
-
-  (When specifying an interface name, Docker will use its first IP address)
-
----
-
-## Token generation
-
-- In the output of `docker swarm init`, we have a message
-  confirming that our node is now the (single) manager:
-
-  ```
-  Swarm initialized: current node (8jud...) is now a manager.
-  ```
-
-- Docker generated two security tokens (like passphrases or passwords) for our cluster
-
-- The CLI shows us the command to use on other nodes to add them to the cluster using the "worker"
-  security token:
-
-  ```
-    To add a worker to this swarm, run the following command:
-      docker swarm join \
-      --token SWMTKN-1-59fl4ak4nqjmao1ofttrc4eprhrola2l87... \
-      172.31.4.182:2377
-  ```
-
----
-
-class: extra-details
-
-## Checking that Swarm mode is enabled
-
-.exercise[
-
-- Run the traditional `docker info` command:
-  ```bash
-  docker info
-  ```
-
-]
-
-The output should include:
-
-```
-Swarm: active
- NodeID: 8jud7o8dax3zxbags3f8yox4b
- Is Manager: true
- ClusterID: 2vcw2oa9rjps3a24m91xhvv0c
- ...
-```
-
----
-
-## Running our first Swarm mode command
-
-- Let's retry the exact same command as earlier
-
-.exercise[
-
-- List the nodes (well, the only node) of our cluster:
-  ```bash
-  docker node ls
-  ```
-
-]
-
-The output should look like the following:
-```
-ID             HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
-8jud...ox4b *  node1     Ready   Active        Leader
-```
-
----
-
-## Adding nodes to the Swarm
-
-- A cluster with one node is not a lot of fun
-
-- Let's add `node2`!
-
-- We need the token that was shown earlier
-
---
-
-- You wrote it down, right?
-
---
-
-- Don't panic, we can easily see it again 😏
-
----
-
-## Adding nodes to the Swarm
-
-.exercise[
-
-- Show the token again:
-  ```bash
-  docker swarm join-token worker
-  ```
-
-- Switch to `node2`
-
-- Copy-paste the `docker swarm join ...` command
-  <br/>(that was displayed just before)
-
-]
-
----
-
-class: extra-details
-
-## Check that the node was added correctly
-
-- Stay on `node2` for now!
-
-.exercise[
-
-- We can still use `docker info` to verify that the node is part of the Swarm:
-  ```bash
-  docker info | grep ^Swarm
-  ```
-
-]
-
-- However, Swarm commands will not work; try, for instance:
-  ```
-  docker node ls
-  ```
-
-- This is because the node that we added is currently a *worker*
-
-- Only *managers* can accept Swarm-specific commands
-
----
-
-## View our two-node cluster
-
-- Let's go back to `node1` and see what our cluster looks like
-
-.exercise[
-
-- Switch back to `node1`
-
-- View the cluster from `node1`, which is a manager:
-  ```bash
-  docker node ls
-  ```
-
-]
-
-The output should be similar to the following:
-```
-ID             HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
-8jud...ox4b *  node1     Ready   Active        Leader
-ehb0...4fvx    node2     Ready   Active
-```
-
----
-
-class: docker-machine
-
-## Adding nodes using the Docker API
-
-- We don't have to SSH into the other nodes, we can use the Docker API
-
-- If you are using Play-With-Docker:
-
-  - the nodes expose the Docker API over port 2375/tcp, without authentication
-
-  - we will connect by setting the `DOCKER_HOST` environment variable
-
-- Otherwise:
-
-  - the nodes expose the Docker API over port 2376/tcp, with TLS mutual authentication
-
-  - we will use Docker Machine to set the correct environment variables
-    <br/>(the nodes have been suitably pre-configured to be controlled through `node1`)
-
----
-
-class: docker-machine
-
-# Docker Machine
-
-- Docker Machine has two primary uses:
-
-  - provisioning cloud instances running the Docker Engine
-
-  - managing local Docker VMs within e.g. VirtualBox
-
-- Docker Machine is purely optional
-
-- It makes it easy to create, upgrade, manage... Docker hosts:
-
-  - on your favorite cloud provider
-
-  - locally (e.g. to test clustering, or different versions)
-
-  - across different cloud providers
-
----
-
-class: self-paced, docker-machine
-
-## If you're using Play-With-Docker ...
-
-- You won't need to use Docker Machine
-
-- Instead, to "talk" to another node, we'll just set `DOCKER_HOST`
-
-- You can skip the exercises telling you to do things with Docker Machine!
-
----
-
-class: docker-machine
-
-## Docker Machine basic usage
-
-- We will learn two commands:
-
-  - `docker-machine ls` (list existing hosts)
-
-  - `docker-machine env` (switch to a specific host)
-
-.exercise[
-
-- List configured hosts:
-  ```bash
-  docker-machine ls
-  ```
-
-]
-
-You should see your 5 nodes.
-
----
-
-class: in-person, docker-machine
-
-## How did we make our 5 nodes show up there?
-
-*For the curious...*
-
-- This was done by our VM provisioning scripts
-
-- After setting up everything else, `node1` adds the 5 nodes
-  to the local Docker Machine configuration
-  (located in `$HOME/.docker/machine`)
-
-- Nodes are added using [Docker Machine generic driver](https://docs.docker.com/machine/drivers/generic/)
-
-  (It skips machine provisioning and jumps straight to the configuration phase)
-
-- Docker Machine creates TLS certificates and deploys them to the nodes through SSH
-
----
-
-class: docker-machine
-
-## Using Docker Machine to communicate with a node
-
-- To select a node, use `eval $(docker-machine env nodeX)`
-
-- This sets a number of environment variables
-
-- To unset these variables, use `eval $(docker-machine env -u)`
-
-.exercise[
-
-- View the variables used by Docker Machine:
-  ```bash
-  docker-machine env node3
-  ```
-
-]
-
-(This shows which variables *would* be set by Docker Machine; but it doesn't change them.)
-
----
-
-class: docker-machine
-
-## Getting the token
-
-- First, let's store the join token in a variable
-
-- This must be done from a manager
-
-.exercise[
-
-- Make sure we talk to the local node, or `node1`:
-  ```bash
-  eval $(docker-machine env -u)
-  ```
-
-- Get the join token:
-  ```bash
-  TOKEN=$(docker swarm join-token -q worker)
-  ```
-
-]
-
----
-
-class: docker-machine
-
-## Change the node targeted by the Docker CLI
-
-- We need to set the right environment variables to communicate with `node3`
-
-.exercise[
-
-- If you're using Play-With-Docker:
-  ```bash
-  export DOCKER_HOST=tcp://node3:2375
-  ```
-
-- Otherwise, use Docker Machine:
-  ```bash
-  eval $(docker-machine env node3)
-  ```
-
-]
-
----
-
-class: docker-machine
-
-## Checking which node we're talking to
-
-- Let's use the Docker API to ask "who are you?" to the remote node
-
-.exercise[
-
-- Extract the node name from the output of `docker info`:
-  ```bash
-  docker info | grep ^Name
-  ```
-
-]
-
-This should tell us that we are talking to `node3`.
-
-Note: it can be useful to use a [custom shell prompt](
-https://github.com/jpetazzo/orchestration-workshop/blob/master/prepare-vms/scripts/postprep.rc#L68)
-reflecting the `DOCKER_HOST` variable.
-
----
-
-class: docker-machine
-
-## Adding a node through the Docker API
-
-- We are going to use the same `docker swarm join` command as before
-
-.exercise[
-
-- Add `node3` to the Swarm:
-  ```bash
-  docker swarm join --token $TOKEN node1:2377
-  ```
-
-]
-
----
-
-class: docker-machine
-
-## Going back to the local node
-
-- We need to revert the environment variable(s) that we had set previously
-
-.exercise[
-
-- If you're using Play-With-Docker, just clear `DOCKER_HOST`:
-  ```bash
-  unset DOCKER_HOST
-  ```
-
-- Otherwise, use Docker Machine to reset all the relevant variables:
-  ```bash
-  eval $(docker-machine env -u)
-  ```
-
-]
-
-From that point, we are communicating with `node1` again.
-
----
-
-class: docker-machine
-
-## Checking the composition of our cluster
-
-- Now that we're talking to `node1` again, we can use management commands
-
-.exercise[
-
-- Check that the node is here:
-  ```bash
-  docker node ls
-  ```
-
-]
-
----
-
-class: under-the-hood
-
-## Under the hood: docker swarm init
-
-When we do `docker swarm init`:
-
-- a keypair is created for the root CA of our Swarm
-
-- a keypair is created for the first node
-
-- a certificate is issued for this node
-
-- the join tokens are created
-
----
-
-class: under-the-hood
-
-## Under the hood: join tokens
-
-There is one token to *join as a worker*, and another to *join as a manager*.
-
-The join tokens have two parts:
-
-- a secret key (preventing unauthorized nodes from joining)
-
-- a fingerprint of the root CA certificate (preventing MITM attacks)
-
-If a token is compromised, it can be rotated instantly with:
-```
-docker swarm join-token --rotate <worker|manager>
-```
-
----
-
-class: under-the-hood
-
-## Under the hood: docker swarm join
-
-When a node joins the Swarm:
-
-- it is issued its own keypair, signed by the root CA
-
-- if the node is a manager:
-
-  - it joins the Raft consensus
-  - it connects to the current leader
-  - it accepts connections from worker nodes
-
-- if the node is a worker:
-
-  - it connects to one of the managers (leader or follower)
-
----
-
-class: under-the-hood
-
-## Under the hood: cluster communication
-
-- The *control plane* is encrypted with AES-GCM; keys are rotated every 12 hours
-
-- Authentication is done with mutual TLS; certificates are rotated every 90 days
-
-  (`docker swarm update` allows to change this delay or to use an external CA)
-
-- The *data plane* (communication between containers) is not encrypted by default
-
-  (but this can be activated on a by-network basis, using IPSEC,
-  leveraging hardware crypto if available)
-
----
-
-class: under-the-hood
-
-## Under the hood: I want to know more!
-
-Revisit SwarmKit concepts:
-
-- Docker 1.12 Swarm Mode Deep Dive Part 1: Topology
-  ([video](https://www.youtube.com/watch?v=dooPhkXT9yI))
-
-- Docker 1.12 Swarm Mode Deep Dive Part 2: Orchestration
-  ([video](https://www.youtube.com/watch?v=_F6PSP-qhdA))
-
-Some presentations from the Docker Distributed Systems Summit in Berlin:
-
-- Heart of the SwarmKit: Topology Management
-  ([slides](https://speakerdeck.com/aluzzardi/heart-of-the-swarmkit-topology-management))
-
-- Heart of the SwarmKit: Store, Topology & Object Model
-  ([slides](http://www.slideshare.net/Docker/heart-of-the-swarmkit-store-topology-object-model))
-  ([video](https://www.youtube.com/watch?v=EmePhjGnCXY))
-
----
-
-## Adding more manager nodes
-
-- Right now, we have only one manager (node1)
-
-- If we lose it, we lose quorum - and that's *very bad!*
-
-- Containers running on other nodes will be fine ...
-
-- But we won't be able to get or set anything related to the cluster
-
-- If the manager is permanently gone, we will have to do a manual repair!
-
-- Nobody wants to do that ... so let's make our cluster highly available
-
----
-
-class: self-paced
-
-## Adding more managers
-
-With Play-With-Docker:
-
-```bash
-TOKEN=$(docker swarm join-token -q manager)
-for N in $(seq 4 5); do
-  export DOCKER_HOST=tcp://node$N:2375
-  docker swarm join --token $TOKEN node1:2377
-done
-unset DOCKER_HOST
-```
-
----
-
-class: docker-machine
-
-## Adding more managers
-
-With Docker Machine:
-
-```bash
-TOKEN=$(docker swarm join-token -q manager)
-for N in $(seq 4 5); do
-  eval $(docker-machine env node$N)
-  docker swarm join --token $TOKEN node1:2377
-done
-eval $(docker-machine env -u)
-```
-
----
-
-class: in-person
-
-## Building our full cluster
-
-- We could SSH to nodes 3, 4, 5; and copy-paste the command
-
---
-
-class: in-person
-
-- Or we could use the AWESOME POWER OF THE SHELL!
-
---
-
-class: in-person
-
-![Mario Red Shell](mario-red-shell.png)
-
---
-
-class: in-person
-
-- No, not *that* shell
-
----
-
-class: in-person
-
-## Let's form like Swarm-tron
-
-- Let's get the token, and loop over the remaining nodes with SSH
-
-.exercise[
-
-- Obtain the manager token:
-  ```bash
-  TOKEN=$(docker swarm join-token -q manager)
-  ```
-
-- Loop over the 3 remaining nodes:
-  ```bash
-    for NODE in node3 node4 node5; do
-      ssh $NODE docker swarm join --token $TOKEN node1:2377
-    done
-  ```
-
-]
-
-[That was easy.](https://www.youtube.com/watch?v=3YmMNpbFjp0)
-
----
-
-## You can control the Swarm from any manager node
-
-.exercise[
-
-- Try the following command on a few different nodes:
-  ```bash
-  docker node ls
-  ```
-
-]
-
-On manager nodes:
-<br/>you will see the list of nodes, with a `*` denoting
-the node you're talking to.
-
-On non-manager nodes:
-<br/>you will get an error message telling you that
-the node is not a manager.
-
-As we saw earlier, you can only control the Swarm through a manager node.
-
----
-
-class: self-paced
-
-## Play-With-Docker node status icon
-
-- If you're using Play-With-Docker, you get node status icons
-
-- Node status icons are displayed left of the node name
-
-  - No icon = no Swarm mode detected
-  - Solid blue icon = Swarm manager detected
-  - Blue outline icon = Swarm worker detected
-
-![Play-With-Docker icons](pwd-icons.png)
-
----
-
-## Dynamically changing the role of a node
-
-- We can change the role of a node on the fly:
-
-  `docker node promote XXX` → make XXX a manager
-  <br/>
-  `docker node demote XXX` → make XXX a worker
-
-.exercise[
-
-- See the current list of nodes:
-  ```
-  docker node ls
-  ```
-
-- Promote any worker node to be a manager:
-  ```
-  docker node promote <node_name_or_id>
-  ```
-
-]
-
----
-
-## How many managers do we need?
-
-- 2N+1 nodes can (and will) tolerate N failures
-  <br/>(you can have an even number of managers, but there is no point)
-
---
-
-- 1 manager = no failure
-
-- 3 managers = 1 failure
-
-- 5 managers = 2 failures (or 1 failure during 1 maintenance)
-
-- 7 managers and more = now you might be overdoing it a little bit
-
----
-
-## Why not have *all* nodes be managers?
-
-- Intuitively, it's harder to reach consensus in larger groups
-
-- With Raft, writes have to go to (and be acknowledged by) all nodes
-
-- More nodes = more network traffic
-
-- Bigger network = more latency
-
----
-
-## What would McGyver do?
-
-- If some of your machines are more than 10ms away from each other,
-  <br/>
-  try to break them down in multiple clusters
-  (keeping internal latency low)
-
-- Groups of up to 9 nodes: all of them are managers
-
-- Groups of 10 nodes and up: pick 5 "stable" nodes to be managers
-  <br/>
-  (Cloud pro-tip: use separate auto-scaling groups for managers and workers)
-
-- Groups of more than 100 nodes: watch your managers' CPU and RAM
-
-- Groups of more than 1000 nodes:
-
-  - if you can afford to have fast, stable managers, add more of them
-  - otherwise, break down your nodes in multiple clusters
-
----
-
-## What's the upper limit?
-
-- We don't know!
-
-- Internal testing at Docker Inc.: 1000-10000 nodes is fine
-
-  - deployed to a single cloud region
-
-  - one of the main take-aways was *"you're gonna need a bigger manager"*
-
-- Testing by the community: [4700 heterogenous nodes all over the 'net](https://sematext.com/blog/2016/11/14/docker-swarm-lessons-from-swarm3k/)
-
-  - it just works
-
-  - more nodes require more CPU; more containers require more RAM
-
-  - scheduling of large jobs (70000 containers) is slow, though (working on it!)
-
----
-
-## Real-life deployment methods
-
--- Running commands manually over SSH
-
---
-
-  (lol jk)
-
---
-
-- Using your favorite configuration management tool
-
-- [Docker for AWS](https://docs.docker.com/docker-for-aws/#quickstart)
-
-- [Docker for Azure](https://docs.docker.com/docker-for-azure/)
-
----
-
-# Running our first Swarm service
-
-- How do we run services? Simplified version:
-
-  `docker run` → `docker service create`
-
-.exercise[
-
-- Create a service featuring an Alpine container pinging Google resolvers:
-  ```bash
-  docker service create alpine ping 8.8.8.8
-  ```
-
-- Check the result:
-  ```bash
-  docker service ps <serviceID>
-  ```
-
-]
-
----
-
-## `--detach` for service creation
-
-(New in Docker Engine 17.05)
-
-If you are running Docker 17.05 or later, you will see the following message:
-
-```
-Since --detach=false was not specified, tasks will be created in the background.
-In a future release, --detach=false will become the default.
-```
-
-Let's ignore it for now; but we'll come back to it in just a few minutes!
-
----
-
-## Checking service logs
-
-(New in Docker Engine 17.05)
-
-- Just like `docker logs` shows the output of a specific local container ...
-
-- ... `docker service logs` shows the output of all the containers of a specific service
-
-.exercise[
-
-- Check the output of our ping command:
-  ```bash
-  docker service logs <serviceID>
-  ```
-
-]
-
-Flags `--follow` and `--tail` are available, as well as a few others.
-
-Note: by default, when a container is destroyed (e.g. when scaling down), its logs are lost.
-
----
-
-class: extra-details
-
-## Before Docker Engine 17.05
-
-- Docker 1.13/17.03/17.04 have `docker service logs` as an experimental feature
-  <br/>(available only when enabling the experimental feature flag)
-
-- We have to use `docker logs`, which only works on local containers
-
-- We will have to connect to the node running our container
-  <br/>(unless it was scheduled locally, of course)
-
----
-
-class: extra-details
-
-## Looking up where our container is running
-
-- The `docker service ps` command told us where our container was scheduled
-
-.exercise[
-
-- Look up the `NODE` on which the container is running:
-  ```bash
-  docker service ps <serviceID>
-  ```
-
-- If you use Play-With-Docker, switch to that node's tab, or set `DOCKER_HOST`
-
-- Otherwise, `ssh` into tht node or use `$(eval docker-machine env node...)`
-
-]
-
----
-
-class: extra-details
-
-## Viewing the logs of the container
-
-.exercise[
-
-- See that the container is running and check its ID:
-  ```bash
-  docker ps
-  ```
-
-- View its logs:
-  ```bash
-  docker logs <containerID>
-  ```
-
-- Go back to `node1` afterwards
-
-]
-
----
-
-## Scale our service
-
-- Services can be scaled in a pinch with the `docker service update` command
-
-.exercise[
-
-- Scale the service to ensure 2 copies per node:
-  ```bash
-  docker service update <serviceID> --replicas 10
-  ```
-
-- Check that we have two containers on the current node:
-  ```bash
-  docker ps
-  ```
-
-]
-
----
-
-## View deployment progress
-
-(New in Docker Engine 17.05)
-
-- Commands that create/update/delete services can run with `--detach=false`
-
-- The CLI will show the status of the command, and exit once it's done working
-
-.exercise[
-
-- Scale the service to ensure 3 copies per node:
-  ```bash
-  docker service update <serviceID> --replicas 15 --detach=false
-  ```
-
-]
-
-Note: `--detach=false` will eventually become the default.
-
-With older versions, you can use e.g.: `watch docker service ps <serviceID>`
-
----
-
-## Expose a service
-
-- Services can be exposed, with two special properties:
-
-  - the public port is available on *every node of the Swarm*,
-
-  - requests coming on the public port are load balanced across all instances.
-
-- This is achieved with option `-p/--publish`; as an approximation:
-
-  `docker run -p → docker service create -p`
-
-- If you indicate a single port number, it will be mapped on a port
-  starting at 30000
-  <br/>(vs. 32768 for single container mapping)
-
-- You can indicate two port numbers to set the public port number
-  <br/>(just like with `docker run -p`)
-
----
-
-## Expose ElasticSearch on its default port
-
-.exercise[
-
-- Create an ElasticSearch service (and give it a name while we're at it):
-  ```bash
-  docker service create --name search --publish 9200:9200 --replicas 7 \
-         --detach=false elasticsearch`:2`
-  ```
-
-]
-
-Note: don't forget the **:2**!
-
-The latest version of the ElasticSearch image won't start without mandatory configuration.
-
----
-
-## Tasks lifecycle
-
-- During the deployment, you will be able to see multiple states:
-
-  - assigned (the task has been assigned to a specific node)
-
-  - preparing (this mostly means "pulling the image")
-
-  - starting
-
-  - running
-
-- When a task is terminated (stopped, killed...) it cannot be restarted
-
-  (A replacement task will be created)
-
----
-
-class: extra-details
-
-![diagram showing what happens during docker service create, courtesy of @aluzzardi](docker-service-create.svg)
-
----
-
-## Test our service
-
-- We mapped port 9200 on the nodes, to port 9200 in the containers
-
-- Let's try to reach that port!
-
-.exercise[
-
-- Try the following command:
-  ```bash
-  curl localhost:9200
-  ```
-
-]
-
-(If you get `Connection refused`: congratulations, you are very fast indeed! Just try again.)
-
-ElasticSearch serves a little JSON document with some basic information
-about this instance; including a randomly-generated super-hero name.
-
----
-
-## Test the load balancing
-
-- If we repeat our `curl` command multiple times, we will see different names
-
-.exercise[
-
-- Send 10 requests, and see which instances serve them:
-  ```bash
-    for N in $(seq 1 10); do
-      curl -s localhost:9200 | jq .name
-    done
-  ```
-
-]
-
-Note: if you don't have `jq` on your Play-With-Docker instance, just install it:
-```bash
-apk add --no-cache jq
-```
-
----
-
-## Load balancing results
-
-Traffic is handled by our clusters [TCP routing mesh](
-https://docs.docker.com/engine/swarm/ingress/).
-
-Each request is served by one of the 7 instances, in rotation.
-
-Note: if you try to access the service from your browser,
-you will probably see the same
-instance name over and over, because your browser (unlike curl) will try
-to re-use the same connection.
-
----
-
-## Under the hood of the TCP routing mesh
-
-- Load balancing is done by IPVS
-
-- IPVS is a high-performance, in-kernel load balancer
-
-- It's been around for a long time (merged in the kernel since 2.4)
-
-- Each node runs a local load balancer
-
-  (Allowing connections to be routed directly to the destination,
-  without extra hops)
-
----
-
-## Managing inbound traffic
-
-There are many ways to deal with inbound traffic on a Swarm cluster.
-
-- Put all (or a subset) of your nodes in a DNS `A` record
-
-- Assign your nodes (or a subset) to an ELB
-
-- Use a virtual IP and make sure that it is assigned to an "alive" node
-
-- etc.
-
----
-
-class: btw-labels
-
-## Managing HTTP traffic
-
-- The TCP routing mesh doesn't parse HTTP headers
-
-- If you want to place multiple HTTP services on port 80, you need something more
-
-- You can setup NGINX or HAProxy on port 80 to do the virtual host switching
-
-- Docker Universal Control Plane provides its own [HTTP routing mesh](
-  https://docs.docker.com/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services/)
-
-  - add a specific label starting with `com.docker.ucp.mesh.http` to your services
-
-  - labels are detected automatically and dynamically update the configuration
-
----
-
-class: btw-labels
-
-## You should use labels
-
-- Labels are a great way to attach arbitrary information to services
-
-- Examples:
-
-  - HTTP vhost of a web app or web service
-
-  - backup schedule for a stateful service
-
-  - owner of a service (for billing, paging...)
-
-  - etc.
-
----
-
-## Pro-tip for ingress traffic management
-
-- It is possible to use *local* networks with Swarm services
-
-- This means that you can do something like this:
-  ```bash
-  docker service create --network host --mode global traefik ...
-  ```
-
-  (This runs the `traefik` load balancer on each node of your cluster, in the `host` network)
-
-- This gives you native performance (no iptables, no proxy, no nothing!)
-
-- The load balancer will "see" the clients' IP addresses
-
-- But: a container cannot simultaneously be in the `host` network and another network
-
-  (You will have to route traffic to containers using exposed ports or UNIX sockets)
-
----
-
-class: extra-details
-
-## Using local networks (`host`, `macvlan` ...) with Swarm services
-
-- Using the `host` network is fairly straightforward
-
-  (With the caveats described on the previous slide)
-
-- It is also possible to use drivers like `macvlan`
-
-  - see [this guide](
-https://docs.docker.com/engine/userguide/networking/get-started-macvlan/
-) to get started on `macvlan`
-
-  - see [this PR](https://github.com/moby/moby/pull/32981) for more information about local network drivers in Swarm mode
-
----
-
-## Visualize container placement
-
-- Let's leverage the Docker API!
-
-.exercise[
-
-- Get the source code of this simple-yet-beautiful visualization app:
-  ```bash
-  cd ~
-  git clone git://github.com/dockersamples/docker-swarm-visualizer
-  ```
-
-- Build and run the Swarm visualizer:
-  ```bash
-  cd docker-swarm-visualizer
-  docker-compose up -d
-  ```
-
-]
-
----
-
-## Connect to the visualization webapp
-
-- It runs a web server on port 8080
-
-.exercise[
-
-- Point your browser to port 8080 of your node1's public ip
-
-  (If you use Play-With-Docker, click on the (8080) badge)
-
-]
-
-- The webapp updates the display automatically (you don't need to reload the page)
-
-- It only shows Swarm services (not standalone containers)
-
-- It shows when nodes go down
-
-- It has some glitches (it's not Carrier-Grade Enterprise-Compliant ISO-9001 software)
-
----
-
-## Why This Is More Important Than You Think
-
-- The visualizer accesses the Docker API *from within a container*
-
-- This is a common pattern: run container management tools *in containers*
-
-- Instead of viewing your cluster, this could take care of logging, metrics, autoscaling ...
-
-- We can run it within a service, too! We won't do it, but the command would look like:
-
-  ```bash
-    docker service create \
-      --mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
-      --name viz --constraint node.role==manager ...
-  ```
-
-Credits: the visualization code was written by
-[Francisco Miranda](https://github.com/maroshii).
-<br/>
-[Mano Marks](https://twitter.com/manomarks) adapted
-it to Swarm and maintains it.
-
----
-
-## Terminate our services
-
-- Before moving on, we will remove those services
-
-- `docker service rm` can accept multiple services names or IDs
-
-- `docker service ls` can accept the `-q` flag
-
-- A Shell snippet a day keeps the cruft away
-
-.exercise[
-
-- Remove all services with this one liner:
-  ```bash
-  docker service ls -q | xargs docker service rm
-  ```
-
-]
-
----
-
-class: title
-
-Our app on Swarm
-
----
-
-## What's on the menu?
-
-In this part, we will:
-
-- **build** images for our app,
-
-- **ship** these images with a registry,
-
-- **run** services using these images.
-
----
-
-## Why do we need to ship our images?
-
-- When we do `docker-compose up`, images are built for our services
-
-- These images are present only on the local node
-
-- We need these images to be distributed on the whole Swarm
-
-- The easiest way to achieve that is to use a Docker registry
-
-- Once our images are on a registry, we can reference them when
-  creating our services
-
----
-
-class: extra-details
-
-## Build, ship, and run, for a single service
-
-If we had only one service (built from a `Dockerfile` in the
-current directory), our workflow could look like this:
-
-```
-docker build -t jpetazzo/doublerainbow:v0.1 .
-docker push jpetazzo/doublerainbow:v0.1
-docker service create jpetazzo/doublerainbow:v0.1
-```
-
-We just have to adapt this to our application, which has 4 services!
-
----
-
-## The plan
-
-- Build on our local node (`node1`)
-
-- Tag images so that they are named `localhost:5000/servicename`
-
-- Upload them to a registry
-
-- Create services using the images
-
----
-
-## Which registry do we want to use?
-
-.small[
-
-- **Docker Hub**
-
-  - hosted by Docker Inc.
-  - requires an account (free, no credit card needed)
-  - images will be public (unless you pay)
-  - located in AWS EC2 us-east-1
-
-- **Docker Trusted Registry**
-
-  - self-hosted commercial product
-  - requires a subscription (free 30-day trial available)
-  - images can be public or private
-  - located wherever you want
-
-- **Docker open source registry**
-
-  - self-hosted barebones repository hosting
-  - doesn't require anything
-  - doesn't come with anything either
-  - located wherever you want
-
-]
-
----
-
-class: extra-details
-
-## Using Docker Hub
-
-*If we wanted to use the Docker Hub...*
-
-<!--
-```meta
-^{
-```
--->
-
-- We would log into the Docker Hub:
-  ```bash
-  docker login
-  ```
-
-- And in the following slides, we would use our Docker Hub login
-  (e.g. `jpetazzo`) instead of the registry address (i.e. `127.0.0.1:5000`)
-
-<!--
-```meta
-^}
-```
--->
-
----
-
-class: extra-details
-
-## Using Docker Trusted Registry
-
-*If we wanted to use DTR, we would...*
-
-- Make sure we have a Docker Hub account
-
-- [Activate a Docker Datacenter subscription](
-  https://hub.docker.com/enterprise/trial/)
-
-- Install DTR on our machines
-
-- Use `dtraddress:port/user` instead of the registry address
-
-*This is out of the scope of this workshop!*
-
----
-
-## Using the open source registry
-
-- We need to run a `registry:2` container
-  <br/>(make sure you specify tag `:2` to run the new version!)
-
-- It will store images and layers to the local filesystem
-  <br/>(but you can add a config file to use S3, Swift, etc.)
-
-- Docker *requires* TLS when communicating with the registry
-
-  - unless for registries on `127.0.0.0/8` (i.e. `localhost`)
-
-  - or with the Engine flag `--insecure-registry`
-
-<!-- -->
-
-- Our strategy: publish the registry container on port 5000,
-  <br/>so that it's available through `127.0.0.1:5000` on each node
-
----
-
-class: manual-btp
-
-# Deploying a local registry
-
-- We will create a single-instance service, publishing its port
-  on the whole cluster
-
-.exercise[
-
-- Create the registry service:
-  ```bash
-  docker service create --name registry --publish 5000:5000 registry:2
-  ```
-
-- Try the following command, until it returns `{"repositories":[]}`:
-  ```bash
-  curl 127.0.0.1:5000/v2/_catalog
-  ```
-
-]
-
-(Retry a few times, it might take 10-20 seconds for the container to be started. Patience.)
-
----
-
-class: manual-btp
-
-## Testing our local registry
-
-- We can retag a small image, and push it to the registry
-
-.exercise[
-
-- Make sure we have the busybox image, and retag it:
-  ```bash
-  docker pull busybox
-  docker tag busybox 127.0.0.1:5000/busybox
-  ```
-
-- Push it:
-  ```bash
-  docker push 127.0.0.1:5000/busybox
-  ```
-
-]
-
----
-
-class: manual-btp
-
-## Checking what's on our local registry
-
-- The registry API has endpoints to query what's there
-
-.exercise[
-
-- Ensure that our busybox image is now in the local registry:
-  ```bash
-  curl http://127.0.0.1:5000/v2/_catalog
-  ```
-
-]
-
-The curl command should now output:
-```json
-{"repositories":["busybox"]}
-```
-
----
-
-class: manual-btp
-
-## Build, tag, and push our application container images
-
-- Compose has named our images `dockercoins_XXX` for each service
-
-- We need to retag them (to `127.0.0.1:5000/XXX:v1`) and push them
-
-.exercise[
-
-- Set `REGISTRY` and `TAG` environment variables to use our local registry
-- And run this little for loop:
-  ```bash
-    cd ~/orchestration-workshop/dockercoins
-    REGISTRY=127.0.0.1:5000 TAG=v1
-    for SERVICE in hasher rng webui worker; do
-      docker tag dockercoins_$SERVICE $REGISTRY/$SERVICE:$TAG
-      docker push $REGISTRY/$SERVICE
-    done
-  ```
-
-]
-
----
-
-class: manual-btp
-
-# Overlay networks
-
-- SwarmKit integrates with overlay networks
-
-- Networks are created with `docker network create`
-
-- Make sure to specify that you want an *overlay* network
-  <br/>(otherwise you will get a local *bridge* network by default)
-
-.exercise[
-
-- Create an overlay network for our application:
-  ```bash
-  docker network create --driver overlay dockercoins
-  ```
-
-]
-
----
-
-class: manual-btp
-
-## Viewing existing networks
-
-- Let's confirm that our network was created
-
-.exercise[
-
-- List existing networks:
-  ```bash
-  docker network ls
-  ```
-
-]
-
----
-
-class: manual-btp
-
-## Can you spot the differences?
-
-The networks `dockercoins` and `ingress` are different from the other ones.
-
-Can you see how?
-
---
-
-class: manual-btp
-
-- They are using a different kind of ID, reflecting the fact that they
-  are SwarmKit objects instead of "classic" Docker Engine objects.
-
-- Their *scope* is `swarm` instead of `local`.
-
-- They are using the overlay driver.
-
----
-
-class: manual-btp, extra-details
-
-## Caveats
-
-.warning[In Docker 1.12, you cannot join an overlay network with `docker run --net ...`.]
-
-Starting with version 1.13, you can, if the network was created with the `--attachable` flag.
-
-*Why is that?*
-
-Placing a container on a network requires allocating an IP address for this container.
-
-The allocation must be done by a manager node (worker nodes cannot update Raft data).
-
-As a result, `docker run --net ...` requires collaboration with manager nodes.
-
-It alters the code path for `docker run`, so it is allowed only under strict circumstances.
-
----
-
-class: manual-btp
-
-## Run the application
-
-- First, create the `redis` service; that one is using a Docker Hub image
-
-.exercise[
-
-- Create the `redis` service:
-  ```bash
-  docker service create --network dockercoins --name redis redis
-  ```
-
-]
-
----
-
-class: manual-btp
-
-## Run the other services
-
-- Then, start the other services one by one
-
-- We will use the images pushed previously
-
-.exercise[
-
-- Start the other services:
-  ```bash
-  REGISTRY=127.0.0.1:5000
-  TAG=v1
-  for SERVICE in hasher rng webui worker; do
-    docker service create --network dockercoins --detach=true \
-           --name $SERVICE $REGISTRY/$SERVICE:$TAG
-  done
-  ```
-
-]
-
-???
-
-## Wait for our application to be up
-
-- We will see later a way to watch progress for all the tasks of the cluster
-
-- But for now, a scrappy Shell loop will do the trick
-
-.exercise[
-
-- Repeatedly display the status of all our services:
-  ```bash
-  watch "docker service ls -q | xargs -n1 docker service ps"
-  ```
-
-- Stop it once everything is running
-
-]
-
----
-
-class: manual-btp
-
-## Expose our application web UI
-
-- We need to connect to the `webui` service, but it is not publishing any port
-
-- Let's reconfigure it to publish a port
-
-.exercise[
-
-- Update `webui` so that we can connect to it from outside:
-  ```bash
-  docker service update webui --publish-add 8000:80 --detach=false
-  ```
-
-]
-
-Note: to "de-publish" a port, you would have to specify the container port.
-</br>(i.e. in that case, `--publish-rm 80`)
-
----
-
-class: manual-btp
-
-## What happens when we modify a service?
-
-- Let's find out what happened to our `webui` service
-
-.exercise[
-
-- Look at the tasks and containers associated to `webui`:
-  ```bash
-  docker service ps webui
-  ```
-]
-
---
-
-class: manual-btp
-
-The first version of the service (the one that was not exposed) has been shutdown.
-
-It has been replaced by the new version, with port 80 accessible from outside.
-
-(This will be discussed with more details in the section about stateful services.)
-
----
-
-class: manual-btp
-
-## Connect to the web UI
-
-- The web UI is now available on port 8000, *on all the nodes of the cluster*
-
-.exercise[
-
-- If you're using Play-With-Docker, just click on the `(8000)` badge
-
-- Otherwise, point your browser to any node, on port 8000
-
-]
-
----
-
-## Scaling the application
-
-- We can change scaling parameters with `docker update` as well
-
-- We will do the equivalent of `docker-compose scale`
-
-.exercise[
-
-- Bring up more workers:
-  ```bash
-  docker service update worker --replicas 10 --detach=false
-  ```
-
-- Check the result in the web UI
-
-]
-
-You should see the performance peaking at 10 hashes/s (like before).
-
----
-
-class: manual-btp
-
-# Global scheduling
-
-- We want to utilize as best as we can the entropy generators
-  on our nodes
-
-- We want to run exactly one `rng` instance per node
-
-- SwarmKit has a special scheduling mode for that, let's use it
-
-- We cannot enable/disable global scheduling on an existing service
-
-- We have to destroy and re-create the `rng` service
-
----
-
-class: manual-btp
-
-## Scaling the `rng` service
-
-.exercise[
-
-- Remove the existing `rng` service:
-  ```bash
-  docker service rm rng
-  ```
-
-- Re-create the `rng` service with *global scheduling*:
-  ```bash
-    docker service create --name rng --network dockercoins --mode global \
-      --detach=false $REGISTRY/rng:$TAG
-  ```
-
-- Look at the result in the web UI
-
-]
-
----
-
-class: extra-details, manual-btp
-
-## Why do we have to re-create the service to enable global scheduling?
-
-- Enabling it dynamically would make rolling updates semantics very complex
-
-- This might change in the future (after all, it was possible in 1.12 RC!)
-
-- As of Docker Engine 17.05, other parameters requiring to `rm`/`create` the service are:
-
-  - service name
-
-  - hostname
-
-  - network
-
----
-
-class: swarm-ready
-
-## How did we make our app "Swarm-ready"?
-
-This app was written in June 2015. (One year before Swarm mode was released.)
-
-What did we change to make it compatible with Swarm mode?
-
---
-
-.exercise[
-
-- Go to the app directory:
-  ```bash
-  cd ~/orchestration-workshop/dockercoins
-  ```
-
-- See modifications in the code:
-  ```bash
-  git log -p --since "4-JUL-2015" -- . ':!*.yml*' ':!*.html'
-  ```
-
-]
-
----
-
-class: swarm-ready
-
-## What did we change in our app since its inception?
-
-- Compose files
-
-- HTML file (it contains an embedded contextual tweet)
-
-- Dockerfiles (to switch to smaller images)
-
-- That's it!
-
---
-
-class: swarm-ready
-
-*We didn't change a single line of code in this app since it was written.*
-
---
-
-class: swarm-ready
-
-*The images that were [built in June 2015](
-https://hub.docker.com/r/jpetazzo/dockercoins_worker/tags/)
-(when the app was written) can still run today ...
-<br/>... in Swarm mode (distributed across a cluster, with load balancing) ...
-<br/>... without any modification.*
-
----
-
-class: swarm-ready
-
-## How did we design our app in the first place?
-
-- [Twelve-Factor App](https://12factor.net/) principles
-
-- Service discovery using DNS names
-
-  - Initially implemented as "links"
-
-  - Then "ambassadors"
-
-  - And now "services"
-
-- Existing apps might require more changes!
-
----
-
-class: manual-btp
-
-# Integration with Compose
-
-- The previous section showed us how to streamline image build and push
-
-- We will now see how to streamline service creation
-
-  (i.e. get rid of the `for SERVICE in ...; do docker service create ...` part)
-
----
-
-## Compose file version 3
-
-(New in Docker Engine 1.13)
-
-- Almost identical to version 2
-
-- Can be directly used by a Swarm cluster through `docker stack ...` commands
-
-- Introduces a `deploy` section to pass Swarm-specific parameters
-
-- Resource limits are moved to this `deploy` section
-
-- See [here](https://github.com/aanand/docker.github.io/blob/8524552f99e5b58452fcb1403e1c273385988b71/compose/compose-file.md#upgrading) for the complete list of changes
-
-- Supersedes *Distributed Application Bundles*
-
-  (JSON payload describing an application; could be generated from a Compose file)
-
----
-
-class: manual-btp
-
-## Removing everything
-
-- Before deploying using "stacks," let's get a clean slate
-
-.exercise[
-
-- Remove *all* the services:
-  ```bash
-  docker service ls -q | xargs docker service rm
-  ```
-
-]
-
----
-
-## Our first stack
-
-We need a registry to move images around.
-
-Without a stack file, it would be deployed with the following command:
-
-```bash
-docker service create --publish 5000:5000 registry:2
-```
-
-Now, we are going to deploy it with the following stack file:
-
-```yaml
-version: "3"
-
-services:
-  registry:
-    image: registry:2
-    ports:
-      - "5000:5000"
-```
-
----
-
-## Checking our stack files
-
-- All the stack files that we will use are in the `stacks` directory
-
-.exercise[
-
-- Go to the `stacks` directory:
-  ```bash
-  cd ~/orchestration-workshop/stacks
-  ```
-
-- Check `registry.yml`:
-  ```bash
-  cat registry.yml
-  ```
-
-]
-
----
-
-## Deploying our first stack
-
-- All stack manipulation commands start with `docker stack`
-
-- Under the hood, they map to `docker service` commands
-
-- Stacks have a *name* (which also serves as a namespace)
-
-- Stacks are specified with the aforementioned Compose file format version 3
-
-.exercise[
-
-- Deploy our local registry:
-  ```bash
-  docker stack deploy registry --compose-file registry.yml
-  ```
-
-]
-
----
-
-## Inspecting stacks
-
-- `docker stack ps` shows the detailed state of all services of a stack
-
-.exercise[
-
-- Check that our registry is running correctly:
-  ```bash
-  docker stack ps registry
-  ```
-
-- Confirm that we get the same output with the following command:
-  ```bash
-  docker service ps registry_registry
-  ```
-
-]
-
----
-
-class: manual-btp
-
-## Specifics of stack deployment
-
-Our registry is not *exactly* identical to the one deployed with `docker service create`!
-
-- Each stack gets its own overlay network
-
-- Services of the task are connected to this network
-  <br/>(unless specified differently in the Compose file)
-
-- Services get network aliases matching their name in the Compose file
-  <br/>(just like when Compose brings up an app specified in a v2 file)
-
-- Services are explicitly named `<stack_name>_<service_name>`
-
-- Services and tasks also get an internal label indicating which stack they belong to
-
----
-
-## Testing our local registry
-
-- Connecting to port 5000 *on any node of the cluster* routes us to the registry
-
-- Therefore, we can use `localhost:5000` or `127.0.0.1:5000` as our registry
-
-.exercise[
-
-- Issue the following API request to the registry:
-  ```bash
-  curl 127.0.0.1:5000/v2/_catalog
-  ```
-
-]
-
-It should return:
-
-```json
-{"repositories":[]}
-```
-
-If that doesn't work, retry a few times; perhaps the container is still starting.
-
----
-
-## Pushing an image to our local registry
-
-- We can retag a small image, and push it to the registry
-
-.exercise[
-
-- Make sure we have the busybox image, and retag it:
-  ```bash
-  docker pull busybox
-  docker tag busybox 127.0.0.1:5000/busybox
-  ```
-
-- Push it:
-  ```bash
-  docker push 127.0.0.1:5000/busybox
-  ```
-
-]
-
----
-
-## Checking what's on our local registry
-
-- The registry API has endpoints to query what's there
-
-.exercise[
-
-- Ensure that our busybox image is now in the local registry:
-  ```bash
-  curl http://127.0.0.1:5000/v2/_catalog
-  ```
-
-]
-
-The curl command should now output:
-```json
-"repositories":["busybox"]}
-```
-
----
-
-## Building and pushing stack services
-
-- When using Compose file version 2 and above, you can specify *both* `build` and `image`
-
-- When both keys are present:
-
-  - Compose does "business as usual" (uses `build`)
-
-  - but the resulting image is named as indicated by the `image` key
-    <br/>
-    (instead of `<projectname>_<servicename>:latest`)
-
-  - it can be pushed to a registry with `docker-compose push`
-
-- Example:
-
-  ```yaml
-    webfront:
-      build: www
-      image: myregistry.company.net:5000/webfront
-  ```
-
----
-
-## Using Compose to build and push images
-
-.exercise[
-
-- Try it:
-  ```bash
-  docker-compose -f dockercoins.yml build
-  docker-compose -f dockercoins.yml push
-  ```
-
-]
-
-Let's have a look at the `dockercoins.yml` file while this is building and pushing.
-
----
-
-```yaml
-version: "3"
-
-services:
-  rng:
-    build: dockercoins/rng
-    image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
-    deploy:
-      mode: global
-  ...
-  redis:
-    image: redis
-  ...
-  worker:
-    build: dockercoins/worker
-    image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
-    ...
-    deploy:
-      replicas: 10
-```
-
----
-
-## Deploying the application
-
-- Now that the images are on the registry, we can deploy our application stack
-
-.exercise[
-
-- Create the application stack:
-  ```bash
-  docker stack deploy dockercoins --compose-file dockercoins.yml
-  ```
-
-]
-
-We can now connect to any of our nodes on port 8000, and we will see the familiar hashing speed graph.
-
----
-
-## Maintaining multiple environments
-
-There are many ways to handle variations between environments.
-
-- Compose loads `docker-compose.yml` and (if it exists) `docker-compose.override.yml`
-
-- Compose can load alternate file(s) by setting the `-f` flag or the `COMPOSE_FILE` environment variable
-
-- Compose files can *extend* other Compose files, selectively including services:
-
-  ```yaml
-    web:
-      extends:
-        file: common-services.yml
-        service: webapp
-  ```
-
-See [this documentation page](https://docs.docker.com/compose/extends/) for more details about these techniques.
-
-
----
-
-class: extra-details
-
-## Good to know ...
-
-- Compose file version 3 adds the `deploy` section
-
-- Compose file version 3.1 adds support for secrets
-
-- You can re-run `docker stack deploy` to update a stack
-
-- ... But unsupported features will be wiped each time you redeploy (!)
-
-  (This will likely be fixed/improved soon)
-
-- `extends` doesn't work with `docker stack deploy`
-
-  (But you can use `docker-compose config` to "flatten" your configuration)
-
----
-
-## Summary
-
-- We've seen how to set up a Swarm
-
-- We've used it to host our own registry
-
-- We've built our app container images
-
-- We've used the registry to host those images
-
-- We've deployed and scaled our application
-
-- We've seen how to use Compose to streamline deployments
-
-- Awesome job, team!
-
----
-
-class: title, in-person
-
-Operating the Swarm
-
----
-
-name: part-2
-
-class: title, self-paced
-
-Part 2
-
----
-
-class: self-paced
-
-## Before we start ...
-
-The following exercises assume that you have a 5-nodes Swarm cluster.
-
-If you come here from a previous tutorial and still have your cluster: great!
-
-Otherwise: check [part 1](#part-1) to learn how to setup your own cluster.
-
-We pick up exactly where we left you, so we assume that you have:
-
-- a five nodes Swarm cluster,
-
-- a self-hosted registry,
-
-- DockerCoins up and running.
-
-The next slide has a cheat sheet if you need to set that up in a pinch.
-
----
-
-class: self-paced
-
-## Catching up
-
-Assuming you have 5 nodes provided by
-[Play-With-Docker](http://www.play-with-docker/), do this from `node1`:
-
-```bash
-docker swarm init --advertise-addr eth0
-TOKEN=$(docker swarm join-token -q manager)
-for N in $(seq 2 5); do
-  DOCKER_HOST=tcp://node$N:2375 docker swarm join --token $TOKEN node1:2377
-done
-git clone git://github.com/jpetazzo/orchestration-workshop
-cd orchestration-workshop/stacks
-docker stack deploy --compose-file registry.yml registry
-docker-compose -f dockercoins.yml build
-docker-compose -f dockercoins.yml push
-docker stack deploy --compose-file dockercoins.yml dockercoins
-```
-
-You should now be able to connect to port 8000 and see the DockerCoins web UI.
-
----
-
-class: netshoot, extra-details
-
-## Troubleshooting overlay networks
-
-<!--
-
-## Finding the real cause of the bottleneck
-
-- We want to debug our app as we scale `worker` up and down
-
--->
-
-- We want to run tools like `ab` or `httping` on the internal network
-
---
-
-class: netshoot, extra-details
-
-- Ah, if only we had created our overlay network with the `--attachable` flag ...
-
---
-
-class: netshoot, extra-details
-
-- Oh well, let's use this as an excuse to introduce New Ways To Do Things
-
----
-
-class: netshoot
-
-# Breaking into an overlay network
-
-- We will create a dummy placeholder service on our network
-
-- Then we will use `docker exec` to run more processes in this container
-
-.exercise[
-
-- Start a "do nothing" container using our favorite Swiss-Army distro:
-  ```bash
-    docker service create --network dockercoins_default --name debug \
-           --constraint node.hostname==$HOSTNAME alpine sleep 1000000000
-  ```
-
-]
-
-The `constraint` makes sure that the container will be created on the local node.
-
----
-
-class: netshoot
-
-## Entering the debug container
-
-- Once our container is started (which should be really fast because the alpine image is small), we can enter it (from any node)
-
-.exercise[
-
-- Locate the container:
-  ```bash
-  docker ps
-  ```
-
-- Enter it:
-  ```bash
-  docker exec -ti <containerID> sh
-  ```
-
-]
-
----
-
-class: netshoot
-
-## Labels
-
-- We can also be fancy and find the ID of the container automatically
-
-- SwarmKit places labels on containers
-
-.exercise[
-
-- Get the ID of the container:
-  ```bash
-  CID=$(docker ps -q --filter label=com.docker.swarm.service.name=debug)
-  ```
-
-- And enter the container:
-  ```bash
-  docker exec -ti $CID sh
-  ```
-
-]
-
----
-
-class: netshoot
-
-## Installing our debugging tools
-
-- Ideally, you would author your own image, with all your favorite tools, and use it instead of the base `alpine` image
-
-- But we can also dynamically install whatever we need
-
-.exercise[
-
-- Install a few tools:
-  ```bash
-  apk add --update curl apache2-utils drill
-  ```
-
-]
-
----
-
-class: netshoot
-
-## Investigating the `rng` service
-
-- First, let's check what `rng` resolves to
-
-.exercise[
-
-- Use drill or nslookup to resolve `rng`:
-  ```bash
-  drill rng
-  ```
-
-]
-
-This give us one IP address. It is not the IP address of a container.
-It is a virtual IP address (VIP) for the `rng` service.
-
----
-
-class: netshoot
-
-## Investigating the VIP
-
-.exercise[
-
-- Try to ping the VIP:
-  ```bash
-  ping rng
-  ```
-
-]
-
-It *should* ping. (But this might change in the future.)
-
-With Engine 1.12: VIPs respond to ping if a
-backend is available on the same machine.
-
-With Engine 1.13: VIPs respond to ping if a
-backend is available anywhere.
-
-(Again: this might change in the future.)
-
----
-
-class: netshoot
-
-## What if I don't like VIPs?
-
-- Services can be published using two modes: VIP and DNSRR.
-
-- With VIP, you get a virtual IP for the service, and a load balancer
-  based on IPVS
-
-  (By the way, IPVS is totally awesome and if you want to learn more about it in the context of containers,
-  I highly recommend [this talk](https://www.youtube.com/watch?v=oFsJVV1btDU&index=5&list=PLkA60AVN3hh87OoVra6MHf2L4UR9xwJkv) by [@kobolog](https://twitter.com/kobolog) at DC15EU!)
-
-- With DNSRR, you get the former behavior (from Engine 1.11), where
-  resolving the service yields the IP addresses of all the containers for
-  this service
-
-- You change this with `docker service create --endpoint-mode [VIP|DNSRR]`
-
----
-
-class: netshoot
-
-## Looking up VIP backends
-
-- You can also resolve a special name: `tasks.<name>`
-
-- It will give you the IP addresses of the containers for a given service
-
-.exercise[
-
-- Obtain the IP addresses of the containers for the `rng` service:
-  ```bash
-  drill tasks.rng
-  ```
-
-]
-
-This should list 5 IP addresses.
-
----
-
-class: netshoot, extra-details
-
-## Testing and benchmarking our service
-
-- We will check that the service is up with `rng`, then
-  benchmark it with `ab`
-
-.exercise[
-
-- Make a test request to the service:
-  ```bash
-  curl rng
-  ```
-
-- Open another window, and stop the workers, to test in isolation:
-  ```bash
-  docker service update dockercoins_worker --replicas 0
-  ```
-
-]
-
-Wait until the workers are stopped (check with `docker service ls`)
-before continuing.
-
----
-
-class: netshoot, extra-details
-
-## Benchmarking `rng`
-
-We will send 50 requests, but with various levels of concurrency.
-
-.exercise[
-
-- Send 50 requests, with a single sequential client:
-  ```bash
-  ab -c 1 -n 50 http://rng/10
-  ```
-
-- Send 50 requests, with fifty parallel clients:
-  ```bash
-  ab -c 50 -n 50 http://rng/10
-  ```
-
-]
-
----
-
-class: netshoot, extra-details
-
-## Benchmark results for `rng`
-
-- When serving requests sequentially, they each take 100ms
-
-- In the parallel scenario, the latency increased dramatically:
-
-- What about `hasher`?
-
----
-
-class: netshoot, extra-details
-
-## Benchmarking `hasher`
-
-We will do the same tests for `hasher`.
-
-The command is slightly more complex, since we need to post random data.
-
-First, we need to put the POST payload in a temporary file.
-
-.exercise[
-
-- Install curl in the container, and generate 10 bytes of random data:
-  ```bash
-  curl http://rng/10 >/tmp/random
-  ```
-
-]
-
----
-
-class: netshoot, extra-details
-
-## Benchmarking `hasher`
-
-Once again, we will send 50 requests, with different levels of concurrency.
-
-.exercise[
-
-- Send 50 requests with a sequential client:
-  ```bash
-    ab -c 1 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
-  ```
-
-- Send 50 requests with 50 parallel clients:
-  ```bash
-    ab -c 50 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
-  ```
-
-]
-
----
-
-class: netshoot, extra-details
-
-## Benchmark results for `hasher`
-
-- The sequential benchmarks takes ~5 seconds to complete
-
-- The parallel benchmark takes less than 1 second to complete
-
-- In both cases, each request takes a bit more than 100ms to complete
-
-- Requests are a bit slower in the parallel benchmark
-
-- It looks like `hasher` is better equiped to deal with concurrency than `rng`
-
----
-
-class: netshoot, extra-details, title
-
-Why?
-
----
-
-class: netshoot, extra-details
-
-## Why does everything take (at least) 100ms?
-
-`rng` code:
-
-![RNG code screenshot](delay-rng.png)
-
-`hasher` code:
-
-![HASHER code screenshot](delay-hasher.png)
-
----
-
-class: netshoot, extra-details, title
-
-But ...
-
-WHY?!?
-
----
-
-class: netshoot, extra-details
-
-## Why did we sprinkle this sample app with sleeps?
-
-- Deterministic performance
-  <br/>(regardless of instance speed, CPUs, I/O...)
-
-- Actual code sleeps all the time anyway
-
-- When your code makes a remote API call:
-
-  - it sends a request;
-
-  - it sleeps until it gets the response;
-
-  - it processes the response.
-
----
-
-class: netshoot, extra-details, in-person
-
-## Why do `rng` and `hasher` behave differently?
-
-![Equations on a blackboard](equations.png)
-
-(Synchronous vs. asynchronous event processing)
-
----
-
-class: netshoot, extra-details
-
-## Global scheduling → global debugging
-
-- Traditional approach:
-
-  - log into a node
-  - install our Swiss Army Knife (if necessary)
-  - troubleshoot things
-
-- Proposed alternative:
-
-  - put our Swiss Army Knife in a container (e.g. [nicolaka/netshoot](https://hub.docker.com/r/nicolaka/netshoot/))
-  - run tests from multiple locations at the same time
-
-(This becomes very practical with the `docker service log` command, available since 17.05.)
-
----
-
-class: nbt, extra-details
-
-## Measuring network conditions on the whole cluster
-
-- Since we have built-in, cluster-wide discovery, it's relatively straightforward
-  to monitor the whole cluster automatically
-
-- [Alexandros Mavrogiannis](https://github.com/alexmavr) wrote
-  [Swarm NBT](https://github.com/alexmavr/swarm-nbt), a tool doing exactly that!
-
-.exercise[
-
-- Start Swarm NBT:
-  ```bash
-    docker run --rm -v inventory:/inventory \
-           -v /var/run/docker.sock:/var/run/docker.sock \
-           alexmavr/swarm-nbt start
-  ```
-
-]
-
-Note: in this mode, Swarm NBT connects to the Docker API socket,
-and issues additional API requests to start all the components it needs.
-
----
-
-class: nbt, extra-details
-
-## Viewing network conditions with Prometheus
-
-- Swarm NBT relies on Prometheus to scrape and store data
-
-- We can directly consume the Prometheus endpoint to view telemetry data
-
-.exercise[
-
-- Point your browser to any Swarm node, on port 9090
-
-  (If you're using Play-With-Docker, click on the (9090) badge)
-
-- In the drop-down, select `icmp_rtt_gauge_seconds`
-
-- Click on "Graph"
-
-]
-
-You are now seeing ICMP latency across your cluster.
-
----
-
-class: nbt, in-person, extra-details
-
-## Viewing network conditions with Grafana
-
-- If you are using a "real" cluster (not Play-With-Docker) you can use Grafana
-
-.exercise[
-
-- Start Grafana with `docker service create -p 3000:3000 grafana`
-- Point your browser to Grafana, on port 3000 on any Swarm node
-- Login with username `admin` and password `admin`
-- Click on the top-left menu and browse to Data Sources
-- Create a prometheus datasource with any name
-- Point it to http://any-node-IP:9090
-- Set access to "direct" and leave credentials blank
-- Click on the top-left menu, highlight "Dashboards" and select the "Import" option
-- Copy-paste [this JSON payload](
-  https://raw.githubusercontent.com/alexmavr/swarm-nbt/master/grafana.json),
-  then use the Prometheus Data Source defined before
-- Poke around the dashboard that magically appeared!
-
-]
-
----
-
-class: ipsec
-
-# Securing overlay networks
-
-- By default, overlay networks are using plain VXLAN encapsulation
-
-  (~Ethernet over UDP, using SwarmKit's control plane for ARP resolution)
-
-- Encryption can be enabled on a per-network basis
-
-  (It will use IPSEC encryption provided by the kernel, leveraging hardware acceleration)
-
-- This is only for the `overlay` driver
-
-  (Other drivers/plugins will use different mechanisms)
-
----
-
-class: ipsec
-
-## Creating two networks: encrypted and not
-
-- Let's create two networks for testing purposes
-
-.exercise[
-
-- Create an "insecure" network:
-  ```bash
-  docker network create insecure --driver overlay --attachable
-  ```
-
-- Create a "secure" network:
-  ```bash
-  docker network create secure --opt encrypted --driver overlay --attachable
-  ```
-
-]
-
-.warning[Make sure that you don't typo that option; errors are silently ignored!]
-
----
-
-class: ipsec
-
-## Deploying a web server sitting on both networks
-
-- Let's use good old NGINX
-
-- We will attach it to both networks
-
-- We will use a placement constraint to make sure that it is on a different node
-
-.exercise[
-
-- Create a web server running somewhere else:
-  ```bash
-    docker service create --name web \
-           --network secure --network insecure \
-           --constraint node.hostname!=node1 \
-           nginx
-  ```
-
-]
-
----
-
-class: ipsec
-
-## Sniff HTTP traffic
-
-- We will use `ngrep`, which allows to grep for network traffic
-
-- We will run it in a container, using host networking to access the host's interfaces
-
-.exercise[
-
-- Sniff network traffic and display all packets containing "HTTP":
-  ```bash
-  docker run --net host nicolaka/netshoot ngrep -tpd eth0 HTTP
-  ```
-
-]
-
---
-
-class: ipsec
-
-Seeing tons of HTTP request? Shutdown your DockerCoins workers:
-```bash
-docker service update dockercoins_worker --replicas=0
-```
-
----
-
-class: ipsec
-
-## Check that we are, indeed, sniffing traffic
-
-- Let's see if we can intercept our traffic with Google!
-
-.exercise[
-
-- Open a new terminal
-
-- Issue an HTTP request to Google (or anything you like):
-  ```bash
-  curl google.com
-  ```
-
-]
-
-The ngrep container will display one `#` per packet traversing the network interface.
-
-When you do the `curl`, you should see the HTTP request in clear text in the output.
-
----
-
-class: ipsec, extra-details
-
-## If you are using Play-With-Docker, Vagrant, etc.
-
-- You will probably have *two* network interfaces
-
-- One interface will be used for outbound traffic (to Google)
-
-- The other one will be used for internode traffic
-
-- You might have to adapt/relaunch the `ngrep` command to specify the right one!
-
----
-
-class: ipsec
-
-## Try to sniff traffic across overlay networks
-
-- We will run `curl web` through both secure and insecure networks
-
-.exercise[
-
-- Access the web server through the insecure network:
-  ```bash
-  docker run --rm --net insecure nicolaka/netshoot curl web
-  ```
-
-- Now do the same through the secure network:
-  ```bash
-  docker run --rm --net secure nicolaka/netshoot curl web
-  ```
-
-]
-
-When you run the first command, you will see HTTP fragments.
-<br/>
-However, when you run the second one, only `#` will show up.
-
----
-
-# Updating services
-
-- We want to make changes to the web UI
-
-- The process is as follows:
-
-  - edit code
-
-  - build new image
-
-  - ship new image
-
-  - run new image
-
----
-
-class: extra-details
-
-## But first...
-
-- Restart the workers
-
-.exercise[
-
-- Just scale back to 10 replicas:
-  ```bash
-  docker service update dockercoins_worker --replicas 10
-  ```
-
-- Check that they're running:
-  ```bash
-  docker service ps dockercoins_worker
-  ```
-
-]
-
----
-
-## Updating a single service the hard way
-
-- To update a single service, we could do the following:
-  ```bash
-  REGISTRY=localhost:5000 TAG=v0.2
-  IMAGE=$REGISTRY/dockercoins_webui:$TAG
-  docker build -t $IMAGE webui/
-  docker push $IMAGE
-  docker service update dockercoins_webui --image $IMAGE
-  ```
-
-- Make sure to tag properly your images: update the `TAG` at each iteration
-
-  (When you check which images are running, you want these tags to be uniquely identifiable)
-
----
-
-## Updating services the easy way
-
-- With the Compose inbtegration, all we have to do is:
-  ```bash
-  export TAG=v0.2
-  docker-compose -f composefile.yml build
-  docker-compose -f composefile.yml push
-  docker stack deploy -c composefile.yml nameofstack
-  ```
-
---
-
-- That's exactly what we used earlier to deploy the app
-
-- We don't need to learn new commands!
-
----
-
-## Updating the web UI
-
-- Let's make the numbers on the Y axis bigger!
-
-.exercise[
-
-- Edit the file `webui/files/index.html`
-
-- Locate the `font-size` CSS attribute and increase it (at least double it)
-
-- Save and exit
-
-- Build, ship, and run:
-  ```bash
-  export TAG=v0.2
-  docker-compose -f dockercoins.yml build
-  docker-compose -f dockercoins.yml push
-  docker stack deploy -c dockercoins.yml dockercoins
-  ```
-
-]
-
----
-
-## Viewing our changes
-
-- Wait at least 10 seconds (for the new version to be deployed)
-
-- Then reload the web UI
-
-- Or just mash "reload" frantically
-
-- ... Eventually the legend on the left will be bigger!
-
----
-
-## Making changes
-
-.exercise[
-
-- Edit `~/orchestration-workshop/dockercoins/worker/worker.py`
-
-- Locate the line that has a `sleep` instruction
-
-- Increase the `sleep` from `0.1` to `1.0`
-
-- Save your changes and exit
-
-]
-
----
-
-## Rolling updates
-
-- Let's change a scaled service: `worker`
-
-.exercise[
-
-- Edit `worker/worker.py`
-
-- Locate the `sleep` instruction and change the delay
-
-- Build, ship, and run our changes:
-  ```bash
-  export TAG=v0.3
-  docker-compose -f dockercoins.yml build
-  docker-compose -f dockercoins.yml push
-  docker stack deploy -c dockercoins.yml dockercoins
-  ```
-
-]
-
----
-
-## Viewing our update as it rolls out
-
-.exercise[
-
-- Check the status of the `dockercoins_worker` service:
-  ```bash
-  watch docker service ps dockercoins_worker
-  ```
-
-- Hide the tasks that are shutdown:
-  ```bash
-  watch -n1 "docker service ps dockercoins_worker | grep -v Shutdown.*Shutdown"
-  ```
-
-]
-
-If you had stopped the workers earlier, this will automatically restart them.
-
-By default, SwarmKit does a rolling upgrade, one instance at a time.
-
-We should therefore see the workers being updated one my one.
-
----
-
-## Changing the upgrade policy
-
-- We can set upgrade parallelism (how many instances to update at the same time)
-
-- And upgrade delay (how long to wait between two batches of instances)
-
-.exercise[
-
-- Change the parallelism to 2 and the delay to 5 seconds:
-  ```bash
-    docker service update dockercoins_worker \
-      --update-parallelism 2 --update-delay 5s
-  ```
-
-]
-
-The current upgrade will continue at a faster pace.
-
----
-
-## Changing the policy in the Compose file
-
-- The policy can also be updated in the Compose file
-
-- This is done by adding an `update_config` key under the `deploy` key:
-
-  ```yaml
-    deploy:
-      replicas: 10
-      update_config:
-        parallelism: 2
-        delay: 10s
-  ```
-
----
-
-## Rolling back
-
-- At any time (e.g. before the upgrade is complete), we can rollback:
-
-  - by editing the Compose file and redeploying;
-
-  - or with the special `--rollback` flag
-
-.exercise[
-
-- Try to rollback the service:
-  ```bash
-  docker service update dockercoins_worker --rollback
-  ```
-
-]
-
-What happens with the web UI graph?
-
----
-
-## The fine print with rollback
-
-- Rollback reverts to the previous service definition
-
-- If we visualize successive updates as a stack:
-
-  - it doesn't "pop" the latest update
-
-  - it "pushes" a copy of the previous update on top
-
-  - ergo, rolling back twice does nothing
-
-- "Service definition" includes rollout cadence
-
-- Each `docker service update` command = a new service definition
-
----
-
-class: extra-details
-
-## Timeline of an upgrade
-
-- SwarmKit will upgrade N instances at a time
-  <br/>(following the `update-parallelism` parameter)
-
-- New tasks are created, and their desired state is set to `Ready`
-  <br/>.small[(this pulls the image if necessary, ensures resource availability, creates the container ... without starting it)]
-
-- If the new tasks fail to get to `Ready` state, go back to the previous step
-  <br/>.small[(SwarmKit will try again and again, until the situation is addressed or desired state is updated)]
-
-- When the new tasks are `Ready`, it sets the old tasks desired state to `Shutdown`
-
-- When the old tasks are `Shutdown`, it starts the new tasks
-
-- Then it waits for the `update-delay`, and continues with the next batch of instances
-
----
-
-name: healthcheck
-
-class: healthcheck
-
-# Health checks
-
-(New in Docker Engine 1.12)
-
-- Commands that are executed on regular intervals in a container
-
-- Must return 0 or 1 to indicate "all is good" or "something's wrong"
-
-- Must execute quickly (timeouts = failures)
-
-- Example:
-  ```bash
-  curl -f http://localhost/_ping || false
-  ```
-  - the `-f` flag ensures that `curl` returns non-zero for 404 and similar errors
-  - `|| false` ensures that any non-zero exit status gets mapped to 1
-  - `curl` must be installed in the container that is being checked
-
----
-
-class: healthcheck
-
-## Defining health checks
-
-- In a Dockerfile, with the [HEALTHCHECK](https://docs.docker.com/engine/reference/builder/#healthcheck) instruction
-  ```
-  HEALTHCHECK --interval=1s --timeout=3s CMD curl -f http://localhost/ || false
-  ```
-
-- From the command line, when running containers or services
-  ```
-  docker run --health-cmd "curl -f http://localhost/ || false" ...
-  docker service create --health-cmd "curl -f http://localhost/ || false" ...
-  ```
-
-- In Compose files, with a per-service [healthcheck](https://docs.docker.com/compose/compose-file/#healthcheck) section
-  ```yaml
-    www:
-      image: hellowebapp
-      healthcheck:
-        test: "curl -f https://localhost/ || false"
-        timeout: 3s
-  ```
-
----
-
-class: healthcheck
-
-## Using health checks
-
-- With `docker run`, health checks are purely informative
-
-  - `docker ps` shows health status
-
-  - `docker inspect` has extra details (including health check command output)
-
-- With `docker service`:
-
-  - unhealthy tasks are terminated (i.e. the service is restarted)
-
-  - failed deployments can be rolled back automatically
-    <br/>(by setting *at least* the flag `--update-failure-action rollback`)
-
----
-
-class: healthcheck
-
-## Automated rollbacks
-
-Here is a comprehensive example using the CLI:
-
-```bash
-docker service update \
-  --update-delay 5s \
-  --update-failure-action rollback \
-  --update-max-failure-ratio .25 \
-  --update-monitor 5s \
-  --update-parallelism 1 \
-  --rollback-delay 5s \
-  --rollback-failure-action pause \
-  --rollback-max-failure-ratio .5 \
-  --rollback-monitor 5s \
-  --rollback-parallelism 0 \
-  --health-cmd "curl -f http://localhost/ || exit 1" \
-  --health-interval 2s \
-  --health-retries 1 \
-  --image yourimage:newversion \
-  yourservice
-```
-
----
-
-class: healthcheck
-
-## Implementing auto-rollback in practice
-
-We will use the following Compose file (`stacks/dockercoins+healthchecks.yml`):
-
-```yaml
-...
-  hasher:
-    build: dockercoins/hasher
-    image: ${REGISTRY-127.0.0.1:5000}/hasher:${TAG-latest}
-    deploy:
-      replicas: 7
-      update_config:
-        delay: 5s
-        failure_action: rollback
-        max_failure_ratio: .5
-        monitor: 5s
-        parallelism: 1
-...
-```
-
----
-
-class: healthcheck
-
-## Enabling auto-rollback
-
-.exercise[
-
-- Go to the `stacks` directory:
-  ```bash
-  cd ~/orchestration-workshop/stacks
-  ```
-
-- Deploy the updated stack:
-  ```bash
-  docker deploy dockercoins --compose-file dockercoins+healthchecks.yml
-  ```
-
-]
-
-This will also scale the `hasher` service to 7 instances.
-
----
-
-class: healthcheck
-
-## Visualizing a rolling update
-
-First, let's make an "innocent" change and deploy it.
-
-.exercise[
-
-- Update the `sleep` delay in the code:
-  ```bash
-  sed -i "s/sleep 0.1/sleep 0.2/" dockercoins/hasher/hasher.rb
-  ```
-
-- Build, ship, and run the new image:
-  ```bash
-  export TAG=v0.3; docker-compose -f dockercoins+healthchecks.yml build
-  docker-compose -f dockercoins+healthchecks.yml push
-  docker service update dockercoins_hasher \
-           --detach=false --image=127.0.0.1:5000/hasher:$TAG
-  ```
-
-]
-
----
-
-class: healthcheck
-
-## Visualizing an automated rollback
-
-And now, a breaking change that will cause the health check to fail:
-
-.exercise[
-
-- Change the HTTP listening port:
-  ```bash
-  sed -i "s/80/81/" dockercoins/hasher/hasher.rb
-  ```
-
-- Build, ship, and run the new image:
-  ```bash
-  export TAG=v0.4; docker-compose -f dockercoins+healthchecks.yml build
-  docker-compose -f dockercoins+healthchecks.yml push
-  docker service update dockercoins_hasher \
-           --detach=false --image=127.0.0.1:5000/hasher:$TAG
-  ```
-
-]
-
----
-
-class: healthcheck
-
-## Command-line options available for health checks, rollbacks, etc.
-
-Batteries included, but swappable
-
-.small[
-```
---health-cmd string                  Command to run to check health
---health-interval duration           Time between running the check (ms|s|m|h)
---health-retries int                 Consecutive failures needed to report unhealthy
---health-start-period duration       Start period for the container to initialize before counting retries towards unstable (ms|s|m|h)
---health-timeout duration            Maximum time to allow one check to run (ms|s|m|h)
---no-healthcheck                     Disable any container-specified HEALTHCHECK
---restart-condition string           Restart when condition is met ("none"|"on-failure"|"any")
---restart-delay duration             Delay between restart attempts (ns|us|ms|s|m|h)
---restart-max-attempts uint          Maximum number of restarts before giving up
---restart-window duration            Window used to evaluate the restart policy (ns|us|ms|s|m|h)
---rollback                           Rollback to previous specification
---rollback-delay duration            Delay between task rollbacks (ns|us|ms|s|m|h)
---rollback-failure-action string     Action on rollback failure ("pause"|"continue")
---rollback-max-failure-ratio float   Failure rate to tolerate during a rollback
---rollback-monitor duration          Duration after each task rollback to monitor for failure (ns|us|ms|s|m|h)
---rollback-order string              Rollback order ("start-first"|"stop-first")
---rollback-parallelism uint          Maximum number of tasks rolled back simultaneously (0 to roll back all at once)
---update-delay duration              Delay between updates (ns|us|ms|s|m|h)
---update-failure-action string       Action on update failure ("pause"|"continue"|"rollback")
---update-max-failure-ratio float     Failure rate to tolerate during an update
---update-monitor duration            Duration after each task update to monitor for failure (ns|us|ms|s|m|h)
---update-order string                Update order ("start-first"|"stop-first")
---update-parallelism uint            Maximum number of tasks updated simultaneously (0 to update all at once)
-```
-]
-
-Yup ... That's a lot of batteries!
-
----
-
-class: node-info
-
-## Getting task information for a given node
-
-- You can see all the tasks assigned to a node with `docker node ps`
-
-- It shows the *desired state* and *current state* of each task
-
-- `docker node ps` shows info about the current node
-
-- `docker node ps <node_name_or_id>` shows info for another node
-
-- `docker node ps -f "name=<service_name>" <node_name>` The name filter matches on all or part of a task’s name
-
----
-
-class: swarmtools
-
-# SwarmKit debugging tools
-
-- The SwarmKit repository comes with debugging tools
-
-- They are *low level* tools; not for general use
-
-- We are going to see two of these tools:
-
-  - `swarmctl`, to communicate directly with the SwarmKit API
-
-  - `swarm-rafttool`, to inspect the content of the Raft log
-
----
-
-class: swarmtools
-
-## Building the SwarmKit tools
-
-- We are going to install a Go compiler, then download SwarmKit source and build it
-
-.exercise[
-- Download, compile, and install SwarmKit with this one-liner:
-  ```bash
-  docker run -v /usr/local/bin:/go/bin golang \
-         go get `-v` github.com/docker/swarmkit/...
-  ```
-
-]
-
-Remove `-v` if you don't like verbose things.
-
-Shameless promo: for more Go and Docker love, check
-[this blog post](http://jpetazzo.github.io/2016/09/09/go-docker/)!
-
-Note: in the unfortunate event of SwarmKit *master* branch being broken,
-the build might fail. In that case, just skip the Swarm tools section.
-
----
-
-class: swarmtools
-
-## Getting cluster-wide task information
-
-- The Docker API doesn't expose this directly (yet)
-
-- But the SwarmKit API does
-
-- We are going to query it with `swarmctl`
-
-- `swarmctl` is an example program showing how to
-  interact with the SwarmKit API
-
----
-
-class: swarmtools
-
-## Using `swarmctl`
-
-- The Docker Engine places the SwarmKit control socket in a special path
-
-- You need root privileges to access it
-
-.exercise[
-
-- If you are using Play-With-Docker, set the following alias:
-  ```bash
-    alias swarmctl='/lib/ld-musl-x86_64.so.1 /usr/local/bin/swarmctl \
-                    --socket /var/run/docker/swarm/control.sock'
-  ```
-
-- Otherwise, set the following alias:
-  ```bash
-    alias swarmctl='sudo swarmctl \
-                    --socket /var/run/docker/swarm/control.sock'
-  ```
-
-]
-
----
-
-class: swarmtools
-
-## `swarmctl` in action
-
-- Let's review a few useful `swarmctl` commands
-
-.exercise[
-
-- List cluster nodes (that's equivalent to `docker node ls`):
-  ```bash
-  swarmctl node ls
-  ```
-
-- View all tasks across all services:
-  ```bash
-  swarmctl task ls
-  ```
-
-]
-
----
-
-class: swarmtools
-
-## `swarmctl` notes
-
-- SwarmKit is vendored into the Docker Engine
-
-- If you want to use `swarmctl`, you need the exact version of
-  SwarmKit that was used in your Docker Engine
-
-- Otherwise, you might get some errors like:
-
-  ```
-  Error: grpc: failed to unmarshal the received message proto: wrong wireType = 0
-  ```
-
-- With Docker 1.12, the control socket was in `/var/lib/docker/swarm/control.sock`
-
----
-
-class: swarmtools
-
-## `swarm-rafttool`
-
-- SwarmKit stores all its important data in a distributed log using the Raft protocol
-
-  (This log is also simply called the "Raft log")
-
-- You can decode that log with `swarm-rafttool`
-
-- This is a great tool to understand how SwarmKit works
-
-- It can also be used in forensics or troubleshooting
-
-  (But consider it as a *very low level* tool!)
-
----
-
-class: swarmtools
-
-## The powers of `swarm-rafttool`
-
-With `swarm-rafttool`, you can:
-
-- view the latest snapshot of the cluster state;
-
-- view the Raft log (i.e. changes to the cluster state);
-
-- view specific objects from the log or snapshot;
-
-- decrypt the Raft data (to analyze it with other tools).
-
-It *cannot* work on live files, so you must stop Docker or make a copy first.
-
----
-
-class: swarmtools
-
-## Using `swarm-rafttool`
-
-- First, let's make a copy of the current Swarm data
-
-.exercise[
-
-- If you are using Play-With-Docker, the Docker data directory is `/graph`:
-  ```bash
-  cp -r /graph/swarm /swarmdata
-  ```
-
-- Otherwise, it is in the default `/var/lib/docker`:
-  ```bash
-  sudo cp -r /var/lib/docker/swarm /swarmdata
-  ```
-
-]
-
----
-
-class: swarmtools
-
-## Dumping the Raft log
-
-- We have to indicate the path holding the Swarm data
-
-  (Otherwise `swarm-rafttool` will try to use the live data, and complain that it's locked!)
-
-.exercise[
-
-- If you are using Play-With-Docker, you must use the musl linker:
-  ```bash
-  /lib/ld-musl-x86_64.so.1 /usr/local/bin/swarm-rafttool -d /swarmdata/ dump-wal
-  ```
-
-- Otherwise, you don't need the musl linker but you need to get root:
-  ```bash
-  sudo swarm-rafttool -d /swarmdata/ dump-wal
-  ```
-
-]
-
-Reminder: this is a very low-level tool, requiring a knowledge of SwarmKit's internals!
-
----
-
-# Secrets management and encryption at rest
-
-(New in Docker Engine 1.13)
-
-- Secrets management = selectively and securely bring secrets to services
-
-- Encryption at rest = protect against storage theft or prying
-
-- Remember:
-
-  - control plane is authenticated through mutual TLS, certs rotated every 90 days
-
-  - control plane is encrypted with AES-GCM, keys rotated every 12 hours
-
-  - data plane is not encrypted by default (for performance reasons),
-    <br/>but we saw earlier how to enable that with a single flag
-
----
-
-class: secrets
-
-## Secret management
-
-- Docker has a "secret safe" (secure key→value store)
-
-- You can create as many secrets as you like
-
-- You can associate secrets to services
-
-- Secrets are exposed as plain text files, but kept in memory only (using `tmpfs`)
-
-- Secrets are immutable (at least in Engine 1.13)
-
-- Secrets have a max size of 500 KB
-
----
-
-class: secrets
-
-## Creating secrets
-
-- Must specify a name for the secret; and the secret itself
-
-.exercise[
-
-- Assign [one of the four most commonly used passwords](https://www.youtube.com/watch?v=0Jx8Eay5fWQ) to a secret called `hackme`:
-  ```bash
-  echo love | docker secret create hackme -
-  ```
-
-]
-
-If the secret is in a file, you can simply pass the path to the file.
-
-(The special path `-` indicates to read from the standard input.)
-
----
-
-class: secrets
-
-## Creating better secrets
-
-- Picking lousy passwords always leads to security breaches
-
-.exercise[
-
-- Let's craft a better password, and assign it to another secret:
-  ```bash
-  base64 /dev/urandom | head -c16 | docker secret create arewesecureyet -
-  ```
-
-]
-
-Note: in the latter case, we don't even know the secret at this point. But Swarm does.
-
----
-
-class: secrets
-
-## Using secrets
-
-- Secrets must be handed explicitly to services
-
-.exercise[
-
-- Create a dummy service with both secrets:
-  ```bash
-    docker service create \
-           --secret hackme --secret arewesecureyet \
-           --name dummyservice --mode global \
-           alpine sleep 1000000000
-  ```
-
-]
-
-We use a global service to make sure that there will be an instance on the local node.
-
----
-
-class: secrets
-
-## Accessing secrets
-
-- Secrets are materialized on `/run/secrets` (which is an in-memory filesystem)
-
-.exercise[
-
-- Find the ID of the container for the dummy service:
-  ```bash
-  CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
-  ```
-
-- Enter the container:
-  ```bash
-  docker exec -ti $CID sh
-  ```
-
-- Check the files in `/run/secrets`
-
-]
-
----
-
-class: secrets
-
-## Rotating secrets
-
-- You can't change a secret
-
-  (Sounds annoying at first; but allows clean rollbacks if a secret update goes wrong)
-
-- You can add a secret to a service with `docker service update --secret-add`
-
-  (This will redeploy the service; it won't add the secret on the fly)
-
-- You can remove a secret with `docker service update --secret-rm`
-
-- Secrets can be mapped to different names by expressing them with a micro-format:
-  ```bash
-  docker service create --secret source=secretname,target=filename
-  ```
-
----
-
-class: secrets
-
-## Changing our insecure password
-
-- We want to replace our `hackme` secret with a better one
-
-.exercise[
-
-- Remove the insecure `hackme` secret:
-  ```bash
-  docker service update dummyservice --secret-rm hackme
-  ```
-
-- Add our better secret instead:
-  ```bash
-  docker service update dummyservice \
-         --secret-add source=arewesecureyet,target=hackme
-  ```
-
-]
-
-Wait for the service to be fully updated with e.g. `watch docker service ps dummyservice`.
-
----
-
-class: secrets
-
-## Checking that our password is now stronger
-
-- We will use the power of `docker exec`!
-
-.exercise[
-
-- Get the ID of the new container:
-  ```bash
-  CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
-  ```
-
-- Check the contents of the secret files:
-  ```bash
-  docker exec $CID grep -r . /run/secrets
-  ```
-
-]
-
----
-
-class: secrets
-
-## Secrets in practice
-
-- Can be (ab)used to hold whole configuration files if needed
-
-- If you intend to rotate secret `foo`, call it `foo.N` instead, and map it to `foo`
-
-  (N can be a serial, a timestamp...)
-
-  ```bash
-  docker service create --secret source=foo.N,target=foo ...
-  ```
-
-- You can update (remove+add) a secret in a single command:
-
-  ```bash
-  docker service update ... --secret-rm foo.M --secret-add source=foo.N,target=foo
-  ```
-
-- For more details and examples, [check the documentation](https://docs.docker.com/engine/swarm/secrets/)
-
----
-
-# Least privilege model
-
-- All the important data is stored in the "Raft log"
-
-- Managers nodes have read/write access to this data
-
-- Workers nodes have no access to this data
-
-- Workers only receive the minimum amount of data that they need:
-
-  - which services to run
-  - network configuration information for these services
-  - credentials for these services
-
-- Compromising a worker node does not give access to the full cluster
-
----
-
-## What can I do if I compromise a worker node?
-
-- I can enter the containers running on that node
-
-- I can access the configuration and credentials used by these containers
-
-- I can inspect the network traffic of these containers
-
-- I cannot inspect or disrupt the network traffic of other containers
-
-  (network information is provided by manager nodes; ARP spoofing is not possible)
-
-- I cannot infer the topology of the cluster and its number of nodes
-
-- I can only learn the IP addresses of the manager nodes
-
----
-
-## Guidelines for workload isolation leveraging least privilege model
-
-- Define security levels
-
-- Define security zones
-
-- Put managers in the highest security zone
-
-- Enforce workloads of a given security level to run in a given zone
-
-- Enforcement can be done with [Authorization Plugins](https://docs.docker.com/engine/extend/plugins_authorization/)
-
----
-
-class: namespaces
-name: namespaces
-
-# Improving isolation with User Namespaces
-
-- *Namespaces* are kernel mechanisms to compartimetalize the system
-
-- There are different kind of namespaces: `pid`, `net`, `mnt`, `ipc`, `uts`, and `user`
-
-- For a primer, see "Anatomy of a Container"
-  ([video](https://www.youtube.com/watch?v=sK5i-N34im8))
-  ([slides](https://www.slideshare.net/jpetazzo/cgroups-namespaces-and-beyond-what-are-containers-made-from-dockercon-europe-2015))
-
-- The *user namespace* allows to map UIDs between the containers and the host
-
-- As a result, `root` in a container can map to a non-privileged user on the host
-
-Note: even without user namespaces, `root` in a container cannot go wild on the host.
-<br/>
-It is mediated by capabilities, cgroups, namespaces, seccomp, LSMs...
-
----
-
-class: namespaces
-
-## User Namespaces in Docker
-
-- Optional feature added in Docker Engine 1.10
-
-- Not enabled by default
-
-- Has to be enabled at Engine startup, and affects all containers
-
-- When enabled, `UID:GID` in containers are mapped to a different range on the host
-
-- Safer than switching to a non-root user (with `-u` or `USER`) in the container
-  <br/>
-  (Since with user namespaces, root escalation maps to a non-privileged user)
-
-- Can be selectively disabled per container by starting them with `--userns=host`
-
----
-
-class: namespaces
-
-## User Namespaces Caveats
-
-When user namespaces are enabled, containers cannot:
-
-- Use the host's network namespace (with `docker run --network=host`)
-
-- Use the host's PID namespace (with `docker run --pid=host`)
-
-- Run in privileged mode (with `docker run --privileged`)
-
-... Unless user namespaces are disabled for the container, with flag `--userns=host`
-
-External volume and graph drivers that don't support user mapping might not work.
-
-All containers are currently mapped to the same UID:GID range.
-
-Some of these limitations might be lifted in the future!
-
----
-
-class: namespaces
-
-## Filesystem ownership details
-
-When enabling user namespaces:
-
-- the UID:GID on disk (in the images and containers) has to match the *mapped* UID:GID
-
-- existing images and containers cannot work (their UID:GID would have to be changed)
-
-For practical reasons, when enabling user namespaces, the Docker Engine places containers and images (and everything else) in a different directory.
-
-As a resut, if you enable user namespaces on an existing installation:
-
--  all containers and images (and e.g. Swarm data) disappear
-
-- *if a node is a member of a Swarm, it is then kicked out of the Swarm*
-
--  everything will re-appear if you disable user namespaces again
-
----
-
-class: namespaces
-
-## Picking a node
-
-- We will select a node where we will enable user namespaces
-
-- This node will have to be re-added to the Swarm
-
-- All containers and services running on this node will be rescheduled
-
-- Let's make sure that we do not pick the node running the registry!
-
-.exercise[
-
-- Check on which node the registry is running:
-  ```bash
-  docker service ps registry
-  ```
-
-]
-
-Pick any other node (noted `nodeX` in the next slides).
-
----
-
-class: namespaces
-
-## Logging into the right Engine
-
-.exercise[
-
-- Log into the right node:
-  ```bash
-  ssh node`X`
-  ```
-
-]
-
----
-
-class: namespaces
-
-## Configuring the Engine
-
-.exercise[
-
-- Create a configuration file for the Engine:
-  ```bash
-  echo '{"userns-remap": "default"}' | sudo tee /etc/docker/daemon.json
-  ```
-
-- Restart the Engine:
-  ```bash
-  kill $(pidof dockerd)
-  ```
-
-]
-
----
-
-class: namespaces 
-
-## Checking that User Namespaces are enabled
-
-.exercise[
-  - Notice the new Docker path:
-  ```bash
-  docker info | grep var/lib
-  ```
-
-  - Notice the new UID:GID permissions:
-  ```bash
-  sudo ls -l /var/lib/docker
-  ```
-
-]
-
-You should see a line like the following:
-```
-drwx------ 11 296608 296608 4096 Aug  3 05:11 296608.296608
-```
-
----
-
-class: namespaces
-
-## Add the node back to the Swarm
-
-.exercise[
-
-- Get our manager token from another node:
-  ```bash
-  ssh node`Y` docker swarm join-token manager
-  ```
-
-- Copy-paste the join command to the node
-
-]
-
----
-
-class: namespaces
-
-## Check the new UID:GID
-
-.exercise[
-
-- Run a background container on the node:
-  ```bash
-  docker run -d --name lockdown alpine sleep 1000000
-  ```
-
-- Look at the processes in this container:
-  ```bash
-  docker top lockdown
-  ps faux
-  ```
-
-]
-
----
-
-class: namespaces
-
-## Comparing on-disk ownership with/without User Namespaces
-
-.exercise[
-
-- Compare the output of the two following commands:
-  ```bash
-  docker run alpine ls -l /
-  docker run --userns=host alpine ls -l /
-  ```
-
-]
-
---
-
-class: namespaces
-
-In the first case, it looks like things belong to `root:root`.
-
-In the second case, we will see the "real" (on-disk) ownership.
-
---
-
-class: namespaces
-
-Remember to get back to `node1` when finished!
-
----
-
-## A reminder about *scope*
-
-- Out of the box, Docker API access is "all or nothing"
-
-- When someone has access to the Docker API, they can access *everything*
-
-- If your developers are using the Docker API to deploy on the dev cluster ...
-
-  ... and the dev cluster is the same as the prod cluster ...
-
-  ... it means that your devs have access to your production data, passwords, etc.
-
-- This can easily be avoided
-
----
-
-## Fine-grained API access control
-
-A few solutions, by increasing order of flexibility:
-
-- Use separate clusters for different security perimeters
-
-  (And different credentials for each cluster)
-
---
-
-- Add an extra layer of abstraction (sudo scripts, hooks, or full-blown PAAS)
-
---
-
-- Enable [authorization plugins]
-
-  - each API request is vetted by your plugin(s)
-
-  - by default, the *subject name* in the client TLS certificate is used as user name
-
-  - example: [user and permission management] in [UCP]
-
-[authorization plugins]: https://docs.docker.com/engine/extend/plugins_authorization/
-[UCP]: https://docs.docker.com/datacenter/ucp/2.1/guides/
-[user and permission management]: https://docs.docker.com/datacenter/ucp/2.1/guides/admin/manage-users/
-
-
----
-
-class: encryption-at-rest
-
-## Encryption at rest
-
-- Swarm data is always encrypted
-
-- A Swarm cluster can be "locked"
-
-- When a cluster is "locked", the encryption key is protected with a passphrase
-
-- Starting or restarting a locked manager requires the passphrase
-
-- This protects against:
-
-  - theft (stealing a physical machine, a disk, a backup tape...)
-
-  - unauthorized access (to e.g. a remote or virtual volume)
-
-  - some vulnerabilities (like path traversal)
-
----
-
-class: encryption-at-rest
-
-## Locking a Swarm cluster
-
-- This is achieved through the `docker swarm update` command
-
-.exercise[
-
-- Lock our cluster:
-  ```bash
-  docker swarm update --autolock=true
-  ```
-
-]
-
-This will display the unlock key. Copy-paste it somewhere safe.
-
----
-
-class: encryption-at-rest
-
-## Locked state
-
-- If we restart a manager, it will now be locked
-
-.exercise[
-
-- Restart the local Engine:
-  ```bash
-  sudo systemctl restart docker
-  ```
-
-]
-
-Note: if you are doing the workshop on your own, using nodes
-that you [provisioned yourself](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine) or with [Play-With-Docker](http://play-with-docker.com/), you might have to use a different method to restart the Engine.
-
----
-
-class: encryption-at-rest
-
-## Checking that our node is locked
-
-- Manager commands (requiring access to crypted data) will fail
-
-- Other commands are OK
-
-.exercise[
-
-- Try a few basic commands:
-  ```bash
-  docker ps
-  docker run alpine echo ♥
-  docker node ls
-  ```
-
-]
-
-(The last command should fail, and it will tell you how to unlock this node.)
-
----
-
-class: encryption-at-rest
-
-## Checking the state of the node programmatically
-
-- The state of the node shows up in the output of `docker info`
-
-.exercise[
-
-- Check the output of `docker info`:
-  ```bash
-  docker info
-  ```
-
-- Can't see it? Too verbose? Grep to the rescue!
-  ```bash
-  docker info | grep ^Swarm
-  ```
-
-]
-
----
-
-class: encryption-at-rest
-
-## Unlocking a node
-
-- You will need the secret token that we obtained when enabling auto-lock earlier
-
-.exercise[
-
-- Unlock the node:
-  ```bash
-  docker swarm unlock
-  ```
-
-- Copy-paste the secret token that we got earlier
-
-- Check that manager commands now work correctly:
-  ```bash
-  docker node ls
-  ```
-
-]
-
----
-
-class: encryption-at-rest
-
-## Managing the secret key
-
-- If the key is compromised, you can change it and re-encrypt with a new key:
-  ```bash
-  docker swarm unlock-key --rotate
-  ```
-
-- If you lost the key, you can get it as long as you have at least one unlocked node:
-  ```bash
-  docker swarm unlock-key -q
-  ```
-
-Note: if you rotate the key while some nodes are locked, without saving the previous key, those nodes won't be able to rejoin.
-
-Note: if somebody steals both your disks and your key, .strike[you're doomed! Doooooomed!]
-<br/>you can block the compromised node with `docker node demote` and `docker node rm`.
-
----
-
-class: encryption-at-rest
-
-## Unlocking the cluster permanently
-
-- If you want to remove the secret key, disable auto-lock
-
-.exercise[
-
-- Permanently unlock the cluster:
-  ```bash
-  docker swarm update --autolock=false
-  ```
-
-]
-
-Note: if some nodes are in locked state at that moment (or if they are offline/restarting
-while you disabled autolock), they still need the previous unlock key to get back online.
-
-For more information about locking, you can check the [upcoming documentation](https://github.com/docker/docker.github.io/pull/694).
-
----
-
-name: logging
-
-# Centralized logging
-
-- We want to send all our container logs to a central place
-
-- If that place could offer a nice web dashboard too, that'd be nice
-
---
-
-- We are going to deploy an ELK stack
-
-- It will accept logs over a GELF socket
-
-- We will update our services to send logs through the GELF logging driver
-
----
-
-# Setting up ELK to store container logs
-
-*Important foreword: this is not an "official" or "recommended"
-setup; it is just an example. We used ELK in this demo because
-it's a popular setup and we keep being asked about it; but you
-will have equal success with Fluent or other logging stacks!*
-
-What we will do:
-
-- Spin up an ELK stack with services
-
-- Gaze at the spiffy Kibana web UI
-
-- Manually send a few log entries using one-shot containers
-
-- Set our containers up to send their logs to Logstash
-
----
-
-## What's in an ELK stack?
-
-- ELK is three components:
-
-  - ElasticSearch (to store and index log entries)
-
-  - Logstash (to receive log entries from various
-    sources, process them, and forward them to various
-    destinations)
-
-  - Kibana (to view/search log entries with a nice UI)
-
-- The only component that we will configure is Logstash
-
-- We will accept log entries using the GELF protocol
-
-- Log entries will be stored in ElasticSearch,
-  <br/>and displayed on Logstash's stdout for debugging
-
----
-
-class: elk-manual
-
-## Setting up ELK
-
-- We need three containers: ElasticSearch, Logstash, Kibana
-
-- We will place them on a common network, `logging`
-
-.exercise[
-
-- Create the network:
-  ```bash
-  docker network create --driver overlay logging
-  ```
-
-- Create the ElasticSearch service:
-  ```bash
-  docker service create --network logging --name elasticsearch elasticsearch:2.4
-  ```
-
-]
-
----
-
-class: elk-manual
-
-## Setting up Kibana
-
-- Kibana exposes the web UI
-
-- Its default port (5601) needs to be published
-
-- It needs a tiny bit of configuration: the address of the ElasticSearch service
-
-- We don't want Kibana logs to show up in Kibana (it would create clutter)
-  <br/>so we tell Logspout to ignore them
-
-.exercise[
-
-- Create the Kibana service:
-  ```bash
-  docker service create --network logging --name kibana --publish 5601:5601 \
-         -e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana:4.6
-  ```
-
-]
-
----
-
-class: elk-manual
-
-## Setting up Logstash
-
-- Logstash needs some configuration to listen to GELF messages and send them to ElasticSearch
-
-- We could author a custom image bundling this configuration
-
-- We can also pass the [configuration](https://github.com/jpetazzo/orchestration-workshop/blob/master/elk/logstash.conf) on the command line
-
-.exercise[
-
-- Create the Logstash service:
-  ```bash
-    docker service create --network logging --name logstash -p 12201:12201/udp \
-           logstash:2.4 -e "$(cat ~/orchestration-workshop/elk/logstash.conf)"
-  ```
-
-]
-
----
-
-class: elk-manual
-
-## Checking Logstash
-
-- Before proceeding, let's make sure that Logstash started properly
-
-.exercise[
-
-- Lookup the node running the Logstash container:
-  ```bash
-  docker service ps logstash
-  ```
-
-- Connect to that node
-
-]
-
----
-
-class: elk-manual
-
-## View Logstash logs
-
-.exercise[
-
-- Get the ID of the Logstash container:
-  ```bash
-  CID=$(docker ps -q --filter label=com.docker.swarm.service.name=logstash)
-  ```
-
-- View the logs:
-  ```bash
-  docker logs --follow $CID
-  ```
-
-]
-
-You should see the heartbeat messages:
-.small[
-```json
-{      "message" => "ok",
-          "host" => "1a4cfb063d13",
-      "@version" => "1",
-    "@timestamp" => "2016-06-19T00:45:45.273Z"
-}
-```
-]
-
----
-
-class: elk-auto
-
-## Deploying our ELK cluster
-
-- We will use a stack file
-
-.exercise[
-
-- Build, ship, and run our ELK stack:
-  ```bash
-  docker-compose -f elk.yml build
-  docker-compose -f elk.yml push
-  docker stack deploy elk -c elk.yml
-  ```
-
-]
-
-Note: the *build* and *push* steps are not strictly necessary, but they don't hurt!
-
-Let's have a look at the [Compose file](
-https://github.com/jpetazzo/orchestration-workshop/blob/master/stacks/elk.yml).
-
----
-
-class: elk-auto
-
-## Checking that our ELK stack works correctly
-
-- Let's view the logs of logstash
-
-  (Who logs the loggers?)
-
-.exercise[
-
-- Stream logstash's logs:
-  ```bash
-  docker service logs --follow --tail 1 elk_logstash
-  ```
-
-]
-
-You should see the heartbeat messages:
-
-.small[
-```json
-{      "message" => "ok",
-          "host" => "1a4cfb063d13",
-      "@version" => "1",
-    "@timestamp" => "2016-06-19T00:45:45.273Z"
-}
-```
-]
-
----
-
-## Testing the GELF receiver
-
-- In a new window, we will generate a logging message
-
-- We will use a one-off container, and Docker's GELF logging driver
-
-.exercise[
-
-- Send a test message:
-  ```bash
-    docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
-           --rm alpine echo hello
-  ```
-]
-
-The test message should show up in the logstash container logs.
-
----
-
-## Sending logs from a service
-
-- We were sending from a "classic" container so far; let's send logs from a service instead
-
-- We're lucky: the parameters (`--log-driver` and `--log-opt`) are exactly the same!
-
-
-.exercise[
-
-- Send a test message:
-  ```bash
-    docker service create \
-           --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
-           alpine echo hello
-  ```
-
-]
-
-The test message should show up as well in the logstash container logs.
-
---
-
-In fact, *multiple messages will show up, and continue to show up every few seconds!*
-
----
-
-## Restart conditions
-
-- By default, if a container exits (or is killed with `docker kill`, or runs out of memory ...),
-  the Swarm will restart it (possibly on a different machine)
-
-- This behavior can be changed by setting the *restart condition* parameter
-
-.exercise[
-
-- Change the restart condition so that Swarm doesn't try to restart our container forever:
-  ```bash
-  docker service update `xxx` --restart-condition none
-  ```
-]
-
-Available restart conditions are `none`, `any`, and `on-error`.
-
-You can also set `--restart-delay`, `--restart-max-attempts`, and `--restart-window`.
-
----
-
-## Connect to Kibana
-
-- The Kibana web UI is exposed on cluster port 5601
-
-.exercise[
-
-- Connect to port 5601 of your cluster
-
-  - if you're using Play-With-Docker, click on the (5601) badge above the terminal
-
-  - otherwise, open http://(any-node-address):5601/ with your browser
-
-]
-
----
-
-## "Configuring" Kibana
-
-- If you see a status page with a yellow item, wait a minute and reload
-  (Kibana is probably still initializing)
-
-- Kibana should offer you to "Configure an index pattern":
-  <br/>in the "Time-field name" drop down, select "@timestamp", and hit the
-  "Create" button
-
-- Then:
-
-  - click "Discover" (in the top-left corner)
-  - click "Last 15 minutes" (in the top-right corner)
-  - click "Last 1 hour" (in the list in the middle)
-  - click "Auto-refresh" (top-right corner)
-  - click "5 seconds" (top-left of the list)
-
-- You should see a series of green bars (with one new green bar every minute)
-
----
-
-## Updating our services to use GELF
-
-- We will now inform our Swarm to add GELF logging to all our services
-
-- This is done with the `docker service update` command
-
-- The logging flags are the same as before
-
-.exercise[
-
-<!--
-
-- Enable GELF logging for all our *stateless* services:
-  ```bash
-    for SERVICE in hasher rng webui worker; do
-      docker service update dockercoins_$SERVICE \
-             --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201
-    done
-  ```
-
--->
-
-- Enable GELF logging for the `rng` service:
-  ```bash
-    docker service update dockercoins_rng \
-           --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201
-  ```
-
-]
-
-After ~15 seconds, you should see the log messages in Kibana.
-
----
-
-## Viewing container logs
-
-- Go back to Kibana
-
-- Container logs should be showing up!
-
-- We can customize the web UI to be more readable
-
-.exercise[
-
-- In the left column, move the mouse over the following
-  columns, and click the "Add" button that appears:
-
-  - host
-  - container_name
-  - message
-
-<!--
-  - logsource
-  - program
-  - message
--->
-
-]
-
----
-
-## .warning[Don't update stateful services!]
-
-- What would have happened if we had updated the Redis service?
-
-- When a service changes, SwarmKit replaces existing container with new ones
-
-- This is fine for stateless services
-
-- But if you update a stateful service, its data will be lost in the process
-
-- If we updated our Redis service, all our DockerCoins would be lost
-
----
-
-## Important afterword
-
-**This is not a "production-grade" setup.**
-
-It is just an educational example. We did set up a single
-ElasticSearch instance and a single Logstash instance.
-
-In a production setup, you need an ElasticSearch cluster
-(both for capacity and availability reasons). You also
-need multiple Logstash instances.
-
-And if you want to withstand
-bursts of logs, you need some kind of message queue:
-Redis if you're cheap, Kafka if you want to make sure
-that you don't drop messages on the floor. Good luck.
-
-If you want to learn more about the GELF driver,
-have a look at [this blog post](
-http://jpetazzo.github.io/2017/01/20/docker-logging-gelf/).
-
----
-
-# Metrics collection
-
-- We want to gather metrics in a central place
-
-- We will gather node metrics and container metrics
-
-- We want a nice interface to view them (graphs)
-
----
-
-## Node metrics
-
-- CPU, RAM, disk usage on the whole node
-
-- Total number of processes running, and their states
-
-- Number of open files, sockets, and their states
-
-- I/O activity (disk, network), per operation or volume
-
-- Physical/hardware (when applicable): temperature, fan speed ...
-
-- ... and much more!
-
----
-
-## Container metrics
-
-- Similar to node metrics, but not totally identical
-
-- RAM breakdown will be different
-
-  - active vs inactive memory
-  - some memory is *shared* between containers, and accounted specially
-
-- I/O activity is also harder to track
-
-  - async writes can cause deferred "charges"
-  - some page-ins are also shared between containers
-
-For details about container metrics, see:
-<br/>
-http://jpetazzo.github.io/2013/10/08/docker-containers-metrics/
-
----
-
-class: snap, prom
-
-## Tools
-
-We will build *two* different metrics pipelines:
-
-- One based on Intel Snap,
-
-- Another based on Prometheus.
-
-If you're using Play-With-Docker, skip the exercises
-relevant to Intel Snap (we rely on a SSH server to deploy,
-and PWD doesn't have that yet).
-
----
-
-class: snap
-
-## First metrics pipeline
-
-We will use three open source Go projects for our first metrics pipeline:
-
-- Intel Snap
-
-  Collects, processes, and publishes metrics
-
-- InfluxDB
-
-  Stores metrics
-
-- Grafana
-
-  Displays metrics visually
-
----
-
-class: snap
-
-## Snap
-
-- [github.com/intelsdi-x/snap](https://github.com/intelsdi-x/snap)
-
-- Can collect, process, and publish metric data
-
-- Doesn’t store metrics
-
-- Works as a daemon (snapd) controlled by a CLI (snapctl)
-
-- Offloads collecting, processing, and publishing to plugins
-
-- Does nothing out of the box; configuration required!
-
-- Docs: https://github.com/intelsdi-x/snap/blob/master/docs/
-
----
-
-class: snap
-
-## InfluxDB
-
-- Snap doesn't store metrics data
-
-- InfluxDB is specifically designed for time-series data
-
-  - CRud vs. CRUD (you rarely if ever update/delete data)
-
-  - orthogonal read and write patterns
-
-  - storage format optimization is key (for disk usage and performance)
-
-- Snap has a plugin allowing to *publish* to InfluxDB
-
----
-
-class: snap
-
-## Grafana
-
-- Snap cannot show graphs
-
-- InfluxDB cannot show graphs
-
-- Grafana will take care of that
-
-- Grafana can read data from InfluxDB and display it as graphs
-
----
-
-class: snap
-
-## Getting and setting up Snap
-
-- We will install Snap directly on the nodes
-
-- Release tarballs are available from GitHub
-
-- We will use a *global service*
-  <br/>(started on all nodes, including nodes added later)
-
-- This service will download and unpack Snap in /opt and /usr/local
-
-- /opt and /usr/local will be bind-mounted from the host
-
-- This service will effectively install Snap on the hosts
-
----
-
-class: snap
-
-## The Snap installer service
-
-- This will get Snap on all nodes
-
-.exercise[
-
-```bash
-docker service create --restart-condition=none --mode global \
-       --mount type=bind,source=/usr/local/bin,target=/usr/local/bin \
-       --mount type=bind,source=/opt,target=/opt centos sh -c '
-SNAPVER=v0.16.1-beta
-RELEASEURL=https://github.com/intelsdi-x/snap/releases/download/$SNAPVER
-curl -sSL $RELEASEURL/snap-$SNAPVER-linux-amd64.tar.gz |
-     tar -C /opt -zxf-
-curl -sSL $RELEASEURL/snap-plugins-$SNAPVER-linux-amd64.tar.gz |
-     tar -C /opt -zxf-
-ln -s snap-$SNAPVER /opt/snap
-for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
-' # If you copy-paste that block, do not forget that final quote ☺
-```
-
-]
-
----
-
-class: snap
-
-## First contact with `snapd`
-
-- The core of Snap is `snapd`, the Snap daemon
-
-- Application made up of a REST API, control module, and scheduler module
-
-.exercise[
-
-- Start `snapd` with plugin trust disabled and log level set to debug:
-  ```bash
-  snapd -t 0 -l 1
-  ```
-
-]
-
-- More resources:
-
-  https://github.com/intelsdi-x/snap/blob/master/docs/SNAPD.md
-  https://github.com/intelsdi-x/snap/blob/master/docs/SNAPD_CONFIGURATION.md
-
----
-
-class: snap
-
-## Using `snapctl` to interact with `snapd`
-
-- Let's load a *collector* and a *publisher* plugins
-
-.exercise[
-
-- Open a new terminal
-
-- Load the psutil collector plugin:
-  ```bash
-  snapctl plugin load /opt/snap/plugin/snap-plugin-collector-psutil
-  ```
-
-- Load the file publisher plugin:
-  ```bash
-  snapctl plugin load /opt/snap/plugin/snap-plugin-publisher-mock-file
-  ```
-
-]
-
----
-
-class: snap
-
-## Checking what we've done
-
-- Good to know: Docker CLI uses `ls`, Snap CLI uses `list`
-
-.exercise[
-
-- See your loaded plugins:
-  ```bash
-  snapctl plugin list
-  ```
-
-- See the metrics you can collect:
-  ```bash
-  snapctl metric list
-  ```
-
-]
-
----
-
-class: snap
-
-## Actually collecting metrics: introducing *tasks*
-
-- To start collecting/processing/publishing metric data, you need to create a *task*
-
-- A *task* indicates:
-
-  - *what* to collect (which metrics)
-  - *when* to collect it (e.g. how often)
-  - *how* to process it (e.g. use it directly, or compute moving averages)
-  - *where* to publish it
-
-- Tasks can be defined with manifests written in JSON or YAML
-
-- Some plugins, such as the Docker collector, allow for wildcards (\*) in the metrics "path"
-  <br/>(see snap/docker-influxdb.json)
-
-- More resources:
-  https://github.com/intelsdi-x/snap/blob/master/docs/TASKS.md
-
----
-
-class: snap
-
-## Our first task manifest
-
-```yaml
-  version: 1
-  schedule:
-    type: "simple" # collect on a set interval
-    interval: "1s" # of every 1s
-  max-failures: 10
-  workflow:
-    collect: # first collect
-      metrics: # metrics to collect
-        /intel/psutil/load/load1: {}
-      config: # there is no configuration
-      publish: # after collecting, publish
-        -
-            plugin_name: "file" # use the file publisher
-            config:
-                file: "/tmp/snap-psutil-file.log" # write to this file
-```
-
----
-
-class: snap
-
-## Creating our first task
-
-- The task manifest shown on the previous slide is stored in `snap/psutil-file.yml`.
-
-.exercise[
-
-- Create a task using the manifest:
-
-  ```bash
-  cd ~/orchestration-workshop/snap
-  snapctl task create -t psutil-file.yml
-  ```
-
-]
-
-  The output should look like the following:
-  ```
-    Using task manifest to create task
-    Task created
-    ID: 240435e8-a250-4782-80d0-6fff541facba
-    Name: Task-240435e8-a250-4782-80d0-6fff541facba
-    State: Running
-  ```
-
----
-
-class: snap
-
-## Checking existing tasks
-
-.exercise[
-
-- This will confirm that our task is running correctly, and remind us of its task ID
-
-  ```bash
-  snapctl task list
-  ```
-
-]
-
-The output should look like the following:
-  ```
-    ID           NAME              STATE     HIT MISS FAIL CREATED
-    24043...acba Task-24043...acba Running   4   0    0    2:34PM   8-13-2016
-  ```
----
-
-class: snap
-
-## Viewing our task dollars at work
-
-- The task is using a very simple publisher, `mock-file`
-
-- That publisher just writes text lines in a file (one line per data point)
-
-.exercise[
-
-- Check that the data is flowing indeed:
-  ```bash
-  tail -f /tmp/snap-psutil-file.log
-  ```
-
-]
-
-To exit, hit `^C`
-
----
-
-class: snap
-
-## Debugging tasks
-
-- When a task is not directly writing to a local file, use `snapctl task watch`
-
-- `snapctl task watch` will stream the metrics you are collecting to STDOUT
-
-.exercise[
-
-```bash
-snapctl task watch <ID>
-```
-
-]
-
-To exit, hit `^C`
-
----
-
-class: snap
-
-## Stopping snap
-
-- Our Snap deployment has a few flaws:
-
-  - snapd was started manually
-
-  - it is running on a single node
-
-  - the configuration is purely local
-
---
-
-class: snap
-
-- We want to change that!
-
---
-
-class: snap
-
-- But first, go back to the terminal where `snapd` is running, and hit `^C`
-
-- All tasks will be stopped; all plugins will be unloaded; Snap will exit
-
----
-
-class: snap
-
-## Snap Tribe Mode
-
-- Tribe is Snap's clustering mechanism
-
-- When tribe mode is enabled, nodes can join *agreements*
-
-- When a node in an *agreement* does something (e.g. load a plugin or run a task),
-  <br/>other nodes of that agreement do the same thing
-
-- We will use it to load the Docker collector and InfluxDB publisher on all nodes,
-  <br/>and run a task to use them
-
-- Without tribe mode, we would have to load plugins and run tasks manually on every node
-
-- More resources:
-  https://github.com/intelsdi-x/snap/blob/master/docs/TRIBE.md
-
----
-
-class: snap
-
-## Running Snap itself on every node
-
-- Snap runs in the foreground, so you need to use `&` or start it in tmux
-
-.exercise[
-
-- Run the following command *on every node:*
-  ```bash
-  snapd -t 0 -l 1 --tribe --tribe-seed node1:6000
-  ```
-
-]
-
-If you're *not* using Play-With-Docker, there is another way to start Snap!
-
----
-
-class: snap
-
-## Starting a daemon through SSH
-
-.warning[Hackety hack ahead!]
-
-- We will create a *global service*
-
-- That global service will install a SSH client
-
-- With that SSH client, the service will connect back to its local node
-  <br/>(i.e. "break out" of the container, using the SSH key that we provide)
-
-- Once logged on the node, the service starts snapd with Tribe Mode enabled
-
----
-
-class: snap
-
-## Running Snap itself on every node
-
-- I might go to hell for showing you this, but here it goes ...
-
-.exercise[
-
-- Start Snap all over the place:
-  ```bash
-    docker service create --name snapd --mode global \
-           --mount type=bind,source=$HOME/.ssh/id_rsa,target=/sshkey \
-           alpine sh -c "
-                  apk add --no-cache openssh-client &&
-                  ssh -o StrictHostKeyChecking=no -i /sshkey docker@172.17.0.1 \
-                      sudo snapd -t 0 -l 1 --tribe --tribe-seed node1:6000
-           " # If you copy-paste that block, don't forget that final quote :-)
-   ```
-
-]
-
-Remember: this *does not work* with Play-With-Docker (which doesn't have SSH).
-
----
-
-class: snap
-
-## Viewing the members of our tribe
-
-- If everything went fine, Snap is now running in tribe mode
-
-.exercise[
-
-- View the members of our tribe:
-  ```bash
-  snapctl member list
-  ```
-
-]
-
-This should show the 5 nodes with their hostnames.
-
----
-
-class: snap
-
-## Create an agreement
-
-- We can now create an *agreement* for our plugins and tasks
-
-.exercise[
-
-- Create an agreement; make sure to use the same name all along:
-  ```bash
-  snapctl agreement create docker-influxdb
-  ```
-
-]
-
-The output should look like the following:
-
-```
-  Name             Number of Members       plugins      tasks
-  docker-influxdb  0                       0            0
-```
-
----
-
-class: snap
-
-## Instruct all nodes to join the agreement
-
-- We dont need another fancy global service!
-
-- We can join nodes from any existing node of the cluster
-
-.exercise[
-
-- Add all nodes to the agreement:
-  ```bash
-    snapctl member list | tail -n +2 |
-      xargs -n1 snapctl agreement join docker-influxdb
-  ```
-
-]
-
-The last bit of output should look like the following:
-```
-  Name             Number of Members       plugins         tasks
-  docker-influxdb  5                       0               0
-```
-
----
-
-class: snap
-
-## Start a container on every node
-
-- The Docker plugin requires at least one container to be started
-
-- Normally, at this point, you will have at least one container on each node
-
-- But just in case you did things differently, let's create a dummy global service
-
-.exercise[
-
-- Create an alpine container on the whole cluster:
-  ```bash
-    docker service create --name ping --mode global alpine ping 8.8.8.8
-  ```
-
-]
-
----
-
-class: snap
-
-## Running InfluxDB
-
-- We will create a service for InfluxDB
-
-- We will use the official image
-
-- InfluxDB uses multiple ports:
-
-  - 8086 (HTTP API; we need this)
-
-  - 8083 (admin interface; we need this)
-
-  - 8088 (cluster communication; not needed here)
-
-  - more ports for other protocols (graphite, collectd...)
-
-- We will just publish the first two
-
----
-
-class: snap
-
-## Creating the InfluxDB service
-
-.exercise[
-
-- Start an InfluxDB service, publishing ports 8083 and 8086:
-  ```bash
-    docker service create --name influxdb \
-           --publish 8083:8083 \
-           --publish 8086:8086 \
-           influxdb:0.13
-  ```
-
-]
-
-Note: this will allow any node to publish metrics data to `localhost:8086`,
-and it will allows us to access the admin interface by connecting to any node
-on port 8083.
-
-.warning[Make sure to use InfluxDB 0.13; a few things changed in 1.0
-(like, the name of the default retention policy is now "autogen") and
-this breaks a few things.]
-
----
-
-class: snap
-
-## Setting up InfluxDB
-
-- We need to create the "snap" database
-
-.exercise[
-
-- Open port 8083 with your browser
-
-- Enter the following query in the query box:
-  ```
-  CREATE DATABASE "snap"
-  ```
-
-- In the top-right corner, select "Database: snap"
-
-]
-
-Note: the InfluxDB query language *looks like* SQL but it's not.
-
-???
-
-## Setting a retention policy
-
-- When graduating to 1.0, InfluxDB changed the name of the default policy
-
-- It used to be "default" and it is now "autogen"
-
-- Snap still uses "default" and this results in errors
-
-.exercise[
-
-- Create a "default" retention policy by entering the following query in the box:
-  ```
-  CREATE RETENTION POLICY "default" ON "snap" DURATION 1w REPLICATION 1
-  ```
-
-]
-
----
-
-class: snap
-
-## Load Docker collector and InfluxDB publisher
-
-- We will load plugins on the local node
-
-- Since our local node is a member of the agreement, all other
-  nodes in the agreement will also load these plugins
-
-.exercise[
-
-- Load Docker collector:
-
-  ```bash
-  snapctl plugin load /opt/snap/plugin/snap-plugin-collector-docker
-  ```
-
-- Load InfluxDB publisher:
-
-  ```bash
-  snapctl plugin load /opt/snap/plugin/snap-plugin-publisher-influxdb
-  ```
-
-]
-
----
-
-class: snap
-
-## Start a simple collection task
-
-- Again, we will create a task on the local node
-
-- The task will be replicated on other nodes members of the same agreement
-
-.exercise[
-
-- Load a task manifest file collecting a couple of metrics on all containers,
-  <br/>and sending them to InfluxDB:
-  ```bash
-  cd ~/orchestration-workshop/snap
-  snapctl task create -t docker-influxdb.json
-  ```
-
-]
-
-Note: the task description sends metrics to the InfluxDB API endpoint
-located at 127.0.0.1:8086. Since the InfluxDB container is published
-on port 8086, 127.0.0.1:8086 always routes traffic to the InfluxDB
-container.
-
----
-
-class: snap
-
-## If things go wrong...
-
-Note: if a task runs into a problem (e.g. it's trying to publish
-to a metrics database, but the database is unreachable), the task
-will be stopped.
-
-You will have to restart it manually by running:
-
-```bash
-snapctl task enable <ID>
-snapctl task start <ID>
-```
-
-This must be done *per node*. Alternatively, you can delete+re-create
-the task (it will delete+re-create on all nodes).
-
----
-
-class: snap
-
-## Check that metric data shows up in InfluxDB
-
-- Let's check existing data with a few manual queries in the InfluxDB admin interface
-
-.exercise[
-
-- List "measurements":
-  ```
-  SHOW MEASUREMENTS
-  ```
-  (This should show two generic entries corresponding to the two collected metrics.)
-
-- View time series data for one of the metrics:
-  ```
-  SELECT * FROM "intel/docker/stats/cgroups/cpu_stats/cpu_usage/total_usage"
-  ```
-  (This should show a list of data points with **time**, **docker_id**, **source**, and **value**.)
-
-]
-
----
-
-class: snap
-
-## Deploy Grafana
-
-- We will use an almost-official image, `grafana/grafana`
-
-- We will publish Grafana's web interface on its default port (3000)
-
-.exercise[
-
-- Create the Grafana service:
-  ```bash
-  docker service create --name grafana --publish 3000:3000 grafana/grafana:3.1.1
-  ```
-
-]
-
----
-
-class: snap
-
-## Set up Grafana
-
-.exercise[
-
-- Open port 3000 with your browser
-
-- Identify with "admin" as the username and password
-
-- Click on the Grafana logo (the orange spiral in the top left corner)
-
-- Click on "Data Sources"
-
-- Click on "Add data source" (green button on the right)
-
-]
-
----
-
-class: snap
-
-## Add InfluxDB as a data source for Grafana
-
-.small[
-
-Fill the form exactly as follows:
-- Name = "snap"
-- Type = "InfluxDB"
-
-In HTTP settings, fill as follows:
-- Url = "http://(IP.address.of.any.node):8086"
-- Access = "direct"
-- Leave HTTP Auth untouched
-
-In InfluxDB details, fill as follows:
-- Database = "snap"
-- Leave user and password blank
-
-Finally, click on "add", you should see a green message saying "Success - Data source is working".
-If you see an orange box (sometimes without a message), it means that you got something wrong. Triple check everything again.
-
-]
-
----
-
-class: snap
-
-![Screenshot showing how to fill the form](grafana-add-source.png)
-
----
-
-class: snap
-
-## Create a dashboard in Grafana
-
-.exercise[
-
-- Click on the Grafana logo again (the orange spiral in the top left corner)
-
-- Hover over "Dashboards"
-
-- Click "+ New"
-
-- Click on the little green rectangle that appeared in the top left
-
-- Hover over "Add Panel"
-
-- Click on "Graph"
-
-]
-
-At this point, you should see a sample graph showing up.
-
----
-
-class: snap
-
-## Setting up a graph in Grafana
-
-.exercise[
-
-- Panel data source: select snap
-- Click on the SELECT metrics query to expand it
-- Click on "select measurement" and pick CPU usage
-- Click on the "+" right next to "WHERE"
-- Select "docker_id"
-- Select the ID of a container of your choice (e.g. the one running InfluxDB)
-- Click on the "+" on the right of the "SELECT" line
-- Add "derivative"
-- In the "derivative" option, select "1s"
-- In the top right corner, click on the clock, and pick "last 5 minutes"
-
-]
-
-Congratulations, you are viewing the CPU usage of a single container!
-
----
-
-class: snap
-
-![Screenshot showing the end result](grafana-add-graph.png)
-
----
-
-class: snap, prom
-
-## Before moving on ...
-
-- Leave that tab open!
-
-- We are going to setup *another* metrics system
-
-- ... And then compare both graphs side by side
-
----
-
-class: snap, prom
-
-## Prometheus vs. Snap
-
-- Prometheus is another metrics collection system
-
-- Snap *pushes* metrics; Prometheus *pulls* them
-
----
-
-class: prom
-
-## Prometheus components
-
-- The *Prometheus server* pulls, stores, and displays metrics
-
-- Its configuration defines a list of *exporter* endpoints
-  <br/>(that list can be dynamic, using e.g. Consul, DNS, Etcd...)
-
-- The exporters expose metrics over HTTP using a simple line-oriented format
-
-  (An optimized format using protobuf is also possible)
-
----
-
-class: prom
-
-## It's all about the `/metrics`
-
-- This is was the *node exporter* looks like:
-
-  http://demo.robustperception.io:9100/metrics
-
-- Prometheus itself exposes its own internal metrics, too:
-
-  http://demo.robustperception.io:9090/metrics
-
-- A *Prometheus server* will *scrape* URLs like these
-
-  (It can also use protobuf to avoid the overhead of parsing line-oriented formats!)
-
----
-
-class: prom-manual
-
-## Collecting metrics with Prometheus on Swarm
-
-- We will run two *global services* (i.e. scheduled on all our nodes):
-
-  - the Prometheus *node exporter* to get node metrics
-
-  - Google's cAdvisor to get container metrics
-
-- We will run a Prometheus server to scrape these exporters
-
-- The Prometheus server will be configured to use DNS service discovery
-
-- We will use `tasks.<servicename>` for service discovery
-
-- All these services will be placed on a private internal network
-
----
-
-class: prom-manual
-
-## Creating an overlay network for Prometheus
-
-- This is the easiest step ☺
-
-.exercise[
-
-- Create an overlay network:
-  ```bash
-  docker network create --driver overlay prom
-  ```
-
-]
-
----
-
-class: prom-manual
-
-## Running the node exporter
-
-- The node exporter *should* run directly on the hosts
-- However, it can run from a container, if configured properly
-  <br/>
-  (it needs to access the host's filesystems, in particular /proc and /sys)
-
-.exercise[
-
-- Start the node exporter:
-  ```bash
-    docker service create --name node --mode global --network prom \
-     --mount type=bind,source=/proc,target=/host/proc \
-     --mount type=bind,source=/sys,target=/host/sys \
-     --mount type=bind,source=/,target=/rootfs \
-     prom/node-exporter \
-      -collector.procfs /host/proc \
-      -collector.sysfs /host/proc \
-      -collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)"
-   ```
-
-]
-
----
-
-class: prom-manual
-
-## Running cAdvisor
-
-- Likewise, cAdvisor *should* run directly on the hosts
-
-- But it can run in containers, if configured properly
-
-.exercise[
-
-- Start the cAdvisor collector:
-  ```bash
-    docker service create --name cadvisor --network prom --mode global \
-      --mount type=bind,source=/,target=/rootfs \
-      --mount type=bind,source=/var/run,target=/var/run \
-      --mount type=bind,source=/sys,target=/sys \
-      --mount type=bind,source=/var/lib/docker,target=/var/lib/docker \
-      google/cadvisor:latest
-  ```
-
-]
-
----
-
-class: prom-manual
-
-## Configuring the Prometheus server
-
-This will be our configuration file for Prometheus:
-
-```yaml
-global:
-  scrape_interval: 10s
-scrape_configs:
-  - job_name: 'prometheus'
-    static_configs:
-      - targets: ['localhost:9090']
-  - job_name: 'node'
-    dns_sd_configs:
-      - names: ['tasks.node']
-        type: 'A'
-        port: 9100
-  - job_name: 'cadvisor'
-    dns_sd_configs:
-      - names: ['tasks.cadvisor']
-        type: 'A'
-        port: 8080
-```
-
----
-
-class: prom-manual
-
-## Passing the configuration to the Prometheus server
-
-- We need to provide our custom configuration to the Prometheus server
-
-- The easiest solution is to create a custom image bundling this configuration
-
-- We will use a very simple Dockerfile:
-  ```dockerfile
-  FROM prom/prometheus:v1.4.1
-  COPY prometheus.yml /etc/prometheus/prometheus.yml
-  ```
-
-  (The configuration file, and the Dockerfile, are in the `prom` subdirectory)
-
-- We will build this image, and push it to our local registry
-
-- Then we will create a service using this image
-
-Note: it is also possible to use a `config` to inject that configuration file
-without having to create this ad-hoc image.
-
----
-
-class: prom-manual
-
-## Building our custom Prometheus image
-
-- We will use the local registry started previously on 127.0.0.1:5000
-
-.exercise[
-
-- Build the image using the provided Dockerfile:
-  ```bash
-  docker build -t 127.0.0.1:5000/prometheus ~/orchestration-workshop/prom
-  ```
-
-- Push the image to our local registry:
-  ```bash
-  docker push 127.0.0.1:5000/prometheus
-  ```
-
-]
-
----
-
-class: prom-manual
-
-## Running our custom Prometheus image
-
-- That's the only service that needs to be published
-
-  (If we want to access Prometheus from outside!)
-
-.exercise[
-
-- Start the Prometheus server:
-  ```bash
-    docker service create --network prom --name prom \
-           --publish 9090:9090 127.0.0.1:5000/prometheus
-  ```
-
-]
-
----
-
-class: prom-auto
-
-## Deploying Prometheus on our cluster
-
-- We will use a stack definition (once again)
-
-.exercise[
-
-- Make sure we are in the stacks directory:
-  ```bash
-  cd ~/orchestration-workshop/stacks
-  ```
-
-- Build, ship, and run the Prometheus stack:
-  ```bash
-  docker-compose -f prometheus.yml build
-  docker-compose -f prometheus.yml push
-  docker stack deploy -c prometheus.yml prometheus
-  ```
-
-]
-
----
-
-class: prom
-
-## Checking our Prometheus server
-
-- First, let's make sure that Prometheus is correctly scraping all metrics
-
-.exercise[
-
-- Open port 9090 with your browser
-
-- Click on "status", then "targets"
-
-]
-
-You should see 11 endpoints (5 cadvisor, 5 node, 1 prometheus).
-
-Their state should be "UP".
-
----
-
-class: prom-auto, config
-
-## Injecting a configuration file
-
-(New in Docker Engine 17.06)
-
-- We are creating a custom image *just to inject a configuration*
-
-- Instead, we could use the base Prometheus image + a `config` 
-
-- A `config` is a blob (usually, a configuration file) that:
-
-  - is created and managed through the Docker API (and CLI)
-
-  - gets persisted into the Raft log (i.e. safely)
-
-  - can be associated to a service
-    <br/>
-    (this injects the blob as a plain file in the service's containers)
-
----
-
-class: prom-auto, config
-
-## Differences between `config` and `secret`
-
-The two are very similar, but ...
-
-- `configs`:
-
-  - can be injected to any filesystem location
-
-  - can be viewed and extracted using the Docker API or CLI
-
-- `secrets`:
-
-  - can only be injected into `/run/secrets`
-
-  - are never stored in clear text on disk
-
-  - cannot be viewed or extracted with the Docker API or CLI
-
----
-
-class: prom-auto, config
-
-## Deploying Prometheus with a `config`
-
-- The `config` can be created manually or declared in the Compose file
-
-- This is what our new Compose file looks like:
-
-.small[
-```yaml
-version: "3.3"
-
-services:
-
-prometheus:
-  image: prom/prometheus:v1.4.1 
-  ports:
-    - "9090:9090"
-  configs:
-    - source: prometheus
-      target: /etc/prometheus/prometheus.yml
-
-...
-
-configs:
-  prometheus:
-    file: ../prom/prometheus.yml
-```
-]
-
-(This is from `prometheus+config.yml`)
-
----
-
-class: prom-auto, config
-
-## Specifying a `config` in a Compose file
-
-- In each service, an optional `configs` section can list as many configs as you want
-
-- Each config can specify:
-
-  - an optional `target` (path to inject the configuration; by default: root of the container)
-
-  - ownership and permissions (by default, the file will be owned by UID 0, i.e. `root`)
-
-- These configs reference top-level `configs` elements
-
-- The top-level configs can be declared as:
-
-  - *external*, meaning that it is supposed to be created before you deploy the stack
-
-  - referencing a file, whose content is used to initialize the config
-
----
-
-class: prom-auto, config
-
-## Re-deploying Prometheus with a config
-
-- We will update the existing stack using `prometheus+config.yml`
-
-.exercise[
-
-- Redeploy the `prometheus` stack:
-  ```bash
-  docker stack deploy -c prometheus+config.yml prometheus
-  ```
-
-- Check that Prometheus still works as intended
-
-  (By connecting to any node of the cluster, on port 9090)
-
-]
-
----
-
-class: prom-auto, config
-
-## Accessing the config object from the Docker CLI
-
-- Config objects can be viewed from the CLI (or API)
-
-.exercise[
-
-- List existing config objects:
-  ```bash
-  docker config ls
-  ```
-
-- View details about our config object:
-  ```bash
-  docker config inspect prometheus_prometheus
-  ```
-
-]
-
-Note: the content of the config blob is shown with BASE64 encoding.
-<br/>
-(It doesn't have to be text; it could be an image or any kind of binary content!)
-
----
-
-class: prom-auto, config
-
-## Extracting a config blob
-
-- Let's retrieve that Prometheus configuration!
-
-.exercise[
-
-- Extract the BASE64 payload with `jq`:
-  ```bash
-  docker config inspect prometheus_prometheus | jq -r .[0].Spec.Data
-  ```
-
-- Decode it with `base64 -d`:
-  ```bash
-  docker config inspect prometheus_prometheus | jq -r .[0].Spec.Data | base64 -d
-  ```
-
-]
-
----
-
-class: prom
-
-## Displaying metrics directly from Prometheus
-
-- This is easy ... if you are familiar with PromQL
-
-.exercise[
-
-- Click on "Graph", and in "expression", paste the following:
-  ```
-    sum by (container_label_com_docker_swarm_node_id) (
-      irate(
-        container_cpu_usage_seconds_total{
-          container_label_com_docker_swarm_service_name="dockercoins_worker"
-          }[1m]
-      )
-    )
-  ```
-
-- Click on the blue "Execute" button and on the "Graph" tab just below
-
-]
-
----
-
-class: prom
-
-## Building the query from scratch
-
-- We are going to build the same query from scratch
-
-- This doesn't intend to be a detailed PromQL course
-
-- This is merely so that you (I) can pretend to know how the previous query works
-  <br/>so that your coworkers (you) can be suitably impressed (or not)
-
-  (Or, so that we can build other queries if necessary, or adapt if cAdvisor,
-  Prometheus, or anything else changes and requires editing the query!)
-
----
-
-class: prom
-
-## Displaying a raw metric for *all* containers
-
-- Click on the "Graph" tab on top
-
-  *This takes us to a blank dashboard*
-
-- Click on the "Insert metric at cursor" drop down, and select `container_cpu_usage_seconds_total`
-
-  *This puts the metric name in the query box*
-
-- Click on "Execute"
-
-  *This fills a table of measurements below*
-
-- Click on "Graph" (next to "Console")
-
-  *This replaces the table of measurements with a series of graphs (after a few seconds)*
-
----
-
-class: prom
-
-## Selecting metrics for a specific service
-
-- Hover over the lines in the graph
-
-  (Look for the ones that have labels like `container_label_com_docker_...`)
-
-- Edit the query, adding a condition between curly braces:
-
-  .small[`container_cpu_usage_seconds_total{container_label_com_docker_swarm_service_name="dockercoins_worker"}`]
-
-- Click on "Execute"
-
-  *Now we should see one line per CPU per container*
-
-- If you want to select by container ID, you can use a regex match: `id=~"/docker/c4bf.*"`
-
-- You can also specify multiple conditions by separating them with commas
-
----
-
-class: prom
-
-## Turn counters into rates
-
-- What we see is the total amount of CPU used (in seconds)
-
-- We want to see a *rate* (CPU time used / real time)
-
-- To get a moving average over 1 minute periods, enclose the current expression within:
-
-  ```
-  rate ( ... { ... } [1m] )
-  ```
-
-  *This should turn our steadily-increasing CPU counter into a wavy graph*
-
-- To get an instantaneous rate, use `irate` instead of `rate`
-
-  (The time window is then used to limit how far behind to look for data if data points
-  are missing in case of scrape failure; see [here](https://www.robustperception.io/irate-graphs-are-better-graphs/) for more details!)
-
-  *This should show spikes that were previously invisible because they were smoothed out*
-
----
-
-class: prom
-
-## Aggregate multiple data series
-
-- We have one graph per CPU per container; we want to sum them
-
-- Enclose the whole expression within:
-
-  ```
-  sum ( ... )
-  ```
-
-  *We now see a single graph*
-
----
-
-class: prom
-
-## Collapse dimensions
-
-- If we have multiple containers we can also collapse just the CPU dimension:
-
-  ```
-  sum without (cpu) ( ... )
-  ```
-
-  *This shows the same graph, but preserves the other labels*
-
-- Congratulations, you wrote your first PromQL expression from scratch!
-
-  (I'd like to thank [Johannes Ziemke](https://twitter.com/discordianfish) and
-  [Julius Volz](https://twitter.com/juliusvolz) for their help with Prometheus!)
-
----
-
-class: prom, snap
-
-## Comparing Snap and Prometheus data
-
-- If you haven't setup Snap, InfluxDB, and Grafana, skip this section
-
-- If you have closed the Grafana tab, you might have to re-setup a new dashboard
-
-  (Unless you saved it before navigating it away)
-
-- To re-do the setup, just follow again the instructions from the previous chapter
-
----
-
-class: prom, snap
-
-## Add Prometheus as a data source in Grafana
-
-.exercise[
-
-- In a new tab, connect to Grafana (port 3000)
-
-- Click on the Grafana logo (the orange spiral in the top-left corner)
-
-- Click on "Data Sources"
-
-- Click on the green "Add data source" button
-
-]
-
-We see the same input form that we filled earlier to connect to InfluxDB.
-
----
-
-class: prom, snap
-
-## Connecting to Prometheus from Grafana
-
-.exercise[
-
-- Enter "prom" in the name field
-
-- Select "Prometheus" as the source type
-
-- Enter http://(IP.address.of.any.node):9090 in the Url field
-
-- Select "direct" as the access method
-
-- Click on "Save and test"
-
-]
-
-Again, we should see a green box telling us "Data source is working."
-
-Otherwise, double-check every field and try again!
-
----
-
-class: prom, snap
-
-## Adding the Prometheus data to our dashboard
-
-.exercise[
-
-- Go back to the the tab where we had our first Grafana dashboard
-
-- Click on the blue "Add row" button in the lower right corner
-
-- Click on the green tab on the left; select "Add panel" and "Graph"
-
-]
-
-This takes us to the graph editor that we used earlier.
-
----
-
-class: prom, snap
-
-## Querying Prometheus data from Grafana
-
-The editor is a bit less friendly than the one we used for InfluxDB.
-
-.exercise[
-
-- Select "prom" as Panel data source
-
-- Paste the query in the query field:
-  ```
-    sum without (cpu, id) ( irate (
-      container_cpu_usage_seconds_total{
-        container_label_com_docker_swarm_service_name="influxdb"}[1m] ) )
-  ```
-
-- Click outside of the query field to confirm
-
-- Close the row editor by clicking the "X" in the top right area
-
-]
-
----
-
-class: prom, snap
-
-## Interpreting results
-
-- The two graphs *should* be similar
-
-- Protip: align the time references!
-
-.exercise[
-
-- Click on the clock in the top right corner
-
-- Select "last 30 minutes"
-
-- Click on "Zoom out"
-
-- Now press the right arrow key (hold it down and watch the CPU usage increase!)
-
-]
-
-*Adjusting units is left as an exercise for the reader.*
-
----
-
-## More resources on container metrics
-
-- [Prometheus, a Whirlwind Tour](https://speakerdeck.com/copyconstructor/prometheus-a-whirlwind-tour),
-  an original overview of Prometheus
-
-- [Docker Swarm & Container Overview](https://grafana.net/dashboards/609),
-  a custom dashboard for Grafana
-
-- [Gathering Container Metrics](http://jpetazzo.github.io/2013/10/08/docker-containers-metrics/),
-  a blog post about cgroups
-
-- [The Prometheus Time Series Database](https://www.youtube.com/watch?v=HbnGSNEjhUc),
-  a talk explaining why custom data storage is necessary for metrics
-
----
-
-# Dealing with stateful services
-
-- First of all, you need to make sure that the data files are on a *volume*
-
-- Volumes are host directories that are mounted to the container's filesystem
-
-- These host directories can be backed by the ordinary, plain host filesystem ...
-
-- ... Or by distributed/networked filesystems
-
-- In the latter scenario, in case of node failure, the data is safe elsewhere ...
-
-- ... And the container can be restarted on another node without data loss
-
----
-
-## Building a stateful service experiment
-
-- We will use Redis for this example
-
-- We will expose it on port 10000 to access it easily
-
-.exercise[
-
-- Start the Redis service:
-  ```bash
-  docker service create --name stateful -p 10000:6379 redis
-  ```
-
-- Check that we can connect to it:
-  ```bash
-  docker run --net host --rm redis redis-cli -p 10000 info server
-  ```
-
-]
-
----
-
-## Accessing our Redis service easily
-
-- Typing that whole command is going to be tedious
-
-.exercise[
-
-- Define a shell alias to make our lives easier:
-  ```bash
-  alias redis='docker run --net host --rm redis redis-cli -p 10000'
-  ```
-
-- Try it:
-  ```bash
-  redis info server
-  ```
-
-]
-
----
-
-## Basic Redis commands
-
-.exercise[
-
-- Check that the `foo` key doesn't exist:
-  ```bash
-  redis get foo
-  ```
-
-- Set it to `bar`:
-  ```bash
-  redis set foo bar
-  ```
-
-- Check that it exists now:
-  ```bash
-  redis get foo
-  ```
-
-]
-
----
-
-## Local volumes vs. global volumes
-
-- Global volumes exist in a single namespace
-
-- A global volume can be mounted on any node
-  <br/>.small[(bar some restrictions specific to the volume driver in use; e.g. using an EBS-backed volume on a GCE/EC2 mixed cluster)]
-
-- Attaching a global volume to a container allows to start the container anywhere
-  <br/>(and retain its data wherever you start it!)
-
-- Global volumes require extra *plugins* (Flocker, Portworx...)
-
-- Docker doesn't come with a default global volume driver at this point
-
-- Therefore, we will fall back on *local volumes*
-
----
-
-## Local volumes
-
-- We will use the default volume driver, `local`
-
-- As the name implies, the `local` volume driver manages *local* volumes
-
-- Since local volumes are (duh!) *local*, we need to pin our container to a specific host
-
-- We will do that with a *constraint*
-
-.exercise[
-
-- Add a placement constraint to our service:
-  ```bash
-  docker service update stateful --constraint-add node.hostname==$HOSTNAME
-  ```
-
-]
-
----
-
-## Where is our data?
-
-- If we look for our `foo` key, it's gone!
-
-.exercise[
-
-- Check the `foo` key:
-  ```bash
-  redis get foo
-  ```
-
-- Adding a constraint caused the service to be redeployed:
-  ```bash
-  docker service ps stateful
-  ```
-
-]
-
-Note: even if the constraint ends up being a no-op (i.e. not
-moving the service), the service gets redeployed.
-This ensures consistent behavior.
-
----
-
-## Setting the key again
-
-- Since our database was wiped out, let's populate it again
-
-.exercise[
-
-- Set `foo` again:
-  ```bash
-  redis set foo bar
-  ```
-
-- Check that it's there:
-  ```bash
-  redis get foo
-  ```
-
-]
-
----
-
-## Service updates cause containers to be replaced
-
-- Let's try to make a trivial update to the service and see what happens
-
-.exercise[
-
-- Set a memory limit to our Redis service:
-  ```bash
-  docker service update stateful --limit-memory 100M
-  ```
-
-- Try to get the `foo` key one more time:
-  ```bash
-  redis get foo
-  ```
-
-]
-
-The key is blank again!
-
----
-
-## Service volumes are ephemeral by default
-
-- Let's highlight what's going on with volumes!
-
-.exercise[
-
-- Check the current list of volumes:
-  ```bash
-  docker volume ls
-  ```
-
-- Carry a minor update to our Redis service:
-  ```bash
-  docker service update stateful --limit-memory 200M
-  ```
-
-]
-
-Again: all changes trigger the creation of a new task, and therefore a replacement of the existing container;
-even when it is not strictly technically necessary.
-
----
-
-## The data is gone again
-
-- What happened to our data?
-
-.exercise[
-
-- The list of volumes is slightly different:
-  ```bash
-  docker volume ls
-  ```
-
-]
-
-(You should see one extra volume.)
-
----
-
-## Assigning a persistent volume to the container
-
-- Let's add an explicit volume mount to our service, referencing a named volume
-
-.exercise[
-
-- Update the service with a volume mount:
-  ```bash
-    docker service update stateful \
-           --mount-add type=volume,source=foobarstore,target=/data
-  ```
-
-- Check the new volume list:
-  ```bash
-  docker volume ls
-  ```
-
-]
-
-Note: the `local` volume driver automatically creates volumes.
-
----
-
-## Checking that persistence actually works across service updates
-
-.exercise[
-
-- Store something in the `foo` key:
-  ```bash
-  redis set foo barbar
-  ```
-
-- Update the service with yet another trivial change:
-  ```bash
-  docker service update stateful --limit-memory 300M
-  ```
-
-- Check that `foo` is still set:
-  ```bash
-  redis get foo
-  ```
-
-]
-
----
-
-## Recap
-
-- The service must commit its state to disk when being shutdown.red[*]
-
-  (Shutdown = being sent a `TERM` signal)
-
-- The state must be written on files located on a volume
-
-- That volume must be specified to be persistent
-
-- If using a local volume, the service must also be pinned to a specific node
-
-  (And losing that node means losing the data, unless there are other backups)
-
-.footnote[<br/>
-.red[*]If you customize Redis configuration, make sure you
-persist data correctly!
-<br/>
-It's easy to make that mistake — __Trust me!__]
-
----
-
-## Cleaning up
-
-.exercise[
-
-- Remove the stateful service:
-  ```bash
-  docker service rm stateful
-  ```
-
-- Remove the associated volume:
-  ```bash
-  docker volume rm foobarstore
-  ```
-
-]
-
-Note: we could keep the volume around if we wanted.
-
----
-
-## Should I run stateful services in containers?
-
---
-
-Depending whom you ask, they'll tell you:
-
---
-
-- certainly not, heathen!
-
---
-
-- we've been running a few thousands PostgreSQL instances in containers ...
-  <br/>for a few years now ... in production ... is that bad?
-
---
-
-- what's a container?
-
---
-
-Perhaps a better question would be:
-
-*"Should I run stateful services?"*
-
---
-
-- is it critical for my business?
-- is it my value-add?
-- or should I find somebody else to run them for me?
-
----
-
-class: extra-details
-
-# Controlling Docker from a container
-
-- In a local environment, just bind-mount the Docker control socket:
-  ```bash
-  docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker
-  ```
-
-- Otherwise, you have to:
-
-  - set `DOCKER_HOST`,
-  - set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS),
-  - copy certificates to the container that will need API access.
-
-More resources on this topic:
-
-- [Do not use Docker-in-Docker for CI](
-  http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
-- [One container to rule them all](
-  http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/)
-
----
-
-class: extra-details
-
-## Bind-mounting the Docker control socket
-
-- In Swarm mode, bind-mounting the control socket gives you access to the whole cluster
-
-- You can tell Docker to place a given service on a manager node, using constraints:
-  ```bash
-    docker service create \
-      --mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
-      --name autoscaler --constraint node.role==manager ...
-  ```
-
----
-
-class: extra-details
-
-## Constraints and global services
-
-(New in Docker Engine 1.13)
-
-- By default, global services run on *all* nodes
-  ```bash
-  docker service create --mode global ...
-  ```
-
-- You can specify constraints for global services
-
-- These services will run only on the node satisfying the constraints
-
-- For instance, this service will run on all manager nodes:
-  ```bash
-  docker service create --mode global --constraint node.role==manager ...
-  ```
-
----
-
-class: extra-details
-
-## Constraints and dynamic scheduling
-
-(New in Docker Engine 1.13)
-
-- If constraints change, services are started/stopped accordingly
-
-  (e.g., `--constraint node.role==manager` and nodes are promoted/demoted)
-
-- This is particularly useful with labels:
-  ```bash
-  docker node update node1 --label-add defcon=five
-  docker service create --constraint node.labels.defcon==five ...
-  docker node update node2 --label-add defcon=five
-  docker node update node1 --label-rm defcon=five
-  ```
-
----
-
-class: extra-details
-
-## Shortcomings of dynamic scheduling
-
-.warning[If a service becomes "unschedulable" (constraints can't be satisfied):]
-
-- It won't be scheduled automatically when constraints are satisfiable again
-
-- You will have to update the service; you can do a no-op udate with:
-  ```bash
-  docker service update ... --force
-  ```
-
-.warning[Docker will silently ignore attempts to remove a non-existent label or constraint]
-
-- It won't warn you if you typo when removing a label or constraint!
-
----
-
-class: extra-details
-
-# Node management
-
-- SwarmKit allows to change (almost?) everything on-the-fly
-
-- Nothing should require a global restart
-
----
-
-class: extra-details
-
-## Node availability
-
-```bash
-docker node update <node-name> --availability <active|pause|drain>
-```
-
-- Active = schedule tasks on this node (default)
-
-- Pause = don't schedule new tasks on this node; existing tasks are not affected
-
-  You can use it to troubleshoot a node without disrupting existing tasks
-
-  It can also be used (in conjunction with labels) to reserve resources
-
-- Drain = don't schedule new tasks on this node; existing tasks are moved away
-
-  This is just like crashing the node, but containers get a chance to shutdown cleanly
-
----
-
-class: extra-details
-
-## Managers and workers
-
-- Nodes can be promoted to manager with `docker node promote`
-
-- Nodes can be demoted to worker with `docker node demote`
-
-- This can also be done with `docker node update <node> --role <manager|worker>`
-
-- Reminder: this has to be done from a manager node
-  <br/>(workers cannot promote themselves)
-
----
-
-class: extra-details
-
-## Removing nodes
-
-- You can leave Swarm mode with `docker swarm leave`
-
-- Nodes are drained before being removed (i.e. all tasks are rescheduled somewhere else)
-
-- Managers cannot leave (they have to be demoted first)
-
-- After leaving, a node still shows up in `docker node ls` (in `Down` state)
-
-- When a node is `Down`, you can remove it with `docker node rm` (from a manager node)
-
----
-
-class: extra-details
-
-## Join tokens and automation
-
-- If you have used Docker 1.12-RC: join tokens are now mandatory!
-
-- You cannot specify your own token (SwarmKit generates it)
-
-- If you need to change the token: `docker swarm join-token --rotate ...`
-
-- To automate cluster deployment:
-
-  - have a seed node do `docker swarm init` if it's not already in Swarm mode
-
-  - propagate the token to the other nodes (secure bucket, facter, ohai...)
-
----
-
-class: extra-details
-
-## Disk space management: `docker system df`
-
-- Shows disk usage for images, containers, and volumes
-
-- Breaks down between *active* and *reclaimable* categories
-
-.exercise[
-
-- Check how much disk space is used at the end of the workshop:
-  ```bash
-  docker system df
-  ```
-
-]
-
-Note: `docker system` is new in Docker Engine 1.13.
-
----
-
-class: extra-details
-
-## Reclaiming unused resources: `docker system prune`
-
-- Removes stopped containers
-
-- Removes dangling images (that don't have a tag associated anymore)
-
-- Removes orphaned volumes
-
-- Removes empty networks
-
-.exercise[
-
-- Try it:
-  ```bash
-  docker system prune -f
-  ```
-
-]
-
-Note: `docker system prune -a` will also remove *unused* images.
-
----
-
-class: extra-details
-
-## Events
-
-- You can get a real-time stream of events with `docker events`
-
-- This will report *local events* and *cluster events*
-
-- Local events =
-  <br/>
-  all activity related to containers, images, plugins, volumes, networks, *on this node*
-
-- Cluster events =
-  <br/>Swarm Mode activity related to services, nodes, secrets, configs, *on the whole cluster*
-
-- `docker events` doesn't report *local events happening on other nodes*
-
-- Events can be filtered (by type, target, labels...)
-
-- Events can be formatted with Go's `text/template` or in JSON
-
----
-
-class: extra-details
-
-## Getting *all the events*
-
-- There is no built-in to get a stream of *all the events* on *all the nodes*
-
-- This can be achieved with (for instance) the four following services working together:
-
-  - a Redis container (used as a stateless, fan-in message queue)
-
-  - a global service bind-mounting the Docker socket, pushing local events to the queue
-
-  - a similar singleton service to push global events to the queue
-
-  - a queue consumer fetching events and processing them as you please
-
-I'm not saying that you should implement it with Shell scripts, but you totally could.
-
-.small[
-(It might or might not be one of the initiating rites of the
-[House of Bash](https://twitter.com/carmatrocity/status/676559402787282944))
-]
-
-For more information about event filters and types, check [the documentation](https://docs.docker.com/engine/reference/commandline/events/).
-
----
-
-
-class: title, extra-details
-
-# What's next?
-
-## (What to expect in future versions of this workshop)
-
----
-
-class: extra-details
-
-## Implemented and stable, but out of scope
-
-- [Docker Content Trust](https://docs.docker.com/engine/security/trust/content_trust/) and
-  [Notary](https://github.com/docker/notary) (image signature and verification)
-
-- Image security scanning (many products available, Docker Inc. and 3rd party)
-
-- [Docker Cloud](https://cloud.docker.com/) and
-  [Docker Datacenter](https://www.docker.com/products/docker-datacenter)
-  (commercial offering with node management, secure registry, CI/CD pipelines, all the bells and whistles)
-
-- Network and storage plugins
-
----
-
-class: extra-details
-
-## Work in progress
-
-- Demo at least one volume plugin
-  <br/>(bonus points if it's a distributed storage system)
-
-- ..................................... (your favorite feature here)
-
-Reminder: there is a tag for each iteration of the content
-in the Github repository.
-
-It makes it easy to come back later and check what has changed since you did it!
-
----
-
-class: title, self-paced
-
-Thank you!
-
----
-
-class: title, in-person
-
-That's all folks! <br/> Questions?
-
-.small[.small[
-
-Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@docker](https://twitter.com/docker)
-
-AJ ([@s0ulshake](https://twitter.com/s0ulshake)) — *For hire!*
-<br/>
-`curl cv.soulshake.net`
-
-]]
-
-<!--
-<br/> Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) <br/> Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
--->
-
-    </textarea>
-    <script src="remark.min.js" type="text/javascript">
-    </script>
-    <script type="text/javascript">
-      var slideshow = remark.create({
-        ratio: '16:9',
-        highlightSpans: true,
-        excludedClasses: ["in-person", "elk-auto", "prom-auto"]
-        //excludedClasses: ["self-paced", "extra-details", "advertise-addr", "docker-machine", "netshoot", "sbt", "ipsec", "node-info", "swarmtools", "secrets", "encryption-at-rest", "elk-manual", "snap", "prom-manual"]
-      });
-    </script>
-  </body>
-</html>
diff --git a/docs/intro-ks.md b/docs/intro-ks.md
new file mode 100644
index 00000000..0241f037
--- /dev/null
+++ b/docs/intro-ks.md
@@ -0,0 +1,15 @@
+## About these slides
+
+- Your one-stop shop to awesomeness:
+
+  http://container.training/
+
+- The content that you're viewing right now is in a public GitHub repository:
+
+  https://github.com/jpetazzo/orchestration-workshop
+
+- Typos? Mistakes? Questions? Feel free to hover over the bottom of the slide ...
+
+--
+
+.footnote[👇 Try it! The source file will be shown and you can view it on GitHub and fork and edit it.]
diff --git a/docs/intro.md b/docs/intro.md
new file mode 100644
index 00000000..b2f206ba
--- /dev/null
+++ b/docs/intro.md
@@ -0,0 +1,41 @@
+## A brief introduction
+
+- This was initially written to support in-person,
+  instructor-led workshops and tutorials
+
+- You can also follow along on your own, at your own pace
+
+- We included as much information as possible in these slides
+
+- We recommend having a mentor to help you ...
+
+- ... Or be comfortable spending some time reading the Docker
+ [documentation](https://docs.docker.com/) ...
+
+- ... And looking for answers in the [Docker forums](forums.docker.com),
+  [StackOverflow](http://stackoverflow.com/questions/tagged/docker),
+  and other outlets
+
+---
+
+class: self-paced
+
+## Hands on, you shall practice
+
+- Nobody ever became a Jedi by spending their lives reading Wookiepedia
+
+- Likewise, it will take more than merely *reading* these slides
+  to make you an expert
+
+- These slides include *tons* of exercises
+
+- They assume that you have access to a cluster of Docker nodes
+
+- If you are attending a workshop or tutorial:
+  <br/>you will be given specific instructions to access your cluster
+
+- If you are doing this on your own:
+  <br/>you can use
+  [Play-With-Docker](http://www.play-with-docker.com/) and
+  read [these instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker) for extra
+  details
diff --git a/docs/ipsec.md b/docs/ipsec.md
new file mode 100644
index 00000000..d1dfbef2
--- /dev/null
+++ b/docs/ipsec.md
@@ -0,0 +1,140 @@
+# Securing overlay networks
+
+- By default, overlay networks are using plain VXLAN encapsulation
+
+  (~Ethernet over UDP, using SwarmKit's control plane for ARP resolution)
+
+- Encryption can be enabled on a per-network basis
+
+  (It will use IPSEC encryption provided by the kernel, leveraging hardware acceleration)
+
+- This is only for the `overlay` driver
+
+  (Other drivers/plugins will use different mechanisms)
+
+---
+
+## Creating two networks: encrypted and not
+
+- Let's create two networks for testing purposes
+
+.exercise[
+
+- Create an "insecure" network:
+  ```bash
+  docker network create insecure --driver overlay --attachable
+  ```
+
+- Create a "secure" network:
+  ```bash
+  docker network create secure --opt encrypted --driver overlay --attachable
+  ```
+
+]
+
+.warning[Make sure that you don't typo that option; errors are silently ignored!]
+
+---
+
+## Deploying a web server sitting on both networks
+
+- Let's use good old NGINX
+
+- We will attach it to both networks
+
+- We will use a placement constraint to make sure that it is on a different node
+
+.exercise[
+
+- Create a web server running somewhere else:
+  ```bash
+    docker service create --name web \
+           --network secure --network insecure \
+           --constraint node.hostname!=node1 \
+           nginx
+  ```
+
+]
+
+---
+
+## Sniff HTTP traffic
+
+- We will use `ngrep`, which allows to grep for network traffic
+
+- We will run it in a container, using host networking to access the host's interfaces
+
+.exercise[
+
+- Sniff network traffic and display all packets containing "HTTP":
+  ```bash
+  docker run --net host nicolaka/netshoot ngrep -tpd eth0 HTTP
+  ```
+
+]
+
+--
+
+Seeing tons of HTTP request? Shutdown your DockerCoins workers:
+```bash
+docker service update dockercoins_worker --replicas=0
+```
+
+---
+
+## Check that we are, indeed, sniffing traffic
+
+- Let's see if we can intercept our traffic with Google!
+
+.exercise[
+
+- Open a new terminal
+
+- Issue an HTTP request to Google (or anything you like):
+  ```bash
+  curl google.com
+  ```
+
+]
+
+The ngrep container will display one `#` per packet traversing the network interface.
+
+When you do the `curl`, you should see the HTTP request in clear text in the output.
+
+---
+
+class: extra-details
+
+## If you are using Play-With-Docker, Vagrant, etc.
+
+- You will probably have *two* network interfaces
+
+- One interface will be used for outbound traffic (to Google)
+
+- The other one will be used for internode traffic
+
+- You might have to adapt/relaunch the `ngrep` command to specify the right one!
+
+---
+
+## Try to sniff traffic across overlay networks
+
+- We will run `curl web` through both secure and insecure networks
+
+.exercise[
+
+- Access the web server through the insecure network:
+  ```bash
+  docker run --rm --net insecure nicolaka/netshoot curl web
+  ```
+
+- Now do the same through the secure network:
+  ```bash
+  docker run --rm --net secure nicolaka/netshoot curl web
+  ```
+
+]
+
+When you run the first command, you will see HTTP fragments.
+<br/>
+However, when you run the second one, only `#` will show up.
diff --git a/docs/k8s-arch1.png b/docs/k8s-arch1.png
new file mode 100644
index 00000000..6dfa0930
Binary files /dev/null and b/docs/k8s-arch1.png differ
diff --git a/docs/k8s-arch2.png b/docs/k8s-arch2.png
new file mode 100644
index 00000000..6bb3847e
Binary files /dev/null and b/docs/k8s-arch2.png differ
diff --git a/docs/kube.yml b/docs/kube.yml
new file mode 100644
index 00000000..75c39f97
--- /dev/null
+++ b/docs/kube.yml
@@ -0,0 +1,106 @@
+exclude:
+- self-paced
+- snap
+
+chat: "[Gitter](https://gitter.im/jpetazzo/workshop-20171026-prague)"
+
+title: "Deploying and Scaling Microservices with Docker and Kubernetes"
+
+chapters:
+- |
+  class: title
+
+  .small[
+
+  Deploying and Scaling Microservices <br/> with Docker and Kubernetes
+
+  .small[.small[
+
+  **Be kind to the WiFi!**
+
+  <!--
+  *Use the 5G network*
+  <br/>
+  -->
+  *Don't use your hotspot*
+  <br/>
+  *Don't stream videos from YouTube, Netflix, etc.
+  <br/>(if you're bored, watch local content instead)*
+
+  Thank you!
+
+  ]
+  ]
+  ]
+
+  ---
+
+  ## Intros
+
+  - Hello! We are
+    Jérôme ([@jpetazzo](https://twitter.com/jpetazzo), Docker Inc.)
+    &
+    AJ ([@s0ulshake](https://twitter.com/s0ulshake), Travis CI)
+
+  --
+
+  - This is our first time doing this
+
+  --
+
+  - But ... containers and us go back a long way
+
+  --
+
+  ![CONTAINERS, I TELL YOU](aj-containers.jpeg)
+
+  --
+
+  - In the immortal words of [Chelsea Manning](https://twitter.com/xychelsea): #WeGotThis!
+
+  ---
+
+  ## Logistics
+
+  - The tutorial will run from 9:00am to 12:15pm
+
+  - There will be a coffee break at 10:30am
+    <br/>
+    (please remind me if I forget about it!)
+
+  - This will be fast-paced, but DON'T PANIC!
+    <br/>
+    (all the content is publicly available)
+
+  - Feel free to interrupt for questions at any time
+
+  - Live feedback, questions, help on @@CHAT@@
+
+- intro-ks.md
+- |
+  @@TOC@@
+- - prereqs-k8s.md
+  - versions-k8s.md
+  - sampleapp.md
+- - concepts-k8s.md
+  - kubenet.md
+  - kubectlget.md
+  - setup-k8s.md
+  - kubectlrun.md
+- - kubectlexpose.md
+  - ourapponkube.md
+  - dashboard.md
+- - kubectlscale.md
+  - daemonset.md
+  - rollout.md
+  - whatsnext.md
+- |
+  class: title
+
+  That's all folks! <br/> Questions?
+
+  .small[.small[
+
+  Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@docker](https://twitter.com/docker)
+
+  ]]
diff --git a/docs/kubectlexpose.md b/docs/kubectlexpose.md
new file mode 100644
index 00000000..1010ca18
--- /dev/null
+++ b/docs/kubectlexpose.md
@@ -0,0 +1,140 @@
+# Exposing containers
+
+- `kubectl expose` creates a *service* for existing pods
+
+- A *service* is a stable address for a pod (or a bunch of pods)
+
+- If we want to connect to our pod(s), we need to create a *service*
+
+- Once a service is created, `kube-dns` will allow us to resolve it by name
+
+  (i.e. after creating service `hello`, the name `hello` will resolve to something)
+
+- There are different types of services, detailed on the following slides:
+
+  `ClusterIP`, `NodePort`, `LoadBalancer`, `ExternalName`
+
+---
+
+## Basic service types
+
+- `ClusterIP` (default type)
+
+  - a virtual IP address is allocated for the service (in an internal, private range)
+  - this IP address is reachable only from within the cluster (nodes and pods)
+  - our code can connect to the service using the original port number
+
+- `NodePort`
+
+  - a port is allocated for the service (by default, in the 30000-32768 range)
+  - that port is made available *on all our nodes* and anybody can connect to it
+  - our code must be changed to connect to that new port number
+
+These service types are always available.
+
+Under the hood: `kube-proxy` is using a userland proxy and a bunch of `iptables` rules.
+
+---
+
+## More service types
+
+- `LoadBalancer`
+
+  - an external load balancer is allocated for the service
+  - the load balancer is configured accordingly
+    <br/>(e.g.: a `NodePort` service is created, and the load balancer sends traffic to that port)
+
+- `ExternalName`
+
+  - the DNS entry managed by `kube-dns` will just be a `CNAME` to a provided record
+  - no port, no IP address, no nothing else is allocated
+
+The `LoadBalancer` type is currently only available on AWS, Azure, and GCE.
+
+---
+
+## Running containers with open ports
+
+- Since `ping` doesn't have anything to connect to, we'll have to run something else
+
+.exercise[
+
+- Start a bunch of ElasticSearch containers:
+  ```bash
+  kubectl run elastic --image=elasticsearch:2 --replicas=7
+  ```
+
+- Watch them being started:
+  ```bash
+  kubectl get pods -w
+  ```
+
+<!-- ```keys ^C``` -->
+
+]
+
+The `-w` option "watches" events happening on the specified resources.
+
+Note: please DO NOT call the service `search`. It would collide with the TLD.
+
+---
+
+## Exposing our deployment
+
+- We'll create a default `ClusterIP` service
+
+.exercise[
+
+- Expose the ElasticSearch HTTP API port:
+  ```bash
+  kubectl expose deploy/elastic --port 9200
+  ```
+
+- Look up which IP address was allocated:
+  ```bash
+  kubectl get svc
+  ```
+
+]
+
+---
+
+## Services are layer 4 constructs
+
+- You can assign IP addresses to services, but they are still *layer 4*
+
+  (i.e. a service is not an IP address; it's an IP address + protocol + port)
+
+- This is caused by the current implementation of `kube-proxy`
+
+  (it relies on mechanisms that don't support layer 3)
+
+- As a result: you *have to* indicate the port number for your service
+    
+- Running services with arbitrary port (or port ranges) requires hacks
+
+  (e.g. host networking mode)
+
+---
+
+## Testing our service
+
+- We will now send a few HTTP requests to our ElasticSearch pods
+
+.exercise[
+
+- Let's obtain the IP address that was allocated for our service, *programatically:*
+  ```bash
+  IP=$(kubectl get svc elastic -o go-template --template '{{ .spec.clusterIP }}')
+  ```
+
+- Send a few requests:
+  ```bash
+  curl http://$IP:9200/
+  ```
+
+]
+
+--
+
+Our requests are load balanced across multiple pods.
diff --git a/docs/kubectlget.md b/docs/kubectlget.md
new file mode 100644
index 00000000..a84e64a0
--- /dev/null
+++ b/docs/kubectlget.md
@@ -0,0 +1,234 @@
+# First contact with `kubectl`
+
+- `kubectl` is (almost) the only tool we'll need to talk to Kubernetes
+
+- It is a rich CLI tool around the Kubernetes API
+
+  (Everything you can do with `kubectl`, you can do directly with the API)
+
+- On our machines, there is a `~/.kube/config` file with:
+
+  - the Kubernetes API address
+
+  - the path to our TLS certificates used to authenticate
+
+- You can also use the `--kubeconfig` flag to pass a config file
+
+- Or directly `--server`, `--user`, etc.
+
+- `kubectl` can be pronounced "Cube C T L", "Cube cuttle", "Cube cuddle"...
+
+---
+
+## `kubectl get`
+
+- Let's look at our `Node` resources with `kubectl get`!
+
+.exercise[
+
+- Look at the composition of our cluster:
+  ```bash
+  kubectl get node
+  ```
+
+- These commands are equivalent:
+  ```bash
+  kubectl get no
+  kubectl get node
+  kubectl get nodes
+  ```
+
+]
+
+---
+
+## From human-readable to machine-readable output
+
+- `kubectl get` can output JSON, YAML, or be directly formatted
+
+.exercise[
+
+- Give us more info about them nodes:
+  ```bash
+  kubectl get nodes -o wide
+  ```
+
+- Let's have some YAML:
+  ```bash
+  kubectl get no -o yaml
+  ```
+  See that `kind: List` at the end? It's the type of our result!
+
+]
+
+---
+
+## (Ab)using `kubectl` and `jq`
+
+- It's super easy to build custom reports
+
+.exercise[
+
+- Show the capacity of all our nodes as a stream of JSON objects:
+  ```bash
+    kubectl get nodes -o json | 
+            jq ".items[] | {name:.metadata.name} + .status.capacity"
+  ```
+
+]
+
+---
+
+## What's available?
+
+- `kubectl` has pretty good introspection facilities
+
+- We can list all available resource types by running `kubectl get`
+
+- We can view details about a resource with:
+  ```bash
+  kubectl describe type/name
+  kubectl describe type name
+  ```
+
+- We can view the definition for a resource type with:
+  ```bash
+  kubectl explain type
+  ```
+
+Each time, `type` can be singular, plural, or abbreviated type name.
+
+---
+
+## Services
+
+- A *service* is a stable endpoint to connect to "something"
+
+  (In the initial proposal, they were called "portals")
+
+.exercise[
+
+- List the services on our cluster with one of these commands:
+  ```bash
+  kubectl get services
+  kubectl get svc
+  ```
+
+]
+
+--
+
+There is already one service on our cluster: the Kubernetes API itself.
+
+---
+
+## ClusterIP services
+
+- A `ClusterIP` service is internal, available from the cluster only
+
+- This is useful for introspection from within containers
+
+.exercise[
+
+- Try to connect to the API:
+  ```bash
+  curl -k https://`10.96.0.1`
+  ```
+  
+  - `-k` is used to skip certificate verification
+  - Make sure to replace 10.96.0.1 with the CLUSTER-IP shown earlier
+
+]
+
+--
+
+The error that we see is expected: the Kubernetes API requires authentication.
+
+---
+
+## Listing running containers
+
+- Containers are manipulated through *pods*
+
+- A pod is a group of containers:
+
+ - running together (on the same node)
+
+ - sharing resources (RAM, CPU; but also network, volumes)
+
+.exercise[
+
+- List pods on our cluster:
+  ```bash
+  kubectl get pods
+  ```
+
+]
+
+--
+
+*These are not the pods you're looking for.* But where are they?!?
+
+---
+
+## Namespaces
+
+- Namespaces allow to segregate resources
+
+.exercise[
+
+- List the namespaces on our cluster with one of these commands:
+  ```bash
+  kubectl get namespaces
+  kubectl get namespace
+  kubectl get ns
+  ```
+
+]
+
+--
+
+*You know what ... This `kube-system` thing looks suspicious.*
+
+---
+
+## Accessing namespaces
+
+- By default, `kubectl` uses the `default` namespace
+
+- We can switch to a different namespace with the `-n` option
+
+.exercise[
+
+- List the pods in the `kube-system` namespace:
+  ```bash
+  kubectl -n kube-system get pods
+  ```
+
+]
+
+--
+
+*Ding ding ding ding ding!*
+
+---
+
+## What are all these pods?
+
+- `etcd` is our etcd server
+
+- `kube-apiserver` is the API server
+
+- `kube-controller-manager` and `kube-scheduler` are other master components
+
+- `kube-dns` is an additional component (not mandatory but super useful, so it's there)
+
+- `kube-proxy` is the (per-node) component managing port mappings and such
+
+- `weave` is the (per-node) component managing the network overlay
+
+- the `READY` column indicates the number of containers in each pod
+
+- the pods with a name ending with `-node1` are the master components
+  <br/>
+  (they have been specifically "pinned" to the master node)
diff --git a/docs/kubectlrun.md b/docs/kubectlrun.md
new file mode 100644
index 00000000..405c0057
--- /dev/null
+++ b/docs/kubectlrun.md
@@ -0,0 +1,249 @@
+# Running our first containers on Kubernetes
+
+- First things first: we cannot run a container
+
+--
+
+- We are going to run a pod, and in that pod there will be a single container
+
+--
+
+- In that container in the pod, we are going to run a simple `ping` command
+
+- Then we are going to start additional copies of the pod
+
+---
+
+## Starting a simple pod with `kubectl run`
+
+- We need to specify at least a *name* and the image we want to use
+
+.exercise[
+
+- Let's ping `goo.gl`:
+  ```bash
+  kubectl run pingpong --image alpine ping goo.gl
+  ```
+
+]
+
+--
+
+OK, what just happened?
+
+---
+
+## Behind the scenes of `kubectl run`
+
+- Let's look at the resources that were created by `kubectl run`
+
+.exercise[
+
+- List most resource types:
+  ```bash
+  kubectl get all
+  ```
+
+]
+
+--
+
+We should see the following things:
+- `deploy/pingpong` (the *deployment* that we just created)
+- `rs/pingpong-xxxx` (a *replica set* created by the deployment)
+- `po/pingpong-yyyy` (a *pod* created by the replica set)
+
+---
+
+## Deployments, replica sets, and replication controllers
+
+- A *deployment* is a high-level construct
+
+  - allows scaling, rolling updates, rollbacks
+
+  - multiple deployments can be used together to implement a
+    [canary deployment](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#canary-deployments)
+
+  - delegates pods management to *replica sets*
+
+- A *replica set* is a low-level construct
+
+  - makes sure that a given number of identical pods are running
+
+  - allows scaling
+
+  - rarely used directly
+
+- A *replication controller* is the (deprecated) predecessor of a replica set
+
+---
+
+## Our `pingpong` deployment
+
+- `kubectl run` created a *deployment*, `deploy/pingpong`
+
+- That deployment created a *replica set*, `rs/pingpong-xxxx`
+
+- That replica set created a *pod*, `po/pingpong-yyyy`
+
+- We'll see later how these folks play together for:
+
+  - scaling
+
+  - high availability
+
+  - rolling updates
+
+---
+
+## Viewing container output
+
+- Let's use the `kubectl logs` command
+
+- We will pass either a *pod name*, or a *type/name*
+
+  (E.g. if we specify a deployment or replica set, it will get the first pod in it)
+
+- Unless specified otherwise, it will only show logs of the first container in the pod
+
+  (Good thing there's only one in ours!)
+
+.exercise[
+
+- View the result of our `ping` command:
+  ```bash
+  kubectl logs deploy/pingpong
+  ```
+
+]
+
+---
+
+## Streaming logs in real time
+
+- Just like `docker logs`, `kubectl logs` supports convenient options:
+
+  - `-f`/`--follow` to stream logs in real time (à la `tail -f`)
+
+  - `--tail` to indicate how many lines you want to see (from the end)
+
+  - `--since` to get logs only after a given timestamp
+
+.exercise[
+
+- View the latest logs of our `ping` command:
+  ```bash
+  kubectl logs deploy/pingpong --tail 1 --follow
+  ```
+
+<!--
+```keys
+^C
+```
+-->
+
+]
+
+---
+
+## Scaling our application
+
+- We can create additional copies of our container (I mean, our pod) with `kubectl scale`
+
+.exercise[
+
+- Scale our `pingpong` deployment:
+  ```bash
+  kubectl scale deploy/pingpong --replicas 8
+  ```
+
+]
+
+Note: what if we tried to scale `rs/pingpong-xxxx`?
+
+We could! But the *deployment* would notice it right away, and scale back to the initial level.
+
+---
+
+## Resilience
+
+- The *deployment* `pingpong` watches its *replica set*
+
+- The *replica set* ensures that the right number of *pods* are running
+
+- What happens if pods disappear?
+
+.exercise[
+
+- In a separate window, list pods, and keep watching them:
+  ```bash
+  kubectl get pods -w
+  ```
+
+<!--
+```keys
+^C
+```
+-->
+
+- Destroy a pod:
+  ```bash
+  kubectl delete pod pingpong-yyyy
+  ```
+]
+
+---
+
+## What if we wanted something different?
+
+- What if we wanted to start a "one-shot" container that *doesn't* get restarted?
+
+- We could use `kubectl run --restart=OnFailure` or `kubectl run --restart=Never`
+
+- These commands would create *jobs* or *pods* instead of *deployments*
+
+- Under the hood, `kubectl run` invokes "generators" to create resource descriptions
+
+- We could also write these resource descriptions ourselves (typically in YAML),
+  <br/>and create them on the cluster with `kubectl apply -f` (discussed later)
+
+- With `kubectl run --schedule=...`, we can also create *cronjobs*
+
+---
+
+## Viewing logs of multiple pods
+
+- When we specify a deployment name, only one single pod's logs are shown
+
+- We can view the logs of multiple pods by specifying a *selector*
+
+- A selector is a logic expression using *labels*
+
+- Conveniently, when you `kubectl run somename`, the associated objects have a `run=somename` label
+
+.exercise[
+
+- View the last line of log from all pods with the `run=pingpong` label:
+  ```bash
+  kubectl logs -l run=pingpong --tail 1
+  ```
+
+]
+
+Unfortunately, `--follow` cannot (yet) be used to stream the logs from multiple containers.
+
+---
+
+class: title
+
+.small[
+Meanwhile, at the Google NOC ...
+
+.small[
+Why the hell 
+<br/>
+are we getting 1000 packets per second 
+<br/>
+of ICMP ECHO traffic from EC2 ?!?
+]
+]
diff --git a/docs/kubectlscale.md b/docs/kubectlscale.md
new file mode 100644
index 00000000..d3a82786
--- /dev/null
+++ b/docs/kubectlscale.md
@@ -0,0 +1,24 @@
+# Scaling a deployment
+
+- We will start with an easy one: the `worker` deployment
+
+.exercise[
+
+- Open two new terminals to check what's going on with pods and deployments:
+  ```bash
+  kubectl get pods -w
+  kubectl get deployments -w
+  ```
+
+<!-- ```keys ^C``` -->
+
+- Now, create more `worker` replicas:
+  ```bash
+  kubectl scale deploy/worker --replicas=10
+  ```
+
+]
+
+After a few seconds, the graph in the web UI should show up.
+<br/>
+(And peak at 10 hashes/second, just like when we were running on a single one.)
\ No newline at end of file
diff --git a/docs/kubenet.md b/docs/kubenet.md
new file mode 100644
index 00000000..504a69b3
--- /dev/null
+++ b/docs/kubenet.md
@@ -0,0 +1,81 @@
+# Kubernetes network model
+
+- TL,DR:
+
+  *Our cluster (nodes and pods) is one big flat IP network.*
+
+--
+
+- In detail:
+
+ - all nodes must be able to reach each other, without NAT
+
+ - all pods must be able to reach each other, without NAT
+
+ - pods and nodes must be able to reach each other, without NAT
+
+ - each pod is aware of its IP address (no NAT)
+
+- Kubernetes doesn't mandate any particular implementation
+
+---
+
+## Kubernetes network model: the good
+
+- Everything can reach everything
+
+- No address translation
+
+- No port translation
+
+- No new protocol
+
+- Pods cannot move from a node to another and keep their IP address
+
+- IP addresses don't have to be "portable" from a node to another
+
+  (We can use e.g. a subnet per node and use a simple routed topology)
+
+- The specification is simple enough to allow many various implementations
+
+---
+
+## Kubernetes network model: the bad and the ugly
+
+- Everything can reach everything
+
+  - if you want security, you need to add network policies
+
+  - the network implementation that you use needs to support them
+
+- There are literally dozens of implementations out there
+
+  (15 are listed in the Kubernetes documentation)
+
+- It *looks like* you have a level 3 network, but it's only level 4
+
+  (The spec requires UDP and TCP, but not port ranges or arbitrary IP packets)
+
+- `kube-proxy` is on the data path when connecting to a pod or container,
+  <br/>and it's not particularly fast (relies on userland proxying or iptables)
+
+---
+
+## Kubernetes network model: in practice
+
+- The nodes that we are using have been set up to use Weave
+
+- We don't endorse Weave in a particular way, it just Works For Us
+
+- Don't worry about the warning about `kube-proxy` performance
+
+- Unless you:
+
+  - routinely saturate 10G network interfaces
+
+  - count packet rates in millions per second
+
+  - run high-traffic VOIP or gaming platforms
+
+  - do weird things that involve millions of simultaneous connections
+    <br/>(in which case you're already familiar with kernel tuning)
diff --git a/docs/leastprivilege.md b/docs/leastprivilege.md
new file mode 100644
index 00000000..9bdc6a07
--- /dev/null
+++ b/docs/leastprivilege.md
@@ -0,0 +1,60 @@
+# Least privilege model
+
+- All the important data is stored in the "Raft log"
+
+- Managers nodes have read/write access to this data
+
+- Workers nodes have no access to this data
+
+- Workers only receive the minimum amount of data that they need:
+
+  - which services to run
+  - network configuration information for these services
+  - credentials for these services
+
+- Compromising a worker node does not give access to the full cluster
+
+---
+
+## What can I do if I compromise a worker node?
+
+- I can enter the containers running on that node
+
+- I can access the configuration and credentials used by these containers
+
+- I can inspect the network traffic of these containers
+
+- I cannot inspect or disrupt the network traffic of other containers
+
+  (network information is provided by manager nodes; ARP spoofing is not possible)
+
+- I cannot infer the topology of the cluster and its number of nodes
+
+- I can only learn the IP addresses of the manager nodes
+
+---
+
+## Guidelines for workload isolation leveraging least privilege model
+
+- Define security levels
+
+- Define security zones
+
+- Put managers in the highest security zone
+
+- Enforce workloads of a given security level to run in a given zone
+
+- Enforcement can be done with [Authorization Plugins](https://docs.docker.com/engine/extend/plugins_authorization/)
+
+---
+
+## Learning more about container security
+
+.blackbelt[DC17US: Securing Containers, One Patch At A Time
+([video](https://www.youtube.com/watch?v=jZSs1RHwcqo&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=4))]
+
+.blackbelt[DC17EU: Container-relevant Upstream Kernel Developments
+([video](https://dockercon.docker.com/watch/7JQBpvHJwjdW6FKXvMfCK1))]
+
+.blackbelt[DC17EU: What Have Syscalls Done for you Lately?
+([video](https://dockercon.docker.com/watch/4ZxNyWuwk9JHSxZxgBBi6J))]
diff --git a/docs/lisa.yml b/docs/lisa.yml
new file mode 100644
index 00000000..d4e381ea
--- /dev/null
+++ b/docs/lisa.yml
@@ -0,0 +1,136 @@
+title: "LISA17 T9: Build, Ship, and Run Microservices on a Docker Swarm Cluster"
+
+chat: "[Gitter](https://gitter.im/jpetazzo/workshop-20171031-sanfrancisco)"
+
+
+exclude:
+- self-paced
+- snap
+- auto-btp
+- benchmarking
+- elk-manual
+- prom-manual
+
+chapters:
+- |
+  class: title
+
+  .small[
+
+  LISA17 T9
+
+  Build, Ship, and Run Microservices on a Docker Swarm Cluster
+
+  .small[.small[
+
+  **Be kind to the WiFi!**
+
+  *Use the 5G network*
+  <br/>
+  *Don't use your hotspot*
+  <br/>
+  *Don't stream videos from YouTube, Netflix, etc.
+  <br/>(if you're bored, watch local content instead)*
+
+  <!--
+  Also: share the power outlets
+  <br/>
+  *(with limited power comes limited responsibility?)*
+  <br/>
+  *(or something?)*
+  -->
+
+  Thank you!
+
+  ]
+  ]
+  ]
+
+  ---
+
+  ## Intros
+ 
+    - Hello! We are
+    AJ ([@s0ulshake](https://twitter.com/s0ulshake), Travis CI)
+    &
+    Jérôme ([@jpetazzo](https://twitter.com/jpetazzo), Docker Inc.)
+
+  --
+
+  - This is our collective Docker knowledge:
+
+    ![Bell Curve](bell-curve.jpg)
+
+  ---
+
+  ## Logistics
+
+  - The tutorial will run from 1:30pm to 5:00pm
+
+  - This will be fast-paced, but DON'T PANIC!
+
+  - There will be a coffee break at 3:00pm
+    <br/>
+    (please remind us if we forget about it!)
+
+  - Feel free to interrupt for questions at any time
+
+  - All the content is publicly available (slides, code samples, scripts)
+
+    One URL to remember: http://container.training
+
+  - Live feedback, questions, help on @@CHAT@@
+
+- intro.md
+- |
+  @@TOC@@
+- - prereqs.md
+  - versions.md
+  - |
+    class: title
+
+    All right!
+    <br/>
+    We're all set.
+    <br/>
+    Let's do this.
+  - sampleapp.md
+  - swarmkit.md
+  - creatingswarm.md
+  - morenodes.md
+- - firstservice.md
+  - ourapponswarm.md
+  - updatingservices.md
+  #- rollingupdates.md
+  #- healthchecks.md
+- - operatingswarm.md
+  #- netshoot.md
+  #- ipsec.md
+  #- swarmtools.md
+  - security.md
+  #- secrets.md
+  #- encryptionatrest.md
+  - leastprivilege.md
+  - apiscope.md
+  - logging.md
+  - metrics.md
+  #- stateful.md
+  #- extratips.md
+  - end.md
+- |
+  class: title
+
+  That's all folks! <br/> Questions?
+
+  .small[.small[
+
+  AJ ([@s0ulshake](https://twitter.com/s0ulshake)) — [@TravisCI](https://twitter.com/travisci)
+
+  Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@Docker](https://twitter.com/docker)
+
+  ]]
+
+  <!--
+  Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
+  AJ ([@s0ulshake](https://twitter.com/s0ulshake))
+  -->
diff --git a/docs/logging.md b/docs/logging.md
new file mode 100644
index 00000000..0748b853
--- /dev/null
+++ b/docs/logging.md
@@ -0,0 +1,420 @@
+name: logging
+
+# Centralized logging
+
+- We want to send all our container logs to a central place
+
+- If that place could offer a nice web dashboard too, that'd be nice
+
+--
+
+- We are going to deploy an ELK stack
+
+- It will accept logs over a GELF socket
+
+- We will update our services to send logs through the GELF logging driver
+
+---
+
+# Setting up ELK to store container logs
+
+*Important foreword: this is not an "official" or "recommended"
+setup; it is just an example. We used ELK in this demo because
+it's a popular setup and we keep being asked about it; but you
+will have equal success with Fluent or other logging stacks!*
+
+What we will do:
+
+- Spin up an ELK stack with services
+
+- Gaze at the spiffy Kibana web UI
+
+- Manually send a few log entries using one-shot containers
+
+- Set our containers up to send their logs to Logstash
+
+---
+
+## What's in an ELK stack?
+
+- ELK is three components:
+
+  - ElasticSearch (to store and index log entries)
+
+  - Logstash (to receive log entries from various
+    sources, process them, and forward them to various
+    destinations)
+
+  - Kibana (to view/search log entries with a nice UI)
+
+- The only component that we will configure is Logstash
+
+- We will accept log entries using the GELF protocol
+
+- Log entries will be stored in ElasticSearch,
+  <br/>and displayed on Logstash's stdout for debugging
+
+---
+
+class: elk-manual
+
+## Setting up ELK
+
+- We need three containers: ElasticSearch, Logstash, Kibana
+
+- We will place them on a common network, `logging`
+
+.exercise[
+
+- Create the network:
+  ```bash
+  docker network create --driver overlay logging
+  ```
+
+- Create the ElasticSearch service:
+  ```bash
+  docker service create --network logging --name elasticsearch elasticsearch:2.4
+  ```
+
+]
+
+---
+
+class: elk-manual
+
+## Setting up Kibana
+
+- Kibana exposes the web UI
+
+- Its default port (5601) needs to be published
+
+- It needs a tiny bit of configuration: the address of the ElasticSearch service
+
+- We don't want Kibana logs to show up in Kibana (it would create clutter)
+  <br/>so we tell Logspout to ignore them
+
+.exercise[
+
+- Create the Kibana service:
+  ```bash
+  docker service create --network logging --name kibana --publish 5601:5601 \
+         -e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana:4.6
+  ```
+
+]
+
+---
+
+class: elk-manual
+
+## Setting up Logstash
+
+- Logstash needs some configuration to listen to GELF messages and send them to ElasticSearch
+
+- We could author a custom image bundling this configuration
+
+- We can also pass the [configuration](https://github.com/jpetazzo/orchestration-workshop/blob/master/elk/logstash.conf) on the command line
+
+.exercise[
+
+- Create the Logstash service:
+  ```bash
+    docker service create --network logging --name logstash -p 12201:12201/udp \
+           logstash:2.4 -e "$(cat ~/orchestration-workshop/elk/logstash.conf)"
+  ```
+
+]
+
+---
+
+class: elk-manual
+
+## Checking Logstash
+
+- Before proceeding, let's make sure that Logstash started properly
+
+.exercise[
+
+- Lookup the node running the Logstash container:
+  ```bash
+  docker service ps logstash
+  ```
+
+- Connect to that node
+
+]
+
+---
+
+class: elk-manual
+
+## View Logstash logs
+
+.exercise[
+
+- View the logs of the logstash service:
+  ```bash
+  docker service logs logstash --follow
+  ```
+
+  <!-- ```wait "message" => "ok"``` -->
+  <!-- ```keys ^C``` -->
+
+]
+
+You should see the heartbeat messages:
+.small[
+```json
+{      "message" => "ok",
+          "host" => "1a4cfb063d13",
+      "@version" => "1",
+    "@timestamp" => "2016-06-19T00:45:45.273Z"
+}
+```
+]
+
+---
+
+class: elk-auto
+
+## Deploying our ELK cluster
+
+- We will use a stack file
+
+.exercise[
+
+- Build, ship, and run our ELK stack:
+  ```bash
+  docker-compose -f elk.yml build
+  docker-compose -f elk.yml push
+  docker stack deploy elk -c elk.yml
+  ```
+
+]
+
+Note: the *build* and *push* steps are not strictly necessary, but they don't hurt!
+
+Let's have a look at the [Compose file](
+https://github.com/jpetazzo/orchestration-workshop/blob/master/stacks/elk.yml).
+
+---
+
+class: elk-auto
+
+## Checking that our ELK stack works correctly
+
+- Let's view the logs of logstash
+
+  (Who logs the loggers?)
+
+.exercise[
+
+- Stream logstash's logs:
+  ```bash
+  docker service logs --follow --tail 1 elk_logstash
+  ```
+
+]
+
+You should see the heartbeat messages:
+
+.small[
+```json
+{      "message" => "ok",
+          "host" => "1a4cfb063d13",
+      "@version" => "1",
+    "@timestamp" => "2016-06-19T00:45:45.273Z"
+}
+```
+]
+
+---
+
+## Testing the GELF receiver
+
+- In a new window, we will generate a logging message
+
+- We will use a one-off container, and Docker's GELF logging driver
+
+.exercise[
+
+- Send a test message:
+  ```bash
+    docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
+           --rm alpine echo hello
+  ```
+]
+
+The test message should show up in the logstash container logs.
+
+---
+
+## Sending logs from a service
+
+- We were sending from a "classic" container so far; let's send logs from a service instead
+
+- We're lucky: the parameters (`--log-driver` and `--log-opt`) are exactly the same!
+
+
+.exercise[
+
+- Send a test message:
+  ```bash
+    docker service create \
+           --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
+           alpine echo hello
+  ```
+
+  <!-- ```wait Detected task failure``` -->
+  <!-- ```keys ^C``` -->
+
+]
+
+The test message should show up as well in the logstash container logs.
+
+--
+
+In fact, *multiple messages will show up, and continue to show up every few seconds!*
+
+---
+
+## Restart conditions
+
+- By default, if a container exits (or is killed with `docker kill`, or runs out of memory ...),
+  the Swarm will restart it (possibly on a different machine)
+
+- This behavior can be changed by setting the *restart condition* parameter
+
+.exercise[
+
+- Change the restart condition so that Swarm doesn't try to restart our container forever:
+  ```bash
+  docker service update `xxx` --restart-condition none
+  ```
+]
+
+Available restart conditions are `none`, `any`, and `on-error`.
+
+You can also set `--restart-delay`, `--restart-max-attempts`, and `--restart-window`.
+
+---
+
+## Connect to Kibana
+
+- The Kibana web UI is exposed on cluster port 5601
+
+.exercise[
+
+- Connect to port 5601 of your cluster
+
+  - if you're using Play-With-Docker, click on the (5601) badge above the terminal
+
+  - otherwise, open http://(any-node-address):5601/ with your browser
+
+]
+
+---
+
+## "Configuring" Kibana
+
+- If you see a status page with a yellow item, wait a minute and reload
+  (Kibana is probably still initializing)
+
+- Kibana should offer you to "Configure an index pattern":
+  <br/>in the "Time-field name" drop down, select "@timestamp", and hit the
+  "Create" button
+
+- Then:
+
+  - click "Discover" (in the top-left corner)
+  - click "Last 15 minutes" (in the top-right corner)
+  - click "Last 1 hour" (in the list in the middle)
+  - click "Auto-refresh" (top-right corner)
+  - click "5 seconds" (top-left of the list)
+
+- You should see a series of green bars (with one new green bar every minute)
+
+---
+
+## Updating our services to use GELF
+
+- We will now inform our Swarm to add GELF logging to all our services
+
+- This is done with the `docker service update` command
+
+- The logging flags are the same as before
+
+.exercise[
+
+- Enable GELF logging for the `rng` service:
+  ```bash
+    docker service update dockercoins_rng \
+           --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201
+  ```
+
+]
+
+After ~15 seconds, you should see the log messages in Kibana.
+
+---
+
+## Viewing container logs
+
+- Go back to Kibana
+
+- Container logs should be showing up!
+
+- We can customize the web UI to be more readable
+
+.exercise[
+
+- In the left column, move the mouse over the following
+  columns, and click the "Add" button that appears:
+
+  - host
+  - container_name
+  - message
+
+<!--
+  - logsource
+  - program
+  - message
+-->
+
+]
+
+---
+
+## .warning[Don't update stateful services!]
+
+- What would have happened if we had updated the Redis service?
+
+- When a service changes, SwarmKit replaces existing container with new ones
+
+- This is fine for stateless services
+
+- But if you update a stateful service, its data will be lost in the process
+
+- If we updated our Redis service, all our DockerCoins would be lost
+
+---
+
+## Important afterword
+
+**This is not a "production-grade" setup.**
+
+It is just an educational example. We did set up a single
+ElasticSearch instance and a single Logstash instance.
+
+In a production setup, you need an ElasticSearch cluster
+(both for capacity and availability reasons). You also
+need multiple Logstash instances.
+
+And if you want to withstand
+bursts of logs, you need some kind of message queue:
+Redis if you're cheap, Kafka if you want to make sure
+that you don't drop messages on the floor. Good luck.
+
+If you want to learn more about the GELF driver,
+have a look at [this blog post](
+http://jpetazzo.github.io/2017/01/20/docker-logging-gelf/).
diff --git a/docs/loop.sh b/docs/loop.sh
new file mode 100755
index 00000000..f724821a
--- /dev/null
+++ b/docs/loop.sh
@@ -0,0 +1,5 @@
+#!/bin/sh
+while true; do
+  find . |
+  entr -d . sh -c "DEBUG=1 ./markmaker.py < kube.yml > workshop.md"
+done
diff --git a/docs/machine.md b/docs/machine.md
new file mode 100644
index 00000000..967629dd
--- /dev/null
+++ b/docs/machine.md
@@ -0,0 +1,225 @@
+## Adding nodes using the Docker API
+
+- We don't have to SSH into the other nodes, we can use the Docker API
+
+- If you are using Play-With-Docker:
+
+  - the nodes expose the Docker API over port 2375/tcp, without authentication
+
+  - we will connect by setting the `DOCKER_HOST` environment variable
+
+- Otherwise:
+
+  - the nodes expose the Docker API over port 2376/tcp, with TLS mutual authentication
+
+  - we will use Docker Machine to set the correct environment variables
+    <br/>(the nodes have been suitably pre-configured to be controlled through `node1`)
+
+---
+
+# Docker Machine
+
+- Docker Machine has two primary uses:
+
+  - provisioning cloud instances running the Docker Engine
+
+  - managing local Docker VMs within e.g. VirtualBox
+
+- Docker Machine is purely optional
+
+- It makes it easy to create, upgrade, manage... Docker hosts:
+
+  - on your favorite cloud provider
+
+  - locally (e.g. to test clustering, or different versions)
+
+  - across different cloud providers
+
+---
+
+class: self-paced
+
+## If you're using Play-With-Docker ...
+
+- You won't need to use Docker Machine
+
+- Instead, to "talk" to another node, we'll just set `DOCKER_HOST`
+
+- You can skip the exercises telling you to do things with Docker Machine!
+
+---
+
+## Docker Machine basic usage
+
+- We will learn two commands:
+
+  - `docker-machine ls` (list existing hosts)
+
+  - `docker-machine env` (switch to a specific host)
+
+.exercise[
+
+- List configured hosts:
+  ```bash
+  docker-machine ls
+  ```
+
+]
+
+You should see your 5 nodes.
+
+---
+
+class: in-person
+
+## How did we make our 5 nodes show up there?
+
+*For the curious...*
+
+- This was done by our VM provisioning scripts
+
+- After setting up everything else, `node1` adds the 5 nodes
+  to the local Docker Machine configuration
+  (located in `$HOME/.docker/machine`)
+
+- Nodes are added using [Docker Machine generic driver](https://docs.docker.com/machine/drivers/generic/)
+
+  (It skips machine provisioning and jumps straight to the configuration phase)
+
+- Docker Machine creates TLS certificates and deploys them to the nodes through SSH
+
+---
+
+## Using Docker Machine to communicate with a node
+
+- To select a node, use `eval $(docker-machine env nodeX)`
+
+- This sets a number of environment variables
+
+- To unset these variables, use `eval $(docker-machine env -u)`
+
+.exercise[
+
+- View the variables used by Docker Machine:
+  ```bash
+  docker-machine env node3
+  ```
+
+]
+
+(This shows which variables *would* be set by Docker Machine; but it doesn't change them.)
+
+---
+
+## Getting the token
+
+- First, let's store the join token in a variable
+
+- This must be done from a manager
+
+.exercise[
+
+- Make sure we talk to the local node, or `node1`:
+  ```bash
+  eval $(docker-machine env -u)
+  ```
+
+- Get the join token:
+  ```bash
+  TOKEN=$(docker swarm join-token -q worker)
+  ```
+
+]
+
+---
+
+## Change the node targeted by the Docker CLI
+
+- We need to set the right environment variables to communicate with `node3`
+
+.exercise[
+
+- If you're using Play-With-Docker:
+  ```bash
+  export DOCKER_HOST=tcp://node3:2375
+  ```
+
+- Otherwise, use Docker Machine:
+  ```bash
+  eval $(docker-machine env node3)
+  ```
+
+]
+
+---
+
+## Checking which node we're talking to
+
+- Let's use the Docker API to ask "who are you?" to the remote node
+
+.exercise[
+
+- Extract the node name from the output of `docker info`:
+  ```bash
+  docker info | grep ^Name
+  ```
+
+]
+
+This should tell us that we are talking to `node3`.
+
+Note: it can be useful to use a [custom shell prompt](
+https://github.com/jpetazzo/orchestration-workshop/blob/master/prepare-vms/scripts/postprep.rc#L68)
+reflecting the `DOCKER_HOST` variable.
+
+---
+
+## Adding a node through the Docker API
+
+- We are going to use the same `docker swarm join` command as before
+
+.exercise[
+
+- Add `node3` to the Swarm:
+  ```bash
+  docker swarm join --token $TOKEN node1:2377
+  ```
+
+]
+
+---
+
+## Going back to the local node
+
+- We need to revert the environment variable(s) that we had set previously
+
+.exercise[
+
+- If you're using Play-With-Docker, just clear `DOCKER_HOST`:
+  ```bash
+  unset DOCKER_HOST
+  ```
+
+- Otherwise, use Docker Machine to reset all the relevant variables:
+  ```bash
+  eval $(docker-machine env -u)
+  ```
+
+]
+
+From that point, we are communicating with `node1` again.
+
+---
+
+## Checking the composition of our cluster
+
+- Now that we're talking to `node1` again, we can use management commands
+
+.exercise[
+
+- Check that the node is here:
+  ```bash
+  docker node ls
+  ```
+
+]
diff --git a/docs/markmaker.py b/docs/markmaker.py
new file mode 100755
index 00000000..f8964b6c
--- /dev/null
+++ b/docs/markmaker.py
@@ -0,0 +1,168 @@
+#!/usr/bin/env python
+# transforms a YAML manifest into a HTML workshop file
+
+import glob
+import logging
+import os
+import re
+import string
+import subprocess
+import sys  
+import yaml
+
+
+logging.basicConfig(level=os.environ.get("LOG_LEVEL", "INFO"))
+
+
+class InvalidChapter(ValueError):
+
+    def __init__(self, chapter):
+        ValueError.__init__(self, "Invalid chapter: {!r}".format(chapter))
+
+
+def anchor(title):
+    title = title.lower().replace(' ', '-')
+    title = ''.join(c for c in title if c in string.ascii_letters+'-')
+    return "toc-" + title
+
+
+def insertslide(markdown, title):
+    title_position = markdown.find("\n# {}\n".format(title))
+    slide_position = markdown.rfind("\n---\n", 0, title_position+1)
+    logging.debug("Inserting title slide at position {}: {}".format(slide_position, title))
+
+    before = markdown[:slide_position]
+
+    extra_slide = """
+---
+
+name: {anchor}
+class: title
+
+{title}
+
+.nav[[Back to table of contents](#{toclink})]
+
+.debug[(automatically generated title slide)]
+""".format(anchor=anchor(title), title=title, toclink=title2chapter[title])
+    after = markdown[slide_position:]
+    return before + extra_slide + after
+
+
+def flatten(titles):
+    for title in titles:
+        if isinstance(title, list):
+            for t in flatten(title):
+                yield t
+        else:
+            yield title
+
+
+def generatefromyaml(manifest):
+    manifest = yaml.load(manifest)
+
+    markdown, titles = processchapter(manifest["chapters"], "(inline)")
+    logging.debug("Found {} titles.".format(len(titles)))
+    toc = gentoc(titles)
+    markdown = markdown.replace("@@TOC@@", toc)
+    for title in flatten(titles):
+        markdown = insertslide(markdown, title)
+
+    exclude = manifest.get("exclude", [])
+    logging.debug("exclude={!r}".format(exclude))
+    if not exclude:
+        logging.warning("'exclude' is empty.")
+    exclude = ",".join('"{}"'.format(c) for c in exclude)
+
+    html = open("workshop.html").read()
+    html = html.replace("@@MARKDOWN@@", markdown)
+    html = html.replace("@@EXCLUDE@@", exclude)
+    html = html.replace("@@CHAT@@", manifest["chat"])
+    html = html.replace("@@TITLE@@", manifest["title"])
+    return html
+
+
+title2chapter = {}
+
+
+def gentoc(titles, depth=0, chapter=0):
+    if not titles:
+        return ""
+    if isinstance(titles, str):
+        title2chapter[titles] = "toc-chapter-1"
+        logging.debug("Chapter {} Title {}".format(chapter, titles))
+        return "  "*(depth-2) + "- [{}](#{})\n".format(titles, anchor(titles))
+    if isinstance(titles, list):
+        if depth==0:
+            sep = "\n\n.debug[(auto-generated TOC)]\n---\n\n"
+            head = ""
+            tail = ""
+        elif depth==1:
+            sep = "\n"
+            head = "name: toc-chapter-{}\n\n## Chapter {}\n\n".format(chapter, chapter)
+            tail = ""
+        else:
+            sep = "\n"
+            head = ""
+            tail = ""
+        return head + sep.join(gentoc(t, depth+1, c+1) for (c,t) in enumerate(titles)) + tail
+
+
+# Arguments:
+# - `chapter` is a string; if it has multiple lines, it will be used as
+#   a markdown fragment; otherwise it will be considered as a file name
+#   to be recursively loaded and parsed
+# - `filename` is the name of the file that we're currently processing
+#   (to generate inline comments to facilitate edition)
+# Returns: (epxandedmarkdown,[list of titles])
+# The list of titles can be nested.
+def processchapter(chapter, filename):
+    if isinstance(chapter, unicode):
+        return processchapter(chapter.encode("utf-8"), filename)
+    if isinstance(chapter, str):
+        if "\n" in chapter:
+            titles = re.findall("^# (.*)", chapter, re.MULTILINE)
+            slidefooter = ".debug[{}]".format(makelink(filename))
+            chapter = chapter.replace("\n---\n", "\n{}\n---\n".format(slidefooter))
+            chapter += "\n" + slidefooter
+            return (chapter, titles)
+        if os.path.isfile(chapter):
+            return processchapter(open(chapter).read(), chapter)
+    if isinstance(chapter, list):
+        chapters = [processchapter(c, filename) for c in chapter]
+        markdown = "\n---\n".join(c[0] for c in chapters)
+        titles = [t for (m,t) in chapters if t]
+        return (markdown, titles)
+    raise InvalidChapter(chapter)
+
+# Try to figure out the URL of the repo on GitHub.
+# This is used to generate "edit me on GitHub"-style links.
+try:
+    if "REPOSITORY_URL" in os.environ:
+        repo = os.environ["REPOSITORY_URL"]
+    else:
+        repo = subprocess.check_output(["git", "config", "remote.origin.url"])
+    repo = repo.strip().replace("git@github.com:", "https://github.com/")
+    if "BRANCH" in os.environ:
+        branch = os.environ["BRANCH"]
+    else:
+        branch = subprocess.check_output(["git", "status", "--short", "--branch"])
+        branch = branch[3:].split("...")[0]
+    base = subprocess.check_output(["git", "rev-parse", "--show-prefix"])
+    base = base.strip().strip("/")
+    urltemplate = ("{repo}/tree/{branch}/{base}/{filename}"
+        .format(repo=repo, branch=branch, base=base, filename="{}"))
+except:
+    logging.exception("Could not generate repository URL; generating local URLs instead.")
+    urltemplate = "file://{pwd}/{filename}".format(pwd=os.environ["PWD"], filename="{}")
+
+def makelink(filename):
+    if os.path.isfile(filename):
+        url = urltemplate.format(filename)
+        return "[{}]({})".format(filename, url)
+    else:
+        return filename
+
+
+sys.stdout.write(generatefromyaml(sys.stdin))
+logging.info("Done")
diff --git a/docs/metrics.md b/docs/metrics.md
new file mode 100644
index 00000000..e0232d71
--- /dev/null
+++ b/docs/metrics.md
@@ -0,0 +1,1637 @@
+# Metrics collection
+
+- We want to gather metrics in a central place
+
+- We will gather node metrics and container metrics
+
+- We want a nice interface to view them (graphs)
+
+---
+
+## Node metrics
+
+- CPU, RAM, disk usage on the whole node
+
+- Total number of processes running, and their states
+
+- Number of open files, sockets, and their states
+
+- I/O activity (disk, network), per operation or volume
+
+- Physical/hardware (when applicable): temperature, fan speed ...
+
+- ... and much more!
+
+---
+
+## Container metrics
+
+- Similar to node metrics, but not totally identical
+
+- RAM breakdown will be different
+
+  - active vs inactive memory
+  - some memory is *shared* between containers, and accounted specially
+
+- I/O activity is also harder to track
+
+  - async writes can cause deferred "charges"
+  - some page-ins are also shared between containers
+
+For details about container metrics, see:
+<br/>
+http://jpetazzo.github.io/2013/10/08/docker-containers-metrics/
+
+---
+
+class: snap, prom
+
+## Tools
+
+We will build *two* different metrics pipelines:
+
+- One based on Intel Snap,
+
+- Another based on Prometheus.
+
+If you're using Play-With-Docker, skip the exercises
+relevant to Intel Snap (we rely on a SSH server to deploy,
+and PWD doesn't have that yet).
+
+---
+
+class: snap
+
+## First metrics pipeline
+
+We will use three open source Go projects for our first metrics pipeline:
+
+- Intel Snap
+
+  Collects, processes, and publishes metrics
+
+- InfluxDB
+
+  Stores metrics
+
+- Grafana
+
+  Displays metrics visually
+
+---
+
+class: snap
+
+## Snap
+
+- [github.com/intelsdi-x/snap](https://github.com/intelsdi-x/snap)
+
+- Can collect, process, and publish metric data
+
+- Doesn’t store metrics
+
+- Works as a daemon (snapd) controlled by a CLI (snapctl)
+
+- Offloads collecting, processing, and publishing to plugins
+
+- Does nothing out of the box; configuration required!
+
+- Docs: https://github.com/intelsdi-x/snap/blob/master/docs/
+
+---
+
+class: snap
+
+## InfluxDB
+
+- Snap doesn't store metrics data
+
+- InfluxDB is specifically designed for time-series data
+
+  - CRud vs. CRUD (you rarely if ever update/delete data)
+
+  - orthogonal read and write patterns
+
+  - storage format optimization is key (for disk usage and performance)
+
+- Snap has a plugin allowing to *publish* to InfluxDB
+
+---
+
+class: snap
+
+## Grafana
+
+- Snap cannot show graphs
+
+- InfluxDB cannot show graphs
+
+- Grafana will take care of that
+
+- Grafana can read data from InfluxDB and display it as graphs
+
+---
+
+class: snap
+
+## Getting and setting up Snap
+
+- We will install Snap directly on the nodes
+
+- Release tarballs are available from GitHub
+
+- We will use a *global service*
+  <br/>(started on all nodes, including nodes added later)
+
+- This service will download and unpack Snap in /opt and /usr/local
+
+- /opt and /usr/local will be bind-mounted from the host
+
+- This service will effectively install Snap on the hosts
+
+---
+
+class: snap
+
+## The Snap installer service
+
+- This will get Snap on all nodes
+
+.exercise[
+
+```bash
+docker service create --restart-condition=none --mode global \
+       --mount type=bind,source=/usr/local/bin,target=/usr/local/bin \
+       --mount type=bind,source=/opt,target=/opt centos sh -c '
+SNAPVER=v0.16.1-beta
+RELEASEURL=https://github.com/intelsdi-x/snap/releases/download/$SNAPVER
+curl -sSL $RELEASEURL/snap-$SNAPVER-linux-amd64.tar.gz |
+     tar -C /opt -zxf-
+curl -sSL $RELEASEURL/snap-plugins-$SNAPVER-linux-amd64.tar.gz |
+     tar -C /opt -zxf-
+ln -s snap-$SNAPVER /opt/snap
+for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
+' # If you copy-paste that block, do not forget that final quote ☺
+```
+
+]
+
+---
+
+class: snap
+
+## First contact with `snapd`
+
+- The core of Snap is `snapd`, the Snap daemon
+
+- Application made up of a REST API, control module, and scheduler module
+
+.exercise[
+
+- Start `snapd` with plugin trust disabled and log level set to debug:
+  ```bash
+  snapd -t 0 -l 1
+  ```
+
+]
+
+- More resources:
+
+  https://github.com/intelsdi-x/snap/blob/master/docs/SNAPD.md
+  https://github.com/intelsdi-x/snap/blob/master/docs/SNAPD_CONFIGURATION.md
+
+---
+
+class: snap
+
+## Using `snapctl` to interact with `snapd`
+
+- Let's load a *collector* and a *publisher* plugins
+
+.exercise[
+
+- Open a new terminal
+
+- Load the psutil collector plugin:
+  ```bash
+  snapctl plugin load /opt/snap/plugin/snap-plugin-collector-psutil
+  ```
+
+- Load the file publisher plugin:
+  ```bash
+  snapctl plugin load /opt/snap/plugin/snap-plugin-publisher-mock-file
+  ```
+
+]
+
+---
+
+class: snap
+
+## Checking what we've done
+
+- Good to know: Docker CLI uses `ls`, Snap CLI uses `list`
+
+.exercise[
+
+- See your loaded plugins:
+  ```bash
+  snapctl plugin list
+  ```
+
+- See the metrics you can collect:
+  ```bash
+  snapctl metric list
+  ```
+
+]
+
+---
+
+class: snap
+
+## Actually collecting metrics: introducing *tasks*
+
+- To start collecting/processing/publishing metric data, you need to create a *task*
+
+- A *task* indicates:
+
+  - *what* to collect (which metrics)
+  - *when* to collect it (e.g. how often)
+  - *how* to process it (e.g. use it directly, or compute moving averages)
+  - *where* to publish it
+
+- Tasks can be defined with manifests written in JSON or YAML
+
+- Some plugins, such as the Docker collector, allow for wildcards (\*) in the metrics "path"
+  <br/>(see snap/docker-influxdb.json)
+
+- More resources:
+  https://github.com/intelsdi-x/snap/blob/master/docs/TASKS.md
+
+---
+
+class: snap
+
+## Our first task manifest
+
+```yaml
+  version: 1
+  schedule:
+    type: "simple" # collect on a set interval
+    interval: "1s" # of every 1s
+  max-failures: 10
+  workflow:
+    collect: # first collect
+      metrics: # metrics to collect
+        /intel/psutil/load/load1: {}
+      config: # there is no configuration
+      publish: # after collecting, publish
+        -
+            plugin_name: "file" # use the file publisher
+            config:
+                file: "/tmp/snap-psutil-file.log" # write to this file
+```
+
+---
+
+class: snap
+
+## Creating our first task
+
+- The task manifest shown on the previous slide is stored in `snap/psutil-file.yml`.
+
+.exercise[
+
+- Create a task using the manifest:
+
+  ```bash
+  cd ~/orchestration-workshop/snap
+  snapctl task create -t psutil-file.yml
+  ```
+
+]
+
+  The output should look like the following:
+  ```
+    Using task manifest to create task
+    Task created
+    ID: 240435e8-a250-4782-80d0-6fff541facba
+    Name: Task-240435e8-a250-4782-80d0-6fff541facba
+    State: Running
+  ```
+
+---
+
+class: snap
+
+## Checking existing tasks
+
+.exercise[
+
+- This will confirm that our task is running correctly, and remind us of its task ID
+
+  ```bash
+  snapctl task list
+  ```
+
+]
+
+The output should look like the following:
+  ```
+    ID           NAME              STATE     HIT MISS FAIL CREATED
+    24043...acba Task-24043...acba Running   4   0    0    2:34PM   8-13-2016
+  ```
+---
+
+class: snap
+
+## Viewing our task dollars at work
+
+- The task is using a very simple publisher, `mock-file`
+
+- That publisher just writes text lines in a file (one line per data point)
+
+.exercise[
+
+- Check that the data is flowing indeed:
+  ```bash
+  tail -f /tmp/snap-psutil-file.log
+  ```
+
+]
+
+To exit, hit `^C`
+
+---
+
+class: snap
+
+## Debugging tasks
+
+- When a task is not directly writing to a local file, use `snapctl task watch`
+
+- `snapctl task watch` will stream the metrics you are collecting to STDOUT
+
+.exercise[
+
+```bash
+snapctl task watch <ID>
+```
+
+]
+
+To exit, hit `^C`
+
+---
+
+class: snap
+
+## Stopping snap
+
+- Our Snap deployment has a few flaws:
+
+  - snapd was started manually
+
+  - it is running on a single node
+
+  - the configuration is purely local
+
+--
+
+class: snap
+
+- We want to change that!
+
+--
+
+class: snap
+
+- But first, go back to the terminal where `snapd` is running, and hit `^C`
+
+- All tasks will be stopped; all plugins will be unloaded; Snap will exit
+
+---
+
+class: snap
+
+## Snap Tribe Mode
+
+- Tribe is Snap's clustering mechanism
+
+- When tribe mode is enabled, nodes can join *agreements*
+
+- When a node in an *agreement* does something (e.g. load a plugin or run a task),
+  <br/>other nodes of that agreement do the same thing
+
+- We will use it to load the Docker collector and InfluxDB publisher on all nodes,
+  <br/>and run a task to use them
+
+- Without tribe mode, we would have to load plugins and run tasks manually on every node
+
+- More resources:
+  https://github.com/intelsdi-x/snap/blob/master/docs/TRIBE.md
+
+---
+
+class: snap
+
+## Running Snap itself on every node
+
+- Snap runs in the foreground, so you need to use `&` or start it in tmux
+
+.exercise[
+
+- Run the following command *on every node:*
+  ```bash
+  snapd -t 0 -l 1 --tribe --tribe-seed node1:6000
+  ```
+
+]
+
+If you're *not* using Play-With-Docker, there is another way to start Snap!
+
+---
+
+class: snap
+
+## Starting a daemon through SSH
+
+.warning[Hackety hack ahead!]
+
+- We will create a *global service*
+
+- That global service will install a SSH client
+
+- With that SSH client, the service will connect back to its local node
+  <br/>(i.e. "break out" of the container, using the SSH key that we provide)
+
+- Once logged on the node, the service starts snapd with Tribe Mode enabled
+
+---
+
+class: snap
+
+## Running Snap itself on every node
+
+- I might go to hell for showing you this, but here it goes ...
+
+.exercise[
+
+- Start Snap all over the place:
+  ```bash
+    docker service create --name snapd --mode global \
+           --mount type=bind,source=$HOME/.ssh/id_rsa,target=/sshkey \
+           alpine sh -c "
+                  apk add --no-cache openssh-client &&
+                  ssh -o StrictHostKeyChecking=no -i /sshkey docker@172.17.0.1 \
+                      sudo snapd -t 0 -l 1 --tribe --tribe-seed node1:6000
+           " # If you copy-paste that block, don't forget that final quote :-)
+   ```
+
+]
+
+Remember: this *does not work* with Play-With-Docker (which doesn't have SSH).
+
+---
+
+class: snap
+
+## Viewing the members of our tribe
+
+- If everything went fine, Snap is now running in tribe mode
+
+.exercise[
+
+- View the members of our tribe:
+  ```bash
+  snapctl member list
+  ```
+
+]
+
+This should show the 5 nodes with their hostnames.
+
+---
+
+class: snap
+
+## Create an agreement
+
+- We can now create an *agreement* for our plugins and tasks
+
+.exercise[
+
+- Create an agreement; make sure to use the same name all along:
+  ```bash
+  snapctl agreement create docker-influxdb
+  ```
+
+]
+
+The output should look like the following:
+
+```
+  Name             Number of Members       plugins      tasks
+  docker-influxdb  0                       0            0
+```
+
+---
+
+class: snap
+
+## Instruct all nodes to join the agreement
+
+- We dont need another fancy global service!
+
+- We can join nodes from any existing node of the cluster
+
+.exercise[
+
+- Add all nodes to the agreement:
+  ```bash
+    snapctl member list | tail -n +2 |
+      xargs -n1 snapctl agreement join docker-influxdb
+  ```
+
+]
+
+The last bit of output should look like the following:
+```
+  Name             Number of Members       plugins         tasks
+  docker-influxdb  5                       0               0
+```
+
+---
+
+class: snap
+
+## Start a container on every node
+
+- The Docker plugin requires at least one container to be started
+
+- Normally, at this point, you will have at least one container on each node
+
+- But just in case you did things differently, let's create a dummy global service
+
+.exercise[
+
+- Create an alpine container on the whole cluster:
+  ```bash
+    docker service create --name ping --mode global alpine ping 8.8.8.8
+  ```
+
+]
+
+---
+
+class: snap
+
+## Running InfluxDB
+
+- We will create a service for InfluxDB
+
+- We will use the official image
+
+- InfluxDB uses multiple ports:
+
+  - 8086 (HTTP API; we need this)
+
+  - 8083 (admin interface; we need this)
+
+  - 8088 (cluster communication; not needed here)
+
+  - more ports for other protocols (graphite, collectd...)
+
+- We will just publish the first two
+
+---
+
+class: snap
+
+## Creating the InfluxDB service
+
+.exercise[
+
+- Start an InfluxDB service, publishing ports 8083 and 8086:
+  ```bash
+    docker service create --name influxdb \
+           --publish 8083:8083 \
+           --publish 8086:8086 \
+           influxdb:0.13
+  ```
+
+]
+
+Note: this will allow any node to publish metrics data to `localhost:8086`,
+and it will allows us to access the admin interface by connecting to any node
+on port 8083.
+
+.warning[Make sure to use InfluxDB 0.13; a few things changed in 1.0
+(like, the name of the default retention policy is now "autogen") and
+this breaks a few things.]
+
+---
+
+class: snap
+
+## Setting up InfluxDB
+
+- We need to create the "snap" database
+
+.exercise[
+
+- Open port 8083 with your browser
+
+- Enter the following query in the query box:
+  ```
+  CREATE DATABASE "snap"
+  ```
+
+- In the top-right corner, select "Database: snap"
+
+]
+
+Note: the InfluxDB query language *looks like* SQL but it's not.
+
+???
+
+## Setting a retention policy
+
+- When graduating to 1.0, InfluxDB changed the name of the default policy
+
+- It used to be "default" and it is now "autogen"
+
+- Snap still uses "default" and this results in errors
+
+.exercise[
+
+- Create a "default" retention policy by entering the following query in the box:
+  ```
+  CREATE RETENTION POLICY "default" ON "snap" DURATION 1w REPLICATION 1
+  ```
+
+]
+
+---
+
+class: snap
+
+## Load Docker collector and InfluxDB publisher
+
+- We will load plugins on the local node
+
+- Since our local node is a member of the agreement, all other
+  nodes in the agreement will also load these plugins
+
+.exercise[
+
+- Load Docker collector:
+
+  ```bash
+  snapctl plugin load /opt/snap/plugin/snap-plugin-collector-docker
+  ```
+
+- Load InfluxDB publisher:
+
+  ```bash
+  snapctl plugin load /opt/snap/plugin/snap-plugin-publisher-influxdb
+  ```
+
+]
+
+---
+
+class: snap
+
+## Start a simple collection task
+
+- Again, we will create a task on the local node
+
+- The task will be replicated on other nodes members of the same agreement
+
+.exercise[
+
+- Load a task manifest file collecting a couple of metrics on all containers,
+  <br/>and sending them to InfluxDB:
+  ```bash
+  cd ~/orchestration-workshop/snap
+  snapctl task create -t docker-influxdb.json
+  ```
+
+]
+
+Note: the task description sends metrics to the InfluxDB API endpoint
+located at 127.0.0.1:8086. Since the InfluxDB container is published
+on port 8086, 127.0.0.1:8086 always routes traffic to the InfluxDB
+container.
+
+---
+
+class: snap
+
+## If things go wrong...
+
+Note: if a task runs into a problem (e.g. it's trying to publish
+to a metrics database, but the database is unreachable), the task
+will be stopped.
+
+You will have to restart it manually by running:
+
+```bash
+snapctl task enable <ID>
+snapctl task start <ID>
+```
+
+This must be done *per node*. Alternatively, you can delete+re-create
+the task (it will delete+re-create on all nodes).
+
+---
+
+class: snap
+
+## Check that metric data shows up in InfluxDB
+
+- Let's check existing data with a few manual queries in the InfluxDB admin interface
+
+.exercise[
+
+- List "measurements":
+  ```
+  SHOW MEASUREMENTS
+  ```
+  (This should show two generic entries corresponding to the two collected metrics.)
+
+- View time series data for one of the metrics:
+  ```
+  SELECT * FROM "intel/docker/stats/cgroups/cpu_stats/cpu_usage/total_usage"
+  ```
+  (This should show a list of data points with **time**, **docker_id**, **source**, and **value**.)
+
+]
+
+---
+
+class: snap
+
+## Deploy Grafana
+
+- We will use an almost-official image, `grafana/grafana`
+
+- We will publish Grafana's web interface on its default port (3000)
+
+.exercise[
+
+- Create the Grafana service:
+  ```bash
+  docker service create --name grafana --publish 3000:3000 grafana/grafana:3.1.1
+  ```
+
+]
+
+---
+
+class: snap
+
+## Set up Grafana
+
+.exercise[
+
+- Open port 3000 with your browser
+
+- Identify with "admin" as the username and password
+
+- Click on the Grafana logo (the orange spiral in the top left corner)
+
+- Click on "Data Sources"
+
+- Click on "Add data source" (green button on the right)
+
+]
+
+---
+
+class: snap
+
+## Add InfluxDB as a data source for Grafana
+
+.small[
+
+Fill the form exactly as follows:
+- Name = "snap"
+- Type = "InfluxDB"
+
+In HTTP settings, fill as follows:
+- Url = "http://(IP.address.of.any.node):8086"
+- Access = "direct"
+- Leave HTTP Auth untouched
+
+In InfluxDB details, fill as follows:
+- Database = "snap"
+- Leave user and password blank
+
+Finally, click on "add", you should see a green message saying "Success - Data source is working".
+If you see an orange box (sometimes without a message), it means that you got something wrong. Triple check everything again.
+
+]
+
+---
+
+class: snap
+
+![Screenshot showing how to fill the form](grafana-add-source.png)
+
+---
+
+class: snap
+
+## Create a dashboard in Grafana
+
+.exercise[
+
+- Click on the Grafana logo again (the orange spiral in the top left corner)
+
+- Hover over "Dashboards"
+
+- Click "+ New"
+
+- Click on the little green rectangle that appeared in the top left
+
+- Hover over "Add Panel"
+
+- Click on "Graph"
+
+]
+
+At this point, you should see a sample graph showing up.
+
+---
+
+class: snap
+
+## Setting up a graph in Grafana
+
+.exercise[
+
+- Panel data source: select snap
+- Click on the SELECT metrics query to expand it
+- Click on "select measurement" and pick CPU usage
+- Click on the "+" right next to "WHERE"
+- Select "docker_id"
+- Select the ID of a container of your choice (e.g. the one running InfluxDB)
+- Click on the "+" on the right of the "SELECT" line
+- Add "derivative"
+- In the "derivative" option, select "1s"
+- In the top right corner, click on the clock, and pick "last 5 minutes"
+
+]
+
+Congratulations, you are viewing the CPU usage of a single container!
+
+---
+
+class: snap
+
+![Screenshot showing the end result](grafana-add-graph.png)
+
+---
+
+class: snap, prom
+
+## Before moving on ...
+
+- Leave that tab open!
+
+- We are going to set up *another* metrics system
+
+- ... And then compare both graphs side by side
+
+---
+
+class: snap, prom
+
+## Prometheus vs. Snap
+
+- Prometheus is another metrics collection system
+
+- Snap *pushes* metrics; Prometheus *pulls* them
+
+---
+
+class: prom
+
+## Prometheus components
+
+- The *Prometheus server* pulls, stores, and displays metrics
+
+- Its configuration defines a list of *exporter* endpoints
+  <br/>(that list can be dynamic, using e.g. Consul, DNS, Etcd...)
+
+- The exporters expose metrics over HTTP using a simple line-oriented format
+
+  (An optimized format using protobuf is also possible)
+
+---
+
+class: prom
+
+## It's all about the `/metrics`
+
+- This is was the *node exporter* looks like:
+
+  http://demo.robustperception.io:9100/metrics
+
+- Prometheus itself exposes its own internal metrics, too:
+
+  http://demo.robustperception.io:9090/metrics
+
+- A *Prometheus server* will *scrape* URLs like these
+
+  (It can also use protobuf to avoid the overhead of parsing line-oriented formats!)
+
+---
+
+class: prom-manual
+
+## Collecting metrics with Prometheus on Swarm
+
+- We will run two *global services* (i.e. scheduled on all our nodes):
+
+  - the Prometheus *node exporter* to get node metrics
+
+  - Google's cAdvisor to get container metrics
+
+- We will run a Prometheus server to scrape these exporters
+
+- The Prometheus server will be configured to use DNS service discovery
+
+- We will use `tasks.<servicename>` for service discovery
+
+- All these services will be placed on a private internal network
+
+---
+
+class: prom-manual
+
+## Creating an overlay network for Prometheus
+
+- This is the easiest step ☺
+
+.exercise[
+
+- Create an overlay network:
+  ```bash
+  docker network create --driver overlay prom
+  ```
+
+]
+
+---
+
+class: prom-manual
+
+## Running the node exporter
+
+- The node exporter *should* run directly on the hosts
+- However, it can run from a container, if configured properly
+  <br/>
+  (it needs to access the host's filesystems, in particular /proc and /sys)
+
+.exercise[
+
+- Start the node exporter:
+  ```bash
+    docker service create --name node --mode global --network prom \
+     --mount type=bind,source=/proc,target=/host/proc \
+     --mount type=bind,source=/sys,target=/host/sys \
+     --mount type=bind,source=/,target=/rootfs \
+     prom/node-exporter \
+      --path.procfs /host/proc \
+      --path.sysfs /host/proc \
+      --collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)"
+   ```
+
+]
+
+---
+
+class: prom-manual
+
+## Running cAdvisor
+
+- Likewise, cAdvisor *should* run directly on the hosts
+
+- But it can run in containers, if configured properly
+
+.exercise[
+
+- Start the cAdvisor collector:
+  ```bash
+    docker service create --name cadvisor --network prom --mode global \
+      --mount type=bind,source=/,target=/rootfs \
+      --mount type=bind,source=/var/run,target=/var/run \
+      --mount type=bind,source=/sys,target=/sys \
+      --mount type=bind,source=/var/lib/docker,target=/var/lib/docker \
+      google/cadvisor:latest
+  ```
+
+]
+
+---
+
+class: prom-manual
+
+## Configuring the Prometheus server
+
+This will be our configuration file for Prometheus:
+
+```yaml
+global:
+  scrape_interval: 10s
+scrape_configs:
+  - job_name: 'prometheus'
+    static_configs:
+      - targets: ['localhost:9090']
+  - job_name: 'node'
+    dns_sd_configs:
+      - names: ['tasks.node']
+        type: 'A'
+        port: 9100
+  - job_name: 'cadvisor'
+    dns_sd_configs:
+      - names: ['tasks.cadvisor']
+        type: 'A'
+        port: 8080
+```
+
+---
+
+class: prom-manual
+
+## Passing the configuration to the Prometheus server
+
+- We need to provide our custom configuration to the Prometheus server
+
+- The easiest solution is to create a custom image bundling this configuration
+
+- We will use a very simple Dockerfile:
+  ```dockerfile
+  FROM prom/prometheus:v1.4.1
+  COPY prometheus.yml /etc/prometheus/prometheus.yml
+  ```
+
+  (The configuration file, and the Dockerfile, are in the `prom` subdirectory)
+
+- We will build this image, and push it to our local registry
+
+- Then we will create a service using this image
+
+Note: it is also possible to use a `config` to inject that configuration file
+without having to create this ad-hoc image.
+
+---
+
+class: prom-manual
+
+## Building our custom Prometheus image
+
+- We will use the local registry started previously on 127.0.0.1:5000
+
+.exercise[
+
+- Build the image using the provided Dockerfile:
+  ```bash
+  docker build -t 127.0.0.1:5000/prometheus ~/orchestration-workshop/prom
+  ```
+
+- Push the image to our local registry:
+  ```bash
+  docker push 127.0.0.1:5000/prometheus
+  ```
+
+]
+
+---
+
+class: prom-manual
+
+## Running our custom Prometheus image
+
+- That's the only service that needs to be published
+
+  (If we want to access Prometheus from outside!)
+
+.exercise[
+
+- Start the Prometheus server:
+  ```bash
+    docker service create --network prom --name prom \
+           --publish 9090:9090 127.0.0.1:5000/prometheus
+  ```
+
+]
+
+---
+
+class: prom-auto
+
+## Deploying Prometheus on our cluster
+
+- We will use a stack definition (once again)
+
+.exercise[
+
+- Make sure we are in the stacks directory:
+  ```bash
+  cd ~/orchestration-workshop/stacks
+  ```
+
+- Build, ship, and run the Prometheus stack:
+  ```bash
+  docker-compose -f prometheus.yml build
+  docker-compose -f prometheus.yml push
+  docker stack deploy -c prometheus.yml prometheus
+  ```
+
+]
+
+---
+
+class: prom
+
+## Checking our Prometheus server
+
+- First, let's make sure that Prometheus is correctly scraping all metrics
+
+.exercise[
+
+- Open port 9090 with your browser
+
+- Click on "status", then "targets"
+
+]
+
+You should see 11 endpoints (5 cadvisor, 5 node, 1 prometheus).
+
+Their state should be "UP".
+
+---
+
+class: prom-auto, config
+
+## Injecting a configuration file
+
+(New in Docker Engine 17.06)
+
+- We are creating a custom image *just to inject a configuration*
+
+- Instead, we could use the base Prometheus image + a `config` 
+
+- A `config` is a blob (usually, a configuration file) that:
+
+  - is created and managed through the Docker API (and CLI)
+
+  - gets persisted into the Raft log (i.e. safely)
+
+  - can be associated to a service
+    <br/>
+    (this injects the blob as a plain file in the service's containers)
+
+---
+
+class: prom-auto, config
+
+## Differences between `config` and `secret`
+
+The two are very similar, but ...
+
+- `configs`:
+
+  - can be injected to any filesystem location
+
+  - can be viewed and extracted using the Docker API or CLI
+
+- `secrets`:
+
+  - can only be injected into `/run/secrets`
+
+  - are never stored in clear text on disk
+
+  - cannot be viewed or extracted with the Docker API or CLI
+
+---
+
+class: prom-auto, config
+
+## Deploying Prometheus with a `config`
+
+- The `config` can be created manually or declared in the Compose file
+
+- This is what our new Compose file looks like:
+
+.small[
+```yaml
+version: "3.3"
+
+services:
+
+prometheus:
+  image: prom/prometheus:v1.4.1 
+  ports:
+    - "9090:9090"
+  configs:
+    - source: prometheus
+      target: /etc/prometheus/prometheus.yml
+
+...
+
+configs:
+  prometheus:
+    file: ../prom/prometheus.yml
+```
+]
+
+(This is from `prometheus+config.yml`)
+
+---
+
+class: prom-auto, config
+
+## Specifying a `config` in a Compose file
+
+- In each service, an optional `configs` section can list as many configs as you want
+
+- Each config can specify:
+
+  - an optional `target` (path to inject the configuration; by default: root of the container)
+
+  - ownership and permissions (by default, the file will be owned by UID 0, i.e. `root`)
+
+- These configs reference top-level `configs` elements
+
+- The top-level configs can be declared as:
+
+  - *external*, meaning that it is supposed to be created before you deploy the stack
+
+  - referencing a file, whose content is used to initialize the config
+
+---
+
+class: prom-auto, config
+
+## Re-deploying Prometheus with a config
+
+- We will update the existing stack using `prometheus+config.yml`
+
+.exercise[
+
+- Redeploy the `prometheus` stack:
+  ```bash
+  docker stack deploy -c prometheus+config.yml prometheus
+  ```
+
+- Check that Prometheus still works as intended
+
+  (By connecting to any node of the cluster, on port 9090)
+
+]
+
+---
+
+class: prom-auto, config
+
+## Accessing the config object from the Docker CLI
+
+- Config objects can be viewed from the CLI (or API)
+
+.exercise[
+
+- List existing config objects:
+  ```bash
+  docker config ls
+  ```
+
+- View details about our config object:
+  ```bash
+  docker config inspect prometheus_prometheus
+  ```
+
+]
+
+Note: the content of the config blob is shown with BASE64 encoding.
+<br/>
+(It doesn't have to be text; it could be an image or any kind of binary content!)
+
+---
+
+class: prom-auto, config
+
+## Extracting a config blob
+
+- Let's retrieve that Prometheus configuration!
+
+.exercise[
+
+- Extract the BASE64 payload with `jq`:
+  ```bash
+  docker config inspect prometheus_prometheus | jq -r .[0].Spec.Data
+  ```
+
+- Decode it with `base64 -d`:
+  ```bash
+  docker config inspect prometheus_prometheus | jq -r .[0].Spec.Data | base64 -d
+  ```
+
+]
+
+---
+
+class: prom
+
+## Displaying metrics directly from Prometheus
+
+- This is easy ... if you are familiar with PromQL
+
+.exercise[
+
+- Click on "Graph", and in "expression", paste the following:
+  ```
+    sum by (container_label_com_docker_swarm_node_id) (
+      irate(
+        container_cpu_usage_seconds_total{
+          container_label_com_docker_swarm_service_name="dockercoins_worker"
+          }[1m]
+      )
+    )
+  ```
+
+- Click on the blue "Execute" button and on the "Graph" tab just below
+
+]
+
+---
+
+class: prom
+
+## Building the query from scratch
+
+- We are going to build the same query from scratch
+
+- This doesn't intend to be a detailed PromQL course
+
+- This is merely so that you (I) can pretend to know how the previous query works
+  <br/>so that your coworkers (you) can be suitably impressed (or not)
+
+  (Or, so that we can build other queries if necessary, or adapt if cAdvisor,
+  Prometheus, or anything else changes and requires editing the query!)
+
+---
+
+class: prom
+
+## Displaying a raw metric for *all* containers
+
+- Click on the "Graph" tab on top
+
+  *This takes us to a blank dashboard*
+
+- Click on the "Insert metric at cursor" drop down, and select `container_cpu_usage_seconds_total`
+
+  *This puts the metric name in the query box*
+
+- Click on "Execute"
+
+  *This fills a table of measurements below*
+
+- Click on "Graph" (next to "Console")
+
+  *This replaces the table of measurements with a series of graphs (after a few seconds)*
+
+---
+
+class: prom
+
+## Selecting metrics for a specific service
+
+- Hover over the lines in the graph
+
+  (Look for the ones that have labels like `container_label_com_docker_...`)
+
+- Edit the query, adding a condition between curly braces:
+
+  .small[`container_cpu_usage_seconds_total{container_label_com_docker_swarm_service_name="dockercoins_worker"}`]
+
+- Click on "Execute"
+
+  *Now we should see one line per CPU per container*
+
+- If you want to select by container ID, you can use a regex match: `id=~"/docker/c4bf.*"`
+
+- You can also specify multiple conditions by separating them with commas
+
+---
+
+class: prom
+
+## Turn counters into rates
+
+- What we see is the total amount of CPU used (in seconds)
+
+- We want to see a *rate* (CPU time used / real time)
+
+- To get a moving average over 1 minute periods, enclose the current expression within:
+
+  ```
+  rate ( ... { ... } [1m] )
+  ```
+
+  *This should turn our steadily-increasing CPU counter into a wavy graph*
+
+- To get an instantaneous rate, use `irate` instead of `rate`
+
+  (The time window is then used to limit how far behind to look for data if data points
+  are missing in case of scrape failure; see [here](https://www.robustperception.io/irate-graphs-are-better-graphs/) for more details!)
+
+  *This should show spikes that were previously invisible because they were smoothed out*
+
+---
+
+class: prom
+
+## Aggregate multiple data series
+
+- We have one graph per CPU per container; we want to sum them
+
+- Enclose the whole expression within:
+
+  ```
+  sum ( ... )
+  ```
+
+  *We now see a single graph*
+
+---
+
+class: prom
+
+## Collapse dimensions
+
+- If we have multiple containers we can also collapse just the CPU dimension:
+
+  ```
+  sum without (cpu) ( ... )
+  ```
+
+  *This shows the same graph, but preserves the other labels*
+
+- Congratulations, you wrote your first PromQL expression from scratch!
+
+  (I'd like to thank [Johannes Ziemke](https://twitter.com/discordianfish) and
+  [Julius Volz](https://twitter.com/juliusvolz) for their help with Prometheus!)
+
+---
+
+class: prom, snap
+
+## Comparing Snap and Prometheus data
+
+- If you haven't set up Snap, InfluxDB, and Grafana, skip this section
+
+- If you have closed the Grafana tab, you might have to re-set up a new dashboard
+
+  (Unless you saved it before navigating it away)
+
+- To re-do the setup, just follow again the instructions from the previous chapter
+
+---
+
+class: prom, snap
+
+## Add Prometheus as a data source in Grafana
+
+.exercise[
+
+- In a new tab, connect to Grafana (port 3000)
+
+- Click on the Grafana logo (the orange spiral in the top-left corner)
+
+- Click on "Data Sources"
+
+- Click on the green "Add data source" button
+
+]
+
+We see the same input form that we filled earlier to connect to InfluxDB.
+
+---
+
+class: prom, snap
+
+## Connecting to Prometheus from Grafana
+
+.exercise[
+
+- Enter "prom" in the name field
+
+- Select "Prometheus" as the source type
+
+- Enter http://(IP.address.of.any.node):9090 in the Url field
+
+- Select "direct" as the access method
+
+- Click on "Save and test"
+
+]
+
+Again, we should see a green box telling us "Data source is working."
+
+Otherwise, double-check every field and try again!
+
+---
+
+class: prom, snap
+
+## Adding the Prometheus data to our dashboard
+
+.exercise[
+
+- Go back to the the tab where we had our first Grafana dashboard
+
+- Click on the blue "Add row" button in the lower right corner
+
+- Click on the green tab on the left; select "Add panel" and "Graph"
+
+]
+
+This takes us to the graph editor that we used earlier.
+
+---
+
+class: prom, snap
+
+## Querying Prometheus data from Grafana
+
+The editor is a bit less friendly than the one we used for InfluxDB.
+
+.exercise[
+
+- Select "prom" as Panel data source
+
+- Paste the query in the query field:
+  ```
+    sum without (cpu, id) ( irate (
+      container_cpu_usage_seconds_total{
+        container_label_com_docker_swarm_service_name="influxdb"}[1m] ) )
+  ```
+
+- Click outside of the query field to confirm
+
+- Close the row editor by clicking the "X" in the top right area
+
+]
+
+---
+
+class: prom, snap
+
+## Interpreting results
+
+- The two graphs *should* be similar
+
+- Protip: align the time references!
+
+.exercise[
+
+- Click on the clock in the top right corner
+
+- Select "last 30 minutes"
+
+- Click on "Zoom out"
+
+- Now press the right arrow key (hold it down and watch the CPU usage increase!)
+
+]
+
+*Adjusting units is left as an exercise for the reader.*
+
+---
+
+## More resources on container metrics
+
+- [Prometheus, a Whirlwind Tour](https://speakerdeck.com/copyconstructor/prometheus-a-whirlwind-tour),
+  an original overview of Prometheus
+
+- [Docker Swarm & Container Overview](https://grafana.net/dashboards/609),
+  a custom dashboard for Grafana
+
+- [Gathering Container Metrics](http://jpetazzo.github.io/2013/10/08/docker-containers-metrics/),
+  a blog post about cgroups
+
+- [The Prometheus Time Series Database](https://www.youtube.com/watch?v=HbnGSNEjhUc),
+  a talk explaining why custom data storage is necessary for metrics
+
+.blackbelt[DC17US: Monitoring, the Prometheus Way
+([video](https://www.youtube.com/watch?v=PDxcEzu62jk&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=5))]
+
+.blackbelt[DC17EU: Prometheus 2.0 Storage Engine
+([video](https://dockercon.docker.com/watch/NNZ8GXHGomouwSXtXnxb8P))]
diff --git a/docs/morenodes.md b/docs/morenodes.md
new file mode 100644
index 00000000..848253fe
--- /dev/null
+++ b/docs/morenodes.md
@@ -0,0 +1,236 @@
+## Adding more manager nodes
+
+- Right now, we have only one manager (node1)
+
+- If we lose it, we lose quorum - and that's *very bad!*
+
+- Containers running on other nodes will be fine ...
+
+- But we won't be able to get or set anything related to the cluster
+
+- If the manager is permanently gone, we will have to do a manual repair!
+
+- Nobody wants to do that ... so let's make our cluster highly available
+
+---
+
+class: self-paced
+
+## Adding more managers
+
+With Play-With-Docker:
+
+```bash
+TOKEN=$(docker swarm join-token -q manager)
+for N in $(seq 4 5); do
+  export DOCKER_HOST=tcp://node$N:2375
+  docker swarm join --token $TOKEN node1:2377
+done
+unset DOCKER_HOST
+```
+
+---
+
+class: in-person
+
+## Building our full cluster
+
+- We could SSH to nodes 3, 4, 5; and copy-paste the command
+
+--
+
+class: in-person
+
+- Or we could use the AWESOME POWER OF THE SHELL!
+
+--
+
+class: in-person
+
+![Mario Red Shell](mario-red-shell.png)
+
+--
+
+class: in-person
+
+- No, not *that* shell
+
+---
+
+class: in-person
+
+## Let's form like Swarm-tron
+
+- Let's get the token, and loop over the remaining nodes with SSH
+
+.exercise[
+
+- Obtain the manager token:
+  ```bash
+  TOKEN=$(docker swarm join-token -q manager)
+  ```
+
+- Loop over the 3 remaining nodes:
+  ```bash
+    for NODE in node3 node4 node5; do
+      ssh $NODE docker swarm join --token $TOKEN node1:2377
+    done
+  ```
+
+]
+
+[That was easy.](https://www.youtube.com/watch?v=3YmMNpbFjp0)
+
+---
+
+## You can control the Swarm from any manager node
+
+.exercise[
+
+- Try the following command on a few different nodes:
+  ```bash
+  docker node ls
+  ```
+
+]
+
+On manager nodes:
+<br/>you will see the list of nodes, with a `*` denoting
+the node you're talking to.
+
+On non-manager nodes:
+<br/>you will get an error message telling you that
+the node is not a manager.
+
+As we saw earlier, you can only control the Swarm through a manager node.
+
+---
+
+class: self-paced
+
+## Play-With-Docker node status icon
+
+- If you're using Play-With-Docker, you get node status icons
+
+- Node status icons are displayed left of the node name
+
+  - No icon = no Swarm mode detected
+  - Solid blue icon = Swarm manager detected
+  - Blue outline icon = Swarm worker detected
+
+![Play-With-Docker icons](pwd-icons.png)
+
+---
+
+## Dynamically changing the role of a node
+
+- We can change the role of a node on the fly:
+
+  `docker node promote nodeX` → make nodeX a manager
+  <br/>
+  `docker node demote nodeX` → make nodeX a worker
+
+.exercise[
+
+- See the current list of nodes:
+  ```
+  docker node ls
+  ```
+
+- Promote any worker node to be a manager:
+  ```
+  docker node promote <node_name_or_id>
+  ```
+
+]
+
+---
+
+## How many managers do we need?
+
+- 2N+1 nodes can (and will) tolerate N failures
+  <br/>(you can have an even number of managers, but there is no point)
+
+--
+
+- 1 manager = no failure
+
+- 3 managers = 1 failure
+
+- 5 managers = 2 failures (or 1 failure during 1 maintenance)
+
+- 7 managers and more = now you might be overdoing it a little bit
+
+---
+
+## Why not have *all* nodes be managers?
+
+- Intuitively, it's harder to reach consensus in larger groups
+
+- With Raft, writes have to go to (and be acknowledged by) all nodes
+
+- More nodes = more network traffic
+
+- Bigger network = more latency
+
+---
+
+## What would McGyver do?
+
+- If some of your machines are more than 10ms away from each other,
+  <br/>
+  try to break them down in multiple clusters
+  (keeping internal latency low)
+
+- Groups of up to 9 nodes: all of them are managers
+
+- Groups of 10 nodes and up: pick 5 "stable" nodes to be managers
+  <br/>
+  (Cloud pro-tip: use separate auto-scaling groups for managers and workers)
+
+- Groups of more than 100 nodes: watch your managers' CPU and RAM
+
+- Groups of more than 1000 nodes:
+
+  - if you can afford to have fast, stable managers, add more of them
+  - otherwise, break down your nodes in multiple clusters
+
+---
+
+## What's the upper limit?
+
+- We don't know!
+
+- Internal testing at Docker Inc.: 1000-10000 nodes is fine
+
+  - deployed to a single cloud region
+
+  - one of the main take-aways was *"you're gonna need a bigger manager"*
+
+- Testing by the community: [4700 heterogenous nodes all over the 'net](https://sematext.com/blog/2016/11/14/docker-swarm-lessons-from-swarm3k/)
+
+  - it just works
+
+  - more nodes require more CPU; more containers require more RAM
+
+  - scheduling of large jobs (70000 containers) is slow, though (working on it!)
+
+---
+
+## Real-life deployment methods
+
+--
+
+Running commands manually over SSH
+
+--
+
+  (lol jk)
+
+--
+
+- Using your favorite configuration management tool
+
+- [Docker for AWS](https://docs.docker.com/docker-for-aws/#quickstart)
+
+- [Docker for Azure](https://docs.docker.com/docker-for-azure/)
diff --git a/docs/namespaces.md b/docs/namespaces.md
new file mode 100644
index 00000000..7d3712b4
--- /dev/null
+++ b/docs/namespaces.md
@@ -0,0 +1,236 @@
+class: namespaces
+name: namespaces
+
+# Improving isolation with User Namespaces
+
+- *Namespaces* are kernel mechanisms to compartimetalize the system
+
+- There are different kind of namespaces: `pid`, `net`, `mnt`, `ipc`, `uts`, and `user`
+
+- For a primer, see "Anatomy of a Container"
+  ([video](https://www.youtube.com/watch?v=sK5i-N34im8))
+  ([slides](https://www.slideshare.net/jpetazzo/cgroups-namespaces-and-beyond-what-are-containers-made-from-dockercon-europe-2015))
+
+- The *user namespace* allows to map UIDs between the containers and the host
+
+- As a result, `root` in a container can map to a non-privileged user on the host
+
+Note: even without user namespaces, `root` in a container cannot go wild on the host.
+<br/>
+It is mediated by capabilities, cgroups, namespaces, seccomp, LSMs...
+
+---
+
+class: namespaces
+
+## User Namespaces in Docker
+
+- Optional feature added in Docker Engine 1.10
+
+- Not enabled by default
+
+- Has to be enabled at Engine startup, and affects all containers
+
+- When enabled, `UID:GID` in containers are mapped to a different range on the host
+
+- Safer than switching to a non-root user (with `-u` or `USER`) in the container
+  <br/>
+  (Since with user namespaces, root escalation maps to a non-privileged user)
+
+- Can be selectively disabled per container by starting them with `--userns=host`
+
+---
+
+class: namespaces
+
+## User Namespaces Caveats
+
+When user namespaces are enabled, containers cannot:
+
+- Use the host's network namespace (with `docker run --network=host`)
+
+- Use the host's PID namespace (with `docker run --pid=host`)
+
+- Run in privileged mode (with `docker run --privileged`)
+
+... Unless user namespaces are disabled for the container, with flag `--userns=host`
+
+External volume and graph drivers that don't support user mapping might not work.
+
+All containers are currently mapped to the same UID:GID range.
+
+Some of these limitations might be lifted in the future!
+
+---
+
+class: namespaces
+
+## Filesystem ownership details
+
+When enabling user namespaces:
+
+- the UID:GID on disk (in the images and containers) has to match the *mapped* UID:GID
+
+- existing images and containers cannot work (their UID:GID would have to be changed)
+
+For practical reasons, when enabling user namespaces, the Docker Engine places containers and images (and everything else) in a different directory.
+
+As a resut, if you enable user namespaces on an existing installation:
+
+-  all containers and images (and e.g. Swarm data) disappear
+
+- *if a node is a member of a Swarm, it is then kicked out of the Swarm*
+
+-  everything will re-appear if you disable user namespaces again
+
+---
+
+class: namespaces
+
+## Picking a node
+
+- We will select a node where we will enable user namespaces
+
+- This node will have to be re-added to the Swarm
+
+- All containers and services running on this node will be rescheduled
+
+- Let's make sure that we do not pick the node running the registry!
+
+.exercise[
+
+- Check on which node the registry is running:
+  ```bash
+  docker service ps registry
+  ```
+
+]
+
+Pick any other node (noted `nodeX` in the next slides).
+
+---
+
+class: namespaces
+
+## Logging into the right Engine
+
+.exercise[
+
+- Log into the right node:
+  ```bash
+  ssh node`X`
+  ```
+
+]
+
+---
+
+class: namespaces
+
+## Configuring the Engine
+
+.exercise[
+
+- Create a configuration file for the Engine:
+  ```bash
+  echo '{"userns-remap": "default"}' | sudo tee /etc/docker/daemon.json
+  ```
+
+- Restart the Engine:
+  ```bash
+  kill $(pidof dockerd)
+  ```
+
+]
+
+---
+
+class: namespaces 
+
+## Checking that User Namespaces are enabled
+
+.exercise[
+  - Notice the new Docker path:
+  ```bash
+  docker info | grep var/lib
+  ```
+
+  - Notice the new UID:GID permissions:
+  ```bash
+  sudo ls -l /var/lib/docker
+  ```
+
+]
+
+You should see a line like the following:
+```
+drwx------ 11 296608 296608 4096 Aug  3 05:11 296608.296608
+```
+
+---
+
+class: namespaces
+
+## Add the node back to the Swarm
+
+.exercise[
+
+- Get our manager token from another node:
+  ```bash
+  ssh node`Y` docker swarm join-token manager
+  ```
+
+- Copy-paste the join command to the node
+
+]
+
+---
+
+class: namespaces
+
+## Check the new UID:GID
+
+.exercise[
+
+- Run a background container on the node:
+  ```bash
+  docker run -d --name lockdown alpine sleep 1000000
+  ```
+
+- Look at the processes in this container:
+  ```bash
+  docker top lockdown
+  ps faux
+  ```
+
+]
+
+---
+
+class: namespaces
+
+## Comparing on-disk ownership with/without User Namespaces
+
+.exercise[
+
+- Compare the output of the two following commands:
+  ```bash
+  docker run alpine ls -l /
+  docker run --userns=host alpine ls -l /
+  ```
+
+]
+
+--
+
+class: namespaces
+
+In the first case, it looks like things belong to `root:root`.
+
+In the second case, we will see the "real" (on-disk) ownership.
+
+--
+
+class: namespaces
+
+Remember to get back to `node1` when finished!
diff --git a/docs/netshoot.md b/docs/netshoot.md
new file mode 100644
index 00000000..7c8d6cd5
--- /dev/null
+++ b/docs/netshoot.md
@@ -0,0 +1,385 @@
+class: extra-details
+
+## Troubleshooting overlay networks
+
+<!--
+
+## Finding the real cause of the bottleneck
+
+- We want to debug our app as we scale `worker` up and down
+
+-->
+
+- We want to run tools like `ab` or `httping` on the internal network
+
+--
+
+class: extra-details
+
+- Ah, if only we had created our overlay network with the `--attachable` flag ...
+
+--
+
+class: extra-details
+
+- Oh well, let's use this as an excuse to introduce New Ways To Do Things
+
+---
+
+# Breaking into an overlay network
+
+- We will create a dummy placeholder service on our network
+
+- Then we will use `docker exec` to run more processes in this container
+
+.exercise[
+
+- Start a "do nothing" container using our favorite Swiss-Army distro:
+  ```bash
+    docker service create --network dockercoins_default --name debug \
+           --constraint node.hostname==$HOSTNAME alpine sleep 1000000000
+  ```
+
+]
+
+The `constraint` makes sure that the container will be created on the local node.
+
+---
+
+## Entering the debug container
+
+- Once our container is started (which should be really fast because the alpine image is small), we can enter it (from any node)
+
+.exercise[
+
+- Locate the container:
+  ```bash
+  docker ps
+  ```
+
+- Enter it:
+  ```bash
+  docker exec -ti <containerID> sh
+  ```
+
+]
+
+---
+
+## Labels
+
+- We can also be fancy and find the ID of the container automatically
+
+- SwarmKit places labels on containers
+
+.exercise[
+
+- Get the ID of the container:
+  ```bash
+  CID=$(docker ps -q --filter label=com.docker.swarm.service.name=debug)
+  ```
+
+- And enter the container:
+  ```bash
+  docker exec -ti $CID sh
+  ```
+
+]
+
+---
+
+## Installing our debugging tools
+
+- Ideally, you would author your own image, with all your favorite tools, and use it instead of the base `alpine` image
+
+- But we can also dynamically install whatever we need
+
+.exercise[
+
+- Install a few tools:
+  ```bash
+  apk add --update curl apache2-utils drill
+  ```
+
+]
+
+---
+
+## Investigating the `rng` service
+
+- First, let's check what `rng` resolves to
+
+.exercise[
+
+- Use drill or nslookup to resolve `rng`:
+  ```bash
+  drill rng
+  ```
+
+]
+
+This give us one IP address. It is not the IP address of a container.
+It is a virtual IP address (VIP) for the `rng` service.
+
+---
+
+## Investigating the VIP
+
+.exercise[
+
+- Try to ping the VIP:
+  ```bash
+  ping rng
+  ```
+
+]
+
+It *should* ping. (But this might change in the future.)
+
+With Engine 1.12: VIPs respond to ping if a
+backend is available on the same machine.
+
+With Engine 1.13: VIPs respond to ping if a
+backend is available anywhere.
+
+(Again: this might change in the future.)
+
+---
+
+## What if I don't like VIPs?
+
+- Services can be published using two modes: VIP and DNSRR.
+
+- With VIP, you get a virtual IP for the service, and a load balancer
+  based on IPVS
+
+  (By the way, IPVS is totally awesome and if you want to learn more about it in the context of containers,
+  I highly recommend [this talk](https://www.youtube.com/watch?v=oFsJVV1btDU&index=5&list=PLkA60AVN3hh87OoVra6MHf2L4UR9xwJkv) by [@kobolog](https://twitter.com/kobolog) at DC15EU!)
+
+- With DNSRR, you get the former behavior (from Engine 1.11), where
+  resolving the service yields the IP addresses of all the containers for
+  this service
+
+- You change this with `docker service create --endpoint-mode [VIP|DNSRR]`
+
+---
+
+## Looking up VIP backends
+
+- You can also resolve a special name: `tasks.<name>`
+
+- It will give you the IP addresses of the containers for a given service
+
+.exercise[
+
+- Obtain the IP addresses of the containers for the `rng` service:
+  ```bash
+  drill tasks.rng
+  ```
+
+]
+
+This should list 5 IP addresses.
+
+---
+
+class: extra-details, benchmarking
+
+## Testing and benchmarking our service
+
+- We will check that the service is up with `rng`, then
+  benchmark it with `ab`
+
+.exercise[
+
+- Make a test request to the service:
+  ```bash
+  curl rng
+  ```
+
+- Open another window, and stop the workers, to test in isolation:
+  ```bash
+  docker service update dockercoins_worker --replicas 0
+  ```
+
+]
+
+Wait until the workers are stopped (check with `docker service ls`)
+before continuing.
+
+---
+
+class: extra-details, benchmarking
+
+## Benchmarking `rng`
+
+We will send 50 requests, but with various levels of concurrency.
+
+.exercise[
+
+- Send 50 requests, with a single sequential client:
+  ```bash
+  ab -c 1 -n 50 http://rng/10
+  ```
+
+- Send 50 requests, with fifty parallel clients:
+  ```bash
+  ab -c 50 -n 50 http://rng/10
+  ```
+
+]
+
+---
+
+class: extra-details, benchmarking
+
+## Benchmark results for `rng`
+
+- When serving requests sequentially, they each take 100ms
+
+- In the parallel scenario, the latency increased dramatically:
+
+- What about `hasher`?
+
+---
+
+class: extra-details, benchmarking
+
+## Benchmarking `hasher`
+
+We will do the same tests for `hasher`.
+
+The command is slightly more complex, since we need to post random data.
+
+First, we need to put the POST payload in a temporary file.
+
+.exercise[
+
+- Install curl in the container, and generate 10 bytes of random data:
+  ```bash
+  curl http://rng/10 >/tmp/random
+  ```
+
+]
+
+---
+
+class: extra-details, benchmarking
+
+## Benchmarking `hasher`
+
+Once again, we will send 50 requests, with different levels of concurrency.
+
+.exercise[
+
+- Send 50 requests with a sequential client:
+  ```bash
+    ab -c 1 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
+  ```
+
+- Send 50 requests with 50 parallel clients:
+  ```bash
+    ab -c 50 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
+  ```
+
+]
+
+---
+
+class: extra-details, benchmarking
+
+## Benchmark results for `hasher`
+
+- The sequential benchmarks takes ~5 seconds to complete
+
+- The parallel benchmark takes less than 1 second to complete
+
+- In both cases, each request takes a bit more than 100ms to complete
+
+- Requests are a bit slower in the parallel benchmark
+
+- It looks like `hasher` is better equiped to deal with concurrency than `rng`
+
+---
+
+class: extra-details, title, benchmarking
+
+Why?
+
+---
+
+class: extra-details, benchmarking
+
+## Why does everything take (at least) 100ms?
+
+`rng` code:
+
+![RNG code screenshot](delay-rng.png)
+
+`hasher` code:
+
+![HASHER code screenshot](delay-hasher.png)
+
+---
+
+class: extra-details, title, benchmarking
+
+But ...
+
+WHY?!?
+
+---
+
+class: extra-details, benchmarking
+
+## Why did we sprinkle this sample app with sleeps?
+
+- Deterministic performance
+  <br/>(regardless of instance speed, CPUs, I/O...)
+
+- Actual code sleeps all the time anyway
+
+- When your code makes a remote API call:
+
+  - it sends a request;
+
+  - it sleeps until it gets the response;
+
+  - it processes the response.
+
+---
+
+class: extra-details, in-person, benchmarking
+
+## Why do `rng` and `hasher` behave differently?
+
+![Equations on a blackboard](equations.png)
+
+(Synchronous vs. asynchronous event processing)
+
+---
+
+class: extra-details
+
+## Global scheduling → global debugging
+
+- Traditional approach:
+
+  - log into a node
+  - install our Swiss Army Knife (if necessary)
+  - troubleshoot things
+
+- Proposed alternative:
+
+  - put our Swiss Army Knife in a container (e.g. [nicolaka/netshoot](https://hub.docker.com/r/nicolaka/netshoot/))
+  - run tests from multiple locations at the same time
+
+(This becomes very practical with the `docker service log` command, available since 17.05.)
+
+---
+
+## More about overlay networks
+
+.blackbelt[[Deep Dive in Docker Overlay Networks](https://www.youtube.com/watch?v=b3XDl0YsVsg&index=1&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8) by Laurent Bernaille (DC17US)]
+
+.blackbelt[Deeper Dive in Docker Overlay Networks by Laurent Bernaille (Wednesday 13:30)]
diff --git a/docs/nodeinfo.md b/docs/nodeinfo.md
new file mode 100644
index 00000000..43de77a3
--- /dev/null
+++ b/docs/nodeinfo.md
@@ -0,0 +1,18 @@
+## Getting task information for a given node
+
+- You can see all the tasks assigned to a node with `docker node ps`
+
+- It shows the *desired state* and *current state* of each task
+
+- `docker node ps` shows info about the current node
+
+- `docker node ps <node_name_or_id>` shows info for another node
+
+- `docker node ps -f <filter_expression>` allows to select which tasks to show
+
+  ```bash
+  # Show only tasks that are supposed to be running
+  docker node ps -f desired-state=running
+  # Show only tasks whose name contains the string "front"
+  docker node ps -f name=front
+  ```
diff --git a/docs/operatingswarm.md b/docs/operatingswarm.md
new file mode 100644
index 00000000..8c81c921
--- /dev/null
+++ b/docs/operatingswarm.md
@@ -0,0 +1,58 @@
+class: title, in-person
+
+Operating the Swarm
+
+---
+
+name: part-2
+
+class: title, self-paced
+
+Part 2
+
+---
+
+class: self-paced
+
+## Before we start ...
+
+The following exercises assume that you have a 5-nodes Swarm cluster.
+
+If you come here from a previous tutorial and still have your cluster: great!
+
+Otherwise: check [part 1](#part-1) to learn how to set up your own cluster.
+
+We pick up exactly where we left you, so we assume that you have:
+
+- a five nodes Swarm cluster,
+
+- a self-hosted registry,
+
+- DockerCoins up and running.
+
+The next slide has a cheat sheet if you need to set that up in a pinch.
+
+---
+
+class: self-paced
+
+## Catching up
+
+Assuming you have 5 nodes provided by
+[Play-With-Docker](http://www.play-with-docker/), do this from `node1`:
+
+```bash
+docker swarm init --advertise-addr eth0
+TOKEN=$(docker swarm join-token -q manager)
+for N in $(seq 2 5); do
+  DOCKER_HOST=tcp://node$N:2375 docker swarm join --token $TOKEN node1:2377
+done
+git clone git://github.com/jpetazzo/orchestration-workshop
+cd orchestration-workshop/stacks
+docker stack deploy --compose-file registry.yml registry
+docker-compose -f dockercoins.yml build
+docker-compose -f dockercoins.yml push
+docker stack deploy --compose-file dockercoins.yml dockercoins
+```
+
+You should now be able to connect to port 8000 and see the DockerCoins web UI.
diff --git a/docs/ourapponkube.md b/docs/ourapponkube.md
new file mode 100644
index 00000000..bb5b6c75
--- /dev/null
+++ b/docs/ourapponkube.md
@@ -0,0 +1,357 @@
+class: title
+
+Our app on Kube
+
+---
+
+## What's on the menu?
+
+In this part, we will:
+
+- **build** images for our app,
+
+- **ship** these images with a registry,
+
+- **run** deployments using these images,
+
+- expose these deployments so they can communicate with each other,
+
+- expose the web UI so we can access it from outside.
+
+---
+
+## The plan
+
+- Build on our control node (`node1`)
+
+- Tag images so that they are named `$REGISTRY/servicename`
+
+- Upload them to a registry
+
+- Create deployments using the images
+
+- Expose (with a ClusterIP) the services that need to communicate
+
+- Expose (with a NodePort) the WebUI
+
+---
+
+## Which registry do we want to use?
+
+- We could use the Docker Hub
+
+- Or a service offered by our cloud provider (GCR, ECR...)
+
+- Or we could just self-host that registry
+
+*We'll self-host the registry because it's the most generic solution for this workshop.*
+
+---
+
+## Using the open source registry
+
+- We need to run a `registry:2` container
+  <br/>(make sure you specify tag `:2` to run the new version!)
+
+- It will store images and layers to the local filesystem
+  <br/>(but you can add a config file to use S3, Swift, etc.)
+
+- Docker *requires* TLS when communicating with the registry
+
+  - unless for registries on `127.0.0.0/8` (i.e. `localhost`)
+
+  - or with the Engine flag `--insecure-registry`
+
+- Our strategy: publish the registry container on a NodePort,
+  <br/>so that it's available through `127.0.0.1:xxxxx` on each node
+
+---
+
+# Deploying a self-hosted registry
+
+- We will deploy a registry container, and expose it with a NodePort
+
+.exercise[
+
+- Create the registry service:
+  ```bash
+  kubectl run registry --image=registry:2
+  ```
+
+- Expose it on a NodePort:
+  ```bash
+  kubectl expose deploy/registry --port=5000 --type=NodePort
+  ```
+
+]
+
+---
+
+## Connecting to our registry
+
+- We need to find out which port has been allocated
+
+.exercise[
+
+- View the service details:
+  ```bash
+  kubectl describe svc/registry
+  ```
+
+- Get the port number programmatically:
+  ```bash
+  NODEPORT=$(kubectl get svc/registry -o json | jq .spec.ports[0].nodePort)
+  REGISTRY=127.0.0.1:$NODEPORT
+  ```
+
+]
+
+---
+
+## Testing our registry
+
+- A convenient Docker registry API route to remember is `/v2/_catalog`
+
+.exercise[
+
+- View the repositories currently held in our registry:
+  ```bash
+  curl $REGISTRY/v2/_catalog
+  ```
+
+]
+
+--
+
+We should see:
+```json
+{"repositories":[]}
+```
+
+---
+
+## Testing our local registry
+
+- We can retag a small image, and push it to the registry
+
+.exercise[
+
+- Make sure we have the busybox image, and retag it:
+  ```bash
+  docker pull busybox
+  docker tag busybox $REGISTRY/busybox
+  ```
+
+- Push it:
+  ```bash
+  docker push $REGISTRY/busybox
+  ```
+
+]
+
+---
+
+## Checking again what's on our local registry
+
+- Let's use the same endpoint as before
+
+.exercise[
+
+- Ensure that our busybox image is now in the local registry:
+  ```bash
+  curl $REGISTRY/v2/_catalog
+  ```
+
+]
+
+The curl command should now output:
+```json
+{"repositories":["busybox"]}
+```
+
+---
+
+## Building and pushing our images
+
+- We are going to use a convenient feature of Docker Compose
+
+.exercise[
+
+- Go to the `stacks` directory:
+  ```bash
+  cd ~/orchestration-workshop/stacks
+  ```
+
+- Build and push the images:
+  ```bash
+  export REGISTRY
+  docker-compose -f dockercoins.yml build
+  docker-compose -f dockercoins.yml push
+  ```
+
+]
+
+Let's have a look at the `dockercoins.yml` file while this is building and pushing.
+
+---
+
+```yaml
+version: "3"
+
+services:
+  rng:
+    build: dockercoins/rng
+    image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
+    deploy:
+      mode: global
+  ...
+  redis:
+    image: redis
+  ...
+  worker:
+    build: dockercoins/worker
+    image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
+    ...
+    deploy:
+      replicas: 10
+```
+
+.warning[Just in case you were wondering ... Docker "services" are not Kubernetes "services".]
+
+---
+
+## Deploying all the things
+
+- We can now deploy our code (as well as a redis instance)
+
+.exercise[
+
+- Deploy `redis`:
+  ```bash
+  kubectl run redis --image=redis
+  ```
+
+- Deploy everything else:
+  ```bash
+    for SERVICE in hasher rng webui worker; do
+      kubectl run $SERVICE --image=$REGISTRY/$SERVICE
+    done
+  ```
+
+]
+
+---
+
+## Is this working?
+
+- After waiting for the deployment to complete, let's look at the logs!
+
+  (Hint: use `kubectl get deploy -w` to watch deployment events)
+
+.exercise[
+
+- Look at some logs:
+  ```bash
+  kubectl logs deploy/rng
+  kubectl logs deploy/worker
+  ```
+
+]
+
+--
+
+🤔 `rng` is fine ... But not `worker`.
+
+--
+
+💡 Oh right! We forgot to `expose`.
+
+---
+
+# Exposing services internally 
+
+- Three deployments need to be reachable by others: `hasher`, `redis`, `rng`
+
+- `worker` doesn't need to be exposed
+
+- `webui` will be dealt with later
+
+.exercise[
+
+- Expose each deployment, specifying the right port:
+  ```bash
+  kubectl expose deployment redis --port 6379
+  kubectl expose deployment rng --port 80
+  kubectl expose deployment hasher --port 80
+  ```
+
+]
+
+---
+
+## Is this working yet?
+
+- The `worker` has an infinite loop, that retries 10 seconds after an error
+
+.exercise[
+
+- Stream the worker's logs:
+  ```bash
+  kubectl logs deploy/worker --follow
+  ```
+
+  (Give it about 10 seconds to recover)
+
+<!--
+```keys
+^C
+```
+-->
+
+]
+
+--
+
+We should now see the `worker`, well, working happily.
+
+---
+
+# Exposing services for external access
+
+- Now we would like to access the Web UI
+
+- We will expose it with a `NodePort`
+
+  (just like we did for the registry)
+
+.exercise[
+
+- Create a `NodePort` service for the Web UI:
+  ```bash
+  kubectl expose deploy/webui --type=NodePort --port=80
+  ```
+
+- Check the port that was allocated:
+  ```bash
+  kubectl get svc
+  ```
+
+]
+
+---
+
+## Accessing the web UI
+
+- We can now connect to *any node*, on the allocated node port, to view the web UI
+
+.exercise[
+
+- Open the web UI in your browser (http://node-ip-address:3xxxx/)
+
+<!-- ```open http://node1:3xxxx/``` -->
+
+]
+
+--
+
+*Alright, we're back to where we started, when we were running on a single node!*
\ No newline at end of file
diff --git a/docs/ourapponswarm.md b/docs/ourapponswarm.md
new file mode 100644
index 00000000..cfe7343c
--- /dev/null
+++ b/docs/ourapponswarm.md
@@ -0,0 +1,979 @@
+class: title
+
+Our app on Swarm
+
+---
+
+## What's on the menu?
+
+In this part, we will:
+
+- **build** images for our app,
+
+- **ship** these images with a registry,
+
+- **run** services using these images.
+
+---
+
+## Why do we need to ship our images?
+
+- When we do `docker-compose up`, images are built for our services
+
+- These images are present only on the local node
+
+- We need these images to be distributed on the whole Swarm
+
+- The easiest way to achieve that is to use a Docker registry
+
+- Once our images are on a registry, we can reference them when
+  creating our services
+
+---
+
+class: extra-details
+
+## Build, ship, and run, for a single service
+
+If we had only one service (built from a `Dockerfile` in the
+current directory), our workflow could look like this:
+
+```
+docker build -t jpetazzo/doublerainbow:v0.1 .
+docker push jpetazzo/doublerainbow:v0.1
+docker service create jpetazzo/doublerainbow:v0.1
+```
+
+We just have to adapt this to our application, which has 4 services!
+
+---
+
+## The plan
+
+- Build on our local node (`node1`)
+
+- Tag images so that they are named `localhost:5000/servicename`
+
+- Upload them to a registry
+
+- Create services using the images
+
+---
+
+## Which registry do we want to use?
+
+.small[
+
+- **Docker Hub**
+
+  - hosted by Docker Inc.
+  - requires an account (free, no credit card needed)
+  - images will be public (unless you pay)
+  - located in AWS EC2 us-east-1
+
+- **Docker Trusted Registry**
+
+  - self-hosted commercial product
+  - requires a subscription (free 30-day trial available)
+  - images can be public or private
+  - located wherever you want
+
+- **Docker open source registry**
+
+  - self-hosted barebones repository hosting
+  - doesn't require anything
+  - doesn't come with anything either
+  - located wherever you want
+
+]
+
+---
+
+class: extra-details
+
+## Using Docker Hub
+
+*If we wanted to use the Docker Hub...*
+
+<!--
+```meta
+^{
+```
+-->
+
+- We would log into the Docker Hub:
+  ```bash
+  docker login
+  ```
+
+- And in the following slides, we would use our Docker Hub login
+  (e.g. `jpetazzo`) instead of the registry address (i.e. `127.0.0.1:5000`)
+
+<!--
+```meta
+^}
+```
+-->
+
+---
+
+class: extra-details
+
+## Using Docker Trusted Registry
+
+*If we wanted to use DTR, we would...*
+
+- Make sure we have a Docker Hub account
+
+- [Activate a Docker Datacenter subscription](
+  https://hub.docker.com/enterprise/trial/)
+
+- Install DTR on our machines
+
+- Use `dtraddress:port/user` instead of the registry address
+
+*This is out of the scope of this workshop!*
+
+---
+
+## Using the open source registry
+
+- We need to run a `registry:2` container
+  <br/>(make sure you specify tag `:2` to run the new version!)
+
+- It will store images and layers to the local filesystem
+  <br/>(but you can add a config file to use S3, Swift, etc.)
+
+- Docker *requires* TLS when communicating with the registry
+
+  - unless for registries on `127.0.0.0/8` (i.e. `localhost`)
+
+  - or with the Engine flag `--insecure-registry`
+
+<!-- -->
+
+- Our strategy: publish the registry container on port 5000,
+  <br/>so that it's available through `127.0.0.1:5000` on each node
+
+---
+
+class: manual-btp
+
+# Deploying a local registry
+
+- We will create a single-instance service, publishing its port
+  on the whole cluster
+
+.exercise[
+
+- Create the registry service:
+  ```bash
+  docker service create --name registry --publish 5000:5000 registry:2
+  ```
+
+- Now try the following command; it should return `{"repositories":[]}`:
+  ```bash
+  curl 127.0.0.1:5000/v2/_catalog
+  ```
+
+]
+
+(If that doesn't work, wait a few seconds and try again.)
+
+---
+
+class: manual-btp
+
+## Testing our local registry
+
+- We can retag a small image, and push it to the registry
+
+.exercise[
+
+- Make sure we have the busybox image, and retag it:
+  ```bash
+  docker pull busybox
+  docker tag busybox 127.0.0.1:5000/busybox
+  ```
+
+- Push it:
+  ```bash
+  docker push 127.0.0.1:5000/busybox
+  ```
+
+]
+
+---
+
+class: manual-btp
+
+## Checking what's on our local registry
+
+- The registry API has endpoints to query what's there
+
+.exercise[
+
+- Ensure that our busybox image is now in the local registry:
+  ```bash
+  curl http://127.0.0.1:5000/v2/_catalog
+  ```
+
+]
+
+The curl command should now output:
+```json
+{"repositories":["busybox"]}
+```
+
+---
+
+class: manual-btp
+
+## Build, tag, and push our application container images
+
+- Compose has named our images `dockercoins_XXX` for each service
+
+- We need to retag them (to `127.0.0.1:5000/XXX:v1`) and push them
+
+.exercise[
+
+- Set `REGISTRY` and `TAG` environment variables to use our local registry
+- And run this little for loop:
+  ```bash
+    cd ~/orchestration-workshop/dockercoins
+    REGISTRY=127.0.0.1:5000 TAG=v1
+    for SERVICE in hasher rng webui worker; do
+      docker tag dockercoins_$SERVICE $REGISTRY/$SERVICE:$TAG
+      docker push $REGISTRY/$SERVICE
+    done
+  ```
+
+]
+
+---
+
+class: manual-btp
+
+# Overlay networks
+
+- SwarmKit integrates with overlay networks
+
+- Networks are created with `docker network create`
+
+- Make sure to specify that you want an *overlay* network
+  <br/>(otherwise you will get a local *bridge* network by default)
+
+.exercise[
+
+- Create an overlay network for our application:
+  ```bash
+  docker network create --driver overlay dockercoins
+  ```
+
+]
+
+---
+
+class: manual-btp
+
+## Viewing existing networks
+
+- Let's confirm that our network was created
+
+.exercise[
+
+- List existing networks:
+  ```bash
+  docker network ls
+  ```
+
+]
+
+---
+
+class: manual-btp
+
+## Can you spot the differences?
+
+The networks `dockercoins` and `ingress` are different from the other ones.
+
+Can you see how?
+
+--
+
+class: manual-btp
+
+- They are using a different kind of ID, reflecting the fact that they
+  are SwarmKit objects instead of "classic" Docker Engine objects.
+
+- Their *scope* is `swarm` instead of `local`.
+
+- They are using the overlay driver.
+
+---
+
+class: manual-btp, extra-details
+
+## Caveats
+
+.warning[In Docker 1.12, you cannot join an overlay network with `docker run --net ...`.]
+
+Starting with version 1.13, you can, if the network was created with the `--attachable` flag.
+
+*Why is that?*
+
+Placing a container on a network requires allocating an IP address for this container.
+
+The allocation must be done by a manager node (worker nodes cannot update Raft data).
+
+As a result, `docker run --net ...` requires collaboration with manager nodes.
+
+It alters the code path for `docker run`, so it is allowed only under strict circumstances.
+
+---
+
+class: manual-btp
+
+## Run the application
+
+- First, create the `redis` service; that one is using a Docker Hub image
+
+.exercise[
+
+- Create the `redis` service:
+  ```bash
+  docker service create --network dockercoins --name redis redis
+  ```
+
+]
+
+---
+
+class: manual-btp
+
+## Run the other services
+
+- Then, start the other services one by one
+
+- We will use the images pushed previously
+
+.exercise[
+
+- Start the other services:
+  ```bash
+  REGISTRY=127.0.0.1:5000
+  TAG=v1
+  for SERVICE in hasher rng webui worker; do
+    docker service create --network dockercoins --detach=true \
+           --name $SERVICE $REGISTRY/$SERVICE:$TAG
+  done
+  ```
+
+]
+
+???
+
+## Wait for our application to be up
+
+- We will see later a way to watch progress for all the tasks of the cluster
+
+- But for now, a scrappy Shell loop will do the trick
+
+.exercise[
+
+- Repeatedly display the status of all our services:
+  ```bash
+  watch "docker service ls -q | xargs -n1 docker service ps"
+  ```
+
+- Stop it once everything is running
+
+]
+
+---
+
+class: manual-btp
+
+## Expose our application web UI
+
+- We need to connect to the `webui` service, but it is not publishing any port
+
+- Let's reconfigure it to publish a port
+
+.exercise[
+
+- Update `webui` so that we can connect to it from outside:
+  ```bash
+  docker service update webui --publish-add 8000:80 --detach=false
+  ```
+
+]
+
+Note: to "de-publish" a port, you would have to specify the container port.
+</br>(i.e. in that case, `--publish-rm 80`)
+
+---
+
+class: manual-btp
+
+## What happens when we modify a service?
+
+- Let's find out what happened to our `webui` service
+
+.exercise[
+
+- Look at the tasks and containers associated to `webui`:
+  ```bash
+  docker service ps webui
+  ```
+]
+
+--
+
+class: manual-btp
+
+The first version of the service (the one that was not exposed) has been shutdown.
+
+It has been replaced by the new version, with port 80 accessible from outside.
+
+(This will be discussed with more details in the section about stateful services.)
+
+---
+
+class: manual-btp
+
+## Connect to the web UI
+
+- The web UI is now available on port 8000, *on all the nodes of the cluster*
+
+.exercise[
+
+- If you're using Play-With-Docker, just click on the `(8000)` badge
+
+- Otherwise, point your browser to any node, on port 8000
+
+]
+
+---
+
+## Scaling the application
+
+- We can change scaling parameters with `docker update` as well
+
+- We will do the equivalent of `docker-compose scale`
+
+.exercise[
+
+- Bring up more workers:
+  ```bash
+  docker service update worker --replicas 10 --detach=false
+  ```
+
+- Check the result in the web UI
+
+]
+
+You should see the performance peaking at 10 hashes/s (like before).
+
+---
+
+class: manual-btp
+
+# Global scheduling
+
+- We want to utilize as best as we can the entropy generators
+  on our nodes
+
+- We want to run exactly one `rng` instance per node
+
+- SwarmKit has a special scheduling mode for that, let's use it
+
+- We cannot enable/disable global scheduling on an existing service
+
+- We have to destroy and re-create the `rng` service
+
+---
+
+class: manual-btp
+
+## Scaling the `rng` service
+
+.exercise[
+
+- Remove the existing `rng` service:
+  ```bash
+  docker service rm rng
+  ```
+
+- Re-create the `rng` service with *global scheduling*:
+  ```bash
+    docker service create --name rng --network dockercoins --mode global \
+      --detach=false $REGISTRY/rng:$TAG
+  ```
+
+- Look at the result in the web UI
+
+]
+
+---
+
+class: extra-details, manual-btp
+
+## Why do we have to re-create the service to enable global scheduling?
+
+- Enabling it dynamically would make rolling updates semantics very complex
+
+- This might change in the future (after all, it was possible in 1.12 RC!)
+
+- As of Docker Engine 17.05, other parameters requiring to `rm`/`create` the service are:
+
+  - service name
+
+  - hostname
+
+  - network
+
+---
+
+class: swarm-ready
+
+## How did we make our app "Swarm-ready"?
+
+This app was written in June 2015. (One year before Swarm mode was released.)
+
+What did we change to make it compatible with Swarm mode?
+
+--
+
+.exercise[
+
+- Go to the app directory:
+  ```bash
+  cd ~/orchestration-workshop/dockercoins
+  ```
+
+- See modifications in the code:
+  ```bash
+  git log -p --since "4-JUL-2015" -- . ':!*.yml*' ':!*.html'
+  ```
+
+  <!-- ```wait commit``` -->
+  <!-- ```keys q``` -->
+
+]
+
+---
+
+class: swarm-ready
+
+## What did we change in our app since its inception?
+
+- Compose files
+
+- HTML file (it contains an embedded contextual tweet)
+
+- Dockerfiles (to switch to smaller images)
+
+- That's it!
+
+--
+
+class: swarm-ready
+
+*We didn't change a single line of code in this app since it was written.*
+
+--
+
+class: swarm-ready
+
+*The images that were [built in June 2015](
+https://hub.docker.com/r/jpetazzo/dockercoins_worker/tags/)
+(when the app was written) can still run today ...
+<br/>... in Swarm mode (distributed across a cluster, with load balancing) ...
+<br/>... without any modification.*
+
+---
+
+class: swarm-ready
+
+## How did we design our app in the first place?
+
+- [Twelve-Factor App](https://12factor.net/) principles
+
+- Service discovery using DNS names
+
+  - Initially implemented as "links"
+
+  - Then "ambassadors"
+
+  - And now "services"
+
+- Existing apps might require more changes!
+
+---
+
+class: manual-btp
+
+# Integration with Compose
+
+- The previous section showed us how to streamline image build and push
+
+- We will now see how to streamline service creation
+
+  (i.e. get rid of the `for SERVICE in ...; do docker service create ...` part)
+
+---
+
+## Compose file version 3
+
+(New in Docker Engine 1.13)
+
+- Almost identical to version 2
+
+- Can be directly used by a Swarm cluster through `docker stack ...` commands
+
+- Introduces a `deploy` section to pass Swarm-specific parameters
+
+- Resource limits are moved to this `deploy` section
+
+- See [here](https://github.com/aanand/docker.github.io/blob/8524552f99e5b58452fcb1403e1c273385988b71/compose/compose-file.md#upgrading) for the complete list of changes
+
+- Supersedes *Distributed Application Bundles*
+
+  (JSON payload describing an application; could be generated from a Compose file)
+
+---
+
+class: manual-btp
+
+## Removing everything
+
+- Before deploying using "stacks," let's get a clean slate
+
+.exercise[
+
+- Remove *all* the services:
+  ```bash
+  docker service ls -q | xargs docker service rm
+  ```
+
+]
+
+---
+
+## Our first stack
+
+We need a registry to move images around.
+
+Without a stack file, it would be deployed with the following command:
+
+```bash
+docker service create --publish 5000:5000 registry:2
+```
+
+Now, we are going to deploy it with the following stack file:
+
+```yaml
+version: "3"
+
+services:
+  registry:
+    image: registry:2
+    ports:
+      - "5000:5000"
+```
+
+---
+
+## Checking our stack files
+
+- All the stack files that we will use are in the `stacks` directory
+
+.exercise[
+
+- Go to the `stacks` directory:
+  ```bash
+  cd ~/orchestration-workshop/stacks
+  ```
+
+- Check `registry.yml`:
+  ```bash
+  cat registry.yml
+  ```
+
+]
+
+---
+
+## Deploying our first stack
+
+- All stack manipulation commands start with `docker stack`
+
+- Under the hood, they map to `docker service` commands
+
+- Stacks have a *name* (which also serves as a namespace)
+
+- Stacks are specified with the aforementioned Compose file format version 3
+
+.exercise[
+
+- Deploy our local registry:
+  ```bash
+  docker stack deploy registry --compose-file registry.yml
+  ```
+
+]
+
+---
+
+## Inspecting stacks
+
+- `docker stack ps` shows the detailed state of all services of a stack
+
+.exercise[
+
+- Check that our registry is running correctly:
+  ```bash
+  docker stack ps registry
+  ```
+
+- Confirm that we get the same output with the following command:
+  ```bash
+  docker service ps registry_registry
+  ```
+
+]
+
+---
+
+class: manual-btp
+
+## Specifics of stack deployment
+
+Our registry is not *exactly* identical to the one deployed with `docker service create`!
+
+- Each stack gets its own overlay network
+
+- Services of the task are connected to this network
+  <br/>(unless specified differently in the Compose file)
+
+- Services get network aliases matching their name in the Compose file
+  <br/>(just like when Compose brings up an app specified in a v2 file)
+
+- Services are explicitly named `<stack_name>_<service_name>`
+
+- Services and tasks also get an internal label indicating which stack they belong to
+
+---
+
+class: auto-btp
+
+## Testing our local registry
+
+- Connecting to port 5000 *on any node of the cluster* routes us to the registry
+
+- Therefore, we can use `localhost:5000` or `127.0.0.1:5000` as our registry
+
+.exercise[
+
+- Issue the following API request to the registry:
+  ```bash
+  curl 127.0.0.1:5000/v2/_catalog
+  ```
+
+]
+
+It should return:
+
+```json
+{"repositories":[]}
+```
+
+If that doesn't work, retry a few times; perhaps the container is still starting.
+
+---
+
+class: auto-btp
+
+## Pushing an image to our local registry
+
+- We can retag a small image, and push it to the registry
+
+.exercise[
+
+- Make sure we have the busybox image, and retag it:
+  ```bash
+  docker pull busybox
+  docker tag busybox 127.0.0.1:5000/busybox
+  ```
+
+- Push it:
+  ```bash
+  docker push 127.0.0.1:5000/busybox
+  ```
+
+]
+
+---
+
+class: auto-btp
+
+## Checking what's on our local registry
+
+- The registry API has endpoints to query what's there
+
+.exercise[
+
+- Ensure that our busybox image is now in the local registry:
+  ```bash
+  curl http://127.0.0.1:5000/v2/_catalog
+  ```
+
+]
+
+The curl command should now output:
+```json
+"repositories":["busybox"]}
+```
+
+---
+
+## Building and pushing stack services
+
+- When using Compose file version 2 and above, you can specify *both* `build` and `image`
+
+- When both keys are present:
+
+  - Compose does "business as usual" (uses `build`)
+
+  - but the resulting image is named as indicated by the `image` key
+    <br/>
+    (instead of `<projectname>_<servicename>:latest`)
+
+  - it can be pushed to a registry with `docker-compose push`
+
+- Example:
+
+  ```yaml
+    webfront:
+      build: www
+      image: myregistry.company.net:5000/webfront
+  ```
+
+---
+
+## Using Compose to build and push images
+
+.exercise[
+
+- Try it:
+  ```bash
+  docker-compose -f dockercoins.yml build
+  docker-compose -f dockercoins.yml push
+  ```
+
+]
+
+Let's have a look at the `dockercoins.yml` file while this is building and pushing.
+
+---
+
+```yaml
+version: "3"
+
+services:
+  rng:
+    build: dockercoins/rng
+    image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
+    deploy:
+      mode: global
+  ...
+  redis:
+    image: redis
+  ...
+  worker:
+    build: dockercoins/worker
+    image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
+    ...
+    deploy:
+      replicas: 10
+```
+
+---
+
+## Deploying the application
+
+- Now that the images are on the registry, we can deploy our application stack
+
+.exercise[
+
+- Create the application stack:
+  ```bash
+  docker stack deploy dockercoins --compose-file dockercoins.yml
+  ```
+
+]
+
+We can now connect to any of our nodes on port 8000, and we will see the familiar hashing speed graph.
+
+---
+
+## Maintaining multiple environments
+
+There are many ways to handle variations between environments.
+
+- Compose loads `docker-compose.yml` and (if it exists) `docker-compose.override.yml`
+
+- Compose can load alternate file(s) by setting the `-f` flag or the `COMPOSE_FILE` environment variable
+
+- Compose files can *extend* other Compose files, selectively including services:
+
+  ```yaml
+    web:
+      extends:
+        file: common-services.yml
+        service: webapp
+  ```
+
+See [this documentation page](https://docs.docker.com/compose/extends/) for more details about these techniques.
+
+
+---
+
+class: extra-details
+
+## Good to know ...
+
+- Compose file version 3 adds the `deploy` section
+
+- Further versions (3.1, ...) add more features (secrets, configs ...)
+
+- You can re-run `docker stack deploy` to update a stack
+
+- You can make manual changes with `docker service update` ...
+
+- ... But they will be wiped out each time you `docker stack deploy`
+
+  (That's the intended behavior, when one thinks about it!)
+
+- `extends` doesn't work with `docker stack deploy`
+
+  (But you can use `docker-compose config` to "flatten" your configuration)
+
+---
+
+## Summary
+
+- We've seen how to set up a Swarm
+
+- We've used it to host our own registry
+
+- We've built our app container images
+
+- We've used the registry to host those images
+
+- We've deployed and scaled our application
+
+- We've seen how to use Compose to streamline deployments
+
+- Awesome job, team!
diff --git a/docs/prereqs-k8s.md b/docs/prereqs-k8s.md
new file mode 100644
index 00000000..0550d6bd
--- /dev/null
+++ b/docs/prereqs-k8s.md
@@ -0,0 +1,161 @@
+# Pre-requirements
+
+- Computer with internet connection and a web browser
+
+- For instructor-led workshops: an SSH client to connect to remote machines
+
+  - on Linux, OS X, FreeBSD... you are probably all set
+
+  - on Windows, get [putty](http://www.putty.org/),
+  Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
+  [Git BASH](https://git-for-windows.github.io/), or
+  [MobaXterm](http://mobaxterm.mobatek.net/)
+
+- A tiny little bit of Docker knowledge
+
+  (that's totally OK if you're not a Docker expert!)
+
+---
+
+class: in-person, extra-details
+
+## Nice-to-haves
+
+- [Mosh](https://mosh.org/) instead of SSH, if your internet connection tends to lose packets
+  <br/>(available with `(apt|yum|brew) install mosh`; then connect with `mosh user@host`)
+
+- [GitHub](https://github.com/join) account
+  <br/>(if you want to fork the repo)
+
+- [Slack](https://community.docker.com/registrations/groups/4316) account
+  <br/>(to join the conversation after the workshop)
+
+- [Docker Hub](https://hub.docker.com) account
+  <br/>(it's one way to distribute images on your cluster)
+
+---
+
+class: extra-details
+
+## Extra details
+
+- This slide should have a little magnifying glass in the top left corner
+
+  (If it doesn't, it's because CSS is hard — Jérôme is only a backend person, alas)
+
+- Slides with that magnifying glass indicate slides providing extra details
+
+- Feel free to skip them if you're in a hurry!
+
+---
+
+## Hands-on sections
+
+- The whole workshop is hands-on
+
+- We will see Docker and Kubernetes in action
+
+- You are invited to reproduce all the demos
+
+- All hands-on sections are clearly identified, like the gray rectangle below
+
+.exercise[
+
+- This is the stuff you're supposed to do!
+- Go to [container.training](http://container.training/) to view these slides
+- Join the chat room on @@CHAT@@
+
+<!-- ```open http://container.training/``` -->
+
+]
+
+---
+
+class: pic, in-person
+
+![You get five VMs](you-get-five-vms.jpg)
+
+<!--
+```bash
+kubectl get all -o name | grep -v services/kubernetes | xargs -n1 kubectl delete
+```
+-->
+
+---
+
+class: in-person
+
+## You get five VMs
+
+- Each person gets 5 private VMs (not shared with anybody else)
+- Kubernetes has been deployed and pre-configured on these machines
+- They'll remain up until the day after the tutorial
+- You should have a little card with login+password+IP addresses
+- You can automatically SSH from one VM to another
+
+.exercise[
+
+<!--
+```bash
+for N in $(seq 1 5); do
+  ssh -o StrictHostKeyChecking=no node$N true
+done
+```
+-->
+
+- Log into the first VM (`node1`) with SSH or MOSH
+- Check that you can SSH (without password) to `node2`:
+  ```bash
+  ssh node2
+  ```
+- Type `exit` or `^D` to come back to node1
+
+<!-- ```bash exit``` -->
+
+]
+
+---
+
+## We will (mostly) interact with node1 only
+
+- Unless instructed, **all commands must be run from the first VM, `node1`**
+
+- We will only checkout/copy the code on `node1`
+
+- During normal operations, we do not need access to the other nodes
+
+- If we had to troubleshoot issues, we would use a combination of:
+
+  - SSH (to access system logs, daemon status...)
+  
+  - Docker API (to check running containers and container engine status)
+
+---
+
+## Terminals
+
+Once in a while, the instructions will say:
+<br/>"Open a new terminal."
+
+There are multiple ways to do this:
+
+- create a new window or tab on your machine, and SSH into the VM;
+
+- use screen or tmux on the VM and open a new window from there.
+
+You are welcome to use the method that you feel the most comfortable with.
+
+---
+
+## Tmux cheatsheet
+
+- Ctrl-b c → creates a new window
+- Ctrl-b n → go to next window
+- Ctrl-b p → go to previous window
+- Ctrl-b " → split window top/bottom
+- Ctrl-b % → split window left/right
+- Ctrl-b Alt-1 → rearrange windows in columns
+- Ctrl-b Alt-2 → rearrange windows in rows
+- Ctrl-b arrows → navigate to other windows
+- Ctrl-b d → detach session
+- tmux attach → reattach to session
diff --git a/docs/prereqs.md b/docs/prereqs.md
new file mode 100644
index 00000000..19268da9
--- /dev/null
+++ b/docs/prereqs.md
@@ -0,0 +1,226 @@
+# Pre-requirements
+
+- Computer with internet connection and a web browser
+
+- For instructor-led workshops: an SSH client to connect to remote machines
+
+  - on Linux, OS X, FreeBSD... you are probably all set
+
+  - on Windows, get [putty](http://www.putty.org/),
+  Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
+  [Git BASH](https://git-for-windows.github.io/), or
+  [MobaXterm](http://mobaxterm.mobatek.net/)
+
+- For self-paced learning: SSH is not necessary if you use
+  [Play-With-Docker](http://www.play-with-docker.com/)
+
+- Some Docker knowledge
+
+  (but that's OK if you're not a Docker expert!)
+
+---
+
+class: in-person, extra-details
+
+## Nice-to-haves
+
+- [Mosh](https://mosh.org/) instead of SSH, if your internet connection tends to lose packets
+  <br/>(available with `(apt|yum|brew) install mosh`; then connect with `mosh user@host`)
+
+- [GitHub](https://github.com/join) account
+  <br/>(if you want to fork the repo)
+
+- [Slack](https://community.docker.com/registrations/groups/4316) account
+  <br/>(to join the conversation after the workshop)
+
+- [Docker Hub](https://hub.docker.com) account
+  <br/>(it's one way to distribute images on your cluster)
+
+---
+
+class: extra-details
+
+## Extra details
+
+- This slide should have a little magnifying glass in the top left corner
+
+  (If it doesn't, it's because CSS is hard — Jérôme is only a backend person, alas)
+
+- Slides with that magnifying glass indicate slides providing extra details
+
+- Feel free to skip them if you're in a hurry!
+
+---
+
+## Hands-on sections
+
+- The whole workshop is hands-on
+
+- We will see Docker in action
+
+- You are invited to reproduce all the demos
+
+- All hands-on sections are clearly identified, like the gray rectangle below
+
+.exercise[
+
+- This is the stuff you're supposed to do!
+- Go to [container.training](http://container.training/) to view these slides
+- Join the chat room on @@CHAT@@
+
+]
+
+---
+
+class: in-person
+
+# VM environment
+
+- To follow along, you need a cluster of five Docker Engines
+
+- If you are doing this with an instructor, see next slide
+
+- If you are doing (or re-doing) this on your own, you can:
+
+  - create your own cluster (local or cloud VMs) with Docker Machine
+    ([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine))
+
+  - use [Play-With-Docker](http://play-with-docker.com) ([instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker))
+
+  - create a bunch of clusters for you and your friends
+    ([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-vms))
+
+---
+
+class: pic, in-person
+
+![You get five VMs](you-get-five-vms.jpg)
+
+---
+
+class: in-person
+
+## You get five VMs
+
+- Each person gets 5 private VMs (not shared with anybody else)
+- They'll remain up until the day after the tutorial
+- You should have a little card with login+password+IP addresses
+- You can automatically SSH from one VM to another
+
+.exercise[
+
+<!--
+```bash
+for N in $(seq 1 5); do
+  ssh -o StrictHostKeyChecking=no node$N true
+done
+```
+-->
+
+- Log into the first VM (`node1`) with SSH or MOSH
+- Check that you can SSH (without password) to `node2`:
+  ```bash
+  ssh node2
+  ```
+- Type `exit` or `^D` to come back to node1
+
+<!-- ```bash exit``` -->
+
+]
+
+---
+
+## If doing or re-doing the workshop on your own ...
+
+- Use [Play-With-Docker](http://www.play-with-docker.com/)!
+
+- Main differences:
+
+  - you don't need to SSH to the machines
+    <br/>(just click on the node that you want to control in the left tab bar)
+
+  - Play-With-Docker automagically detects exposed ports
+    <br/>(and displays them as little badges with port numbers, above the terminal)
+
+  - You can access HTTP services by clicking on the port numbers
+
+  - exposing TCP services requires something like
+    [ngrok](https://ngrok.com/)
+    or [supergrok](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker)
+
+<!--
+
+- If you use VMs deployed with Docker Machine:
+
+  - you won't have pre-authorized SSH keys to bounce across machines
+
+  - you won't have host aliases
+
+-->
+
+---
+
+class: self-paced
+
+## Using Play-With-Docker
+
+- Open a new browser tab to [www.play-with-docker.com](http://www.play-with-docker.com/)
+
+- Confirm that you're not a robot
+
+- Click on "ADD NEW INSTANCE": congratulations, you have your first Docker node!
+
+- When you will need more nodes, just click on "ADD NEW INSTANCE" again
+
+- Note the countdown in the corner; when it expires, your instances are destroyed
+
+- If you give your URL to somebody else, they can access your nodes too
+  <br/>
+  (You can use that for pair programming, or to get help from a mentor)
+
+- Loving it? Not loving it? Tell it to the wonderful authors,
+  [@marcosnils](https://twitter.com/marcosnils) &
+  [@xetorthio](https://twitter.com/xetorthio)!
+
+---
+
+## We will (mostly) interact with node1 only
+
+- Unless instructed, **all commands must be run from the first VM, `node1`**
+
+- We will only checkout/copy the code on `node1`
+
+- When we will use the other nodes, we will do it mostly through the Docker API
+
+- We will log into other nodes only for initial setup and a few "out of band" operations
+  <br/>(checking internal logs, debugging...)
+
+---
+
+## Terminals
+
+Once in a while, the instructions will say:
+<br/>"Open a new terminal."
+
+There are multiple ways to do this:
+
+- create a new window or tab on your machine, and SSH into the VM;
+
+- use screen or tmux on the VM and open a new window from there.
+
+You are welcome to use the method that you feel the most comfortable with.
+
+---
+
+## Tmux cheatsheet
+
+- Ctrl-b c → creates a new window
+- Ctrl-b n → go to next window
+- Ctrl-b p → go to previous window
+- Ctrl-b " → split window top/bottom
+- Ctrl-b % → split window left/right
+- Ctrl-b Alt-1 → rearrange windows in columns
+- Ctrl-b Alt-2 → rearrange windows in rows
+- Ctrl-b arrows → navigate to other windows
+- Ctrl-b d → detach session
+- tmux attach → reattach to session
diff --git a/docs/requirements.txt b/docs/requirements.txt
new file mode 100644
index 00000000..e857b62a
--- /dev/null
+++ b/docs/requirements.txt
@@ -0,0 +1,2 @@
+# This is for netlify
+PyYAML
diff --git a/docs/rollingupdates.md b/docs/rollingupdates.md
new file mode 100644
index 00000000..235a93b7
--- /dev/null
+++ b/docs/rollingupdates.md
@@ -0,0 +1,139 @@
+# Rolling updates
+
+- Let's change a scaled service: `worker`
+
+.exercise[
+
+- Edit `worker/worker.py`
+
+- Locate the `sleep` instruction and change the delay
+
+- Build, ship, and run our changes:
+  ```bash
+  export TAG=v0.4
+  docker-compose -f dockercoins.yml build
+  docker-compose -f dockercoins.yml push
+  docker stack deploy -c dockercoins.yml dockercoins
+  ```
+
+]
+
+---
+
+## Viewing our update as it rolls out
+
+.exercise[
+
+- Check the status of the `dockercoins_worker` service:
+  ```bash
+  watch docker service ps dockercoins_worker
+  ```
+
+- Hide the tasks that are shutdown:
+  ```bash
+  watch -n1 "docker service ps dockercoins_worker | grep -v Shutdown.*Shutdown"
+  ```
+
+]
+
+If you had stopped the workers earlier, this will automatically restart them.
+
+By default, SwarmKit does a rolling upgrade, one instance at a time.
+
+We should therefore see the workers being updated one my one.
+
+---
+
+## Changing the upgrade policy
+
+- We can set upgrade parallelism (how many instances to update at the same time)
+
+- And upgrade delay (how long to wait between two batches of instances)
+
+.exercise[
+
+- Change the parallelism to 2 and the delay to 5 seconds:
+  ```bash
+    docker service update dockercoins_worker \
+      --update-parallelism 2 --update-delay 5s
+  ```
+
+]
+
+The current upgrade will continue at a faster pace.
+
+---
+
+## Changing the policy in the Compose file
+
+- The policy can also be updated in the Compose file
+
+- This is done by adding an `update_config` key under the `deploy` key:
+
+  ```yaml
+    deploy:
+      replicas: 10
+      update_config:
+        parallelism: 2
+        delay: 10s
+  ```
+
+---
+
+## Rolling back
+
+- At any time (e.g. before the upgrade is complete), we can rollback:
+
+  - by editing the Compose file and redeploying;
+
+  - or with the special `--rollback` flag
+
+.exercise[
+
+- Try to rollback the service:
+  ```bash
+  docker service update dockercoins_worker --rollback
+  ```
+
+]
+
+What happens with the web UI graph?
+
+---
+
+## The fine print with rollback
+
+- Rollback reverts to the previous service definition
+
+- If we visualize successive updates as a stack:
+
+  - it doesn't "pop" the latest update
+
+  - it "pushes" a copy of the previous update on top
+
+  - ergo, rolling back twice does nothing
+
+- "Service definition" includes rollout cadence
+
+- Each `docker service update` command = a new service definition
+
+---
+
+class: extra-details
+
+## Timeline of an upgrade
+
+- SwarmKit will upgrade N instances at a time
+  <br/>(following the `update-parallelism` parameter)
+
+- New tasks are created, and their desired state is set to `Ready`
+  <br/>.small[(this pulls the image if necessary, ensures resource availability, creates the container ... without starting it)]
+
+- If the new tasks fail to get to `Ready` state, go back to the previous step
+  <br/>.small[(SwarmKit will try again and again, until the situation is addressed or desired state is updated)]
+
+- When the new tasks are `Ready`, it sets the old tasks desired state to `Shutdown`
+
+- When the old tasks are `Shutdown`, it starts the new tasks
+
+- Then it waits for the `update-delay`, and continues with the next batch of instances
diff --git a/docs/rollout.md b/docs/rollout.md
new file mode 100644
index 00000000..4d98fc03
--- /dev/null
+++ b/docs/rollout.md
@@ -0,0 +1,206 @@
+# Rolling updates
+
+- By default (without rolling updates), when a scaled resource is updated:
+
+  - new pods are created
+
+  - old pods are terminated
+  
+  - ... all at the same time
+  
+  - if something goes wrong, ¯\\\_(ツ)\_/¯
+
+---
+
+## Rolling updates
+
+- With rolling updates, when a resource is updated, it happens progressively
+
+- Two parameters determine the pace of the rollout: `maxUnavailable` and `maxSurge`
+
+- They can be specified in absolute number of pods, or percentage of the `replicas` count
+
+- At any given time ...
+
+  - there will always be at least `replicas`-`maxUnavailable` pods available
+
+  - there will never be more than `replicas`+`maxSurge` pods in total
+
+  - there will therefore be up to `maxUnavailable`+`maxSurge` pods being updated
+
+- We have the possibility to rollback to the previous version
+  <br/>(if the update fails or is unsatisfactory in any way)
+
+---
+
+## Rolling updates in practice
+
+- As of Kubernetes 1.8, we can do rolling updates with:
+
+  `deployments`, `daemonsets`, `statefulsets`
+
+- Editing one of these resources will automatically result in a rolling update
+
+- Rolling updates can be monitored with the `kubectl rollout` subcommand
+
+---
+
+## Building a new version of the `worker` service
+
+.exercise[
+
+- Go to the `stack` directory:
+  ```bash
+  cd ~/orchestration-workshop/stacks
+  ```
+
+- Edit `dockercoins/worker/worker.py`, update the `sleep` line to sleep 1 second
+
+- Build a new tag and push it to the registry:
+  ```bash
+  #export REGISTRY=localhost:3xxxx
+  export TAG=v0.2
+  docker-compose -f dockercoins.yml build
+  docker-compose -f dockercoins.yml push
+  ```
+
+]
+
+---
+
+## Rolling out the new version of the `worker` service
+
+.exercise[
+
+- Let's monitor what's going on by opening a few terminals, and run:
+  ```bash
+  kubectl get pods -w
+  kubectl get replicasets -w
+  kubectl get deployments -w
+  ```
+
+<!-- ```keys ^C``` -->
+
+- Update `worker` either with `kubectl edit`, or by running:
+  ```bash
+  kubectl set image deploy worker worker=$REGISTRY/worker:$TAG
+  ```
+
+]
+
+--
+
+That rollout should be pretty quick. What shows in the web UI?
+
+---
+
+## Rolling out a boo-boo
+
+- What happens if we make a mistake?
+
+.exercise[
+
+- Update `worker` by specifying a non-existent image:
+  ```bash
+  export TAG=v0.3
+  kubectl set image deploy worker worker=$REGISTRY/worker:$TAG
+  ```
+
+- Check what's going on:
+  ```bash
+  kubectl rollout status deploy worker
+  ```
+
+]
+
+--
+
+Our rollout is stuck. However, the app is not dead (just 10% slower).
+
+---
+
+## Recovering from a bad rollout
+
+- We could push some `v0.3` image
+
+  (the pod retry logic will eventually catch it and the rollout will proceed)
+
+- Or we could invoke a manual rollback
+
+.exercise[
+
+<!--
+```keys
+^C
+```
+-->
+
+- Cancel the deployment and wait for the dust to settle down:
+  ```bash
+  kubectl rollout undo deploy worker
+  kubectl rollout status deploy worker
+  ```
+
+]
+
+---
+
+## Changing rollout parameters
+
+- We want to:
+
+  - revert to `v0.1`
+  - be conservative on availability (always have desired number of available workers)
+  - be aggressive on rollout speed (update more than one pod at a time) 
+  - give some time to our workers to "warm up" before starting more
+
+The corresponding changes can be expressed in the following YAML snippet:
+
+.small[
+```yaml
+spec:
+  template:
+    spec:
+      containers:
+      - name: worker
+        image: $REGISTRY/worker:v0.1
+  strategy:
+    rollingUpdate:
+      maxUnavailable: 0
+      maxSurge: 3
+  minReadySeconds: 10
+```
+]
+
+---
+
+## Applying changes through a YAML patch
+
+- We could use `kubectl edit deployment worker`
+
+- But we could also use `kubectl patch` with the exact YAML shown before
+
+.exercise[
+
+.small[
+
+- Apply all our changes and wait for them to take effect:
+  ```bash
+  kubectl patch deployment worker -p "
+    spec:
+      template:
+        spec:
+          containers:
+          - name: worker
+            image: $REGISTRY/worker:v0.1
+      strategy:
+        rollingUpdate:
+          maxUnavailable: 0
+          maxSurge: 3
+      minReadySeconds: 10
+    "
+  kubectl rollout status deployment worker
+  ```
+  ] 
+
+]
diff --git a/docs/sampleapp.md b/docs/sampleapp.md
new file mode 100644
index 00000000..07df3049
--- /dev/null
+++ b/docs/sampleapp.md
@@ -0,0 +1,477 @@
+# Our sample application
+
+- Visit the GitHub repository with all the materials of this workshop:
+  <br/>https://github.com/jpetazzo/orchestration-workshop
+
+- The application is in the [dockercoins](
+  https://github.com/jpetazzo/orchestration-workshop/tree/master/dockercoins)
+  subdirectory
+
+- Let's look at the general layout of the source code:
+
+  there is a Compose file [docker-compose.yml](
+  https://github.com/jpetazzo/orchestration-workshop/blob/master/dockercoins/docker-compose.yml) ...
+
+  ... and 4 other services, each in its own directory:
+
+  - `rng` = web service generating random bytes
+  - `hasher` = web service computing hash of POSTed data
+  - `worker` = background process using `rng` and `hasher`
+  - `webui` = web interface to watch progress
+
+---
+
+class: extra-details
+
+## Compose file format version
+
+*Particularly relevant if you have used Compose before...*
+
+- Compose 1.6 introduced support for a new Compose file format (aka "v2")
+
+- Services are no longer at the top level, but under a `services` section
+
+- There has to be a `version` key at the top level, with value `"2"` (as a string, not an integer)
+
+- Containers are placed on a dedicated network, making links unnecessary
+
+- There are other minor differences, but upgrade is easy and straightforward
+
+---
+
+## Links, naming, and service discovery
+
+- Containers can have network aliases (resolvable through DNS)
+
+- Compose file version 2+ makes each container reachable through its service name
+
+- Compose file version 1 did require "links" sections
+
+- Our code can connect to services using their short name
+
+  (instead of e.g. IP address or FQDN)
+
+- Network aliases are automatically namespaced
+
+  (i.e. you can have multiple apps declaring and using a service named `database`)
+
+---
+
+## Example in `worker/worker.py`
+
+![Service discovery](service-discovery.png)
+
+---
+
+## What's this application?
+
+---
+
+class: pic
+
+![DockerCoins logo](dockercoins.png)
+
+(DockerCoins 2016 logo courtesy of [@XtlCnslt](https://twitter.com/xtlcnslt) and [@ndeloof](https://twitter.com/ndeloof). Thanks!)
+
+---
+
+## What's this application?
+
+- It is a DockerCoin miner! 💰🐳📦🚢
+
+--
+
+- No, you can't buy coffee with DockerCoins
+
+--
+
+- How DockerCoins works:
+
+  - `worker` asks to `rng` to generate a few random bytes
+
+  - `worker` feeds these bytes into `hasher`
+
+  - and repeat forever!
+
+  - every second, `worker` updates `redis` to indicate how many loops were done
+
+  - `webui` queries `redis`, and computes and exposes "hashing speed" in your browser
+
+---
+
+## Getting the application source code
+
+- We will clone the GitHub repository
+
+- The repository also contains scripts and tools that we will use through the workshop
+
+.exercise[
+
+<!--
+```bash
+if [ -d orchestration-workshop ]; then
+  mv orchestration-workshop orchestration-workshop.$$
+fi
+```
+-->
+
+- Clone the repository on `node1`:
+  ```bash
+  git clone git://github.com/jpetazzo/orchestration-workshop
+  ```
+
+]
+
+(You can also fork the repository on GitHub and clone your fork if you prefer that.)
+
+---
+
+# Running the application
+
+Without further ado, let's start our application.
+
+.exercise[
+
+- Go to the `dockercoins` directory, in the cloned repo:
+  ```bash
+  cd ~/orchestration-workshop/dockercoins
+  ```
+
+- Use Compose to build and run all containers:
+  ```bash
+  docker-compose up
+  ```
+
+<!--
+```wait units of work done```
+```keys ^C```
+-->
+
+]
+
+Compose tells Docker to build all container images (pulling
+the corresponding base images), then starts all containers,
+and displays aggregated logs.
+
+---
+
+## Lots of logs
+
+- The application continuously generates logs
+
+- We can see the `worker` service making requests to `rng` and `hasher`
+
+- Let's put that in the background
+
+.exercise[
+
+- Stop the application by hitting `^C`
+
+]
+
+- `^C` stops all containers by sending them the `TERM` signal
+
+- Some containers exit immediately, others take longer
+  <br/>(because they don't handle `SIGTERM` and end up being killed after a 10s timeout)
+
+---
+
+## Restarting in the background
+
+- Many flags and commands of Compose are modeled after those of `docker`
+
+.exercise[
+
+- Start the app in the background with the `-d` option:
+  ```bash
+  docker-compose up -d
+  ```
+
+- Check that our app is running with the `ps` command:
+  ```bash
+  docker-compose ps
+  ```
+
+]
+
+`docker-compose ps` also shows the ports exposed by the application.
+
+---
+
+class: extra-details
+
+## Viewing logs
+
+- The `docker-compose logs` command works like `docker logs`
+
+.exercise[
+
+- View all logs since container creation and exit when done:
+  ```bash
+  docker-compose logs
+  ```
+
+- Stream container logs, starting at the last 10 lines for each container:
+  ```bash
+  docker-compose logs --tail 10 --follow
+  ```
+
+<!--
+```wait units of work done```
+```keys ^C```
+-->
+
+]
+
+Tip: use `^S` and `^Q` to pause/resume log output.
+
+---
+
+class: extra-details
+
+## Upgrading from Compose 1.6
+
+.warning[The `logs` command has changed between Compose 1.6 and 1.7!]
+
+- Up to 1.6
+
+  - `docker-compose logs` is the equivalent of `logs --follow`
+
+  - `docker-compose logs` must be restarted if containers are added
+
+- Since 1.7
+
+  - `--follow` must be specified explicitly
+
+  - new containers are automatically picked up by `docker-compose logs`
+
+---
+
+## Connecting to the web UI
+
+- The `webui` container exposes a web dashboard; let's view it
+
+.exercise[
+
+- With a web browser, connect to `node1` on port 8000
+
+- Remember: the `nodeX` aliases are valid only on the nodes themselves
+
+- In your browser, you need to enter the IP address of your node
+
+<!-- ```open http://node1:8000``` -->
+
+]
+
+You should see a speed of approximately 4 hashes/second.
+
+More precisely: 4 hashes/second, with regular dips down to zero.
+<br/>This is because Jérôme is incapable of writing good frontend code.
+<br/>Don't ask. Seriously, don't ask. This is embarrassing.
+
+---
+
+class: extra-details
+
+## Why does the speed seem irregular?
+
+- The app actually has a constant, steady speed: 3.33 hashes/second
+  <br/>
+  (which corresponds to 1 hash every 0.3 seconds, for *reasons*)
+
+- The worker doesn't update the counter after every loop, but up to once per second
+
+- The speed is computed by the browser, checking the counter about once per second
+
+- Between two consecutive updates, the counter will increase either by 4, or by 0
+
+- The perceived speed will therefore be 4 - 4 - 4 - 0 - 4 - 4 - etc.
+
+*We told you to not ask!!!*
+
+---
+
+## Scaling up the application
+
+- Our goal is to make that performance graph go up (without changing a line of code!)
+
+--
+
+- Before trying to scale the application, we'll figure out if we need more resources
+
+  (CPU, RAM...)
+
+- For that, we will use good old UNIX tools on our Docker node
+
+---
+
+## Looking at resource usage
+
+- Let's look at CPU, memory, and I/O usage
+
+.exercise[
+
+- run `top` to see CPU and memory usage (you should see idle cycles)
+
+<!--
+```bash top```
+
+```wait Tasks```
+```keys ^C```
+-->
+
+- run `vmstat 1` to see I/O usage (si/so/bi/bo)
+  <br/>(the 4 numbers should be almost zero, except `bo` for logging)
+
+<!--
+```bash vmstat 1```
+
+```wait memory```
+```keys ^C```
+-->
+
+]
+
+We have available resources.
+
+- Why?
+- How can we use them?
+
+---
+
+## Scaling workers on a single node
+
+- Docker Compose supports scaling
+- Let's scale `worker` and see what happens!
+
+.exercise[
+
+- Start one more `worker` container:
+  ```bash
+  docker-compose scale worker=2
+  ```
+
+- Look at the performance graph (it should show a x2 improvement)
+
+- Look at the aggregated logs of our containers (`worker_2` should show up)
+
+- Look at the impact on CPU load with e.g. top (it should be negligible)
+
+]
+
+---
+
+## Adding more workers
+
+- Great, let's add more workers and call it a day, then!
+
+.exercise[
+
+- Start eight more `worker` containers:
+  ```bash
+  docker-compose scale worker=10
+  ```
+
+- Look at the performance graph: does it show a x10 improvement?
+
+- Look at the aggregated logs of our containers
+
+- Look at the impact on CPU load and memory usage
+
+]
+
+---
+
+# Identifying bottlenecks
+
+- You should have seen a 3x speed bump (not 10x)
+
+- Adding workers didn't result in linear improvement
+
+- *Something else* is slowing us down
+
+--
+
+- ... But what?
+
+--
+
+- The code doesn't have instrumentation
+
+- Let's use state-of-the-art HTTP performance analysis!
+  <br/>(i.e. good old tools like `ab`, `httping`...)
+
+---
+
+## Accessing internal services
+
+- `rng` and `hasher` are exposed on ports 8001 and 8002
+
+- This is declared in the Compose file:
+
+  ```yaml
+    ...
+    rng:
+      build: rng
+      ports:
+      - "8001:80"
+
+    hasher:
+      build: hasher
+      ports:
+      - "8002:80"
+    ...
+  ```
+
+---
+
+## Measuring latency under load
+
+We will use `httping`.
+
+.exercise[
+
+- Check the latency of `rng`:
+  ```bash
+  httping -c 10 localhost:8001
+  ```
+
+- Check the latency of `hasher`:
+  ```bash
+  httping -c 10 localhost:8002
+  ```
+
+]
+
+`rng` has a much higher latency than `hasher`.
+
+---
+
+## Let's draw hasty conclusions
+
+- The bottleneck seems to be `rng`
+
+- *What if* we don't have enough entropy and can't generate enough random numbers?
+
+- We need to scale out the `rng` service on multiple machines!
+
+Note: this is a fiction! We have enough entropy. But we need a pretext to scale out.
+
+(In fact, the code of `rng` uses `/dev/urandom`, which never runs out of entropy...
+<br/>
+...and is [just as good as `/dev/random`](http://www.slideshare.net/PacSecJP/filippo-plain-simple-reality-of-entropy).)
+
+---
+
+## Clean up
+
+- Before moving on, let's remove those containers
+
+.exercise[
+
+- Tell Compose to remove everything:
+  ```bash
+  docker-compose down
+  ```
+
+]
diff --git a/docs/secrets.md b/docs/secrets.md
new file mode 100644
index 00000000..9e558770
--- /dev/null
+++ b/docs/secrets.md
@@ -0,0 +1,194 @@
+class: secrets
+
+## Secret management
+
+- Docker has a "secret safe" (secure key→value store)
+
+- You can create as many secrets as you like
+
+- You can associate secrets to services
+
+- Secrets are exposed as plain text files, but kept in memory only (using `tmpfs`)
+
+- Secrets are immutable (at least in Engine 1.13)
+
+- Secrets have a max size of 500 KB
+
+---
+
+class: secrets
+
+## Creating secrets
+
+- Must specify a name for the secret; and the secret itself
+
+.exercise[
+
+- Assign [one of the four most commonly used passwords](https://www.youtube.com/watch?v=0Jx8Eay5fWQ) to a secret called `hackme`:
+  ```bash
+  echo love | docker secret create hackme -
+  ```
+
+]
+
+If the secret is in a file, you can simply pass the path to the file.
+
+(The special path `-` indicates to read from the standard input.)
+
+---
+
+class: secrets
+
+## Creating better secrets
+
+- Picking lousy passwords always leads to security breaches
+
+.exercise[
+
+- Let's craft a better password, and assign it to another secret:
+  ```bash
+  base64 /dev/urandom | head -c16 | docker secret create arewesecureyet -
+  ```
+
+]
+
+Note: in the latter case, we don't even know the secret at this point. But Swarm does.
+
+---
+
+class: secrets
+
+## Using secrets
+
+- Secrets must be handed explicitly to services
+
+.exercise[
+
+- Create a dummy service with both secrets:
+  ```bash
+    docker service create \
+           --secret hackme --secret arewesecureyet \
+           --name dummyservice --mode global \
+           alpine sleep 1000000000
+  ```
+
+]
+
+We use a global service to make sure that there will be an instance on the local node.
+
+---
+
+class: secrets
+
+## Accessing secrets
+
+- Secrets are materialized on `/run/secrets` (which is an in-memory filesystem)
+
+.exercise[
+
+- Find the ID of the container for the dummy service:
+  ```bash
+  CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
+  ```
+
+- Enter the container:
+  ```bash
+  docker exec -ti $CID sh
+  ```
+
+- Check the files in `/run/secrets`
+
+]
+
+---
+
+class: secrets
+
+## Rotating secrets
+
+- You can't change a secret
+
+  (Sounds annoying at first; but allows clean rollbacks if a secret update goes wrong)
+
+- You can add a secret to a service with `docker service update --secret-add`
+
+  (This will redeploy the service; it won't add the secret on the fly)
+
+- You can remove a secret with `docker service update --secret-rm`
+
+- Secrets can be mapped to different names by expressing them with a micro-format:
+  ```bash
+  docker service create --secret source=secretname,target=filename
+  ```
+
+---
+
+class: secrets
+
+## Changing our insecure password
+
+- We want to replace our `hackme` secret with a better one
+
+.exercise[
+
+- Remove the insecure `hackme` secret:
+  ```bash
+  docker service update dummyservice --secret-rm hackme
+  ```
+
+- Add our better secret instead:
+  ```bash
+  docker service update dummyservice \
+         --secret-add source=arewesecureyet,target=hackme
+  ```
+
+]
+
+Wait for the service to be fully updated with e.g. `watch docker service ps dummyservice`.
+<br/>(With Docker Engine 17.10 and later, the CLI will wait for you!)
+
+---
+
+class: secrets
+
+## Checking that our password is now stronger
+
+- We will use the power of `docker exec`!
+
+.exercise[
+
+- Get the ID of the new container:
+  ```bash
+  CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
+  ```
+
+- Check the contents of the secret files:
+  ```bash
+  docker exec $CID grep -r . /run/secrets
+  ```
+
+]
+
+---
+
+class: secrets
+
+## Secrets in practice
+
+- Can be (ab)used to hold whole configuration files if needed
+
+- If you intend to rotate secret `foo`, call it `foo.N` instead, and map it to `foo`
+
+  (N can be a serial, a timestamp...)
+
+  ```bash
+  docker service create --secret source=foo.N,target=foo ...
+  ```
+
+- You can update (remove+add) a secret in a single command:
+
+  ```bash
+  docker service update ... --secret-rm foo.M --secret-add source=foo.N,target=foo
+  ```
+
+- For more details and examples, [check the documentation](https://docs.docker.com/engine/swarm/secrets/)
diff --git a/docs/security.md b/docs/security.md
new file mode 100644
index 00000000..49833a11
--- /dev/null
+++ b/docs/security.md
@@ -0,0 +1,16 @@
+# Secrets management and encryption at rest
+
+(New in Docker Engine 1.13)
+
+- Secrets management = selectively and securely bring secrets to services
+
+- Encryption at rest = protect against storage theft or prying
+
+- Remember:
+
+  - control plane is authenticated through mutual TLS, certs rotated every 90 days
+
+  - control plane is encrypted with AES-GCM, keys rotated every 12 hours
+
+  - data plane is not encrypted by default (for performance reasons),
+    <br/>but we saw earlier how to enable that with a single flag
diff --git a/docs/selfpaced.yml b/docs/selfpaced.yml
new file mode 100644
index 00000000..7b34d3b4
--- /dev/null
+++ b/docs/selfpaced.yml
@@ -0,0 +1,65 @@
+exclude:
+- in-person
+
+chat: FIXME
+
+title: Docker Orchestration Workshop
+
+chapters:
+- |
+  class: title
+  Docker <br/> Orchestration <br/> Workshop
+- intro.md
+- |
+  @@TOC@@
+- - prereqs.md
+  - versions.md
+  - |
+    class: title
+
+    All right!
+    <br/>
+    We're all set.
+    <br/>
+    Let's do this.
+  - |
+    name: part-1
+
+    class: title, self-paced
+
+    Part 1
+  - sampleapp.md
+  - |
+    class: title
+
+    Scaling out
+  - swarmkit.md
+  - creatingswarm.md
+  - machine.md
+  - morenodes.md
+- - firstservice.md
+  - ourapponswarm.md
+- - operatingswarm.md
+  - netshoot.md
+  - swarmnbt.md
+  - ipsec.md
+  - updatingservices.md
+  - rollingupdates.md
+  - healthchecks.md
+  - nodeinfo.md
+  - swarmtools.md
+- - security.md
+  - secrets.md
+  - leastprivilege.md
+  - namespaces.md
+  - apiscope.md
+  - encryptionatrest.md
+  - logging.md
+  - metrics.md
+  - stateful.md
+  - extratips.md
+  - end.md
+- |
+  class: title
+
+  Thank you!
diff --git a/docs/setup-k8s.md b/docs/setup-k8s.md
new file mode 100644
index 00000000..1bccce18
--- /dev/null
+++ b/docs/setup-k8s.md
@@ -0,0 +1,65 @@
+# Setting up Kubernetes
+
+- How did we set up these Kubernetes clusters that we're using?
+
+--
+
+- We used `kubeadm` on "fresh" EC2 instances with Ubuntu 16.04 LTS
+
+    1. Install Docker
+
+    2. Install Kubernetes packages
+
+    3. Run `kubeadm init` on the master node
+
+    4. Set up Weave (the overlay network)
+       <br/>
+       (that step is just one `kubectl apply` command; discussed later)
+
+    5. Run `kubeadm join` on the other nodes (with the token produced by `kubeadm init`)
+
+    6. Copy the configuration file generated by `kubeadm init`
+
+---
+
+## `kubeadm` drawbacks
+
+- Doesn't set up Docker or any other container engine
+
+- Doesn't set up the overlay network
+
+- Scripting is complex
+  <br/>
+  (because extracting the token requires advanced `kubectl` commands)
+
+- Doesn't set up multi-master (no high availability)
+
+--
+
+- It's still twice as many steps as setting up a Swarm cluster 😕
+
+---
+
+## Other deployment options
+
+- If you are on Google Cloud:
+  [GKE](https://cloud.google.com/container-engine/)
+
+  Empirically the best Kubernetes deployment out there
+
+- If you are on AWS:
+  [kops](https://github.com/kubernetes/kops)
+
+  ... But with AWS re:invent just around the corner, expect some changes
+
+- On a local machine:
+  [minikube](https://kubernetes.io/docs/getting-started-guides/minikube/),
+  [kubespawn](https://github.com/kinvolk/kube-spawn),
+  [Docker4Mac (coming soon)](https://beta.docker.com/)
+
+- If you want something customizable:
+  [kubicorn](https://github.com/kris-nova/kubicorn)
+
+  Probably the closest to a multi-cloud/hybrid solution so far, but in development
+
+- Also, many commercial options!
diff --git a/docs/startrek-federation.jpg b/docs/startrek-federation.jpg
new file mode 100644
index 00000000..88befb11
Binary files /dev/null and b/docs/startrek-federation.jpg differ
diff --git a/docs/stateful.md b/docs/stateful.md
new file mode 100644
index 00000000..f8de8570
--- /dev/null
+++ b/docs/stateful.md
@@ -0,0 +1,344 @@
+# Dealing with stateful services
+
+- First of all, you need to make sure that the data files are on a *volume*
+
+- Volumes are host directories that are mounted to the container's filesystem
+
+- These host directories can be backed by the ordinary, plain host filesystem ...
+
+- ... Or by distributed/networked filesystems
+
+- In the latter scenario, in case of node failure, the data is safe elsewhere ...
+
+- ... And the container can be restarted on another node without data loss
+
+---
+
+## Building a stateful service experiment
+
+- We will use Redis for this example
+
+- We will expose it on port 10000 to access it easily
+
+.exercise[
+
+- Start the Redis service:
+  ```bash
+  docker service create --name stateful -p 10000:6379 redis
+  ```
+
+- Check that we can connect to it:
+  ```bash
+  docker run --net host --rm redis redis-cli -p 10000 info server
+  ```
+
+]
+
+---
+
+## Accessing our Redis service easily
+
+- Typing that whole command is going to be tedious
+
+.exercise[
+
+- Define a shell alias to make our lives easier:
+  ```bash
+  alias redis='docker run --net host --rm redis redis-cli -p 10000'
+  ```
+
+- Try it:
+  ```bash
+  redis info server
+  ```
+
+]
+
+---
+
+## Basic Redis commands
+
+.exercise[
+
+- Check that the `foo` key doesn't exist:
+  ```bash
+  redis get foo
+  ```
+
+- Set it to `bar`:
+  ```bash
+  redis set foo bar
+  ```
+
+- Check that it exists now:
+  ```bash
+  redis get foo
+  ```
+
+]
+
+---
+
+## Local volumes vs. global volumes
+
+- Global volumes exist in a single namespace
+
+- A global volume can be mounted on any node
+  <br/>.small[(bar some restrictions specific to the volume driver in use; e.g. using an EBS-backed volume on a GCE/EC2 mixed cluster)]
+
+- Attaching a global volume to a container allows to start the container anywhere
+  <br/>(and retain its data wherever you start it!)
+
+- Global volumes require extra *plugins* (Flocker, Portworx...)
+
+- Docker doesn't come with a default global volume driver at this point
+
+- Therefore, we will fall back on *local volumes*
+
+---
+
+## Local volumes
+
+- We will use the default volume driver, `local`
+
+- As the name implies, the `local` volume driver manages *local* volumes
+
+- Since local volumes are (duh!) *local*, we need to pin our container to a specific host
+
+- We will do that with a *constraint*
+
+.exercise[
+
+- Add a placement constraint to our service:
+  ```bash
+  docker service update stateful --constraint-add node.hostname==$HOSTNAME
+  ```
+
+]
+
+---
+
+## Where is our data?
+
+- If we look for our `foo` key, it's gone!
+
+.exercise[
+
+- Check the `foo` key:
+  ```bash
+  redis get foo
+  ```
+
+- Adding a constraint caused the service to be redeployed:
+  ```bash
+  docker service ps stateful
+  ```
+
+]
+
+Note: even if the constraint ends up being a no-op (i.e. not
+moving the service), the service gets redeployed.
+This ensures consistent behavior.
+
+---
+
+## Setting the key again
+
+- Since our database was wiped out, let's populate it again
+
+.exercise[
+
+- Set `foo` again:
+  ```bash
+  redis set foo bar
+  ```
+
+- Check that it's there:
+  ```bash
+  redis get foo
+  ```
+
+]
+
+---
+
+## Service updates cause containers to be replaced
+
+- Let's try to make a trivial update to the service and see what happens
+
+.exercise[
+
+- Set a memory limit to our Redis service:
+  ```bash
+  docker service update stateful --limit-memory 100M
+  ```
+
+- Try to get the `foo` key one more time:
+  ```bash
+  redis get foo
+  ```
+
+]
+
+The key is blank again!
+
+---
+
+## Service volumes are ephemeral by default
+
+- Let's highlight what's going on with volumes!
+
+.exercise[
+
+- Check the current list of volumes:
+  ```bash
+  docker volume ls
+  ```
+
+- Carry a minor update to our Redis service:
+  ```bash
+  docker service update stateful --limit-memory 200M
+  ```
+
+]
+
+Again: all changes trigger the creation of a new task, and therefore a replacement of the existing container;
+even when it is not strictly technically necessary.
+
+---
+
+## The data is gone again
+
+- What happened to our data?
+
+.exercise[
+
+- The list of volumes is slightly different:
+  ```bash
+  docker volume ls
+  ```
+
+]
+
+(You should see one extra volume.)
+
+---
+
+## Assigning a persistent volume to the container
+
+- Let's add an explicit volume mount to our service, referencing a named volume
+
+.exercise[
+
+- Update the service with a volume mount:
+  ```bash
+    docker service update stateful \
+           --mount-add type=volume,source=foobarstore,target=/data
+  ```
+
+- Check the new volume list:
+  ```bash
+  docker volume ls
+  ```
+
+]
+
+Note: the `local` volume driver automatically creates volumes.
+
+---
+
+## Checking that persistence actually works across service updates
+
+.exercise[
+
+- Store something in the `foo` key:
+  ```bash
+  redis set foo barbar
+  ```
+
+- Update the service with yet another trivial change:
+  ```bash
+  docker service update stateful --limit-memory 300M
+  ```
+
+- Check that `foo` is still set:
+  ```bash
+  redis get foo
+  ```
+
+]
+
+---
+
+## Recap
+
+- The service must commit its state to disk when being shutdown.red[*]
+
+  (Shutdown = being sent a `TERM` signal)
+
+- The state must be written on files located on a volume
+
+- That volume must be specified to be persistent
+
+- If using a local volume, the service must also be pinned to a specific node
+
+  (And losing that node means losing the data, unless there are other backups)
+
+.footnote[<br/>
+.red[*]If you customize Redis configuration, make sure you
+persist data correctly!
+<br/>
+It's easy to make that mistake — __Trust me!__]
+
+---
+
+## Cleaning up
+
+.exercise[
+
+- Remove the stateful service:
+  ```bash
+  docker service rm stateful
+  ```
+
+- Remove the associated volume:
+  ```bash
+  docker volume rm foobarstore
+  ```
+
+]
+
+Note: we could keep the volume around if we wanted.
+
+---
+
+## Should I run stateful services in containers?
+
+--
+
+Depending whom you ask, they'll tell you:
+
+--
+
+- certainly not, heathen!
+
+--
+
+- we've been running a few thousands PostgreSQL instances in containers ...
+  <br/>for a few years now ... in production ... is that bad?
+
+--
+
+- what's a container?
+
+--
+
+Perhaps a better question would be:
+
+*"Should I run stateful services?"*
+
+--
+
+- is it critical for my business?
+- is it my value-add?
+- or should I find somebody else to run them for me?
diff --git a/docs/swarmkit.md b/docs/swarmkit.md
new file mode 100644
index 00000000..207aa69e
--- /dev/null
+++ b/docs/swarmkit.md
@@ -0,0 +1,153 @@
+# SwarmKit
+
+- [SwarmKit](https://github.com/docker/swarmkit) is an open source
+  toolkit to build multi-node systems
+
+- It is a reusable library, like libcontainer, libnetwork, vpnkit ...
+
+- It is a plumbing part of the Docker ecosystem
+
+--
+
+.footnote[🐳 Did you know that кит means "whale" in Russian?]
+
+---
+
+## SwarmKit features
+
+- Highly-available, distributed store based on [Raft](
+  https://en.wikipedia.org/wiki/Raft_%28computer_science%29)
+  <br/>(avoids depending on an external store: easier to deploy; higher performance)
+
+- Dynamic reconfiguration of Raft without interrupting cluster operations
+
+- *Services* managed with a *declarative API*
+  <br/>(implementing *desired state* and *reconciliation loop*)
+
+- Integration with overlay networks and load balancing
+
+- Strong emphasis on security:
+
+  - automatic TLS keying and signing; automatic cert rotation
+  - full encryption of the data plane; automatic key rotation
+  - least privilege architecture (single-node compromise ≠ cluster compromise)
+  - on-disk encryption with optional passphrase
+
+---
+
+class: extra-details
+
+## Where is the key/value store?
+
+- Many orchestration systems use a key/value store backed by a consensus algorithm
+  <br/>
+  (k8s→etcd→Raft, mesos→zookeeper→ZAB, etc.)
+
+- SwarmKit implements the Raft algorithm directly
+  <br/>
+  (Nomad is similar; thanks [@cbednarski](https://twitter.com/@cbednarski),
+  [@diptanu](https://twitter.com/diptanu) and others for point it out!)
+
+- Analogy courtesy of [@aluzzardi](https://twitter.com/aluzzardi):
+
+  *It's like B-Trees and RDBMS. They are different layers, often
+  associated. But you don't need to bring up a full SQL server when
+  all you need is to index some data.*
+
+- As a result, the orchestrator has direct access to the data
+  <br/>
+  (the main copy of the data is stored in the orchestrator's memory)
+
+- Simpler, easier to deploy and operate; also faster
+
+---
+
+## SwarmKit concepts (1/2)
+
+- A *cluster* will be at least one *node* (preferably more)
+
+- A *node* can be a *manager* or a *worker*
+
+- A *manager* actively takes part in the Raft consensus, and keeps the Raft log
+
+- You can talk to a *manager* using the SwarmKit API
+
+- One *manager* is elected as the *leader*; other managers merely forward requests to it
+
+- The *workers* get their instructions from the *managers*
+
+- Both *workers* and *managers* can run containers
+
+---
+
+## Illustration
+
+![Illustration](swarm-mode.svg)
+
+---
+
+## SwarmKit concepts (2/2)
+
+- The *managers* expose the SwarmKit API
+
+- Using the API, you can indicate that you want to run a *service*
+
+- A *service* is specified by its *desired state*: which image, how many instances...
+
+- The *leader* uses different subsystems to break down services into *tasks*:
+  <br/>orchestrator, scheduler, allocator, dispatcher
+
+- A *task* corresponds to a specific container, assigned to a specific *node*
+
+- *Nodes* know which *tasks* should be running, and will start or stop containers accordingly (through the Docker Engine API)
+
+You can refer to the [NOMENCLATURE](https://github.com/docker/swarmkit/blob/master/design/nomenclature.md) in the SwarmKit repo for more details.
+
+---
+
+## Swarm Mode
+
+- Since version 1.12, Docker Engine embeds SwarmKit
+
+- All the SwarmKit features are "asleep" until you enable "Swarm Mode"
+
+- Examples of Swarm Mode commands:
+
+  - `docker swarm` (enable Swarm mode; join a Swarm; adjust cluster parameters)
+
+  - `docker node` (view nodes; promote/demote managers; manage nodes)
+
+  - `docker service` (create and manage services)
+
+???
+
+- The Docker API exposes the same concepts
+
+- The SwarmKit API is also exposed (on a separate socket)
+
+---
+
+## You need to enable Swarm mode to use the new stuff
+
+- By default, all this new code is inactive
+
+- Swarm Mode can be enabled, "unlocking" SwarmKit functions
+  <br/>(services, out-of-the-box overlay networks, etc.)
+
+.exercise[
+
+- Try a Swarm-specific command:
+  ```bash
+  docker node ls
+  ```
+
+<!-- Ignore errors: ```wait ``` -->
+
+]
+
+--
+
+You will get an error message:
+```
+Error response from daemon: This node is not a swarm manager. [...]
+```
diff --git a/docs/swarmnbt.md b/docs/swarmnbt.md
new file mode 100644
index 00000000..9440a2c1
--- /dev/null
+++ b/docs/swarmnbt.md
@@ -0,0 +1,72 @@
+class: nbt, extra-details
+
+## Measuring network conditions on the whole cluster
+
+- Since we have built-in, cluster-wide discovery, it's relatively straightforward
+  to monitor the whole cluster automatically
+
+- [Alexandros Mavrogiannis](https://github.com/alexmavr) wrote
+  [Swarm NBT](https://github.com/alexmavr/swarm-nbt), a tool doing exactly that!
+
+.exercise[
+
+- Start Swarm NBT:
+  ```bash
+    docker run --rm -v inventory:/inventory \
+           -v /var/run/docker.sock:/var/run/docker.sock \
+           alexmavr/swarm-nbt start
+  ```
+
+]
+
+Note: in this mode, Swarm NBT connects to the Docker API socket,
+and issues additional API requests to start all the components it needs.
+
+---
+
+class: nbt, extra-details
+
+## Viewing network conditions with Prometheus
+
+- Swarm NBT relies on Prometheus to scrape and store data
+
+- We can directly consume the Prometheus endpoint to view telemetry data
+
+.exercise[
+
+- Point your browser to any Swarm node, on port 9090
+
+  (If you're using Play-With-Docker, click on the (9090) badge)
+
+- In the drop-down, select `icmp_rtt_gauge_seconds`
+
+- Click on "Graph"
+
+]
+
+You are now seeing ICMP latency across your cluster.
+
+---
+
+class: nbt, in-person, extra-details
+
+## Viewing network conditions with Grafana
+
+- If you are using a "real" cluster (not Play-With-Docker) you can use Grafana
+
+.exercise[
+
+- Start Grafana with `docker service create -p 3000:3000 grafana`
+- Point your browser to Grafana, on port 3000 on any Swarm node
+- Login with username `admin` and password `admin`
+- Click on the top-left menu and browse to Data Sources
+- Create a prometheus datasource with any name
+- Point it to http://any-node-IP:9090
+- Set access to "direct" and leave credentials blank
+- Click on the top-left menu, highlight "Dashboards" and select the "Import" option
+- Copy-paste [this JSON payload](
+  https://raw.githubusercontent.com/alexmavr/swarm-nbt/master/grafana.json),
+  then use the Prometheus Data Source defined before
+- Poke around the dashboard that magically appeared!
+
+]
diff --git a/docs/swarmtools.md b/docs/swarmtools.md
new file mode 100644
index 00000000..32c63773
--- /dev/null
+++ b/docs/swarmtools.md
@@ -0,0 +1,184 @@
+# SwarmKit debugging tools
+
+- The SwarmKit repository comes with debugging tools
+
+- They are *low level* tools; not for general use
+
+- We are going to see two of these tools:
+
+  - `swarmctl`, to communicate directly with the SwarmKit API
+
+  - `swarm-rafttool`, to inspect the content of the Raft log
+
+---
+
+## Building the SwarmKit tools
+
+- We are going to install a Go compiler, then download SwarmKit source and build it
+
+.exercise[
+- Download, compile, and install SwarmKit with this one-liner:
+  ```bash
+  docker run -v /usr/local/bin:/go/bin golang \
+         go get `-v` github.com/docker/swarmkit/...
+  ```
+
+]
+
+Remove `-v` if you don't like verbose things.
+
+Shameless promo: for more Go and Docker love, check
+[this blog post](http://jpetazzo.github.io/2016/09/09/go-docker/)!
+
+Note: in the unfortunate event of SwarmKit *master* branch being broken,
+the build might fail. In that case, just skip the Swarm tools section.
+
+---
+
+## Getting cluster-wide task information
+
+- The Docker API doesn't expose this directly (yet)
+
+- But the SwarmKit API does
+
+- We are going to query it with `swarmctl`
+
+- `swarmctl` is an example program showing how to
+  interact with the SwarmKit API
+
+---
+
+## Using `swarmctl`
+
+- The Docker Engine places the SwarmKit control socket in a special path
+
+- You need root privileges to access it
+
+.exercise[
+
+- If you are using Play-With-Docker, set the following alias:
+  ```bash
+    alias swarmctl='/lib/ld-musl-x86_64.so.1 /usr/local/bin/swarmctl \
+                    --socket /var/run/docker/swarm/control.sock'
+  ```
+
+- Otherwise, set the following alias:
+  ```bash
+    alias swarmctl='sudo swarmctl \
+                    --socket /var/run/docker/swarm/control.sock'
+  ```
+
+]
+
+---
+
+## `swarmctl` in action
+
+- Let's review a few useful `swarmctl` commands
+
+.exercise[
+
+- List cluster nodes (that's equivalent to `docker node ls`):
+  ```bash
+  swarmctl node ls
+  ```
+
+- View all tasks across all services:
+  ```bash
+  swarmctl task ls
+  ```
+
+]
+
+---
+
+## `swarmctl` notes
+
+- SwarmKit is vendored into the Docker Engine
+
+- If you want to use `swarmctl`, you need the exact version of
+  SwarmKit that was used in your Docker Engine
+
+- Otherwise, you might get some errors like:
+
+  ```
+  Error: grpc: failed to unmarshal the received message proto: wrong wireType = 0
+  ```
+
+- With Docker 1.12, the control socket was in `/var/lib/docker/swarm/control.sock`
+
+---
+
+## `swarm-rafttool`
+
+- SwarmKit stores all its important data in a distributed log using the Raft protocol
+
+  (This log is also simply called the "Raft log")
+
+- You can decode that log with `swarm-rafttool`
+
+- This is a great tool to understand how SwarmKit works
+
+- It can also be used in forensics or troubleshooting
+
+  (But consider it as a *very low level* tool!)
+
+---
+
+## The powers of `swarm-rafttool`
+
+With `swarm-rafttool`, you can:
+
+- view the latest snapshot of the cluster state;
+
+- view the Raft log (i.e. changes to the cluster state);
+
+- view specific objects from the log or snapshot;
+
+- decrypt the Raft data (to analyze it with other tools).
+
+It *cannot* work on live files, so you must stop Docker or make a copy first.
+
+---
+
+## Using `swarm-rafttool`
+
+- First, let's make a copy of the current Swarm data
+
+.exercise[
+
+- If you are using Play-With-Docker, the Docker data directory is `/graph`:
+  ```bash
+  cp -r /graph/swarm /swarmdata
+  ```
+
+- Otherwise, it is in the default `/var/lib/docker`:
+  ```bash
+  sudo cp -r /var/lib/docker/swarm /swarmdata
+  ```
+
+]
+
+---
+
+## Dumping the Raft log
+
+- We have to indicate the path holding the Swarm data
+
+  (Otherwise `swarm-rafttool` will try to use the live data, and complain that it's locked!)
+
+.exercise[
+
+- If you are using Play-With-Docker, you must use the musl linker:
+  ```bash
+  /lib/ld-musl-x86_64.so.1 /usr/local/bin/swarm-rafttool -d /swarmdata/ dump-wal
+  ```
+
+- Otherwise, you don't need the musl linker but you need to get root:
+  ```bash
+  sudo swarm-rafttool -d /swarmdata/ dump-wal
+  ```
+
+]
+
+Reminder: this is a very low-level tool, requiring a knowledge of SwarmKit's internals!
diff --git a/docs/thanks-weave.png b/docs/thanks-weave.png
new file mode 100644
index 00000000..1487e60e
Binary files /dev/null and b/docs/thanks-weave.png differ
diff --git a/docs/updatingservices.md b/docs/updatingservices.md
new file mode 100644
index 00000000..98e9e987
--- /dev/null
+++ b/docs/updatingservices.md
@@ -0,0 +1,98 @@
+# Updating services
+
+- We want to make changes to the web UI
+
+- The process is as follows:
+
+  - edit code
+
+  - build new image
+
+  - ship new image
+
+  - run new image
+
+---
+
+## Updating a single service the hard way
+
+- To update a single service, we could do the following:
+  ```bash
+  REGISTRY=localhost:5000 TAG=v0.3
+  IMAGE=$REGISTRY/dockercoins_webui:$TAG
+  docker build -t $IMAGE webui/
+  docker push $IMAGE
+  docker service update dockercoins_webui --image $IMAGE
+  ```
+
+- Make sure to tag properly your images: update the `TAG` at each iteration
+
+  (When you check which images are running, you want these tags to be uniquely identifiable)
+
+---
+
+## Updating services the easy way
+
+- With the Compose integration, all we have to do is:
+  ```bash
+  export TAG=v0.3
+  docker-compose -f composefile.yml build
+  docker-compose -f composefile.yml push
+  docker stack deploy -c composefile.yml nameofstack
+  ```
+
+--
+
+- That's exactly what we used earlier to deploy the app
+
+- We don't need to learn new commands!
+
+---
+
+## Updating the web UI
+
+- Let's make the numbers on the Y axis bigger!
+
+.exercise[
+
+- Edit the file `webui/files/index.html`:
+  ```bash
+  vi dockercoins/webui/files/index.html
+  ```
+
+  <!-- ```wait <title>``` -->
+
+- Locate the `font-size` CSS attribute and increase it (at least double it)
+
+  <!--
+  ```keys /font-size```
+  ```keys ^J```
+  ```keys lllllllllllllcw45px```
+  ```keys ^[``` ]
+  ```keys :wq```
+  ```keys ^J```
+  -->
+
+- Save and exit
+
+- Build, ship, and run:
+  ```bash
+  export TAG=v0.3
+  docker-compose -f dockercoins.yml build
+  docker-compose -f dockercoins.yml push
+  docker stack deploy -c dockercoins.yml dockercoins
+  ```
+
+]
+
+---
+
+## Viewing our changes
+
+- Wait at least 10 seconds (for the new version to be deployed)
+
+- Then reload the web UI
+
+- Or just mash "reload" frantically
+
+- ... Eventually the legend on the left will be bigger!
diff --git a/docs/versions-k8s.md b/docs/versions-k8s.md
new file mode 100644
index 00000000..795248b6
--- /dev/null
+++ b/docs/versions-k8s.md
@@ -0,0 +1,41 @@
+## Brand new versions!
+
+- Kubernetes 1.8
+- Docker Engine 17.10
+- Docker Compose 1.16
+
+
+.exercise[
+
+- Check all installed versions:
+  ```bash
+  kubectl version
+  docker version
+  docker-compose -v
+  ```
+
+]
+
+---
+
+class: extra-details
+
+## Kubernetes and Docker compatibility
+
+- Kubernetes only validates Docker Engine versions 1.11.2, 1.12.6, 1.13.1, and 17.03.2
+
+--
+
+class: extra-details
+
+- Are we living dangerously?
+
+--
+
+class: extra-details
+
+- "Validates" = continuous integration builds
+
+- The Docker API is versioned, and offers strong backward-compatibility
+
+  (If a client uses e.g. API v1.25, the Docker Engine will keep behaving the same way)
diff --git a/docs/versions.md b/docs/versions.md
new file mode 100644
index 00000000..03e29f53
--- /dev/null
+++ b/docs/versions.md
@@ -0,0 +1,96 @@
+## Brand new versions!
+
+- Engine 17.10
+- Compose 1.16
+- Machine 0.12
+
+.exercise[
+
+- Check all installed versions:
+  ```bash
+  docker version
+  docker-compose -v
+  docker-machine -v
+  ```
+
+]
+
+---
+
+## Wait, what, 17.10 ?!?
+
+--
+
+- Docker 1.13 = Docker 17.03 (year.month, like Ubuntu)
+
+- Every month, there is a new "edge" release (with new features)
+
+- Every quarter, there is a new "stable" release
+
+- Docker CE releases are maintained 4+ months
+
+- Docker EE releases are maintained 12+ months
+
+- For more details, check the [Docker EE announcement blog post](https://blog.docker.com/2017/03/docker-enterprise-edition/)
+
+---
+
+class: extra-details
+
+## Docker CE vs Docker EE
+
+- Docker EE:
+
+  - $$$
+  - certification for select distros, clouds, and plugins
+  - advanced management features (fine-grained access control, security scanning...)
+
+- Docker CE:
+
+  - free
+  - available through Docker Mac, Docker Windows, and major Linux distros
+  - perfect for individuals and small organizations
+
+---
+
+class: extra-details
+
+## Why?
+
+- More readable for enterprise users
+
+  (i.e. the very nice folks who are kind enough to pay us big $$$ for our stuff)
+
+- No impact for the community
+
+  (beyond CE/EE suffix and version numbering change)
+
+- Both trains leverage the same open source components
+
+  (containerd, libcontainer, SwarmKit...)
+
+- More predictible release schedule (see next slide)
+
+---
+
+class: pic
+
+![Docker CE/EE release cycle](lifecycle.png)
+
+---
+
+## What was added when?
+
+||||
+| ---- | ----- | --- |
+| 2015 |  1.9  | Overlay (multi-host) networking, network/IPAM plugins
+| 2016 |  1.10 | Embedded dynamic DNS
+| 2016 |  1.11 | DNS round robin load balancing
+| 2016 |  1.12 | Swarm mode, routing mesh, encrypted networking, healthchecks
+| 2017 |  1.13 | Stacks, attachable overlays, image squash and compress
+| 2017 |  1.13 | Windows Server 2016 Swarm mode
+| 2017 | 17.03 | Secrets
+| 2017 | 17.04 | Update rollback, placement preferences (soft constraints)
+| 2017 | 17.05 | Multi-stage image builds, service logs
+| 2017 | 17.06 | Swarm configs, node/service events
+| 2017 | 17.06 | Windows Server 2016 Swarm overlay networks, secrets
diff --git a/docs/whatsnext.md b/docs/whatsnext.md
new file mode 100644
index 00000000..67734b02
--- /dev/null
+++ b/docs/whatsnext.md
@@ -0,0 +1,187 @@
+# Next steps
+
+*Alright, how do I get started and containerize my apps?*
+
+--
+
+Suggested containerization checklist:
+
+.checklist[
+- write a Dockerfile for one service in one app
+- write Dockerfiles for the other (buildable) services
+- write a Compose file for that whole app
+- make sure that devs are empowered to run the app in containers
+- set up automated builds of container images from the code repo
+- set up a CI pipeline using these container images
+- set up a CD pipeline (for staging/QA) using these images
+]
+
+And *then* it is time to look at orchestration!
+
+---
+
+## Namespaces
+
+- Namespaces let you run multiple identical stacks side by side
+
+- Two namespaces (e.g. `blue` and `green`) can each have their own `redis` service
+
+- Each of the two `redis` services has its own `ClusterIP`
+
+- `kube-dns` creates two entries, mapping to these two `ClusterIP` addresses:
+
+  `redis.blue.svc.cluster.local` and `redis.green.svc.cluster.local`
+
+- Pods in the `blue` namespace get a *search suffix* of `blue.svc.cluster.local`
+
+- As a result, resolving `redis` from a pod in the `blue` namespace yields the "local" `redis`
+
+.warning[This does not provide *isolation*! That would be the job of network policies.]
+
+---
+
+## Stateful services (databases etc.)
+
+- As a first step, it is wiser to keep stateful services *outside* of the cluster
+
+- Exposing them to pods can be done with multiple solutions:
+
+  - `ExternalName` services
+    <br/>
+    (`redis.blue.svc.cluster.local` will be a `CNAME` record)
+
+  - `ClusterIP` services with explicit `Endpoints`
+    <br/>
+    (instead of letting Kubernetes generate the endpoints from a selector)
+
+  - Ambassador services
+    <br/>
+    (application-level proxies that can provide credentials injection and more)
+
+---
+
+## Stateful services (second take)
+
+- If you really want to host stateful services on Kubernetes, you can look into:
+
+  - volumes (to carry persistent data)
+
+  - storage plugins
+
+  - persistent volume claims (to ask for specific volume characteristics)
+
+  - stateful sets (pods that are *not* ephemeral)
+
+---
+
+## HTTP traffic handling
+
+- *Services* are layer 4 constructs
+
+- HTTP is a layer 7 protocol
+
+- It is handled by *ingresses* (a different resource kind)
+
+- *Ingresses* allow:
+
+  - virtual host routing
+  - session stickiness
+  - URI mapping
+  - and much more!
+
+- Check out e.g. [Træfik](https://docs.traefik.io/user-guide/kubernetes/)
+
+---
+
+## Logging and metrics
+
+- Logging is delegated to the container engine
+
+- Metrics are typically handled with Prometheus
+
+  (Heapster is a popular add-on)
+
+---
+
+## Managing the configuration of our applications
+
+- Two constructs are particularly useful: secrets and config maps
+
+- They allow to expose arbitrary information to our containers
+
+- **Avoid** storing configuration in container images
+
+  (There are some exceptions to that rule, but it's generally a Bad Idea)
+
+- **Never** store sensitive information in container images
+
+  (It's the container equivalent of the password on a post-it note on your screen)
+
+---
+
+## Managing stack deployments
+
+- The best deployment tool will vary, depending on:
+
+  - the size and complexity of your stack(s)
+  - how often you change it (i.e. add/remove components)
+  - the size and skills of your team
+
+- A few examples:
+
+  - shell scripts invoking `kubectl`
+  - YAML resources descriptions committed to a repo
+  - [Helm](https://github.com/kubernetes/helm) (~package manager)
+  - [Spinnaker](https://www.spinnaker.io/) (Netflix' CD platform)
+
+---
+
+## Cluster federation
+
+--
+
+![Star Trek Federation](startrek-federation.jpg)
+
+--
+
+Sorry Star Trek fans, this is not the federation you're looking for!
+
+--
+
+(If I add "Your cluster is in another federation" I might get a 3rd fandom wincing!)
+
+---
+
+## Cluster federation
+
+- Kubernetes master operation relies on etcd
+
+- etcd uses the Raft protocol
+
+- Raft recommends low latency between nodes
+
+- What if our cluster spreads multiple regions?
+
+--
+
+- Break it down in local clusters
+
+- Regroup them in a *cluster federation*
+
+- Synchronize resources across clusters
+
+- Discover resources across clusters
+
+---
+
+## Developer experience
+
+*I've put this last, but it's pretty important!*
+
+- How do you on-board a new developer?
+
+- What do they need to install to get a dev stack?
+
+- How does a code change make it from dev to prod?
+
+- How does someone add a component to a stack?
diff --git a/docs/workshop.css b/docs/workshop.css
new file mode 100644
index 00000000..a2fad6e0
--- /dev/null
+++ b/docs/workshop.css
@@ -0,0 +1,168 @@
+@import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
+@import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
+@import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
+
+/* For print! Borrowed from https://github.com/gnab/remark/issues/50 */
+@page {
+  size: 1210px 681px;
+  margin: 0;
+ }
+
+@media print {
+  .remark-slide-scaler {
+      width: 100% !important;
+      height: 100% !important;
+      transform: scale(1) !important;
+      top: 0 !important;
+      left: 0 !important;
+  }
+}
+
+/* put slide numbers in top-right corner instead of bottom-right */
+div.remark-slide-number {
+  top: 6px;
+  left: unset;
+  bottom: unset;
+  right: 6px;
+}
+
+.debug {
+  font-size: 25px;
+  position: absolute;
+  left: 0px;
+  right: 0px;
+  bottom: 0px;
+  font-family: monospace;
+  color: white;
+}
+.debug a {
+  color: white;
+}
+.debug:hover {
+  background-color: black;
+}
+
+body { font-family: 'Droid Serif'; }
+
+h1, h2, h3 {
+  font-family: 'Yanone Kaffeesatz';
+  font-weight: normal;
+  margin-top: 0.5em;
+}
+
+a {
+  text-decoration: none;
+  color: blue;
+}
+
+.remark-slide-content { padding: 1em 2.5em 1em 2.5em; }
+.remark-slide-content { font-size: 25px; }
+.remark-slide-content h1 { font-size: 50px; }
+.remark-slide-content h2 { font-size: 50px; }
+.remark-slide-content h3 { font-size: 25px; }
+
+.footnote {
+  position: absolute;
+  bottom: 3em;
+}
+
+.remark-code { font-size: 25px; }
+.small .remark-code { font-size: 16px; }
+.remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
+.remark-inline-code {
+   background-color: #ccc;
+}
+
+.red { color: #fa0000; }
+.gray { color: #ccc; }
+.small { font-size: 70%; }
+.big { font-size: 140%; }
+.underline { text-decoration: underline; }
+.strike { text-decoration: line-through; }
+
+.pic {
+  vertical-align: middle;
+  text-align: center;
+  padding: 0 0 0 0 !important;
+}
+img {
+  max-width: 100%;
+  max-height: 550px;
+}
+.small img {
+  max-height: 250px;
+}
+
+.title {
+  vertical-align: middle;
+  text-align: center;
+}
+.title h1 { font-size: 3em; font-family: unset;}
+.title p { font-size: 3em; }
+
+.nav {
+  font-size: 25px;
+  position: absolute;
+  left: 0;
+  right: 0;
+  bottom: 2em;
+}
+
+.quote {
+  background: #eee;
+  border-left: 10px solid #ccc;
+  margin: 1.5em 10px;
+  padding: 0.5em 10px;
+  quotes: "\201C""\201D""\2018""\2019";
+  font-style: italic;
+}
+.quote:before {
+  color: #ccc;
+  content: open-quote;
+  font-size: 4em;
+  line-height: 0.1em;
+  margin-right: 0.25em;
+  vertical-align: -0.4em;
+}
+.quote p {
+  display: inline;
+}
+
+.blackbelt {
+  background-image: url("blackbelt.png");
+  background-size: 1.5em;
+  background-repeat: no-repeat;
+  padding-left: 2em;
+}
+.warning {
+  background-image: url("warning.png");
+  background-size: 1.5em;
+  background-repeat: no-repeat;
+  padding-left: 2em;
+}
+.exercise {
+  background-color: #eee;
+  background-image: url("keyboard.png");
+  background-size: 1.4em;
+  background-repeat: no-repeat;
+  background-position: 0.2em 0.2em;
+  border: 2px dotted black;
+}
+.exercise:before {
+  content: "Exercise";
+  margin-left: 1.8em;
+}
+
+li p { line-height: 1.25em; }
+
+div.extra-details {
+  background-image: url(extra-details.png);
+  background-position: 0.5% 1%;
+  background-size: 4%;
+}
+
+/* This is used only for the history slide (the only table in this doc) */
+td {
+  padding: 0.1em 0.5em;
+  background: #eee;
+}
diff --git a/docs/workshop.html b/docs/workshop.html
new file mode 100644
index 00000000..1800ecd1
--- /dev/null
+++ b/docs/workshop.html
@@ -0,0 +1,35 @@
+<!DOCTYPE html>
+<html>
+  <head>
+    <title>@@TITLE@@</title>
+    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
+    <link rel="stylesheet" href="workshop.css">
+  </head>
+  <body>
+    <!--
+    <div style="position: absolute; left: 20%; right: 20%; top: 30%;">
+      <h1 style="font-size: 3em;">Loading ...</h1>
+      The slides should show up here. If they don't, it might be
+      because you are accessing this file directly from your filesystem.
+      It needs to be served from a web server. You can try this:
+      <pre>
+        docker-compose up -d
+        open http://localhost:8888/workshop.html # on MacOS
+        xdg-open http://localhost:8888/workshop.html # on Linux
+      </pre>
+      Once the slides are loaded, this notice disappears when you
+      go full screen (e.g. by hitting "f").
+    </div>
+    -->
+    <textarea id="source">@@MARKDOWN@@</textarea>
+    <script src="remark.min.js" type="text/javascript">
+    </script>
+    <script type="text/javascript">
+      var slideshow = remark.create({
+        ratio: '16:9',
+        highlightSpans: true,
+        excludedClasses: [@@EXCLUDE@@]
+      });
+    </script>
+  </body>
+</html>
diff --git a/netlify.toml b/netlify.toml
new file mode 100644
index 00000000..f580b319
--- /dev/null
+++ b/netlify.toml
@@ -0,0 +1,5 @@
+[build]
+  base = "docs"
+  publish = "docs"
+  command = "./build.sh once"
+
diff --git a/prepare-vms/lib/aws.sh b/prepare-vms/lib/aws.sh
index e7b47718..e99abad7 100644
--- a/prepare-vms/lib/aws.sh
+++ b/prepare-vms/lib/aws.sh
@@ -1,9 +1,9 @@
-aws_display_tags(){
+aws_display_tags() {
     # Print all "Name" tags in our region with their instance count
     echo "[#] [Status] [Token] [Tag]" \
         | awk '{ printf "%-7s %-12s %-25s %-25s\n", $1, $2, $3, $4}'
     aws ec2 describe-instances \
-            --query "Reservations[*].Instances[*].[State.Name,ClientToken,Tags[0].Value]" \
+        --query "Reservations[*].Instances[*].[State.Name,ClientToken,Tags[0].Value]" \
         | tr -d "\r" \
         | uniq -c \
         | sort -k 3 \
@@ -12,17 +12,17 @@ aws_display_tags(){
 
 aws_get_tokens() {
     aws ec2 describe-instances --output text \
-            --query 'Reservations[*].Instances[*].[ClientToken]' \
+        --query 'Reservations[*].Instances[*].[ClientToken]' \
         | sort -u
 }
 
 aws_display_instance_statuses_by_tag() {
     TAG=$1
     need_tag $TAG
-    
+
     IDS=$(aws ec2 describe-instances \
         --filters "Name=tag:Name,Values=$TAG" \
-        --query "Reservations[*].Instances[*].InstanceId" | tr '\t' ' ' )
+        --query "Reservations[*].Instances[*].InstanceId" | tr '\t' ' ')
 
     aws ec2 describe-instance-status \
         --instance-ids $IDS \
@@ -34,20 +34,20 @@ aws_display_instances_by_tag() {
     TAG=$1
     need_tag $TAG
     result=$(aws ec2 describe-instances --output table \
-                    --filter "Name=tag:Name,Values=$TAG" \
-                    --query "Reservations[*].Instances[*].[ \
+        --filter "Name=tag:Name,Values=$TAG" \
+        --query "Reservations[*].Instances[*].[ \
                         InstanceId, \
                         State.Name, \
                         Tags[0].Value, \
                         PublicIpAddress, \
                         InstanceType \
                         ]"
-            )
-        if [[ -z $result ]]; then
-            die "No instances found with tag $TAG in region $AWS_DEFAULT_REGION."
-        else
-            echo "$result"
-        fi
+    )
+    if [[ -z $result ]]; then
+        die "No instances found with tag $TAG in region $AWS_DEFAULT_REGION."
+    else
+        echo "$result"
+    fi
 }
 
 aws_get_instance_ids_by_filter() {
@@ -57,7 +57,6 @@ aws_get_instance_ids_by_filter() {
         --output text | tr "\t" "\n" | tr -d "\r"
 }
 
-
 aws_get_instance_ids_by_client_token() {
     TOKEN=$1
     need_tag $TOKEN
@@ -76,8 +75,8 @@ aws_get_instance_ips_by_tag() {
     aws ec2 describe-instances --filter "Name=tag:Name,Values=$TAG" \
         --output text \
         --query "Reservations[*].Instances[*].PublicIpAddress" \
-            | tr "\t" "\n" \
-            | sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4  # sort IPs
+        | tr "\t" "\n" \
+        | sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4 # sort IPs
 }
 
 aws_kill_instances_by_tag() {
diff --git a/prepare-vms/lib/cli.sh b/prepare-vms/lib/cli.sh
index aea4e956..b20e6c8c 100644
--- a/prepare-vms/lib/cli.sh
+++ b/prepare-vms/lib/cli.sh
@@ -31,18 +31,18 @@ sep() {
     if [ -z "$COLUMNS" ]; then
         COLUMNS=80
     fi
-    SEP=$(yes = | tr -d "\n" | head -c $[$COLUMNS - 1])
+    SEP=$(yes = | tr -d "\n" | head -c $(($COLUMNS - 1)))
     if [ -z "$1" ]; then
         >/dev/stderr echo $SEP
     else
         MSGLEN=$(echo "$1" | wc -c)
-        if [ $[ $MSGLEN +4 ] -gt $COLUMNS ]; then
+        if [ $(($MSGLEN + 4)) -gt $COLUMNS ]; then
             >/dev/stderr echo "$SEP"
             >/dev/stderr echo "$1"
             >/dev/stderr echo "$SEP"
         else
-            LEFTLEN=$[ ($COLUMNS - $MSGLEN - 2) / 2 ]
-            RIGHTLEN=$[ $COLUMNS - $MSGLEN - 2 - $LEFTLEN ]
+            LEFTLEN=$((($COLUMNS - $MSGLEN - 2) / 2))
+            RIGHTLEN=$(($COLUMNS - $MSGLEN - 2 - $LEFTLEN))
             LEFTSEP=$(echo $SEP | head -c $LEFTLEN)
             RIGHTSEP=$(echo $SEP | head -c $RIGHTLEN)
             >/dev/stderr echo "$LEFTSEP $1 $RIGHTSEP"
diff --git a/prepare-vms/lib/colors.sh b/prepare-vms/lib/colors.sh
index b20f32a0..dcfeec9b 100644
--- a/prepare-vms/lib/colors.sh
+++ b/prepare-vms/lib/colors.sh
@@ -1,15 +1,15 @@
-bold() { 
+bold() {
     echo "$(tput bold)$1$(tput sgr0)"
-} 
- 
-red() { 
-    echo "$(tput setaf 1)$1$(tput sgr0)"
-} 
+}
 
-green() { 
+red() {
+    echo "$(tput setaf 1)$1$(tput sgr0)"
+}
+
+green() {
     echo "$(tput setaf 2)$1$(tput sgr0)"
-} 
- 
-yellow(){ 
+}
+
+yellow() {
     echo "$(tput setaf 3)$1$(tput sgr0)"
-} 
+}
diff --git a/prepare-vms/lib/commands.sh b/prepare-vms/lib/commands.sh
index 3c41c699..8a4248c4 100644
--- a/prepare-vms/lib/commands.sh
+++ b/prepare-vms/lib/commands.sh
@@ -1,7 +1,7 @@
 export AWS_DEFAULT_OUTPUT=text
 
 HELP=""
-_cmd () {
+_cmd() {
     HELP="$(printf "%s\n%-12s %s\n" "$HELP" "$1" "$2")"
 }
 
@@ -39,7 +39,7 @@ _cmd_cards() {
     need_tag $TAG
     need_settings $SETTINGS
 
-    aws_get_instance_ips_by_tag $TAG > tags/$TAG/ips.txt
+    aws_get_instance_ips_by_tag $TAG >tags/$TAG/ips.txt
 
     # Remove symlinks to old cards
     rm -f ips.html ips.pdf
@@ -78,18 +78,18 @@ _cmd_deploy() {
     >/dev/stderr echo ""
 
     sep "Deploying tag $TAG"
-    pssh -I tee /tmp/settings.yaml < $SETTINGS
+    pssh -I tee /tmp/settings.yaml <$SETTINGS
     pssh "
     sudo apt-get update &&
     sudo apt-get install -y python-setuptools &&
     sudo easy_install pyyaml"
 
     # Copy postprep.py to the remote machines, and execute it, feeding it the list of IP addresses
-    pssh -I tee /tmp/postprep.py < lib/postprep.py
-    pssh --timeout 900 --send-input "python /tmp/postprep.py >>/tmp/pp.out 2>>/tmp/pp.err" < ips.txt
+    pssh -I tee /tmp/postprep.py <lib/postprep.py
+    pssh --timeout 900 --send-input "python /tmp/postprep.py >>/tmp/pp.out 2>>/tmp/pp.err" <ips.txt
 
     # Install docker-prompt script
-    pssh -I sudo tee /usr/local/bin/docker-prompt < lib/docker-prompt
+    pssh -I sudo tee /usr/local/bin/docker-prompt <lib/docker-prompt
     pssh sudo chmod +x /usr/local/bin/docker-prompt
 
     # If /home/docker/.ssh/id_rsa doesn't exist, copy it from node1
@@ -116,7 +116,7 @@ _cmd_deploy() {
     sep "Deployed tag $TAG"
     info "You may want to run one of the following commands:"
     info "$0 kube $TAG"
-    info "$0 pull-images $TAG"
+    info "$0 pull_images $TAG"
     info "$0 cards $TAG $SETTINGS"
 }
 
@@ -206,7 +206,7 @@ _cmd_ips() {
 }
 
 _cmd list "List available batches in the current region"
-_cmd_list(){
+_cmd_list() {
     info "Listing batches in region $AWS_DEFAULT_REGION:"
     aws_display_tags
 }
@@ -259,7 +259,7 @@ _cmd_retag() {
     if [[ -z "$NEWTAG" ]]; then
         die "You must specify a new tag to apply."
     fi
-    aws_tag_instances $OLDTAG $NEWTAG 
+    aws_tag_instances $OLDTAG $NEWTAG
 }
 
 _cmd start "Start a batch of VMs"
@@ -279,8 +279,8 @@ _cmd_start() {
     # Upload our SSH keys to AWS if needed, to be added to each VM's authorized_keys
     key_name=$(sync_keys)
 
-    AMI=$(_cmd_ami)  # Retrieve the AWS image ID
-    TOKEN=$(get_token)  # generate a timestamp token for this batch of VMs
+    AMI=$(_cmd_ami)    # Retrieve the AWS image ID
+    TOKEN=$(get_token) # generate a timestamp token for this batch of VMs
     AWS_KEY_NAME=$(make_key_name)
 
     sep "Starting instances"
@@ -295,7 +295,7 @@ _cmd_start() {
         --instance-type t2.medium \
         --client-token $TOKEN \
         --image-id $AMI)
-    reservation_id=$(echo "$result" | head -1 | awk '{print $2}' )
+    reservation_id=$(echo "$result" | head -1 | awk '{print $2}')
     info "Reservation ID: $reservation_id"
     sep
 
@@ -317,7 +317,7 @@ _cmd_start() {
 
     mkdir -p tags/$TAG
     IPS=$(aws_get_instance_ips_by_tag $TAG)
-    echo "$IPS" > tags/$TAG/ips.txt
+    echo "$IPS" >tags/$TAG/ips.txt
     link_tag $TAG
     if [ -n "$SETTINGS" ]; then
         _cmd_deploy $TAG $SETTINGS
@@ -325,16 +325,16 @@ _cmd_start() {
         info "To deploy or kill these instances, run one of the following:"
         info "$0 deploy $TAG <settings/somefile.yaml>"
         info "$0 stop $TAG"
-    fi    
+    fi
 }
 
 _cmd ec2quotas "Check our EC2 quotas (max instances)"
-_cmd_ec2quotas(){
+_cmd_ec2quotas() {
     greet
 
     max_instances=$(aws ec2 describe-account-attributes \
-                    --attribute-names max-instances \
-                    --query 'AccountAttributes[*][AttributeValues]')
+        --attribute-names max-instances \
+        --query 'AccountAttributes[*][AttributeValues]')
     info "In the current region ($AWS_DEFAULT_REGION) you can deploy up to $max_instances instances."
 
     # Print list of AWS EC2 regions, highlighting ours ($AWS_DEFAULT_REGION) in the list
@@ -373,7 +373,7 @@ link_tag() {
     ln -sf $IPS_FILE ips.txt
 }
 
-pull_tag(){
+pull_tag() {
     TAG=$1
     need_tag $TAG
     link_tag $TAG
@@ -405,13 +405,13 @@ wait_until_tag_is_running() {
     COUNT=$2
     i=0
     done_count=0
-    while [[ $done_count -lt $COUNT ]]; do \
+    while [[ $done_count -lt $COUNT ]]; do
         let "i += 1"
         info "$(printf "%d/%d instances online" $done_count $COUNT)"
         done_count=$(aws ec2 describe-instances \
-                --filters "Name=instance-state-name,Values=running" \
-                          "Name=tag:Name,Values=$TAG" \
-                --query "Reservations[*].Instances[*].State.Name" \
+            --filters "Name=instance-state-name,Values=running" \
+            "Name=tag:Name,Values=$TAG" \
+            --query "Reservations[*].Instances[*].State.Name" \
             | tr "\t" "\n" \
             | wc -l)
 
@@ -432,7 +432,7 @@ tag_is_reachable() {
 test_tag() {
     ips_file=tags/$TAG/ips.txt
     info "Picking a random IP address in $ips_file to run tests."
-    n=$[ 1 + $RANDOM % $(wc -l < $ips_file) ]
+    n=$((1 + $RANDOM % $(wc -l <$ips_file)))
     ip=$(head -n $n $ips_file | tail -n 1)
     test_vm $ip
     info "Tests complete."
@@ -461,8 +461,8 @@ test_vm() {
         "env" \
         "ls -la /home/docker/.ssh"; do
         sep "$cmd"
-        echo "$cmd" | 
-            ssh -A -q \
+        echo "$cmd" \
+            | ssh -A -q \
                 -o "UserKnownHostsFile /dev/null" \
                 -o "StrictHostKeyChecking=no" \
                 $user@$ip sudo -u docker -i \
@@ -480,26 +480,26 @@ test_vm() {
     info "Test VM was $ip."
 }
 
-make_key_name(){
+make_key_name() {
     SHORT_FINGERPRINT=$(ssh-add -l | grep RSA | head -n1 | cut -d " " -f 2 | tr -d : | cut -c 1-8)
     echo "${SHORT_FINGERPRINT}-${USER}"
 }
 
 sync_keys() {
     # make sure ssh-add -l contains "RSA"
-    ssh-add -l | grep -q RSA ||
-        die "The output of \`ssh-add -l\` doesn't contain 'RSA'. Start the agent, add your keys?"
+    ssh-add -l | grep -q RSA \
+        || die "The output of \`ssh-add -l\` doesn't contain 'RSA'. Start the agent, add your keys?"
 
     AWS_KEY_NAME=$(make_key_name)
     info "Syncing keys... "
-    if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &> /dev/null; then
+    if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &>/dev/null; then
         aws ec2 import-key-pair --key-name $AWS_KEY_NAME \
             --public-key-material "$(ssh-add -L \
-                                    | grep -i RSA \
-                                    | head -n1 \
-                                    | cut -d " " -f 1-2)" &> /dev/null
+                | grep -i RSA \
+                | head -n1 \
+                | cut -d " " -f 1-2)" &>/dev/null
 
-        if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &> /dev/null; then
+        if ! aws ec2 describe-key-pairs --key-name "$AWS_KEY_NAME" &>/dev/null; then
             die "Somehow, importing the key didn't work. Make sure that 'ssh-add -l | grep RSA | head -n1' returns an RSA key?"
         else
             info "Imported new key $AWS_KEY_NAME."
@@ -523,4 +523,3 @@ describe_tag() {
     aws_display_instances_by_tag $TAG
     aws_display_instance_statuses_by_tag $TAG
 }
-
diff --git a/prepare-vms/lib/find-ubuntu-ami.sh b/prepare-vms/lib/find-ubuntu-ami.sh
index 94049908..6a6020bf 100644
--- a/prepare-vms/lib/find-ubuntu-ami.sh
+++ b/prepare-vms/lib/find-ubuntu-ami.sh
@@ -3,10 +3,10 @@
 # That way, it can be safely invoked as a function from other scripts.
 
 find_ubuntu_ami() {
-(
+    (
 
-usage() {
-    cat >&2 <<__
+        usage() {
+            cat >&2 <<__
 usage: find-ubuntu-ami.sh [ <filter>... ] [ <sorting> ] [ <options> ]
 where:
     <filter> is pair of key and substring to search
@@ -33,66 +33,94 @@ where:
     protip for Docker orchestration workshop admin:
         ./find-ubuntu-ami.sh -t hvm:ebs -r \$AWS_REGION -v 15.10 -N
 __
-    exit 1
-}
+            exit 1
+        }
 
-args=`getopt hr:n:v:a:t:d:i:k:RNVATDIKq $*`
-if [ $? != 0 ] ; then
-    echo >&2
-    usage
-fi
+        args=$(getopt hr:n:v:a:t:d:i:k:RNVATDIKq $*)
+        if [ $? != 0 ]; then
+            echo >&2
+            usage
+        fi
 
-region=
-name=
-version=
-arch=
-type=
-date=
-image=
-kernel=
+        region=
+        name=
+        version=
+        arch=
+        type=
+        date=
+        image=
+        kernel=
 
-sort=date
+        sort=date
 
-quiet=
+        quiet=
 
-set -- $args
-for a ; do
-    case "$a" in
-        -h) usage ;;
+        set -- $args
+        for a; do
+            case "$a" in
+            -h) usage ;;
 
-        -r) region=$2 ; shift ;;
-        -n) name=$2 ; shift ;;
-        -v) version=$2 ; shift ;;
-        -a) arch=$2 ; shift ;;
-        -t) type=$2 ; shift ;;
-        -d) date=$2 ; shift ;;
-        -i) image=$2 ; shift ;;
-        -k) kernel=$2 ; shift ;;
-        
-        -R) sort=region ;;
-        -N) sort=name ;;
-        -V) sort=version ;;
-        -A) sort=arch ;;
-        -T) sort=type ;;
-        -D) sort=date ;;
-        -I) sort=image ;;
-        -K) sort=kernel ;;
+            -r)
+                region=$2
+                shift
+                ;;
+            -n)
+                name=$2
+                shift
+                ;;
+            -v)
+                version=$2
+                shift
+                ;;
+            -a)
+                arch=$2
+                shift
+                ;;
+            -t)
+                type=$2
+                shift
+                ;;
+            -d)
+                date=$2
+                shift
+                ;;
+            -i)
+                image=$2
+                shift
+                ;;
+            -k)
+                kernel=$2
+                shift
+                ;;
 
-        -q) quiet=y ;;
-        
-        --) shift ; break ;;
-        *) continue ;;
-    esac
-    shift
-done
+            -R) sort=region ;;
+            -N) sort=name ;;
+            -V) sort=version ;;
+            -A) sort=arch ;;
+            -T) sort=type ;;
+            -D) sort=date ;;
+            -I) sort=image ;;
+            -K) sort=kernel ;;
 
-[ $# = 0 ] || usage
+            -q) quiet=y ;;
 
-fix_json() {
-    tr -d \\n | sed 's/,]}/]}/'
-}
+            --)
+                shift
+                break
+                ;;
+            *) continue ;;
+            esac
+            shift
+        done
 
-jq_query() { cat <<__
+        [ $# = 0 ] || usage
+
+        fix_json() {
+            tr -d \\n | sed 's/,]}/]}/'
+        }
+
+        jq_query() {
+            cat <<__
     .aaData | map (
         {
             region: .[0],
@@ -116,31 +144,31 @@ jq_query() { cat <<__
     ) | sort_by(.$sort) | .[] |
     "\(.region)|\(.name)|\(.version)|\(.arch)|\(.type)|\(.date)|\(.image)|\(.kernel)"
 __
+        }
+
+        trim_quotes() {
+            sed 's/^"//;s/"$//'
+        }
+
+        escape_spaces() {
+            sed 's/ /\\\ /g'
+        }
+
+        url=http://cloud-images.ubuntu.com/locator/ec2/releasesTable
+
+        {
+            [ "$quiet" ] || echo REGION NAME VERSION ARCH TYPE DATE IMAGE KERNEL
+            curl -s $url | fix_json | jq "$(jq_query)" | trim_quotes | escape_spaces | tr \| ' '
+        } \
+            | while read region name version arch type date image kernel; do
+                image=${image%<*}
+                image=${image#*>}
+                if [ "$quiet" ]; then
+                    echo $image
+                else
+                    echo "$region|$name|$version|$arch|$type|$date|$image|$kernel"
+                fi
+            done | column -t -s \|
+
+    )
 }
-
-trim_quotes() {
-    sed 's/^"//;s/"$//'
-}
-
-escape_spaces() {
-    sed 's/ /\\\ /g'
-}
-
-url=http://cloud-images.ubuntu.com/locator/ec2/releasesTable
-
-{
-    [ "$quiet" ] || echo REGION NAME VERSION ARCH TYPE DATE IMAGE KERNEL
-    curl -s $url | fix_json | jq "`jq_query`" | trim_quotes | escape_spaces | tr \| ' '
-} |
-    while read region name version arch type date image kernel ; do
-        image=${image%<*}
-        image=${image#*>}
-        if [ "$quiet" ]; then
-            echo $image
-        else
-            echo "$region|$name|$version|$arch|$type|$date|$image|$kernel"
-        fi
-    done | column -t -s \|
-
-)
-}
\ No newline at end of file
diff --git a/prepare-vms/lib/postprep.py b/prepare-vms/lib/postprep.py
index 5f13d570..7383e531 100755
--- a/prepare-vms/lib/postprep.py
+++ b/prepare-vms/lib/postprep.py
@@ -60,7 +60,7 @@ system("echo docker:training | sudo chpasswd")
 
 # Fancy prompt courtesy of @soulshake.
 system("""sudo -u docker tee -a /home/docker/.bashrc <<SQRL
-export PS1='\e[1m\e[31m[\h] \e[32m(\\$(docker-prompt)) \e[34m\u@{}\e[35m \w\e[0m\n$ '
+export PS1='\e[1m\e[31m[{}] \e[32m(\\$(docker-prompt)) \e[34m\u@\h\e[35m \w\e[0m\n$ '
 SQRL""".format(ipv4))
 
 # Custom .vimrc
@@ -135,7 +135,9 @@ while addresses:
     print(cluster)
 
     mynode = cluster.index(ipv4) + 1
-    system("echo 'node{}' | sudo -u docker tee /tmp/node".format(mynode))
+    system("echo node{} | sudo -u docker tee /tmp/node".format(mynode))
+    system("echo node{} | sudo tee /etc/hostname".format(mynode))
+    system("sudo hostname node{}".format(mynode))
     system("sudo -u docker mkdir -p /home/docker/.ssh")
     system("sudo -u docker touch /home/docker/.ssh/authorized_keys")
 
diff --git a/prepare-vms/lib/pssh.sh b/prepare-vms/lib/pssh.sh
index 72e41f55..cf13f48b 100644
--- a/prepare-vms/lib/pssh.sh
+++ b/prepare-vms/lib/pssh.sh
@@ -1,8 +1,8 @@
-# This file can be sourced in order to directly run commands on 
+# This file can be sourced in order to directly run commands on
 # a batch of VMs whose IPs are located in ips.txt of the directory in which
 # the command is run.
 
-pssh () {
+pssh() {
     HOSTFILE="ips.txt"
 
     [ -f $HOSTFILE ] || {
@@ -14,10 +14,10 @@ pssh () {
     export PSSH=$(which pssh || which parallel-ssh)
 
     $PSSH -h $HOSTFILE -l ubuntu \
-    --par 100 \
-    -O LogLevel=ERROR \
-    -O UserKnownHostsFile=/dev/null \
-    -O StrictHostKeyChecking=no \
-    -O ForwardAgent=yes \
-    "$@"
+        --par 100 \
+        -O LogLevel=ERROR \
+        -O UserKnownHostsFile=/dev/null \
+        -O StrictHostKeyChecking=no \
+        -O ForwardAgent=yes \
+        "$@"
 }
diff --git a/prepare-vms/workshopctl b/prepare-vms/workshopctl
index 469e6fc7..c8b26861 100755
--- a/prepare-vms/workshopctl
+++ b/prepare-vms/workshopctl
@@ -50,8 +50,8 @@ check_dependencies() {
     status=0
     for dependency in $DEPENDENCIES; do
         if ! command -v $dependency >/dev/null; then
-             warning "Dependency $dependency could not be found."
-             status=1
+            warning "Dependency $dependency could not be found."
+            status=1
         fi
     done
     return $status
@@ -61,11 +61,11 @@ check_image() {
     docker inspect $TRAINER_IMAGE >/dev/null 2>&1
 }
 
-check_envvars ||
-    die "Please set all required environment variables."
+check_envvars \
+    || die "Please set all required environment variables."
 
-check_dependencies || 
-    warning "At least one dependency is missing. Install it or try the image wrapper."
+check_dependencies \
+    || warning "At least one dependency is missing. Install it or try the image wrapper."
 
 # Now check which command was invoked and execute it
 if [ "$1" ]; then
diff --git a/stacks/dockercoins+healthchecks.yml b/stacks/dockercoins+healthchecks.yml
new file mode 100644
index 00000000..0f01d557
--- /dev/null
+++ b/stacks/dockercoins+healthchecks.yml
@@ -0,0 +1,35 @@
+version: "3"
+
+services:
+  rng:
+    build: dockercoins/rng
+    image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
+    deploy:
+      mode: global
+
+  hasher:
+    build: dockercoins/hasher
+    image: ${REGISTRY-127.0.0.1:5000}/hasher:${TAG-latest}
+    deploy:
+      replicas: 7
+      update_config:
+        delay: 5s
+        failure_action: rollback
+        max_failure_ratio: .5
+        monitor: 5s
+        parallelism: 1
+
+  webui:
+    build: dockercoins/webui
+    image: ${REGISTRY-127.0.0.1:5000}/webui:${TAG-latest}
+    ports:
+    - "8000:80"
+
+  redis:
+    image: redis
+
+  worker:
+    build: dockercoins/worker
+    image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
+    deploy:
+      replicas: 10