mirror of
https://github.com/jpetazzo/container.training.git
synced 2026-03-05 10:50:33 +00:00
4738 lines
92 KiB
HTML
4738 lines
92 KiB
HTML
<!DOCTYPE html>
|
|
<html>
|
|
<head>
|
|
<base target="_blank">
|
|
<title>Docker Orchestration Workshop</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
|
|
<style type="text/css">
|
|
@import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
|
|
@import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
|
|
@import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
|
|
|
|
body { font-family: 'Droid Serif'; }
|
|
|
|
h1, h2, h3 {
|
|
font-family: 'Yanone Kaffeesatz';
|
|
font-weight: normal;
|
|
margin-top: 0.5em;
|
|
}
|
|
a {
|
|
text-decoration: none;
|
|
color: blue;
|
|
}
|
|
.remark-slide-content { padding: 1em 2.5em 1em 2.5em; }
|
|
|
|
.remark-slide-content { font-size: 25px; }
|
|
.remark-slide-content h1 { font-size: 50px; }
|
|
.remark-slide-content h2 { font-size: 50px; }
|
|
.remark-slide-content h3 { font-size: 25px; }
|
|
.remark-code { font-size: 25px; }
|
|
.small .remark-code { font-size: 16px; }
|
|
|
|
.remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
|
|
.red { color: #fa0000; }
|
|
.gray { color: #ccc; }
|
|
.small { font-size: 70%; }
|
|
.big { font-size: 140%; }
|
|
.underline { text-decoration: underline; }
|
|
.pic {
|
|
vertical-align: middle;
|
|
text-align: center;
|
|
padding: 0 0 0 0 !important;
|
|
}
|
|
img {
|
|
max-width: 100%;
|
|
max-height: 550px;
|
|
}
|
|
.title {
|
|
vertical-align: middle;
|
|
text-align: center;
|
|
}
|
|
.title h1 { font-size: 100px; }
|
|
.title p { font-size: 100px; }
|
|
.quote {
|
|
background: #eee;
|
|
border-left: 10px solid #ccc;
|
|
margin: 1.5em 10px;
|
|
padding: 0.5em 10px;
|
|
quotes: "\201C""\201D""\2018""\2019";
|
|
font-style: italic;
|
|
}
|
|
.quote:before {
|
|
color: #ccc;
|
|
content: open-quote;
|
|
font-size: 4em;
|
|
line-height: 0.1em;
|
|
margin-right: 0.25em;
|
|
vertical-align: -0.4em;
|
|
}
|
|
.quote p {
|
|
display: inline;
|
|
}
|
|
.warning {
|
|
background-image: url("warning.png");
|
|
background-size: 1.5em;
|
|
background-repeat: no-repeat;
|
|
padding-left: 2em;
|
|
}
|
|
.exercise {
|
|
background-color: #eee;
|
|
background-image: url("keyboard.png");
|
|
background-size: 1.4em;
|
|
background-repeat: no-repeat;
|
|
background-position: 0.2em 0.2em;
|
|
border: 2px dotted black;
|
|
}
|
|
.exercise::before {
|
|
content: "Exercise";
|
|
margin-left: 1.8em;
|
|
}
|
|
li p { line-height: 1.25em; }
|
|
</style>
|
|
</head>
|
|
<body>
|
|
<textarea id="source">
|
|
|
|
class: title
|
|
|
|
Docker <br/> Orchestration <br/> Workshop
|
|
|
|
---
|
|
|
|
## Logistics
|
|
|
|
- Hello! We're `jerome at docker dot com` and `aj at soulshake dot net`
|
|
|
|
<!--
|
|
Reminder, when updating the agenda: when people are told to show
|
|
up at 9am, they usually trickle in until 9:30am (except for paid
|
|
training sessions). If you're not sure that people will be there
|
|
on time, it's a good idea to have a breakfast with the attendees
|
|
at e.g. 9am, and start at 9:30.
|
|
|
|
- Agenda:
|
|
|
|
.small[
|
|
- 08:00-09:00 hello and breakfast
|
|
- 09:00:10:25 part 1
|
|
- 10:25-10:35 coffee break
|
|
- 10:35-12:00 part 2
|
|
- 12:00-13:00 lunch break
|
|
- 13:00-14:25 part 3
|
|
- 14:25-14:35 coffee break
|
|
- 14:35-16:00 part 4
|
|
]
|
|
|
|
-->
|
|
|
|
- The tutorial will run from 1:20pm to 4:40pm
|
|
|
|
- There will be a break from 3:00pm to 3:15pm
|
|
|
|
- This will be FAST PACED, but DON'T PANIC!
|
|
|
|
- All the content is publicly available (slides, code samples, scripts)
|
|
|
|
<!--
|
|
Remember to change:
|
|
- the Gitter link below
|
|
- the "tweet my speed" hashtag in DockerCoins HTML
|
|
-->
|
|
|
|
- Live feedback, questions, help on
|
|
[Gitter](http://container.training/chat)
|
|
|
|
---
|
|
|
|
|
|
<!--
|
|
grep '^# ' index.html | grep -v '<br' | tr '#' '-'
|
|
-->
|
|
|
|
## Chapter 1: getting started
|
|
|
|
- Pre-requirements
|
|
- VM environment
|
|
- Our sample application
|
|
- Running the application
|
|
- Identifying bottlenecks
|
|
- Scaling out
|
|
- Connecting to containers on other hosts
|
|
- Abstracting remote services with ambassadors
|
|
|
|
---
|
|
|
|
## Chapter 2: Swarm setup and deployment
|
|
|
|
- Dynamic orchestration
|
|
- Deploying Swarm
|
|
- Picking a key/value store
|
|
- Running containers on Swarm
|
|
- Resource allocation
|
|
- Multi-host networking
|
|
- Building images with Swarm
|
|
- Deploying a local registry
|
|
- Scaling web services with Compose on Swarm
|
|
|
|
---
|
|
|
|
## Chapter 3: Docker for Ops
|
|
|
|
- Logs
|
|
- Setting up ELK to store container logs
|
|
- Network traffic analysis
|
|
- Backups
|
|
- Controlling Docker from a container
|
|
- Docker events stream
|
|
- Security upgrades
|
|
|
|
---
|
|
|
|
## Chapter 4: high availability (additional content)
|
|
|
|
- Distributing Machine credentials
|
|
- Highly available Swarm managers
|
|
- Highly available containers
|
|
- Conclusions
|
|
|
|
---
|
|
|
|
# Pre-requirements
|
|
|
|
- Computer with network connection and SSH client
|
|
|
|
- on Linux, OS X, FreeBSD... you are probably all set
|
|
|
|
- on Windows, get [putty](http://www.putty.org/),
|
|
[Git BASH](https://msysgit.github.io/), or
|
|
[MobaXterm](http://mobaxterm.mobatek.net/)
|
|
|
|
- Basic Docker knowledge
|
|
<br/>(but that's OK if you're not a Docker expert!)
|
|
|
|
---
|
|
|
|
## Nice-to-haves
|
|
|
|
- [GitHub](https://github.com/join) account
|
|
<br/>(if you want to fork the repo; also used to join Gitter)
|
|
|
|
- [Gitter](https://gitter.im/) account
|
|
<br/>(to join the conversation during the workshop)
|
|
|
|
- [Docker Hub](https://hub.docker.com) account
|
|
<br/>(it's one way to distribute images on your Swarm cluster)
|
|
|
|
---
|
|
|
|
## Hands-on sections
|
|
|
|
- The whole workshop is hands-on
|
|
|
|
- I will show Docker in action
|
|
|
|
- I invite you to reproduce what I do
|
|
|
|
- All hands-on sections are clearly identified, like the gray rectangle below
|
|
|
|
.exercise[
|
|
|
|
- This is the stuff you're supposed to do!
|
|
- Go to [container.training](http://container.training/) to view these slides
|
|
- Join the chat room on
|
|
[Gitter](http://container.training/chat)
|
|
|
|
]
|
|
|
|
---
|
|
|
|
# VM environment
|
|
|
|
- Each person gets 5 private VMs (not shared with anybody else)
|
|
- They'll be up until tonight
|
|
- You have a little card with login+password+IP addresses
|
|
- You can automatically SSH from one VM to another
|
|
|
|
.exercise[
|
|
|
|
<!--
|
|
```bash
|
|
for N in $(seq 1 5); do
|
|
ssh -o StrictHostKeyChecking=no node$N true
|
|
done
|
|
for N in $(seq 1 5); do
|
|
(.
|
|
docker-machine rm -f node$N
|
|
ssh node$N "docker ps -aq | xargs -r docker rm -f"
|
|
ssh node$N sudo rm -f /etc/systemd/system/docker.service
|
|
ssh node$N sudo systemctl daemon-reload
|
|
echo Restarting node$N.
|
|
ssh node$N sudo systemctl restart docker
|
|
echo Restarted node$N.
|
|
) &
|
|
done
|
|
wait
|
|
```
|
|
-->
|
|
|
|
- Log into the first VM (`node1`)
|
|
- Check that you can SSH (without password) to `node2`:
|
|
```bash
|
|
ssh node2
|
|
```
|
|
- Type `exit` or `^D` to come back to node1
|
|
|
|
<!--
|
|
```meta
|
|
^D
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## We will (mostly) interact with node1 only
|
|
|
|
- Unless instructed, **all commands must be run from the first VM, `node1`**
|
|
|
|
- We will only checkout/copy the code on `node1`
|
|
|
|
- When we will use the other nodes, we will do it mostly through the Docker API
|
|
|
|
- We will use SSH only for a few "out of band" operations (mass-removing containers...)
|
|
|
|
---
|
|
|
|
## Terminals
|
|
|
|
Once in a while, the instructions will say:
|
|
<br/>"Open a new terminal."
|
|
|
|
There are multiple ways to do this:
|
|
|
|
- create a new window or tab on your machine, and SSH into the VM;
|
|
|
|
- use screen or tmux on the VM and open a new window from there.
|
|
|
|
You are welcome to use the method that you feel the most comfortable with.
|
|
|
|
---
|
|
|
|
## Tmux cheatsheet
|
|
|
|
- Ctrl-b c → creates a new window
|
|
- Ctrl-b n → go to next window
|
|
- Ctrl-b p → go to previous window
|
|
- Ctrl-b " → split window top/bottom
|
|
- Ctrl-b % → split window left/right
|
|
- Ctrl-b Alt-1 → rearrange windows in columns
|
|
- Ctrl-b Alt-2 → rearrange windows in rows
|
|
- Ctrl-b arrows → navigate to other windows
|
|
- Ctrl-b d → detach session
|
|
- tmux attach → reattach to session
|
|
|
|
---
|
|
|
|
## Brand new versions!
|
|
|
|
- Engine 1.11
|
|
- Compose 1.7
|
|
- Swarm 1.2
|
|
- Machine 0.6
|
|
|
|
.exercise[
|
|
|
|
- Check all installed versions:
|
|
```bash
|
|
docker version
|
|
docker-compose -v
|
|
docker run --rm swarm -version
|
|
docker-machine -v
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Why are we not using the latest version of Machine?
|
|
|
|
- The latest version of Machine is 0.7
|
|
|
|
- The way it deploys Swarm is different from 0.6
|
|
|
|
- This causes a regression in the strategy that we will use later
|
|
|
|
- More details later!
|
|
|
|
---
|
|
|
|
# Our sample application
|
|
|
|
- Visit the GitHub repository with all the materials of this workshop:
|
|
<br/>https://github.com/jpetazzo/orchestration-workshop
|
|
|
|
- The application is in the [dockercoins](
|
|
https://github.com/jpetazzo/orchestration-workshop/tree/master/dockercoins)
|
|
subdirectory
|
|
|
|
- Let's look at the general layout of the source code:
|
|
|
|
there is a Compose file [docker-compose.yml](
|
|
https://github.com/jpetazzo/orchestration-workshop/blob/master/dockercoins/docker-compose.yml) ...
|
|
|
|
... and 4 other services, each in its own directory:
|
|
|
|
- `rng` = web service generating random bytes
|
|
- `hasher` = web service computing hash of POSTed data
|
|
- `worker` = background process using `rng` and `hasher`
|
|
- `webui` = web interface to watch progress
|
|
|
|
---
|
|
|
|
## Compose file format version
|
|
|
|
*Particularly relevant if you have used Compose before...*
|
|
|
|
- Compose 1.6 introduced support for a new Compose file format (aka "v2")
|
|
|
|
- Services are no longer at the top level, but under a `services` section
|
|
|
|
- There has to be a `version` key at the top level, with value `"2"` (as a string, not an integer)
|
|
|
|
- Containers are placed on a dedicated network, making links unnecessary
|
|
|
|
- There are other minor differences, but upgrade is easy and straightforward
|
|
|
|
---
|
|
|
|
## Links, naming, and service discovery
|
|
|
|
- Containers can have network aliases (resolvable through DNS)
|
|
|
|
- Compose file version 2 makes each container reachable through its service name
|
|
|
|
- Compose file version 1 requires "links" sections
|
|
|
|
- Our code can connect to services using their short name
|
|
|
|
(instead of e.g. IP address or FQDN)
|
|
|
|
---
|
|
|
|
## Example in `worker/worker.py`
|
|
|
|

|
|
|
|
---
|
|
|
|
## What's this application?
|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
(DockerCoins 2016 logo courtesy of @XtlCnslt and @ndeloof. Thanks!)
|
|
|
|
---
|
|
|
|
## What's this application?
|
|
|
|
- It is a DockerCoin miner! 💰🐳📦🚢
|
|
|
|
- No, you can't buy coffee with DockerCoins
|
|
|
|
- How DockerCoins works:
|
|
|
|
- `worker` asks to `rng` to give it random bytes
|
|
- `worker` feeds those random bytes into `hasher`
|
|
- each hash starting with `0` is a DockerCoin
|
|
- DockerCoins are stored in `redis`
|
|
- `redis` is also updated every second to track speed
|
|
- you can see the progress with the `webui`
|
|
|
|
---
|
|
|
|
## Getting the application source code
|
|
|
|
- We will clone the GitHub repository
|
|
|
|
- The repository also contains scripts and tools that we will use through the workshop
|
|
|
|
.exercise[
|
|
|
|
<!--
|
|
```bash
|
|
[ -d orchestration-workshop ] && mv orchestration-workshop orchestration-workshop.$$
|
|
```
|
|
-->
|
|
|
|
- Clone the repository on `node1`:
|
|
```bash
|
|
git clone git://github.com/jpetazzo/orchestration-workshop
|
|
```
|
|
|
|
]
|
|
|
|
(You can also fork the repository on GitHub and clone your fork if you prefer that.)
|
|
|
|
---
|
|
|
|
# Running the application
|
|
|
|
Without further ado, let's start our application.
|
|
|
|
.exercise[
|
|
|
|
- Go to the `dockercoins` directory, in the cloned repo:
|
|
```bash
|
|
cd ~/orchestration-workshop/dockercoins
|
|
```
|
|
|
|
- Use Compose to build and run all containers:
|
|
```bash
|
|
docker-compose up
|
|
```
|
|
|
|
]
|
|
|
|
Compose tells Docker to build all container images (pulling
|
|
the corresponding base images), then starts all containers,
|
|
and displays aggregated logs.
|
|
|
|
---
|
|
|
|
## Lots of logs
|
|
|
|
- The application continuously generates logs
|
|
|
|
- We can see the `worker` service making requests to `rng` and `hasher`
|
|
|
|
- Let's put that in the background
|
|
|
|
.exercise[
|
|
|
|
- Stop the application by hitting `^C`
|
|
|
|
<!--
|
|
```meta
|
|
^C
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
- `^C` stops all containers by sending them the `TERM` signal
|
|
|
|
- Some containers exit immediately, others take longer
|
|
<br/>(because they don't handle `SIGTERM` and end up being killed after a 10s timeout)
|
|
|
|
---
|
|
|
|
## Restarting in the background
|
|
|
|
- Many flags and commands of Compose are modeled after those of `docker`
|
|
|
|
.exercise[
|
|
|
|
- Start the app in the background with the `-d` option:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
- Check that our app is running with the `ps` command:
|
|
```bash
|
|
docker-compose ps
|
|
```
|
|
|
|
]
|
|
|
|
`docker-compose ps` also shows the ports exposed by the application.
|
|
|
|
---
|
|
|
|
## Viewing logs
|
|
|
|
- The `docker-compose logs` command works like `docker logs`
|
|
|
|
.exercise[
|
|
|
|
- View all logs since container creation and exit when done:
|
|
```bash
|
|
docker-compose logs
|
|
```
|
|
|
|
- Stream container logs, starting at the last 10 lines for each container:
|
|
```bash
|
|
docker-compose logs --tail 10 --follow
|
|
```
|
|
|
|
<!--
|
|
```meta
|
|
^C
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
Tip: use `^S` and `^Q` to pause/resume log output.
|
|
|
|
???
|
|
|
|
## Upgrading from Compose 1.6
|
|
|
|
.warning[The `logs` command has changed between Compose 1.6 and 1.7!]
|
|
|
|
- Up to 1.6
|
|
|
|
- `docker-compose logs` is the equivalent of `logs --follow`
|
|
|
|
- `docker-compose logs` must be restarted if containers are added
|
|
|
|
- Since 1.7
|
|
|
|
- `--follow` must be specified explicitly
|
|
|
|
- new containers are automatically picked up by `docker-compose logs`
|
|
|
|
---
|
|
|
|
## Connecting to the web UI
|
|
|
|
- The `webui` container exposes a web dashboard; let's view it
|
|
|
|
.exercise[
|
|
|
|
- Open http://[yourVMaddr]:8000/ (from a browser)
|
|
|
|
]
|
|
|
|
- The app actually has a constant, steady speed (3.33 coins/second)
|
|
|
|
- The speed seems not-so-steady because:
|
|
|
|
- the worker doesn't update the counter after every loop, but up to once per second
|
|
|
|
- the speed is computed by the browser, checking the counter about once per second
|
|
|
|
- between two consecutive updates, the counter will increase either by 4, or by 0
|
|
|
|
---
|
|
|
|
## Scaling up the application
|
|
|
|
- Our goal is to make that performance graph go up (without changing a line of code!)
|
|
|
|
- Before trying to scale the application, we'll figure out if we need more resources
|
|
|
|
(CPU, RAM...)
|
|
|
|
- For that, we will use good old UNIX tools on our Docker node
|
|
|
|
<!-- FIXME add reference to cadvisor, snap, ...? -->
|
|
|
|
---
|
|
|
|
## Looking at resource usage
|
|
|
|
- Let's look at CPU, memory, and I/O usage
|
|
|
|
.exercise[
|
|
|
|
- run `top` to see CPU and memory usage (you should see idle cycles)
|
|
|
|
- run `vmstat 3` to see I/O usage (si/so/bi/bo)
|
|
<br/>(the 4 numbers should be almost zero, except `bo` for logging)
|
|
|
|
]
|
|
|
|
We have available resources.
|
|
|
|
- Why?
|
|
- How can we use them?
|
|
|
|
---
|
|
|
|
## Scaling workers on a single node
|
|
|
|
- Docker Compose supports scaling
|
|
- Let's scale `worker` and see what happens!
|
|
|
|
.exercise[
|
|
|
|
- Start one more `worker` container:
|
|
```bash
|
|
docker-compose scale worker=2
|
|
```
|
|
|
|
- Look at the performance graph (it should show a x2 improvement)
|
|
|
|
- Look at the aggregated logs of our containers (`worker_2` should show up)
|
|
|
|
- Look at the impact on CPU load with e.g. top (it should be negligible)
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Adding more workers
|
|
|
|
- Great, let's add more workers and call it a day, then!
|
|
|
|
.exercise[
|
|
|
|
- Start eight more `worker` containers:
|
|
```bash
|
|
docker-compose scale worker=10
|
|
```
|
|
|
|
- Look at the performance graph: does it show a x10 improvement?
|
|
|
|
- Look at the aggregated logs of our containers
|
|
|
|
- Look at the impact on CPU load and memory usage
|
|
|
|
<!--
|
|
```bash
|
|
sleep 5
|
|
killall docker-compose
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
---
|
|
|
|
# Identifying bottlenecks
|
|
|
|
- You should have seen a 3x speed bump (not 10x)
|
|
|
|
- Adding workers didn't result in linear improvement
|
|
|
|
- *Something else* is slowing us down
|
|
|
|
--
|
|
|
|
- ... But what?
|
|
|
|
--
|
|
|
|
- The code doesn't have instrumentation
|
|
|
|
- Let's use state-of-the-art HTTP performance analysis!
|
|
<br/>(i.e. good old tools like `ab`, `httping`...)
|
|
|
|
---
|
|
|
|
## Measuring latency under load
|
|
|
|
We will use `httping`.
|
|
|
|
.exercise[
|
|
|
|
- Check the latency of `rng`:
|
|
```bash
|
|
httping -c 10 localhost:8001
|
|
```
|
|
|
|
- Check the latency of `hasher`:
|
|
```bash
|
|
httping -c 10 localhost:8002
|
|
```
|
|
|
|
]
|
|
|
|
`rng` has a much higher latency than `hasher`.
|
|
|
|
---
|
|
|
|
## Let's draw hasty conclusions
|
|
|
|
- The bottleneck seems to be `rng`
|
|
|
|
- *What if* we don't have enough entropy and can't generate enough random numbers?
|
|
|
|
- We need to scale out the `rng` service on multiple machines!
|
|
|
|
Note: this is a fiction! We have enough entropy. But we need a pretext to scale out.
|
|
<br/>(In fact, the code of `rng` uses `/dev/urandom`, which doesn't need entropy.)
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# Scaling out
|
|
|
|
---
|
|
|
|
# Connecting to containers on other hosts
|
|
|
|
- So far, our whole stack is on a single machine
|
|
|
|
- We want to scale out (across multiple nodes)
|
|
|
|
- We will deploy the same stack multiple times
|
|
|
|
- But we want every stack to use the same Redis
|
|
<br/>(in other words: Redis is our only *stateful* service here)
|
|
|
|
--
|
|
|
|
- And remember: we're not allowed to change the code!
|
|
|
|
- the code connects to host `redis`
|
|
- `redis` must resolve to the address of our Redis service
|
|
- the Redis service must listen on the default port (6379)
|
|
|
|
???
|
|
|
|
## Using custom DNS mapping
|
|
|
|
- We could setup a Redis server on its default port
|
|
|
|
- And add a DNS entry mapping `redis` to this server
|
|
|
|
.exercise[
|
|
|
|
- See what happens if we run:
|
|
```bash
|
|
docker run --add-host redis:1.2.3.4 alpine ping redis
|
|
```
|
|
|
|
<!--
|
|
```meta
|
|
^C
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
There is a Compose file option for that: `extra_hosts`.
|
|
|
|
---
|
|
|
|
# Abstracting remote services with ambassadors
|
|
|
|
<!--
|
|
|
|
- What if we can't/won't run Redis on its default port?
|
|
|
|
- What if we want to be able to move it easily?
|
|
|
|
-->
|
|
|
|
- We will use an ambassador
|
|
|
|
- Redis will be started independently of our stack
|
|
|
|
- It will run at an arbitrary location (host+port)
|
|
|
|
- In our stack, we replace `redis` with an ambassador
|
|
|
|
- The ambassador will connect to Redis
|
|
|
|
- The ambassador will "act as" Redis in the stack
|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|

|
|
|
|
---
|
|
|
|
## Start redis
|
|
|
|
- Start a standalone Redis container
|
|
|
|
- Let Docker expose it on a random port
|
|
|
|
.exercise[
|
|
|
|
- Run redis with a random public port:
|
|
<br/>`docker run -d -P --name myredis redis`
|
|
|
|
- Check which port was allocated:
|
|
<br/>`docker port myredis 6379`
|
|
|
|
]
|
|
|
|
- Note the IP address of the machine, and this port
|
|
|
|
---
|
|
|
|
## Introduction to `jpetazzo/hamba`
|
|
|
|
- General purpose load balancer and traffic director
|
|
|
|
- [Source code is available on GitHub](
|
|
https://github.com/jpetazzo/hamba)
|
|
|
|
- [Public image is available on the Docker Hub](
|
|
https://hub.docker.com/r/jpetazzo/hamba/)
|
|
|
|
- Generates a configuration file for HAProxy, then starts HAProxy
|
|
|
|
- Parameters are provided on the command line; for instance:
|
|
```bash
|
|
docker run -d -p 80 jpetazzo/hamba 80 www1:1234 www2:2345
|
|
docker run -d -p 80 jpetazzo/hamba 80 www1 1234 www2 2345
|
|
```
|
|
Those two commands do the same thing: they start a load balancer
|
|
listening on port 80, and balancing traffic across www1:1234 and www2:2345
|
|
|
|
---
|
|
|
|
## Update `docker-compose.yml`
|
|
|
|
.exercise[
|
|
|
|
- Replace `redis` with an ambassador using `jpetazzo/hamba`:
|
|
```yaml
|
|
redis:
|
|
image: jpetazzo/hamba
|
|
command: 6379 `AA.BB.CC.DD:EEEEE`
|
|
```
|
|
|
|
<!--
|
|
```edit
|
|
cat docker-compose.yml-ambassador | sed "s/AA.BB.CC.DD/$(curl myip.enix.org/REMOTE_ADDR)/" | sed "s/EEEEE/$(docker port myredis 6379 | cut -d: -f2)/" > docker-compose.yml
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
Shortcut: `docker-compose.yml-ambassador`
|
|
<br/>(But you still have to update `AA.BB.CC.DD:EEEEE`!)
|
|
|
|
---
|
|
|
|
## Start the stack on the first machine
|
|
|
|
- Compose will detect the change in the `redis` service
|
|
|
|
- It will replace `redis` with a `jpetazzo/hamba` instance
|
|
|
|
.exercise[
|
|
|
|
- Just tell Compose to do its thing:
|
|
<br/>`docker-compose up -d`
|
|
|
|
- Check that the stack is up and running:
|
|
<br/>`docker-compose ps`
|
|
|
|
- Look at the web UI to make sure that it works fine
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Controlling other Docker Engines
|
|
|
|
- Many tools in the ecosystem will honor the `DOCKER_HOST` environment variable
|
|
|
|
- Those tools include (obviously!) the Docker CLI and Docker Compose
|
|
|
|
- Our training VMs have been setup to accept API requests on port 55555
|
|
<br/>(without authentication - this is very insecure, by the way!)
|
|
|
|
- We will see later how to setup mutual authentication with certificates
|
|
|
|
---
|
|
|
|
## Setting the `DOCKER_HOST` environment variable
|
|
|
|
.exercise[
|
|
|
|
- Check how many containers are running on `node1`:
|
|
```bash
|
|
docker ps
|
|
```
|
|
|
|
- Set the `DOCKER_HOST` variable to control `node2`, and compare:
|
|
```bash
|
|
export DOCKER_HOST=tcp://node2:55555
|
|
docker ps
|
|
```
|
|
|
|
]
|
|
|
|
You shouldn't see any container running on `node2` at this point.
|
|
|
|
---
|
|
|
|
## Start the stack on another machine
|
|
|
|
- We will tell Compose to bring up our stack on the other node
|
|
|
|
- It will use the local code (we don't need to checkout the code on `node2`)
|
|
|
|
.exercise[
|
|
|
|
- Start the stack:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
]
|
|
|
|
Note: this will build the container images on `node2`, resulting
|
|
in potentially different results from `node1`. We will see later
|
|
how to use the same images across the whole cluster.
|
|
|
|
---
|
|
|
|
## Run the application on every node
|
|
|
|
- We will repeat the previous step with a little shell loop
|
|
|
|
... but introduce parallelism to save some time
|
|
|
|
.exercise[
|
|
|
|
- Deploy one instance of the stack on each node:
|
|
|
|
```bash
|
|
for N in 3 4 5; do
|
|
DOCKER_HOST=tcp://node$N:55555 docker-compose up -d &
|
|
done
|
|
wait
|
|
```
|
|
|
|
]
|
|
|
|
Note: again, this will rebuild the container images on each node.
|
|
|
|
---
|
|
|
|
## Scale!
|
|
|
|
- The app is built (and running!) everywhere
|
|
|
|
- Scaling can be done very quickly
|
|
|
|
.exercise[
|
|
|
|
- Add a bunch of workers all over the place:
|
|
|
|
```bash
|
|
for N in 1 2 3 4 5; do
|
|
DOCKER_HOST=tcp://node$N:55555 docker-compose scale worker=10
|
|
done
|
|
```
|
|
|
|
- Admire the result in the web UI!
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## A few words about development volumes
|
|
|
|
- Try to access the web UI on another node
|
|
|
|
--
|
|
|
|
- It doesn't work! Why?
|
|
|
|
--
|
|
|
|
- Static assets are masked by an empty volume
|
|
|
|
--
|
|
|
|
- We need to comment out the `volumes` section
|
|
|
|
---
|
|
|
|
## Why must we comment out the `volumes` section?
|
|
|
|
- Volumes have multiple uses:
|
|
|
|
- storing persistent stuff (database files...)
|
|
|
|
- sharing files between containers (logs, configuration...)
|
|
|
|
- sharing files between host and containers (source...)
|
|
|
|
- The `volumes` directive expands to an host path:
|
|
|
|
`/home/docker/orchestration-workshop/dockercoins/webui/files`
|
|
|
|
- This host path exists on the local machine (not on the others)
|
|
|
|
- This specific volume is used in development (not in production)
|
|
|
|
---
|
|
|
|
## Stop the app
|
|
|
|
- Let's use `docker-compose down`
|
|
|
|
- It will stop and remove the DockerCoins app (but leave other containers running)
|
|
|
|
.exercise[
|
|
|
|
- We can do another simple parallel shell loop:
|
|
```bash
|
|
for N in $(seq 1 5); do
|
|
export DOCKER_HOST=tcp://node$N:55555
|
|
docker-compose down &
|
|
done
|
|
wait
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Clean up the redis container
|
|
|
|
- `docker-compose down` only removes containers defined with Compose
|
|
|
|
.exercise[
|
|
|
|
- Check that `myredis` is still there:
|
|
```bash
|
|
unset DOCKER_HOST
|
|
docker ps
|
|
```
|
|
|
|
- Remove it:
|
|
```bash
|
|
docker rm -f myredis
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Considerations about ambassadors
|
|
|
|
"Ambassador" is a design pattern.
|
|
|
|
There are many ways to implement it.
|
|
|
|
Others implementations include:
|
|
|
|
- [interlock](https://github.com/ehazlett/interlock);
|
|
- [registrator](http://gliderlabs.com/registrator/latest/);
|
|
- [smartstack](http://nerds.airbnb.com/smartstack-service-discovery-cloud/);
|
|
- [zuul](https://github.com/Netflix/zuul/wiki);
|
|
- and more!
|
|
|
|
<!--
|
|
|
|
We will present three increasingly complex (but also powerful)
|
|
ways to deploy ambassadors.
|
|
|
|
-->
|
|
|
|
???
|
|
|
|
## Single-tier ambassador deployment
|
|
|
|
- One-shot configuration process
|
|
|
|
- Must be executed manually after each scaling operation
|
|
|
|
- Scans current state, updates load balancer configuration
|
|
|
|
- Pros:
|
|
<br/>- simple, robust, no extra moving part
|
|
<br/>- easy to customize (thanks to simple design)
|
|
<br/>- can deal efficiently with large changes
|
|
|
|
- Cons:
|
|
<br/>- must be executed after each scaling operation
|
|
<br/>- harder to compose different strategies
|
|
|
|
- Example: this workshop
|
|
|
|
???
|
|
|
|
## Two-tier ambassador deployment
|
|
|
|
- Daemon listens to Docker events API
|
|
|
|
- Reacts to container start/stop events
|
|
|
|
- Adds/removes back-ends to load balancers configuration
|
|
|
|
- Pros:
|
|
<br/>- no extra step required when scaling up/down
|
|
|
|
- Cons:
|
|
<br/>- extra process to run and maintain
|
|
<br/>- deals with one event at a time (ordering matters)
|
|
|
|
- Hidden gotcha: load balancer creation
|
|
|
|
- Example: interlock
|
|
|
|
???
|
|
|
|
## Three-tier ambassador deployment
|
|
|
|
|
|
- Daemon listens to Docker events API
|
|
|
|
- Reacts to container start/stop events
|
|
|
|
- Adds/removes scaled services in distributed config DB (Zookeeper, etcd, Consul…)
|
|
|
|
- Another daemon listens to config DB events,
|
|
<br/>adds/removes backends to load balancers configuration
|
|
|
|
- Pros:
|
|
<br/>- more flexibility
|
|
|
|
- Cons:
|
|
<br/>- three extra services to run and maintain
|
|
|
|
- Example: registrator
|
|
|
|
---
|
|
|
|
## Ambassadors and overlay networks
|
|
|
|
- Overlay networks allow direct multi-host communication
|
|
|
|
- Ambassadors are still useful to implement other tasks:
|
|
|
|
- load balancing;
|
|
|
|
- credentials injection;
|
|
|
|
- instrumentation;
|
|
|
|
- fail-over;
|
|
|
|
- etc.
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# Dynamic orchestration
|
|
|
|
---
|
|
|
|
## Static vs Dynamic
|
|
|
|
- Static
|
|
|
|
- you decide what goes where
|
|
|
|
- simple to describe and implement
|
|
|
|
- seems easy at first but doesn't scale efficiently
|
|
|
|
- Dynamic
|
|
|
|
- the system decides what goes where
|
|
|
|
- requires extra components (HA KV...)
|
|
|
|
- scaling can be finer-grained, more efficient
|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|
## Hands-on Swarm
|
|
|
|

|
|
|
|
---
|
|
|
|
## Swarm (in theory)
|
|
|
|
- Consolidates multiple Docker hosts into a single one
|
|
|
|
- You talk to Swarm using the Docker API
|
|
|
|
→ you can use all existing tools: Docker CLI, Docker Compose, etc.
|
|
|
|
- Swarm talks to your Docker Engines using the Docker API too
|
|
|
|
→ you can use existing Engines without modification
|
|
|
|
- Dispatches (schedules) your containers across the cluster, transparently
|
|
|
|
- Open source and written in Go (like the Docker Engine)
|
|
|
|
- Initial design and implementation by [@aluzzardi](https://twitter.com/aluzzardi) and [@vieux](https://twitter.com/vieux),
|
|
who were also the authors of the first versions of the Docker Engine
|
|
|
|
---
|
|
|
|
## Swarm (in practice)
|
|
|
|
- Stable since November 2015
|
|
|
|
- Easy to setup (compared to other orchestrators)
|
|
|
|
- Tested with 1000 nodes + 50000 containers
|
|
<br/>.small[(without particular tuning; see DockerCon EU opening keynotes!)]
|
|
|
|
- Requires a key/value store for advanced features
|
|
|
|
- Can use Consul, etcd, or Zookeeper
|
|
|
|
---
|
|
|
|
# Deploying Swarm
|
|
|
|
- Components involved:
|
|
|
|
- cluster discovery mechanism
|
|
<br/>(so that the manager can learn about the nodes)
|
|
|
|
- Swarm manager
|
|
<br/>(your frontend to the cluster)
|
|
|
|
- Swarm agent
|
|
<br/>(runs on each node, registers it with service discovery)
|
|
|
|
---
|
|
|
|
## Cluster discovery
|
|
|
|
- Possible backends:
|
|
|
|
- dynamic, self-hosted
|
|
<br/>(requires to run a Consul/etcd/Zookeeper cluster)
|
|
|
|
- static, through command-line or file
|
|
<br/>(great for testing, or for private subnets, see [this article](
|
|
https://medium.com/on-docker/docker-swarm-flat-file-engine-discovery-2b23516c71d4#.6vp94h5wn)
|
|
|
|
- external, token-based
|
|
<br/>(dynamic; nothing to operate; relies on external service operated by Docker Inc.)
|
|
|
|
---
|
|
|
|
## Swarm agent
|
|
|
|
- Used only for dynamic discovery (ZK, etcd, Consul, token)
|
|
|
|
- Must run on each node
|
|
|
|
- Every 20s (by default), tells to the discovery system:
|
|
|
|
*"Hello, there is a Swarm node at A.B.C.D:EFGH"*
|
|
|
|
- Must know the node's IP address
|
|
|
|
(It cannot figure it out by itself, because it doesn't know whether to use public or private addresses)
|
|
|
|
- The node continues to work even if the agent dies
|
|
|
|
---
|
|
|
|
## Swarm manager
|
|
|
|
- Accepts Docker API requests
|
|
|
|
- Communicates with the cluster nodes
|
|
|
|
- Performs healthchecks, scheduling...
|
|
|
|
---
|
|
|
|
# Picking a key/value store
|
|
|
|
- We are going to use a key/value store, and use it for:
|
|
|
|
- cluster membership discovery
|
|
|
|
- overlay networks backend
|
|
|
|
- resilient storage of important credentials
|
|
|
|
- Swarm leader election
|
|
|
|
- We are going to use Consul, and run one Consul instance on each node
|
|
|
|
(That way, we can always access Consul over localhost)
|
|
|
|
---
|
|
|
|
## Do we really need a key/value store?
|
|
|
|
- Cluster membership discovery doesn't *require* a key/value store
|
|
|
|
(We could use the token mechanism instead)
|
|
|
|
- Network overlays don't *require* a key/value store
|
|
|
|
(We could use a plugin like Weave instead)
|
|
|
|
- Credentials can be distributed through other mechanisms
|
|
|
|
(E.g. copying them to a private S3 bucket)
|
|
|
|
- Swarm leader election, however, requires a key/value store
|
|
|
|
---
|
|
|
|
## Why are we using a key/value store, then?
|
|
|
|
- Each aforementioned mechanism requires some reliable, distributed storage
|
|
|
|
- If we don't use our own key/value store, we end up using *something else*:
|
|
|
|
- Docker Inc.'s centralized token discovery service
|
|
|
|
- [Weave's CRDT protocol](https://github.com/weaveworks/weave/wiki/IP-allocation-design)
|
|
|
|
- AWS S3 (or your cloud provider's equivalent, or some other file storage system)
|
|
|
|
- Each of those is one extra potential point of failure
|
|
|
|
- See for instance [Kyle Kingsbury's analysis of Chronos](https://aphyr.com/posts/326-jepsen-chronos) for an illustration of this problem
|
|
|
|
- By operating our own key/value store, we have 1 extra service instead of 3 (or more)
|
|
|
|
---
|
|
|
|
## Should we always use a key/value store?
|
|
|
|
--
|
|
|
|
- No!
|
|
|
|
--
|
|
|
|
- If you don't want to operate your own key/value store, don't do it
|
|
|
|
- You might be more comfortable using tokens + Weave + S3, for instance
|
|
|
|
- You can also use static discovery
|
|
|
|
- Maybe you don't even need overlay networks
|
|
|
|
---
|
|
|
|
## Why Consul?
|
|
|
|
- Consul is not the "official" or best way to do this
|
|
|
|
- This is an arbitrary decision made by Truly Yours
|
|
|
|
- I *personally* find Consul easier to setup for a workshop like this
|
|
|
|
- ... But etcd and Zookeper will work too!
|
|
|
|
---
|
|
|
|
## Setting up our Swarm cluster
|
|
|
|
We need to:
|
|
|
|
- create certificates,
|
|
|
|
- distribute them on our nodes,
|
|
|
|
- run the Swarm agent on every node,
|
|
|
|
- run the Swarm manager on `node1`,
|
|
|
|
- reconfigure the Engine on each node to add extra flags (for overlay networks).
|
|
|
|
That's a lot of work, so we'll use Docker Machine to automate this.
|
|
|
|
---
|
|
|
|
## Using Docker Machine to setup a Swarm cluster
|
|
|
|
- Docker Machine has two primary uses:
|
|
|
|
- provisioning cloud instances running the Docker Engine
|
|
|
|
- managing local Docker VMs within e.g. VirtualBox
|
|
|
|
- It can also create Swarm clusters, and will:
|
|
|
|
- create and manage certificates
|
|
|
|
- automatically start swarm agent and manager containers
|
|
|
|
- It comes with a special driver, `generic`, to (re)configure existing machines
|
|
|
|
---
|
|
|
|
## Setting up Docker Machine
|
|
|
|
- Install `docker-machine` (single binary download)
|
|
|
|
(This is already done on your VMs!)
|
|
|
|
- Set a few environment variables (cloud credentials)
|
|
```bash
|
|
export AWS_ACCESS_KEY_ID=AKI...
|
|
export AWS_SECRET_ACCESS_KEY=...
|
|
export AWS_DEFAULT_REGION=eu-west-2
|
|
export DIGITALOCEAN_ACCESS_TOKEN=...
|
|
export DIGITALOCEAN_SIZE=2gb
|
|
export AZURE_SUBSCRIPTION_ID=...
|
|
```
|
|
|
|
(We already have 5 nodes, so we don't need to do this!)
|
|
|
|
---
|
|
|
|
## Creating nodes with Docker Machine
|
|
|
|
- The only two mandatory parameters are the driver to use, and the machine name:
|
|
```bash
|
|
docker-machine create -d digitalocean node42
|
|
```
|
|
|
|
- *Tons* of parameters can be specified; see [Docker Machine driver documentation](https://docs.docker.com/machine/drivers/)
|
|
|
|
- To list machines and their status:
|
|
```bash
|
|
docker-machine ls
|
|
```
|
|
|
|
- To destroy a machine:
|
|
```bash
|
|
docker-machine rm node42
|
|
```
|
|
|
|
---
|
|
|
|
## Communicating with nodes managed by Docker Machine
|
|
|
|
- Select a machine for use:
|
|
```bash
|
|
eval $(docker-machine env node42)
|
|
```
|
|
This will set a few environment variables (at least `DOCKER_HOST`).
|
|
|
|
- Execute regular commands with Docker, Compose, etc.
|
|
|
|
(They will pick up remote host address from environment)
|
|
|
|
- If you need to go under the hood, you can get SSH access:
|
|
```bash
|
|
docker-machine ssh node42
|
|
```
|
|
|
|
---
|
|
|
|
## Docker Machine `generic` driver
|
|
|
|
- Most drivers work the same way:
|
|
|
|
- use cloud API to create instance
|
|
|
|
- connect to instance over SSH
|
|
|
|
- install Docker
|
|
|
|
- The `generic` driver skips the first step
|
|
|
|
- It can install Docker on any machine, as long as you have SSH access
|
|
|
|
- We will use that!
|
|
|
|
---
|
|
|
|
## Setting up Swarm with Docker Machine
|
|
|
|
When invoking Machine, we will provide three sets of parameters:
|
|
|
|
- the machine driver to use (`generic`) and the SSH connection information
|
|
|
|
- Swarm-specific options indicating the cluster membership discovery mechanism
|
|
|
|
- Extra flags to be passed to the Engine, to enable overlay networks
|
|
|
|
---
|
|
|
|
## Provisioning the first node
|
|
|
|
.exercise[
|
|
|
|
- Use the following command to provision the manager node:
|
|
|
|
<!--
|
|
```placeholder
|
|
AA.BB.CC.DD $(getent hosts node1 | awk '{print $1}')
|
|
```
|
|
-->
|
|
|
|
```bash
|
|
docker-machine create --driver generic \
|
|
--engine-opt cluster-store=consul://localhost:8500 \
|
|
--engine-opt cluster-advertise=eth0:2376 \
|
|
--swarm --swarm-master --swarm-discovery consul://localhost:8500 \
|
|
--generic-ssh-user docker --generic-ip-address `AA.BB.CC.DD` node1
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Provisioning the other nodes
|
|
|
|
- The command is almost the same, but without the `--swarm-master` flag
|
|
|
|
- We will use a shell snippet for convenience
|
|
|
|
.exercise[
|
|
|
|
```bash
|
|
grep node[2345] /etc/hosts | grep -v ^127 |
|
|
while read IPADDR NODENAME
|
|
do docker-machine create --driver generic \
|
|
--engine-opt cluster-store=consul://localhost:8500 \
|
|
--engine-opt cluster-advertise=eth0:2376 \
|
|
--swarm --swarm-discovery consul://localhost:8500 \
|
|
--generic-ssh-user docker \
|
|
--generic-ip-address $IPADDR $NODENAME
|
|
done
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Check what we did
|
|
|
|
Let's connect to the first node *individually*.
|
|
|
|
.exercise[
|
|
|
|
- Select the node with Machine
|
|
|
|
```bash
|
|
eval $(docker-machine env node1)
|
|
```
|
|
|
|
- Execute some Docker commands
|
|
|
|
```bash
|
|
docker version
|
|
docker info
|
|
```
|
|
|
|
]
|
|
|
|
In the output of `docker info`, we should see `Cluster store` and `Cluster advertise`.
|
|
|
|
---
|
|
|
|
## Interact with the node
|
|
|
|
Let's try a few basic Docker commands on this node.
|
|
|
|
.exercise[
|
|
|
|
- Run a simple container:
|
|
```bash
|
|
docker run --rm busybox echo hello world
|
|
```
|
|
|
|
- See running containers:
|
|
```bash
|
|
docker ps
|
|
```
|
|
|
|
]
|
|
|
|
Two containers should show up: the agent and the manager.
|
|
|
|
---
|
|
|
|
## Connect to the Swarm cluster
|
|
|
|
Now, let's try the same operations, but when talking to the Swarm manager.
|
|
|
|
.exercise[
|
|
|
|
- Select the Swarm manager with Machine:
|
|
|
|
```bash
|
|
eval $(docker-machine env node1 --swarm)
|
|
```
|
|
|
|
- Execute some Docker commands
|
|
|
|
```bash
|
|
docker version
|
|
docker info
|
|
docker ps
|
|
```
|
|
|
|
]
|
|
|
|
The output is different! Let's review this.
|
|
|
|
---
|
|
|
|
## `docker version`
|
|
|
|
Swarm identifies itself clearly:
|
|
|
|
```
|
|
Client:
|
|
Version: 1.11.1
|
|
API version: 1.23
|
|
Go version: go1.5.4
|
|
Git commit: 5604cbe
|
|
Built: Tue Apr 26 23:38:55 2016
|
|
OS/Arch: linux/amd64
|
|
|
|
Server:
|
|
Version: swarm/1.2.2
|
|
API version: 1.22
|
|
Go version: go1.5.4
|
|
Git commit: 34e3da3
|
|
Built: Mon May 9 17:03:22 UTC 2016
|
|
OS/Arch: linux/amd64
|
|
```
|
|
|
|
---
|
|
|
|
## `docker info`
|
|
|
|
The output of `docker info` on Swarm shows a number of differences from
|
|
the output on a single Engine:
|
|
|
|
.small[
|
|
```
|
|
Containers: 0
|
|
Running: 0
|
|
Paused: 0
|
|
Stopped: 0
|
|
Images: 0
|
|
Server Version: swarm/1.2.2
|
|
Role: primary
|
|
Strategy: spread
|
|
Filters: health, port, containerslots, dependency, affinity, constraint
|
|
Nodes: 0
|
|
Plugins:
|
|
Volume:
|
|
Network:
|
|
Kernel Version: 4.2.0-36-generic
|
|
Operating System: linux
|
|
Architecture: amd64
|
|
CPUs: 0
|
|
Total Memory: 0 B
|
|
Name: node1
|
|
Docker Root Dir:
|
|
Debug mode (client): false
|
|
Debug mode (server): false
|
|
WARNING: No kernel memory limit support
|
|
```
|
|
]
|
|
---
|
|
|
|
## Why zero node?
|
|
|
|
- We haven't started Consul yet
|
|
|
|
- Swarm discovery is not operational
|
|
|
|
- Swarm can't discover the nodes
|
|
|
|
Note: Docker will start (and be functional) without a K/V store.
|
|
|
|
This lets us run Consul itself in a container.
|
|
|
|
---
|
|
|
|
## Adding Consul
|
|
|
|
- We will run Consul in containers
|
|
|
|
- We will use the [Consul official image](
|
|
https://hub.docker.com/_/consul/) that was released *very recently*
|
|
|
|
- We will tell Docker to automatically restart it on reboots
|
|
|
|
- To simplify network setup, we will use `host` networking
|
|
|
|
---
|
|
|
|
## A few words about `host` networking
|
|
|
|
- Consul needs to be aware of its actual IP address (seen by other nodes)
|
|
|
|
- It also binds a bunch of different ports
|
|
|
|
- It makes sense (from a security point of view) to have Consul listening on localhost only
|
|
|
|
(and have "users", i.e. Engine, Swarm, etc. connect over localhost)
|
|
|
|
- Therefore, we will use `host` networking!
|
|
|
|
- Also: Docker Machine 0.6 starts the Swarm containers in `host` networking ...
|
|
|
|
- ... but Docker Machine 0.7 doesn't (which is why we stick to 0.6 for now)
|
|
|
|
---
|
|
|
|
## Consul fundamentals (if I must give you just one slide...)
|
|
|
|
- Consul nodes can be "just an agent" or "server"
|
|
|
|
- From the client's perspective, they behave the same
|
|
|
|
- Only servers are members in the Raft consensus / leader election / etc
|
|
|
|
(non-server agents forward requests to a server)
|
|
|
|
- All nodes must be told the address of at least another node to join
|
|
|
|
(except for the first node, where this is optional)
|
|
|
|
- At least the first nodes must know how many nodes to expect to have quorum
|
|
|
|
- Consul can have only one "truth" at a time (hence the importance of quorum)
|
|
|
|
---
|
|
|
|
## Starting our Consul cluster
|
|
|
|
.exercise[
|
|
|
|
- Make sure you're logged into `node1`, and:
|
|
|
|
```bash
|
|
IPADDR=$(ip a ls dev eth0 | sed -n 's,.*inet \(.*\)/.*,\1,p')
|
|
for N in 1 2 3 4 5; do
|
|
ssh node$N -- docker run -d --restart=always --name consul_node$N \
|
|
-e CONSUL_BIND_INTERFACE=eth0 --net host consul \
|
|
agent -server -retry-join $IPADDR -bootstrap-expect 5 \
|
|
-ui -client 0.0.0.0
|
|
done
|
|
```
|
|
|
|
]
|
|
|
|
Note: in production, you probably want to remove `-client 0.0.0.0` since it
|
|
gives public access to your cluster! Also adapt `-bootstrap-expect` to your quorum.
|
|
|
|
---
|
|
|
|
## Check that our Consul cluster is up
|
|
|
|
- With your browser, navigate to any instance on port 8500
|
|
<br/>(in "NODES" you should see the five nodes)
|
|
|
|
- Let's run a couple of useful Consul commands
|
|
|
|
.exercise[
|
|
|
|
- Ask Consul the list of members it knows:
|
|
```bash
|
|
docker run --net host --rm consul members
|
|
```
|
|
|
|
- Ask Consul which node is the current leader:
|
|
```bash
|
|
curl localhost:8500/v1/status/leader
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Check that our Swarm cluster is up
|
|
|
|
.exercise[
|
|
|
|
- Try again the `docker info` from earlier:
|
|
|
|
```bash
|
|
eval $(docker-machine env --swarm node1)
|
|
docker info
|
|
docker ps
|
|
```
|
|
|
|
]
|
|
|
|
All nodes should be visible. (If not, give them a minute or two to register.)
|
|
|
|
The Consul containers should be visible.
|
|
|
|
The Swarm containers, however, are hidden by Swarm (unless you use `docker ps -a`).
|
|
|
|
---
|
|
|
|
# Running containers on Swarm
|
|
|
|
Try to run a few `busybox` containers.
|
|
|
|
Then, let's get serious:
|
|
|
|
.exercise[
|
|
|
|
- Start a Redis service:
|
|
<br/>`docker run -dP redis`
|
|
|
|
- See the service address:
|
|
<br/>`docker port $(docker ps -lq) 6379`
|
|
|
|
]
|
|
|
|
This can be any of your five nodes.
|
|
|
|
---
|
|
|
|
## Scheduling strategies
|
|
|
|
- Random: pick a node at random
|
|
<br/>(but honor resource constraints)
|
|
|
|
- Spread: pick the node with the least containers
|
|
<br/>(including stopped containers)
|
|
|
|
- Binpack: try to maximize resource usage
|
|
<br/>(in other words: use as few hosts as possible)
|
|
|
|
---
|
|
|
|
# Resource allocation
|
|
|
|
- Swarm can honor resource reservations
|
|
|
|
- This requires containers to be started with resource limits
|
|
|
|
- Swarm refuses to schedule a container if it cannot honor a reservation
|
|
|
|
.exercise[
|
|
|
|
- Start Redis containers with 1 GB of RAM until Swarm refuses to start more:
|
|
```bash
|
|
docker run -d -m 1G redis
|
|
```
|
|
|
|
]
|
|
|
|
On a cluster of 5 nodes with ~3.8 GB of RAM per node, Swarm will refuse to start the 16th container.
|
|
|
|
---
|
|
|
|
## Removing our Redis containers
|
|
|
|
- Let's use a little bit of shell scripting
|
|
|
|
.exercise[
|
|
|
|
- Remove all containers using the redis image:
|
|
```bash
|
|
docker ps | awk '/redis/ {print $1}' | xargs docker rm -f
|
|
```
|
|
|
|
]
|
|
|
|
???
|
|
|
|
## Things to know about resource allocation
|
|
|
|
- `docker info` shows resource allocation for each node
|
|
|
|
- Swarm allows a 5% resource overcommit (tunable)
|
|
|
|
- Containers without resource reservation can always be started
|
|
|
|
- Resources of stopped containers are still counted as being reserved
|
|
|
|
- this guarantees that it will be possible to restart a stopped container
|
|
|
|
- containers have to be deleted to free up their resources
|
|
|
|
- `docker update` can be used to change resource allocation on the fly
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# Setting up overlay networks
|
|
|
|
---
|
|
|
|
# Multi-host networking
|
|
|
|
- Docker 1.9 has the concept of *networks*
|
|
|
|
- By default, containers are on the default "bridge" network
|
|
|
|
- You can create additional networks
|
|
|
|
- Containers can be on multiple networks
|
|
|
|
- Containers can dynamically join/leave networks
|
|
|
|
- The "overlay" driver lets networks span multiple hosts
|
|
|
|
- Containers can have "network aliases" resolvable through DNS
|
|
|
|
---
|
|
|
|
## Manipulating networks, names, and aliases
|
|
|
|
- The preferred method is to let Compose do the heavy lifting for us
|
|
|
|
(YAML-defined networking!)
|
|
|
|
- But if we really need to, we can use the Docker CLI, with:
|
|
|
|
`docker network ...`
|
|
|
|
`docker run --net ... --net-alias ...`
|
|
|
|
- The following slides illustrate those commands
|
|
|
|
---
|
|
|
|
## Create a few networks and containers
|
|
|
|
.exercise[
|
|
|
|
- Create two networks, *blue* and *green*:
|
|
```bash
|
|
docker network create blue
|
|
docker network create green
|
|
docker network ls
|
|
```
|
|
|
|
- Create containers with names of blue and green
|
|
things, on their respective networks:
|
|
```bash
|
|
docker run -d --net-alias things --name sky --net blue -m 3G redis
|
|
docker run -d --net-alias things --name navy --net blue -m 3G redis
|
|
docker run -d --net-alias things --name grass --net green -m 3G redis
|
|
docker run -d --net-alias things --name forest --net green -m 3G redis
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Check connectivity within networks
|
|
|
|
.exercise[
|
|
|
|
- Check that our containers are on different nodes:
|
|
|
|
```bash
|
|
docker ps
|
|
```
|
|
|
|
- This will work:
|
|
|
|
```bash
|
|
docker run --rm --net blue alpine ping -c 3 navy
|
|
```
|
|
|
|
- This will not:
|
|
|
|
```bash
|
|
docker run --rm --net blue alpine ping -c 3 grass
|
|
```
|
|
|
|
]
|
|
|
|
???
|
|
|
|
## Containers connected to multiple networks
|
|
|
|
- Some colors aren't *quite* blue *nor* green
|
|
|
|
.exercise[
|
|
|
|
- Create a container that we want to be on both networks:
|
|
```bash
|
|
docker run -d --net-alias things --net blue --name turquoise redis
|
|
```
|
|
|
|
- Check connectivity:
|
|
```bash
|
|
docker exec -ti turquoise ping -c 3 navy
|
|
docker exec -ti turquoise ping -c 3 grass
|
|
```
|
|
(First works; second doesn't)
|
|
|
|
]
|
|
|
|
???
|
|
|
|
## Dynamically connecting containers
|
|
|
|
- This is achieved with the command:
|
|
<br/>`docker network connect NETNAME CONTAINER`
|
|
|
|
.exercise[
|
|
|
|
- Dynamically connect to the green network:
|
|
```bash
|
|
docker network connect green turquoise
|
|
```
|
|
|
|
- Check connectivity:
|
|
```bash
|
|
docker exec -ti turquoise ping -c 3 navy
|
|
docker exec -ti turquoise ping -c 3 grass
|
|
```
|
|
(Both commands work now)
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Network aliases
|
|
|
|
- Each container was created with the network alias `things`
|
|
|
|
- Network aliases are scoped by network
|
|
|
|
.exercise[
|
|
|
|
- Resolve the `things` alias from both networks:
|
|
```bash
|
|
docker run --rm --net blue alpine nslookup things
|
|
docker run --rm --net green alpine nslookup things
|
|
```
|
|
|
|
]
|
|
|
|
???
|
|
|
|
## Under the hood
|
|
|
|
- Each network has an interface in the container
|
|
|
|
- There is also an interface for the default gateway
|
|
|
|
.exercise[
|
|
|
|
- View interfaces in our `turquoise` container:
|
|
```bash
|
|
docker exec -ti turquoise ip addr ls
|
|
```
|
|
|
|
]
|
|
|
|
???
|
|
|
|
## Dynamically disconnecting containers
|
|
|
|
- There is a mirror command to `docker network connect`
|
|
|
|
.exercise[
|
|
|
|
- Disconnect the *turquoise* container from *blue*
|
|
(its original network):
|
|
```bash
|
|
docker network disconnect blue turquoise
|
|
```
|
|
|
|
- Check connectivity:
|
|
```bash
|
|
docker exec -ti turquoise ping -c 3 navy
|
|
docker exec -ti turquoise ping -c 3 grass
|
|
```
|
|
(First command fails, second one works)
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Cleaning up
|
|
|
|
.exercise[
|
|
|
|
- Destroy containers:
|
|
|
|
<!--
|
|
```bash
|
|
docker rm -f sky navy grass forest turquoise
|
|
```
|
|
-->
|
|
|
|
```bash
|
|
docker rm -f sky navy grass forest
|
|
```
|
|
|
|
- Destroy networks:
|
|
|
|
```bash
|
|
docker network rm blue
|
|
docker network rm green
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Cleaning up after an outage or a crash
|
|
|
|
- You cannot remove a network if it still has containers
|
|
|
|
- There is no `"rm -f"` for network
|
|
|
|
- If a network still has stale endpoints, you can use `"disconnect -f"`
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# Building images with Swarm
|
|
|
|
---
|
|
|
|
## Building images with Swarm
|
|
|
|
- Special care must be taken when building and running images
|
|
|
|
- We *can* build images on Swarm (with `docker build` or `docker-compose build`)
|
|
|
|
- One node will be picked at random, and the build will happen there
|
|
|
|
- At the end of the build, the image will be present *only on that node*
|
|
|
|
---
|
|
|
|
## Building on Swarm can yield inconsistent results
|
|
|
|
- Builds are scheduled on random nodes
|
|
|
|
- Multiple builds and rebuilds can happen on different nodes
|
|
|
|
- If a build happens on a different node, the cache of the previous build cannot be used
|
|
|
|
- Worse: you can have two different images with the same name on your cluster
|
|
|
|
---
|
|
|
|
## Scaling won't work as expected
|
|
|
|
Consider the following scenario:
|
|
|
|
- `docker-compose up`
|
|
<br/>
|
|
→ each service is built on a node, and runs there
|
|
|
|
- `docker-compose scale`
|
|
<br/>
|
|
→ additional containers for this service can only be spawned where the image was built
|
|
|
|
- `docker-compose up` (again)
|
|
<br/>
|
|
→ services might be built (and started) on different nodes
|
|
|
|
- `docker-compose scale`
|
|
<br/>
|
|
→ containers can be spawned with both the new and old images
|
|
|
|
---
|
|
|
|
## Scaling correctly with Swarm
|
|
|
|
- After building an image, it should be distributed to the cluster
|
|
|
|
(Or made available through a registry, so that nodes can download it automatically)
|
|
|
|
- Instead of referencing images with the `:latest` tag, unique tags should be used
|
|
|
|
(Using e.g. timestamps, version numbers, or VCS hashes)
|
|
|
|
---
|
|
|
|
## Why can't Swarm do this automatically for us?
|
|
|
|
- Let's step back and think for a minute ...
|
|
|
|
- What should `docker build` do on Swarm?
|
|
|
|
- build on one machine
|
|
|
|
- build everywhere ($$$)
|
|
|
|
- After the build, what should `docker run` do?
|
|
|
|
- run where we built (how do we know where it is?)
|
|
|
|
- run on any machine that has the image
|
|
|
|
- Could Compose+Swarm solve this automatically?
|
|
|
|
---
|
|
|
|
## A few words about "sane defaults"
|
|
|
|
- *It would be nice if Swarm could pick a node, and build there!*
|
|
|
|
- but which node should it pick?
|
|
- what if the build is very expensive?
|
|
- what if we want to distribute the build across nodes?
|
|
- what if we want to tag some builder nodes?
|
|
- ok but what if no node has been tagged?
|
|
|
|
- *It would be nice if Swarm could automatically push images!*
|
|
|
|
- using the Docker Hub is an easy choice
|
|
<br/>(you just need an account)
|
|
- but some of us can't/won't use Docker Hub
|
|
<br/>(for compliance reasons or because no network access)
|
|
|
|
.small[("Sane" defaults are nice only if we agree on the definition of "sane")]
|
|
|
|
---
|
|
|
|
## The plan
|
|
|
|
- Build on a single node (`node1`)
|
|
|
|
- Tag images with the current UNIX timestamp (for simplicity)
|
|
|
|
- Upload them to a registry
|
|
|
|
- Update the Compose file to use those images
|
|
|
|
This is all automated with the [`build-tag-push.py` script](https://github.com/jpetazzo/orchestration-workshop/blob/master/bin/build-tag-push.py).
|
|
|
|
---
|
|
|
|
## Which registry do we want to use?
|
|
|
|
.small[
|
|
|
|
- **Docker Hub**
|
|
|
|
- hosted by Docker Inc.
|
|
- requires an account (free, no credit card needed)
|
|
- images will be public (unless you pay)
|
|
- located in AWS EC2 us-east-1
|
|
|
|
- **Docker Trusted Registry**
|
|
|
|
- self-hosted commercial product
|
|
- requires a subscription (free 30-day trial available)
|
|
- images can be public or private
|
|
- located wherever you want
|
|
|
|
- **Docker open source registry**
|
|
|
|
- self-hosted barebones repository hosting
|
|
- doesn't require anything
|
|
- doesn't come with anything either
|
|
- located wherever you want
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Using Docker Hub
|
|
|
|
- Set the `DOCKER_REGISTRY` environment variable to your Docker Hub user name
|
|
<br/>(the `build-tag-push.py` script prefixes each image name with that variable)
|
|
|
|
- We will also see how to run the open source registry
|
|
<br/>(so use whatever option you want!)
|
|
|
|
.exercise[
|
|
|
|
<!--
|
|
```meta
|
|
^{
|
|
```
|
|
-->
|
|
|
|
- Set the following environment variable:
|
|
<br/>`export DOCKER_REGISTRY=jpetazzo`
|
|
|
|
- (Use *your* Docker Hub login, of course!)
|
|
|
|
- Log into the Docker Hub:
|
|
<br/>`docker login`
|
|
|
|
<!--
|
|
```meta
|
|
^}
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Using Docker Trusted Registry
|
|
|
|
If we wanted to use DTR, we would:
|
|
|
|
- make sure we have a Docker Hub account
|
|
- [activate a Docker Datacenter subscription](
|
|
https://hub.docker.com/enterprise/trial/)
|
|
- install DTR on our machines
|
|
- set `DOCKER_REGISTRY` to `dtraddress:port/user`
|
|
|
|
*This is out of the scope of this workshop!*
|
|
|
|
---
|
|
|
|
## Using open source registry
|
|
|
|
- We need to run a `registry:2` container
|
|
<br/>(make sure you specify tag `:2` to run the new version!)
|
|
|
|
- It will store images and layers to the local filesystem
|
|
<br/>(but you can add a config file to use S3, Swift, etc.)
|
|
|
|
- Docker *requires* TLS when communicating with the registry,
|
|
unless for registries on `localhost` or with the Engine
|
|
flag `--insecure-registry`
|
|
|
|
- Our strategy: run a reverse proxy on `localhost:5000` on each node
|
|
|
|
---
|
|
|
|
## Registry frontends and backend
|
|
|
|

|
|
|
|
---
|
|
|
|
# Deploying a local registry
|
|
|
|
- There is a Compose file for that
|
|
|
|
.exercise[
|
|
|
|
- Go to the `registry` directory in the repository:
|
|
```bash
|
|
cd ~/orchestration-workshop/registry
|
|
```
|
|
|
|
]
|
|
|
|
Let's examine the `docker-compose.yml` file.
|
|
|
|
---
|
|
|
|
## Running a local registry with Compose
|
|
|
|
```yaml
|
|
version: "2"
|
|
|
|
services:
|
|
backend:
|
|
image: registry:2
|
|
frontend:
|
|
image: jpetazzo/hamba
|
|
command: 5000 backend:5000
|
|
ports:
|
|
- "127.0.0.1:5000:5000"
|
|
depends_on:
|
|
- backend
|
|
```
|
|
|
|
- *Backend* is the actual registry.
|
|
- *Frontend* is the ambassador that we deployed earlier.
|
|
<br/>
|
|
It communicates with *backend* using an internal network
|
|
and network aliases.
|
|
|
|
---
|
|
|
|
## Starting a local registry with Compose
|
|
|
|
- We will bring up the registry
|
|
|
|
- Then we will ensure that one *frontend* is running
|
|
on each node by scaling it to our number of nodes
|
|
|
|
.exercise[
|
|
|
|
- Start the registry:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## "Scaling" the local registry
|
|
|
|
- This is a particular kind of scaling
|
|
|
|
- We just want to ensure that one *frontend*
|
|
is running on every single node of the cluster
|
|
|
|
.exercise[
|
|
|
|
- Scale the registry:
|
|
```bash
|
|
for N in $(seq 1 5); do
|
|
docker-compose scale frontend=$N
|
|
done
|
|
```
|
|
|
|
]
|
|
|
|
Note: Swarm might do that automatically for us in the future.
|
|
|
|
---
|
|
|
|
## Testing our local registry
|
|
|
|
- We can retag a small image, and push it to the registry
|
|
|
|
.exercise[
|
|
|
|
- Make sure we have the busybox image, and retag it:
|
|
```bash
|
|
docker pull busybox
|
|
docker tag busybox localhost:5000/busybox
|
|
```
|
|
|
|
- Push it:
|
|
```bash
|
|
docker push localhost:5000/busybox
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Checking what's on our local registry
|
|
|
|
- The registry API has endpoints to query what's there
|
|
|
|
.exercise[
|
|
|
|
- Ensure that our busybox image is now in the local registry:
|
|
```bash
|
|
curl http://localhost:5000/v2/_catalog
|
|
```
|
|
|
|
]
|
|
|
|
The curl command should output:
|
|
```json
|
|
{"repositories":["busybox"]}
|
|
```
|
|
|
|
---
|
|
|
|
## Adapting our Compose file to run on Swarm
|
|
|
|
- We can get rid of all the `ports` section, except for the web UI
|
|
|
|
.exercise[
|
|
|
|
- Go back to the dockercoins directory:
|
|
```bash
|
|
cd ~/orchestration-workshop/dockercoins
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Our new Compose file
|
|
|
|
.small[
|
|
```yaml
|
|
version: '2'
|
|
|
|
services:
|
|
rng:
|
|
build: rng
|
|
|
|
hasher:
|
|
build: hasher
|
|
|
|
webui:
|
|
build: webui
|
|
ports:
|
|
- "8000:80"
|
|
|
|
redis:
|
|
image: redis
|
|
|
|
worker:
|
|
build: worker
|
|
```
|
|
]
|
|
|
|
Copy-paste this into `docker-compose.yml`
|
|
<br/>(or you can `cp docker-compose.yml-v2 docker-compose.yml`)
|
|
|
|
---
|
|
|
|
## Use images, not builds
|
|
|
|
- We need to replace each `build` with an `image`
|
|
|
|
- We will use the `build-tag-push.py` script for that
|
|
|
|
.exercise[
|
|
|
|
- Set `DOCKER_REGISTRY` to use our local registry
|
|
|
|
- Make sure that you are building on `node1`
|
|
|
|
- Then run the script
|
|
|
|
```bash
|
|
export DOCKER_REGISTRY=localhost:5000
|
|
eval $(docker-machine env node1)
|
|
../bin/build-tag-push.py
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Run the application
|
|
|
|
- At this point, our app is ready to run
|
|
|
|
.exercise[
|
|
|
|
- Start the application:
|
|
```bash
|
|
export COMPOSE_FILE=docker-compose.yml-`NNN`
|
|
eval $(docker-machine env node1 --swarm)
|
|
docker-compose up -d
|
|
```
|
|
|
|
- Observe that it's running on multiple nodes:
|
|
<br/>(each container name is prefixed with the node it's running on)
|
|
```bash
|
|
docker ps
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## View the performance graph
|
|
|
|
- Load up the graph in the browser
|
|
|
|
.exercise[
|
|
|
|
- Check the `webui` service address and port:
|
|
```bash
|
|
docker-compose port webui 80
|
|
```
|
|
|
|
- Open it in your browser
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Scaling workers
|
|
|
|
- Scaling the `worker` service works out of the box
|
|
(like before)
|
|
|
|
.exercise[
|
|
|
|
- Scale `worker`:
|
|
```bash
|
|
docker-compose scale worker=10
|
|
```
|
|
|
|
]
|
|
|
|
Check that workers are on different nodes.
|
|
|
|
However, we hit the same bottleneck as before.
|
|
|
|
How can we address that?
|
|
|
|
---
|
|
|
|
## Finding the real cause of the bottleneck
|
|
|
|
- If time permits, we can benchmark `rng` and `hasher` to find out more
|
|
|
|
- Otherwise, we'll fast-forward a bit
|
|
|
|
---
|
|
|
|
## Benchmarking in isolation
|
|
|
|
- If we want the benchmark to be accurate, we need to make sure that `rng` and `hasher` are not receiving traffic
|
|
|
|
.exercise[
|
|
|
|
- Stop the `worker` containers:
|
|
```bash
|
|
docker-compose kill worker
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## A better benchmarking tool
|
|
|
|
- Instead of `httping`, we will now use `ab` (Apache Bench)
|
|
|
|
- We will install it in an `alpine` container placed on the network used by our application
|
|
|
|
.exercise[
|
|
|
|
- Start an interactive `alpine` container on the `dockercoins_rng` network:
|
|
```bash
|
|
docker run -ti --net dockercoins_default alpine sh
|
|
```
|
|
|
|
- Install `ab` with the `apache2-utils` package:
|
|
```bash
|
|
apk add --update apache2-utils
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Benchmarking `rng`
|
|
|
|
We will send 50 requests, but with various levels of concurrency.
|
|
|
|
.exercise[
|
|
|
|
- Send 50 requests, with a single sequential client:
|
|
```bash
|
|
ab -c 1 -n 50 http://rng/10
|
|
```
|
|
|
|
- Send 50 requests, with ten parallel clients:
|
|
```bash
|
|
ab -c 10 -n 50 http://rng/10
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Benchmark results for `rng`
|
|
|
|
- In both cases, the benchmark takes ~5 seconds to complete
|
|
|
|
- When serving requests sequentially, they each take 100ms
|
|
|
|
- In the parallel scenario, the latency increased dramatically:
|
|
|
|
- one request is served in 100ms
|
|
- another is served in 200ms
|
|
- another is served in 300ms
|
|
- ...
|
|
- another is served in 1000ms
|
|
|
|
- What about `hasher`?
|
|
|
|
---
|
|
|
|
## Benchmarking `hasher`
|
|
|
|
We will do the same tests for `hasher`.
|
|
|
|
The command is slightly more complex, since we need to post random data.
|
|
|
|
First, we need to put the POST payload in a temporary file.
|
|
|
|
.exercise[
|
|
|
|
- Install curl in the container, and generate 10 bytes of random data:
|
|
```bash
|
|
apk add curl
|
|
curl http://rng/10 >/tmp/random
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Benchmarking `hasher`
|
|
|
|
Once again, we will send 50 requests, with different levels of concurrency.
|
|
|
|
.exercise[
|
|
|
|
- Send 50 requests with a sequential client:
|
|
```bash
|
|
ab -c 1 -n 50 -T application/octet-stream \
|
|
-p /tmp/random http://hasher/
|
|
```
|
|
|
|
- Send 50 requests with 10 parallel clients:
|
|
```bash
|
|
ab -c 10 -n 50 -T application/octet-stream \
|
|
-p /tmp/random http://hasher/
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Benchmark results for `hasher`
|
|
|
|
- The sequential benchmarks takes ~5 seconds to complete
|
|
|
|
- The parallel benchmark takes less than 1 second to complete
|
|
|
|
- In both cases, each request takes a bit more than 100ms to complete
|
|
|
|
- Requests are a bit slower in the parallel benchmark
|
|
|
|
- It looks like `hasher` is better equiped to deal with concurrency than `rng`
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
Why?
|
|
|
|
---
|
|
|
|
## Why does everything take (at least) 100ms?
|
|
|
|
--
|
|
|
|
`rng` code:
|
|
|
|

|
|
|
|
--
|
|
|
|
`hasher` code:
|
|
|
|

|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
But ...
|
|
|
|
WHY?!?
|
|
|
|
---
|
|
|
|
## Why did we sprinkle this sample app with sleeps?
|
|
|
|
- Deterministic performance
|
|
<br/>(regardless of instance speed, CPUs, I/O...)
|
|
|
|
--
|
|
|
|
- Actual code sleeps all the time anyway
|
|
|
|
--
|
|
|
|
- When your code makes a remote API call:
|
|
|
|
- it sends a request;
|
|
|
|
- it sleeps until it gets the response;
|
|
|
|
- it processes the response.
|
|
|
|
---
|
|
|
|
## Why do `rng` and `hasher` behave differently?
|
|
|
|

|
|
|
|
--
|
|
|
|
(Synchronous vs. asynchronous event processing)
|
|
|
|
---
|
|
|
|
## How to make `rng` go faster
|
|
|
|
- Obvious solution: comment out the `sleep` instruction
|
|
|
|
--
|
|
|
|
- Unfortunately, in the real world, network latency exists
|
|
|
|
--
|
|
|
|
- More realistic solution: use an asynchronous framework
|
|
<br/>(e.g. use gunicorn with gevent)
|
|
|
|
--
|
|
|
|
- Reminder: we can't change the code!
|
|
|
|
--
|
|
|
|
- Solution: scale out `rng`
|
|
<br/>(dispatch `rng` requests on multiple instances)
|
|
|
|
---
|
|
|
|
# Scaling web services with Compose on Swarm
|
|
|
|
- We *can* scale network services with Compose
|
|
|
|
- The result may or may not be satisfactory, though!
|
|
|
|
.exercise[
|
|
|
|
- Restart the `worker` service:
|
|
```bash
|
|
docker-compose start worker
|
|
```
|
|
|
|
- Scale the `rng` service:
|
|
```bash
|
|
docker-compose scale rng=5
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Results
|
|
|
|
- In the web UI, you might see a performance increase ... or maybe not
|
|
|
|
--
|
|
|
|
- Since Engine 1.11, we get round-robin DNS records
|
|
|
|
(i.e. resolving `rng` will yield the IP addresses of all 3 containers)
|
|
|
|
- Docker randomizes the records it sends
|
|
|
|
- But many resolvers will sort them in unexpected ways
|
|
|
|
- Depending on various factors, you could get:
|
|
|
|
- all traffic on a single container
|
|
- traffic perfectly balanced on all containers
|
|
- traffic unevenly balanced across containers
|
|
|
|
---
|
|
|
|
## Assessing DNS randomness
|
|
|
|
- Let's see how our containers resolve DNS requests
|
|
|
|
.exercise[
|
|
|
|
- On each of our 10 scaled workers, execute 5 ping requests:
|
|
```bash
|
|
for N in $(seq 1 10); do
|
|
echo PING__________$N
|
|
for I in $(seq 1 5); do
|
|
docker exec -ti dockercoins_worker_$N ping -c1 rng
|
|
done
|
|
done | grep PING
|
|
```
|
|
|
|
]
|
|
|
|
(The 7th Might Surprise You!)
|
|
|
|
---
|
|
|
|
## DNS randomness
|
|
|
|
- Other programs can yield different results
|
|
|
|
- Same program on another distro can yield different results
|
|
|
|
- Same source code with another libc or resolver can yield different results
|
|
|
|
- Running the same test at different times can yield different results
|
|
|
|
- Did I mention that Your Results May Vary?
|
|
|
|
---
|
|
|
|
## Implementing fair load balancing
|
|
|
|
- Instead of relying on DNS round robin, let's use a proper load balancer
|
|
|
|
- Use Compose to create multiple copies of the `rng` service
|
|
|
|
- Put a load balancer in front of them
|
|
|
|
- Point other services to the load balancer
|
|
|
|
---
|
|
|
|
## Naming problem
|
|
|
|
- The service is called `rng`
|
|
|
|
- Therefore, it is reachable with the network name `rng`
|
|
|
|
- Our application code (the `worker` service) connects to `rng`
|
|
|
|
- So the name `rng` should resolve to the load balancer
|
|
|
|
- What do‽
|
|
|
|
---
|
|
|
|
## Naming is *per-network*
|
|
|
|
- Solution: put `rng` on its own network
|
|
|
|
- That way, it doesn't take the network name `rng`
|
|
<br/>(at least not on the default network)
|
|
|
|
- Have the load balancer sit on both networks
|
|
|
|
- Add the name `rng` to the load balancer
|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|
Original DockerCoins
|
|
|
|

|
|
|
|
---
|
|
|
|
class: pic
|
|
|
|
Load-balanced DockerCoins
|
|
|
|

|
|
|
|
---
|
|
|
|
## Declaring networks
|
|
|
|
- Networks (other than the default one)
|
|
*must* be declared
|
|
in a top-level `networks` section,
|
|
placed anywhere in the file
|
|
|
|
.exercise[
|
|
|
|
- Add the `rng` network to the Compose file, `docker-compose.yml-NNN`:
|
|
```yaml
|
|
version: '2'
|
|
|
|
networks:
|
|
rng:
|
|
|
|
services:
|
|
rng:
|
|
image: ...
|
|
...
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Putting the `rng` service in its network
|
|
|
|
- Services can have a `networks` section
|
|
|
|
- If they don't: they are placed in the default network
|
|
|
|
- If they do: they are placed only in the mentioned networks
|
|
|
|
.exercise[
|
|
|
|
- Change the `rng` service to put it in its network:
|
|
```yaml
|
|
rng:
|
|
image: localhost:5000/dockercoins_rng:…
|
|
networks:
|
|
rng:
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Adding the load balancer
|
|
|
|
- The load balancer has to be in both networks: `rng` and `default`
|
|
- In the `default` network, it must have the `rng` alias
|
|
- We will use the `jpetazzo/hamba` image
|
|
|
|
.exercise[
|
|
|
|
- Add the `rng-lb` service to the Compose file:
|
|
```yaml
|
|
rng-lb:
|
|
image: jpetazzo/hamba
|
|
command: run
|
|
networks:
|
|
rng:
|
|
default:
|
|
aliases: [ rng ]
|
|
```
|
|
]
|
|
|
|
---
|
|
|
|
## Load balancer initial configuration
|
|
|
|
- We specified `run` as the initial command
|
|
|
|
- This tells `hamba` to wait for an initial configuration
|
|
|
|
- The load balancer will not be operational (until we feed it its configuration)
|
|
|
|
---
|
|
|
|
## Start the application
|
|
|
|
.exercise[
|
|
|
|
- Bring up DockerCoins:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
- See that `worker` is complaining:
|
|
```bash
|
|
docker-compose logs --tail 100 --follow worker
|
|
```
|
|
]
|
|
|
|
---
|
|
|
|
## Add one backend to the load balancer
|
|
|
|
- Multiple solutions:
|
|
|
|
- lookup the IP address of the `rng` backend
|
|
- use the backend's network name
|
|
- use the backend's container name (easiest!)
|
|
|
|
.exercise[
|
|
|
|
- Configure the load balancer:
|
|
```bash
|
|
docker run --rm --volumes-from dockercoins_rng-lb_1 \
|
|
--net container:dockercoins_rng-lb_1 \
|
|
jpetazzo/hamba reconfigure 80 dockercoins_rng_1 80
|
|
```
|
|
|
|
]
|
|
|
|
The application should now be working correctly.
|
|
|
|
---
|
|
|
|
## Add all backends to the load balancer
|
|
|
|
- The command is similar to the one before
|
|
|
|
- We need to pass the list of all backends
|
|
|
|
.exercise[
|
|
|
|
- Reconfigure the load balancer:
|
|
```bash
|
|
docker run --rm \
|
|
--volumes-from dockercoins_rng-lb_1 \
|
|
--net container:dockercoins_rng-lb_1 \
|
|
jpetazzo/hamba reconfigure 80 \
|
|
$(for N in $(seq 1 5); do
|
|
echo dockercoins_rng_$N:80
|
|
done)
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Automating the process
|
|
|
|
- Nobody loves artisan YAML handy craft
|
|
|
|
- This can be scripted very easily
|
|
|
|
- But can it be fully automated?
|
|
|
|
---
|
|
|
|
## Use DNS to discover the addresses of all the backends
|
|
|
|
- When multiple containers have the same network alias:
|
|
|
|
- Engine 1.10 returns only one of them (the same one across the whole network)
|
|
|
|
- Engine 1.11 returns all of them (in a random order)
|
|
|
|
- A "smart" client can use all records to implement load balancing
|
|
|
|
- We can compose `jpetazzo/hamba` with a special-purpose container,
|
|
which will dynamically generate HAProxy's configuration when
|
|
the DNS records are updated
|
|
|
|
---
|
|
|
|
## Introducing `jpetazzo/watchdns`
|
|
|
|
- [100 lines of pure POSIX scriptery](
|
|
https://github.com/jpetazzo/watchdns/blob/master/watchdns)
|
|
|
|
- Resolves a given DNS name every second
|
|
|
|
- Each time the result changes, a new HAProxy configuration is generated
|
|
|
|
- When used together with `--volumes-from` and `jpetazzo/hamba`, it
|
|
updates the configuration of an existing load balancer
|
|
|
|
- Comes with a companion script, [`add-load-balancer-v2.py`](https://github.com/jpetazzo/orchestration-workshop/blob/master/bin/add-load-balancer-v2.py), to update your Compose files
|
|
|
|
---
|
|
|
|
## Using `jpetazzo/watchdns`
|
|
|
|
.exercise[
|
|
|
|
- First, revert the Compose file to remove the load balancer
|
|
|
|
- Then, run `add-load-balancer-v2.py`:
|
|
```bash
|
|
../bin/add-load-balancer-v2.py rng
|
|
```
|
|
|
|
- Inspect the resulting Compose file
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Scaling with `watchdns`
|
|
|
|
.exercise[
|
|
|
|
- Start the application with the new sidekick containers:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
- Scale `rng`:
|
|
```bash
|
|
docker-compose scale rng=10
|
|
```
|
|
|
|
- Check logs:
|
|
```bash
|
|
docker-compose logs rng-wd
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Comments
|
|
|
|
- This is a very crude implementation of the pattern
|
|
|
|
- A Go version would only be a bit longer, but use much less resources
|
|
|
|
- When there are many backends, reacting quickly to change is less important
|
|
|
|
(i.e. it's not necessary to re-resolve records every second!)
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# All things ops <br/> (logs, backups, and more)
|
|
|
|
---
|
|
|
|
# Logs
|
|
|
|
- Two strategies:
|
|
|
|
- log to plain files on volumes
|
|
|
|
- log to stdout
|
|
<br/>(and use a logging driver)
|
|
|
|
---
|
|
|
|
## Logging to plain files on volumes
|
|
|
|
(Sorry, that part won't be hands-on!)
|
|
|
|
- Start a container with `-v /logs`
|
|
|
|
- Make sure that all log files are in `/logs`
|
|
|
|
- To check logs, run e.g.
|
|
|
|
```bash
|
|
docker run --volumes-from ... ubuntu sh -c "grep WARN /logs/*.log"
|
|
```
|
|
|
|
- Or just go interactive:
|
|
|
|
```bash
|
|
docker run --volumes-from ... -ti ubuntu
|
|
```
|
|
|
|
- You can (should) start a log shipper that way
|
|
|
|
---
|
|
|
|
## Logging to stdout
|
|
|
|
- All containers should write to stdout/stderr
|
|
|
|
- Docker will collect logs and pass them to a logging driver
|
|
|
|
- Logging driver can specified globally, and per container
|
|
<br/>(changing it for a container overrides the global setting)
|
|
|
|
- To change the global logging driver, pass extra flags to the daemon
|
|
<br/>(requires a daemon restart)
|
|
|
|
- To override the logging driver for a container, pass extra flags to `docker run`
|
|
|
|
---
|
|
|
|
## Specifying logging flags
|
|
|
|
- `--log-driver`
|
|
|
|
*selects the driver*
|
|
|
|
- `--log-opt key=val`
|
|
|
|
*adds driver-specific options*
|
|
<br/>*(can be repeated multiple times)*
|
|
|
|
- The flags are identical for `docker daemon` and `docker run`
|
|
|
|
---
|
|
|
|
## Logging flags in practice
|
|
|
|
- If you provision your nodes with Docker Machine,
|
|
you can set global logging flags (which will apply to all
|
|
containers started by a given Engine) like this:
|
|
|
|
```bash
|
|
docker-machine create ... --engine-opt log-driver=...
|
|
```
|
|
|
|
- Otherwise, use your favorite method to edit or manage configuration files
|
|
|
|
- You can set per-container logging options in Compose files
|
|
|
|
---
|
|
|
|
## Available drivers
|
|
|
|
- json-file (default)
|
|
|
|
- syslog (can send to UDP, TCP, TCP+TLS, UNIX sockets)
|
|
|
|
- awslogs (AWS CloudWatch)
|
|
|
|
- journald
|
|
|
|
- gelf
|
|
|
|
- fluentd
|
|
|
|
- splunk
|
|
|
|
---
|
|
|
|
## About json-file ...
|
|
|
|
- It doesn't rotate logs by default, so your disks will fill up
|
|
|
|
(Unless you set `maxsize` *and* `maxfile` log options.)
|
|
|
|
- It's the only one supporting logs retrieval
|
|
|
|
(If you want to use `docker logs`, `docker-compose logs`,
|
|
or fetch logs from the Docker API, you need json-file!)
|
|
|
|
- This might change in the future
|
|
|
|
(But it's complex since there is no standard protocol
|
|
to *retrieve* log entries.)
|
|
|
|
All about logging in the documentation:
|
|
https://docs.docker.com/reference/logging/overview/
|
|
|
|
---
|
|
|
|
# Setting up ELK to store container logs
|
|
|
|
*Important foreword: this is not an "official" or "recommended"
|
|
setup; it is just an example. We do not endorse ELK, GELF,
|
|
or the other elements of the stack more than others!*
|
|
|
|
What we will do:
|
|
|
|
- Spin up an ELK stack, with Compose
|
|
|
|
- Gaze at the spiffy Kibana web UI
|
|
|
|
- Manually send a few log entries over GELF
|
|
|
|
- Reconfigure our DockerCoins app to send logs to ELK
|
|
|
|
---
|
|
|
|
## What's in an ELK stack?
|
|
|
|
- ELK is three components:
|
|
|
|
- ElasticSearch (to store and index log entries)
|
|
|
|
- Logstash (to receive log entries from various
|
|
sources, process them, and forward them to various
|
|
destinations)
|
|
|
|
- Kibana (to view/search log entries with a nice UI)
|
|
|
|
- The only component that we will configure is Logstash
|
|
|
|
- We will accept log entries using the GELF protocol
|
|
|
|
- Log entries will be stored in ElasticSearch,
|
|
<br/>and displayed on Logstash's stdout for debugging
|
|
|
|
---
|
|
|
|
## Starting our ELK stack
|
|
|
|
- We will use a *separate* Compose file
|
|
|
|
- The Compose file is in the `elk` directory
|
|
|
|
.exercise[
|
|
|
|
- Go to the `elk` directory:
|
|
```bash
|
|
cd ~/orchestration-workshop/elk
|
|
```
|
|
|
|
- Start the ELK stack:
|
|
```bash
|
|
unset COMPOSE_FILE
|
|
docker-compose up -d
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Making sure that each node has a local logstash
|
|
|
|
- We will configure each container to send logs to `localhost:12201`
|
|
|
|
- We need to make sure that each node has a logstash container listening on port 12201
|
|
|
|
.exercise[
|
|
|
|
- Scale the `logstash` service to 5 instances (one per node):
|
|
```bash
|
|
for N in $(seq 1 5); do
|
|
docker-compose scale logstash=$N
|
|
done
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Checking that our ELK stack works
|
|
|
|
- Our default Logstash configuration sends a test
|
|
message every minute
|
|
|
|
- All messages are stored into ElasticSearch,
|
|
but also shown on Logstash stdout
|
|
|
|
.exercise[
|
|
|
|
- Look at Logstash stdout:
|
|
```bash
|
|
docker-compose logs logstash
|
|
```
|
|
|
|
]
|
|
|
|
After less than one minute, you should see a `"message" => "ok"`
|
|
in the output.
|
|
|
|
---
|
|
|
|
## Connect to Kibana
|
|
|
|
- Our ELK stack exposes two public services:
|
|
<br/>the Kibana web server, and the GELF UDP socket
|
|
|
|
- They are both exposed on their default port numbers
|
|
<br/>(5601 for Kibana, 12201 for GELF)
|
|
|
|
.exercise[
|
|
|
|
- Check the address of the node running kibana:
|
|
```bash
|
|
docker-compose ps
|
|
```
|
|
|
|
- Open the UI in your browser: http://instance-address:5601/
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## "Configuring" Kibana
|
|
|
|
- If you see a status page with a yellow item, wait a minute and reload
|
|
(Kibana is probably still initializing)
|
|
|
|
- Kibana should offer you to "Configure an index pattern",
|
|
just click the "Create" button
|
|
|
|
- Then:
|
|
|
|
- click "Discover" (in the top-left corner)
|
|
- click "Last 15 minutes" (in the top-right corner)
|
|
- click "Last 1 hour" (in the list in the middle)
|
|
- click "Auto-refresh" (top-right corner)
|
|
- click "5 seconds" (top-left of the list)
|
|
|
|
- You should see a series of green bars (with one new green bar every minute)
|
|
|
|
---
|
|
|
|

|
|
|
|
---
|
|
|
|
## Sending container output to Kibana
|
|
|
|
- We will create a simple container displaying "hello world"
|
|
|
|
- We will override the container logging driver
|
|
|
|
- The GELF address is `127.0.0.1:12201`, because the Compose file
|
|
explicitly exposes the GELF socket on port 12201
|
|
|
|
.exercise[
|
|
|
|
- Start our one-off container:
|
|
|
|
```bash
|
|
docker run --rm --log-driver gelf \
|
|
--log-opt gelf-address=udp://127.0.0.1:12201 \
|
|
alpine echo hello world
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Visualizing container logs in Kibana
|
|
|
|
- Less than 5 seconds later (the refresh rate of the UI),
|
|
the log line should be visible in the web UI
|
|
|
|
- We can customize the web UI to be more readable
|
|
|
|
.exercise[
|
|
|
|
- In the left column, move the mouse over the following
|
|
columns, and click the "Add" button that appears:
|
|
|
|
- host
|
|
- container_name
|
|
- message
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Switching back to the DockerCoins application
|
|
|
|
.exercise[
|
|
|
|
- Go back to the dockercoins directory:
|
|
```bash
|
|
cd ~/orchestration-workshop/dockercoins
|
|
```
|
|
|
|
- Set the `COMPOSE_FILE` variable:
|
|
```bash
|
|
export COMPOSE_FILE=docker-compose.yml-`NNN`
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
## Add the logging driver to the Compose file
|
|
|
|
- We need to add the logging section to each container
|
|
|
|
.exercise[
|
|
|
|
- Edit the `docker-compose.yml-NNN` file, adding the following lines **to each container**:
|
|
|
|
```yaml
|
|
logging:
|
|
driver: gelf
|
|
options:
|
|
gelf-address: "udp://127.0.0.1:12201"
|
|
```
|
|
|
|
]
|
|
|
|
There is also a script, [`../bin/add-logging.py`](https://github.com/jpetazzo/orchestration-workshop/blob/master/bin/add-logging.py), to do that automatically.
|
|
|
|
---
|
|
|
|
## Update the DockerCoins app
|
|
|
|
.exercise[
|
|
|
|
- Use Compose normally:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
]
|
|
|
|
If you look in the Kibana web UI, you will see log lines
|
|
refreshed every 5 seconds.
|
|
|
|
Note: to do interesting things (graphs, searches...) we
|
|
would need to create indexes. This is beyond the scope
|
|
of this workshop.
|
|
|
|
---
|
|
|
|
## Logging in production
|
|
|
|
- If we were using an ELK stack:
|
|
|
|
- scale ElasticSearch
|
|
- interpose a Redis or Kafka queue to deal with bursts
|
|
|
|
- Configure your Engines to send all logs to ELK by default
|
|
|
|
- Start the logging containers with a different logging system
|
|
<br/>(to avoid a logging loop)
|
|
|
|
- Make sure you don't end up writing *all logs* on the nodes running Logstash!
|
|
|
|
---
|
|
|
|
# Network traffic analysis
|
|
|
|
- We want to inspect the network traffic entering/leaving `dockercoins_redis_1`
|
|
|
|
- We will use *shared network namespaces* to perform network analysis
|
|
|
|
- Two containers sharing the same network namespace...
|
|
|
|
- have the same IP addresses
|
|
|
|
- have the same network interfaces
|
|
|
|
- `eth0` is therefore the same in both containers
|
|
|
|
---
|
|
|
|
## Install and start `ngrep`
|
|
|
|
Ngrep uses libpcap (like tcpdump) to sniff network traffic.
|
|
|
|
.exercise[
|
|
|
|
<!--
|
|
```meta
|
|
^{
|
|
```
|
|
-->
|
|
|
|
- Start a container with the same network namespace:
|
|
<br/>`docker run --net container:dockercoins_redis_1 -ti alpine sh`
|
|
|
|
- Install ngrep:
|
|
<br/>`apk update && apk add ngrep`
|
|
|
|
- Run ngrep:
|
|
<br/>`ngrep -tpd eth0 -Wbyline . tcp`
|
|
|
|
<!--
|
|
```meta
|
|
^}
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
You should see a stream of Redis requests and responses.
|
|
|
|
---
|
|
|
|
# Backups
|
|
|
|
- We want to enable backups for `dockercoins_redis_1`
|
|
|
|
- We don't want to install extra software in this container
|
|
|
|
- We will use a special backup container:
|
|
|
|
- sharing the same volumes
|
|
|
|
- using the same network stack (to connect to it easily)
|
|
|
|
- possibly containing our backup tools
|
|
|
|
- This works because the `redis` container image stores its data on a volume
|
|
|
|
---
|
|
|
|
## Starting the backup container
|
|
|
|
- We will use the `--net container:` option to be able to connect locally
|
|
|
|
- We will use the `--volumes-from` option to access the container's persistent data
|
|
|
|
.exercise[
|
|
|
|
<!--
|
|
```meta
|
|
^{
|
|
```
|
|
-->
|
|
|
|
- Start the container:
|
|
|
|
```bash
|
|
docker run --net container:dockercoins_redis_1 \
|
|
--volumes-from dockercoins_redis_1:ro \
|
|
-v /tmp/myredis:/output \
|
|
-ti alpine sh
|
|
```
|
|
|
|
- Look in `/data` in the container (that's where Redis puts its data dumps)
|
|
]
|
|
|
|
---
|
|
|
|
## Connecting to Redis
|
|
|
|
- We need to tell Redis to perform a data dump *now*
|
|
|
|
.exercise[
|
|
|
|
- Connect to Redis:
|
|
```bash
|
|
telnet localhost 6379
|
|
```
|
|
|
|
- Issue commands `SAVE` then `QUIT`
|
|
|
|
- Look at `/data` again (notice the time stamps)
|
|
|
|
]
|
|
|
|
- There should be a recent dump file now!
|
|
|
|
---
|
|
|
|
## Getting the dump out of the container
|
|
|
|
- We could use many things:
|
|
|
|
- s3cmd to copy to S3
|
|
- SSH to copy to a remote host
|
|
- gzip/bzip/etc before copying
|
|
|
|
- We'll just copy it to the Docker host
|
|
|
|
.exercise[
|
|
|
|
- Copy the file from `/data` to `/output`
|
|
|
|
- Exit the container
|
|
|
|
- Look into `/tmp/myredis` (on the host)
|
|
|
|
<!--
|
|
```meta
|
|
^}
|
|
```
|
|
-->
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Scheduling backups
|
|
|
|
In the "old world," we (generally) use cron.
|
|
|
|
With containers, what are our options?
|
|
|
|
--
|
|
|
|
- run `cron` on the Docker host, and put `docker run` in the crontab
|
|
|
|
--
|
|
|
|
- run `cron` in the backup container, and make sure it keeps running
|
|
<br/>(e.g. with `docker run --restart=…`)
|
|
|
|
--
|
|
|
|
- run `cron` in a container, and start backup containers from there
|
|
|
|
--
|
|
|
|
- listen to the Docker events stream, automatically scheduling backups
|
|
<br/>when database containers are started
|
|
|
|
---
|
|
|
|
# Controlling Docker from a container
|
|
|
|
- In a local environment, just bind-mount the Docker control socket:
|
|
```bash
|
|
docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker
|
|
```
|
|
|
|
- Otherwise, you have to:
|
|
|
|
- set `DOCKER_HOST`,
|
|
- set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS),
|
|
- copy certificates to the container that will need API access.
|
|
|
|
More resources on this topic:
|
|
|
|
- [Do not use Docker-in-Docker for CI](
|
|
http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
|
|
- [One container to rule them all](
|
|
http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/)
|
|
|
|
---
|
|
|
|
# Docker events stream
|
|
|
|
- Using the Docker API, we can get real-time
|
|
notifications of everything happening in the Engine:
|
|
|
|
- container creation/destruction
|
|
- container start/stop
|
|
- container exit/signal/out of memory
|
|
- container attach/detach
|
|
- volume creation/destruction
|
|
- network creation/destruction
|
|
- connection/disconnection of containers
|
|
|
|
---
|
|
|
|
## Subscribing to the events stream
|
|
|
|
- This is done with `docker events`
|
|
|
|
.exercise[
|
|
|
|
- Get a stream of events:
|
|
```bash
|
|
docker events
|
|
```
|
|
|
|
<!--
|
|
```meta
|
|
^Z
|
|
```
|
|
-->
|
|
|
|
- In a new terminal, do *anything*:
|
|
```bash
|
|
docker run --rm alpine sleep 10
|
|
```
|
|
|
|
]
|
|
|
|
You should see events for the lifecycle of the
|
|
container, as well as its connection/disconnection
|
|
to the default `bridge` network.
|
|
|
|
---
|
|
|
|
## A few tools to use the events stream
|
|
|
|
- [docker-spotter](https://github.com/discordianfish/docker-spotter)
|
|
|
|
Written in Go; simple building block to use directly in Shell scripts
|
|
|
|
- [ahab](https://github.com/instacart/ahab)
|
|
|
|
Written in Python; available as a library; ships with a CLI tool
|
|
|
|
---
|
|
|
|
# Security upgrades
|
|
|
|
- This section is not hands-on
|
|
|
|
- Public Service Announcement
|
|
|
|
- We'll discuss:
|
|
|
|
- how to upgrade the Docker daemon
|
|
|
|
- how to upgrade container images
|
|
|
|
---
|
|
|
|
## Upgrading the Docker daemon
|
|
|
|
- Stop all containers cleanly
|
|
|
|
- Stop the Docker daemon
|
|
|
|
- Upgrade the Docker daemon
|
|
|
|
- Start the Docker daemon
|
|
|
|
- Start all containers
|
|
|
|
- This is like upgrading your Linux kernel, but it will get better
|
|
|
|
(Docker Engine 1.11 is using containerd, which will ultimately allow seamless upgrades.)
|
|
|
|
???
|
|
|
|
## In practice
|
|
|
|
- Keep track of running containers before stopping the Engine:
|
|
```bash
|
|
docker ps --no-trunc -q |
|
|
tee /tmp/running |
|
|
xargs -n1 -P10 docker stop
|
|
```
|
|
|
|
- Restart those containers after the Engine is running again:
|
|
```bash
|
|
xargs docker start < /tmp/running
|
|
```
|
|
<br/>(Run this multiple times if you have linked containers!)
|
|
|
|
---
|
|
|
|
## Upgrading container images
|
|
|
|
- When a vulnerability is announced:
|
|
|
|
- if it affects your base images: make sure they are fixed first
|
|
|
|
- if it affects downloaded packages: make sure they are fixed first
|
|
|
|
- re-pull base images
|
|
|
|
- rebuild
|
|
|
|
- restart containers
|
|
|
|
---
|
|
|
|
## How do we know when to upgrade?
|
|
|
|
- Subscribe to CVE notifications
|
|
|
|
- https://cve.mitre.org/
|
|
|
|
- your distros' security announcements
|
|
|
|
- Check CVE status in official images
|
|
<br/>(tag [cve-tracker](
|
|
https://github.com/docker-library/official-images/labels/cve-tracker)
|
|
in [docker-library/official-images](
|
|
https://github.com/docker-library/official-images/labels/cve-tracker)
|
|
repo)
|
|
|
|
- Use a container vulnerability scanner
|
|
<br/>(e.g. [Docker Security Scanning](https://blog.docker.com/2016/05/docker-security-scanning/))
|
|
|
|
---
|
|
|
|
## Upgrading with Compose
|
|
|
|
Compose makes this particularly easy:
|
|
```bash
|
|
docker-compose build --pull --no-cache
|
|
docker-compose up -d
|
|
```
|
|
|
|
This will automatically:
|
|
|
|
- pull base images;
|
|
- rebuild all container images;
|
|
- bring up the new containers.
|
|
|
|
Remember: Compose will automatically move our
|
|
volumes to the new containers, so data is preserved.
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# Resiliency <br/> and <br/> high availability
|
|
|
|
---
|
|
|
|
## What are our single points of failure?
|
|
|
|
- The TLS certificates created by Machine are on `node1`
|
|
|
|
- We have only one Swarm manager
|
|
|
|
- If a node (running containers) is down or unreachable,
|
|
our application will be affected
|
|
|
|
---
|
|
|
|
# Distributing Machine credentials
|
|
|
|
- All the credentials (TLS keys and certs) are on node1
|
|
<br/>(the node on which we ran `docker-machine create`)
|
|
|
|
- If we lose node1, we're toast
|
|
|
|
- We need to move (or copy) the credentials somewhere safe
|
|
|
|
- Credentials are regular files, and relatively small
|
|
|
|
- Ah, if only we had a highly available, hierarchic store ...
|
|
|
|
--
|
|
|
|
- Wait a minute, we have one!
|
|
|
|
--
|
|
|
|
(That's Consul, if you were wondering)
|
|
|
|
---
|
|
|
|
## Storing files in Consul
|
|
|
|
- We will use [Benjamin Wester's consulfs](
|
|
https://github.com/bwester/consulfs)
|
|
|
|
- It mounts a Consul key/value store as a local filesystem
|
|
|
|
- Performance will be horrible
|
|
<br/>(don't run a database on top of that!)
|
|
|
|
- But to store files of a few KB, nobody will notice
|
|
|
|
- We will copy/link/sync... `~/.docker/machine` to Consul
|
|
|
|
---
|
|
|
|
## Installing consulfs
|
|
|
|
- Option 1: install Go, git clone, go build ...
|
|
|
|
- Option 2: be lazy and use [jpetazzo/consulfs](
|
|
https://hub.docker.com/r/jpetazzo/consulfs/)
|
|
|
|
.exercise[
|
|
|
|
- Be lazy and use the Docker image:
|
|
```bash
|
|
eval $(docker-machine env node1)
|
|
docker run --rm -v /usr/local/bin:/target jpetazzo/consulfs
|
|
```
|
|
]
|
|
|
|
Note: the `jpetazzo/consulfs` image contains the
|
|
`consulfs` binary.
|
|
|
|
It copies it to `/target` (if `/target` is a volume).
|
|
|
|
---
|
|
|
|
## Can't we run consulfs in a container?
|
|
|
|
- Yes we can!
|
|
|
|
- The filesystem will be mounted in the container
|
|
|
|
- It won't be visible outside of the container (from the host)
|
|
|
|
- We can use *shared mounts* to propagate mounts from containers to Docker
|
|
|
|
- But propagating from Docker to the host requires particular systemd flags
|
|
|
|
- ... So we'll run it on the host for now
|
|
|
|
---
|
|
|
|
## Running consulfs
|
|
|
|
- The `consulfs` binary takes two arguments:
|
|
|
|
- the Consul server address
|
|
- a mount point (that has to be created first)
|
|
|
|
.exercise[
|
|
|
|
- Create a mount point and mount Consul as a local filesystem:
|
|
```bash
|
|
mkdir ~/consul
|
|
consulfs localhost:8500 ~/consul
|
|
```
|
|
|
|
]
|
|
|
|
Leave this running in the foreground.
|
|
|
|
---
|
|
|
|
## Checking our consulfs mount point
|
|
|
|
- All key/values will be visible:
|
|
|
|
- Swarm discovery
|
|
|
|
- overlay networks
|
|
|
|
- ... anything you put in Consul!
|
|
|
|
.exercise[
|
|
|
|
- Check that Consul key/values are visible:
|
|
```bash
|
|
ls -l ~/consul/
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Copying our credentials to Consul
|
|
|
|
- Use standard UNIX commands
|
|
|
|
- Don't try to preserve permissions, though (`consulfs` doesn't store permissions)
|
|
|
|
.exercise[
|
|
|
|
- Copy Machine credentials into Consul:
|
|
```bash
|
|
cp -r ~/.docker/machine/. ~/consul/machine/
|
|
```
|
|
|
|
]
|
|
|
|
(This command can be re-executed to update the copy.)
|
|
|
|
---
|
|
|
|
## Install consulfs on another node
|
|
|
|
- We will repeat the previous steps to install consulfs
|
|
|
|
.exercise[
|
|
|
|
- Connect to node2:
|
|
```bash
|
|
ssh node2
|
|
```
|
|
|
|
- Install `consulfs`:
|
|
```bash
|
|
docker run --rm -v /usr/local/bin:/target jpetazzo/consulfs
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Mount Consul
|
|
|
|
- The procedure is still the same as on the first node
|
|
|
|
.exercise[
|
|
|
|
- Create the mount point:
|
|
```bash
|
|
mkdir ~/consul
|
|
```
|
|
|
|
- Mount the filesystem:
|
|
```bash
|
|
consulfs localhost:8500 ~/consul &
|
|
```
|
|
|
|
]
|
|
|
|
At this point, `ls -l ~/consul` should show `docker` and
|
|
`machine` directories.
|
|
|
|
---
|
|
|
|
## Access the credentials from the other node
|
|
|
|
- We will create a symlink
|
|
|
|
- We could also copy the credentials
|
|
|
|
.exercise[
|
|
|
|
- Create the symlink:
|
|
```bash
|
|
mkdir -p ~/.docker/
|
|
ln -s ~/consul/machine ~/.docker/
|
|
```
|
|
|
|
- Check that all nodes are visible:
|
|
```bash
|
|
docker-machine ls
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## A few words on this strategy
|
|
|
|
- Anyone accessing Consul can control your Docker cluster
|
|
<br/>(to be fair: anyone accessing Consul can wreck
|
|
serious havoc to your cluster anyway)
|
|
|
|
- ConsulFS doesn't support *all* POSIX operations,
|
|
so a few things (like `mv`) will not work)
|
|
|
|
- As a consequence, with Machine 0.6, you cannot
|
|
run `docker-machine create` directly on top of ConsulFS
|
|
|
|
---
|
|
|
|
## What if Consul becomes unavailable?
|
|
|
|
- If Consul becomes unavailable (e.g. loses quorum),
|
|
<br/>you won't be able to access your credentials
|
|
|
|
- If Consul becomes unavailable ...
|
|
<br/>your cluster will be in a bad state anyway
|
|
|
|
- You can still access each Docker Engine over the
|
|
local UNIX socket
|
|
<br/>(and repair Consul that way)
|
|
|
|
|
|
---
|
|
|
|
# Highly available Swarm managers
|
|
|
|
- Until now, the Swarm manager was a SPOF
|
|
<br/>(Single Point Of Failure)
|
|
|
|
- Swarm has support for replication
|
|
|
|
- When replication is enabled, you deploy multiple (identical) managers
|
|
|
|
- one will be "primary"
|
|
- the other(s) will be "secondary"
|
|
- this is determined automatically
|
|
<br/>(through *leader election*)
|
|
|
|
---
|
|
|
|
## Swarm leader election
|
|
|
|
- The leader election mechanism relies on a key/value store
|
|
<br/>(Consul, etcd, Zookeeper)
|
|
|
|
- There is no requirement on the number of replicas
|
|
<br/>(the quorum is achieved through the key/value store)
|
|
|
|
- When the leader (or "primary") is unavailable,
|
|
<br/>a new election happens automatically
|
|
|
|
- You can issue API requests to any manager:
|
|
<br/>if you talk to a secondary, it forwards to the primary
|
|
|
|
.warning[There is currently a bug when
|
|
the Consul cluster itself has a leader election;
|
|
<br/>see [docker/swarm#1782](https://github.com/docker/swarm/issues/1782).]
|
|
|
|
---
|
|
|
|
## Swarm replication in practice
|
|
|
|
- We need to give two extra flags to the Swarm manager:
|
|
|
|
- `--replication`
|
|
|
|
*enables replication (duh!)*
|
|
|
|
- `--advertise ip.ad.dr.ess:port`
|
|
|
|
*address and port where this Swarm manager is reachable*
|
|
|
|
- Do you deploy with Docker Machine?
|
|
<br/>Then you can use `--swarm-opt`
|
|
to automatically pass flags to the Swarm manager
|
|
|
|
---
|
|
|
|
## Cleaning up our current Swarm containers
|
|
|
|
- We will use Docker Machine to re-provision Swarm
|
|
|
|
- We need to:
|
|
|
|
- remove the nodes from the Machine registry
|
|
- remove the Swarm containers
|
|
|
|
.exercise[
|
|
|
|
- Remove the current configuration (remember to go back to node1!):
|
|
```bash
|
|
for N in 1 2 3 4 5; do
|
|
ssh node$N docker rm -f swarm-agent swarm-agent-master
|
|
docker-machine rm -f node$N
|
|
done
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Re-deploy with the new configuration
|
|
|
|
- This time, all nodes can be deployed identically
|
|
<br/>(instead of 1 manager + 4 non-managers)
|
|
|
|
.exercise[
|
|
|
|
```bash
|
|
grep node[12345] /etc/hosts | grep -v ^127 |
|
|
while read IPADDR NODENAME; do
|
|
docker-machine create --driver generic \
|
|
--engine-opt cluster-store=consul://localhost:8500 \
|
|
--engine-opt cluster-advertise=eth0:2376 \
|
|
--swarm --swarm-master \
|
|
--swarm-discovery consul://localhost:8500 \
|
|
--swarm-opt replication --swarm-opt advertise=$IPADDR:3376 \
|
|
--generic-ssh-user docker --generic-ip-address $IPADDR $NODENAME
|
|
done
|
|
```
|
|
|
|
]
|
|
|
|
.small[
|
|
Note: Consul is still running thanks to the `--restart=always` policy.
|
|
Other containers are now stopped, because the engines have been
|
|
reconfigured and restarted.
|
|
]
|
|
|
|
---
|
|
|
|
## Assess our new cluster health
|
|
|
|
- The output of `docker info` will tell us the status
|
|
of the node that we are talking to (primary or replica)
|
|
|
|
- If we talk to a replica, it will tell us who is the primary
|
|
|
|
.exercise[
|
|
|
|
- Talk to a random node, and ask its view of the cluster:
|
|
```bash
|
|
eval $(docker-machine env node3 --swarm)
|
|
docker info | grep -e ^Name -e ^Role -e ^Primary
|
|
```
|
|
|
|
]
|
|
|
|
Note: `docker info` is one of the only commands that will
|
|
work even when there is no elected primary. This helps
|
|
debugging.
|
|
|
|
---
|
|
|
|
## Test Swarm manager failover
|
|
|
|
- The previous command told us which node was the primary manager
|
|
|
|
- if `Role` is `primary`,
|
|
<br/>then the primary is indicated by `Name`
|
|
|
|
- if `Role` is `replica`,
|
|
<br/>then the primary is indicated by `Primary`
|
|
|
|
.exercise[
|
|
|
|
- Kill the primary manager:
|
|
```bash
|
|
ssh node`N` docker kill swarm-agent-master
|
|
```
|
|
|
|
]
|
|
|
|
Look at the output of `docker info` every few seconds.
|
|
|
|
---
|
|
|
|
# Highly available containers
|
|
|
|
- Swarm has support for *rescheduling* on node failure
|
|
|
|
- It has to be explicitly enabled on a per-container basis
|
|
|
|
- When the primary manager detects that a node goes down,
|
|
<br/>those containers are rescheduled elsewhere
|
|
|
|
- If the containers can't be rescheduled (constraints issue),
|
|
<br/>they are lost (there is no reconciliation loop yet)
|
|
|
|
- In Swarm 1.1, this is an *experimental* feature
|
|
<br/>(To enable it, you must pass the `--experimental` flag when you start Swarm itself!)
|
|
|
|
- In Swarm 1.2, you don't need the `--experimental` flag anymore
|
|
|
|
---
|
|
|
|
## About Swarm generic flags
|
|
|
|
- Some flags like `--experimental` and `--debug` must be *before* the Swarm command
|
|
<br/>(i.e. `docker run swarm --debug manage ...`)
|
|
|
|
- We cannot use Docker Machine to pass that flag ☹
|
|
<br/>(Machine adds flags *after* the Swarm command)
|
|
|
|
- Instead, we can use a custom Swarm image:
|
|
```dockerfile
|
|
FROM swarm
|
|
ENTRYPOINT ["/swarm", "--debug"]
|
|
```
|
|
|
|
- We can tell Machine to use this with `--swarm-image`
|
|
|
|
---
|
|
|
|
## Start a resilient container
|
|
|
|
- By default, containers will not be restarted when their node goes down
|
|
|
|
- You must pass an explicit *rescheduling policy* to make that happen
|
|
|
|
- For now, the only policy is "on-node-failure"
|
|
|
|
.exercise[
|
|
|
|
- Start a container with a rescheduling policy:
|
|
|
|
```bash
|
|
docker run --name highlander -d -e reschedule:on-node-failure nginx
|
|
```
|
|
|
|
]
|
|
|
|
Check that the container is up and running.
|
|
|
|
---
|
|
|
|
## Simulate a node failure
|
|
|
|
- We will reboot the node running this container
|
|
|
|
- Swarm will reschedule it
|
|
|
|
.exercise[
|
|
|
|
- Check on which node the container is running:
|
|
</br>`NODE=$(docker inspect --format '{{.Node.Name}}' highlander)`
|
|
|
|
- Reboot that node:
|
|
<br/>`ssh $NODE sudo reboot`
|
|
|
|
- Check that the container has been recheduled:
|
|
<br/>`docker ps -a`
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## Reboots
|
|
|
|
- When rebooting a node, Docker is stopped cleanly, and containers are stopped
|
|
|
|
- Our container is rescheduled, but not started
|
|
|
|
- To simulate a "proper" failure, we can use the Chaos Monkey script instead
|
|
|
|
```bash
|
|
~/orchestration-workshop/bin/chaosmonkey $NODE <connect|disconnect|reboot>
|
|
```
|
|
|
|
---
|
|
|
|
## Cluster reconciliation
|
|
|
|
- After the cluster rejoins, we can end up with duplicate containers
|
|
|
|
.exercise[
|
|
|
|
- Once the node is back, remove one of the extraneous containers:
|
|
```bash
|
|
docker rm -f node`N`/highlander
|
|
```
|
|
|
|
]
|
|
|
|
---
|
|
|
|
## .warning[Caveats]
|
|
|
|
- There are some corner cases when the node is also
|
|
the Swarm leader or the Consul leader; this is being improved
|
|
right now!
|
|
|
|
- The safest way to address for now this is to run the Consul
|
|
servers, the Swarm managers, and your containers, on
|
|
different nodes.
|
|
|
|
- Swarm doesn't handle gracefully the fact that after the
|
|
reboot, you have *two* containers named `highlander`,
|
|
and attempts to manipulate the container with its name
|
|
will not work. This will be improved too.
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# Conclusions
|
|
|
|
---
|
|
|
|
## Swarm cluster deployment
|
|
|
|
- We saw how to use Machine with the `generic` driver to turn
|
|
any set of machines into a Swarm cluster
|
|
|
|
- This can trivially be adapted to provision cloud instances
|
|
on the fly (using "normal" drivers of Docker Machine)
|
|
|
|
- For auto-scaling, you can use e.g.:
|
|
|
|
- private admin-only network
|
|
|
|
- no TLS
|
|
|
|
- static discovery on a /24 to /20 network (depending on your needs)
|
|
|
|
---
|
|
|
|
## Key/value store
|
|
|
|
- We saw an easy deployment method for Consul
|
|
|
|
- This is good for 3 to 9 nodes
|
|
|
|
- Remember: raft write performance *degrades* as you add nodes!
|
|
|
|
- For bigger clusters:
|
|
|
|
- have e.g. 5 "static" server nodes
|
|
|
|
- put them in round robin DNS record set (or behind an ELB)
|
|
|
|
- run a normal agent on the other nodes
|
|
|
|
---
|
|
|
|
## App deployment
|
|
|
|
- We saw how to transform a Compose file into a series of build artefacts
|
|
|
|
- using S3 or another object store is trivial
|
|
|
|
- We saw how to programmatically add load balancing, logging
|
|
|
|
- This can be improved further by using variable interpolation for the image tags
|
|
|
|
- Rolling deploys are relatively straightforward, but:
|
|
|
|
- I recommend to aim directly for blue/green (or canary) deploy
|
|
|
|
- In the production stack, abstract stateful services with ambassadors
|
|
|
|
---
|
|
|
|
## Operations
|
|
|
|
- We saw how to setup an ELK stack and send logs to it in a record time
|
|
|
|
*Important: this doesn't mean that operating ELK suddenly became an easy thing!*
|
|
|
|
- We saw how to translate a few basic tasks to containerized environments
|
|
|
|
(Backups, network traffic analysis)
|
|
|
|
- Debugging is surprisingly similar to what it used to be:
|
|
|
|
- remember that containerized processes are normal processes running on the host
|
|
|
|
- `docker exec` is your friend
|
|
|
|
- also: `docker run --net host --pid host -v /:/hostfs alpine chroot /hostfs`
|
|
|
|
---
|
|
|
|
## Things we haven't covered
|
|
|
|
- Per-container system metrics (look at cAdvisor, Snap, Prometheus...)
|
|
|
|
- Application metrics (continue to use whatever you were using before)
|
|
|
|
- Supervision (whatever you were using before still works exactly the same way)
|
|
|
|
- Tracking access to credentials and sensitive information (see Vault, Keywhiz...)
|
|
|
|
- ... (tell me what I should cover in future workshops!) ...
|
|
|
|
---
|
|
|
|
## Resilience
|
|
|
|
- We saw how to store important data (crendentials) in Consul
|
|
|
|
- We saw how to achieve H/A for Swarm itself
|
|
|
|
- Rescheduling policies give us basic H/A for containers
|
|
|
|
- This will be improved in future releases
|
|
|
|
- Docker in general, and Swarm in particular, move *fast*
|
|
|
|
- Current high availability features are not Chaos-Monkey proof (yet)
|
|
|
|
- We (well, the Swarm team) is working to change that
|
|
|
|
---
|
|
|
|
## What's next?
|
|
|
|
- November 2015: Compose 1.5 + Engine 1.9 =
|
|
<br/>first release with multi-host networking
|
|
|
|
- January 2016: Compose 1.6 + Engine 1.10 =
|
|
<br/>embedded DNS server, experimental high availability
|
|
|
|
- April 2016: Compose 1.7 + Engine 1.11 =
|
|
<br/>round robin DNS records, huge improvements in HA
|
|
|
|
- Next release: another truckload of features
|
|
|
|
- I will deliver this workshop about twice a month
|
|
|
|
- Check out the GitHub repo for updated content!
|
|
<br/>(there is a tag for each big round of updates)
|
|
|
|
---
|
|
|
|
## Overall complexity
|
|
|
|
- The scripts used here are pretty simple (each is less than 100 LOCs)
|
|
|
|
- You can easily rewrite them in your favorite language,
|
|
<br/>adapt and customize them, in a few hours of time
|
|
|
|
- FYI: those scripts are smaller and simpler than the
|
|
scripts (cloud init etc) used to deploy the VMs for this
|
|
workshop!
|
|
|
|
- Docker Inc. has commercial products to wrap all this:
|
|
|
|
- Docker Cloud
|
|
<br/>(manage your Docker nodes from a SAAS portal)
|
|
|
|
- Docker Datacenter
|
|
<br/>(buzzword-compliant management solution:
|
|
<br/>turnkey, enterprise-class, on-premise, etc.)
|
|
|
|
---
|
|
|
|
class: title
|
|
|
|
# Thanks! <br/> Questions?
|
|
|
|
## [@jpetazzo](https://twitter.com/jpetazzo) <br/> [@docker](https://twitter.com/docker)
|
|
|
|
</textarea>
|
|
<script src="https://gnab.github.io/remark/downloads/remark-0.13.min.js" type="text/javascript">
|
|
</script>
|
|
<script type="text/javascript">
|
|
var slideshow = remark.create({
|
|
ratio: '16:9',
|
|
highlightSpans: true
|
|
});
|
|
</script>
|
|
</body>
|
|
</html>
|