mirror of
https://github.com/jpetazzo/container.training.git
synced 2026-03-04 10:20:39 +00:00
2657 lines
47 KiB
HTML
2657 lines
47 KiB
HTML
<!DOCTYPE html>
|
||
<html>
|
||
<head>
|
||
<base target="_blank">
|
||
<title>Docker Orchestration Workshop</title>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
|
||
<style type="text/css">
|
||
@import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
|
||
@import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
|
||
@import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
|
||
|
||
body { font-family: 'Droid Serif'; font-size: 150%; }
|
||
|
||
h1, h2, h3 {
|
||
font-family: 'Yanone Kaffeesatz';
|
||
font-weight: normal;
|
||
}
|
||
a {
|
||
text-decoration: none;
|
||
color: blue;
|
||
}
|
||
.remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
|
||
.red { color: #fa0000; }
|
||
.gray { color: #ccc; }
|
||
.small { font-size: 70%; }
|
||
.big { font-size: 140%; }
|
||
.underline { text-decoration: underline; }
|
||
.footnote {
|
||
position: absolute;
|
||
bottom: 3em;
|
||
}
|
||
.pic {
|
||
vertical-align: middle;
|
||
text-align: center;
|
||
padding: 0 0 0 0 !important;
|
||
}
|
||
img {
|
||
max-width: 100%;
|
||
max-height: 450px;
|
||
}
|
||
.title {
|
||
vertical-align: middle;
|
||
text-align: center;
|
||
}
|
||
.title {
|
||
font-size: 2em;
|
||
}
|
||
.title .remark-slide-number {
|
||
font-size: 0.5em;
|
||
}
|
||
.quote {
|
||
background: #eee;
|
||
border-left: 10px solid #ccc;
|
||
margin: 1.5em 10px;
|
||
padding: 0.5em 10px;
|
||
quotes: "\201C""\201D""\2018""\2019";
|
||
font-style: italic;
|
||
}
|
||
.quote:before {
|
||
color: #ccc;
|
||
content: open-quote;
|
||
font-size: 4em;
|
||
line-height: 0.1em;
|
||
margin-right: 0.25em;
|
||
vertical-align: -0.4em;
|
||
}
|
||
.quote p {
|
||
display: inline;
|
||
}
|
||
.icon img {
|
||
height: 1em;
|
||
}
|
||
.exercise {
|
||
background-color: #eee;
|
||
background-image: url("keyboard.png");
|
||
background-size: 1.4em;
|
||
background-repeat: no-repeat;
|
||
background-position: 0.2em 0.2em;
|
||
border: 2px dotted black;
|
||
}
|
||
.exercise::before {
|
||
content: "Exercise:";
|
||
margin-left: 1.8em;
|
||
}
|
||
li p { line-height: 1.25em; }
|
||
</style>
|
||
</head>
|
||
<body>
|
||
<textarea id="source">
|
||
|
||
class: title
|
||
|
||
# Docker <br/> Orchestration <br/> Workshop
|
||
|
||
---
|
||
|
||
<!-- grep '^# ' index.html | grep -v '<br' | tr '#' '-'^C -->
|
||
|
||
## Outline (1/2)
|
||
|
||
- Pre-requirements
|
||
- VM environment
|
||
- Our sample application
|
||
- Running services independently
|
||
- Running the whole app on a single node
|
||
- Identifying bottlenecks
|
||
- Measuring latency under load
|
||
- Scaling HTTP on a single node
|
||
- Put a load balancer on it
|
||
- Connecting to containers on other hosts
|
||
- Abstracting remote services with ambassadors
|
||
- Various considerations about ambassadors
|
||
|
||
---
|
||
|
||
## Outline (2/2)
|
||
|
||
- Docker for ops
|
||
- Backups
|
||
- Logs
|
||
- Security upgrades
|
||
- Network traffic analysis
|
||
- Dynamic orchestration
|
||
- Hands-on Swarm
|
||
- Deploying Swarm
|
||
- Cluster discovery
|
||
- Building our app on Swarm
|
||
- Network plumbing on Swarm
|
||
- Going further
|
||
|
||
---
|
||
|
||
# Pre-requirements
|
||
|
||
- Computer with network connection and SSH client
|
||
<br/>(on Windows, get [putty](http://www.putty.org/)
|
||
or [Git BASH](https://msysgit.github.io/))
|
||
- GitHub account (recommended; not mandatory)
|
||
- Docker Hub account (only for Swarm hands-on section)
|
||
- Basic Docker knowledge
|
||
|
||
.exercise[
|
||
|
||
- This is the stuff you're supposed to do!
|
||
- Create [GitHub](https://github.com/) and
|
||
[Docker Hub](https://hub.docker.com) accounts now if needed
|
||
- Go to [view.dckr.info](http://view.dckr.info) to view these slides
|
||
|
||
]
|
||
|
||
---
|
||
|
||
# VM environment
|
||
|
||
- Each person gets 5 VMs
|
||
- They are *your* VMs
|
||
- They'll be up until tomorrow
|
||
- You have a little card with login+password+IP addresses
|
||
- You can automatically SSH from one VM to another
|
||
|
||
.exercise[
|
||
|
||
- Log into the first VM (`node1`)
|
||
- Check that you can SSH (without password) to `node2`
|
||
- Check the version of docker with `docker version`
|
||
|
||
]
|
||
|
||
.footnote[Note: from now on, unless instructed, **all commands must
|
||
be run from the first VM, `node1`**.]
|
||
|
||
---
|
||
|
||
## Brand new versions!
|
||
|
||
- Engine 1.8.2
|
||
|
||
- Compose 1.4.2
|
||
|
||
- Swarm 0.4
|
||
|
||
- Machine 0.4.1
|
||
|
||
---
|
||
|
||
# Our sample application
|
||
|
||
- Let's look at the general layout of the
|
||
[source code](https://github.com/jpetazzo/orchestration-workshop)
|
||
|
||
- Each directory = 1 microservice
|
||
- `rng` = web service generating random bytes
|
||
- `hasher` = web service computing hash of POSTed data
|
||
- `worker` = background process using `rng` and `hasher`
|
||
- `webui` = web interface to watch progress
|
||
|
||
.exercise[
|
||
|
||
- Clone the repository on `node1`:
|
||
<br/>.small[`git clone git://github.com/jpetazzo/orchestration-workshop`]
|
||
|
||
]
|
||
|
||
(Bonus points for forking on GitHub and cloning your fork!)
|
||
|
||
---
|
||
|
||
## What's this application?
|
||
|
||
- It is a DockerCoin miner! 💰🐳📦🚢
|
||
|
||
- No, you can't buy coffee with DockerCoins
|
||
|
||
- How DockerCoins works:
|
||
|
||
- `worker` asks to `rng` to give it random bytes
|
||
- `worker` feeds those random bytes into `hasher`
|
||
- each hash starting with `0` is a DockerCoin
|
||
- DockerCoins are stored in `redis`
|
||
- `redis` is also updated every second to track speed
|
||
- you can see the progress with the `webui`
|
||
|
||
Next: we will inspect components independently.
|
||
|
||
---
|
||
|
||
# Running services independently
|
||
|
||
First, we will run the random number generator (`rng`).
|
||
|
||
.exercise[
|
||
|
||
- Go to the `dockercoins` directory (in the cloned repo)
|
||
|
||
- Run `docker-compose up rng`
|
||
<br/>(Docker will pull `python` and build the microservice)
|
||
|
||
]
|
||
|
||
.icon[] Pay attention to the port mapping!
|
||
|
||
- The container log says:
|
||
<br/>`Running on http://0.0.0.0:80/`
|
||
|
||
- But if you try `curl localhost:80`, you will get:
|
||
<br/>`Connection refused`
|
||
|
||
---
|
||
|
||
## Understanding port mapping
|
||
|
||
- `node1`, the Docker host, has only one port 80
|
||
|
||
- If we give the one and only port 80 to the first
|
||
container who asks for it, we are in trouble when
|
||
another container needs it
|
||
|
||
- Default behavior: containers are not "exposed"
|
||
<br/>(only reachable through their private address)
|
||
|
||
- Container network services can be exposed:
|
||
|
||
- statically (you decide which host port to use)
|
||
|
||
- dynamically (Docker allocates a host port)
|
||
|
||
---
|
||
|
||
## Declaring port mapping
|
||
|
||
- Directly with the Docker Engine:
|
||
<br/>`docker run -P redis`
|
||
<br/>`docker run -p 6379 redis`
|
||
<br/>`docker run -p 1234:6379 redis`
|
||
|
||
- With Docker Compose, in the `docker-compose.yml` file:
|
||
|
||
```
|
||
rng:
|
||
…
|
||
ports:
|
||
- "8001:80"
|
||
```
|
||
|
||
→ port 8001 *on the host* maps to
|
||
port 80 *in the container*
|
||
|
||
---
|
||
|
||
## Using the `rng` service
|
||
|
||
Let's get random bytes of data!
|
||
|
||
.exercise[
|
||
|
||
- Open a second terminal and connect to the same VM
|
||
|
||
- Check that the service is alive:
|
||
<br/>`curl localhost:8001`
|
||
|
||
- Get 10 bytes of random data:
|
||
<br/>`curl localhost:8001/10`
|
||
|
||
- If the binary data output messed up your terminal, fix it:
|
||
<br/>`reset`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Running the hasher
|
||
|
||
.exercise[
|
||
|
||
- Run `docker-compose up hasher`
|
||
<br/>(it will pull `ruby` and do the build)
|
||
|
||
]
|
||
|
||
.icon[] Again, pay attention to the port mapping!
|
||
|
||
The container log says that it's listening on port 80,
|
||
but it's mapped to port 8002 on the host.
|
||
|
||
You can see the mapping in `docker-compose.yml`.
|
||
|
||
---
|
||
|
||
## Testing the hasher
|
||
|
||
.exercise[
|
||
|
||
- Open a third terminal window, and SSH to `node1`
|
||
|
||
- Run `curl localhost:8002`
|
||
<br/>(it will say it's alive)
|
||
|
||
- Posting binary data requires some extra flags:
|
||
|
||
```
|
||
curl \
|
||
-H "Content-type: application/octet-stream" \
|
||
--data-binary hello \
|
||
localhost:8002
|
||
```
|
||
|
||
- Check that it computed the right hash:
|
||
<br/>`echo -n hello | sha256sum`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Stopping services
|
||
|
||
We have multiple options:
|
||
|
||
- Interrupt `docker-compose up` with `^C`
|
||
|
||
- Stop individual services with `docker-compose stop rng`
|
||
|
||
- Stop all services with `docker-compose stop`
|
||
|
||
- Kill all services with `docker-compose kill`
|
||
<br/>(rude, but faster!)
|
||
|
||
.exercise[
|
||
|
||
- Use any of those methods to stop `rng` and `hasher`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
# Running the whole app on a single node
|
||
|
||
.exercise[
|
||
|
||
- Run `docker-compose up` to start all components
|
||
|
||
]
|
||
|
||
- Aggregate output is shown
|
||
|
||
- Output is verbose
|
||
<br/>(because the worker is constantly hitting other services)
|
||
|
||
- Now let's use the little web UI to see realtime progress
|
||
|
||
|
||
.exercise[
|
||
|
||
- Open http://[yourVMaddr]:8000/ (from a browser)
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Running in the background
|
||
|
||
- The logs are very verbose (and won't get better)
|
||
|
||
- Let's put them in the background for now!
|
||
|
||
.exercise[
|
||
|
||
- Stop the app (with `^C`)
|
||
|
||
- Start it again with `docker-compose up -d`
|
||
|
||
- Check on the web UI that the app is still making progress
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Looking at resource usage
|
||
|
||
- Let's look at CPU, memory, and I/O usage
|
||
|
||
.exercise[
|
||
|
||
- run `top` to see CPU and memory usage
|
||
<br/>(you should see idle cycles)
|
||
|
||
- run `vmstat 3` to see I/O usage (si/so/bi/bo)
|
||
<br/>(the 4 numbers should be almost zero,
|
||
<br/>except `bo` for logging)
|
||
|
||
]
|
||
|
||
We have available resources.
|
||
|
||
- Why?
|
||
- How can we use them?
|
||
|
||
---
|
||
|
||
## Scaling workers on a single node
|
||
|
||
- Docker Compose supports scaling.red[*]
|
||
- Let's scale `worker` and see what happens!
|
||
|
||
.exercise[
|
||
|
||
- In one SSH session, run `docker-compose logs worker`
|
||
|
||
- In another, run `docker-compose scale worker=10`
|
||
|
||
- See the impact on CPU load (with top/htop),
|
||
<br/>and on compute speed (with web UI)
|
||
|
||
]
|
||
|
||
You should see the compute speed increase ~3x (not 10x).
|
||
|
||
.footnote[.red[*]With some limitations, as we'll see later.]
|
||
|
||
---
|
||
|
||
# Identifying bottlenecks
|
||
|
||
- Adding workers didn't result in linear improvement
|
||
|
||
- *Something else* is slowing us down
|
||
|
||
--
|
||
|
||
- ... But what?
|
||
|
||
--
|
||
|
||
- The code doesn't have instrumentation
|
||
|
||
- We will use `ab` for individual load testing
|
||
|
||
- We will use `httping` to view latency under load
|
||
|
||
---
|
||
|
||
## Benchmarking our microservices
|
||
|
||
We will test microservices in isolation.
|
||
|
||
.exercise[
|
||
|
||
- Stop the application:
|
||
`docker-compose kill`
|
||
|
||
- Remove old containers:
|
||
`docker-compose rm`
|
||
|
||
- Start `hasher` and `rng`:
|
||
`docker-compose up hasher rng`
|
||
|
||
]
|
||
|
||
Now let's hammer them with requests!
|
||
|
||
---
|
||
|
||
## Testing `rng`
|
||
|
||
Let's assess the raw performance of our RNG.
|
||
|
||
.exercise[
|
||
|
||
- Test the performance on one big request:
|
||
<br/>`curl -o/dev/null localhost:8001/10000000`
|
||
<br/>(should take ~1s, and show speed of ~10 MB/s)
|
||
|
||
]
|
||
|
||
If we were doing requests of 1000 bytes ...
|
||
|
||
... Could we get 10k req/s?
|
||
|
||
Let's test and see what happens!
|
||
|
||
---
|
||
|
||
## Concurrent requests
|
||
|
||
.exercise[
|
||
|
||
- Test 100 requests of 1000 bytes each:
|
||
<br/>`ab -n 100 localhost:8001/1000`
|
||
|
||
- Test 100 requests, 10 requests in parallel:
|
||
<br/>`ab -n 100 -c 10 localhost:8001/1000`
|
||
<br/>(look how the latency has increased!)
|
||
|
||
- Try with 100 requests in parallel:
|
||
<br/>`ab -n 100 -c 100 localhost:8001/1000`
|
||
|
||
]
|
||
|
||
--
|
||
|
||
Whatever we do, we get ~10 requests/second.
|
||
|
||
Increasing concurrency doesn't help:
|
||
it just increases latency.
|
||
|
||
---
|
||
|
||
## Discussion
|
||
|
||
- When serving requests sequentially, they each take 100ms
|
||
|
||
- When 10 requests arrive at the same time:
|
||
|
||
- one request is served in 100ms
|
||
- another is served in 200ms
|
||
- another is served in 300ms
|
||
- ...
|
||
- another is served in 1000ms
|
||
|
||
- All requests are queued and served by a single thread
|
||
|
||
- It looks like `rng` doesn't handle concurrent requests
|
||
|
||
- What about `hasher`?
|
||
|
||
---
|
||
|
||
## Save some random data and stop the generator
|
||
|
||
Before testing the hasher, let's save some random
|
||
data that we will feed to the hasher later.
|
||
|
||
.exercise[
|
||
|
||
- Run `curl localhost:8001/1000000 > /tmp/random`
|
||
|
||
]
|
||
|
||
Now we can stop the generator.
|
||
|
||
.exercise[
|
||
|
||
- In the shell where you did `docker-compose up rng`,
|
||
<br/>stop it by hitting `^C`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Benchmarking the hasher
|
||
|
||
We will hash the data that we just got from `rng`.
|
||
|
||
.exercise[
|
||
|
||
- Posting binary data requires some extra flags:
|
||
|
||
```
|
||
curl \
|
||
-H "Content-type: application/octet-stream" \
|
||
--data-binary @/tmp/random \
|
||
localhost:8002
|
||
```
|
||
|
||
- Compute the hash locally to verify that it works fine:
|
||
<br/>`sha256sum /tmp/random`
|
||
<br/>(it should display the same hash)
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## The hasher under load
|
||
|
||
The invocation of `ab` will be slightly more complex as well.
|
||
|
||
.exercise[
|
||
|
||
- Execute 100 requests in a row:
|
||
|
||
```
|
||
ab -n 100 -T application/octet-stream \
|
||
-p /tmp/random localhost:8002/
|
||
```
|
||
|
||
- Execute 100 requests with 10 requests in parallel:
|
||
|
||
```
|
||
ab -c 10 -n 100 -T application/octet-stream \
|
||
-p /tmp/random localhost:8002/
|
||
```
|
||
|
||
]
|
||
|
||
Take note of the performance numbers (requests/s).
|
||
|
||
---
|
||
|
||
## Benchmarking the hasher on smaller data
|
||
|
||
Here we hashed 1,000,000 bytes.
|
||
|
||
Later we will hash much smaller payloads.
|
||
|
||
Let's repeat the tests with smaller data.
|
||
|
||
.exercise[
|
||
|
||
- Run `truncate --size=10 /tmp/random`
|
||
- Repeat the `ab` tests
|
||
|
||
]
|
||
|
||
---
|
||
|
||
# Measuring latency under load
|
||
|
||
We will use `httping`.
|
||
|
||
.exercise[
|
||
|
||
- You need three SSH connections
|
||
|
||
- In the first one, let run `httping localhost:8001`
|
||
|
||
- In the second one, let run `httping localhost:8002`
|
||
|
||
- In the third one, run `docker-compose up -d`
|
||
|
||
]
|
||
|
||
Check the latency numbers.
|
||
|
||
- `hasher` should be very low (~1ms)
|
||
- `rng` should be low, with occasional spikes (10-100ms)
|
||
|
||
---
|
||
|
||
## Latency when scaling the worker
|
||
|
||
We will add workers and see what happens.
|
||
|
||
.exercise[
|
||
|
||
- Run `docker-compose scale worker=2`
|
||
|
||
- Check latency
|
||
|
||
- Increase number of workers and repeat
|
||
|
||
]
|
||
|
||
What happens?
|
||
|
||
- `hasher` remains low
|
||
- `rng` spikes up until it is reaches ~N*100ms
|
||
<br/>(when you have N+1 workers)
|
||
|
||
---
|
||
|
||
class: title
|
||
|
||
Why?
|
||
|
||
---
|
||
|
||
## Why does everything take (at least) 100ms?
|
||
|
||
--
|
||
|
||
`rng` code:
|
||
|
||

|
||
|
||
--
|
||
|
||
`hasher` code:
|
||
|
||

|
||
|
||
---
|
||
|
||
class: title
|
||
|
||
But ...
|
||
|
||
WHY?!?
|
||
|
||
---
|
||
|
||
## Why did we sprinkle this sample app with sleeps?
|
||
|
||
- Deterministic performance
|
||
<br/>(regardless of instance speed, CPUs, I/O...)
|
||
|
||
--
|
||
|
||
- Actual code sleeps all the time anyway
|
||
|
||
--
|
||
|
||
- When your code makes a remote API call:
|
||
|
||
- it sends a request;
|
||
|
||
- it sleeps until it gets the response;
|
||
|
||
- it processes the response.
|
||
|
||
---
|
||
|
||
## Why do `rng` and `hasher` behave differently?
|
||
|
||

|
||
|
||
--
|
||
|
||
(Synchronous vs. asynchronous event processing)
|
||
|
||
---
|
||
|
||
## How to make `rng` go faster
|
||
|
||
- Obvious solution: comment out the `sleep` instruction
|
||
|
||
--
|
||
|
||
- Real-world solution: use an asynchronous framework
|
||
<br/>(e.g. use gunicorn with gevent)
|
||
|
||
--
|
||
|
||
- New rule: we can't change the code!
|
||
|
||
--
|
||
|
||
- Solution: scale out `rng`
|
||
<br/>(dispatch `rng` requests on multiple instances)
|
||
|
||
---
|
||
|
||
# Scaling HTTP on a single node
|
||
|
||
- We could try to scale with Compose:
|
||
|
||
```
|
||
docker-compose scale rng=3
|
||
```
|
||
|
||
- Compose doesn't deal with load balancing
|
||
|
||
- We would get 3 instances ...
|
||
|
||
- ... But only the first one would serve traffic
|
||
|
||
---
|
||
|
||
## The plan
|
||
|
||
- Stop the `rng` service first
|
||
|
||
- Create multiple identical `rng` containers
|
||
|
||
- Put a load balancer in front of them
|
||
|
||
- Point other services to the load balancer
|
||
|
||
---
|
||
|
||
## Stopping `rng`
|
||
|
||
- That's the easy part!
|
||
|
||
.exercise[
|
||
|
||
- Use `docker-compose` to stop `rng`:
|
||
|
||
```
|
||
docker-compose stop rng
|
||
```
|
||
|
||
]
|
||
|
||
Note: we do this first because we are about to remove
|
||
`rng` from the Docker Compose file.
|
||
|
||
If we don't stop
|
||
`rng` now, it will remain up and running, with Compose
|
||
being unaware of its existence!
|
||
|
||
---
|
||
|
||
## Scaling `rng`
|
||
|
||
.exercise[
|
||
|
||
- Replace the `rng` service with multiple copies of it:
|
||
|
||
```
|
||
rng1:
|
||
build: rng
|
||
|
||
rng2:
|
||
build: rng
|
||
|
||
rng3:
|
||
build: rng
|
||
```
|
||
|
||
]
|
||
|
||
That's all!
|
||
|
||
Shortcut: `docker-compose.yml-scaled-rng`
|
||
|
||
---
|
||
|
||
## Introduction to `jpetazzo/hamba`
|
||
|
||
- Public image on the Docker Hub
|
||
|
||
- Load balancer based on HAProxy
|
||
|
||
- Expects the following arguments:
|
||
<br/>`FE-port BE1-addr BE1-port BE2-addr BE2-port ...`
|
||
<br/>*or*
|
||
<br/>`FE-addr:FE-port BE1-addr BE1-port BE2-addr BE2-port ...`
|
||
|
||
- FE=frontend (the thing other services connect to)
|
||
|
||
- BE=backend (the multiple copies of your scaled service)
|
||
|
||
.small[
|
||
Example: listen to port 80 and balance traffic on www1:1234 + www2:2345
|
||
|
||
```
|
||
docker run -d -p 80 jpetazzo/hamba 80 www1 1234 www2 2345
|
||
```
|
||
]
|
||
|
||
---
|
||
|
||
# Put a load balancer on it
|
||
|
||
Let's add our load balancer to the Compose file.
|
||
|
||
.exercise[
|
||
|
||
- Add the following section to the Compose file:
|
||
|
||
```
|
||
rng0:
|
||
image: jpetazzo/hamba
|
||
links:
|
||
- rng1
|
||
- rng2
|
||
- rng3
|
||
command: 80 rng1 80 rng2 80 rng3 80
|
||
ports:
|
||
- "8001:80"
|
||
```
|
||
|
||
]
|
||
|
||
Shortcut: `docker-compose.yml-scaled-rng`
|
||
|
||
---
|
||
|
||
## Point other services to the load balancer
|
||
|
||
- The only affected service is `worker`
|
||
|
||
- We have to replace the `rng` link with a link to `rng0`,
|
||
but it should still be named `rng` (so we don't change the code)
|
||
|
||
.exercise[
|
||
|
||
- Update the `worker` section as follows:
|
||
|
||
```
|
||
worker:
|
||
build: worker
|
||
links:
|
||
- rng0:rng
|
||
- hasher
|
||
- redis
|
||
```
|
||
|
||
]
|
||
|
||
Shortcut: `docker-compose.yml-scaled-rng`
|
||
|
||
---
|
||
|
||
## Start the whole stack
|
||
|
||
.exercise[
|
||
|
||
- Run `docker-compose up -d`
|
||
|
||
- Check worker logs with `docker-compose logs worker`
|
||
|
||
- Check load balancer logs with `docker-compose logs rng0`
|
||
|
||
]
|
||
|
||
If you get errors about port 8001, make sure that
|
||
`rng` was stopped correctly and try again.
|
||
|
||
---
|
||
|
||
## The good, the bad, the ugly
|
||
|
||
- The good
|
||
|
||
We scaled a service, added a load balancer -
|
||
<br/>without changing a single line of code.
|
||
|
||
- The bad
|
||
|
||
We manually copy-pasted sections in `docker-compose.yml`.
|
||
|
||
Improvement: write scripts to transform the YAML file.
|
||
|
||
- The ugly
|
||
|
||
If we scale up/down, we have to restart everything.
|
||
|
||
Improvement: reconfigure the load balancer dynamically.
|
||
|
||
---
|
||
|
||
# Connecting to containers on other hosts
|
||
|
||
- So far, our whole stack is on a single machine
|
||
|
||
- We want to scale out (across multiple nodes)
|
||
|
||
- We will deploy the same stack multiple times
|
||
|
||
- But we want every stack to use the same Redis
|
||
<br/>(in other words: Redis is our only *stateful* service here)
|
||
|
||
--
|
||
|
||
- And remember: we're not allowed to change the code!
|
||
|
||
- the code connects to host `redis`
|
||
- `redis` must resolve to the address of our Redis service
|
||
- the Redis service must listen on the default port (6379)
|
||
|
||
---
|
||
|
||
## The plan
|
||
|
||
- Deploy our Redis service separately
|
||
|
||
- use the same `redis` image
|
||
|
||
- make sure that Redis server port (6379) is publicly accessible,
|
||
using port 6379 on the Docker host
|
||
|
||
- Update our Docker Compose YAML file
|
||
|
||
- remove the `redis` section
|
||
|
||
- in the `links` section, remove `redis`
|
||
|
||
- instead, put a `redis` entry in `extra_hosts`
|
||
|
||
---
|
||
|
||
## Making Redis available on its default port
|
||
|
||
There are two strategies.
|
||
|
||
- `docker run -p 6379:6379 redis`
|
||
|
||
- the container has its own, isolated network stack
|
||
- Docker creates a port mapping rule through iptables
|
||
- slight performance overhead
|
||
- port number is explicit (visible through Docker API)
|
||
|
||
- `docker run --net host redis`
|
||
|
||
- the container uses the network stack of the host
|
||
- when it binds to 6379/tcp, that's 6379/tcp on the host
|
||
- allows raw speed (no overhead due to iptables/bridge)
|
||
- port number is not visible through Docker API
|
||
|
||
Choose wisely!
|
||
|
||
---
|
||
|
||
## Deploy Redis
|
||
|
||
.exercise[
|
||
|
||
- Start a new redis container, mapping port 6379 to 6379:
|
||
|
||
```
|
||
docker run -d -p 6379:6379 redis
|
||
```
|
||
|
||
- Check that it's running with `docker ps`
|
||
|
||
- Note the IP address of this Docker host
|
||
|
||
- Try to connect to it (from anywhere):
|
||
|
||
```
|
||
telnet ip.ad.dr.ess 6379
|
||
```
|
||
|
||
]
|
||
|
||
To exit a telnet session: `Ctrl-] c ENTER`
|
||
|
||
---
|
||
|
||
## Update `docker-compose.yml` (1/3)
|
||
|
||
.exercise[
|
||
|
||
- Comment out `redis`:
|
||
|
||
```
|
||
#redis:
|
||
# image: redis
|
||
```
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Update `docker-compose.yml` (2/3)
|
||
|
||
.exercise[
|
||
|
||
- Update `worker`:
|
||
|
||
```
|
||
worker:
|
||
build: worker
|
||
extra_hosts:
|
||
redis: A.B.C.D
|
||
links:
|
||
- rng0:rng
|
||
- hasher
|
||
```
|
||
|
||
]
|
||
|
||
Replace `A.B.C.D` with the IP address noted earlier.
|
||
|
||
Shortcut: `docker-compose.yml-extra-hosts`
|
||
<br/>(But you still have to replace `A.B.C.D`!)
|
||
|
||
---
|
||
|
||
## Update `docker-compose.yml` (3/3)
|
||
|
||
.exercise[
|
||
|
||
- Update `webui`:
|
||
|
||
```
|
||
webui:
|
||
build: webui
|
||
extra_hosts:
|
||
redis: A.B.C.D
|
||
ports:
|
||
- "8000:80"
|
||
#volumes:
|
||
# - "./webui/files/:/files/"
|
||
```
|
||
|
||
]
|
||
|
||
(Replace `A.B.C.D` with the IP address noted earlier)
|
||
|
||
.icon[] Don't forget to comment out the `volumes` section!
|
||
|
||
---
|
||
|
||
## Why did we comment out the `volumes` section?
|
||
|
||
- Volumes have multiple uses:
|
||
|
||
- storing persistent stuff (database files...)
|
||
|
||
- sharing files between containers (logs, configuration...)
|
||
|
||
- sharing files between host and containers (source...)
|
||
|
||
- The `volumes` directive expands to an host path
|
||
<br/>.small[(e.g. `/home/docker/orchestration-workshop/dockercoins/webui/files`)]
|
||
|
||
- This host path exists on the local machine
|
||
<br/>(not on the others)
|
||
|
||
- This specific volume is used in development
|
||
<br/>(not in production)
|
||
|
||
---
|
||
|
||
## Start the stack on the first machine
|
||
|
||
- Nothing special to do here
|
||
|
||
- Just bring up the application like we did before
|
||
|
||
.exercise[
|
||
|
||
- `docker-compose up -d`
|
||
|
||
]
|
||
|
||
- Check in the web browser that it's running correctly
|
||
|
||
---
|
||
|
||
## Start the stack on another machine
|
||
|
||
- We will set the `DOCKER_HOST` variable
|
||
|
||
- `docker-compose` will detect and use it
|
||
|
||
- Our Docker hosts are listening on port 55555
|
||
|
||
.exercise[
|
||
|
||
- Set the environment variable:
|
||
<br/>`export DOCKER_HOST=tcp://node2:55555`
|
||
|
||
- Start the stack:
|
||
<br/>`docker-compose up -d`
|
||
|
||
- Check that it's running:
|
||
<br/>`docker-compose ps`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Scale!
|
||
|
||
.exercise[
|
||
|
||
- Open the Web UI
|
||
<br/>(on a node where it's deployed)
|
||
|
||
- Deploy one instance of the stack on each node
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Cleanup
|
||
|
||
- Let's remove what we did
|
||
|
||
.exercise[
|
||
|
||
- You can use the following scriptlet:
|
||
|
||
```
|
||
for N in $(seq 1 5); do
|
||
export DOCKER_HOST=tcp://node$N:55555
|
||
docker ps -qa | xargs docker rm -f
|
||
done
|
||
unset DOCKER_HOST
|
||
```
|
||
|
||
]
|
||
|
||
---
|
||
|
||
# Abstracting remote services with ambassadors
|
||
|
||
- What if we can't/won't run Redis on its default port?
|
||
|
||
- What if we want to be able to move it more easily?
|
||
|
||
--
|
||
|
||
- We will use an ambassador
|
||
|
||
- Redis will run at an arbitrary location (host+port)
|
||
|
||
- The ambassador will be part of the scaled stack
|
||
|
||
- The ambassador will connect to Redis
|
||
|
||
- The ambassador will "act as" Redis in the stack
|
||
|
||
---
|
||
|
||
## Start redis
|
||
|
||
- This time, we will let Docker pick the port for Redis
|
||
|
||
.exercise[
|
||
|
||
- Run redis with a random public port:
|
||
<br/>`docker run -d -P --name myredis redis`
|
||
|
||
- Check which port was allocated:
|
||
<br/>`docker port myredis 6379`
|
||
|
||
]
|
||
|
||
- Note this IP address and port
|
||
|
||
---
|
||
|
||
## Update `docker-compose.yml`
|
||
|
||
.exercise[
|
||
|
||
- Restore `links` as they were before in `webui` and `worker`
|
||
|
||
- Replace `redis` with an ambassador using `jpetazzo/hamba`:
|
||
|
||
```
|
||
redis:
|
||
image: jpetazzo/hamba
|
||
command: 6379 AA.BB.CC.DD EEEEE
|
||
```
|
||
|
||
]
|
||
|
||
Shortcut: `docker-compose.yml-ambassador`
|
||
<br/>(But you still have to update `AA.BB.CC.DD EEEE`!)
|
||
|
||
---
|
||
|
||
## Start the new stack
|
||
|
||
.exercise[
|
||
|
||
- Run `docker-compose up -d`
|
||
|
||
- Go to the web UI
|
||
|
||
- Start the stack on another node as previously,
|
||
<br/>and confirm on the web UI that it's picking up
|
||
|
||
]
|
||
|
||
---
|
||
|
||
# Various considerations about ambassadors
|
||
|
||
- "But, ambassadors are adding an extra hop!"
|
||
|
||
--
|
||
|
||
- Yes, but if you need load balancing, you need that hop
|
||
|
||
- Ambassadors actually *save* one hop
|
||
<br/>(they act as local load balancers)
|
||
|
||
- traditional load balancer:
|
||
<br/>client ⇒ external LB ⇒ server (2 physical hops)
|
||
|
||
- ambassadors:
|
||
<br/>client → ambassador ⇒ server (1 physical hop)
|
||
|
||
--
|
||
|
||
- Ambassadors are more reliable than traditional LBs
|
||
<br/>(they are colocated with their clients)
|
||
|
||
---
|
||
|
||
## Inconvenients of ambassadors
|
||
|
||
- Generic issues
|
||
<br/>(shared with any kind of load balancing / HA setup)
|
||
|
||
- extra logical hop (not transparent to the client)
|
||
|
||
- must assess backend health
|
||
|
||
- one more thing to worry about (!)
|
||
|
||
- Specific issues
|
||
|
||
- load balancing fairness
|
||
|
||
High-end load balancing solutions will rely on back pressure
|
||
from the backends. This addresses the fairness issue.
|
||
|
||
---
|
||
|
||
## There are many ways to deploy ambassadors
|
||
|
||
"Ambassador" is a design pattern.
|
||
|
||
There are many ways to implement it.
|
||
|
||
We will present three increasingly complex (but also powerful)
|
||
ways to deploy ambassadors.
|
||
|
||
---
|
||
|
||
## Single-tier ambassador deployment
|
||
|
||
- One-shot configuration process
|
||
|
||
- Must be executed manually after each scaling operation
|
||
|
||
- Scans current state, updates load balancer configuration
|
||
|
||
- Pros:
|
||
<br/>- simple, robust, no extra moving part
|
||
<br/>- easy to customize (thanks to simple design)
|
||
<br/>- can deal efficiently with large changes
|
||
|
||
- Cons:
|
||
<br/>- must be executed after each scaling operation
|
||
<br/>- harder to compose different strategies
|
||
|
||
- Example: this workshop
|
||
|
||
---
|
||
|
||
## Two-tier ambassador deployment
|
||
|
||
- Daemon listens to Docker events API
|
||
|
||
- Reacts to container start/stop events
|
||
|
||
- Adds/removes back-ends to load balancers configuration
|
||
|
||
- Pros:
|
||
<br/>- no extra step required when scaling up/down
|
||
|
||
- Cons:
|
||
<br/>- extra process to run and maintain
|
||
<br/>- deals with one event at a time (ordering matters)
|
||
|
||
- Hidden gotcha: load balancer creation
|
||
|
||
- Example: interlock
|
||
|
||
---
|
||
|
||
## Three-tier ambassador deployment
|
||
|
||
|
||
- Daemon listens to Docker events API
|
||
|
||
- Reacts to container start/stop events
|
||
|
||
- Adds/removes scaled services in distributed config DB
|
||
<br/>(zookeeper, etcd, consul…)
|
||
|
||
- Another daemon listens to config DB events
|
||
|
||
- Adds/removes backends to load balancers configuration
|
||
|
||
- Pros:
|
||
<br/>- more flexibility
|
||
|
||
- Cons:
|
||
<br/>- three extra services to run and maintain
|
||
|
||
- Example: registrator
|
||
|
||
---
|
||
|
||
## Other multi-host communication mechanisms
|
||
|
||
- Overlay networks
|
||
|
||
- weave, flannel, pipework ...
|
||
|
||
- Network plugins
|
||
|
||
- check out Docker Experimental Engine
|
||
<br/>(should land in stable releases soon)
|
||
|
||
- Allow a flat network for your containers
|
||
|
||
- Often requires an extra service to deal with BUM packets
|
||
<br/>(broadcast/unknown/multicast)
|
||
|
||
- Load balancers and/or failover mechanisms still needed
|
||
|
||
---
|
||
|
||
class: title
|
||
|
||
# Interlude <br/>
|
||
|
||
# Docker for ops
|
||
|
||
---
|
||
|
||
# Backups
|
||
|
||
- Redis is still running (with name `myredis`)
|
||
|
||
- We want to enable backups without touching it
|
||
|
||
- We will use a special backup container:
|
||
|
||
- sharing the same volumes
|
||
|
||
- linked to it (to connect to it easily)
|
||
|
||
- possibly containing our backup tools
|
||
|
||
- This works because the `redis` container image
|
||
<br/>stores its data on a volume
|
||
|
||
---
|
||
|
||
## Starting the backup container
|
||
|
||
.exercise[
|
||
|
||
- Start the container:
|
||
|
||
```
|
||
docker run --link myredis:redis \
|
||
--volumes-from myredis \
|
||
-v /tmp/myredis:/output \
|
||
-ti ubuntu
|
||
```
|
||
|
||
- Look in `/data` in the container
|
||
<br/>(That's where Redis puts its data dumps)
|
||
]
|
||
|
||
- We need to tell Redis to perform a data dump *now*
|
||
|
||
---
|
||
|
||
## Connecting to Redis
|
||
|
||
.exercise[
|
||
|
||
- `apt-get install telnet`
|
||
|
||
- `telnet redis 6379`
|
||
|
||
- issue `SAVE` then `QUIT`
|
||
|
||
- Look at `/data` again
|
||
|
||
]
|
||
|
||
- There should be a recent dump file now
|
||
|
||
---
|
||
|
||
## Getting the dump out of the container
|
||
|
||
- We could use many things:
|
||
|
||
- s3cmd to copy to S3
|
||
- SSH to copy to a remote host
|
||
- gzip/bzip/etc before copying
|
||
|
||
- We'll just copy it to the Docker host
|
||
|
||
.exercise[
|
||
|
||
- Copy the file from `/data` to `/output`
|
||
|
||
- Exit the container
|
||
|
||
- Look into `/tmp/myredis` (on the host)
|
||
|
||
]
|
||
|
||
---
|
||
|
||
# Logs
|
||
|
||
- Sorry, this part won't be hands-on
|
||
|
||
- Two strategies:
|
||
|
||
- log to plain files on volumes
|
||
|
||
- log to stdout
|
||
<br/>(and use a logging driver)
|
||
|
||
---
|
||
|
||
## Logging to plain files on volumes
|
||
|
||
- Start a container with `-v /logs`
|
||
|
||
- Make sure that all log files are in `/logs`
|
||
|
||
- To check logs, run e.g.
|
||
|
||
```
|
||
docker run --volumes-from ... ubuntu sh -c \
|
||
"grep WARN /logs/*.log"
|
||
```
|
||
|
||
- Or just go interactive:
|
||
|
||
```
|
||
docker run --volumes-from ... -ti ubuntu
|
||
```
|
||
|
||
- You can (should) start a log shipper that way
|
||
|
||
---
|
||
|
||
## Logging to stdout
|
||
|
||
- All containers should write to stdout/stderr
|
||
|
||
- Docker will collect logs and pass them to a logging driver
|
||
|
||
- Available drivers:
|
||
<br/>json-file (default), syslog, journald, gelf, fluentd
|
||
|
||
- Change driver by passing `--log-driver` option to daemon
|
||
<br>(On Ubuntu, tweak `DOCKER_OPTS` in `/etc/default/docker`)
|
||
|
||
- For now, only json-files supports logs retrieval
|
||
<br/>(i.e. `docker logs`)
|
||
|
||
- Warning: json-file doesn't rotate logs by default
|
||
<br/>(but this can be changed with `--log-opt`)
|
||
|
||
See: https://docs.docker.com/reference/logging/overview/
|
||
|
||
---
|
||
|
||
# Security upgrades
|
||
|
||
- This section is not hands-on
|
||
|
||
- Public Service Announcement
|
||
|
||
- We'll discuss:
|
||
|
||
- how to upgrade the Docker daemon
|
||
|
||
- how to upgrade container images
|
||
|
||
---
|
||
|
||
## Upgrading the Docker daemon
|
||
|
||
- Stop all containers cleanly
|
||
<br/>(`docker ps -q | xargs docker stop`)
|
||
|
||
- Stop the Docker daemon
|
||
|
||
- Upgrade the Docker daemon
|
||
|
||
- Start the Docker daemon
|
||
|
||
- Start all containers
|
||
|
||
- This is like upgrading your Linux kernel,
|
||
<br/>but it will get better
|
||
|
||
---
|
||
|
||
## Upgrading container images
|
||
|
||
- When a vulnerability is announced:
|
||
|
||
- if it affects your base images,
|
||
<br/>make sure they are fixed first
|
||
|
||
- if it affects downloaded packages,
|
||
<br/>make sure they are fixed first
|
||
|
||
- re-pull base images
|
||
|
||
- rebuild
|
||
|
||
- restart containers
|
||
|
||
(The procedure is simple and plain, just follow it!)
|
||
|
||
---
|
||
|
||
# Network traffic analysis
|
||
|
||
- We still have `myredis` running
|
||
|
||
- We will use *shared network namespaces*
|
||
<br/>to perform network analysis
|
||
|
||
- Two containers sharing the same network namespace...
|
||
|
||
- have the same IP addresses
|
||
|
||
- have the same network interfaces
|
||
|
||
- `eth0` is therefore the same in both containers
|
||
|
||
---
|
||
|
||
## Install and start `ngrep`
|
||
|
||
Ngrep uses libpcap (like tcpdump) to sniff network traffic.
|
||
|
||
.exercise[
|
||
|
||
- Start a container with the same network namespace:
|
||
<br/>`docker run --net container:myredis -ti alpine`
|
||
|
||
- Install ngrep:
|
||
<br/>`apk update && apk add ngrep`
|
||
|
||
- Run ngrep:
|
||
<br/>`ngrep -tpd eth0 -Wbyline . tcp`
|
||
|
||
]
|
||
|
||
You should see a stream of Redis requests and responses.
|
||
|
||
---
|
||
|
||
class: title
|
||
|
||
# Dynamic orchestration
|
||
|
||
---
|
||
|
||
## Static vs Dynamic
|
||
|
||
- Static
|
||
|
||
- you decide what goes where
|
||
|
||
- simple to describe and implement
|
||
|
||
- seems easy at first but doesn't scale efficiently
|
||
|
||
- Dynamic
|
||
|
||
- the system decides what goes where
|
||
|
||
- requires extra components (HA KV...)
|
||
|
||
- scaling can be finer-grained, more efficient
|
||
|
||
---
|
||
|
||
## Mesos (overview)
|
||
|
||
- First presented in 2009
|
||
|
||
- Initial goal: resource scheduler
|
||
<br/>(two-level/pessimistic)
|
||
|
||
- top-level "master" knows the global cluster state
|
||
|
||
- "slave" nodes report status and resources to master
|
||
|
||
- master allocates resources to "frameworks"
|
||
|
||
- Container support added recently
|
||
<br/>(had to fit existing model)
|
||
|
||
- Network and service discovery is complex
|
||
|
||
---
|
||
|
||
## Mesos (in practice)
|
||
|
||
- Easy to setup a test cluster (in containers!)
|
||
|
||
- Great to accommodate mixed workloads
|
||
<br/>(see Marathon, Chronos, Aurora, and many more)
|
||
|
||
- "Meh" if you only want to run Docker containers
|
||
|
||
- In production on clusters of thousands of nodes
|
||
|
||
- Open source project; commercial support available
|
||
|
||
---
|
||
|
||
## Kubernetes (overview)
|
||
|
||
- 1 year old
|
||
|
||
- Designed specifically as a platform for containers
|
||
<br/>("greenfield" design)
|
||
|
||
- "pods" = groups of containers sharing network/storage
|
||
|
||
- Scaling and HA managed by "replication controllers"
|
||
|
||
- extensive use of "tags" instead of e.g. tree hierarchy
|
||
|
||
- Initially designed around Docker,
|
||
<br/>but doesn't hesitate to diverge in a few places
|
||
|
||
---
|
||
|
||
## Kubernetes (in practice)
|
||
|
||
- Network and service discovery is powerful, but complex
|
||
<br/>.small[(different mechanisms within pod, between pods, for inbound traffic...)]
|
||
|
||
- Initially designed around GCE
|
||
<br/>.small[(currently relies on "native" features for fast networking and persistence)]
|
||
|
||
- Adaptation is needed when it differs from Docker
|
||
<br/>.small[(need to learn new API, new tooling, new concepts)]
|
||
|
||
- Great deployment platform ...
|
||
<br/>but no developer experience yet
|
||
|
||
---
|
||
|
||
## Swarm (in theory)
|
||
|
||
- Consolidates multiple Docker hosts into a single one
|
||
|
||
- "Looks like" a Docker daemon, but it dispatches (schedules)
|
||
your containers on multiple daemons
|
||
|
||
- Talks the Docker API front and back
|
||
<br/>(leverages the Docker API and ecosystem)
|
||
|
||
- Open source and written in Go (like Docker)
|
||
|
||
- Started by two of the original Docker authors
|
||
<br/>([@aluzzardi](https://twitter.com/aluzzardi) and [@vieux](https://twitter.com/vieux))
|
||
|
||
---
|
||
|
||
## Swarm (in practice)
|
||
|
||
- Not stable yet (version 0.4 right now)
|
||
|
||
- OK for some scenarios (Jenkins, grid...)
|
||
|
||
- Not OK (yet.red[*]) for Compose build, links...
|
||
|
||
- We'll see it (briefly) in action
|
||
|
||
.footnote[.red[*]By "not OK" we mean "requires extra elbow grease."]
|
||
|
||
---
|
||
|
||
## PAAS on Docker
|
||
|
||
- The PAAS workflow: *just push code*
|
||
<br/>(inspired by Heroku, dotCloud...)
|
||
|
||
- TL,DR: easier for devs, harder for ops,
|
||
<br/>some very opinionated choices
|
||
|
||
- A few examples:
|
||
<br/>(Non-exhaustive list!!!)
|
||
|
||
- Cloud Foundry
|
||
- Deis
|
||
- Dokku
|
||
- Flynn
|
||
- Tsuru
|
||
|
||
---
|
||
|
||
## A few other tools
|
||
|
||
- Flocker
|
||
|
||
- manage/migrate stateful containers
|
||
|
||
- Powerstrip
|
||
|
||
- sits in front of the Docker API
|
||
|
||
- great to implement your own experiments
|
||
|
||
- Weave
|
||
|
||
- overlay network so that containers can ping each other
|
||
|
||
... And many more!
|
||
|
||
---
|
||
|
||
class: pic
|
||
|
||

|
||
|
||
---
|
||
|
||
## Warning: here be dragons
|
||
|
||
- So far, we've used stable products (versions 1.X)
|
||
|
||
- We're going to expore experimental software
|
||
|
||
- **Use at your own risk**
|
||
|
||
---
|
||
|
||
# Hands-on Swarm
|
||
|
||

|
||
|
||
---
|
||
|
||
## Setting up our Swarm cluster
|
||
|
||
- This can be done manually or with **Docker Machine**
|
||
|
||
- Manual deployment:
|
||
|
||
- with TLS: certificate generation is painful
|
||
<br/>(needs dual-use certs)
|
||
|
||
- without TLS: easier, but insecure
|
||
<br/>(unless you run on your internal/private network)
|
||
|
||
- Docker Machine deployment:
|
||
|
||
- generates keys, certificates, and deploys them for you
|
||
|
||
- can also create VMs
|
||
|
||
---
|
||
|
||
## The Way Of The Machine
|
||
|
||
- Install `docker-machine` (single binary download)
|
||
|
||
- Set a few environment variables (cloud credentials)
|
||
|
||
- Create one or more machines:
|
||
<br/>`docker-machine create -d digitalocean node42`
|
||
|
||
- List machines and their status:
|
||
<br/>`docker-machine ls`
|
||
|
||
- Select a machine for use:
|
||
<br/>`eval $(docker-machine env node42)`
|
||
<br/>(this will set a few environment variables)
|
||
|
||
- Execute regular commands with Docker, Compose, etc.
|
||
<br/>(they will pick up remote host address from environment)
|
||
|
||
---
|
||
|
||
## Docker Machine `generic` driver
|
||
|
||
- Most drivers work the same way:
|
||
|
||
- use cloud API to create instance
|
||
|
||
- connect to instance over SSH
|
||
|
||
- install Docker
|
||
|
||
- The `generic` driver skips the first step
|
||
|
||
- It can install Docker on any machine,
|
||
<br/>as long as you have SSH access
|
||
|
||
- We will use that!
|
||
|
||
---
|
||
|
||
# Deploying Swarm
|
||
|
||
- Components involved:
|
||
|
||
- service discovery mechanism
|
||
<br/>(we'll use Docker's hosted system)
|
||
|
||
- swarm manager
|
||
<br/>(runs on `node1`, exposes Docker API)
|
||
|
||
- swarm agent
|
||
<br/>(runs on each node, registers it with service discovery)
|
||
|
||
---
|
||
|
||
# Cluster discovery
|
||
|
||
- Possible backends:
|
||
|
||
- dynamic, self-hosted (zk, etcd, consul)
|
||
|
||
- static (command-line or file)
|
||
|
||
- hosted by Docker (token)
|
||
|
||
- We will use the token mechanism
|
||
|
||
.exercise[
|
||
|
||
- Run `TOKEN=$(docker run swarm create)`
|
||
- Save `$TOKEN` carefully: it's your token
|
||
<br/>(it's the unique identifier for your cluster)
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Swarm agent
|
||
|
||
- Used only for dynamic discovery (zk, etcd, consul, token)
|
||
|
||
- Must run on each node
|
||
|
||
- Every 20s (by default), tells to the discovery system:
|
||
</br>"Hello, there is a Swarm node at A.B.C.D:EFGH"
|
||
|
||
- Must know the node's IP address
|
||
<br/>(sorry, it can't figure it out by itself, because
|
||
<br/>it doesn't know whether to use public or private addresses)
|
||
|
||
- The node continues to work even if the agent dies
|
||
|
||
- Automatically started by Docker Machine
|
||
<br/>(when the `--swarm` option is passed)
|
||
|
||
---
|
||
|
||
## Swarm manager
|
||
|
||
- Today: must run on the leader node
|
||
|
||
- Later: can run on multiple nodes, with leader election
|
||
|
||
- Automatically started by Docker Machine
|
||
<br/>(when the `--swarm-master` option is passed)
|
||
|
||
.exercise[
|
||
|
||
- Connect to `node1`
|
||
|
||
- "Create" a node with Docker Machine
|
||
|
||
.small[
|
||
```
|
||
docker-machine create node1 --driver generic \
|
||
--swarm --swarm-master --swarm-discovery token://$TOKEN \
|
||
--generic-ssh-user docker --generic-ip-address 1.2.3.4
|
||
```
|
||
]
|
||
|
||
]
|
||
|
||
(Don't forget to replace 1.2.3.4 with the node IP address!)
|
||
|
||
---
|
||
|
||
## Check our node
|
||
|
||
Let's connect to the node *individually*.
|
||
|
||
.exercise[
|
||
|
||
- Select the node with Machine
|
||
|
||
```
|
||
eval $(docker-machine env node1)
|
||
```
|
||
|
||
- Execute some Docker commands
|
||
|
||
```
|
||
docker version
|
||
docker info
|
||
docker ps
|
||
```
|
||
|
||
]
|
||
|
||
Two containers should show up: the agent and the manager.
|
||
|
||
---
|
||
|
||
## Check our (single-node) Swarm cluster
|
||
|
||
Let's connect to the manager instead.
|
||
|
||
.exercise[
|
||
|
||
- Select the Swarm manager with Machine
|
||
|
||
```
|
||
eval $(docker-machine env node1 --swarm)
|
||
```
|
||
|
||
- Execute some Docker commands
|
||
|
||
```
|
||
docker version
|
||
docker info
|
||
docker ps
|
||
```
|
||
|
||
]
|
||
|
||
The output is different! Let's review this.
|
||
|
||
---
|
||
|
||
## `docker version`
|
||
|
||
Swarm identifies itself clearly:
|
||
|
||
```
|
||
Client:
|
||
Version: 1.8.2
|
||
API version: 1.20
|
||
Go version: go1.4.2
|
||
Git commit: 0a8c2e3
|
||
Built: Thu Sep 10 19:19:00 UTC 2015
|
||
OS/Arch: linux/amd64
|
||
|
||
Server:
|
||
Version: swarm/0.4.0
|
||
API version: 1.16
|
||
Go version: go1.4.2
|
||
Git commit: d647d82
|
||
Built:
|
||
OS/Arch: linux/amd64
|
||
```
|
||
|
||
---
|
||
|
||
## `docker info`
|
||
|
||
Swarm gives cluster information, showing all nodes:
|
||
|
||
```
|
||
Containers: 3
|
||
Images: 6
|
||
Role: primary
|
||
Strategy: spread
|
||
Filters: affinity, health, constraint, port, dependency
|
||
Nodes: 1
|
||
node: 52.89.117.68:2376
|
||
└ Containers: 3
|
||
└ Reserved CPUs: 0 / 2
|
||
└ Reserved Memory: 0 B / 3.86 GiB
|
||
└ Labels: executiondriver=native-0.2,
|
||
kernelversion=3.13.0-53-generic,
|
||
operatingsystem=Ubuntu 14.04.2 LTS,
|
||
provider=generic, storagedriver=aufs
|
||
CPUs: 2
|
||
Total Memory: 3.86 GiB
|
||
Name: 2ec2e6c4054e
|
||
```
|
||
|
||
---
|
||
|
||
## `docker ps`
|
||
|
||
- This one should show nothing at this point.
|
||
|
||
- The Swarm containers are hidden.
|
||
|
||
- This avoids unneeded pollution.
|
||
|
||
- This also avoids killing them by mistake.
|
||
|
||
---
|
||
|
||
## Add other nodes to the cluster
|
||
|
||
- Let's use *almost* the same command line
|
||
<br/>(but without `--swarm-master`)
|
||
|
||
.exercise[
|
||
|
||
- Stay on `node1` (it has keys and certificates now!)
|
||
|
||
- Add another node with Docker Machine
|
||
|
||
.small[
|
||
```
|
||
docker-machine create node2 --driver generic \
|
||
--swarm --swarm-discovery token://$TOKEN \
|
||
--generic-ssh-user docker --generic-ip-address 1.2.3.4
|
||
```
|
||
]
|
||
]
|
||
|
||
Remember to update the IP address correctly.
|
||
|
||
Repeat for all 4 nodes.
|
||
|
||
Pro tip: look for name/address mapping in `/etc/hosts`.
|
||
|
||
---
|
||
|
||
## Scripting
|
||
|
||
To help you a little bit:
|
||
|
||
```
|
||
grep node[2345] /etc/hosts | grep -v ^127 |
|
||
while read IPADDR NODENAME
|
||
do docker-machine create $NODENAME --driver generic \
|
||
--swarm --swarm-discovery token://$TOKEN \
|
||
--generic-ssh-user docker \
|
||
--generic-ip-address $IPADDR \
|
||
</dev/null
|
||
done
|
||
```
|
||
|
||
Fun fact: Machine drains stdin.
|
||
|
||
That's why we use `</dev/null` here.
|
||
|
||
<!---
|
||
Let's fix Markdown coloring with this one weird trick!
|
||
-->
|
||
|
||
---
|
||
|
||
## Running containers on Swarm
|
||
|
||
Try to run a few `busybox` containers.
|
||
|
||
Then, let's get serious:
|
||
|
||
.exercise[
|
||
|
||
- Start a Redis service:
|
||
<br/>`docker run -dP redis`
|
||
|
||
- See the service address:
|
||
<br/>`docker port $(docker ps -lq) 6379`
|
||
|
||
]
|
||
|
||
This can be any of your five nodes.
|
||
|
||
---
|
||
|
||
# Building our app on Swarm
|
||
|
||
- Swarm has partial support for builds
|
||
|
||
- .icon[] Older versions of Compose would crash on builds
|
||
|
||
- Try it!
|
||
|
||
.exercise[
|
||
|
||
- Run `docker-compose build` once ...
|
||
|
||
- Run `docker-compose build` twice ...
|
||
|
||
- What happened?
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Re-thinking the build process
|
||
|
||
- Let's step back and think for a minute ...
|
||
|
||
- What should `docker build` do on Swarm?
|
||
|
||
- build on one machine
|
||
|
||
- build everywhere ($$$)
|
||
|
||
- After the build, what should `docker run` do?
|
||
|
||
- run where we built (how do we know where it is?)
|
||
|
||
- run on any machine that has the image
|
||
|
||
- What do, what do‽
|
||
|
||
---
|
||
|
||
## The plan
|
||
|
||
- Build locally
|
||
|
||
- Tag images
|
||
|
||
- Upload them to the hub
|
||
|
||
- Update the Compose file to use those images
|
||
|
||
*That's the purpose of the `build-tag-push.py` script!*
|
||
|
||
---
|
||
|
||
## Build, Tag, And Push
|
||
|
||
Let's inspect the source code of `build-tag-push.py` and run it.
|
||
|
||
.icon[] It is better to run it against a single node!
|
||
|
||
(There are some race conditions within Swarm when building+pushing too fast.)
|
||
|
||
.exercise[
|
||
|
||
- Point to a single node:
|
||
<br/>`eval $(docker-machine env node1)`
|
||
|
||
- Run the script (from the `dockercoins` directory):
|
||
<br/>`../build-tag-push.py`
|
||
|
||
- Inspect the `docker-compose.yml-XXX` file that it created
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Can we run this now?
|
||
|
||
Let's try!
|
||
|
||
.exercise[
|
||
|
||
- Switch back to the Swarm cluster:
|
||
<br/>`eval $(docker-machine env node1 --swarm)`
|
||
|
||
- Bring up the application:
|
||
<br/>`docker-compose -f docker-compose.yml-XXX up`
|
||
|
||
]
|
||
|
||
--
|
||
|
||
It won't work, because Compose and Swarm do not collaborate
|
||
to establish *placement constraints*.
|
||
|
||
--
|
||
|
||
(╯°□°)╯︵ ┻━┻
|
||
|
||
---
|
||
|
||
## Simple container dependencies
|
||
|
||
- Container A has a link to container B
|
||
|
||
- Compose starts B first, then A
|
||
|
||
- Swarm translates the link into a placement constraint:
|
||
|
||
- *"put A on the same node as B"*
|
||
|
||
- Alles gut
|
||
|
||
---
|
||
|
||
## Complex container dependencies
|
||
|
||
- Container A has a link to containers B and C
|
||
|
||
- Compose starts B and C first
|
||
<br/>(but that can be on different nodes!)
|
||
|
||
- Compose starts A
|
||
|
||
- Swarm translates the links into placements contraints
|
||
|
||
- *"put A on the same node as B"*
|
||
- *"put A on the same node as C"*
|
||
|
||
- If B and C are on different nodes, that's impossible
|
||
|
||
So, what do‽
|
||
|
||
---
|
||
|
||
## A word on placement constraints
|
||
|
||
- Swarm supports constraints
|
||
|
||
- We could tell swarm to put all our containers together
|
||
|
||
- Linking would work
|
||
|
||
- But all containers would end up on the same node
|
||
|
||
--
|
||
|
||
- So having a cluster would be pointless!
|
||
|
||
---
|
||
|
||
# Network plumbing on Swarm
|
||
|
||
- We will use one-tier, dynamic ambassadors
|
||
<br/>(as seen before)
|
||
|
||
- Other available options:
|
||
|
||
- injecting service addresses in environment variables
|
||
|
||
- implementing service discovery in the application
|
||
|
||
- use Docker Engine Experimental + network plugins
|
||
<br/>(or any other overlay network like Weave or Pipework)
|
||
|
||
---
|
||
|
||
## Revisiting `jpetazzo/hamba`
|
||
|
||
- Configuration is stored in a *volume*
|
||
|
||
- A watcher process looks for configuration updates,
|
||
<br/>and restarts HAProxy when needed
|
||
|
||
- It can be started without configuration:
|
||
|
||
```
|
||
docker run --name amba jpetazzo/hamba run
|
||
```
|
||
|
||
- There is a helper to inject a new configuration:
|
||
|
||
```
|
||
docker run --rm --volumes-from amba jpetazzo/hamba \
|
||
reconfigure 80 backend1 port1 backend2 port2 ...
|
||
```
|
||
|
||
.footnote[Note: configuration validation and error messages
|
||
will be logged by the ambassador, not the `reconfigure` container.]
|
||
|
||
---
|
||
|
||
## Should we use `links` for our ambassadors?
|
||
|
||
Technically, we could use links.
|
||
|
||
- Before starting an app container:
|
||
|
||
start the ambassador(s) it needs
|
||
|
||
- When starting an app container:
|
||
|
||
link it to its ambassador(s)
|
||
|
||
But we wouldn't be able to use `docker-compose scale` anymore.
|
||
|
||
---
|
||
|
||
## Network namespaces and `extra_hosts`
|
||
|
||
This is our plan:
|
||
|
||
- Replace each `link` with an `extra_host`,
|
||
<br/>pointing to the `127.127.X.X` address space
|
||
|
||
- Start app containers normally
|
||
<br/>(`docker-compose up`, `docker-compose scale`)
|
||
|
||
- Start ambassadors after app containers are up:
|
||
|
||
- ambassadors bind to `127.127.X.X`
|
||
|
||
- they share their client's network namespace
|
||
|
||
- Reconfigure ambassadors each time something changes
|
||
|
||
---
|
||
|
||
## Our plan for service discovery
|
||
|
||
- Replace all `links` with static `/etc/hosts` entries
|
||
|
||
- Those entries will map to `127.127.0.X`
|
||
<br/>(with different `X` for each service)
|
||
|
||
- Example: `redis` will point to `127.127.0.2`
|
||
<br/>(instead of a container address)
|
||
|
||
- Start all services; scale them if we want
|
||
<br/>(at this point, they will all fail to connect)
|
||
|
||
- Start ambassadors in the services' namespace;
|
||
<br/>each ambassador will listen on the right `127.127.0.X`
|
||
|
||
- Gather all backend addresses and configure ambassadors
|
||
|
||
.icon[] Services should try to reconnect!
|
||
|
||
---
|
||
|
||
## "Design for failure," they said
|
||
|
||
- When the containers are started, the network is not ready
|
||
|
||
- First connection attempts **will fail**
|
||
|
||
- App should try to reconnect
|
||
|
||
- It is OK to crash and restart
|
||
|
||
- Exponential back-off is nice
|
||
|
||
---
|
||
|
||
## Our tools
|
||
|
||
- `link-to-ambassadors.py`
|
||
|
||
- replaces all `links` with `extra_hosts` entries
|
||
|
||
- `create-ambassadors.py`
|
||
|
||
- scans running containers
|
||
- allocates `127.127.X.X` addresses
|
||
- starts (unconfigured) ambassadors
|
||
|
||
- `configure-ambassadors.py`
|
||
|
||
- scans running containers
|
||
- gathers backend addresses
|
||
- sends configuration to ambassadors
|
||
|
||
---
|
||
|
||
## Convert links to ambassadors
|
||
|
||
- When we ran `build-tag-push.py` earlier,
|
||
<br/>it generated a new `docker-compose.yml-XXX` file.
|
||
|
||
.exercise[
|
||
|
||
- Run the first script to create a new YAML file:
|
||
<br/>`../link-to-ambassadors.py docker-compose.yml-XXX a.yml`
|
||
|
||
- Look how the file was modified:
|
||
<br/>`diff docker-compose.yml-XXX a.yml`
|
||
|
||
]
|
||
|
||
The script can take one or two file name arguments:
|
||
|
||
- two arguments indicate input and output files to use;
|
||
- with one argument, the file will be modified in place.
|
||
|
||
---
|
||
|
||
## Bring up the application
|
||
|
||
The application can now be started and scaled.
|
||
|
||
Remember to use the *new* YAML file!
|
||
|
||
.exercise[
|
||
|
||
- Start the application:
|
||
<br/>`docker-compose -f a.yml up -d`
|
||
|
||
- Scale the application:
|
||
<br/>`docker-compose -f a.yml scale worker=5 rng=10`
|
||
|
||
]
|
||
|
||
Note: you can scale everything as you like, *except Redis*,
|
||
because it is stateful.
|
||
|
||
---
|
||
|
||
## Create the ambassadors
|
||
|
||
This has to be executed each time you create new services
|
||
or scale up existing ones.
|
||
|
||
The script takes the YAML file as its only argument.
|
||
|
||
It will scan and compare:
|
||
|
||
- the list of app containers,
|
||
- the list of ambassadors.
|
||
|
||
It will create missing ambassadors.
|
||
|
||
.exercise[
|
||
|
||
- Run the script!
|
||
<br/>`../create-ambassadors.py a.yml`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Configure the ambassadors
|
||
|
||
All ambassadors are created but they still need configuration.
|
||
|
||
That's the purpose of the last script.
|
||
|
||
It will gather:
|
||
|
||
- the list of app backends,
|
||
- the list of ambassadors.
|
||
|
||
Then it configures all ambassadors with all found backends.
|
||
|
||
.exercise[
|
||
|
||
- Run it!
|
||
<br/>`../configure-ambassadors.py a.yml`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
## Check what we did
|
||
|
||
.exercise[
|
||
|
||
|
||
- Find out the address of the web UI:
|
||
<br/>`docker-compose ps webui`
|
||
|
||
- Point your browser to it
|
||
|
||
- Check the logs:
|
||
<br/>`docker-compose logs`
|
||
|
||
]
|
||
|
||
---
|
||
|
||
# Going further
|
||
|
||
Scaling the application (difficulty: easy)
|
||
|
||
- Run `docker-compose scale`
|
||
|
||
- Re-create ambassadors
|
||
|
||
- Re-configure ambassadors
|
||
|
||
- No downtime
|
||
|
||
---
|
||
|
||
## Going further
|
||
|
||
Deploying a new version (difficulty: easy)
|
||
|
||
- Just re-run all the steps!
|
||
|
||
- However, Compose will re-create the containers
|
||
|
||
- You will have to re-create ambassadors
|
||
<br/>(and configure them)
|
||
|
||
- You will have to cleanup old ambassadors
|
||
<br/>(left as an exercise for the reader)
|
||
|
||
- You will experience a little bit of downtime
|
||
|
||
---
|
||
|
||
## Going further
|
||
|
||
Zero-downtime deployment (difficulty: medium)
|
||
|
||
- Isolate stateful services
|
||
<br/>(like we did earlier for Redis)
|
||
|
||
- Do blue/green deployment:
|
||
|
||
- deploy and scale version N
|
||
|
||
- point a "top-level" load balancer to the app
|
||
|
||
- deploy and scale version N+1
|
||
|
||
- put both apps in the "top-level" balancer
|
||
|
||
- slowly switch traffic over to app version N+1
|
||
|
||
---
|
||
|
||
## Going further
|
||
|
||
Harder projects:
|
||
|
||
- Try two-tier or three-tier ambassador deployments
|
||
|
||
- Try overlay networking instead of ambassadors
|
||
|
||
- Try to deploy Mesos or Kubernetes
|
||
|
||
---
|
||
|
||
class: title
|
||
|
||
# Thanks! <br/> Questions?
|
||
|
||
### [@jpetazzo](https://twitter.com/jpetazzo) <br/> [@docker](https://twitter.com/docker)
|
||
|
||
</textarea>
|
||
<script src="https://gnab.github.io/remark/downloads/remark-0.5.9.min.js" type="text/javascript">
|
||
</script>
|
||
<script type="text/javascript">
|
||
var slideshow = remark.create();
|
||
</script>
|
||
</body>
|
||
</html>
|