Files
container.training/docs/index.html
2016-10-01 08:06:43 -07:00

4005 lines
78 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html>
<html>
<head>
<base target="_blank">
<title>Docker Orchestration Workshop</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<style type="text/css">
@import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
@import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
@import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
body { font-family: 'Droid Serif'; }
h1, h2, h3 {
font-family: 'Yanone Kaffeesatz';
font-weight: normal;
margin-top: 0.5em;
}
a {
text-decoration: none;
color: blue;
}
.remark-slide-content { padding: 1em 2.5em 1em 2.5em; }
.remark-slide-content { font-size: 25px; }
.remark-slide-content h1 { font-size: 50px; }
.remark-slide-content h2 { font-size: 50px; }
.remark-slide-content h3 { font-size: 25px; }
.remark-code { font-size: 25px; }
.small .remark-code { font-size: 16px; }
.remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
.red { color: #fa0000; }
.gray { color: #ccc; }
.small { font-size: 70%; }
.big { font-size: 140%; }
.underline { text-decoration: underline; }
.pic {
vertical-align: middle;
text-align: center;
padding: 0 0 0 0 !important;
}
img {
max-width: 100%;
max-height: 550px;
}
.title {
vertical-align: middle;
text-align: center;
}
.title h1 { font-size: 100px; }
.title p { font-size: 100px; }
.quote {
background: #eee;
border-left: 10px solid #ccc;
margin: 1.5em 10px;
padding: 0.5em 10px;
quotes: "\201C""\201D""\2018""\2019";
font-style: italic;
}
.quote:before {
color: #ccc;
content: open-quote;
font-size: 4em;
line-height: 0.1em;
margin-right: 0.25em;
vertical-align: -0.4em;
}
.quote p {
display: inline;
}
.warning {
background-image: url("warning.png");
background-size: 1.5em;
background-repeat: no-repeat;
padding-left: 2em;
}
.exercise {
background-color: #eee;
background-image: url("keyboard.png");
background-size: 1.4em;
background-repeat: no-repeat;
background-position: 0.2em 0.2em;
border: 2px dotted black;
}
.exercise::before {
content: "Exercise";
margin-left: 1.8em;
}
li p { line-height: 1.25em; }
</style>
</head>
<body>
<textarea id="source">
class: title
Docker <br/> Orchestration <br/> Workshop
---
## Intros
- Hello! We are:
AJ ([@s0ulshake](https://twitter.com/s0ulshake))
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
<!--
Reminder, when updating the agenda: when people are told to show
up at 9am, they usually trickle in until 9:30am (except for paid
training sessions). If you're not sure that people will be there
on time, it's a good idea to have a breakfast with the attendees
at e.g. 9am, and start at 9:30.
-->
---
## Agenda
<!--
- Agenda:
.small[
- 09:00-09:15 hello
- 09:15-10:45 part 1
- 10:45-11:00 coffee break
- 11:00-12:30 part 2
- 12:30-13:30 lunch break
- 13:30-15:00 part 3
- 15:00-15:15 coffee break
- 15:15-16:45 part 4
- 16:45-17:30 Q&A
]
-->
- The tutorial will run from 1pm to 5pm
- This will be fast-paced, but DON'T PANIC!
- We will do short breaks for coffee + QA every hour
- All the content is publicly available (slides, code samples, scripts)
- Live feedback, questions, help on
[Gitter](http://container.training/chat)
<!--
Remember to change:
- the link below
- the "tweet my speed" hashtag in DockerCoins HTML
-->
---
<!--
grep '^# ' index.html | grep -v '<br' | tr '#' '-'
-->
## Chapter 1: getting started
- Pre-requirements
- VM environment
- Our sample application
- Running the application
- Identifying bottlenecks
- Introducing SwarmKit
---
## Chapter 2: scaling out our app on Swarm
- Creating our first Swarm
- Running our first Swarm service
- Deploying a local registry
- Overlay networks
---
## Chapter 3: operating the Swarm
- Breaking into an overlay network
- Rolling updates
- Centralized logging
- Setting up ELK to store container logs
---
## Chapter 4: deeper in Swarm
(Additional/optional/bonus content!)
- Dealing with stateful services
- Scripting image building and pushing
- Distributed Application Bundles
- Controlling Docker from a container
- Node management
---
## Chapter 5: metrics
- Setting up Snap to collect and publish metric data
- Using InfluxDB and Grafana for storage and display
---
# Pre-requirements
- Computer with network connection and SSH client
- on Linux, OS X, FreeBSD... you are probably all set
- on Windows, get [putty](http://www.putty.org/),
Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
[Git BASH](https://git-for-windows.github.io/), or
[MobaXterm](http://mobaxterm.mobatek.net/)
- Some Docker knowledge
(but that's OK if you're not a Docker expert!)
???
## Nice-to-haves
- [GitHub](https://github.com/join) account
<br/>(if you want to fork the repo; also used to join Gitter)
- [Gitter](https://gitter.im/) account
<br/>(to join the conversation during the workshop)
- [Docker Hub](https://hub.docker.com) account
<br/>(it's one way to distribute images on your Swarm cluster)
---
## Hands-on sections
- The whole workshop is hands-on
- I will show Docker 1.12 in action
- I invite you to reproduce what I do
- All hands-on sections are clearly identified, like the gray rectangle below
.exercise[
- This is the stuff you're supposed to do!
- Go to [container.training](http://container.training/) to view these slides
- Join the chat room on
[Gitter](http://container.training/chat)
]
---
# VM environment
- Each person gets 5 private VMs (not shared with anybody else)
- They'll be up until tonight
- You have a little card with login+password+IP addresses
- You can automatically SSH from one VM to another
.exercise[
<!--
```bash
for N in $(seq 1 5); do
ssh -o StrictHostKeyChecking=no node$N true
done
for N in $(seq 1 5); do
(.
docker-machine rm -f node$N
ssh node$N "docker ps -aq | xargs -r docker rm -f"
ssh node$N sudo rm -f /etc/systemd/system/docker.service
ssh node$N sudo systemctl daemon-reload
echo Restarting node$N.
ssh node$N sudo systemctl restart docker
echo Restarted node$N.
) &
done
wait
```
-->
- Log into the first VM (`node1`)
- Check that you can SSH (without password) to `node2`:
```bash
ssh node2
```
- Type `exit` or `^D` to come back to node1
<!--
```meta
^D
```
-->
]
---
## We will (mostly) interact with node1 only
- Unless instructed, **all commands must be run from the first VM, `node1`**
- We will only checkout/copy the code on `node1`
- When we will use the other nodes, we will do it mostly through the Docker API
- We will use SSH only for the initial setup and a few "out of band" operations
<br/>(checking internal logs, debugging...)
---
## Terminals
Once in a while, the instructions will say:
<br/>"Open a new terminal."
There are multiple ways to do this:
- create a new window or tab on your machine, and SSH into the VM;
- use screen or tmux on the VM and open a new window from there.
You are welcome to use the method that you feel the most comfortable with.
---
## Tmux cheatsheet
- Ctrl-b c → creates a new window
- Ctrl-b n → go to next window
- Ctrl-b p → go to previous window
- Ctrl-b " → split window top/bottom
- Ctrl-b % → split window left/right
- Ctrl-b Alt-1 → rearrange windows in columns
- Ctrl-b Alt-2 → rearrange windows in rows
- Ctrl-b arrows → navigate to other windows
- Ctrl-b d → detach session
- tmux attach → reattach to session
---
## Brand new versions!
- Engine 1.12.2-rc1
- Compose 1.8.1
.exercise[
- Check all installed versions:
```bash
docker version
docker-compose -v
```
]
---
# Our sample application
- Visit the GitHub repository with all the materials of this workshop:
<br/>https://github.com/jpetazzo/orchestration-workshop
- The application is in the [dockercoins](
https://github.com/jpetazzo/orchestration-workshop/tree/master/dockercoins)
subdirectory
- Let's look at the general layout of the source code:
there is a Compose file [docker-compose.yml](
https://github.com/jpetazzo/orchestration-workshop/blob/master/dockercoins/docker-compose.yml) ...
... and 4 other services, each in its own directory:
- `rng` = web service generating random bytes
- `hasher` = web service computing hash of POSTed data
- `worker` = background process using `rng` and `hasher`
- `webui` = web interface to watch progress
???
## Compose file format version
*Particularly relevant if you have used Compose before...*
- Compose 1.6 introduced support for a new Compose file format (aka "v2")
- Services are no longer at the top level, but under a `services` section
- There has to be a `version` key at the top level, with value `"2"` (as a string, not an integer)
- Containers are placed on a dedicated network, making links unnecessary
- There are other minor differences, but upgrade is easy and straightforward
---
## Links, naming, and service discovery
- Containers can have network aliases (resolvable through DNS)
- Compose file version 2 makes each container reachable through its service name
- Compose file version 1 requires "links" sections
- Our code can connect to services using their short name
(instead of e.g. IP address or FQDN)
---
## Example in `worker/worker.py`
![Service discovery](service-discovery.png)
---
## What's this application?
---
class: pic
![DockerCoins logo](dockercoins.png)
(DockerCoins 2016 logo courtesy of [@XtlCnslt](https://twitter.com/xtlcnslt) and [@ndeloof](https://twitter.com/ndeloof). Thanks!)
---
## What's this application?
- It is a DockerCoin miner! 💰🐳📦🚢
- No, you can't buy coffee with DockerCoins
- How DockerCoins works:
- `worker` asks to `rng` to give it random bytes
- `worker` feeds those random bytes into `hasher`
- each hash starting with `0` is a DockerCoin
- DockerCoins are stored in `redis`
- `redis` is also updated every second to track speed
- you can see the progress with the `webui`
---
## Getting the application source code
- We will clone the GitHub repository
- The repository also contains scripts and tools that we will use through the workshop
.exercise[
<!--
```bash
[ -d orchestration-workshop ] && mv orchestration-workshop orchestration-workshop.$$
```
-->
- Clone the repository on `node1`:
```bash
git clone git://github.com/jpetazzo/orchestration-workshop
```
]
(You can also fork the repository on GitHub and clone your fork if you prefer that.)
---
# Running the application
Without further ado, let's start our application.
.exercise[
- Go to the `dockercoins` directory, in the cloned repo:
```bash
cd ~/orchestration-workshop/dockercoins
```
- Use Compose to build and run all containers:
```bash
docker-compose up
```
]
Compose tells Docker to build all container images (pulling
the corresponding base images), then starts all containers,
and displays aggregated logs.
---
## Lots of logs
- The application continuously generates logs
- We can see the `worker` service making requests to `rng` and `hasher`
- Let's put that in the background
.exercise[
- Stop the application by hitting `^C`
<!--
```meta
^C
```
-->
]
- `^C` stops all containers by sending them the `TERM` signal
- Some containers exit immediately, others take longer
<br/>(because they don't handle `SIGTERM` and end up being killed after a 10s timeout)
---
## Restarting in the background
- Many flags and commands of Compose are modeled after those of `docker`
.exercise[
- Start the app in the background with the `-d` option:
```bash
docker-compose up -d
```
- Check that our app is running with the `ps` command:
```bash
docker-compose ps
```
]
`docker-compose ps` also shows the ports exposed by the application.
???
## Viewing logs
- The `docker-compose logs` command works like `docker logs`
.exercise[
- View all logs since container creation and exit when done:
```bash
docker-compose logs
```
- Stream container logs, starting at the last 10 lines for each container:
```bash
docker-compose logs --tail 10 --follow
```
<!--
```meta
^C
```
-->
]
Tip: use `^S` and `^Q` to pause/resume log output.
???
## Upgrading from Compose 1.6
.warning[The `logs` command has changed between Compose 1.6 and 1.7!]
- Up to 1.6
- `docker-compose logs` is the equivalent of `logs --follow`
- `docker-compose logs` must be restarted if containers are added
- Since 1.7
- `--follow` must be specified explicitly
- new containers are automatically picked up by `docker-compose logs`
---
## Connecting to the web UI
- The `webui` container exposes a web dashboard; let's view it
.exercise[
- Open http://[yourVMaddr]:8000/ (from a browser)
]
- The app actually has a constant, steady speed (3.33 coins/second)
- The speed seems not-so-steady because:
- the worker doesn't update the counter after every loop, but up to once per second
- the speed is computed by the browser, checking the counter about once per second
- between two consecutive updates, the counter will increase either by 4, or by 0
---
## Scaling up the application
- Our goal is to make that performance graph go up (without changing a line of code!)
???
- Before trying to scale the application, we'll figure out if we need more resources
(CPU, RAM...)
- For that, we will use good old UNIX tools on our Docker node
???
## Looking at resource usage
- Let's look at CPU, memory, and I/O usage
.exercise[
- run `top` to see CPU and memory usage (you should see idle cycles)
- run `vmstat 3` to see I/O usage (si/so/bi/bo)
<br/>(the 4 numbers should be almost zero, except `bo` for logging)
]
We have available resources.
- Why?
- How can we use them?
---
## Scaling workers on a single node
- Docker Compose supports scaling
- Let's scale `worker` and see what happens!
.exercise[
- Start one more `worker` container:
```bash
docker-compose scale worker=2
```
- Look at the performance graph (it should show a x2 improvement)
- Look at the aggregated logs of our containers (`worker_2` should show up)
- Look at the impact on CPU load with e.g. top (it should be negligible)
]
---
## Adding more workers
- Great, let's add more workers and call it a day, then!
.exercise[
- Start eight more `worker` containers:
```bash
docker-compose scale worker=10
```
- Look at the performance graph: does it show a x10 improvement?
- Look at the aggregated logs of our containers
- Look at the impact on CPU load and memory usage
<!--
```bash
sleep 5
killall docker-compose
```
-->
]
---
# Identifying bottlenecks
- You should have seen a 3x speed bump (not 10x)
- Adding workers didn't result in linear improvement
- *Something else* is slowing us down
--
- ... But what?
--
- The code doesn't have instrumentation
- Let's use state-of-the-art HTTP performance analysis!
<br/>(i.e. good old tools like `ab`, `httping`...)
---
## Measuring latency under load
We will use `httping`.
.exercise[
- Check the latency of `rng`:
```bash
httping -c 10 localhost:8001
```
- Check the latency of `hasher`:
```bash
httping -c 10 localhost:8002
```
]
`rng` has a much higher latency than `hasher`.
---
## Let's draw hasty conclusions
- The bottleneck seems to be `rng`
- *What if* we don't have enough entropy and can't generate enough random numbers?
- We need to scale out the `rng` service on multiple machines!
Note: this is a fiction! We have enough entropy. But we need a pretext to scale out.
<br/>(In fact, the code of `rng` uses `/dev/urandom`, which doesn't need entropy.)
---
## Clean up
- Before moving on, let's remove those containers
.exercise[
- Tell Compose to remove everything:
```bash
docker-compose down
```
]
---
class: title
# Scaling out
---
# SwarmKit
- [SwarmKit](https://github.com/docker/swarmkit) is an open source
toolkit to build multi-node systems
- It is a reusable library, like libcontainer, libnetwork, vpnkit ...
- It is a plumbing part of the Docker ecosystem
- SwarmKit comes with two examples:
- `swarmctl` (a CLI tool to "speak" the SwarmKit API)
- `swarmd` (an agent that can federate existing Docker Engines into a Swarm)
- SwarmKit/swarmd/swarmctl → libcontainer/containerd/container-ctr
---
## SwarmKit features
- Highly-available, distributed store based on Raft
<br/>(more on next slide)
- *Services* managed with a *declarative API*
<br/>(implementing *desired state* and *reconciliation loop*)
- Automatic TLS keying and signing
- Dynamic promotion/demotion of nodes, allowing to change
how many nodes are actively part of the Raft consensus
- Integration with overlay networks and load balancing
- And much more!
---
## Where is the key/value store?
- Many other orchestration systems use a key/value store
<br/>
(k8s→etcd, mesos→zookeeper, etc.)
- SwarmKit stores information directly in Raft
<br/>
(Nomad is similar; thanks [@cbednarski](https://twitter.com/@cbednarski),
[@diptanu](https://twitter.com/diptanu) and others for point it out!)
- Analogy courtesy of [@aluzzardi](https://twitter.com/aluzzardi):
*It's like B-Trees and RDBMS. They are different layers, often
associated. But you don't need to bring up a full SQL server when
all you need is to index some data.*
- As a result, the orchestrator has direct access to the data
<br/>
(the main copy of the data is stored in the orchestrator's memory)
- Simpler, easier to deploy and operate; also faster
---
## SwarmKit concepts (1/2)
- A *cluster* will be at least one *node* (preferably more)
- A *node* can be a *manager* or a *worker*
(Note: in SwarmKit, *managers* are also *workers*)
- A *manager* actively takes part in the Raft consensus
- You can talk to a *manager* using the SwarmKit API
- One *manager* is elected as the *leader*; other managers merely forward requests to it
---
## SwarmKit concepts (2/2)
- The *managers* expose the SwarmKit API
- Using the API, you can indicate that you want to run a *service*
- A *service* is specified by its *desired state*: which image, how many instances...
- The *leader* uses different subsystems to break down services into *tasks*:
<br/>orchestrator, scheduler, allocator, dispatcher
- A *task* corresponds to a specific container, assigned to a specific *node*
- *Nodes* know which *tasks* should be running, and will start or stop containers accordingly (through the Docker Engine API)
You can refer to the [NOMENCLATURE](https://github.com/docker/swarmkit/blob/master/design/nomenclature.md) in the SwarmKit repo for more details.
---
## Swarm Mode
- Docker Engine 1.12 features SwarmKit integration
- The Docker CLI features three new commands:
- `docker swarm` (enable Swarm mode; join a Swarm; adjust cluster parameters)
- `docker node` (view nodes; promote/demote managers; manage nmodes)
- `docker service` (create and manage services)
- The Docker API exposes the same concepts
- The SwarmKit API is also exposed (on a separate socket)
???
## Illustration
![Illustration](swarm-mode.png)
---
## You need to enable Swarm mode to use the new stuff
- By default, everything runs as usual
- Swarm Mode can be enabled, "unlocking" SwarmKit functions
<br/>(services, out-of-the-box overlay networks, etc.)
.exercise[
- Try a Swarm-specific command:
```
$ docker node ls
Error response from daemon: This node is not a swarm manager. [...]
```
]
---
# Creating our first Swarm
- The cluster is initialized with `docker swarm init`
- This should be executed on a first, seed node
- .warning[DO NOT execute `docker swarm init` on multiple nodes!]
You would have multiple disjoint clusters.
.exercise[
- Create our cluster from node1:
```bash
docker swarm init
```
]
---
## Token generation
- In the output of `docker swarm init`, we have a message
confirming that our node is now the (single) manager:
```
Swarm initialized: current node (8jud...) is now a manager.
```
- Docker generated two security tokens (like passphrases or passwords) for our cluster
- The CLI shows us the command to use on other nodes to add them to the cluster using the "worker"
security token:
```
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-59fl4ak4nqjmao1ofttrc4eprhrola2l87... \
172.31.4.182:2377
```
---
## Checking that Swarm mode is enabled
.exercise[
- Run the traditional `docker info` command:
```bash
docker info
```
]
The output should include:
```
Swarm: active
NodeID: 8jud7o8dax3zxbags3f8yox4b
Is Manager: true
ClusterID: 2vcw2oa9rjps3a24m91xhvv0c
...
```
---
## Running our first Swarm mode command
- Let's retry the exact same command as earlier
.exercise[
- List the nodes (well, the only node) of our cluster:
```bash
docker node ls
```
]
The output should look like the following:
```
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
8jud...ox4b * ip-172-31-4-182 Ready Active Leader
```
---
## Adding nodes to the Swarm
- A cluster with one node is not a lot of fun
- Let's add `node2`!
- We need the token that was shown earlier
--
- You wrote it down, right?
--
- Don't panic, we can easily see it again 😏
---
## Adding nodes to the Swarm
.exercise[
- Show the token again:
```bash
docker swarm join-token worker
```
- Log into `node2`:
```bash
ssh node2
```
- Copy paste the `docker swarm join ...` command
<br/>(that was displayed just before)
]
???
## Check that the node was added correctly
- Stay logged into `node2`!
.exercise[
- We can still use `docker info` to verify that the node is part of the Swarm:
```bash
$ docker info | grep ^Swarm
```
]
- However, Swarm commands will not work; try, for instance:
```
docker node ls
```
- This is because the node that we added is currently a *worker*
- Only *managers* can accept Swarm-specific commands
---
## View our two-node cluster
- Let's go back to `node1` and see what our cluster looks like
.exercise[
- Logout from `node2` (with `exit` or `Ctrl-D` or ...)
- View the cluster from `node1`, which is a manager:
```bash
docker node ls
```
]
The output should be similar to the following:
```
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
8jud...ox4b * ip-172-31-4-182 Ready Active Leader
ehb0...4fvx ip-172-31-4-180 Ready Active
```
---
## Adding nodes using the Docker API
- We don't have to SSH into the other nodes, we can use the Docker API
- Our nodes (for this workshop) expose the Docker API over port 55555,
without authentication (DO NOT DO THIS IN PRODUCTION; FOR EDUCATIONAL USE ONLY)
.exercise[
- Set `DOCKER_HOST` and add `node3` to the Swarm:
```bash
DOCKER_HOST=tcp://node3:55555 docker swarm join \
--token $(docker swarm join-token -q worker) node1:2377
```
- Check that the node is here:
```bash
docker node ls
```
]
---
## Under the hood
When we do `docker swarm init`, a TLS root CA is created. Then a keypair is issued for the first node, and signed by the root CA.
When further nodes join the Swarm, they are issued their own keypair, signed by the root CA, and they also receive the root CA public key and certificate.
All communication is encrypted over TLS.
The node keys and certificates are automatically renewed on regular intervals
<br/>(by default, 90 days; this is tunable with `docker swarm update`).
---
## Adding more manager nodes
- Right now, we have only one manager (node1)
- If we lose it, we're SOL
- Let's make our cluster highly available
.exercise[
- Add nodes 4 and 5 to the cluster as *managers* (instead of simple *workers*):
```bash
for N in 4 5; do
DOCKER_HOST=tcp://node$N:55555 docker swarm join \
--token $(docker swarm join-token -q manager) node1:2377
done
```
]
---
## You can control the Swarm from any manager node
.exercise[
- Try the following command on a few different nodes:
```bash
ssh nodeX docker node ls
```
]
On manager nodes:
<br/>you will see the list of nodes, with a `*` denoting
the node you're talking to.
On non-manager nodes:
<br/>you will get an error message telling you that
the node is not a manager.
As we saw earlier, you can only control the Swarm through a manager node.
---
## Promoting nodes
- Instead of adding a manager node, we can also promote existing workers
- Nodes can be promoted (and demoted) at any time
.exercise[
- See the current list of nodes:
```
docker node ls
```
- Promote the two worker nodes to be managers:
```
docker node promote XXX YYY
```
]
---
# Running our first Swarm service
- How do we run services? Simplified version:
`docker run` → `docker service create`
.exercise[
- Create a service featuring an Alpine container pinging Google resolvers:
```bash
docker service create alpine ping 8.8.8.8
```
- Check where the container was created:
```bash
docker service ps <serviceID>
```
]
---
## Checking container logs
- Right now, there is no direct way to check the logs of our container
<br/>(unless it was scheduled on the current node)
- Look up the `NODE` on which the container is running
<br/>(in the output of the `docker service ps` command)
.exercise[
- Log into the node:
```bash
ssh ip-172-31-XXX-XXX
```
]
---
## Viewing the logs of the container
- We need to be logged into the node running the container
.exercise[
- See that the container is running and check its ID:
```bash
docker ps
```
- View its logs:
```bash
docker logs <containerID>
```
]
Go back to `node1` afterwards.
---
## Scale our service
- Services can be scaled in a pinch with the `docker service update`
command
.exercise[
- Scale the service to ensure 2 copies per node:
```bash
docker service update <serviceID> --replicas 10
```
- Check that we have two containers on the current node:
```bash
docker ps
```
]
---
## Expose a service
- Services can be exposed, with two special properties:
- the public port is available on *every node of the Swarm*,
- requests coming on the public port are load balanced across all instances.
- This is achieved with option `-p/--publish`; as an approximation:
`docker run -p → docker service create -p`
- If you indicate a single port number, it will be mapped on a port
starting at 30000
<br/>(vs. 32768 for single container mapping)
- You can indicate two port numbers to set the public port number
<br/>(just like with `docker run -p`)
---
## Expose ElasticSearch on its default port
.exercise[
- Create an ElasticSearch service (and give it a name while we're at it):
```bash
docker service create --name search --publish 9200:9200 --replicas 7 \
elasticsearch
```
- Check what's going on:
```bash
watch docker service ps search
```
]
---
## Tasks lifecycle
- If you are fast enough, you will be able to see multiple states:
- assigned (the task has been assigned to a specific node)
- preparing (right now, this mostly means "pulling the image")
- running
- When a task is terminated (stopped, killed...) it cannot be restarted
(A replacement task will be created)
---
![diagram showing what happens during docker service create, courtesy of @aluzzardi](docker-service-create.svg)
---
## Test our service
- We mapped port 9200 on the nodes, to port 9200 in the containers
- Let's try to reach that port!
.exercise[
- Repeat the following command a few times:
```bash
curl localhost:9200
```
]
Each request should be served by a different ElasticSearch instance.
(You will see each instance advertising a different name.)
---
## Terminate our services
- Before moving on, we will remove those services
- `docker service rm` can accept multiple services names or IDs
- `docker service ls` can accept the `-q` flag
- A Shell snippet a day keeps the cruft away
.exercise[
- Remove all services with this one liner:
```bash
docker service ls -q | xargs docker service rm
```
]
---
class: title
# Our app on Swarm
---
## What's on the menu?
In this part, we will cover:
- building images for our app,
- shipping those images with a registry,
- running them through the services concept,
- enabling inter-container communication with overlay networks.
---
## Why do we need to ship our images?
- When we do `docker-compose up`, images are built for our services
- Those images are present only on the local node
- We need those images to be distributed on the whole Swarm
- The easiest way to achieve that is to use a Docker registry
- Once our images are on a registry, we can reference them when
creating our services
---
## Build, ship, and run, for a single service
If we had only one service (built from a `Dockerfile` in the
current directory), our workflow could look like this:
```
docker build -t jpetazzo/doublerainbow:v0.1 .
docker push jpetazzo/doublerainbow:v0.1
docker service create jpetazzo/doublerainbow:v0.1
```
We just have to adapt this to our application, which has 4 services!
---
## The plan
- Build on our local node (`node1`)
- Tag images with a version number
(timestamp; git hash; semantic...)
- Upload them to a registry
- Create services using the images
---
## Which registry do we want to use?
.small[
- **Docker Hub**
- hosted by Docker Inc.
- requires an account (free, no credit card needed)
- images will be public (unless you pay)
- located in AWS EC2 us-east-1
- **Docker Trusted Registry**
- self-hosted commercial product
- requires a subscription (free 30-day trial available)
- images can be public or private
- located wherever you want
- **Docker open source registry**
- self-hosted barebones repository hosting
- doesn't require anything
- doesn't come with anything either
- located wherever you want
]
???
## Using Docker Hub
- Set the `DOCKER_REGISTRY` environment variable to your Docker Hub user name
<br/>(the `build-tag-push.py` script prefixes each image name with that variable)
- We will also see how to run the open source registry
<br/>(so use whatever option you want!)
.exercise[
<!--
```meta
^{
```
-->
- Set the following environment variable:
<br/>`export DOCKER_REGISTRY=jpetazzo`
- (Use *your* Docker Hub login, of course!)
- Log into the Docker Hub:
<br/>`docker login`
<!--
```meta
^}
```
-->
]
???
## Using Docker Trusted Registry
If we wanted to use DTR, we would:
- make sure we have a Docker Hub account
- [activate a Docker Datacenter subscription](
https://hub.docker.com/enterprise/trial/)
- install DTR on our machines
- set `DOCKER_REGISTRY` to `dtraddress:port/user`
*This is out of the scope of this workshop!*
---
## Using open source registry
- We need to run a `registry:2` container
<br/>(make sure you specify tag `:2` to run the new version!)
- It will store images and layers to the local filesystem
<br/>(but you can add a config file to use S3, Swift, etc.)
<!--
- Docker *requires* TLS when communicating with the registry
- unless for registries on `localhost`
- or with the Engine flag `--insecure-registry`
<!-- -->
- Our strategy: publish the registry container on port 5000,
<br/>so that it's available through `localhost:5000` on each node
---
# Deploying a local registry
- We will create a single-instance service, publishing its port
on the whole cluster
.exercise[
- Create the registry service:
```bash
docker service create --name registry --publish 5000:5000 registry:2
```
- Try the following command, until it returns `{"repositories":[]}`:
```bash
curl localhost:5000/v2/_catalog
```
]
(Retry a few times, it might take 10-20 seconds for the container to be started. Patience.)
---
## Testing our local registry
- We can retag a small image, and push it to the registry
.exercise[
- Make sure we have the busybox image, and retag it:
```bash
docker pull busybox
docker tag busybox localhost:5000/busybox
```
- Push it:
```bash
docker push localhost:5000/busybox
```
]
---
## Checking what's on our local registry
- The registry API has endpoints to query what's there
.exercise[
- Ensure that our busybox image is now in the local registry:
```bash
curl http://localhost:5000/v2/_catalog
```
]
The curl command should now output:
```json
{"repositories":["busybox"]}
```
---
## Build, tag, and push our application container images
- Scriptery to the rescue!
.exercise[
- Set `DOCKER_REGISTRY` and `TAG` environment variables to use our local registry
- And run this little for loop:
```bash
DOCKER_REGISTRY=localhost:5000
TAG=v0.1
for SERVICE in hasher rng webui worker; do
docker-compose build $SERVICE
docker tag dockercoins_$SERVICE $DOCKER_REGISTRY/dockercoins_$SERVICE:$TAG
docker push $DOCKER_REGISTRY/dockercoins_$SERVICE
done
```
]
---
# Overlay networks
- SwarmKit integrates with overlay networks, without requiring
an extra key/value store
- Overlay networks are created the same way as before
.exercise[
- Create an overlay network for our application:
```bash
docker network create --driver overlay dockercoins
```
- Check existing networks:
```bash
docker network ls
```
]
---
## Can you spot the differences?
The networks `dockercoins` and `ingress` are different from the other ones.
Can you see how?
--
- They are using a different kind of ID, reflecting the fact that they
are SwarmKit objects instead of "classic" Docker Engine objects.
- They're *scope* is "swarm" instead of "local".
- They are using the overlay driver.
---
## Caveats
.warning[It is currently not possible to join an overlay network with `docker run --net ...`;
this might or might not change in the future. We will see how to cope
with this limitation.]
*Why is that?*
Placing a container on a network requires allocating an IP address for this container.
The allocation must be done by a manager node (worker nodes cannot update Raft's data structures).
As a result, `docker run --net ...` would only work on manager nodes.
Moreover, it would significantly alter the code path for `docker run`, even in classic mode.
<br/>(That could be a bad thing if it's not done very carefully!)
---
## Run the application
- First, create the `redis` service; that one is using a Docker Hub image
.exercise[
- Create the `redis` service:
```bash
docker service create --network dockercoins --name redis redis
```
]
---
## Run the other services
- Then, start the other services one by one
- We will use the images pushed previously
.exercise[
- Start the other services:
```bash
DOCKER_REGISTRY=localhost:5000
TAG=v0.1
for SERVICE in hasher rng webui worker; do
docker service create --network dockercoins --name $SERVICE \
$DOCKER_REGISTRY/dockercoins_$SERVICE:$TAG
done
```
]
???
## Wait for our application to be up
- We will see later a way to watch progress for all the tasks of the cluster
- But for now, a scrappy Shell loop will do the trick
.exercise[
- Repeatedly display the status of all our services:
```bash
watch "docker service ls -q | xargs -n1 docker service ps"
```
- Stop it once everything is running
]
---
## Expose our application web UI
- We need to connect to the `webui` service, but it is not publishing any port
- Let's re-create the `webui` service, but publish its port 80 this time
.exercise[
- Remove the `webui` service:
```bash
docker service rm webui
```
- Re-create it:
```bash
docker service create --network dockercoins --name webui \
-p 8000:80 $DOCKER_REGISTRY/dockercoins_webui:$TAG
```
]
<!--
- Let's reconfigure it to publish a port
.exercise[
- Update `webui` so that we can connect to it from outside:
```bash
docker service update webui --publish-add 8000:80
```
]
Note: to "de-publish" a port, you would have to specify the container port.
</br>(i.e. in that case, `--publish-rm 80`)
-->
---
## Connect to the web UI
- The web UI is now available on port 8000, *on all the nodes of the cluster*
.exercise[
- Point your browser to any node, on port 8000
]
You might have to wait a bit for the container to be up and running.
Check its status with `docker service ps webui`.
---
## Scaling the application
- We can change scaling parameters with `docker update` as well
- We will do the equivalent of `docker-compose scale`
.exercise[
- Bring up more workers:
```bash
docker service update worker --replicas 10
```
- Check the result in the web UI
]
You should see the performance peaking at 10 hashes/s (like before).
---
## Global scheduling
- We want to utilize as best as we can the entropy generators
on our nodes
- We want to run exactly one `rng` instance per node
- SwarmKit has a special scheduling mode for that, let's use it
- We cannot enable/disable global scheduling on an existing service
- We have to destroy and re-create the `rng` service
---
## Scaling the `rng` service
.exercise[
- Remove the existing `rng` service:
```bash
docker service rm rng
```
- Re-create the `rng` service with *global scheduling*:
```bash
docker service create --name rng --network dockercoins --mode global \
$DOCKER_REGISTRY/dockercoins_rng:$TAG
```
- Look at the result in the web UI
]
Note: if the hash rate goes to zero and doesn't climb back up, try to `rm` and `create` again.
---
## Checkpoint
- We've seen how to setup a Swarm
- We've used it to host our own registry
- We've built our app container images
- We've used the registry to host those images
- We've deployed and scaled our application
Let's treat ourselves with a nice pat in the back!
--
And carry on, we have much more to see and learn!
---
class: title
# Operating the Swarm
---
## Troubleshooting overlay networks
<!--
## Finding the real cause of the bottleneck
- We want to debug our app as we scale `worker` up and down
-->
- We want to run tools like `ab` or `httping` on the internal network
- .warning[This will be very hackish]
(Better techniques and tools might become available in the future!)
---
# Breaking into an overlay network
- We will create a dummy placeholder service on our network
- Then we will use `docker exec` to run more processes in this container
.exercise[
- Start a "do nothing" container using our favorite Swiss-Army distro:
```bash
docker service create --network dockercoins --name debug --mode global \
alpine sleep 1000000000
```
]
Why am I using global scheduling here? Because I'm lazy!
<br/>
With global scheduling, I'm *guaranteed* to have an instance on the local node.
<br/>
I don't need to SSH to another node.
---
## Entering the debug container
- Once our container is started (which should be really fast because the alpine image is small), we can enter it (from any node)
.exercise[
- Locate the container:
```bash
docker ps
```
- Enter it:
```bash
docker exec -ti <containerID> sh
```
]
---
## Labels
- We can also be fancy and find the ID of the container automatically
- SwarmKit places labels on containers
.exercise[
- Get the ID of the container:
```bash
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=debug)
```
- And enter the container:
```bash
docker exec -ti $CID sh
```
]
---
## Installing our debugging tools
- Ideally, you would author your own image, with all your favorite tools, and use it instead of the base `alpine` image
- But we can also dynamically install whatever we need
.exercise[
- Install a few tools:
```bash
apk add --update curl apache2-utils drill
```
]
---
## Investigating the `rng` service
- First, let's check what `rng` resolves to
.exercise[
- Use drill or nslookup to resolve `rng`:
```bash
drill rng
```
]
This give us one IP address. It is not the IP address of a container.
It is a virtual IP address (VIP) for the `rng` service.
---
## Investigating the VIP
.exercise[
- Try to ping the VIP:
```bash
ping rng
```
]
It *should* ping. (But this might change in the future.)
Current behavior for VIPs is to ping when there is a backend available on the same machine.
(Again: this might change in the future.)
---
## What if I don't like VIPs?
- Services can be published using two modes: VIP and DNSRR.
- With VIP, you get a virtual IP for the service, and an load balancer
based on IPVS
(By the way, IPVS is totally awesome and if you want to learn more about it in the context of containers,
I highly recommend [this talk](https://www.youtube.com/watch?v=oFsJVV1btDU&index=5&list=PLkA60AVN3hh87OoVra6MHf2L4UR9xwJkv) by [@kobolog](https://twitter.com/kobolog) at DC15EU!)
- With DNSRR, you get the former behavior (from Engine 1.11), where
resolving the service yields the IP addresses of all the containers for
this service
- You change this with `docker service create --endpoint-mode [VIP|DNSRR]`
???
## Testing and benchmarking our service
- We will check that the service is up with `rng`, then
benchmark it with `ab`
.exercise[
- Make a test request to the service:
```bash
curl rng
```
- Open another window, and stop the workers, to test in isolation:
```bash
docker service update worker --replicas 0
```
]
Wait until the workers are stopped (check with `docker service ls`)
before continuing.
???
## Benchmarking `rng`
We will send 50 requests, but with various levels of concurrency.
.exercise[
- Send 50 requests, with a single sequential client:
```bash
ab -c 1 -n 50 http://rng/10
```
- Send 50 requests, with fifty parallel clients:
```bash
ab -c 50 -n 50 http://rng/10
```
]
???
## Benchmark results for `rng`
- When serving requests sequentially, they each take 100ms
- In the parallel scenario, the latency increased dramatically:
- What about `hasher`?
???
## Benchmarking `hasher`
We will do the same tests for `hasher`.
The command is slightly more complex, since we need to post random data.
First, we need to put the POST payload in a temporary file.
.exercise[
- Install curl in the container, and generate 10 bytes of random data:
```bash
curl http://rng/10 >/tmp/random
```
]
???
## Benchmarking `hasher`
Once again, we will send 50 requests, with different levels of concurrency.
.exercise[
- Send 50 requests with a sequential client:
```bash
ab -c 1 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
```
- Send 50 requests with 50 parallel clients:
```bash
ab -c 50 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
```
]
???
## Benchmark results for `hasher`
- The sequential benchmarks takes ~5 seconds to complete
- The parallel benchmark takes less than 1 second to complete
- In both cases, each request takes a bit more than 100ms to complete
- Requests are a bit slower in the parallel benchmark
- It looks like `hasher` is better equiped to deal with concurrency than `rng`
???
class: title
Why?
???
## Why does everything take (at least) 100ms?
??
`rng` code:
![RNG code screenshot](delay-rng.png)
??
`hasher` code:
![HASHER code screenshot](delay-hasher.png)
???
class: title
But ...
WHY?!?
???
## Why did we sprinkle this sample app with sleeps?
- Deterministic performance
<br/>(regardless of instance speed, CPUs, I/O...)
??
- Actual code sleeps all the time anyway
??
- When your code makes a remote API call:
- it sends a request;
- it sleeps until it gets the response;
- it processes the response.
???
## Why do `rng` and `hasher` behave differently?
![Equations on a blackboard](equations.png)
??
(Synchronous vs. asynchronous event processing)
---
# Rolling updates
- We want to release a new version of the worker
- We will edit the code ...
- ... build the new image ...
- ... push it to the registry ...
- ... update our service to use the new image
???
## But first...
- Restart the workers
.exercise[
- Just scale back to 10 replicas:
```bash
docker service update worker --replicas 10
```
- Check that they're running:
```bash
docker service ps worker
```
]
---
## Making changes
.exercise[
- Edit `~/orchestration-workshop/dockercoins/worker/worker.py`
- Locate the line that has a `sleep` instruction
- Reduce the `sleep` from `0.1` to `0.01`
- Save your changes and exit
]
---
## Building and pushing the new image
.exercise[
- Build the new image:
```bash
IMAGE=localhost:5000/dockercoins_worker:v0.01
docker build -t $IMAGE worker
```
- Push it to the registry:
```bash
docker push $IMAGE
```
]
Note how the build and push were fast (because caching).
---
## Watching the deployment process
- We will need to open a new window for this
.exercise[
- Look at our service status:
```bash
watch -n1 "docker service ps worker | grep -v Shutdown.*Shutdown"
```
]
- `docker service ps worker` gives us all tasks
<br/>(including the one whose current or desired state is `Shutdown`)
- Then we filter out the tasks whose current **and** desired state is `Shutdown`
- Future versions might have fancy filters to make that less tinkerish
---
## Updating to our new image
- Keep the `watch ...` command running!
.exercise[
- In the other window, update the service to the new image:
```bash
docker service update worker --image $IMAGE
```
]
By default, SwarmKit does a rolling upgrade, one instance at a time.
---
## Changing the upgrade policy
- We can set upgrade parallelism (how many instances to update at the same time)
- And upgrade delay (how long to wait between two batches of instances)
.exercise[
- Change the parallelism to 2 and the delay to 5 seconds:
```bash
docker service update worker --update-parallelism 2 --update-delay 5s
```
- Rollback to the previous image:
```bash
docker service update worker --image $DOCKER_REGISTRY/dockercoins_worker:v0.1
```
]
---
## Timeline of an upgrade
- SwarmKit will upgrade N instances at a time
<br/>(following the `update-parallelism` parameter)
- New tasks are created, and their desired state is set to `Ready`
<br/>.small[(this pulls the image if necessary, ensures resource availability, creates the container ... without starting it)]
- If the new tasks fail to get to `Ready` state, go back to the previous step
<br/>.small[(SwarmKit will try again and again, until the situation is addressed or desired state is updated)]
- When the new tasks are `Ready`, it sets the old tasks desired state to `Shutdown`
- When the old tasks are `Shutdown`, it starts the new tasks
- Then it waits for the `update-delay`, and continues with the next batch of instances
---
## Getting cluster-wide task information
- The Docker API doesn't expose this directly (yet)
- But the SwarmKit API does
- Let's see how to use it
- We will use `swarmctl`
- `swarmctl` is an example program showing how to
interact with the SwarmKit API
- First, we need to install `swarmctl`
---
## Building `swarmctl`
- I thought I would enjoy a 1-minute break at this point
- So we are going to compile SwarmKit (including `swarmctl`)
.exercise[
- Download, compile, install SwarmKit with this one-liner:
```bash
docker run -v /usr/local/bin:/go/bin golang \
go get `-v` github.com/docker/swarmkit/...
```
]
Remove `-v` if you don't like verbose things.
Shameless promo: for more Go and Docker love, check
[this blog post](http://jpetazzo.github.io/2016/09/09/go-docker/)!
---
## Using `swarmctl`
- The Docker Engine places the SwarmKit control socket in a special path
- And you need root privileges to access it
.exercise[
- Set an alias so that swarmctl can run as root and use the right control socket:
```bash
alias \
swarmctl='sudo swarmctl --socket /var/lib/docker/swarm/control.sock'
```
]
---
## `swarmctl` in action
- Let's review a few useful `swarmctl` commands
.exercise[
- List cluster nodes (that's equivalent to `docker node ls`):
```bash
swarmctl node ls
```
- View all tasks across all services:
```bash
swarmctl task ls
```
]
---
## Caveat
- SwarmKit is vendored into the Docker Engine
- If you want to use `swarmctl`, you need the exact version of
SwarmKit that was used in your Docker Engine
- Otherwise, you might get some errors like:
```
Error: grpc: failed to unmarshal the received message proto: wrong wireType = 0
```
---
# Centralized logging
- We want to send all our container logs to a central place
- If that place could offer a nice web dashboard too, that'd be nice
--
- We are going to deploy an ELK stack
- It will accept logs over a GELF socket
- We will update our services to send logs through the GELF logging driver
---
# Setting up ELK to store container logs
*Important foreword: this is not an "official" or "recommended"
setup; it is just an example. We used ELK in this demo because
it's a popular setup and we keep being asked about it; but you
will have equal success with Fluent or other logging stacks!*
What we will do:
- Spin up an ELK stack with services
- Gaze at the spiffy Kibana web UI
- Manually send a few log entries using one-shot containers
- Setup our containers to send their logs to Logstash
---
## What's in an ELK stack?
- ELK is three components:
- ElasticSearch (to store and index log entries)
- Logstash (to receive log entries from various
sources, process them, and forward them to various
destinations)
- Kibana (to view/search log entries with a nice UI)
- The only component that we will configure is Logstash
- We will accept log entries using the GELF protocol
- Log entries will be stored in ElasticSearch,
<br/>and displayed on Logstash's stdout for debugging
---
## Setting up ELK
- We need three containers: ElasticSearch, Logstash, Kibana
- We will place them on a common network, `logging`
.exercise[
- Create the network:
```bash
docker network create --driver overlay logging
```
- Create the ElasticSearch service:
```bash
docker service create --network logging --name elasticsearch elasticsearch
```
]
---
## Setting up Kibana
- Kibana exposes the web UI
- Its default port (5601) needs to be published
- It needs a tiny bit of configuration: the address of the ElasticSearch service
- We don't want Kibana logs to show up in Kibana (it would create clutter)
<br/>so we tell Logspout to ignore them
.exercise[
- Create the Kibana service:
```bash
docker service create --network logging --name kibana --publish 5601:5601 \
-e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana
```
]
---
## Setting up Logstash
- Logstash needs some configuration to listen to GELF messages and send them to ElasticSearch
- We could author a custom image bundling this configuration
- We can also pass the [configuration](https://github.com/jpetazzo/orchestration-workshop/blob/master/elk/logstash.conf) on the command line
.exercise[
- Create the Logstash service:
```bash
docker service create --network logging --name logstash -p 12201:12201/udp \
logstash -e "$(cat ~/orchestration-workshop/elk/logstash.conf)"
```
]
---
## Checking Logstash
- Before proceeding, let's make sure that Logstash started properly
.exercise[
- Lookup the node running the Logstash container:
```bash
docker service ps logstash
```
- Log into that node:
```bash
ssh ip-172-31-XXX-XXX
```
]
---
## View Logstash logs
.exercise[
- Get the ID of the Logstash container:
```bash
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=logstash)
```
- View the logs:
```bash
docker logs --follow $CID
```
]
You should see the heartbeat messages:
.small[
```json
{ "message" => "ok",
"host" => "1a4cfb063d13",
"@version" => "1",
"@timestamp" => "2016-06-19T00:45:45.273Z"
}
```
]
---
## Testing the GELF receiver
- In a new window, we will generate a logging message
- We will use a one-off container, and Docker's GELF logging driver
.exercise[
- Send a test message:
```bash
docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
--rm alpine echo hello
```
]
The test message should show up in the logstash container logs.
???
## Sending logs from a service
- We were sending from a "classic" container so far; let's send logs from a service instead
- We're lucky: the parameters (`--log-driver` and `--log-opt`) are exactly the same!
- We will use the `--restart-condition` flag so that the container doesn't restart forever
.exercise[
- Send a test message:
```bash
docker service create --restart-condition none \
--log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
alpine echo hello
```
]
The test message should show up as well in the logstash container logs.
---
## Connect to Kibana
- The Kibana web UI is exposed on cluster port 5601
- Open the UI in your browser: http://instance-address:5601/
(Remember: you can use any instance address!)
---
## "Configuring" Kibana
- If you see a status page with a yellow item, wait a minute and reload
(Kibana is probably still initializing)
- Kibana should offer you to "Configure an index pattern":
<br/>in the "Time-field name" drop down, select "@timestamp", and hit the
"Create" button
- Then:
- click "Discover" (in the top-left corner)
- click "Last 15 minutes" (in the top-right corner)
- click "Last 1 hour" (in the list in the middle)
- click "Auto-refresh" (top-right corner)
- click "5 seconds" (top-left of the list)
- You should see a series of green bars (with one new green bar every minute)
---
## Updating our services to use GELF
- We will now inform our Swarm to add GELF logging to all our services
- This is done with the `docker service update` command
- The logging flags are the same as before
.exercise[
- Enable GELF logging for all our *stateless* services:
```bash
for SERVICE in hasher rng webui worker; do
docker service update $SERVICE \
--log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201
done
```
]
After ~15 seconds, you should see the log messages in Kibana.
---
## Viewing container logs
- Go back to Kibana
- Container logs should be showing up!
- We can customize the web UI to be more readable
.exercise[
- In the left column, move the mouse over the following
columns, and click the "Add" button that appears:
- host
- container_name
- message
<!--
- logsource
- program
- message
-->
]
---
## .warning[Don't update stateful services!]
- Why didn't we update the Redis service as well?
- When a service changes, SwarmKit replaces existing container with new ones
- This is fine for stateless services
- But if you update a stateful service, its data will be lost in the process
- If we updated our Redis service, all our DockerCoins would be lost
---
## Important afterword
**This is not a "production-grade" setup.**
It is just an educational example. We did setup a single
ElasticSearch instance and a single Logstash instance.
In a production setup, you need an ElasticSearch cluster
(both for capacity and availability reasons). You also
need multiple Logstash instances.
And if you want to withstand
bursts of logs, you need some kind of message queue:
Redis if you're cheap, Kafka is you want to make sure
that you don't drop messages on the floor. Good luck.
---
class: title
# Additional content
## (Might require unhealthy amounts of coffee and/or Club Mate)
---
# Dealing with stateful services
- First of all, you need to make sure that the data files are on a *volume*
- Volumes are host directories that are mounted to the container's filesystem
- These host directories can be backed by the ordinary, plain host filesystem ...
- ... Or by distributed/networked filesystems
- In the latter scenario, in case of node failure, the data is safe elsewhere ...
- ... And the container can be restarted on another node without data loss
---
## Building a stateful service experiment
- We will use Redis for this example
- We will expose it on port 10000 to access it easily
.exercise[
- Start the Redis service:
```bash
docker service create --name stateful -p 10000:6379 redis
```
- Check that we can connect to it (replace XX.XX.XX.XX with any node's IP address):
```bash
docker run --rm redis redis-cli -h `XX.XX.XX.XX` -p 10000 info server
```
]
---
## Accessing our Redis service easily
- Typing that whole command is going to be tedious
.exercise[
- Define a shell alias to make our lives easier:
```bash
alias redis='docker run --rm redis redis-cli -h `XX.XX.XX.XX` -p 10000'
```
- Try it:
```bash
redis info server
```
]
---
## Basic Redis commands
.exercise[
- Check that the `foo` key doesn't exist:
```bash
redis get foo
```
- Set it to `bar`:
```bash
redis set foo bar
```
- Check that it exists now:
```bash
redis get foo
```
]
---
## Local volumes vs. global volumes
- Global volumes exist in a single namespace
- A global volume can be mounted on any node
<br/>.small[(bar some restrictions specific to the volume driver in use; e.g. using an EBS-backed volume on a GCE/EC2 mixed cluster)]
- Attaching a global volume to a container allows to start the container anywhere
<br/>(and retain its data wherever you start it!)
- Global volumes require extra *plugins* (Flocker, Portworx...)
- Docker doesn't come with a default global volume driver at this point
- Therefore, we will fall back on *local volumes*
---
## Local volumes
- We will use the default volume driver, `local`
- As the name implies, the `local` volume driver manages *local* volumes
- Since local volumes are (duh!) *local*, we need to pin our container to a specific host
- We will do that with a *constraint*
.exercise[
- Add a placement constraint to our service:
```bash
docker service update stateful --constraint-add node.hostname==$HOSTNAME
```
]
---
## Where is our data?
- If we look for our `foo` key, it's gone!
.exercise[
- Check the `foo` key:
```bash
redis get foo
```
- Adding a constraint caused the service to be redeployed:
```bash
docker service ps stateful
```
]
Note: even if the constraint ends up being a no-op (i.e. not
moving the service), the service gets redeployed.
This ensures consistent behavior.
---
## Setting the key again
- Since our database was wiped out, let's populate it again
.exercise[
- Set `foo` again:
```bash
redis set foo bar
```
- Check that it's there:
```bash
redis get foo
```
]
---
## Service updates cause containers to be replaced
- Let's try to make a trivial update to the service and see what happens
.exercise[
- Set a memory limit to our Redis service:
```bash
docker service update stateful --limit-memory 100M
```
- Try to get the `foo` key one more time:
```bash
redis get foo
```
]
The key is blank again!
---
## Service volumes are ephemeral by default
- Let's highlight what's going on with volumes!
.exercise[
- Check the current list of volumes:
```bash
docker volume ls
```
- Carry a minor update to our Redis service:
```bash
docker service update stateful --limit-memory 200M
```
]
Again: all changes trigger the creation of a new task, and therefore a replacement of the existing container;
even when it is not strictly technically necessary.
---
## The data is gone again
- What happened to our data?
.exercise[
- The list of volumes is slightly different:
```bash
docker volume ls
```
- And as you can expect, the `foo` key is gone:
```bash
redis get foo
```
]
---
## Assigning a persistent volume to the container
- Let's add an explicit volume mount to our service, referencing a named volume
.exercise[
- Update the service with a volume mount:
```bash
docker service update stateful \
--mount-add type=volume,source=foobarstore,target=/data
```
- Check the new volume list:
```bash
docker volume ls
```
]
Note: the `local` volume driver automatically creates volumes.
---
## Checking that persistence actually works across service updates
.exercise[
- Store something in the `foo` key:
```bash
redis set foo barbar
```
- Update the service with yet another trivial change:
```bash
docker service update stateful --limit-memory 300M
```
- Check that `foo` is still set:
```bash
redis get foo
```
]
---
## Recap
- The service must commit its state to disk when being shutdown.red[*]
(Shutdown = being sent a `TERM` signal)
- The state must be written on files located on a volume
- That volume must be specified to be persistent
- If using a local volume, the service must also be pinned to a specific node
(And losing that node means losing the data, unless there are other backups)
.footnote[<br/>.red[*]Until recently, the Redis image didn't automatically
persist data. Beware!]
---
## Cleaning up
.exercise[
- Remove the stateful service:
```bash
docker service rm stateful
```
- Remove the associated volume:
```bash
docker volume rm foobarstore
```
]
Note: we could keep the volume around if we wanted.
---
# Scripting image building and pushing
- Earlier, we used some rather crude shell loops to build and push images
- Compose (and clever environment variables) can help us to make that easier
- When using Compose file version 2, you can specify *both* `build` and `image`:
```yaml
version: "2"
services:
webapp:
build: src/
image: jpetazzo/webapp:${TAG}
```
Note: Compose tolerates empty (or unset) environment variables, but in this example,
`TAG` *must* be set, because `jpetazzo/webapp:` is not a valid image name.
---
## Updating the Compose file to specify image tags
- Let's update the Compose file for DockerCoins to make it easier to push it to our registry
.exercise[
- Go back to the `dockercoins` directory:
```bash
cd ~/orchestration-workshop/dockercoins
```
- Edit `docker-compose.yml`, and update each service to add an `image` directive as follows:
```yaml
rng:
build: rng
`image: ${REGISTRY_SLASH}rng${COLON_TAG}`
```
]
You can also directly use the file `docker-compose.yml-images`.
---
## Use the new Compose file
- We need to set `REGISTRY_SLASH` and `COLON_TAG` variables
- Then we can use Compose to `build` and `push`
.exercise[
- Set environment variables:
```bash
export REGISTRY_SLASH=localhost:5000/
export COLON_TAG=:v0.01
```
- Build and push with Compose:
```bash
docker-compose build
docker-compose push
```
]
---
## Why the weird variable names?
- It would be more intuitive to have:
```bash
REGISTRY=localhost:5000
TAG=v0.01
```
- But then, when the variables are not set, the image names would be invalid
(they would look like .red[`/rng:`])
- Putting the slash and the colon in the variables allows to use the Compose file
even when the variables are not set
- The variable names (might) remind you that you have to put the trailing slash and heading colon
---
# Distributed Application Bundles
- The previous section showed us how to streamline image build and push
- We will now see how to streamline service creation
(i.e. get rid of the `for SERVICE in ...; do docker service create ...` part)
.warning[This is experimental and subject to change!]
---
## What is a Distributed Application Bundle?
- Conceptually similar to a Compose file, but for Swarm clusters
- A Distributed Application Bundle is a JSON payload describing the services
- It's typically stored as `<stackname>.dab`
- It's JSON because you're not supposed to edit it manually
- It can be generated by Compose, and consumed by Docker (experimental branch)
- In addition to image names, it contains their exact SHA256
---
## Generating a DAB
- This is done with the Compose `bundle` command
.exercise[
- Create the DAB for the DockerCoins application:
```bash
docker-compose bundle
```
- Inspect the resulting file:
```bash
cat dockercoins.dab
```
]
---
## Using a DAB
- This is done with `docker stack deploy <stackname>`
.exercise[
- Try to deploy the DAB:
```bash
docker stack deploy dockercoins
```
]
--
Oh, right, we need the *experimental* build of Docker!
---
## Installing Docker experimental CLI
- We don't need to upgrade our Docker Engines; we just need an upgraded CLI
- We will download and extract it in a separate directory (to keep the original intact)
.exercise[
- Download and unpack the latest experimental build of Docker:
```bash
curl -sSL \
https://experimental.docker.com/builds/$(uname -s)/$(uname -m)/docker-latest.tgz \
| tar -C ~ -zxf-
```
]
---
## Using Docker experimental CLI
- Just invoke `~/docker/docker` instead of `docker`
.exercise[
- Deploy our app using the DAB file:
```bash
~/docker/docker stack deploy dockercoins
```
- Check the stack deployment:
```bash
~/docker/docker stack ps dockercoins
```
]
---
## Look at the newly deployed stack
.exercise[
- Let's find out which port was allocated for `webui`:
```bash
docker service inspect dockercoins_webui \
--format '{{ (index .Endpoint.Ports 0).PublishedPort }}'
```
- Point your navigator to any node on that port
]
Note: we can use the "normal" CLI for everything else.
We only need it for `docker stack`.
---
## Clean up
- Unsurprisingly, there is a `docker stack rm` command
.exercise[
- Clean up the stack we just deployed:
```bash
~/docker/docker stack rm dockercoins
```
]
---
## Scoping
- All resources (service names, network names...) are prefixed with the stack name
- This allows us to stage up multiple instances side by side
(Just like before with Compose's project name parameter)
---
## Some features are not fully supported yet
- Global scheduling
- Scaling
- Fixed port numbers
- Logging options
- ... and much more
You can specify *most* of them in the DAB itself, but Compose can't generate it (yet).
---
# Controlling Docker from a container
- In a local environment, just bind-mount the Docker control socket:
```bash
docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker
```
- Otherwise, you have to:
- set `DOCKER_HOST`,
- set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS),
- copy certificates to the container that will need API access.
More resources on this topic:
- [Do not use Docker-in-Docker for CI](
http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
- [One container to rule them all](
http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/)
---
## Bind-mounting the Docker control socket
- In Swarm mode, bind-mounting the control socket gives you access to the whole cluster
- You can tell Docker to place a given service on a manager node, using constraints:
```bash
docker service create \
--mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
--name autoscaler --constraint node.role==manager ...
```
---
# Node management
- SwarmKit allows to change (almost?) everything on-the-fly
- Nothing should require a global restart
---
## Node availability
```bash
docker node update <node-name> --availability <active|pause|drain>
```
- Active = schedule tasks on this node (default)
- Pause = don't schedule new tasks on this node; existing tasks are not affected
You can use it to troubleshoot a node without disrupting existing tasks
It can also be used (in conjunction with labels) to reserve resources
- Drain = don't schedule new tasks on this node; existing tasks are moved away
This is just like crashing the node, but containers get a chance to shutdown cleanly
---
## Managers and workers
- Nodes can be promoted to manager with `docker node promote`
- Nodes can be demoted to worker with `docker node demote`
- This can also be done with `docker node update <node> --role <manager|worker>`
- Reminder: this has to be done from a manager node
<br/>(workers cannot promote themselves)
---
## Removing nodes
- You can leave Swarm mode with `docker swarm leave`
- Nodes are drained before being removed (i.e. all tasks are rescheduled somewhere else)
- Managers cannot leave (they have to be demoted first)
- After leaving, a node still shows up in `docker node ls` (in `Down` state)
- When a node is `Down`, you can remove it with `docker node rm` (from a manager node)
---
## Join tokens and automation
- If you have used Docker 1.12-RC: join tokens are now mandatory!
- You cannot specify your own token (SwarmKit generates it)
- If you need to change the token: `docker swarm join-token --rotate ...`
- To automate cluster deployment:
- have a seed node do `docker swarm init` if it's not already in Swarm mode
- propagate the token to the other nodes (secure bucket, facter, ohai...)
---
class: title
# Metrics
---
## Which metrics will we collect?
- node metrics (e.g. cpu, ram, disk space)
- container metrics (e.g. memory used, processes, network traffic going in and out)
---
## Tools
We will use three open source Go projects for metric collection, publishing, storing, and visualization:
- Intel Snap: telemetry framework to collect, process, and publish metric data
- InfluxDB: database
- Grafana: graph visuals
---
## Snap
- [www.github.com/intelsdi-x/snap](www.github.com/intelsdi-x/snap)
- Can collect, process, and publish metric data
- Doesnt store metrics
- Works as a daemon
- Offloads collecting, processing, and publishing to plugins
- Have to configure it to use the plugins and collect the metrics you want
- Docs: https://github.com/intelsdi-x/snap/blob/master/docs/
---
## InfluxDB
- Since Snap doesn't have have a database, we need one
- It's specifically for time series
---
## Grafana
- Since neither Snap or InfluxDB can show graphs, we're using Grafana
---
## Getting and setting up Snap
- This will get Snap on all nodes
.exercise[
```bash
docker service create --restart-condition=none --mode global \
--mount type=bind,source=/usr/local/bin,target=/usr/local/bin \
--mount type=bind,source=/opt,target=/opt centos sh -c '
SNAPVER=v0.16.1-beta
RELEASEURL=https://github.com/intelsdi-x/snap/releases/download/$SNAPVER
curl -sSL $RELEASEURL/snap-$SNAPVER-linux-amd64.tar.gz | tar -C /opt -zxf-
curl -sSL $RELEASEURL/snap-plugins-$SNAPVER-linux-amd64.tar.gz | tar -C /opt -zxf-
ln -s snap-$SNAPVER /opt/snap
for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done'
```
]
---
## `snapd`- Snap daemon
- Application made up of a REST API, control module, and scheduler module
.exercise[
- Start `snapd` with plugin trust disabled and log level set to debug
```bash
snapd -t 0 -l 1
```
]
- More resources:
https://github.com/intelsdi-x/snap/blob/master/docs/SNAPD.md
https://github.com/intelsdi-x/snap/blob/master/docs/SNAPD_CONFIGURATION.md
---
## `snapctl` - loading plugins
- First, open a new window
.exercise[
- Load the psutil collector plugin
```bash
snapctl plugin load /opt/snap/plugin/snap-plugin-collector-psutil
```
- Load the file publisher plugin
```bash
snapctl plugin load /opt/snap/plugin/snap-plugin-publisher-mock-file
```
]
---
## `snapctl` - see what you loaded and can collect
.exercise[
- See your loaded plugins
```bash
snapctl plugin list
```
- See the metrics you can collect
```bash
snapctl metric list
```
]
---
## `snapctl` - tasks
- To start collecting/processing/publishing metric data, you need to create a task
- For this workshop we will be using just the task manifest
- Tasks can be written in JSON or YAML and the metrics you want to collect are listed in the task file
- Some plugins, such as the Docker collector, allow for wildcards which is denoted by a star (see snap/docker-influxdb.json)
- More resources:
https://github.com/intelsdi-x/snap/blob/master/docs/TASKS.md
---
## `snapctl` - task manifest
```json
---
version: 1
schedule:
type: "simple" # collect on a set interval
interval: "1s" # of every 1s
max-failures: 10
workflow:
collect: # first collect
metrics: # metrics to collect
/intel/psutil/load/load1: {}
config: # there is no configuration
publish: # after collecting, publish
-
plugin_name: "file" # use the file publisher
config:
file: "/tmp/snap-psutil-file.log" # write to this file
```
---
## `snapctl` - starting a task
.exercise[
- Using the task manifest in the snap directory, start a task to collect metrics from psutil and publish them to a file.
```bash
cd ~/orchestration-workshop/snap
snapctl task create -t psutil-file.yml
```
]
The output should look like the following:
```
Using task manifest to create task
Task created
ID: 240435e8-a250-4782-80d0-6fff541facba
Name: Task-240435e8-a250-4782-80d0-6fff541facba
State: Running
```
---
## `snapctl` - see the tasks
.exercise[
- Using the task in the snap directory start a task to collect metrics from psutil and publish them to a file.
```bash
snapctl task list
```
]
The output should look like the following:
```
ID NAME STATE HIT MISS FAIL CREATED LAST FAILURE
24043...acba Task-24043...acba Running 4 0 0 2:34PM 8-13-2016
```
---
## Check file
.exercise[
```bash
tail -f /tmp/snap-psutil-file.log
```
]
To exit, hit `^C`
---
## `snapctl` - watch metrics
- Watch will stream the metrics you are collecting to STDOUT
.exercise[
```bash
snapctl task watch <ID>
```
]
To exit, hit `^C`
---
## `snapctl` - stop the task
.exercise[
- Using the ID name, stop the task
```bash
snapctl task stop <ID>
```
]
---
## Stopping snap
- Just hit `^C` in the terminal window with `snapd` running and snap will stop and all plugins will be unloaded and tasks stopped
---
## Snap Tribe Mode
- Tribe is Snap's clustering mechanism
- Nodes can join agreements and in these, they share the same loaded plugins and running tasks
- We will use it to load the Docker collector and InfluxDB publisher on all nodes and run our task
- If we didn't use Tribe, we would have to go to every node and manually load the plugins and start the task
- More resources:
https://github.com/intelsdi-x/snap/blob/master/docs/TRIBE.md
---
## Start `snapd` with Tribe Mode enabled
- On your first node, start snap in tribe mode
.exercise[
```bash
snapd --tribe -t 0 -l 1
```
]
---
## Create first Tribe agreement
.exercise[
```bash
snapctl agreement create docker-influxdb
```
]
The output should look like the following:
```
Name Number of Members plugins tasks
docker-influxdb 0 0 0
```
---
## Join running snapd to agreement
.exercise[
```bash
snapctl agreement join docker-influxdb $HOSTNAME
```
]
The output should look like the following:
```
Name Number of Members plugins tasks
docker-influxdb 1 0 0
```
---
## Start a container on every node
- The Docker plugin requires at least one container to be started, so to ensure that happens, on node 1 create a global service (you need all nodes to be in a swarm)
- If there a specific container you'd rather use, feel free to do so
.exercise[
```bash
docker service create --mode global alpine ping 8.8.8.8
```
]
---
## Start InfluxDB and Grafana containers
- Start up containers with InfluxDB and Grafana using docker-compose on node 1
.exercise[
```bash
cd influxdb-grafana
docker-compose up
```
]
---
## Set up InfluxDB
- Go to `http://<NODE1_IP>:8083`
- Create a new database called snap with the query `CREATE DATABASE "snap"`
- Switch to the snap database on the top right
---
## Load Docker collector and InfluxDB publisher
.exercise[
- Load Docker collector
```bash
snapctl plugin load /opt/snap/plugin/snap-plugin-collector-docker
```
- Load InfluxDB publisher
```bash
snapctl plugin load /opt/snap/plugin/snap-plugin-publisher-influxdb
```
]
---
## Start task
.exercise[
- Using a task manifest file, create a task using the Docker collector to gather container metrics and send them to the InfluxDB publisher plugin
- Replace HOST_IP in docker-influxdb.json with the NODE1_IP address
```bash
snapctl task create -t docker-influxdb.json
```
]
---
# Restarting a task
- This is only necessary if the task becomes disabled
.exercise[
- Enable the task
```bash
snapctl task enable <ID>
```
- Start the task
```bash
snapctl task start <ID>
```
]
---
# See metrics in InfluxDB
- To see what metrics you're able to collect from (these should match `snapctl metric list`) use the `SHOW MEASUREMENTS` query
- To see more information from one of the metrics use something like the following with one of the metrics between the quotes:
```
SELECT * FROM "intel/linux/docker/025fd8c5dc0c/cpu_stats/cpu_usage/total_usage"
```
---
## Set up Grafana
- Go to `http://<NODE1_IP>:3000`
- If it asks for a username/password they're both `admin`
- Click the Grafana logo -> Data Sources -> Add data source
---
## Add Grafana data source
- Change the Type to InfluxDB
- Name : influxdb
- Check the default box
- Url: `http://<NODE1_IP>:8086`
- Access: direct
- Database: snap
---
## Create graphs in Grafana
- Click the Grafana logo -> Dashboards -> new
- Click on a green bar on the left -> add panel -> graph
- Click anywhere on the new line that says SELECT, then click select measurement and pick one of the metrics to display
- You can add the source (this is the hostname of each node) and filter by that if you want
- Click on "Last 6 hours" in the top right and change it to last 5 minutes and the update rate to 5s
---
## Add more nodes to the Tribe
- This will load the plugins from node 1 on the other nodes and start the same task
.exercise[
- Start snapd in tribe mode on all nodes
```bash
for N in 2 3 4 5; do ssh -f node$N snapd --tribe -t 0 -l 1 --log-path /tmp \
--tribe-node-name node$N --tribe-seed node1:6000; done
```
- Join the agreement
```bash
for N in 2 3 4 5; do ssh node$N snapctl agreement join docker-influxdb node$N; \
done
```
]
---
## InfluxDB and Grafana updates
- Now if you look at InfluxDB you should see metrics from the other nodes if you look at SHOW MEASUREMENTS again and can add these to your Grafana dashboard
---
class: title
# Thanks! <br/> Questions?
<!--
## [@jpetazzo](https://twitter.com/jpetazzo) <br/> [@docker](https://twitter.com/docker)
-->
## AJ ([@s0ulshake](https://twitter.com/s0ulshake)) <br/> Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) <br/> Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
</textarea>
<script src="remark-0.13.min.js" type="text/javascript">
</script>
<script type="text/javascript">
var slideshow = remark.create({
ratio: '16:9',
highlightSpans: true
});
</script>
</body>
</html>