mirror of
https://github.com/jpetazzo/container.training.git
synced 2026-02-16 10:39:55 +00:00
Compare commits
3 Commits
gitpod
...
2017-04-17
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
37242c0c72 | ||
|
|
acab2f9074 | ||
|
|
4cd6235ab7 |
BIN
docs/blackbelt.png
Normal file
BIN
docs/blackbelt.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 16 KiB |
339
docs/index.html
339
docs/index.html
@@ -76,6 +76,12 @@
|
||||
background-repeat: no-repeat;
|
||||
padding-left: 2em;
|
||||
}
|
||||
.blackbelt {
|
||||
background-image: url("blackbelt.png");
|
||||
background-size: 1.5em;
|
||||
background-repeat: no-repeat;
|
||||
padding-left: 2em;
|
||||
}
|
||||
.exercise {
|
||||
background-color: #eee;
|
||||
background-image: url("keyboard.png");
|
||||
@@ -104,10 +110,14 @@ class: in-person
|
||||
|
||||
## Intros
|
||||
|
||||
- Hello! We are
|
||||
- Hello!
|
||||
|
||||
<!--
|
||||
We are
|
||||
AJ ([@s0ulshake](https://twitter.com/s0ulshake))
|
||||
&
|
||||
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
|
||||
-->
|
||||
|
||||
--
|
||||
|
||||
@@ -125,20 +135,25 @@ on time, it's a good idea to have a breakfast with the attendees
|
||||
at e.g. 9am, and start at 9:30.
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
## Agenda
|
||||
|
||||
<!--
|
||||
- Agenda:
|
||||
-->
|
||||
|
||||
.small[
|
||||
- 09:00-09:15 hello
|
||||
- 09:15-10:45 part 1
|
||||
- 10:45-11:00 coffee break
|
||||
- 11:00-12:30 part 2
|
||||
- 12:30-13:30 lunch break
|
||||
- 13:30-15:00 part 3
|
||||
- 15:00-15:15 coffee break
|
||||
- 15:15-16:45 part 4
|
||||
- 16:45-17:00 Q&A
|
||||
- 14:00-14:05 hello
|
||||
- 14:05-14:50 part 1
|
||||
- 14:50-15:00 tea/coffee break + Q&A
|
||||
- 15:00-15:50 part 2
|
||||
- 15:50-16:00 more tea/coffee + Q&A
|
||||
- 16:00-16:50 part 3
|
||||
- 16:50-17:00 no tea/coffee, still Q&A
|
||||
- 17:00-23:59 kombucha, beers, and more
|
||||
]
|
||||
|
||||
<!--
|
||||
@@ -149,12 +164,26 @@ at e.g. 9am, and start at 9:30.
|
||||
|
||||
- Feel free to interrupt for questions at any time
|
||||
|
||||
- Live feedback, questions, help on [Gitter](chat)
|
||||
- Live feedback, questions, help on [Slack](chat)
|
||||
|
||||
- All the content is publicly available (slides, code samples, scripts)
|
||||
|
||||
.blackbelt[GOTTA CATCH'EM ALL! (The black belt track references!)]
|
||||
|
||||
---
|
||||
|
||||
## .blackbelt[Tuesday 16:15]
|
||||
|
||||
Cgroups in Go...
|
||||
|
||||
Namespaces in Go...
|
||||
|
||||
Reminder: *Cgroups + Namespaces = Containers*
|
||||
|
||||
Liz Rice will craft some artisanal, organic, non-GMO containers in Go! ✨
|
||||
|
||||
???
|
||||
|
||||
class: in-person
|
||||
|
||||
## Disclaimer
|
||||
@@ -217,8 +246,6 @@ class: self-paced
|
||||
read [these instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker) for extra
|
||||
details
|
||||
|
||||
???
|
||||
|
||||
<!--
|
||||
grep '^# ' index.html | grep -v '<br' | tr '#' '-'
|
||||
-->
|
||||
@@ -267,25 +294,25 @@ class: in-person
|
||||
|
||||
class: in-person
|
||||
|
||||
## Chapter 3: operating the Swarm
|
||||
## Chapter 3: operating the Swarm (advanced material)
|
||||
|
||||
- Breaking into an overlay network
|
||||
- (Breaking into an overlay network)
|
||||
|
||||
- Securing overlay networks
|
||||
- (Securing overlay networks)
|
||||
|
||||
- Rolling updates
|
||||
- (Rolling updates)
|
||||
|
||||
- (Secrets management and encryption at rest)
|
||||
|
||||
- [Centralized logging](#logging)
|
||||
- ([Centralized logging](#logging))
|
||||
|
||||
- Metrics collection
|
||||
- (Metrics collection)
|
||||
|
||||
---
|
||||
|
||||
class: in-person
|
||||
|
||||
## Chapter 4: bonus material
|
||||
## Chapter 4: useful Swarm-fu
|
||||
|
||||
- Dealing with stateful services
|
||||
|
||||
@@ -543,8 +570,8 @@ You are welcome to use the method that you feel the most comfortable with.
|
||||
|
||||
## Brand new versions!
|
||||
|
||||
- Engine 17.03
|
||||
- Compose 1.11
|
||||
- Engine 17.05
|
||||
- Compose 1.12
|
||||
- Machine 0.10
|
||||
|
||||
.exercise[
|
||||
@@ -560,7 +587,7 @@ You are welcome to use the method that you feel the most comfortable with.
|
||||
|
||||
---
|
||||
|
||||
## Wait, what, 17.03 ?!?
|
||||
## Wait, what, 17.05 ?!?
|
||||
|
||||
--
|
||||
|
||||
@@ -670,9 +697,9 @@ class: extra-details
|
||||
|
||||
- Containers can have network aliases (resolvable through DNS)
|
||||
|
||||
- Compose file version 2 makes each container reachable through its service name
|
||||
- Compose file version 2+ makes each container reachable through its service name
|
||||
|
||||
- Compose file version 1 requires "links" sections
|
||||
- Compose file version 1 required "links" sections
|
||||
|
||||
- Our code can connect to services using their short name
|
||||
|
||||
@@ -990,6 +1017,20 @@ killall docker-compose
|
||||
|
||||
---
|
||||
|
||||
## .blackbelt[Wednesday 13:30]
|
||||
|
||||
Do you want to know exactly what your code is doing?
|
||||
|
||||
Down to the microsecond?
|
||||
|
||||
.small[(I want to say nanosecond but I don't want to be too presomptuous)]
|
||||
|
||||
Flame graphs! Performance counters! Kernel tracing!
|
||||
|
||||
Brendan Gregg will share with us the secrets of container peformance analysis.
|
||||
|
||||
---
|
||||
|
||||
## Accessing internal services
|
||||
|
||||
- `rng` and `hasher` are exposed on ports 8001 and 8002
|
||||
@@ -1755,6 +1796,20 @@ Some presentations from the Docker Distributed Systems Summit in Berlin:
|
||||
|
||||
---
|
||||
|
||||
## .blackbelt[Tuesday 14:55]
|
||||
|
||||
What is this "quorum" thing exactly?
|
||||
|
||||
How does raft *actually* work?
|
||||
|
||||
Can I blow up a Swarm cluster to bits and rebuild it from scratch?
|
||||
|
||||
Docker Captain Laura Frank will answer all these questions!
|
||||
|
||||
*"I dyed my hair red with the blood of offline manager nodes."*
|
||||
|
||||
---
|
||||
|
||||
## Adding more manager nodes
|
||||
|
||||
- Right now, we have only one manager (node1)
|
||||
@@ -3484,7 +3539,7 @@ You should now be able to connect to port 8000 and see the DockerCoins web UI.
|
||||
|
||||
---
|
||||
|
||||
# Breaking into an overlay network
|
||||
# (Breaking into an overlay network)
|
||||
|
||||
- We will create a dummy placeholder service on our network
|
||||
|
||||
@@ -3908,7 +3963,7 @@ class: in-person
|
||||
|
||||
---
|
||||
|
||||
# Securing overlay networks
|
||||
# (Securing overlay networks)
|
||||
|
||||
- By default, overlay networks are using plain VXLAN encapsulation
|
||||
|
||||
@@ -4037,7 +4092,31 @@ However, when you run the second one, only `#` will show up.
|
||||
|
||||
---
|
||||
|
||||
# Rolling updates
|
||||
## .blackbelt[Tuesday 11:45]
|
||||
|
||||
In a galaxy far far away ...
|
||||
|
||||
--
|
||||
|
||||
The Death Star has a REST API, and the Empire protects it with L7 security policies ...
|
||||
|
||||
--
|
||||
|
||||
Wait, what?
|
||||
|
||||
--
|
||||
|
||||
Thomas Graf will present BPF and XDP, which are some *really cool* kernel tech.
|
||||
|
||||
Then he'll show how Cilium leverages them to implement L7 security policies.
|
||||
|
||||
Also he might or might not blow up Death Stars.
|
||||
|
||||
.small[(In other news, BPF and XDP are also used by Facebook to achieve 10x performance improvements over IPVS LB.)]
|
||||
|
||||
---
|
||||
|
||||
# (Rolling updates)
|
||||
|
||||
- We want to release a new version of the worker
|
||||
|
||||
@@ -4217,8 +4296,6 @@ Note: if you updated the roll-out parallelism, *rollback* will not rollback to t
|
||||
|
||||
---
|
||||
|
||||
class: swarmctl
|
||||
|
||||
## Getting task information for a given node
|
||||
|
||||
- You can see all the tasks assigned to a node with `docker node ps`
|
||||
@@ -4235,7 +4312,7 @@ class: swarmctl
|
||||
|
||||
class: swarmtools
|
||||
|
||||
## SwarmKit debugging tools
|
||||
# SwarmKit debugging tools
|
||||
|
||||
- The SwarmKit repository comes with debugging tools
|
||||
|
||||
@@ -4459,6 +4536,26 @@ Reminder: this is a very low-level tool, requiring a knowledge of SwarmKit's int
|
||||
|
||||
---
|
||||
|
||||
## .blackbelt[Wednesday 11:15]
|
||||
|
||||
*"Docker takes security seriously."*
|
||||
|
||||
--
|
||||
|
||||
Well, *everybody* takes security seriously, don't they?
|
||||
|
||||
--
|
||||
|
||||
*"We don't give a 💩 about security!"* (Said no company ever)
|
||||
|
||||
--
|
||||
|
||||
Come see how we reaccommodate security bugs, *one patch at a time*.
|
||||
|
||||
With Michael Crosby, maintainer of the Engine, libcontainer, runc, containerd ...
|
||||
|
||||
---
|
||||
|
||||
class: secrets
|
||||
|
||||
## Secret management
|
||||
@@ -5242,6 +5339,24 @@ http://jpetazzo.github.io/2017/01/20/docker-logging-gelf/).
|
||||
|
||||
---
|
||||
|
||||
## .blackbelt[Tuesday 17:10]
|
||||
|
||||
You want to implement surveillance?
|
||||
|
||||
I mean – metrics collection?
|
||||
|
||||
Gather CPU, RAM, I/O, etc. for all your nodes and containers?
|
||||
|
||||
But also applications and business metrics?
|
||||
|
||||
Julius Volz will show you what Prometheus can do for you!
|
||||
|
||||
(Spoiler alert: a lot!)
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
# Metrics collection
|
||||
|
||||
- We want to gather metrics in a central place
|
||||
@@ -5252,6 +5367,8 @@ http://jpetazzo.github.io/2017/01/20/docker-logging-gelf/).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Node metrics
|
||||
|
||||
- CPU, RAM, disk usage on the whole node
|
||||
@@ -5268,6 +5385,8 @@ http://jpetazzo.github.io/2017/01/20/docker-logging-gelf/).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Container metrics
|
||||
|
||||
- Similar to node metrics, but not totally identical
|
||||
@@ -5288,6 +5407,8 @@ http://jpetazzo.github.io/2013/10/08/docker-containers-metrics/
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Tools
|
||||
|
||||
We will build *two* different metrics pipelines:
|
||||
@@ -5302,6 +5423,8 @@ and PWD doesn't have that yet).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## First metrics pipeline
|
||||
|
||||
We will use three open source Go projects for our first metrics pipeline:
|
||||
@@ -5320,6 +5443,8 @@ We will use three open source Go projects for our first metrics pipeline:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Snap
|
||||
|
||||
- [github.com/intelsdi-x/snap](https://github.com/intelsdi-x/snap)
|
||||
@@ -5338,6 +5463,8 @@ We will use three open source Go projects for our first metrics pipeline:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## InfluxDB
|
||||
|
||||
- Snap doesn't store metrics data
|
||||
@@ -5354,6 +5481,8 @@ We will use three open source Go projects for our first metrics pipeline:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Grafana
|
||||
|
||||
- Snap cannot show graphs
|
||||
@@ -5366,6 +5495,8 @@ We will use three open source Go projects for our first metrics pipeline:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Getting and setting up Snap
|
||||
|
||||
- We will install Snap directly on the nodes
|
||||
@@ -5383,6 +5514,8 @@ We will use three open source Go projects for our first metrics pipeline:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## The Snap installer service
|
||||
|
||||
- This will get Snap on all nodes
|
||||
@@ -5408,6 +5541,8 @@ for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## First contact with `snapd`
|
||||
|
||||
- The core of Snap is `snapd`, the Snap daemon
|
||||
@@ -5430,6 +5565,8 @@ for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Using `snapctl` to interact with `snapd`
|
||||
|
||||
- Let's load a *collector* and a *publisher* plugins
|
||||
@@ -5452,6 +5589,8 @@ for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Checking what we've done
|
||||
|
||||
- Good to know: Docker CLI uses `ls`, Snap CLI uses `list`
|
||||
@@ -5472,6 +5611,8 @@ for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Actually collecting metrics: introducing *tasks*
|
||||
|
||||
- To start collecting/processing/publishing metric data, you need to create a *task*
|
||||
@@ -5493,6 +5634,8 @@ for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Our first task manifest
|
||||
|
||||
```yaml
|
||||
@@ -5515,6 +5658,8 @@ for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Creating our first task
|
||||
|
||||
- The task manifest shown on the previous slide is stored in `snap/psutil-file.yml`.
|
||||
@@ -5541,6 +5686,8 @@ for BIN in snapd snapctl; do ln -s /opt/snap/bin/$BIN /usr/local/bin/$BIN; done
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Checking existing tasks
|
||||
|
||||
.exercise[
|
||||
@@ -5560,6 +5707,8 @@ The output should look like the following:
|
||||
```
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Viewing our task dollars at work
|
||||
|
||||
- The task is using a very simple publisher, `mock-file`
|
||||
@@ -5579,6 +5728,8 @@ To exit, hit `^C`
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Debugging tasks
|
||||
|
||||
- When a task is not directly writing to a local file, use `snapctl task watch`
|
||||
@@ -5597,6 +5748,8 @@ To exit, hit `^C`
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Stopping snap
|
||||
|
||||
- Our Snap deployment has a few flaws:
|
||||
@@ -5609,16 +5762,22 @@ To exit, hit `^C`
|
||||
|
||||
--
|
||||
|
||||
class: metrics
|
||||
|
||||
- We want to change that!
|
||||
|
||||
--
|
||||
|
||||
class: metrics
|
||||
|
||||
- But first, go back to the terminal where `snapd` is running, and hit `^C`
|
||||
|
||||
- All tasks will be stopped; all plugins will be unloaded; Snap will exit
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Snap Tribe Mode
|
||||
|
||||
- Tribe is Snap's clustering mechanism
|
||||
@@ -5638,6 +5797,8 @@ To exit, hit `^C`
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Running Snap itself on every node
|
||||
|
||||
- Snap runs in the foreground, so you need to use `&` or start it in tmux
|
||||
@@ -5655,6 +5816,8 @@ If you're *not* using Play-With-Docker, there is another way to start Snap!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Starting a daemon through SSH
|
||||
|
||||
.warning[Hackety hack ahead!]
|
||||
@@ -5670,6 +5833,8 @@ If you're *not* using Play-With-Docker, there is another way to start Snap!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Running Snap itself on every node
|
||||
|
||||
- I might go to hell for showing you this, but here it goes ...
|
||||
@@ -5693,6 +5858,8 @@ Remember: this *does not work* with Play-With-Docker (which doesn't have SSH).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Viewing the members of our tribe
|
||||
|
||||
- If everything went fine, Snap is now running in tribe mode
|
||||
@@ -5710,6 +5877,8 @@ This should show the 5 nodes with their hostnames.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Create an agreement
|
||||
|
||||
- We can now create an *agreement* for our plugins and tasks
|
||||
@@ -5732,6 +5901,8 @@ The output should look like the following:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Instruct all nodes to join the agreement
|
||||
|
||||
- We dont need another fancy global service!
|
||||
@@ -5756,6 +5927,8 @@ The last bit of output should look like the following:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Start a container on every node
|
||||
|
||||
- The Docker plugin requires at least one container to be started
|
||||
@@ -5775,6 +5948,8 @@ The last bit of output should look like the following:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Running InfluxDB
|
||||
|
||||
- We will create a service for InfluxDB
|
||||
@@ -5795,6 +5970,8 @@ The last bit of output should look like the following:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Creating the InfluxDB service
|
||||
|
||||
.exercise[
|
||||
@@ -5819,6 +5996,8 @@ this breaks a few things.]
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Setting up InfluxDB
|
||||
|
||||
- We need to create the "snap" database
|
||||
@@ -5859,6 +6038,8 @@ Note: the InfluxDB query language *looks like* SQL but it's not.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Load Docker collector and InfluxDB publisher
|
||||
|
||||
- We will load plugins on the local node
|
||||
@@ -5884,6 +6065,8 @@ Note: the InfluxDB query language *looks like* SQL but it's not.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Start a simple collection task
|
||||
|
||||
- Again, we will create a task on the local node
|
||||
@@ -5907,6 +6090,8 @@ container.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## If things go wrong...
|
||||
|
||||
Note: if a task runs into a problem (e.g. it's trying to publish
|
||||
@@ -5925,6 +6110,8 @@ the task (it will delete+re-create on all nodes).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Check that metric data shows up in InfluxDB
|
||||
|
||||
- Let's check existing data with a few manual queries in the InfluxDB admin interface
|
||||
@@ -5947,6 +6134,8 @@ the task (it will delete+re-create on all nodes).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Deploy Grafana
|
||||
|
||||
- We will use an almost-official image, `grafana/grafana`
|
||||
@@ -5964,6 +6153,8 @@ the task (it will delete+re-create on all nodes).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Set up Grafana
|
||||
|
||||
.exercise[
|
||||
@@ -5982,6 +6173,8 @@ the task (it will delete+re-create on all nodes).
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Add InfluxDB as a data source for Grafana
|
||||
|
||||
.small[
|
||||
@@ -6006,10 +6199,14 @@ If you see an orange box (sometimes without a message), it means that you got so
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Create a dashboard in Grafana
|
||||
|
||||
.exercise[
|
||||
@@ -6032,6 +6229,8 @@ At this point, you should see a sample graph showing up.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Setting up a graph in Grafana
|
||||
|
||||
.exercise[
|
||||
@@ -6053,10 +6252,14 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Before moving on ...
|
||||
|
||||
- Leave that tab open!
|
||||
@@ -6067,6 +6270,8 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Prometheus
|
||||
|
||||
- Prometheus is another metrics collection system
|
||||
@@ -6084,6 +6289,8 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## It's all about the `/metrics`
|
||||
|
||||
- This is was the *node exporter* looks like:
|
||||
@@ -6100,6 +6307,8 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Collecting metrics with Prometheus on Swarm
|
||||
|
||||
- We will run two *global services* (i.e. scheduled on all our nodes):
|
||||
@@ -6118,6 +6327,8 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Creating an overlay network for Prometheus
|
||||
|
||||
- This is the easiest step ☺
|
||||
@@ -6133,6 +6344,8 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Running the node exporter
|
||||
|
||||
- The node exporter *should* run directly on the hosts
|
||||
@@ -6158,6 +6371,8 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Running cAdvisor
|
||||
|
||||
- Likewise, cAdvisor *should* run directly on the hosts
|
||||
@@ -6180,6 +6395,8 @@ Congratulations, you are viewing the CPU usage of a single container!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Configuring the Prometheus server
|
||||
|
||||
This will be our configuration file for Prometheus:
|
||||
@@ -6205,6 +6422,8 @@ scrape_configs:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Passing the configuration to the Prometheus server
|
||||
|
||||
- We need to provide our custom configuration to the Prometheus server
|
||||
@@ -6225,6 +6444,8 @@ scrape_configs:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Building our custom Prometheus image
|
||||
|
||||
- We will use the local registry started previously on 127.0.0.1:5000
|
||||
@@ -6245,6 +6466,8 @@ scrape_configs:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Running our custom Prometheus image
|
||||
|
||||
- That's the only service that needs to be published
|
||||
@@ -6263,6 +6486,8 @@ scrape_configs:
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Checking our Prometheus server
|
||||
|
||||
- First, let's make sure that Prometheus is correctly scraping all metrics
|
||||
@@ -6281,6 +6506,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Displaying metrics directly from Prometheus
|
||||
|
||||
- This is easy ... if you are familiar with PromQL
|
||||
@@ -6304,6 +6531,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Building the query from scratch
|
||||
|
||||
- We are going to build the same query from scratch
|
||||
@@ -6318,6 +6547,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Displaying a raw metric for *all* containers
|
||||
|
||||
- Click on the "Graph" tab on top
|
||||
@@ -6338,6 +6569,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Selecting metrics for a specific service
|
||||
|
||||
- Hover over the lines in the graph
|
||||
@@ -6358,6 +6591,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Turn counters into rates
|
||||
|
||||
- What we see is the total amount of CPU used (in seconds)
|
||||
@@ -6381,6 +6616,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Aggregate multiple data series
|
||||
|
||||
- We have one graph per CPU; we want to sum them
|
||||
@@ -6408,6 +6645,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Comparing Snap and Prometheus data
|
||||
|
||||
- If you haven't setup Snap, InfluxDB, and Grafana, skip this section
|
||||
@@ -6420,6 +6659,8 @@ Their state should be "UP".
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Add Prometheus as a data source in Grafana
|
||||
|
||||
.exercise[
|
||||
@@ -6438,6 +6679,8 @@ We see the same input form that we filled earlier to connect to InfluxDB.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Connecting to Prometheus from Grafana
|
||||
|
||||
.exercise[
|
||||
@@ -6460,6 +6703,8 @@ Otherwise, double-check every field and try again!
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Adding the Prometheus data to our dashboard
|
||||
|
||||
.exercise[
|
||||
@@ -6476,6 +6721,8 @@ This takes us to the graph editor that we used earlier.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Querying Prometheus data from Grafana
|
||||
|
||||
The editor is a bit less friendly than the one we used for InfluxDB.
|
||||
@@ -6499,6 +6746,8 @@ The editor is a bit less friendly than the one we used for InfluxDB.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## Interpreting results
|
||||
|
||||
- The two graphs *should* be similar
|
||||
@@ -6521,6 +6770,8 @@ The editor is a bit less friendly than the one we used for InfluxDB.
|
||||
|
||||
---
|
||||
|
||||
class: metrics
|
||||
|
||||
## More resources on container metrics
|
||||
|
||||
- [Docker Swarm & Container Overview](https://grafana.net/dashboards/609),
|
||||
@@ -7071,11 +7322,31 @@ class: title
|
||||
|
||||
---
|
||||
|
||||
## .blackbelt[Wednesday 14:25]
|
||||
|
||||
Don't get hacked!
|
||||
|
||||
.small[(Especially by sketchy Russian groups, since it leads to privilege escalation and compromises Democracy. Just saying.)]
|
||||
|
||||
Sign images!
|
||||
|
||||
Require multiple keys for signature!
|
||||
|
||||
Revoke compromised keys and reissue new ones easily!
|
||||
|
||||
Prevent replay attacks and use of obsolete vulnerable software!
|
||||
|
||||
Justin Cappos will tell you all about *securing the software supply chain!*
|
||||
|
||||
.small[(This is Infosec! It's Very Important! Therefore I Am Allowed To Use Lots Of Exclamation Marks!)]
|
||||
|
||||
---
|
||||
|
||||
## Work in progress
|
||||
|
||||
- Stabilize Compose/Swarm integration
|
||||
|
||||
<!--
|
||||
- Refine Snap deployment
|
||||
-->
|
||||
|
||||
- Healthchecks
|
||||
|
||||
@@ -7110,7 +7381,7 @@ class: title
|
||||
var slideshow = remark.create({
|
||||
ratio: '16:9',
|
||||
highlightSpans: true,
|
||||
excludedClasses: ["in-person"]
|
||||
excludedClasses: ["self-paced", "extra-details", "metrics", "swarmtools", "secrets", "encryption-at-rest"]
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
|
||||
@@ -66,20 +66,24 @@ aws_display_instances_by_tag() {
|
||||
fi
|
||||
}
|
||||
|
||||
aws_get_instance_ids_by_filter() {
|
||||
FILTER=$1
|
||||
aws ec2 describe-instances --filters $FILTER \
|
||||
--query Reservations[*].Instances[*].InstanceId \
|
||||
--output text | tr "\t" "\n"
|
||||
}
|
||||
|
||||
|
||||
aws_get_instance_ids_by_client_token() {
|
||||
TOKEN=$1
|
||||
need_tag $TOKEN
|
||||
aws ec2 describe-instances --filters "Name=client-token,Values=$TOKEN" \
|
||||
| grep ^INSTANCE \
|
||||
| awk '{print $8}'
|
||||
aws_get_instance_ids_by_filter Name=client-token,Values=$TOKEN
|
||||
}
|
||||
|
||||
aws_get_instance_ids_by_tag() {
|
||||
TAG=$1
|
||||
need_tag $TAG
|
||||
aws ec2 describe-instances --filters "Name=tag:Name,Values=$TAG" \
|
||||
| grep ^INSTANCE \
|
||||
| awk '{print $8}'
|
||||
aws_get_instance_ids_by_filter Name=tag:Name,Values=$TAG
|
||||
}
|
||||
|
||||
aws_get_instance_ips_by_tag() {
|
||||
|
||||
Reference in New Issue
Block a user