Files
container.training/slides/containers/Orchestration_Overview.md
2021-10-31 01:03:38 +02:00

443 lines
7.3 KiB
Markdown

# Orchestration, an overview
In this chapter, we will:
* Explain what is orchestration and why we would need it.
* Present (from a high-level perspective) some orchestrators.
---
class: pic
## What's orchestration?
![Joana Carneiro (orchestra conductor)](images/conductor.jpg)
---
## What's orchestration?
According to Wikipedia:
*Orchestration describes the __automated__ arrangement,
coordination, and management of complex computer systems,
middleware, and services.*
--
*[...] orchestration is often discussed in the context of
__service-oriented architecture__, __virtualization__, provisioning,
Converged Infrastructure and __dynamic datacenter__ topics.*
--
What does that really mean?
---
## Example 1: dynamic cloud instances
--
- Q: do we always use 100% of our servers?
--
- A: obviously not!
.center[![Daily variations of traffic](images/traffic-graph.png)]
---
## Example 1: dynamic cloud instances
- Every night, scale down
(by shutting down extraneous replicated instances)
- Every morning, scale up
(by deploying new copies)
- "Pay for what you use"
(i.e. save big $$$ here)
---
## Example 1: dynamic cloud instances
How do we implement this?
- Crontab
- Autoscaling (save even bigger $$$)
That's *relatively* easy.
Now, how are things for our IAAS provider?
---
## Example 2: dynamic datacenter
- Q: what's the #1 cost in a datacenter?
--
- A: electricity!
--
- Q: what uses electricity?
--
- A: servers, obviously
- A: ... and associated cooling
--
- Q: do we always use 100% of our servers?
--
- A: obviously not!
---
## Example 2: dynamic datacenter
- If only we could turn off unused servers during the night...
- Problem: we can only turn off a server if it's totally empty!
(i.e. all VMs on it are stopped/moved)
- Solution: *migrate* VMs and shutdown empty servers
(e.g. combine two hypervisors with 40% load into 80%+0%,
<br/>and shut down the one at 0%)
---
## Example 2: dynamic datacenter
How do we implement this?
- Shut down empty hosts (but keep some spare capacity)
- Start hosts again when capacity gets low
- Ability to "live migrate" VMs
(Xen already did this 10+ years ago)
- Rebalance VMs on a regular basis
- what if a VM is stopped while we move it?
- should we allow provisioning on hosts involved in a migration?
*Scheduling* becomes more complex.
---
## What is scheduling?
According to Wikipedia (again):
*In computing, scheduling is the method by which threads,
processes or data flows are given access to system resources.*
The scheduler is concerned mainly with:
- throughput (total amount of work done per time unit);
- turnaround time (between submission and completion);
- response time (between submission and start);
- waiting time (between job readiness and execution);
- fairness (appropriate times according to priorities).
In practice, these goals often conflict.
**"Scheduling" = decide which resources to use.**
---
## Exercise 1
- You have:
- 5 hypervisors (physical machines)
- Each server has:
- 16 GB RAM, 8 cores, 1 TB disk
- Each week, your team requests:
- one VM with X RAM, Y CPU, Z disk
Scheduling = deciding which hypervisor to use for each VM.
Difficulty: easy!
---
<!-- Warning, two almost identical slides (for img effect) -->
## Exercise 2
- You have:
- 1000+ hypervisors (and counting!)
- Each server has different resources:
- 8-500 GB of RAM, 4-64 cores, 1-100 TB disk
- Multiple times a day, a different team asks for:
- up to 50 VMs with different characteristics
Scheduling = deciding which hypervisor to use for each VM.
Difficulty: ???
---
<!-- Warning, two almost identical slides (for img effect) -->
## Exercise 2
- You have:
- 1000+ hypervisors (and counting!)
- Each server has different resources:
- 8-500 GB of RAM, 4-64 cores, 1-100 TB disk
- Multiple times a day, a different team asks for:
- up to 50 VMs with different characteristics
Scheduling = deciding which hypervisor to use for each VM.
![Troll face](images/trollface.png)
---
## Exercise 3
- You have machines (physical and/or virtual)
- You have containers
- You are trying to put the containers on the machines
- Sounds familiar?
---
class: pic
## Scheduling with one resource
.center[![Not-so-good bin packing](images/binpacking-1d-1.gif)]
## We can't fit a job of size 6 :(
---
class: pic
## Scheduling with one resource
.center[![Better bin packing](images/binpacking-1d-2.gif)]
## ... Now we can!
---
class: pic
## Scheduling with two resources
.center[![2D bin packing](images/binpacking-2d.gif)]
---
class: pic
## Scheduling with three resources
.center[![3D bin packing](images/binpacking-3d.gif)]
---
class: pic
## You need to be good at this
.center[![Tangram](images/tangram.gif)]
---
class: pic
## But also, you must be quick!
.center[![Tetris](images/tetris-1.png)]
---
class: pic
## And be web scale!
.center[![Big tetris](images/tetris-2.gif)]
---
class: pic
## And think outside (?) of the box!
.center[![3D tetris](images/tetris-3.png)]
---
class: pic
## Good luck!
.center[![FUUUUUU face](images/fu-face.jpg)]
---
## TL,DR
* Scheduling with multiple resources (dimensions) is hard.
* Don't expect to solve the problem with a Tiny Shell Script.
* There are literally tons of research papers written on this.
---
## But our orchestrator also needs to manage ...
* Network connectivity (or filtering) between containers.
* Load balancing (external and internal).
* Failure recovery (if a node or a whole datacenter fails).
* Rolling out new versions of our applications.
(Canary deployments, blue/green deployments...)
---
## Some orchestrators
We are going to present briefly a few orchestrators.
There is no "absolute best" orchestrator.
It depends on:
- your applications,
- your requirements,
- your pre-existing skills...
---
## Nomad
- Open Source project by Hashicorp.
- Arbitrary scheduler (not just for containers).
- Great if you want to schedule mixed workloads.
(VMs, containers, processes...)
- Less integration with the rest of the container ecosystem.
---
## Mesos
- Open Source project in the Apache Foundation.
- Arbitrary scheduler (not just for containers).
- Two-level scheduler.
- Top-level scheduler acts as a resource broker.
- Second-level schedulers (aka "frameworks") obtain resources from top-level.
- Frameworks implement various strategies.
(Marathon = long running processes; Chronos = run at intervals; ...)
- Commercial offering through DC/OS by Mesosphere.
---
## Rancher
- Rancher 1 offered a simple interface for Docker hosts.
- Rancher 2 is a complete management platform for Docker and Kubernetes.
- Technically not an orchestrator, but it's a popular option.
---
## Swarm
- Tightly integrated with the Docker Engine.
- Extremely simple to deploy and setup, even in multi-manager (HA) mode.
- Secure by default.
- Strongly opinionated:
- smaller set of features,
- easier to operate.
---
## Kubernetes
- Open Source project initiated by Google.
- Contributions from many other actors.
- *De facto* standard for container orchestration.
- Many deployment options; some of them very complex.
- Reputation: steep learning curve.
- Reality:
- true, if we try to understand *everything*;
- false, if we focus on what matters.
???
:EN:- Orchestration overview
:FR:- Survol de techniques d'orchestration