Compare commits

...

255 Commits

Author SHA1 Message Date
Jérôme Petazzoni
cbee7484ae update docs for new slide deck generator 2017-10-19 17:08:22 +02:00
Jerome Petazzoni
764d33c884 power outlets are the worst 2017-10-16 09:05:50 +02:00
Jérôme Petazzoni
a4fc5b924f Last update after dry run 2017-10-16 00:56:55 +02:00
Jérôme Petazzoni
b155000d56 hero syndrome (thanks @soulshake) 2017-10-15 23:23:18 +02:00
Jérôme Petazzoni
baf48657d0 Clean up a bunch of titles 2017-10-15 23:13:06 +02:00
Jérôme Petazzoni
b4b22ff47b Add chat variable to workshop YML files 2017-10-15 23:01:46 +02:00
Jérôme Petazzoni
c4b131ae5e Add black belt refs 2017-10-15 22:37:23 +02:00
Jérôme Petazzoni
af1031760b Add blackbelt icon and css 2017-10-15 22:11:13 +02:00
Jérôme Petazzoni
da7c4742bf Netlify is <3 2017-10-14 22:37:28 +02:00
Jérôme Petazzoni
a3cf917100 Add dashboard section + kubectl apply sec talk 2017-10-14 17:42:02 +02:00
Jérôme Petazzoni
96a5cc15ec Add a comment at end of each slide showing origin 2017-10-14 16:56:12 +02:00
Jérôme Petazzoni
117c6c18e9 pull-images -> pull_images 2017-10-14 14:22:46 +02:00
Jérôme Petazzoni
8b0173bc87 Fix build-forever (entr is buggy, yo) 2017-10-14 14:19:21 +02:00
Jérôme Petazzoni
9450ed2057 Improve error reporting (thanks @bretfisher for reporting this) 2017-10-14 14:10:50 +02:00
Romain Degez
994be990f5 Remove extraneous chapter title 2017-10-14 12:31:58 +02:00
Jérôme Petazzoni
e2fd9531ef Add rolling upgrade section and whatsnext 2017-10-13 23:16:02 +02:00
Jérôme Petazzoni
8bb7243aaf Imperative vs declarative; spec 2017-10-13 20:43:31 +02:00
Jérôme Petazzoni
9a067f2064 kube agenda 2017-10-13 20:05:03 +02:00
Jérôme Petazzoni
009bc2089d Backport #91 2017-10-13 19:54:21 +02:00
Jérôme Petazzoni
b3a7e36c37 fixup build.sh script 2017-10-13 19:23:00 +02:00
Jérôme Petazzoni
a9a82ccd1e Rework slide builder + add section on daemonsets 2017-10-12 20:49:59 +02:00
Jérôme Petazzoni
6abbebe00d Reword Compose File v3 explanations 2017-10-12 10:38:16 +02:00
Jérôme Petazzoni
4dec9c43f1 One more round of updates for dc17eu 2017-10-12 10:17:52 +02:00
Jérôme Petazzoni
3369005e06 Revert to single HTML generator and parametrize excludeClasses 2017-10-12 09:53:14 +02:00
Jérôme Petazzoni
c4d76ba367 Backport #92 (thanks @bretfisher 👍🏻) 2017-10-12 09:04:56 +02:00
Jérôme Petazzoni
ba24b66d84 Fix extra-details icon 2017-10-12 00:12:30 +02:00
Jérôme Petazzoni
8d2391e4d6 Add kubespawn 2017-10-12 00:12:19 +02:00
Jérôme Petazzoni
4c68847dd1 Remove PWD reference from kube material 2017-10-11 23:33:13 +02:00
Jérôme Petazzoni
b67371e0ec Helper script based on entr 2017-10-11 23:32:14 +02:00
Jérôme Petazzoni
cd2cf9b3a4 Tweak page number positioning 2017-10-11 23:31:59 +02:00
Jérôme Petazzoni
4eaf2310b6 Add how to run and expose services on kube 2017-10-11 23:31:39 +02:00
Jérôme Petazzoni
20e9517722 Put slide number in top-left corner 2017-10-11 16:00:23 +02:00
Jérôme Petazzoni
553fd6b742 Fix custom prompt 2017-10-11 15:54:56 +02:00
Jérôme Petazzoni
25c8623a81 Add kubectl completion 2017-10-11 15:54:46 +02:00
Jérôme Petazzoni
f787d1b6c3 Add kube concepts + kubectl primer 2017-10-11 15:49:11 +02:00
Jérôme Petazzoni
825257427f Split out selfpaced and dockercon workshops 2017-10-10 17:55:22 +02:00
Jérôme Petazzoni
e28a64c6cf Remove old version 2017-10-09 18:04:44 +02:00
Jérôme Petazzoni
f8888bf16a Split out content to many smaller files
And add markmaker.py to generate workshop.md
2017-10-09 16:56:23 +02:00
Jérôme Petazzoni
ac523e0f14 Add upstream URL 2017-10-09 13:30:38 +02:00
Jérôme Petazzoni
3211c1ba8a Add data-path option 2017-10-07 19:24:07 +02:00
Jérôme Petazzoni
f1aa5d07fa Fix printing 2017-10-07 15:14:46 +02:00
Jérôme Petazzoni
c0e2fc8832 Allow to run workshopctl in a container 2017-10-06 21:40:39 +02:00
Jérôme Petazzoni
08722db23f Major rehaul of trainer script (it is now workshopctl) 2017-10-06 19:01:15 +02:00
Jérôme Petazzoni
11ec3336eb Remove media dir (unused) 2017-10-06 13:10:52 +02:00
Jérôme Petazzoni
42603d6f62 Add host network in Swarm mode 2017-10-05 14:27:23 +02:00
Jérôme Petazzoni
5c825c864c Allow to start+deploy in a single step 2017-10-05 12:55:36 +02:00
Jérôme Petazzoni
186b30a742 Add a couple of slides about events 2017-10-04 17:13:01 +02:00
Jérôme Petazzoni
06b97454c6 Add section about configs 2017-10-04 16:36:48 +02:00
Jérôme Petazzoni
c393d2aa51 Remove older (unused) stacks 2017-10-04 15:21:33 +02:00
Jérôme Petazzoni
3817332905 Remove obsolete scripts 2017-10-04 15:20:01 +02:00
Jérôme Petazzoni
b7dbbd4633 Add kubernetes deployment code (behind cheap feature switch) 2017-10-03 22:15:43 +02:00
Jérôme Petazzoni
b0a34aa106 Remove Swarm classic 2017-10-02 13:33:58 +02:00
Jérôme Petazzoni
36f512a3d3 Backport content from DOD MSP 2017-09-29 23:35:52 +02:00
Jérôme Petazzoni
87cbbd5c35 Backport a few updates from devopscon 2017-09-29 23:17:27 +02:00
Jérôme Petazzoni
2f6689d639 Refactor card generation to use Jinja templates
This makes the card generation process a bit easier to customize.
A few issues with Chrome page breaks were also fixed.
2017-09-29 22:29:08 +02:00
Jérôme Petazzoni
4f7651855e Update version numbers 2017-09-29 19:24:24 +02:00
Jérôme Petazzoni
aea59c757e Add HEALTHCHECK support, courtesy of @bretfisher 2017-09-27 18:07:03 +02:00
Jérôme Petazzoni
af2d82d00a Merge branch 'BretFisher-healthcheck-auto-rollback' 2017-09-27 12:32:43 +02:00
Jérôme Petazzoni
5b8861009d Merge branch 'healthcheck-auto-rollback' of https://github.com/BretFisher/orchestration-workshop into BretFisher-healthcheck-auto-rollback 2017-09-27 12:32:29 +02:00
Jérôme Petazzoni
674bfe82c7 Remove conference hashtag in CTA tweet link (closes #77) 2017-09-27 12:20:08 +02:00
Jérôme Petazzoni
8f61a2fffa If any of the commands of postprep fails, abort
Closes #80
2017-09-27 12:14:55 +02:00
Jérôme Petazzoni
748881d37d Add a fancy table! 2017-09-26 21:55:09 +02:00
Jérôme Petazzoni
d29863a0e0 Merge branch 'ops-feature-history' of https://github.com/BretFisher/orchestration-workshop 2017-09-26 18:42:04 +02:00
Jérôme Petazzoni
acc84729a2 Merge pull request #89 from BretFisher/add-inline-code-bg
Add inline code background color
2017-09-13 14:26:23 -07:00
Bret Fisher
9af9477f65 ugg spacing 2017-09-12 19:09:51 -07:00
Bret Fisher
15cca15ec5 add inline-code grey background
So much grey! All the grey's!
2017-09-12 19:08:36 -07:00
Bret Fisher
685ea653fe adding healthcheck with rollback 2017-09-12 19:03:52 -07:00
Jérôme Petazzoni
bf13657a8f Merge branch 'master' of github.com:jpetazzo/orchestration-workshop 2017-08-03 11:02:45 +02:00
Jérôme Petazzoni
9c7fb40475 Merge branch 'BretFisher-user-namespaces' 2017-08-03 11:02:27 +02:00
Jérôme Petazzoni
b1b8b53a2f Adapt @bretfisher work to match formatting etc 2017-08-03 11:01:31 +02:00
Jérôme Petazzoni
69259c27a1 Merge branch 'user-namespaces' of https://github.com/BretFisher/orchestration-workshop into BretFisher-user-namespaces 2017-08-03 08:40:53 +02:00
Jérôme Petazzoni
7354974ece Merge pull request #87 from lastcoolnameleft/patch-1
1.9.0 does not support docker-compse.yml Version 3
2017-08-01 23:31:53 -07:00
Tommy Falgout
5379619026 1.9.0 does not support docker-compse.yml Version 3 2017-08-01 17:46:21 -05:00
Jérôme Petazzoni
0d7ee1dda0 Merge branch 'alexellis-alexellis-patch-sol' 2017-07-12 13:41:45 +02:00
Jérôme Petazzoni
243d585432 Add a few details about what happens when losing the sole manager 2017-07-12 13:41:37 +02:00
Alex Ellis
f5fe7152f3 Internationalisation
I had no idea what SOL was - had to google this on Urban Dictionary :-/ have put an internationalisation in and retained the colliqualism in brackets.
2017-07-11 19:00:23 +01:00
Jérôme Petazzoni
94d9ad22d0 Add ngrep details when using PWD or Vagrant re/ interface selection (closes #84) 2017-07-11 19:51:00 +02:00
Bret Fisher
59f1b1069d fixed some feature release confusion 2017-06-26 14:34:03 -04:00
Bret Fisher
c30386a73d added ops feature history slide 2017-06-18 20:28:51 -07:00
Jérôme Petazzoni
0af160e0a8 Merge pull request #82 from adulescentulus/fix_visualizer_exercise
(some) wrong instructions
2017-06-17 09:31:31 -07:00
Andreas Groll
1fdb7b8077 added missing stackname 2017-06-12 15:25:35 +02:00
Andreas Groll
d2b67c426e you only can connect to the ip where you started your visualizer 2017-06-12 12:07:59 +02:00
Jérôme Petazzoni
a84cc36cd8 Update installation method 2017-06-09 18:16:29 +02:00
Jerome Petazzoni
c8ecf5a647 PYCON final check! 2017-05-17 18:14:33 -07:00
Jerome Petazzoni
e9ee050386 Explain extra details 2017-05-17 15:56:28 -07:00
Jerome Petazzoni
6e59e2092c Merge branch 'master' of github.com:jpetazzo/orchestration-workshop 2017-05-17 15:00:42 -07:00
Jerome Petazzoni
c7b0fd32bd Add detail about ASGs 2017-05-17 15:00:31 -07:00
Jérôme Petazzoni
ead4e33604 Merge pull request #79 from jliu70/oscon2017
fix typo
2017-05-17 14:31:26 -07:00
Jérôme Petazzoni
96b4f76c67 Backport all changes from OSCON 2017-05-17 00:17:24 -05:00
Jeff Liu
6337d49123 fix typo 2017-05-08 10:21:51 -05:00
Jerome Petazzoni
aec2de848b Rename docker-compose files to keep .yml extension (fixes #69) 2017-05-03 12:44:17 -07:00
Jérôme Petazzoni
91942f22a0 Merge pull request #73 from everett-toews/cd-to-snap
Change to the snap dir first
2017-05-03 14:36:52 -05:00
Jérôme Petazzoni
93cdc9d987 Merge pull request #72 from everett-toews/fix-worker-service-name
Fix the dockercoins_worker service name
2017-05-03 14:36:27 -05:00
Jérôme Petazzoni
13e6283221 Merge pull request #71 from everett-toews/netshoot
Consistent use of the netshoot image
2017-05-03 14:35:54 -05:00
Jerome Petazzoni
e56bea5c16 Update Swarm visualizer information 2017-05-03 12:36:09 -07:00
Jerome Petazzoni
eda499f084 Fix link to Raft (thanks @kchien) - fixes #74 2017-05-03 12:20:45 -07:00
Jerome Petazzoni
ae638b8e89 Minor updates before GOTO 2017-05-03 11:46:35 -07:00
Jerome Petazzoni
5296be32ed Handle untagged resources 2017-05-03 11:26:47 -07:00
Jerome Petazzoni
f1cd3ba7d0 Remove rc.yaml 2017-05-03 10:02:36 -07:00
Jérôme Petazzoni
b307adee91 Last updates
Conflicts:
	docs/index.html
2017-05-03 09:34:42 -07:00
Jérôme Petazzoni
f4540fad78 Update describe-instances for awscli 1.11 (thanks @mikegcoleman for finding that bug!) 2017-05-03 09:15:45 -07:00
Jérôme Petazzoni
70db794111 Simplify stackfiles 2017-04-16 23:56:30 -05:00
Jérôme Petazzoni
abafc0c8ec Add swarm-rafttool 2017-04-16 23:47:56 -05:00
Everett Toews
a7dba759a8 Change to the snap dir first 2017-04-16 14:34:49 -05:00
Everett Toews
b14662490a Fix the dockercoins_worker service name 2017-04-16 13:23:54 -05:00
Everett Toews
9d45168752 Consistent use of the netshoot image 2017-04-16 13:16:02 -05:00
Jérôme Petazzoni
7b3c9cd2c3 Add @alexmavr/swarm-nbt (FTW!) 2017-04-15 18:29:32 -05:00
Jérôme Petazzoni
84d4a367ec Mention --filter for docker service ps 2017-04-15 17:45:24 -05:00
Jérôme Petazzoni
bd6b37b573 Add @manomarks' Swarm viz tool 2017-04-15 17:21:38 -05:00
Jérôme Petazzoni
e1b2a4440d Update docker service logs; --detach=false 2017-04-14 15:39:52 -05:00
Jérôme Petazzoni
1b5365d905 Update settings; add security workshop 2017-04-14 15:39:24 -05:00
Jérôme Petazzoni
27ea268026 Automatically resolve AMI ID to use 2017-04-14 15:32:03 -05:00
Bret Fisher
45402a28e5 updated to preventls accidently registry delete 2017-04-14 02:37:07 -04:00
Bret Fisher
9e97c7a490 adding user namspace change and daemon.json example
also adding .footnote css
2017-04-14 01:34:51 -04:00
Jérôme Petazzoni
b0f566538d Re-add useful self-paced slides 2017-03-31 21:49:57 -05:00
Jerome Petazzoni
e637354d3e Fix TOC and minor tweaks 2017-03-31 21:41:24 -05:00
Jerome Petazzoni
1f8c27b1aa Update deployed versions 2017-03-31 21:40:05 -05:00
Jerome Petazzoni
f7d317d960 Backporting Devoxx updates 2017-03-31 21:39:48 -05:00
Jérôme Petazzoni
a8c54a8afd Update chat links 2017-03-31 21:36:08 -05:00
Jerome Petazzoni
73b3752c7e Change chat links 2017-03-31 21:33:12 -05:00
Jérôme Petazzoni
d60ba2e91e Merge pull request #68 from hknust/master
Service name should be dockercoins_worker not worker
2017-03-30 17:11:37 -05:00
Jérôme Petazzoni
d480f5c26a Clarify node switching commands 2017-03-20 19:30:38 -07:00
Jérôme Petazzoni
540aa91f48 Hotfix JS file 2017-03-10 16:46:51 -06:00
Jérôme Petazzoni
8f3c0da385 Use our custom fork of remark; updates for Docker Birthday 2017-03-10 16:40:48 -06:00
Holger Knust
6610ff178d Fixed typo on slide. Attempts instead of attemps 2017-03-04 23:13:35 -08:00
Holger Knust
9a9e725d5b Service name should be dockercoins_worker not worker 2017-03-04 11:29:01 -08:00
Jérôme Petazzoni
09cabc556e Update for SCALE 15x 2017-03-02 16:38:59 -08:00
Jérôme Petazzoni
44f4017992 Switch from localhost to 127.0.0.1 (to work around some weird DNS issues) 2017-03-02 14:06:59 -08:00
Jérôme Petazzoni
6f85ff7824 Reorganize advanced content for Docker Birthday 2017-02-16 15:16:06 -06:00
Jérôme Petazzoni
514ac69a8f Ship part 1 for Docker Birthday 2017-02-15 00:03:01 -06:00
Jérôme Petazzoni
7418691249 Rework intro for self-guided workshop 2017-02-14 10:15:27 -06:00
Jérôme Petazzoni
4d2289b2d2 Add details about authorization plugins 2017-02-09 12:33:55 -06:00
Jerome Petazzoni
e0956be92c Add link target for logging 2017-01-20 16:24:15 -08:00
Jérôme Petazzoni
d623f76a02 add note on API scope 2017-01-13 19:29:22 -06:00
Jérôme Petazzoni
dd555af795 update section about restart condition 2017-01-13 17:59:57 -06:00
Jérôme Petazzoni
a2da3f417b update secret section 2017-01-13 17:35:45 -06:00
Jérôme Petazzoni
d129b37781 minor updates, including services ps -a flag 2017-01-13 16:22:58 -06:00
Jérôme Petazzoni
849ea6e576 improve LB demo a bit 2017-01-13 16:04:53 -06:00
Jérôme Petazzoni
7ed54eee66 Merge pull request #64 from trapier/slides_comment_format
slides: code block comment formatting on snap install
2016-12-12 17:59:21 -06:00
Trapier Marshall
1dca8e5a7a slides: code block comment formatting
This will make it easier to copy-paste the whole block used for
snap installation
2016-12-12 11:03:30 -05:00
Jérôme Petazzoni
165de1dbb5 Merge pull request #63 from trapier/slides_cosmetic_edits
couple of cosmetic edits to slides
2016-12-11 21:48:57 -06:00
Trapier Marshall
b7afd13012 couple cosmetic corrections to slides 2016-12-11 01:16:30 -05:00
Jerome Petazzoni
e8b64c5e08 Last touch-ups for LISA16! Good to go! 2016-12-05 19:32:39 -08:00
Jerome Petazzoni
9124eb0e07 Add healthchecks in WIP section 2016-12-05 13:32:09 -08:00
Jerome Petazzoni
0bede24e23 Add what's next section 2016-12-05 10:49:31 -08:00
Jerome Petazzoni
ee79e5ba86 Add MOSH instructions 2016-12-05 10:32:29 -08:00
Jerome Petazzoni
9078cfb57d DAB -> Compose v3 2016-12-05 08:53:31 -08:00
Jerome Petazzoni
6854698fe1 Add Fluentd instructions (contrib) 2016-12-04 17:07:48 -08:00
Jerome Petazzoni
16a4dac192 Add "replayability" instructions 2016-12-04 16:40:17 -08:00
Jerome Petazzoni
0029fa47c5 Update secrets and autolock chapters (thanks @diogomonica for feedback and pointers!) 2016-12-04 09:19:09 -08:00
Jerome Petazzoni
a53636340b Tweak 2016-12-03 10:30:29 -08:00
Jerome Petazzoni
c95b88e562 Secrets management and data encryption 2016-12-03 10:28:20 -08:00
Jerome Petazzoni
d438bd624a Merge branch 'master' of github.com:jpetazzo/orchestration-workshop 2016-12-02 17:50:39 -08:00
Jerome Petazzoni
839746831b Improve illustration a bit 2016-12-02 17:50:29 -08:00
Jérôme Petazzoni
0b1b589314 Merge pull request #60 from hubertst/patch-1
Update provisioning.yml
2016-12-02 16:47:54 -08:00
Hubert
61d2709f8f Update provisioning.yml
fix for ansible 2.2
2016-12-02 09:49:52 +01:00
Jerome Petazzoni
1741a7b35a Add encrypted networks 2016-12-01 22:15:42 -08:00
Jerome Petazzoni
e101856dd7 dynamic scheduling 2016-12-01 17:18:00 -08:00
Jerome Petazzoni
d451f9c7bf Add note on docker service update --mode 2016-12-01 15:52:05 -08:00
Jerome Petazzoni
b021b0eec8 Addtl metrics resources 2016-12-01 15:43:49 -08:00
Jerome Petazzoni
e4f824fd07 docker system ... 2016-11-30 15:54:14 -08:00
Jerome Petazzoni
019165e98c Re-enable a few slides (checked all ??? slides) 2016-11-29 13:02:42 -08:00
Jerome Petazzoni
cf5c2d5741 Add PromQL details + side-by-side Prom&Snap comparison 2016-11-29 12:59:28 -08:00
Jerome Petazzoni
971bf85b17 Clarify raft usage 2016-11-28 17:44:15 -08:00
Jerome Petazzoni
83749ade43 Add "what did we change in this app?" section 2016-11-28 17:17:24 -08:00
Jerome Petazzoni
76fb2f2e2c Add prometheus files (fixes #58) 2016-11-28 12:30:56 -08:00
Jerome Petazzoni
6bda8147e4 Merge branch 'lisa16' 2016-11-28 12:28:03 -08:00
Jerome Petazzoni
95751d1ee9 Merge branch 'master' of github.com:jpetazzo/orchestration-workshop 2016-11-23 15:18:12 -08:00
Jerome Petazzoni
12adae107e Update instructions to install Compose in nodes
Closes #51

(Also addresses remarks about using Machine in older EC2 accounts lacking VPC)
2016-11-23 15:18:07 -08:00
Jerome Petazzoni
c652ea08a2 Upgrade to remark 0.14 (closes #38) 2016-11-23 14:45:03 -08:00
Jerome Petazzoni
30008e4af6 Add warning re/ swarmtctl (fixes #35) 2016-11-23 14:34:44 -08:00
Jérôme Petazzoni
bb262e27e8 Merge pull request #55 from stefanlasiewski/master
"Using Docker Machine to communicate with a node" missing the `docker-machine env` command
2016-11-23 12:27:55 -06:00
Jerome Petazzoni
9656d959cc Switch to EBS-based instances; change default instance type to t2.medium 2016-11-21 17:10:07 -08:00
Jerome Petazzoni
46b772b95e First round of updates for LISA 2016-11-21 16:55:47 -08:00
stefanlasiewski
f801e1b9ad Add instructions for VMware Fusion. 2016-11-21 11:44:13 -08:00
stefanlasiewski
1c44d7089a Merge branch 'master' of https://github.com/stefanlasiewski/orchestration-workshop 2016-11-18 14:44:58 -08:00
stefanlasiewski
1f7f4a29ff docker-machine ... should actually be docker-machine env ... in a
couple of places.
2016-11-18 14:44:33 -08:00
Jerome Petazzoni
e16e23e2bd Add supergrok instructions 2016-11-18 10:06:10 -08:00
Jérôme Petazzoni
b5206aa68e Merge pull request #53 from drewmoseley/patch-1
Install pycrypto
2016-11-17 17:24:49 -06:00
Jérôme Petazzoni
8a47bce180 Merge pull request #52 from asziranyi/patch-1
add vagrant-vbguest install link
2016-11-17 17:24:18 -06:00
Drew Moseley
6cd8c32621 Install pycrypto
Not sure if it's somehow unique to my setup but Ansible needed me to install pycrypto as well.
2016-11-17 12:07:42 -05:00
asziranyi
f2f1934940 add vagrant-vbguest installation link 2016-11-17 15:50:47 +01:00
Jerome Petazzoni
8cc388dcb8 add ctrl-p ctrl-q warning 2016-11-14 12:36:57 -08:00
Jerome Petazzoni
a276e72ab0 add ngrok instructions 2016-11-14 11:23:22 -08:00
Jerome Petazzoni
bdb8e1b3df Add instructions for self-paced workshop 2016-11-11 14:28:28 -08:00
Jérôme Petazzoni
66ee4739ed typos 2016-11-07 22:40:59 -06:00
Jérôme Petazzoni
893c7b13c6 Add instructions to create VMs with Docker Machine 2016-11-07 22:38:43 -06:00
Jerome Petazzoni
78b730e4ac Patch up TOC generator 2016-11-01 17:37:48 -07:00
Jerome Petazzoni
e3eb06ddfb Bump up to Compose 1.8.1 and Machine 0.8.2 2016-11-01 17:10:55 -07:00
Jerome Petazzoni
ad29a45191 Add advertise-addr info + small fixups for mentor week 2016-11-01 17:10:36 -07:00
Jerome Petazzoni
e1968beefa Bump to 16.04 LTS AMIs (closes #37)
16.04 doesn't come with Python setuptools, so we have to install that too.
2016-10-18 08:53:53 -07:00
Jerome Petazzoni
b1b3ecb5e9 Add Prometheus section 2016-10-16 17:28:05 -07:00
Jerome Petazzoni
ef60a78998 Pin version numbers used by ELK 2016-10-16 16:30:04 -07:00
Jerome Petazzoni
70064da91c Add Docker Machine; use it to get TLS mutual auth instead of 55555 plain text 2016-10-16 16:27:21 -07:00
Jérôme Petazzoni
0b6a3a1cba Merge pull request #48 from soulshake/typo
Typo fixes
2016-10-08 14:49:16 +02:00
AJ Bowen
e403a005ea 'Set up' when it's a verb, 'setup' when it's a noun. 2016-10-07 17:09:34 +02:00
AJ Bowen
773528fc2b They're --> Their 2016-10-07 16:19:05 +02:00
Jérôme Petazzoni
97af5492f7 Remove InfluxDB password auth 2016-10-04 18:42:32 +02:00
Jérôme Petazzoni
194ce5d7b6 Update Julius info 2016-10-04 14:11:12 +02:00
Jérôme Petazzoni
fafc8fb1ed Update TOC and add slide about Prometheus 2016-10-04 14:10:38 +02:00
Jérôme Petazzoni
4cb37481ba Merge pull request #46 from dragorosson/patch-1
Fix grammar
2016-10-04 03:47:29 +02:00
Drago Rosson
9196b27f0e Fix grammar 2016-10-03 16:21:56 -05:00
Jerome Petazzoni
9ce98430ab Last (hopefully) round of fixes before LinuxCon EU! 2016-10-03 09:20:40 -07:00
tiffany jernigan
4117f079e6 Run InfluxDB and Grafana as services using Docker Hub images. 2016-10-01 18:03:40 -07:00
Jerome Petazzoni
1105c9fa1f Merge remote-tracking branch 'tiffanyfj/metrics' 2016-10-01 08:06:43 -07:00
Jerome Petazzoni
ab7c1bb09a Prepare for LinuxCon EU Berlin 2016-10-01 08:05:55 -07:00
Jérôme Petazzoni
bfcb24c1ca Merge pull request #45 from anonymuse/jesse/docs_linkfix
Fix path for README links
2016-09-30 16:22:08 +02:00
Jesse White
45f410bb49 Fix path for README links 2016-09-29 17:22:55 -04:00
Jérôme Petazzoni
bcd2433fa4 Merge branch 'BretFisher-readme-updates' 2016-09-29 00:25:46 +02:00
Jérôme Petazzoni
1d02ddf271 Mess up with whitespace, because I am OCD like that 2016-09-29 00:25:36 +02:00
Jérôme Petazzoni
4765410393 Merge branch 'readme-updates' of https://github.com/BretFisher/orchestration-workshop-with-docker into BretFisher-readme-updates 2016-09-29 00:22:21 +02:00
tiffany jernigan
6102d21150 Added metrics chapter 2016-09-28 14:18:36 -07:00
Bret Fisher
75caa65973 more trainer info 2016-09-28 01:26:56 -04:00
Bret Fisher
dfd2bf4aeb new example settings file 2016-09-28 01:26:42 -04:00
Bret Fisher
51000b4b4d better swarm image for cards 2016-09-28 01:26:02 -04:00
Bret Fisher
3acd3b078b more info for trainers 2016-09-27 13:06:35 -04:00
Bret Fisher
4b43287c5b more info for trainers 2016-09-27 11:37:42 -04:00
Jerome Petazzoni
c8c745459c Update stateful section 2016-09-19 11:23:23 -07:00
Jerome Petazzoni
04dec2e196 Round of updates for Velocity 2016-09-18 16:20:51 -07:00
Jerome Petazzoni
0f8c189786 Docker Application Bundle -> Distributed Application Bundle 2016-09-18 12:24:47 -07:00
Jerome Petazzoni
81cc14d47b Fix VM card background image 2016-09-18 12:18:05 -07:00
Jérôme Petazzoni
060b2377d5 Merge pull request #34 from everett-toews/fix-link
Fix broken link to nomenclature doc
2016-09-11 12:01:24 -05:00
Everett Toews
1e77736987 Fix broken link to nomenclature doc 2016-09-10 15:49:04 -05:00
Jérôme Petazzoni
bf2b4b7eb7 Merge pull request #32 from everett-toews/github-docs
Move slides to docs for GitHub Pages
2016-09-08 13:56:40 -05:00
Everett Toews
8396f13a4a Move slides to docs for GitHub Pages 2016-08-27 16:12:25 -05:00
Jerome Petazzoni
571097f369 Small fix 2016-08-27 13:55:26 -07:00
Jerome Petazzoni
b1110db8ca Update TOC 2016-08-24 14:01:31 -07:00
Jerome Petazzoni
b73a628f05 Remove old files 2016-08-24 13:52:16 -07:00
Jerome Petazzoni
a07795565d Update tweet message 2016-08-24 13:50:25 -07:00
Jérôme Petazzoni
c4acbfd858 Add diagram 2016-08-24 16:34:32 -04:00
Jerome Petazzoni
ddbda14e14 Reviews/edits 2016-08-24 13:31:00 -07:00
Jerome Petazzoni
ad4ea8659b Node management 2016-08-24 08:04:27 -07:00
Jerome Petazzoni
8d7f27d60d Add Docker Application Bundles
Capitalize Redis consistently
2016-08-24 06:59:15 -07:00
Jerome Petazzoni
9f21c7279c Compose build+push 2016-08-23 14:19:14 -07:00
Jerome Petazzoni
53ae221632 Add stateful service section 2016-08-23 11:03:57 -07:00
Jerome Petazzoni
6719bcda87 Update logging section 2016-08-22 15:51:26 -07:00
Jerome Petazzoni
40e0c96c91 Rolling upgrades 2016-08-22 14:21:00 -07:00
Jerome Petazzoni
2c8664e58d Updated dockercoins deployment instructions 2016-08-12 06:47:30 -07:00
Jerome Petazzoni
1e5cee2456 Updated intro+cluster setup part 2016-08-11 10:01:51 -07:00
Jerome Petazzoni
29b8f53ae0 More typo fixes courtesy of @tiffanyfj 2016-08-11 06:05:43 -07:00
Jérôme Petazzoni
451f68db1d Update instructions to join cluster 2016-08-10 15:50:30 +02:00
Jérôme Petazzoni
5a4d10ed1a Upgrade versions to Engine 1.12 + Compose 1.8 2016-08-10 15:50:10 +02:00
Jérôme Petazzoni
06d5dc7846 Merge pull request #29 from programmerq/pssh-command
detect debian command or upstream command
2016-08-07 15:26:29 +02:00
Jeff Anderson
b63eb0fa40 detect debian command or upstream command 2016-08-01 12:38:12 -06:00
Jérôme Petazzoni
117e2a9ba2 Merge pull request #13 from fiunchinho/master
Version can be set as env variable to be used, instead of generating unix timestamp
2016-07-11 23:57:13 -05:00
Jerome Petazzoni
d2f6e88fd1 Add -v flag for go get swarmit 2016-06-28 16:47:18 -07:00
Jérôme Petazzoni
c742c39ed9 Merge pull request #26 from beenanner/master
Upgrade docker-compose files to v2
2016-06-28 06:44:27 -07:00
Jerome Petazzoni
1f2b931b01 Slack -> Gitter 2016-06-22 11:54:47 -07:00
Jerome Petazzoni
e351ede294 Fix TOC 2016-06-22 11:48:00 -07:00
Jerome Petazzoni
9ffbfacca8 Last words 2016-06-19 11:15:11 -07:00
Jerome Petazzoni
60524d2ff3 Fixes 2016-06-19 00:07:19 -07:00
Jerome Petazzoni
7001c05ec0 DockerCon update 2016-06-18 18:06:15 -07:00
Jonathan Lee
5d4414723d Upgrade docker-compose files to v2 2016-06-13 21:47:59 -04:00
José Armesto
4dad732c15 Removed unnecesary prints 2016-03-19 19:45:03 +01:00
José Armesto
bb7cadf701 Version can be set as env variable to be used, instead of generating unix timestamp 2016-03-15 16:28:46 +01:00
186 changed files with 12916 additions and 6924 deletions

1
.gitignore vendored
View File

@@ -6,3 +6,4 @@ prepare-vms/ips.html
prepare-vms/ips.pdf
prepare-vms/settings.yaml
prepare-vms/tags
docs/*.yml.html

326
README.md
View File

@@ -1,8 +1,250 @@
# Orchestration at scale(s)
# Docker Orchestration Workshop
This is the material for the "Docker orchestration workshop"
written and delivered by Jérôme Petazzoni (and possibly others)
at multiple conferences and events like:
This is the material (slides, scripts, demo app, and other
code samples) for the "Docker orchestration workshop"
written and delivered by Jérôme Petazzoni (and lots of others)
non-stop since June 2015.
## Content
The workshop introduces a demo app, "DockerCoins," built
around a micro-services architecture. First, we run it
on a single node, using Docker Compose. Then, we pretend
that we need to scale it, and we use an orchestrator
(SwarmKit or Kubernetes) to deploy and scale the app on
a cluster.
We explain the concepts of the orchestrator. For SwarmKit,
we setup the cluster with `docker swarm init` and `docker swarm join`.
For Kubernetes, we use pre-configured clusters.
Then, we cover more advanced concepts: scaling, load balancing,
updates, global services or daemon sets.
There are a number of advanced optional chapters about
logging, metrics, secrets, network encryption, etc.
The content is very modular: it is broken down in a large
number of Markdown files, that are put together according
to a YAML manifest. This allows to re-use content
between different workshops very easily.
## Quick start (or, "I want to try it!")
This workshop is designed to be *hands on*, i.e. to give you a step-by-step
guide where you will build your own Docker cluster, and use it to deploy
a sample application.
The easiest way to follow the workshop is to attend it when it is delivered
by an instructor. In that case, the instructor will generally give you
credentials (IP addresses, login, password) to connect to your own cluster
of virtual machines; and the [slides](http://jpetazzo.github.io/orchestration-workshop)
assume that you have your own cluster indeed.
If you want to follow the workshop on your own, and want to have your
own cluster, we have multiple solutions for you!
### Using [play-with-docker](http://play-with-docker.com/)
This method is very easy to get started: you don't need any extra account
or resources! It works only for the SwarmKit version of the workshop, though.
To get started, go to [play-with-docker](http://play-with-docker.com/), and
click on _ADD NEW INSTANCE_ five times. You will get five "docker-in-docker"
containers, all on a private network. These are your five nodes for the workshop!
When the instructions in the slides tell you to "SSH on node X", just go to
the tab corresponding to that node.
The nodes are not directly reachable from outside; so when the slides tell
you to "connect to the IP address of your node on port XYZ" you will have
to use a different method: click on the port number that should appear on
top of the play-with-docker window. This only works for HTTP services,
though.
Note that the instances provided by Play-With-Docker have a short lifespan
(a few hours only), so if you want to do the workshop over multiple sessions,
you will have to start over each time ... Or create your own cluster with
one of the methods described below.
### Using Docker Machine to create your own cluster
This method requires a bit more work to get started, but you get a permanent
cluster, with less limitations.
You will need Docker Machine (if you have Docker Mac, Docker Windows, or
the Docker Toolbox, you're all set already). You will also need:
- credentials for a cloud provider (e.g. API keys or tokens),
- or a local install of VirtualBox or VMware (or anything supported
by Docker Machine).
Full instructions are in the [prepare-machine](prepare-machine) subdirectory.
### Using our scripts to mass-create a bunch of clusters
Since we often deliver the workshop during conferences or similar events,
we have scripts to automate the creation of a bunch of clusters using
AWS EC2. If you want to create multiple clusters and have EC2 credits,
check the [prepare-vms](prepare-vms) directory for more information.
## How This Repo is Organized
- **dockercoins**
- Sample App: compose files and source code for the dockercoins sample apps
used throughout the workshop
- **docs**
- Slide Deck: presentation slide deck, works out-of-box with GitHub Pages,
uses https://remarkjs.com
- **prepare-local**
- untested scripts for automating the creation of local virtualbox VM's
(could use your help validating)
- **prepare-machine**
- instructions explaining how to use Docker Machine to create VMs
- **prepare-vms**
- scripts for automating the creation of AWS instances for students
## Slide Deck
- The slides are in the `docs` directory.
- For each slide deck, there is a `.yml` file referencing `.md` files.
- The `.md` files contain Markdown snippets.
- When you run `build.sh once`, it will "compile" all the `.yml` files
into `.yml.html` files that you can open in your browser.
- You can also run `build.sh forever`, which will watch the directory
and rebuild slides automatically when files are modified.
- If needed, you can fine-tune `workshop.css` and `workshop.html`
(respectively the CSS style used, and the boilerplate template).
- The slides use https://remarkjs.com to render Markdown into HTML in
a web browser.
## Sample App: Dockercoins!
The sample app is in the `dockercoins` directory. It's used during all chapters
for explaining different concepts of orchestration.
To see it in action:
- `cd dockercoins && docker-compose up -d`
- this will build and start all the services
- the web UI will be available on port 8000
*If you just want to run the workshop for yourself, you can stop reading
here. If you want to deliver the workshop for others (i.e. if you
want to become an instructor), keep reading!*
## Running the Workshop
### General timeline of planning a workshop
- Fork repo and run through slides, doing the hands-on to be sure you
understand the different `dockercoins` repo's and the steps we go through to
get to a full Swarm Mode cluster of many containers. You'll update the first
few slides and last slide at a minimum, with your info.
- Your docs directory can use GitHub Pages.
- This workshop expects 5 servers per student. You can get away with as little
as 2 servers per student, but you'll need to change the slide deck to
accommodate. More servers = more fun.
- If you have more then ~20 students, try to get an assistant (TA) to help
people with issues, so you don't have to stop the workshop to help someone
with ssh etc.
- AWS is our most tested process for generating student machines. In
`prepare-vms` you'll find scripts to create EC2 instances, install docker,
pre-pull images, and even print "cards" to place at each students seat with
IP's and username/password.
- Test AWS Scripts: Be sure to test creating *all* your needed servers a week
before workshop (just for a few minutes). You'll likely hit AWS limits in the
region closest to your class, and it sometimes takes days to get AWS to raise
those limits with a support ticket.
- Create a https://gitter.im chat room for your workshop and update slides
with url. Also useful for TA to monitor this during workshop. You can use it
before/after to answer questions, and generally works as a better answer then
"email me that question".
- If you can send an email to students ahead of time, mention how they should
get SSH, and test that SSH works. If they can `ssh github.com` and get
`permission denied (publickey)` then they know it worked, and SSH is properly
installed and they don't have anything blocking it. SSH and a browser are all
they need for class.
- Typically you create the servers the day before or morning of workshop, and
leave them up the rest of day after workshop. If creating hundreds of servers,
you'll likely want to run all these `workshopctl` commands from a dedicated
instance you have in same region as instances you want to create. Much faster
this way if you're on poor internet. Also, create 2 sets of servers for
yourself, and use one during workshop and the 2nd is a backup.
- Remember you'll need to print the "cards" for students, so you'll need to
create instances while you have a way to print them.
### Things That Could Go Wrong
- Creating AWS instances ahead of time, and you hit its limits in region and
didn't plan enough time to wait on support to increase your limits. :(
- Students have technical issues during workshop. Can't get ssh working,
locked-down computer, host firewall, etc.
- Horrible wifi, or ssh port TCP/22 not open on network! If wifi sucks you
can try using MOSH https://mosh.org which handles SSH over UDP. TMUX can also
prevent you from loosing your place if you get disconnected from servers.
https://tmux.github.io
- Forget to print "cards" and cut them up for handing out IP's.
- Forget to have fun and focus on your students!
### Creating the VMs
`prepare-vms/workshopctl` is the script that gets you most of what you need for
setting up instances. See
[prepare-vms/README.md](prepare-vms)
for all the info on tools and scripts.
### Content for Different Workshop Durations
With all the slides, this workshop is a full day long. If you need to deliver
it in shorter timelines, here's some recommendations on what to cut out. You
can replace `---` with `???` which will hide slides. Or leave them there and
add something like `(EXTRA CREDIT)` to title so students can still view the
content but you also know to skip during presentation.
#### 3 Hour Version
- Limit time on debug tools, maybe skip a few. *"Chapter 1:
Identifying bottlenecks"*
- Limit time on Compose, try to have them building the Swarm Mode by 30
minutes in
- Skip most of Chapter 3, Centralized Logging and ELK
- Skip most of Chapter 4, but keep stateful services and DAB's if possible
- Mention what DAB's are, but make this part optional in case you run out
of time
#### 2 Hour Version
- Skip all the above, and:
- Skip the story arc of debugging dockercoins all together, skipping the
troubleshooting tools. Just focus on getting them from single-host to
multi-host and multi-container.
- Goal is first 30min on intro and Docker Compose and what dockercoins is,
and getting it up on one node in docker-compose.
- Next 60-75 minutes is getting dockercoins in Swarm Mode services across
servers. Big Win.
- Last 15-30 minutes is for stateful services, DAB files, and questions.
## Past events
Since its inception, this workshop has been delivered dozens of times,
to thousands of people, and has continuously evolved. This is a short
history of the first times it was delivered. Look also in the "tags"
of this repository: they all correspond to successive iterations of
this workshop. If you attended a past version of the workshop, you
can use these tags to see what has changed since then.
- QCON, New York City (2015, June)
- KCDC, Kansas City (2015, June)
@@ -13,80 +255,7 @@ at multiple conferences and events like:
- SCALE, Pasadena (2016, January)
- Zenika, Paris (2016, February)
- Container Solutions, Amsterdam (2016, February)
## Slides
The slides are in the `www/htdocs` directory.
The recommended way to view them is to:
- have a Docker host
- clone this repository to your Docker host
- `cd www && docker-compose up -d`
- this will start a web server on port 80
- point your browser at your Docker host and enjoy
## Sample code
The sample app is in the `dockercoins` directory.
To see it in action:
- `cd dockercoins && docker-compose up -d`
- this will build and start all the services
- the web UI will be available on port 8000
## Running the workshop
WARNING: those instructions are incomplete. Consider
them as notes quickly drafted on a napkin rather than
proper documentation!
### Creating the VMs
I use the `trainctl` script from the `docker-fundamentals`
repository. Sorry if you don't have that!
After starting the VMs, use the `trainctl ips` command
to dump the list of IP addresses into a file named `ips.txt`.
### Generating the printed cards
- Put `ips.txt` file in `prepare-vms` directory.
- Generate HTML file.
- Open it in Chrome.
- Transform to PDF.
- Print it.
### Deploying your SSH key to all the machines
- Make sure that you have SSH keys loaded (`ssh-add -l`).
- Source `rc`.
- Run `pcopykey`.
### Installing extra packages
- Source `postprep.rc`.
(This will install a few extra packages, add entries to
/etc/hosts, generate SSH keys, and deploy them on all hosts.)
### Final touches
- Set two groups of machines for instructor's use.
- You will use the first group during the workshop.
- The second group will run a web server with the slides.
- Log into the first machine of the second group.
- Git clone this repo.
- Put up the web server as instructed above.
- Use cli53 to add an A record for e.g. `view.dckr.info`.
- ... and much more!
# Problems? Bugs? Questions?
@@ -108,3 +277,4 @@ conference or for your company: contact me (jerome
at docker dot com).
Thank you!

View File

@@ -1,42 +0,0 @@
#!/usr/bin/env python
import os
import sys
import yaml
# arg 1 = service name
# arg 2 = number of instances
service_name = sys.argv[1]
desired_instances = int(sys.argv[2])
compose_file = os.environ["COMPOSE_FILE"]
input_file, output_file = compose_file, compose_file
config = yaml.load(open(input_file))
# The ambassadors need to know the service port to use.
# Those ports must be declared here.
ports = yaml.load(open("ports.yml"))
port = str(ports[service_name])
command_line = port
depends_on = []
for n in range(1, 1+desired_instances):
config["services"]["{}{}".format(service_name, n)] = config["services"][service_name]
command_line += " {}{}:{}".format(service_name, n, port)
depends_on.append("{}{}".format(service_name, n))
config["services"][service_name] = {
"image": "jpetazzo/hamba",
"command": command_line,
"depends_on": depends_on,
}
if "networks" in config["services"]["{}1".format(service_name)]:
config["services"][service_name]["networks"] = config["services"]["{}1".format(service_name)]["networks"]
yaml.safe_dump(config, open(output_file, "w"), default_flow_style=False)

View File

@@ -1,87 +0,0 @@
#!/usr/bin/env python
import os
import sys
import yaml
def error(msg):
print("ERROR: {}".format(msg))
exit(1)
# arg 1 = service name
service_name = sys.argv[1]
compose_file = os.environ["COMPOSE_FILE"]
input_file, output_file = compose_file, compose_file
config = yaml.load(open(input_file))
version = config.get("version")
if version != "2":
error("Unsupported $COMPOSE_FILE version: {!r}".format(version))
# The load balancers need to know the service port to use.
# Those ports must be declared here.
ports = yaml.load(open("ports.yml"))
port = str(ports[service_name])
if service_name not in config["services"]:
error("service {} not found in $COMPOSE_FILE"
.format(service_name))
lb_name = "{}-lb".format(service_name)
be_name = "{}-be".format(service_name)
wd_name = "{}-wd".format(service_name)
if lb_name in config["services"]:
error("load balancer {} already exists in $COMPOSE_FILE"
.format(lb_name))
if wd_name in config["services"]:
error("dns watcher {} already exists in $COMPOSE_FILE"
.format(wd_name))
service = config["services"][service_name]
if "networks" in service:
error("service {} has custom networks"
.format(service_name))
# Put the service on its own network.
service["networks"] = {service_name: {"aliases": [ be_name ] } }
# Put a label indicating which load balancer is responsible for this service.
if "labels" not in service:
service["labels"] = {}
service["labels"]["loadbalancer"] = lb_name
# Add the load balancer.
config["services"][lb_name] = {
"image": "jpetazzo/hamba",
"command": "{} {} {}".format(port, be_name, port),
"depends_on": [ service_name ],
"networks": {
"default": {
"aliases": [ service_name ],
},
service_name: None,
},
}
# Add the DNS watcher.
config["services"][wd_name] = {
"image": "jpetazzo/watchdns",
"command": "{} {} {}".format(port, be_name, port),
"volumes_from": [ lb_name ],
"networks": {
service_name: None,
},
}
if "networks" not in config:
config["networks"] = {}
if service_name not in config["networks"]:
config["networks"][service_name] = None
yaml.safe_dump(config, open(output_file, "w"), default_flow_style=False)

View File

@@ -1,61 +0,0 @@
#!/usr/bin/env python
from common import ComposeFile
import os
import subprocess
import time
registry = os.environ.get("DOCKER_REGISTRY")
if not registry:
print("Please set the DOCKER_REGISTRY variable, e.g.:")
print("export DOCKER_REGISTRY=jpetazzo # use the Docker Hub")
print("export DOCKER_REGISTRY=localhost:5000 # use a local registry")
exit(1)
# Get the name of the current directory.
project_name = os.path.basename(os.path.realpath("."))
# Generate a Docker image tag, using the UNIX timestamp.
# (i.e. number of seconds since January 1st, 1970)
version = str(int(time.time()))
# Execute "docker-compose build" and abort if it fails.
subprocess.check_call(["docker-compose", "-f", "docker-compose.yml", "build"])
# Load the services from the input docker-compose.yml file.
# TODO: run parallel builds.
compose_file = ComposeFile("docker-compose.yml")
# Iterate over all services that have a "build" definition.
# Tag them, and initiate a push in the background.
push_operations = dict()
for service_name, service in compose_file.services.items():
if "build" in service:
compose_image = "{}_{}".format(project_name, service_name)
registry_image = "{}/{}_{}:{}".format(registry, project_name, service_name, version)
# Re-tag the image so that it can be uploaded to the registry.
subprocess.check_call(["docker", "tag", compose_image, registry_image])
# Spawn "docker push" to upload the image.
push_operations[service_name] = subprocess.Popen(["docker", "push", registry_image])
# Replace the "build" definition by an "image" definition,
# using the name of the image on the registry.
del service["build"]
service["image"] = registry_image
# Wait for push operations to complete.
for service_name, popen_object in push_operations.items():
print("Waiting for {} push to complete...".format(service_name))
popen_object.wait()
print("Done.")
# Write the new docker-compose.yml file.
if "COMPOSE_FILE" not in os.environ:
os.environ["COMPOSE_FILE"] = "docker-compose.yml-{}".format(version)
print("Writing to new Compose file:")
else:
print("Writing to provided Compose file:")
print("COMPOSE_FILE={}".format(os.environ["COMPOSE_FILE"]))
compose_file.save()

View File

@@ -1,76 +0,0 @@
import os
import subprocess
import sys
import time
import yaml
def COMPOSE_FILE():
if "COMPOSE_FILE" not in os.environ:
print("The $COMPOSE_FILE environment variable is not set. Aborting.")
exit(1)
return os.environ["COMPOSE_FILE"]
class ComposeFile(object):
def __init__(self, filename=None):
if filename is None:
filename = COMPOSE_FILE()
if not os.path.isfile(filename):
print("File {!r} does not exist. Aborting.".format(filename))
exit(1)
self.data = yaml.load(open(filename))
@property
def services(self):
if self.data.get("version") == "2":
return self.data["services"]
else:
return self.data
def save(self, filename=None):
if filename is None:
filename = COMPOSE_FILE()
with open(filename, "w") as f:
yaml.safe_dump(self.data, f, default_flow_style=False)
# Executes a bunch of commands in parallel, but no more than N at a time.
# This allows to execute concurrently a large number of tasks, without
# turning into a fork bomb.
# `parallelism` is the number of tasks to execute simultaneously.
# `commands` is a list of tasks to execute.
# Each task is itself a list, where the first element is a descriptive
# string, and the folloowing elements are the arguments to pass to Popen.
def parallel_run(commands, parallelism):
running = []
# While stuff is running, or we have stuff to run...
while commands or running:
# While there is stuff to run, and room in the pipe...
while commands and len(running)<parallelism:
command = commands.pop(0)
print("START {}".format(command[0]))
popen = subprocess.Popen(
command[1:], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
popen._desc = command[0]
running.append(popen)
must_sleep = True
for popen in running:
status = popen.poll()
if status is not None:
must_sleep = False
running.remove(popen)
if status==0:
print("OK {}".format(popen._desc))
else:
print("ERROR {} [Exit status: {}]"
.format(popen._desc, status))
output = "\n" + popen.communicate()[0].strip()
output = output.replace("\n", "\n| ")
print(output)
else:
print("WAIT ({} running, {} more to run)"
.format(len(running), len(commands)))
if must_sleep:
time.sleep(1)

View File

@@ -1,69 +0,0 @@
#!/usr/bin/env python
from common import parallel_run
import os
import subprocess
project_name = os.path.basename(os.path.realpath("."))
# Get all services and backends in our compose application.
containers_data = subprocess.check_output([
"docker", "ps",
"--filter", "label=com.docker.compose.project={}".format(project_name),
"--format", '{{ .ID }} '
'{{ .Label "com.docker.compose.service" }} '
'{{ .Ports }}',
])
# Build list of backends.
frontend_ports = dict()
backends = dict()
for container in containers_data.split('\n'):
if not container:
continue
# TODO: support services with multiple ports!
container_id, service_name, port = container.split(' ')
if not port:
continue
backend, frontend = port.split("->")
backend_addr, backend_port = backend.split(':')
frontend_port, frontend_proto = frontend.split('/')
# TODO: deal with udp (mostly skip it?)
assert frontend_proto == "tcp"
# TODO: check inconsistencies between port mappings
frontend_ports[service_name] = frontend_port
if service_name not in backends:
backends[service_name] = []
backends[service_name].append((backend_addr, backend_port))
# Get all existing ambassadors for this application.
ambassadors_data = subprocess.check_output([
"docker", "ps",
"--filter", "label=ambassador.project={}".format(project_name),
"--format", '{{ .ID }} '
'{{ .Label "ambassador.service" }} '
'{{ .Label "ambassador.bindaddr" }}',
])
# Update ambassadors.
operations = []
for ambassador in ambassadors_data.split('\n'):
if not ambassador:
continue
ambassador_id, service_name, bind_address = ambassador.split()
print("Updating configuration for {}/{} -> {}:{} -> {}"
.format(service_name, ambassador_id,
bind_address, frontend_ports[service_name],
backends[service_name]))
command = [
ambassador_id,
"docker", "run", "--rm", "--volumes-from", ambassador_id,
"jpetazzo/hamba", "reconfigure",
"{}:{}".format(bind_address, frontend_ports[service_name])
]
for backend_addr, backend_port in backends[service_name]:
command.extend([backend_addr, backend_port])
operations.append(command)
# Execute all commands in parallel.
parallel_run(operations, 10)

View File

@@ -1,71 +0,0 @@
#!/usr/bin/env python
from common import ComposeFile, parallel_run
import os
import subprocess
config = ComposeFile()
project_name = os.path.basename(os.path.realpath("."))
# Get all services in our compose application.
containers_data = subprocess.check_output([
"docker", "ps",
"--filter", "label=com.docker.compose.project={}".format(project_name),
"--format", '{{ .ID }} {{ .Label "com.docker.compose.service" }}',
])
# Get all existing ambassadors for this application.
ambassadors_data = subprocess.check_output([
"docker", "ps",
"--filter", "label=ambassador.project={}".format(project_name),
"--format", '{{ .ID }} '
'{{ .Label "ambassador.container" }} '
'{{ .Label "ambassador.service" }}',
])
# Build a set of existing ambassadors.
ambassadors = dict()
for ambassador in ambassadors_data.split('\n'):
if not ambassador:
continue
ambassador_id, container_id, linked_service = ambassador.split()
ambassadors[container_id, linked_service] = ambassador_id
operations = []
# Start the missing ambassadors.
for container in containers_data.split('\n'):
if not container:
continue
container_id, service_name = container.split()
extra_hosts = config.services[service_name].get("extra_hosts", {})
for linked_service, bind_address in extra_hosts.items():
description = "Ambassador {}/{}/{}".format(
service_name, container_id, linked_service)
ambassador_id = ambassadors.pop((container_id, linked_service), None)
if ambassador_id:
print("{} already exists: {}".format(description, ambassador_id))
else:
print("{} not found, creating it.".format(description))
operations.append([
description,
"docker", "run", "-d",
"--net", "container:{}".format(container_id),
"--label", "ambassador.project={}".format(project_name),
"--label", "ambassador.container={}".format(container_id),
"--label", "ambassador.service={}".format(linked_service),
"--label", "ambassador.bindaddr={}".format(bind_address),
"jpetazzo/hamba", "run"
])
# Destroy extraneous ambassadors.
for ambassador_id in ambassadors.values():
print("{} is not useful anymore, destroying it.".format(ambassador_id))
operations.append([
"rm -f {}".format(ambassador_id),
"docker", "rm", "-f", ambassador_id,
])
# Execute all commands in parallel.
parallel_run(operations, 10)

View File

@@ -1,3 +0,0 @@
#!/bin/sh
docker ps -q --filter label=ambassador.project=dockercoins |
xargs docker rm -f

View File

@@ -1,16 +0,0 @@
#!/bin/sh
# Some tools will choke on the YAML files generated by PyYAML;
# in particular on a section like this one:
#
# service:
# ports:
# - 8000:5000
#
# This script adds two spaces in front of the dash in those files.
# Warning: it is a hack, and probably won't work on some YAML files.
[ -f "$COMPOSE_FILE" ] || {
echo "Cannot find COMPOSE_FILE"
exit 1
}
sed -i 's/^ -/ -/' $COMPOSE_FILE

View File

@@ -1,38 +0,0 @@
#!/usr/bin/env python
from common import ComposeFile
import yaml
config = ComposeFile()
# The ambassadors need to know the service port to use.
# Those ports must be declared here.
ports = yaml.load(open("ports.yml"))
def generate_local_addr():
last_byte = 2
while last_byte<255:
yield "127.127.0.{}".format(last_byte)
last_byte += 1
for service_name, service in config.services.items():
if "links" in service:
for link, local_addr in zip(service["links"], generate_local_addr()):
if link not in ports:
print("Skipping link {} in service {} "
"(no port mapping defined). "
"Your code will probably break."
.format(link, service_name))
continue
if "extra_hosts" not in service:
service["extra_hosts"] = {}
service["extra_hosts"][link] = local_addr
del service["links"]
if "ports" in service:
del service["ports"]
if "volumes" in service:
del service["volumes"]
if service_name in ports:
service["ports"] = [ ports[service_name] ]
config.save()

View File

@@ -1,46 +0,0 @@
#!/usr/bin/env python
# FIXME: hardcoded
PORT="80"
import os
import subprocess
project_name = os.path.basename(os.path.realpath("."))
# Get all existing services for this application.
containers_data = subprocess.check_output([
"docker", "ps",
"--filter", "label=com.docker.compose.project={}".format(project_name),
"--format", '{{ .Label "com.docker.compose.service" }} '
'{{ .Label "com.docker.compose.container-number" }} '
'{{ .Label "loadbalancer" }}',
])
load_balancers = dict()
for line in containers_data.split('\n'):
if not line:
continue
service_name, container_number, load_balancer = line.split(' ')
if load_balancer:
if load_balancer not in load_balancers:
load_balancers[load_balancer] = []
load_balancers[load_balancer].append((service_name, int(container_number)))
for load_balancer, backends in load_balancers.items():
# FIXME: iterate on all load balancers
container_name = "{}_{}_1".format(project_name, load_balancer)
command = [
"docker", "run", "--rm",
"--volumes-from", container_name,
"--net", "container:{}".format(container_name),
"jpetazzo/hamba", "reconfigure", PORT,
]
command.extend(
"{}_{}_{}:{}".format(project_name, backend_name, backend_number, PORT)
for (backend_name, backend_number) in sorted(backends)
)
print("Updating configuration for {} with {} backend(s)..."
.format(container_name, len(backends)))
subprocess.check_output(command)

View File

@@ -1,201 +0,0 @@
#!/bin/sh
unset DOCKER_REGISTRY
unset DOCKER_HOST
unset COMPOSE_FILE
SWARM_IMAGE=${SWARM_IMAGE:-swarm}
prepare_1_check_ssh_keys () {
for N in $(seq 1 5); do
ssh node$N true
done
}
prepare_2_compile_swarm () {
cd ~
git clone git://github.com/docker/swarm
cd swarm
[[ -z "$1" ]] && {
echo "Specify which revision to build."
return
}
git checkout "$1" || return
mkdir -p image
docker build -t docker/swarm:$1 .
docker run -i --entrypoint sh docker/swarm:$1 \
-c 'cat $(which swarm)' > image/swarm
chmod +x image/swarm
cat >image/Dockerfile <<EOF
FROM scratch
COPY ./swarm /swarm
ENTRYPOINT ["/swarm", "-debug", "-experimental"]
EOF
docker build -t jpetazzo/swarm:$1 image
docker login
docker push jpetazzo/swarm:$1
docker logout
SWARM_IMAGE=jpetazzo/swarm:$1
}
clean_1_containers () {
for N in $(seq 1 5); do
ssh node$N "docker ps -aq | xargs -r -n1 -P10 docker rm -f"
done
}
clean_2_volumes () {
for N in $(seq 1 5); do
ssh node$N "docker volume ls -q | xargs -r docker volume rm"
done
}
clean_3_images () {
for N in $(seq 1 5); do
ssh node$N "docker images | awk '/dockercoins|jpetazzo/ {print \$1\":\"\$2}' | xargs -r docker rmi -f"
done
}
clean_4_machines () {
rm -rf ~/.docker/machine/
}
clean_all () {
clean_1_containers
clean_2_volumes
clean_3_images
clean_4_machines
}
dm_swarm () {
eval $(docker-machine env node1 --swarm)
}
dm_node1 () {
eval $(docker-machine env node1)
}
setup_1_swarm () {
grep node[12345] /etc/hosts | grep -v ^127 |
while read IPADDR NODENAME; do
docker-machine create --driver generic \
--engine-opt cluster-store=consul://localhost:8500 \
--engine-opt cluster-advertise=eth0:2376 \
--swarm --swarm-master --swarm-image $SWARM_IMAGE \
--swarm-discovery consul://localhost:8500 \
--swarm-opt replication --swarm-opt advertise=$IPADDR:3376 \
--generic-ssh-user docker --generic-ip-address $IPADDR $NODENAME
done
}
setup_2_consul () {
IPADDR=$(ssh node1 ip a ls dev eth0 |
sed -n 's,.*inet \(.*\)/.*,\1,p')
for N in 1 2 3 4 5; do
ssh node$N -- docker run -d --restart=always --name consul_node$N \
-e CONSUL_BIND_INTERFACE=eth0 --net host consul \
agent -server -retry-join $IPADDR -bootstrap-expect 5 \
-ui -client 0.0.0.0
done
}
setup_3_wait () {
# Wait for a Swarm master
dm_swarm
while ! docker ps; do sleep 1; done
# Wait for all nodes to be there
while ! [ "$(docker info | grep "^Nodes:")" = "Nodes: 5" ]; do sleep 1; done
}
setup_4_registry () {
cd ~/orchestration-workshop/registry
dm_swarm
docker-compose up -d
for N in $(seq 2 5); do
docker-compose scale frontend=$N
done
}
setup_5_btp_dockercoins () {
cd ~/orchestration-workshop/dockercoins
dm_node1
export DOCKER_REGISTRY=localhost:5000
cp docker-compose.yml-v2 docker-compose.yml
~/orchestration-workshop/bin/build-tag-push.py | tee /tmp/btp.log
export $(tail -n 1 /tmp/btp.log)
}
setup_6_add_lbs () {
cd ~/orchestration-workshop/dockercoins
~/orchestration-workshop/bin/add-load-balancer-v2.py rng
~/orchestration-workshop/bin/add-load-balancer-v2.py hasher
}
setup_7_consulfs () {
dm_swarm
docker pull jpetazzo/consulfs
for N in $(seq 1 5); do
ssh node$N "docker run --rm -v /usr/local/bin:/target jpetazzo/consulfs"
ssh node$N mkdir -p ~/consul
ssh -f node$N "mountpoint ~/consul || consulfs localhost:8500 ~/consul"
done
}
setup_8_syncmachine () {
while ! mountpoint ~/consul; do
sleep 1
done
cp -r ~/.docker/machine ~/consul/
for N in $(seq 2 5); do
ssh node$N mkdir -p ~/.docker
ssh node$N "[ -L ~/.docker/machine ] || ln -s ~/consul/machine ~/.docker"
done
}
setup_9_elk () {
dm_swarm
cd ~/orchestration-workshop/elk
docker-compose up -d
for N in $(seq 1 5); do
docker-compose scale logstash=$N
done
}
setup_all () {
setup_1_swarm
setup_2_consul
setup_3_wait
setup_4_registry
setup_5_btp_dockercoins
setup_6_add_lbs
setup_7_consulfs
setup_8_syncmachine
dm_swarm
}
force_remove_network () {
dm_swarm
NET="$1"
for CNAME in $(docker network inspect $NET | grep Name | grep -v \"$NET\" | cut -d\" -f4); do
echo $CNAME
docker network disconnect -f $NET $CNAME
done
docker network rm $NET
}
demo_1_compose_up () {
dm_swarm
cd ~/orchestration-workshop/dockercoins
docker-compose up -d
}
grep -qs -- MAGICMARKER "$0" && { # Don't display this line in the function lis
echo "You should source this file, then invoke the following functions:"
grep -- '^[a-z].*{$' "$0" | cut -d" " -f1
}
show_swarm_primary () {
dm_swarm
docker info 2>/dev/null | grep -e ^Role -e ^Primary
}

View File

@@ -1,10 +0,0 @@
cadvisor:
image: google/cadvisor
ports:
- "8080:8080"
volumes:
- "/:/rootfs:ro"
- "/var/run:/var/run:rw"
- "/sys:/sys:ro"
- "/var/lib/docker/:/var/lib/docker:ro"

View File

@@ -1,19 +0,0 @@
# CEPH on Docker
Note: this doesn't quite work yet.
The OSD containers need to be started twice (the first time, they fail
initializing; second time is a champ).
Also, it looks like you need at least two OSD containers (or the OSD
container should have two disks/directories, whatever).
RadosGw is listening on port 8080.
The `admin` container will create a `docker` user using `radosgw-admin`.
If you run it multiple times, that's OK: further invocations are idempotent.
Last but not least: it looks like AWS CLI uses a new signature format
that doesn't work with RadosGW. After almost two hours trying to figure
out what was wrong, I tried the S3 credentials directly with boto and
it worked immediately (I was able to create a bucket).

View File

@@ -1,53 +0,0 @@
version: "2"
services:
mon:
image: ceph/daemon
command: mon
environment:
CEPH_PUBLIC_NETWORK: 10.33.0.0/16
MON_IP: 10.33.0.2
osd:
image: ceph/daemon
command: osd_directory
depends_on:
- mon
volumes_from:
- mon
volumes:
- /var/lib/ceph/osd
mds:
image: ceph/daemon
command: mds
environment:
CEPHFS_CREATE: 1
depends_on:
- mon
volumes_from:
- mon
rgw:
image: ceph/daemon
command: rgw
depends_on:
- mon
volumes_from:
- mon
environment:
CEPH_OPTS: --verbose
admin:
image: ceph/daemon
entrypoint: radosgw-admin
depends_on:
- mon
volumes_from:
- mon
command: user create --uid=docker --display-name=docker
networks:
default:
ipam:
driver: default
config:
- subnet: 10.33.0.0/16
gateway: 10.33.0.1

View File

@@ -1,12 +0,0 @@
version: "2"
services:
bootstrap:
image: jpetazzo/consul
command: agent -server -bootstrap
container_name: bootstrap
server:
image: jpetazzo/consul
command: agent -server -join bootstrap -join server
client:
image: jpetazzo/consul
command: members -rpc-addr server:8400

View File

@@ -1,30 +1,21 @@
version: "2"
services:
rng1:
build: rng
rng2:
build: rng
rng3:
build: rng
rng:
image: jpetazzo/hamba
command: 80 rng1:80 rng2:80 rng3:80
depends_on:
- rng1
- rng2
- rng3
build: rng
image: ${REGISTRY_SLASH}rng${COLON_TAG}
ports:
- "8001:80"
hasher:
build: hasher
image: ${REGISTRY_SLASH}hasher${COLON_TAG}
ports:
- "8002:80"
webui:
build: webui
image: ${REGISTRY_SLASH}webui${COLON_TAG}
ports:
- "8000:80"
volumes:
@@ -35,4 +26,5 @@ services:
worker:
build: worker
image: ${REGISTRY_SLASH}worker${COLON_TAG}

View File

@@ -1,27 +0,0 @@
version: "2"
services:
rng:
build: rng
ports:
- "8001:80"
hasher:
build: hasher
ports:
- "8002:80"
webui:
build: webui
ports:
- "8000:80"
volumes:
- "./webui/files/:/files/"
redis:
image: jpetazzo/hamba
command: 6379 AA.BB.CC.DD:EEEEE
worker:
build: worker

View File

@@ -1,26 +0,0 @@
version: "2"
services:
rng:
build: rng
ports:
- "80"
hasher:
build: hasher
ports:
- "8002:80"
webui:
build: webui
ports:
- "8000:80"
volumes:
- "./webui/files/:/files/"
redis:
image: redis
worker:
build: worker

View File

@@ -1,20 +0,0 @@
version: '2'
services:
rng:
build: rng
hasher:
build: hasher
webui:
build: webui
ports:
- "8000:80"
redis:
image: redis
worker:
build: worker

View File

@@ -1,7 +1,10 @@
FROM ruby:alpine
RUN apk add --update build-base
RUN apk add --update build-base curl
RUN gem install sinatra
RUN gem install thin
ADD hasher.rb /
CMD ["ruby", "hasher.rb"]
EXPOSE 80
HEALTHCHECK \
--interval=1s --timeout=2s --retries=3 --start-period=1s \
CMD curl http://localhost/ || exit 1

View File

@@ -1,5 +0,0 @@
hasher: 80
redis: 6379
rng: 80
webui: 80

View File

@@ -50,7 +50,7 @@ function refresh () {
points.push({ x: s2.now, y: speed });
}
$("#speed").text("~" + speed.toFixed(1) + " hashes/second");
var msg = ("I'm attending the @docker workshop at @scaleconf, "
var msg = ("I'm attending a @docker orchestration workshop, "
+ "and my #DockerCoins mining rig is crunching "
+ speed.toFixed(1) + " hashes/second! W00T!");
$("#tweet").attr(

8
docs/TODO Normal file
View File

@@ -0,0 +1,8 @@
Black belt references that I want to add somewhere:
What Have Namespaces Done for You Lately?
https://www.youtube.com/watch?v=MHv6cWjvQjM&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=8
Cilium: Network and Application Security with BPF and XDP
https://www.youtube.com/watch?v=ilKlmTDdFgk&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=9

41
docs/apiscope.md Normal file
View File

@@ -0,0 +1,41 @@
## A reminder about *scope*
- Out of the box, Docker API access is "all or nothing"
- When someone has access to the Docker API, they can access *everything*
- If your developers are using the Docker API to deploy on the dev cluster ...
... and the dev cluster is the same as the prod cluster ...
... it means that your devs have access to your production data, passwords, etc.
- This can easily be avoided
---
## Fine-grained API access control
A few solutions, by increasing order of flexibility:
- Use separate clusters for different security perimeters
(And different credentials for each cluster)
--
- Add an extra layer of abstraction (sudo scripts, hooks, or full-blown PAAS)
--
- Enable [authorization plugins]
- each API request is vetted by your plugin(s)
- by default, the *subject name* in the client TLS certificate is used as user name
- example: [user and permission management] in [UCP]
[authorization plugins]: https://docs.docker.com/engine/extend/plugins_authorization/
[UCP]: https://docs.docker.com/datacenter/ucp/2.1/guides/
[user and permission management]: https://docs.docker.com/datacenter/ucp/2.1/guides/admin/manage-users/

BIN
docs/bell-curve.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

BIN
docs/blackbelt.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

30
docs/build.sh Executable file
View File

@@ -0,0 +1,30 @@
#!/bin/sh
case "$1" in
once)
for YAML in *.yml; do
./markmaker.py < $YAML > $YAML.html || rm $YAML.html
done
;;
forever)
# There is a weird bug in entr, at least on MacOS,
# where it doesn't restore the terminal to a clean
# state when exitting. So let's try to work around
# it with stty.
STTY=$(stty -g)
while true; do
find . | entr -d $0 once
STATUS=$?
case $STATUS in
2) echo "Directory has changed. Restarting.";;
130) echo "SIGINT or q pressed. Exiting."; break;;
*) echo "Weird exit code: $STATUS. Retrying in 1 second."; sleep 1;;
esac
done
stty $STTY
;;
*)
echo "$0 <once|forever>"
;;
esac

9
docs/chat/index.html Normal file
View File

@@ -0,0 +1,9 @@
<html>
<!-- Generated with index.html.sh -->
<head>
<meta http-equiv="refresh" content="0; URL='https://dockercommunity.slack.com/messages/docker-mentor'" />
</head>
<body>
<a href="https://dockercommunity.slack.com/messages/docker-mentor">https://dockercommunity.slack.com/messages/docker-mentor</a>
</body>
</html>

16
docs/chat/index.html.sh Executable file
View File

@@ -0,0 +1,16 @@
#!/bin/sh
#LINK=https://gitter.im/jpetazzo/workshop-20170322-sanjose
LINK=https://dockercommunity.slack.com/messages/docker-mentor
#LINK=https://usenix-lisa.slack.com/messages/docker
sed "s,@@LINK@@,$LINK,g" >index.html <<EOF
<html>
<!-- Generated with index.html.sh -->
<head>
<meta http-equiv="refresh" content="0; URL='$LINK'" />
</head>
<body>
<a href="$LINK">$LINK</a>
</body>
</html>
EOF

283
docs/concepts-k8s.md Normal file
View File

@@ -0,0 +1,283 @@
# Kubernetes concepts
- Kubernetes is a container management system
- It runs and manages containerized applications on a cluster
--
- What does that really mean?
---
## Basic things we can ask Kubernetes to do
--
- Start 5 containers using image `atseashop/api:v1.3`
--
- Place an internal load balancer in front of these containers
--
- Start 10 containers using image `atseashop/webfront:v1.3`
--
- Place a public load balancer in front of these containers
--
- It's Black Friday (or Christmas), traffic spikes, grow our cluster and add containers
--
- New release! Replace my containers with the new image `atseashop/webfront:v1.4`
--
- Keep processing requests during the upgrade; update my containers one at a time
---
## Other things that Kubernetes can do for us
- Basic autoscaling
- Blue/green deployment, canary deployment
- Long running services, but also batch (one-off) jobs
- Overcommit our cluster and *evict* low-priority jobs
- Run services with *stateful* data (databases etc.)
- Fine-grained access control defining *what* can be done by *whom* on *which* resources
- Integrating third party services (*service catalog*)
- Automating complex tasks (*operators*)
---
## Kubernetes architecture
---
class: pic
![haha only kidding](k8s-arch1.png)
---
## Kubernetes architecture
- Ha ha ha ha
- OK, I was trying to scare you, it's much simpler than that ❤️
---
class: pic
![that one is more like the real thing](k8s-arch2.png)
---
## Credits
- The first schema is a Kubernetes cluster with storage backed by multi-path iSCSI
(Courtesy of [Yongbok Kim](https://www.yongbok.net/blog/))
- The second one is an good simplified representation of a Kubernetes cluster
(Courtesy of [Imesh Gunaratne](https://medium.com/containermind/a-reference-architecture-for-deploying-wso2-middleware-on-kubernetes-d4dee7601e8e))
---
## Kubernetes architecture: the master
- The Kubernetes logic (its "brains") is a collection of services:
- the API server (our point of entry to everything!)
- core services like the scheduler and controller manager
- `etcd` (a highly available key/value store; the "database" of Kubernetes)
- Together, these services form what is called the "master"
- These services can run straight on a host, or in containers
<br/>
(that's an implementation detail)
- `etcd` can be run on separate machines (first schema) or co-located (second schema)
- We need at least one master, but we can have more (for high availability)
---
## Kubernetes architecture: the nodes
- The nodes executing our containers run another collection of services:
- a container Engine (typically Docker)
- kubelet (the "node agent")
- kube-proxy (a necessary but not sufficient network component)
- Nodes were formerly called "minions"
- It is customary to *not* run apps on the node(s) running master components
(Except when using small development clusters)
---
## Do we need to run Docker at all?
No!
--
- By default, Kubernetes uses the Docker Engine to run containers
- We could also use `rkt` ("Rocket") from CoreOS
- Or leverage other pluggable runtimes through the *Container Runtime Interface*
(like CRI-O, or containerd)
---
## Do we need to run Docker at all?
Yes!
--
- In this workshop, we run our app on a single node first
- We will need to build images and ship them around
- We can do these things without Docker
<br/>
(and get diagnosed with NIH¹ syndrome)
- Docker is still the most stable container engine today
<br/>
(but other options are maturing very quickly)
.footnote[¹[Not Invented Here](https://en.wikipedia.org/wiki/Not_invented_here)]
---
## Do we need to run Docker at all?
- On our development environments, CI pipelines ... :
*Yes, almost certainly*
- On our production servers:
*Yes (today)*
*Probably not (in the future)*
.footnote[More information about CRI [on the Kubernetes blog](http://blog.kubernetes.io/2016/12/]container-runtime-interface-cri-in-kubernetes.html).
---
## Kubernetes resources
- The Kubernetes API defines a lot of objects called *resources*
- These resources are organized by type, or `Kind` (in the API)
- A few common resource types are:
- node (self-explanatory)
- pod (group of containers running together on a node)
- service (stable network endpoint to connect to one or multiple containers)
- namespace (more-or-less isolated group of things)
- secret (bundle of sensitive data to be passed to a container)
And much more! (We can see the full list by running `kubectl get`)
---
# Declarative vs imperative
- Kubernetes puts a very strong emphasis on being *declarative*
- Declarative:
*I want a cup of tea. Make it happen.*
- Imperative:
*Boil some water. Pour it in a teapot. Add tea leaves. Steep for a while. Serve in cup.*
--
- Declarative seems simpler at first ...
--
- ... As long as you know how to brew tea
---
## Declarative vs imperative
- What declarative would really be:
*I want a cup of tea, obtained by pouring an infusion¹ of tea leaves in a cup.*
--
*¹An infusion is obtained by letting the object steep a few minutes in hot² water.*
--
*²Hot liquid is obtained by pouring it in an appropriate container³ and setting it on a stove.*
--
*³Ah, finally, containers! Something we know about. Let's get to work, shall we?*
---
## Declarative vs imperative
- Imperative systems:
- simpler
- if a task is interrupted, we have to restart from scratch
- Declarative systems:
- if a task is interrupted (or if we show up to the party half-way through),
we can figure out what's missing and do only what's necessary
- we need to be able to *observe* the system
- ... and compute a "diff" between *what we have* and *what we want*
---
## Declarative vs imperative in Kubernetes
- Virtually everything we create in Kubernetes is created from a *spec*
- Watch for the `spec` fields in the YAML files later!
- The *spec* describes *how we want the thing to be*
- Kubernetes will *reconcile* the current state with the spec
<br/>(technically, this is done by a number of *controllers*)
- When we want to change some resource, we update the *spec*
- Kubernetes will then *converge* that resource

352
docs/creatingswarm.md Normal file
View File

@@ -0,0 +1,352 @@
# Creating our first Swarm
- The cluster is initialized with `docker swarm init`
- This should be executed on a first, seed node
- .warning[DO NOT execute `docker swarm init` on multiple nodes!]
You would have multiple disjoint clusters.
.exercise[
- Create our cluster from node1:
```bash
docker swarm init
```
]
--
class: advertise-addr
If Docker tells you that it `could not choose an IP address to advertise`, see next slide!
---
class: advertise-addr
## IP address to advertise
- When running in Swarm mode, each node *advertises* its address to the others
<br/>
(i.e. it tells them *"you can contact me on 10.1.2.3:2377"*)
- If the node has only one IP address (other than 127.0.0.1), it is used automatically
- If the node has multiple IP addresses, you **must** specify which one to use
<br/>
(Docker refuses to pick one randomly)
- You can specify an IP address or an interface name
<br/>(in the latter case, Docker will read the IP address of the interface and use it)
- You can also specify a port number
<br/>(otherwise, the default port 2377 will be used)
---
class: advertise-addr
## Which IP address should be advertised?
- If your nodes have only one IP address, it's safe to let autodetection do the job
.small[(Except if your instances have different private and public addresses, e.g.
on EC2, and you are building a Swarm involving nodes inside and outside the
private network: then you should advertise the public address.)]
- If your nodes have multiple IP addresses, pick an address which is reachable
*by every other node* of the Swarm
- If you are using [play-with-docker](http://play-with-docker.com/), use the IP
address shown next to the node name
.small[(This is the address of your node on your private internal overlay network.
The other address that you might see is the address of your node on the
`docker_gwbridge` network, which is used for outbound traffic.)]
Examples:
```bash
docker swarm init --advertise-addr 10.0.9.2
docker swarm init --advertise-addr eth0:7777
```
---
class: extra-details
## Using a separate interface for the data path
- You can use different interfaces (or IP addresses) for control and data
- You set the _control plane path_ with `--advertise-addr`
(This will be used for SwarmKit manager/worker communication, leader election, etc.)
- You set the _data plane path_ with `--data-path-addr`
(This will be used for traffic between containers)
- Both flags can accept either an IP address, or an interface name
(When specifying an interface name, Docker will use its first IP address)
---
## Token generation
- In the output of `docker swarm init`, we have a message
confirming that our node is now the (single) manager:
```
Swarm initialized: current node (8jud...) is now a manager.
```
- Docker generated two security tokens (like passphrases or passwords) for our cluster
- The CLI shows us the command to use on other nodes to add them to the cluster using the "worker"
security token:
```
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-59fl4ak4nqjmao1ofttrc4eprhrola2l87... \
172.31.4.182:2377
```
---
class: extra-details
## Checking that Swarm mode is enabled
.exercise[
- Run the traditional `docker info` command:
```bash
docker info
```
]
The output should include:
```
Swarm: active
NodeID: 8jud7o8dax3zxbags3f8yox4b
Is Manager: true
ClusterID: 2vcw2oa9rjps3a24m91xhvv0c
...
```
---
## Running our first Swarm mode command
- Let's retry the exact same command as earlier
.exercise[
- List the nodes (well, the only node) of our cluster:
```bash
docker node ls
```
]
The output should look like the following:
```
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
8jud...ox4b * node1 Ready Active Leader
```
---
## Adding nodes to the Swarm
- A cluster with one node is not a lot of fun
- Let's add `node2`!
- We need the token that was shown earlier
--
- You wrote it down, right?
--
- Don't panic, we can easily see it again 😏
---
## Adding nodes to the Swarm
.exercise[
- Show the token again:
```bash
docker swarm join-token worker
```
- Switch to `node2`
- Copy-paste the `docker swarm join ...` command
<br/>(that was displayed just before)
]
---
class: extra-details
## Check that the node was added correctly
- Stay on `node2` for now!
.exercise[
- We can still use `docker info` to verify that the node is part of the Swarm:
```bash
docker info | grep ^Swarm
```
]
- However, Swarm commands will not work; try, for instance:
```
docker node ls
```
- This is because the node that we added is currently a *worker*
- Only *managers* can accept Swarm-specific commands
---
## View our two-node cluster
- Let's go back to `node1` and see what our cluster looks like
.exercise[
- Switch back to `node1`
- View the cluster from `node1`, which is a manager:
```bash
docker node ls
```
]
The output should be similar to the following:
```
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
8jud...ox4b * node1 Ready Active Leader
ehb0...4fvx node2 Ready Active
```
---
class: under-the-hood
## Under the hood: docker swarm init
When we do `docker swarm init`:
- a keypair is created for the root CA of our Swarm
- a keypair is created for the first node
- a certificate is issued for this node
- the join tokens are created
---
class: under-the-hood
## Under the hood: join tokens
There is one token to *join as a worker*, and another to *join as a manager*.
The join tokens have two parts:
- a secret key (preventing unauthorized nodes from joining)
- a fingerprint of the root CA certificate (preventing MITM attacks)
If a token is compromised, it can be rotated instantly with:
```
docker swarm join-token --rotate <worker|manager>
```
---
class: under-the-hood
## Under the hood: docker swarm join
When a node joins the Swarm:
- it is issued its own keypair, signed by the root CA
- if the node is a manager:
- it joins the Raft consensus
- it connects to the current leader
- it accepts connections from worker nodes
- if the node is a worker:
- it connects to one of the managers (leader or follower)
---
class: under-the-hood
## Under the hood: cluster communication
- The *control plane* is encrypted with AES-GCM; keys are rotated every 12 hours
- Authentication is done with mutual TLS; certificates are rotated every 90 days
(`docker swarm update` allows to change this delay or to use an external CA)
- The *data plane* (communication between containers) is not encrypted by default
(but this can be activated on a by-network basis, using IPSEC,
leveraging hardware crypto if available)
---
class: under-the-hood
## Under the hood: I want to know more!
Revisit SwarmKit concepts:
- Docker 1.12 Swarm Mode Deep Dive Part 1: Topology
([video](https://www.youtube.com/watch?v=dooPhkXT9yI))
- Docker 1.12 Swarm Mode Deep Dive Part 2: Orchestration
([video](https://www.youtube.com/watch?v=_F6PSP-qhdA))
Some presentations from the Docker Distributed Systems Summit in Berlin:
- Heart of the SwarmKit: Topology Management
([slides](https://speakerdeck.com/aluzzardi/heart-of-the-swarmkit-topology-management))
- Heart of the SwarmKit: Store, Topology & Object Model
([slides](http://www.slideshare.net/Docker/heart-of-the-swarmkit-store-topology-object-model))
([video](https://www.youtube.com/watch?v=EmePhjGnCXY))
And DockerCon Black Belt talks:
.blackbelt[[Everything You Thought You Already Knew About Orchestration](https://www.youtube.com/watch?v=Qsv-q8WbIZY&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=6) by Laura Frank (DC17US)]
.blackbelt[Container Orchestration from Theory to Practice by Laura Frank and Stephen Day (Wednesday 14:25)]

371
docs/daemonset.md Normal file
View File

@@ -0,0 +1,371 @@
# Daemon sets
- Remember: we did all that cluster orchestration business for `rng`
- We want one (and exactly one) instance of `rng` per node
- If we just scale `deploy/rng` to 4, nothing guarantees that they spread
- Instead of a `deployment`, we will use a `daemonset`
- Daemon sets are great for cluster-wide, per-node processes:
- `kube-proxy`
- `weave` (our overlay network)
- monitoring agents
- hardware management tools (e.g. SCSI/FC HBA agents)
- etc.
- They can also be restricted to run [only on some nodes](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#running-pods-on-only-some-nodes)
---
## Creating a daemon set
- Unfortunately, as of Kubernetes 1.8, the CLI cannot create daemon sets
--
- More precisely: it doesn't have a subcommand to create a daemon set
--
- But any kind of resource can always be created by providing a YAML description:
```bash
kubectl apply -f foo.yaml
```
--
- How do we create the YAML file for our daemon set?
--
- option 1: read the docs
--
- option 2: `vi` our way out of it
---
## Creating the YAML file for our daemon set
- Let's start with the YAML file for the current `rng` resource
.exercise[
- Dump the `rng` resource in YAML:
```bash
kubectl get deploy/rng -o yaml --export >rng.yml
```
- Edit `rng.yml`
]
Note: `--export` will remove "cluster-specific" information, i.e.:
- namespace (so that the resource is not tied to a specific namespace)
- status and creation timestamp (useless when creating a new resource)
- resourceVersion and uid (these would cause... *interesting* problems)
---
## "Casting" a resource to another
- What if we just changed the `kind` field?
(It can't be that easy, right?)
.exercise[
- Change `kind: Deployment` to `kind: DaemonSet`
- Save, quit
- Try to create our new resource:
```bash
kubectl apply -f rng.yml
```
]
--
We all knew this couldn't be that easy, right!
---
## Understanding the problem
- The core of the error is:
```
error validating data:
[ValidationError(DaemonSet.spec):
unknown field "replicas" in io.k8s.api.extensions.v1beta1.DaemonSetSpec,
...
```
--
- *Obviously,* it doesn't make sense to specify a number of replicas for a daemon set
--
- Workaround: fix the YAML
- remove the `replicas` field
- remove the `strategy` field (which defines the rollout mechanism for a deployment)
- remove the `status: {}` line at the end
--
- Or, we could also ...
---
## Use the `--force`, Luke
- We could also tell Kubernetes to ignore these errors and try anyway
- The `--force` flag actual name is `--validate=false`
.exercise[
- Try to load our YAML file and ignore errors:
```bash
kubectl apply -f rng.yml --validate=false
```
]
--
Wait ... Now, can it be *that* easy?
---
## Checking what we've done
- Did we transform our `deployment` into a `daemonset`?
.exercise[
- Look at the resources that we have now:
```bash
kubectl get all
```
]
--
We have both `deploy/rng` and `ds/rng` now!
--
And one too many pod ...
---
## Explanation
- You can have different resource types with the same name
(i.e. a *deployment* and a *daemonset* both named `rng`)
- We still have the old `rng` *deployment*
- But now we have the new `rng` *daemonset* as well
- If we look at the pods, we have:
- *one pod* for the deployment
- *one pod per node* for the daemonset
---
## What are all these pods doing?
- Let's check the logs of all these `rng` pods
- All these pods have a `run=rng` label:
- the first pod, because that's what `kubectl run` does
- the other ones (in the daemon set), because we
*copied the spec from the first one*
- Therefore, we can query everybody's logs using that `run=rng` selector
.exercise[
- Check the logs of all the pods having a label `run=rng`:
```bash
kubectl logs -l run=rng --tail 1
```
]
--
It appears that *all the pods* are serving requests at the moment.
---
## The magic of selectors
- The `rng` *service* is load balancing requests to a set of pod
- This set of pod is defined as "pods having the label `run=rng`"
.exercise[
- Check the *selector* in the `rng` service definition:
```bash
kubectl describe service rng
```
]
When we created additional pods with this label, they were
automatically detected by `svc/rng` and added as *endpoints*
to the associated load balancer.
---
## Removing the first pod from the load balancer
- What would happen if we removed that pod, with `kubectl delete pod ...`?
--
The `replicaset` would re-create it immediately.
--
- What would happen if we removed the `run=rng` label from that pod?
--
The `replicaset` would re-create it immediately.
--
... Because what matters to the `replicaset` is the number of pods *matching that selector.*
--
- But but but ... Don't we have more than one pod with `run=rng` now?
--
The answer lies in the exact selector used by the `replicaset` ...
---
## Deep dive into selectors
- Let's look at the selectors for the `rng` *deployment* and the associated *replica set*
.exercise[
- Show detailed information about the `rng` deployment:
```bash
kubectl describe deploy rng
```
- Show detailed information about the `rng` replica:
<br/>(The second command doesn't require you to get the exact name of the replica set)
```bash
kubectl describe rs rng-yyyy
kubectl describe rs -l run=rng
```
]
--
The replica set selector also has a `pod-template-hash`, unlike the pods in our daemon set.
---
# Updating a service through labels and selectors
- What if we want to drop the `rng` deployment from the load balancer?
- Option 1:
- destroy it
- Option 2:
- add an extra *label* to the daemon set
- update the service *selector* to refer to that *label*
--
Of course, option 2 offers more learning opportunities. Right?
---
## Add an extra label to the daemon set
- We will update the daemon set "spec"
- Option 1:
- edit the `rng.yml` file that we used earlier
- `kubectl apply -f rng.yml` to load the new definition
- Option 2:
- use `kubectl edit`
.exercise[
- Use one of the two options!
]
---
## A few possible gotchas ...
- There is a difference between:
- the label(s) of a resource (in the `metadata` block in the beginning)
- the selector of a resource (in the `spec` block)
- the label(s) of the resource(s) created by the first resource (in the `template` block)
- You want to update the selector and the template (at least)
- The template must match the selector
(i.e. the resource will refuse to create resources that it will not select)
- In YAML, `yes` should be quoted; i.e. `isactive: "yes"`
---
## Wrapping up
.exercise[
- Update the replica set selector and template label
- Update the service selector
- Check the logs of all `run=rng` pods to check that only 4 of them are now active
- Look at the pods that we have right now
- Bonus exercise 1: clean up the pods of the "old" daemon set
- Bonus exercise 2: how could we have done to avoid creating new pods?
]

179
docs/dashboard.md Normal file
View File

@@ -0,0 +1,179 @@
# The Kubernetes dashboard
- Kubernetes resources can also be viewed with a web dashboard
- We are going to deploy that dashboard with *three commands:*
- one to actually *run* the dashboard
- one to make the dashboard available from outside
- one to bypass authentication for the dashboard
--
.footnote[.warning[Yes, this will open our cluster to all kinds of shenanigans. Don't do this at home.]]
---
## Running the dashboard
- We need to create a *deployment* and a *service* for the dashboard
- But also a *secret*, a *service account*, a *role* and a *role binding*
- All these things can be defined in a YAML file and created with `kubectl apply -f`
.exercise[
- Create all the dashboard resources, with the following command:
```bash
kubectl apply -f https://goo.gl/Qamqab
```
]
The goo.gl URL expands to:
<br/>
.small[https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml]
---
## Making the dashboard reachable from outside
- The dashboard is exposed through a `ClusterIP` service
- We need a `NodePort` service instead
.exercise[
- Edit the service:
```bash
kubectl edit service kubernetes-dashboard
```
]
--
`NotFound`?!? Y U NO WORK?!?
---
## Editing the `kubernetes-dashboard` service
- If we look at the YAML that we loaded just before, we'll get a hint
--
- The dashboard was created in the `kube-system` namespace
.exercise[
- Edit the service:
```bash
kubectl -n kube-system edit service kubernetes-dashboard
```
- Change `ClusterIP` to `NodePort`, save, and exit
- Check the port that was assigned with `kubectl -n kube-system get services`
]
---
## Connecting to the dashboard
.exercise[
- Connect to https://oneofournodes:3xxxx/
(You will have to work around the TLS certificate validation warning)
]
- We have three authentication options at this point:
- token (associated with a role that has appropriate permissions)
- kubeconfig (e.g. using the `~/.kube/config` file from `node1`)
- "skip" (use the dashboard "service account")
- Let's use "skip": we get a bunch of warnings and don't see much
---
## Granting more rights to the dashboard
- The dashboard documentation [explains how to do](https://github.com/kubernetes/dashboard/wiki/Access-control#admin-privileges)
- We just need to load another YAML file!
.exercise[
- Grant admin privileges to the dashboard so we can see our resources:
```bash
kubectl apply -f https://goo.gl/CHsLTA
```
- Reload the dashboard and enjoy!
]
--
.warning[By the way, we just added a backdoor to our Kubernetes cluster!]
---
# Security implications of `kubectl apply`
- When we do `kubectl apply -f <URL>`, we create arbitrary resources
- Resources can be evil; imagine a `deployment` that ...
--
- starts bitcoin miners on the whole cluster
--
- hides in a non-default namespace
--
- bind-mounts our nodes' filesystem
--
- inserts SSH keys in the root account (on the node)
--
- encrypts our data and ransoms it
--
- ☠️☠️☠️
---
## `kubectl apply` is the new `curl | sh`
- `curl | sh` is convenient
- It's safe if you use HTTPS URLs from trusted sources
--
- `kubectl apply -f` is convenient
- It's safe if you use HTTPS URLs from trusted sources
--
- It introduces new failure modes
- Example: the official setup instructions for most pod networks

View File

Before

Width:  |  Height:  |  Size: 15 KiB

After

Width:  |  Height:  |  Size: 15 KiB

View File

Before

Width:  |  Height:  |  Size: 26 KiB

After

Width:  |  Height:  |  Size: 26 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 680 KiB

View File

Before

Width:  |  Height:  |  Size: 137 KiB

After

Width:  |  Height:  |  Size: 137 KiB

View File

Before

Width:  |  Height:  |  Size: 252 KiB

After

Width:  |  Height:  |  Size: 252 KiB

View File

Before

Width:  |  Height:  |  Size: 213 KiB

After

Width:  |  Height:  |  Size: 213 KiB

View File

Before

Width:  |  Height:  |  Size: 901 KiB

After

Width:  |  Height:  |  Size: 901 KiB

181
docs/dockercon.yml Normal file
View File

@@ -0,0 +1,181 @@
chat: https://dockercommunity.slack.com/messages/C7ET1GY4Q
exclude:
- self-paced
- snap
- auto-btp
- benchmarking
- elk-manual
- prom-manual
chapters:
- |
class: title
.small[
Swarm: from Zero to Hero
.small[.small[
**Be kind to the WiFi!**
*Use the 5G network*
<br/>
*Don't use your hotspot*
<br/>
*Don't stream videos from YouTube, Netflix, etc.
<br/>(if you're bored, watch local content instead)*
Also: share the power outlets
<br/>
*(with limited power comes limited responsibility?)*
<br/>
*(or something?)*
Thank you!
]
]
]
---
## Intros
<!--
- Hello! We are
AJ ([@s0ulshake](https://twitter.com/s0ulshake))
&
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
-->
- Hello! We are Jérôme, Lee, Nicholas, and Scott
<!--
I am
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
-->
--
- This is our collective Docker knowledge:
![Bell Curve](bell-curve.jpg)
---
## "From zero to hero"
--
- It rhymes, but it's a pretty bad title, to be honest
--
- None of you is a "zero"
--
- None of us is a "hero"
--
- None of us should even try to be a hero
--
*The hero syndrome is a phenomenon affecting people who seek heroism or recognition,
usually by creating a desperate situation which they can resolve.
This can include unlawful acts, such as arson.
The phenomenon has been noted to affect civil servants,
such as firefighters, nurses, police officers, and security guards.*
(Wikipedia page on [hero syndrome](https://en.wikipedia.org/wiki/Hero_syndrome))
---
## Agenda
.small[
- 09:00-09:10 Hello!
- 09:10-10:30 Part 1
- 10:30-11:00 coffee break
- 11:00-12:30 Part 2
- 12:30-13:30 lunch break
- 13:30-15:00 Part 3
- 15:00-15:30 coffee break
- 15:30-17:00 Part 4
- 17:00-18:00 Afterhours and Q&A
]
<!--
- The tutorial will run from 9:00am to 12:20pm
- This will be fast-paced, but DON'T PANIC!
- There will be a coffee break at 10:30am
<br/>
(please remind me if I forget about it!)
-->
- All the content is publicly available (slides, code samples, scripts)
Upstream URL: https://github.com/jpetazzo/orchestration-workshop
- Feel free to interrupt for questions at any time
- Live feedback, questions, help on [Gitter](chat)
http://container.training/chat
- intro.md
- |
@@TOC@@
- - prereqs.md
- versions.md
- |
class: title
All right!
<br/>
We're all set.
<br/>
Let's do this.
- sampleapp.md
- swarmkit.md
- creatingswarm.md
- morenodes.md
- - firstservice.md
- ourapponswarm.md
- updatingservices.md
- healthchecks.md
- - operatingswarm.md
- netshoot.md
- ipsec.md
- swarmtools.md
- security.md
- secrets.md
- encryptionatrest.md
- leastprivilege.md
- apiscope.md
- - logging.md
- metrics.md
- stateful.md
- extratips.md
- end.md
- |
class: title
That's all folks! <br/> Questions?
.small[.small[
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@docker](https://twitter.com/docker)
]]
<!--
Tiffany ([@tiffanyfayj](https://twitter.com/tiffanyfayj))
AJ ([@s0ulshake](https://twitter.com/s0ulshake))
-->

View File

Before

Width:  |  Height:  |  Size: 575 KiB

After

Width:  |  Height:  |  Size: 575 KiB

154
docs/encryptionatrest.md Normal file
View File

@@ -0,0 +1,154 @@
## Encryption at rest
- Swarm data is always encrypted
- A Swarm cluster can be "locked"
- When a cluster is "locked", the encryption key is protected with a passphrase
- Starting or restarting a locked manager requires the passphrase
- This protects against:
- theft (stealing a physical machine, a disk, a backup tape...)
- unauthorized access (to e.g. a remote or virtual volume)
- some vulnerabilities (like path traversal)
---
## Locking a Swarm cluster
- This is achieved through the `docker swarm update` command
.exercise[
- Lock our cluster:
```bash
docker swarm update --autolock=true
```
]
This will display the unlock key. Copy-paste it somewhere safe.
---
## Locked state
- If we restart a manager, it will now be locked
.exercise[
- Restart the local Engine:
```bash
sudo systemctl restart docker
```
]
Note: if you are doing the workshop on your own, using nodes
that you [provisioned yourself](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine) or with [Play-With-Docker](http://play-with-docker.com/), you might have to use a different method to restart the Engine.
---
## Checking that our node is locked
- Manager commands (requiring access to crypted data) will fail
- Other commands are OK
.exercise[
- Try a few basic commands:
```bash
docker ps
docker run alpine echo ♥
docker node ls
```
]
(The last command should fail, and it will tell you how to unlock this node.)
---
## Checking the state of the node programmatically
- The state of the node shows up in the output of `docker info`
.exercise[
- Check the output of `docker info`:
```bash
docker info
```
- Can't see it? Too verbose? Grep to the rescue!
```bash
docker info | grep ^Swarm
```
]
---
## Unlocking a node
- You will need the secret token that we obtained when enabling auto-lock earlier
.exercise[
- Unlock the node:
```bash
docker swarm unlock
```
- Copy-paste the secret token that we got earlier
- Check that manager commands now work correctly:
```bash
docker node ls
```
]
---
## Managing the secret key
- If the key is compromised, you can change it and re-encrypt with a new key:
```bash
docker swarm unlock-key --rotate
```
- If you lost the key, you can get it as long as you have at least one unlocked node:
```bash
docker swarm unlock-key -q
```
Note: if you rotate the key while some nodes are locked, without saving the previous key, those nodes won't be able to rejoin.
Note: if somebody steals both your disks and your key, .strike[you're doomed! Doooooomed!]
<br/>you can block the compromised node with `docker node demote` and `docker node rm`.
---
## Unlocking the cluster permanently
- If you want to remove the secret key, disable auto-lock
.exercise[
- Permanently unlock the cluster:
```bash
docker swarm update --autolock=false
```
]
Note: if some nodes are in locked state at that moment (or if they are offline/restarting
while you disabled autolock), they still need the previous unlock key to get back online.
For more information about locking, you can check the [upcoming documentation](https://github.com/docker/docker.github.io/pull/694).

38
docs/end.md Normal file
View File

@@ -0,0 +1,38 @@
class: title, extra-details
# What's next?
## (What to expect in future versions of this workshop)
---
class: extra-details
## Implemented and stable, but out of scope
- [Docker Content Trust](https://docs.docker.com/engine/security/trust/content_trust/) and
[Notary](https://github.com/docker/notary) (image signature and verification)
- Image security scanning (many products available, Docker Inc. and 3rd party)
- [Docker Cloud](https://cloud.docker.com/) and
[Docker Datacenter](https://www.docker.com/products/docker-datacenter)
(commercial offering with node management, secure registry, CI/CD pipelines, all the bells and whistles)
- Network and storage plugins
---
class: extra-details
## Work in progress
- Demo at least one volume plugin
<br/>(bonus points if it's a distributed storage system)
- ..................................... (your favorite feature here)
Reminder: there is a tag for each iteration of the content
in the Github repository.
It makes it easy to come back later and check what has changed since you did it!

View File

Before

Width:  |  Height:  |  Size: 205 KiB

After

Width:  |  Height:  |  Size: 205 KiB

BIN
docs/extra-details.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

246
docs/extratips.md Normal file
View File

@@ -0,0 +1,246 @@
# Controlling Docker from a container
- In a local environment, just bind-mount the Docker control socket:
```bash
docker run -ti -v /var/run/docker.sock:/var/run/docker.sock docker
```
- Otherwise, you have to:
- set `DOCKER_HOST`,
- set `DOCKER_TLS_VERIFY` and `DOCKER_CERT_PATH` (if you use TLS),
- copy certificates to the container that will need API access.
More resources on this topic:
- [Do not use Docker-in-Docker for CI](
http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
- [One container to rule them all](
http://jpetazzo.github.io/2016/04/03/one-container-to-rule-them-all/)
---
## Bind-mounting the Docker control socket
- In Swarm mode, bind-mounting the control socket gives you access to the whole cluster
- You can tell Docker to place a given service on a manager node, using constraints:
```bash
docker service create \
--mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
--name autoscaler --constraint node.role==manager ...
```
---
## Constraints and global services
(New in Docker Engine 1.13)
- By default, global services run on *all* nodes
```bash
docker service create --mode global ...
```
- You can specify constraints for global services
- These services will run only on the node satisfying the constraints
- For instance, this service will run on all manager nodes:
```bash
docker service create --mode global --constraint node.role==manager ...
```
---
## Constraints and dynamic scheduling
(New in Docker Engine 1.13)
- If constraints change, services are started/stopped accordingly
(e.g., `--constraint node.role==manager` and nodes are promoted/demoted)
- This is particularly useful with labels:
```bash
docker node update node1 --label-add defcon=five
docker service create --constraint node.labels.defcon==five ...
docker node update node2 --label-add defcon=five
docker node update node1 --label-rm defcon=five
```
---
## Shortcomings of dynamic scheduling
.warning[If a service becomes "unschedulable" (constraints can't be satisfied):]
- It won't be scheduled automatically when constraints are satisfiable again
- You will have to update the service; you can do a no-op udate with:
```bash
docker service update ... --force
```
.warning[Docker will silently ignore attempts to remove a non-existent label or constraint]
- It won't warn you if you typo when removing a label or constraint!
---
# Node management
- SwarmKit allows to change (almost?) everything on-the-fly
- Nothing should require a global restart
---
## Node availability
```bash
docker node update <node-name> --availability <active|pause|drain>
```
- Active = schedule tasks on this node (default)
- Pause = don't schedule new tasks on this node; existing tasks are not affected
You can use it to troubleshoot a node without disrupting existing tasks
It can also be used (in conjunction with labels) to reserve resources
- Drain = don't schedule new tasks on this node; existing tasks are moved away
This is just like crashing the node, but containers get a chance to shutdown cleanly
---
## Managers and workers
- Nodes can be promoted to manager with `docker node promote`
- Nodes can be demoted to worker with `docker node demote`
- This can also be done with `docker node update <node> --role <manager|worker>`
- Reminder: this has to be done from a manager node
<br/>(workers cannot promote themselves)
---
## Removing nodes
- You can leave Swarm mode with `docker swarm leave`
- Nodes are drained before being removed (i.e. all tasks are rescheduled somewhere else)
- Managers cannot leave (they have to be demoted first)
- After leaving, a node still shows up in `docker node ls` (in `Down` state)
- When a node is `Down`, you can remove it with `docker node rm` (from a manager node)
---
## Join tokens and automation
- If you have used Docker 1.12-RC: join tokens are now mandatory!
- You cannot specify your own token (SwarmKit generates it)
- If you need to change the token: `docker swarm join-token --rotate ...`
- To automate cluster deployment:
- have a seed node do `docker swarm init` if it's not already in Swarm mode
- propagate the token to the other nodes (secure bucket, facter, ohai...)
---
## Disk space management: `docker system df`
- Shows disk usage for images, containers, and volumes
- Breaks down between *active* and *reclaimable* categories
.exercise[
- Check how much disk space is used at the end of the workshop:
```bash
docker system df
```
]
Note: `docker system` is new in Docker Engine 1.13.
---
## Reclaiming unused resources: `docker system prune`
- Removes stopped containers
- Removes dangling images (that don't have a tag associated anymore)
- Removes orphaned volumes
- Removes empty networks
.exercise[
- Try it:
```bash
docker system prune -f
```
]
Note: `docker system prune -a` will also remove *unused* images.
---
## Events
- You can get a real-time stream of events with `docker events`
- This will report *local events* and *cluster events*
- Local events =
<br/>
all activity related to containers, images, plugins, volumes, networks, *on this node*
- Cluster events =
<br/>Swarm Mode activity related to services, nodes, secrets, configs, *on the whole cluster*
- `docker events` doesn't report *local events happening on other nodes*
- Events can be filtered (by type, target, labels...)
- Events can be formatted with Go's `text/template` or in JSON
---
## Getting *all the events*
- There is no built-in to get a stream of *all the events* on *all the nodes*
- This can be achieved with (for instance) the four following services working together:
- a Redis container (used as a stateless, fan-in message queue)
- a global service bind-mounting the Docker socket, pushing local events to the queue
- a similar singleton service to push global events to the queue
- a queue consumer fetching events and processing them as you please
I'm not saying that you should implement it with Shell scripts, but you totally could.
.small[
(It might or might not be one of the initiating rites of the
[House of Bash](https://twitter.com/carmatrocity/status/676559402787282944))
]
For more information about event filters and types, check [the documentation](https://docs.docker.com/engine/reference/commandline/events/).

472
docs/firstservice.md Normal file
View File

@@ -0,0 +1,472 @@
# Running our first Swarm service
- How do we run services? Simplified version:
`docker run``docker service create`
.exercise[
- Create a service featuring an Alpine container pinging Google resolvers:
```bash
docker service create alpine ping 8.8.8.8
```
- Check the result:
```bash
docker service ps <serviceID>
```
]
---
## `--detach` for service creation
(New in Docker Engine 17.05)
If you are running Docker 17.05 to 17.09, you will see the following message:
```
Since --detach=false was not specified, tasks will be created in the background.
In a future release, --detach=false will become the default.
```
You can ignore that for now; but we'll come back to it in just a few minutes!
---
## Checking service logs
(New in Docker Engine 17.05)
- Just like `docker logs` shows the output of a specific local container ...
- ... `docker service logs` shows the output of all the containers of a specific service
.exercise[
- Check the output of our ping command:
```bash
docker service logs <serviceID>
```
]
Flags `--follow` and `--tail` are available, as well as a few others.
Note: by default, when a container is destroyed (e.g. when scaling down), its logs are lost.
---
class: extra-details
## Before Docker Engine 17.05
- Docker 1.13/17.03/17.04 have `docker service logs` as an experimental feature
<br/>(available only when enabling the experimental feature flag)
- We have to use `docker logs`, which only works on local containers
- We will have to connect to the node running our container
<br/>(unless it was scheduled locally, of course)
---
class: extra-details
## Looking up where our container is running
- The `docker service ps` command told us where our container was scheduled
.exercise[
- Look up the `NODE` on which the container is running:
```bash
docker service ps <serviceID>
```
- If you use Play-With-Docker, switch to that node's tab, or set `DOCKER_HOST`
- Otherwise, `ssh` into tht node or use `$(eval docker-machine env node...)`
]
---
class: extra-details
## Viewing the logs of the container
.exercise[
- See that the container is running and check its ID:
```bash
docker ps
```
- View its logs:
```bash
docker logs <containerID>
```
- Go back to `node1` afterwards
]
---
## Scale our service
- Services can be scaled in a pinch with the `docker service update` command
.exercise[
- Scale the service to ensure 2 copies per node:
```bash
docker service update <serviceID> --replicas 10 --detach=true
```
- Check that we have two containers on the current node:
```bash
docker ps
```
]
---
## View deployment progress
(New in Docker Engine 17.05)
- Commands that create/update/delete services can run with `--detach=false`
- The CLI will show the status of the command, and exit once it's done working
.exercise[
- Scale the service to ensure 3 copies per node:
```bash
docker service update <serviceID> --replicas 15 --detach=false
```
]
Note: with Docker Engine 17.10 and later, `--detach=false` is the default.
With versions older than 17.05, you can use e.g.: `watch docker service ps <serviceID>`
---
## Expose a service
- Services can be exposed, with two special properties:
- the public port is available on *every node of the Swarm*,
- requests coming on the public port are load balanced across all instances.
- This is achieved with option `-p/--publish`; as an approximation:
`docker run -p → docker service create -p`
- If you indicate a single port number, it will be mapped on a port
starting at 30000
<br/>(vs. 32768 for single container mapping)
- You can indicate two port numbers to set the public port number
<br/>(just like with `docker run -p`)
---
## Expose ElasticSearch on its default port
.exercise[
- Create an ElasticSearch service (and give it a name while we're at it):
```bash
docker service create --name search --publish 9200:9200 --replicas 7 \
--detach=false elasticsearch`:2`
```
]
Note: don't forget the **:2**!
The latest version of the ElasticSearch image won't start without mandatory configuration.
---
## Tasks lifecycle
- During the deployment, you will be able to see multiple states:
- assigned (the task has been assigned to a specific node)
- preparing (this mostly means "pulling the image")
- starting
- running
- When a task is terminated (stopped, killed...) it cannot be restarted
(A replacement task will be created)
---
class: extra-details
![diagram showing what happens during docker service create, courtesy of @aluzzardi](docker-service-create.svg)
---
## Test our service
- We mapped port 9200 on the nodes, to port 9200 in the containers
- Let's try to reach that port!
.exercise[
- Try the following command:
```bash
curl localhost:9200
```
]
(If you get `Connection refused`: congratulations, you are very fast indeed! Just try again.)
ElasticSearch serves a little JSON document with some basic information
about this instance; including a randomly-generated super-hero name.
---
## Test the load balancing
- If we repeat our `curl` command multiple times, we will see different names
.exercise[
- Send 10 requests, and see which instances serve them:
```bash
for N in $(seq 1 10); do
curl -s localhost:9200 | jq .name
done
```
]
Note: if you don't have `jq` on your Play-With-Docker instance, just install it:
```bash
apk add --no-cache jq
```
---
## Load balancing results
Traffic is handled by our clusters [TCP routing mesh](
https://docs.docker.com/engine/swarm/ingress/).
Each request is served by one of the 7 instances, in rotation.
Note: if you try to access the service from your browser,
you will probably see the same
instance name over and over, because your browser (unlike curl) will try
to re-use the same connection.
---
## Under the hood of the TCP routing mesh
- Load balancing is done by IPVS
- IPVS is a high-performance, in-kernel load balancer
- It's been around for a long time (merged in the kernel since 2.4)
- Each node runs a local load balancer
(Allowing connections to be routed directly to the destination,
without extra hops)
---
## Managing inbound traffic
There are many ways to deal with inbound traffic on a Swarm cluster.
- Put all (or a subset) of your nodes in a DNS `A` record
- Assign your nodes (or a subset) to an ELB
- Use a virtual IP and make sure that it is assigned to an "alive" node
- etc.
---
class: btw-labels
## Managing HTTP traffic
- The TCP routing mesh doesn't parse HTTP headers
- If you want to place multiple HTTP services on port 80, you need something more
- You can setup NGINX or HAProxy on port 80 to do the virtual host switching
- Docker Universal Control Plane provides its own [HTTP routing mesh](
https://docs.docker.com/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services/)
- add a specific label starting with `com.docker.ucp.mesh.http` to your services
- labels are detected automatically and dynamically update the configuration
---
class: btw-labels
## You should use labels
- Labels are a great way to attach arbitrary information to services
- Examples:
- HTTP vhost of a web app or web service
- backup schedule for a stateful service
- owner of a service (for billing, paging...)
- etc.
---
## Pro-tip for ingress traffic management
- It is possible to use *local* networks with Swarm services
- This means that you can do something like this:
```bash
docker service create --network host --mode global traefik ...
```
(This runs the `traefik` load balancer on each node of your cluster, in the `host` network)
- This gives you native performance (no iptables, no proxy, no nothing!)
- The load balancer will "see" the clients' IP addresses
- But: a container cannot simultaneously be in the `host` network and another network
(You will have to route traffic to containers using exposed ports or UNIX sockets)
---
class: extra-details
## Using local networks (`host`, `macvlan` ...) with Swarm services
- Using the `host` network is fairly straightforward
(With the caveats described on the previous slide)
- It is also possible to use drivers like `macvlan`
- see [this guide](
https://docs.docker.com/engine/userguide/networking/get-started-macvlan/
) to get started on `macvlan`
- see [this PR](https://github.com/moby/moby/pull/32981) for more information about local network drivers in Swarm mode
---
## Visualize container placement
- Let's leverage the Docker API!
.exercise[
- Get the source code of this simple-yet-beautiful visualization app:
```bash
cd ~
git clone git://github.com/dockersamples/docker-swarm-visualizer
```
- Build and run the Swarm visualizer:
```bash
cd docker-swarm-visualizer
docker-compose up -d
```
]
---
## Connect to the visualization webapp
- It runs a web server on port 8080
.exercise[
- Point your browser to port 8080 of your node1's public ip
(If you use Play-With-Docker, click on the (8080) badge)
]
- The webapp updates the display automatically (you don't need to reload the page)
- It only shows Swarm services (not standalone containers)
- It shows when nodes go down
- It has some glitches (it's not Carrier-Grade Enterprise-Compliant ISO-9001 software)
---
## Why This Is More Important Than You Think
- The visualizer accesses the Docker API *from within a container*
- This is a common pattern: run container management tools *in containers*
- Instead of viewing your cluster, this could take care of logging, metrics, autoscaling ...
- We can run it within a service, too! We won't do it, but the command would look like:
```bash
docker service create \
--mount source=/var/run/docker.sock,type=bind,target=/var/run/docker.sock \
--name viz --constraint node.role==manager ...
```
Credits: the visualization code was written by
[Francisco Miranda](https://github.com/maroshii).
<br/>
[Mano Marks](https://twitter.com/manomarks) adapted
it to Swarm and maintains it.
---
## Terminate our services
- Before moving on, we will remove those services
- `docker service rm` can accept multiple services names or IDs
- `docker service ls` can accept the `-q` flag
- A Shell snippet a day keeps the cruft away
.exercise[
- Remove all services with this one liner:
```bash
docker service ls -q | xargs docker service rm
```
]

BIN
docs/grafana-add-graph.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 147 KiB

BIN
docs/grafana-add-source.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 145 KiB

211
docs/healthchecks.md Normal file
View File

@@ -0,0 +1,211 @@
name: healthchecks
# Health checks
(New in Docker Engine 1.12)
- Commands that are executed on regular intervals in a container
- Must return 0 or 1 to indicate "all is good" or "something's wrong"
- Must execute quickly (timeouts = failures)
- Example:
```bash
curl -f http://localhost/_ping || false
```
- the `-f` flag ensures that `curl` returns non-zero for 404 and similar errors
- `|| false` ensures that any non-zero exit status gets mapped to 1
- `curl` must be installed in the container that is being checked
---
## Defining health checks
- In a Dockerfile, with the [HEALTHCHECK](https://docs.docker.com/engine/reference/builder/#healthcheck) instruction
```
HEALTHCHECK --interval=1s --timeout=3s CMD curl -f http://localhost/ || false
```
- From the command line, when running containers or services
```
docker run --health-cmd "curl -f http://localhost/ || false" ...
docker service create --health-cmd "curl -f http://localhost/ || false" ...
```
- In Compose files, with a per-service [healthcheck](https://docs.docker.com/compose/compose-file/#healthcheck) section
```yaml
www:
image: hellowebapp
healthcheck:
test: "curl -f https://localhost/ || false"
timeout: 3s
```
---
## Using health checks
- With `docker run`, health checks are purely informative
- `docker ps` shows health status
- `docker inspect` has extra details (including health check command output)
- With `docker service`:
- unhealthy tasks are terminated (i.e. the service is restarted)
- failed deployments can be rolled back automatically
<br/>(by setting *at least* the flag `--update-failure action rollback`)
---
## Automated rollbacks
Here is a comprehensive example using the CLI:
```bash
docker service update \
--update-delay 5s \
--update-failure-action rollback \
--update-max-failure-ratio .25 \
--update-monitor 5s \
--update-parallelism 1 \
--rollback-delay 5s \
--rollback-failure-action pause \
--rollback-max-failure-ratio .5 \
--rollback-monitor 5s \
--rollback-parallelism 0 \
--health-cmd "curl -f http://localhost/ || exit 1" \
--health-interval 2s \
--health-retries 1 \
--image yourimage:newversion \
yourservice
```
---
## Implementing auto-rollback in practice
We will use the following Compose file (`stacks/dockercoins+healthcheck.yml`):
```yaml
...
hasher:
build: dockercoins/hasher
image: ${REGISTRY-127.0.0.1:5000}/hasher:${TAG-latest}
deploy:
replicas: 7
update_config:
delay: 5s
failure_action: rollback
max_failure_ratio: .5
monitor: 5s
parallelism: 1
...
```
---
## Enabling auto-rollback
.exercise[
- Go to the `stacks` directory:
```bash
cd ~/orchestration-workshop/stacks
```
- Deploy the updated stack:
```bash
docker stack deploy dockercoins --compose-file dockercoins+healthcheck.yml
```
]
This will also scale the `hasher` service to 7 instances.
---
## Visualizing a rolling update
First, let's make an "innocent" change and deploy it.
.exercise[
- Update the `sleep` delay in the code:
```bash
sed -i "s/sleep 0.1/sleep 0.2/" dockercoins/hasher/hasher.rb
```
- Build, ship, and run the new image:
```bash
export TAG=v0.5
docker-compose -f dockercoins+healthcheck.yml build
docker-compose -f dockercoins+healthcheck.yml push
docker service update dockercoins_hasher \
--detach=false --image=127.0.0.1:5000/hasher:$TAG
```
]
---
## Visualizing an automated rollback
And now, a breaking change that will cause the health check to fail:
.exercise[
- Change the HTTP listening port:
```bash
sed -i "s/80/81/" dockercoins/hasher/hasher.rb
```
- Build, ship, and run the new image:
```bash
export TAG=v0.6
docker-compose -f dockercoins+healthcheck.yml build
docker-compose -f dockercoins+healthcheck.yml push
docker service update dockercoins_hasher \
--detach=false --image=127.0.0.1:5000/hasher:$TAG
```
]
---
## Command-line options available for health checks, rollbacks, etc.
Batteries included, but swappable
.small[
```
--health-cmd string Command to run to check health
--health-interval duration Time between running the check (ms|s|m|h)
--health-retries int Consecutive failures needed to report unhealthy
--health-start-period duration Start period for the container to initialize before counting retries towards unstable (ms|s|m|h)
--health-timeout duration Maximum time to allow one check to run (ms|s|m|h)
--no-healthcheck Disable any container-specified HEALTHCHECK
--restart-condition string Restart when condition is met ("none"|"on-failure"|"any")
--restart-delay duration Delay between restart attempts (ns|us|ms|s|m|h)
--restart-max-attempts uint Maximum number of restarts before giving up
--restart-window duration Window used to evaluate the restart policy (ns|us|ms|s|m|h)
--rollback Rollback to previous specification
--rollback-delay duration Delay between task rollbacks (ns|us|ms|s|m|h)
--rollback-failure-action string Action on rollback failure ("pause"|"continue")
--rollback-max-failure-ratio float Failure rate to tolerate during a rollback
--rollback-monitor duration Duration after each task rollback to monitor for failure (ns|us|ms|s|m|h)
--rollback-order string Rollback order ("start-first"|"stop-first")
--rollback-parallelism uint Maximum number of tasks rolled back simultaneously (0 to roll back all at once)
--update-delay duration Delay between updates (ns|us|ms|s|m|h)
--update-failure-action string Action on update failure ("pause"|"continue"|"rollback")
--update-max-failure-ratio float Failure rate to tolerate during an update
--update-monitor duration Duration after each task update to monitor for failure (ns|us|ms|s|m|h)
--update-order string Update order ("start-first"|"stop-first")
--update-parallelism uint Maximum number of tasks updated simultaneously (0 to update all at once)
```
]
Yup ... That's a lot of batteries!

41
docs/intro.md Normal file
View File

@@ -0,0 +1,41 @@
## A brief introduction
- This was initially written to support in-person,
instructor-led workshops and tutorials
- You can also follow along on your own, at your own pace
- We included as much information as possible in these slides
- We recommend having a mentor to help you ...
- ... Or be comfortable spending some time reading the Docker
[documentation](https://docs.docker.com/) ...
- ... And looking for answers in the [Docker forums](forums.docker.com),
[StackOverflow](http://stackoverflow.com/questions/tagged/docker),
and other outlets
---
class: self-paced
## Hands on, you shall practice
- Nobody ever became a Jedi by spending their lives reading Wookiepedia
- Likewise, it will take more than merely *reading* these slides
to make you an expert
- These slides include *tons* of exercises
- They assume that you have access to a cluster of Docker nodes
- If you are attending a workshop or tutorial:
<br/>you will be given specific instructions to access your cluster
- If you are doing this on your own:
<br/>you can use
[Play-With-Docker](http://www.play-with-docker.com/) and
read [these instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker) for extra
details

140
docs/ipsec.md Normal file
View File

@@ -0,0 +1,140 @@
# Securing overlay networks
- By default, overlay networks are using plain VXLAN encapsulation
(~Ethernet over UDP, using SwarmKit's control plane for ARP resolution)
- Encryption can be enabled on a per-network basis
(It will use IPSEC encryption provided by the kernel, leveraging hardware acceleration)
- This is only for the `overlay` driver
(Other drivers/plugins will use different mechanisms)
---
## Creating two networks: encrypted and not
- Let's create two networks for testing purposes
.exercise[
- Create an "insecure" network:
```bash
docker network create insecure --driver overlay --attachable
```
- Create a "secure" network:
```bash
docker network create secure --opt encrypted --driver overlay --attachable
```
]
.warning[Make sure that you don't typo that option; errors are silently ignored!]
---
## Deploying a web server sitting on both networks
- Let's use good old NGINX
- We will attach it to both networks
- We will use a placement constraint to make sure that it is on a different node
.exercise[
- Create a web server running somewhere else:
```bash
docker service create --name web \
--network secure --network insecure \
--constraint node.hostname!=node1 \
nginx
```
]
---
## Sniff HTTP traffic
- We will use `ngrep`, which allows to grep for network traffic
- We will run it in a container, using host networking to access the host's interfaces
.exercise[
- Sniff network traffic and display all packets containing "HTTP":
```bash
docker run --net host nicolaka/netshoot ngrep -tpd eth0 HTTP
```
]
--
Seeing tons of HTTP request? Shutdown your DockerCoins workers:
```bash
docker service update dockercoins_worker --replicas=0
```
---
## Check that we are, indeed, sniffing traffic
- Let's see if we can intercept our traffic with Google!
.exercise[
- Open a new terminal
- Issue an HTTP request to Google (or anything you like):
```bash
curl google.com
```
]
The ngrep container will display one `#` per packet traversing the network interface.
When you do the `curl`, you should see the HTTP request in clear text in the output.
---
class: extra-details
## If you are using Play-With-Docker, Vagrant, etc.
- You will probably have *two* network interfaces
- One interface will be used for outbound traffic (to Google)
- The other one will be used for internode traffic
- You might have to adapt/relaunch the `ngrep` command to specify the right one!
---
## Try to sniff traffic across overlay networks
- We will run `curl web` through both secure and insecure networks
.exercise[
- Access the web server through the insecure network:
```bash
docker run --rm --net insecure nicolaka/netshoot curl web
```
- Now do the same through the secure network:
```bash
docker run --rm --net secure nicolaka/netshoot curl web
```
]
When you run the first command, you will see HTTP fragments.
<br/>
However, when you run the second one, only `#` will show up.

BIN
docs/k8s-arch1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 352 KiB

BIN
docs/k8s-arch2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

View File

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 24 KiB

View File

Before

Width:  |  Height:  |  Size: 145 KiB

After

Width:  |  Height:  |  Size: 145 KiB

89
docs/kube.yml Normal file
View File

@@ -0,0 +1,89 @@
exclude:
- self-paced
- snap
chat: "FIXME"
chapters:
- |
class: title
.small[
Deploying and scaling microservices <br/> with Docker and Kubernetes
.small[.small[
**Be kind to the WiFi!**
*Use the 5G network*
<br/>
*Don't use your hotspot*
<br/>
*Don't stream videos from YouTube, Netflix, etc.
<br/>(if you're bored, watch local content instead)*
Thank you!
]
]
]
---
## Intros
- Hello! I am
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo))
--
- This is my first time doing this
---
## Logistics
- The tutorial will run from 9:00am to 12:15pm
- There will be a coffee break at 10:30am
<br/>
(please remind me if I forget about it!)
- All the content is publicly available (slides, code samples, scripts)
Upstream URL: https://github.com/jpetazzo/orchestration-workshop
- Feel free to interrupt for questions at any time
- Live feedback, questions, help on [Gitter](chat)
http://container.training/chat
- intro.md
- |
@@TOC@@
- - prereqs-k8s.md
- versions-k8s.md
- sampleapp.md
- - concepts-k8s.md
- kubectlget.md
- setup-k8s.md
- kubectlrun.md
- - kubectlexpose.md
- ourapponkube.md
- dashboard.md
- - kubectlscale.md
- daemonset.md
- rollout.md
- whatsnext.md
- |
class: title
That's all folks! <br/> Questions?
.small[.small[
Jérôme ([@jpetazzo](https://twitter.com/jpetazzo)) — [@docker](https://twitter.com/docker)
]]

138
docs/kubectlexpose.md Normal file
View File

@@ -0,0 +1,138 @@
# Exposing containers
- `kubectl expose` creates a *service* for existing pods
- A *service* is a stable address for a pod (or a bunch of pods)
- If we want to connect to our pod(s), we need to create a *service*
- Once a service is created, `kube-dns` will allow us to resolve it by name
(i.e. after creating service `hello`, the name `hello` will resolve to something)
- There are different types of services, detailed on the following slides:
`ClusterIP`, `NodePort`, `LoadBalancer`, `ExternalName`
---
## Basic service types
- `ClusterIP` (default type)
- a virtual IP address is allocated for the service (in an internal, private range)
- this IP address is reachable only from within the cluster (nodes and pods)
- our code can connect to the service using the original port number
- `NodePort`
- a port is allocated for the service (by default, in the 30000-32768 range)
- that port is made available *on all our nodes* and anybody can connect to it
- our code must be changed to connect to that new port number
These service types are always available.
Under the hood: `kube-proxy` is using a userland proxy and a bunch of `iptables` rules.
---
## More service types
- `LoadBalancer`
- an external load balancer is allocated for the service
- the load balancer is configured accordingly
<br/>(e.g.: a `NodePort` service is created, and the load balancer sends traffic to that port)
- `ExternalName`
- the DNS entry managed by `kube-dns` will just be a `CNAME` to a provided record
- no port, no IP address, no nothing else is allocated
The `LoadBalancer` type is currently only available on AWS, Azure, and GCE.
---
## Running containers with open ports
- Since `ping` doesn't have anything to connect to, we'll have to run something else
.exercise[
- Start a bunch of ElasticSearch containers:
```bash
kubectl run elastic --image=elasticsearch:2 --replicas=7
```
- Watch them being started:
```bash
kubectl get pods -w
```
]
The `-w` option "watches" events happening on the specified resources.
Note: please DO NOT call the service `search`. It would collide with the TLD.
---
## Exposing our deployment
- We'll create a default `ClusterIP` service
.exercise[
- Expose the ElasticSearch HTTP API port:
```bash
kubectl expose deploy/elastic --port 9200
```
- Look up which IP address was allocated:
```bash
kubectl get svc
```
]
---
## Services are layer 4 constructs
- You can assign IP addresses to services, but they are still *layer 4*
(i.e. a service is not an IP address; it's an IP address + protocol + port)
- This is caused by the current implementation of `kube-proxy`
(it relies on mechanisms that don't support layer 3)
- As a result: you *have to* indicate the port number for your service
- Running services with arbitrary port (or port ranges) requires hacks
(e.g. host networking mode)
---
## Testing our service
- We will now send a few HTTP requests to our ElasticSearch pods
.exercise[
- Let's obtain the IP address that was allocated for our service, *programatically:*
```bash
IP=$(kubectl get svc elastic -o go-template --template '{{ .spec.clusterIP }}')
```
- Send a few requests:
```bash
curl http://$IP:9200/
```
]
--
Our requests are load balanced across multiple pods.

234
docs/kubectlget.md Normal file
View File

@@ -0,0 +1,234 @@
# First contact with `kubectl`
- `kubectl` is (almost) the only tool we'll need to talk to Kubernetes
- It is a rich CLI tool around the Kubernetes API
(Everything you can do with `kubectl`, you can do directly with the API)
- On our machines, there is a `~/.kube/config` file with:
- the Kubernetes API address
- the path to our TLS certificates used to authenticate
- You can also use the `--kubeconfig` flag to pass a config file
- Or directly `--server`, `--user`, etc.
- `kubectl` can be pronounced "Cube C T L", "Cube cuttle", "Cube cuddle"...
---
## `kubectl get`
- Let's look at our `Node` resources with `kubectl get`!
.exercise[
- Look at the composition of our cluster:
```bash
kubectl get node
```
- These commands are equivalent:
```bash
kubectl get no
kubectl get node
kubectl get nodes
```
]
---
## From human-readable to machine-readable output
- `kubectl get` can output JSON, YAML, or be directly formatted
.exercise[
- Give us more info about them nodes:
```bash
kubectl get nodes -o wide
```
- Let's have some YAML:
```bash
kubectl get no -o yaml
```
See that `kind: List` at the end? It's the type of our result!
]
---
## (Ab)using `kubectl` and `jq`
- It's super easy to build custom reports
.exercise[
- Show the capacity of all our nodes as a stream of JSON objects:
```bash
kubectl get nodes -o json |
jq ".items[] | {name:.metadata.name} + .status.capacity"
```
]
---
## What's available?
- `kubectl` has pretty good introspection facilities
- We can list all available resource types by running `kubectl get`
- We can view details about a resource with:
```bash
kubectl describe type/name
kubectl describe type name
```
- We can view the definition for a resource type with:
```bash
kubectl explain type
```
Each time, `type` can be singular, plural, or abbreviated type name.
---
## Services
- A *service* is a stable endpoint to connect to "something"
(In the initial proposal, they were called "portals")
.exercise[
- List the services on our cluster with one of these commands:
```bash
kubectl get services
kubectl get svc
```
]
--
There is already one service on our cluster: the Kubernetes API itself.
---
## ClusterIP services
- A `ClusterIP` service is internal, available from the cluster only
- This is useful for introspection from within containers
.exercise[
- Try to connect to the API:
```bash
curl -k https://`10.96.0.1`
```
- `-k` is used to skip certificate verification
- Make sure to replace 10.96.0.1 with the CLUSTER-IP shown earlier
]
--
The error that we see is expected: the Kubernetes API requires authentication.
---
## Listing running containers
- Containers are manipulated through *pods*
- A pod is a group of containers:
- running together (on the same node)
- sharing resources (RAM, CPU; but also network, volumes)
.exercise[
- List pods on our cluster:
```bash
kubectl get pods
```
]
--
*These are not the pods you're looking for.* But where are they?!?
---
## Namespaces
- Namespaces allow to segregate resources
.exercise[
- List the namespaces on our cluster with one of these commands:
```bash
kubectl get namespaces
kubectl get namespace
kubectl get ns
```
]
--
*You know what ... This `kube-system` thing looks suspicious.*
---
## Accessing namespaces
- By default, `kubectl` uses the `default` namespace
- We can switch to a different namespace with the `-n` option
.exercise[
- List the pods in the `kube-system` namespace:
```bash
kubectl -n kube-system get pods
```
]
--
*Ding ding ding ding ding!*
---
## What are all these pods?
- `etcd` is our etcd server
- `kube-apiserver` is the API server
- `kube-controller-manager` and `kube-scheduler` are other master components
- `kube-dns` is an additional component (not mandatory but super useful, so it's there)
- `kube-proxy` is the (per-node) component managing port mappings and such
- `weave` is the (per-node) component managing the network overlay
- the `READY` column indicates the number of containers in each pod
- the pods with a name ending with `-ip-172-31-XX-YY` are the master components
<br/>
(they have been specifically "pinned" to the master node)

197
docs/kubectlrun.md Normal file
View File

@@ -0,0 +1,197 @@
# Running our first containers on Kubernetes
- First things first: we cannot run a container
--
- We are going to run a pod, and in that pod there will be a single container
--
- In that container in the pod, we are going to run a simple `ping` command
- Then we are going to start additional copies of the pod
---
## Starting a simple pod with `kubectl run`
- We need to specify at least a *name* and the image we want to use
.exercise[
- Let's ping `goo.gl`:
```bash
kubectl run pingpong --image alpine ping goo.gl
```
]
--
OK, what did just happen?
---
## Behind the scenes of `kubectl run`
- Let's look at the resources that were created by `kubectl run`
.exercise[
- List most resource types:
```bash
kubectl get all
```
]
--
We should see the following things:
- `deploy/pingpong` (the *deployment* that we just created)
- `rs/pingpong-xxxx` (a *replica set* created by the deployment)
- `po/pingpong-yyyy` (a *pod* created by the replica set)
---
## Deployments, replica sets, and replication controllers
- A *deployment* is a high-level construct
- allows scaling, rolling updates, rollbacks
- multiple deployments can be used together to implement a
[canary deployment](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#canary-deployments)
- delegates pods management to *replica sets*
- A *replica set* is a low-level construct
- makes sure that a given number of identical pods are running
- allows scaling
- rarely used directly
- A *replication controller* is the (deprecated) predecessor of a replica set
---
## Our `pingpong` deployment
- `kubectl run` created a *deployment*, `deploy/pingpong`
- That deployment created a *replica set*, `rs/pingpong-xxxx`
- That replica set created a *pod*, `po/pingpong-yyyy`
- We'll see later how these folks play together for:
- scaling
- high availability
- rolling updates
---
## Viewing container output
- Let's use the `kubectl logs` command
- We will pass either a *pod name*, or a *type/name*
(E.g. if we specify a deployment or replica set, it will get the first pod in it)
- Unless specified otherwise, it will only show logs of the first container in the pod
(Good thing there's only one in ours!)
.exercise[
- View the result of our `ping` command:
```bash
kubectl logs deploy/pingpong
```
]
---
## Streaming logs in real time
- Just like `docker logs`, `kubectl logs` supports convenient options:
- `-f`/`--follow` to stream logs in real time (à la `tail -f`)
- `--tail` to indicate how many lines you want to see (from the end)
- `--since` to get logs only after a given timestamp
.exercise[
- View the latest logs of our `ping` command:
```bash
kubectl logs deploy/pingpong --tail 1 --follow
```
]
---
## Scaling our application
- We can create additional copies of our container (I mean, our pod) with `kubectl scale`
.exercise[
- Scale our `pingpong` deployment:
```bash
kubectl scale deploy/pingpong --replicas 8
```
]
Note: what if we tried to scale `rs/pingpong-xxxx`?
We could! But the *deployment* would notice it right away, and scale back to the initial level.
---
## Viewing logs of multiple pods
- When we specify a deployment name, only one single pod's logs are shown
- We can view the logs of multiple pods by specifying a *selector*
- A selector is a logic expression using *labels*
- Conveniently, when you `kubectl run somename`, the associated objects have a `run=somename` label
.exercise[
- View the last line of log from all pods with the `run=pingpong` label:
```bash
kubectl logs -l run=pingpong --tail 1
```
]
Unfortunately, `--follow` cannot (yet) be used to stream the logs from multiple containers.
---
class: title
.small[
Meanwhile, at the Google NOC ...
.small[
Why the hell
<br/>
are we getting 1000 packets per second
<br/>
of ICMP ECHO traffic from EC2 ?!?
]
]

22
docs/kubectlscale.md Normal file
View File

@@ -0,0 +1,22 @@
# Scaling a deployment
- We will start with an easy one: the `worker` deployment
.exercise[
- Open two new terminals to check what's going on with pods and deployments:
```bash
kubectl get pods -w
kubectl get deployments -w
```
- Now, create more `worker` replicas:
```bash
kubectl scale deploy/worker --replicas=10
```
]
After a few seconds, the graph in the web UI should show up.
<br/>
(And peak at 10 hashes/second, just like when we were running on a single one.)

57
docs/leastprivilege.md Normal file
View File

@@ -0,0 +1,57 @@
# Least privilege model
- All the important data is stored in the "Raft log"
- Managers nodes have read/write access to this data
- Workers nodes have no access to this data
- Workers only receive the minimum amount of data that they need:
- which services to run
- network configuration information for these services
- credentials for these services
- Compromising a worker node does not give access to the full cluster
---
## What can I do if I compromise a worker node?
- I can enter the containers running on that node
- I can access the configuration and credentials used by these containers
- I can inspect the network traffic of these containers
- I cannot inspect or disrupt the network traffic of other containers
(network information is provided by manager nodes; ARP spoofing is not possible)
- I cannot infer the topology of the cluster and its number of nodes
- I can only learn the IP addresses of the manager nodes
---
## Guidelines for workload isolation leveraging least privilege model
- Define security levels
- Define security zones
- Put managers in the highest security zone
- Enforce workloads of a given security level to run in a given zone
- Enforcement can be done with [Authorization Plugins](https://docs.docker.com/engine/extend/plugins_authorization/)
---
## Learning more about container security
.blackbelt[[Securing Containers, One Patch At A Time](https://www.youtube.com/watch?v=jZSs1RHwcqo&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8&index=4) by Michael Crosby (DC17US)]
.blackbelt[Container-relevant Upstream Kernel Developments by Tycho Andersen (Tuesday 14:55)]
.blackbelt[What Have Syscalls Done for you Lately? by Liz Rice (Tuesday 11:45)]

BIN
docs/lifecycle.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

431
docs/logging.md Normal file
View File

@@ -0,0 +1,431 @@
name: logging
# Centralized logging
- We want to send all our container logs to a central place
- If that place could offer a nice web dashboard too, that'd be nice
--
- We are going to deploy an ELK stack
- It will accept logs over a GELF socket
- We will update our services to send logs through the GELF logging driver
---
# Setting up ELK to store container logs
*Important foreword: this is not an "official" or "recommended"
setup; it is just an example. We used ELK in this demo because
it's a popular setup and we keep being asked about it; but you
will have equal success with Fluent or other logging stacks!*
What we will do:
- Spin up an ELK stack with services
- Gaze at the spiffy Kibana web UI
- Manually send a few log entries using one-shot containers
- Set our containers up to send their logs to Logstash
---
## What's in an ELK stack?
- ELK is three components:
- ElasticSearch (to store and index log entries)
- Logstash (to receive log entries from various
sources, process them, and forward them to various
destinations)
- Kibana (to view/search log entries with a nice UI)
- The only component that we will configure is Logstash
- We will accept log entries using the GELF protocol
- Log entries will be stored in ElasticSearch,
<br/>and displayed on Logstash's stdout for debugging
---
class: elk-manual
## Setting up ELK
- We need three containers: ElasticSearch, Logstash, Kibana
- We will place them on a common network, `logging`
.exercise[
- Create the network:
```bash
docker network create --driver overlay logging
```
- Create the ElasticSearch service:
```bash
docker service create --network logging --name elasticsearch elasticsearch:2.4
```
]
---
class: elk-manual
## Setting up Kibana
- Kibana exposes the web UI
- Its default port (5601) needs to be published
- It needs a tiny bit of configuration: the address of the ElasticSearch service
- We don't want Kibana logs to show up in Kibana (it would create clutter)
<br/>so we tell Logspout to ignore them
.exercise[
- Create the Kibana service:
```bash
docker service create --network logging --name kibana --publish 5601:5601 \
-e ELASTICSEARCH_URL=http://elasticsearch:9200 kibana:4.6
```
]
---
class: elk-manual
## Setting up Logstash
- Logstash needs some configuration to listen to GELF messages and send them to ElasticSearch
- We could author a custom image bundling this configuration
- We can also pass the [configuration](https://github.com/jpetazzo/orchestration-workshop/blob/master/elk/logstash.conf) on the command line
.exercise[
- Create the Logstash service:
```bash
docker service create --network logging --name logstash -p 12201:12201/udp \
logstash:2.4 -e "$(cat ~/orchestration-workshop/elk/logstash.conf)"
```
]
---
class: elk-manual
## Checking Logstash
- Before proceeding, let's make sure that Logstash started properly
.exercise[
- Lookup the node running the Logstash container:
```bash
docker service ps logstash
```
- Connect to that node
]
---
class: elk-manual
## View Logstash logs
.exercise[
- Get the ID of the Logstash container:
```bash
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=logstash)
```
- View the logs:
```bash
docker logs --follow $CID
```
]
You should see the heartbeat messages:
.small[
```json
{ "message" => "ok",
"host" => "1a4cfb063d13",
"@version" => "1",
"@timestamp" => "2016-06-19T00:45:45.273Z"
}
```
]
---
class: elk-auto
## Deploying our ELK cluster
- We will use a stack file
.exercise[
- Build, ship, and run our ELK stack:
```bash
docker-compose -f elk.yml build
docker-compose -f elk.yml push
docker stack deploy elk -c elk.yml
```
]
Note: the *build* and *push* steps are not strictly necessary, but they don't hurt!
Let's have a look at the [Compose file](
https://github.com/jpetazzo/orchestration-workshop/blob/master/stacks/elk.yml).
---
class: elk-auto
## Checking that our ELK stack works correctly
- Let's view the logs of logstash
(Who logs the loggers?)
.exercise[
- Stream logstash's logs:
```bash
docker service logs --follow --tail 1 elk_logstash
```
]
You should see the heartbeat messages:
.small[
```json
{ "message" => "ok",
"host" => "1a4cfb063d13",
"@version" => "1",
"@timestamp" => "2016-06-19T00:45:45.273Z"
}
```
]
---
## Testing the GELF receiver
- In a new window, we will generate a logging message
- We will use a one-off container, and Docker's GELF logging driver
.exercise[
- Send a test message:
```bash
docker run --log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
--rm alpine echo hello
```
]
The test message should show up in the logstash container logs.
---
## Sending logs from a service
- We were sending from a "classic" container so far; let's send logs from a service instead
- We're lucky: the parameters (`--log-driver` and `--log-opt`) are exactly the same!
.exercise[
- Send a test message:
```bash
docker service create \
--log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201 \
alpine echo hello
```
]
The test message should show up as well in the logstash container logs.
--
In fact, *multiple messages will show up, and continue to show up every few seconds!*
---
## Restart conditions
- By default, if a container exits (or is killed with `docker kill`, or runs out of memory ...),
the Swarm will restart it (possibly on a different machine)
- This behavior can be changed by setting the *restart condition* parameter
.exercise[
- Change the restart condition so that Swarm doesn't try to restart our container forever:
```bash
docker service update `xxx` --restart-condition none
```
]
Available restart conditions are `none`, `any`, and `on-error`.
You can also set `--restart-delay`, `--restart-max-attempts`, and `--restart-window`.
---
## Connect to Kibana
- The Kibana web UI is exposed on cluster port 5601
.exercise[
- Connect to port 5601 of your cluster
- if you're using Play-With-Docker, click on the (5601) badge above the terminal
- otherwise, open http://(any-node-address):5601/ with your browser
]
---
## "Configuring" Kibana
- If you see a status page with a yellow item, wait a minute and reload
(Kibana is probably still initializing)
- Kibana should offer you to "Configure an index pattern":
<br/>in the "Time-field name" drop down, select "@timestamp", and hit the
"Create" button
- Then:
- click "Discover" (in the top-left corner)
- click "Last 15 minutes" (in the top-right corner)
- click "Last 1 hour" (in the list in the middle)
- click "Auto-refresh" (top-right corner)
- click "5 seconds" (top-left of the list)
- You should see a series of green bars (with one new green bar every minute)
---
## Updating our services to use GELF
- We will now inform our Swarm to add GELF logging to all our services
- This is done with the `docker service update` command
- The logging flags are the same as before
.exercise[
<!--
- Enable GELF logging for all our *stateless* services:
```bash
for SERVICE in hasher rng webui worker; do
docker service update dockercoins_$SERVICE \
--log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201
done
```
-->
- Enable GELF logging for the `rng` service:
```bash
docker service update dockercoins_rng \
--log-driver gelf --log-opt gelf-address=udp://127.0.0.1:12201
```
]
After ~15 seconds, you should see the log messages in Kibana.
---
## Viewing container logs
- Go back to Kibana
- Container logs should be showing up!
- We can customize the web UI to be more readable
.exercise[
- In the left column, move the mouse over the following
columns, and click the "Add" button that appears:
- host
- container_name
- message
<!--
- logsource
- program
- message
-->
]
---
## .warning[Don't update stateful services!]
- What would have happened if we had updated the Redis service?
- When a service changes, SwarmKit replaces existing container with new ones
- This is fine for stateless services
- But if you update a stateful service, its data will be lost in the process
- If we updated our Redis service, all our DockerCoins would be lost
---
## Important afterword
**This is not a "production-grade" setup.**
It is just an educational example. We did set up a single
ElasticSearch instance and a single Logstash instance.
In a production setup, you need an ElasticSearch cluster
(both for capacity and availability reasons). You also
need multiple Logstash instances.
And if you want to withstand
bursts of logs, you need some kind of message queue:
Redis if you're cheap, Kafka if you want to make sure
that you don't drop messages on the floor. Good luck.
If you want to learn more about the GELF driver,
have a look at [this blog post](
http://jpetazzo.github.io/2017/01/20/docker-logging-gelf/).

5
docs/loop.sh Executable file
View File

@@ -0,0 +1,5 @@
#!/bin/sh
while true; do
find . |
entr -d . sh -c "DEBUG=1 ./markmaker.py < kube.yml > workshop.md"
done

225
docs/machine.md Normal file
View File

@@ -0,0 +1,225 @@
## Adding nodes using the Docker API
- We don't have to SSH into the other nodes, we can use the Docker API
- If you are using Play-With-Docker:
- the nodes expose the Docker API over port 2375/tcp, without authentication
- we will connect by setting the `DOCKER_HOST` environment variable
- Otherwise:
- the nodes expose the Docker API over port 2376/tcp, with TLS mutual authentication
- we will use Docker Machine to set the correct environment variables
<br/>(the nodes have been suitably pre-configured to be controlled through `node1`)
---
# Docker Machine
- Docker Machine has two primary uses:
- provisioning cloud instances running the Docker Engine
- managing local Docker VMs within e.g. VirtualBox
- Docker Machine is purely optional
- It makes it easy to create, upgrade, manage... Docker hosts:
- on your favorite cloud provider
- locally (e.g. to test clustering, or different versions)
- across different cloud providers
---
class: self-paced
## If you're using Play-With-Docker ...
- You won't need to use Docker Machine
- Instead, to "talk" to another node, we'll just set `DOCKER_HOST`
- You can skip the exercises telling you to do things with Docker Machine!
---
## Docker Machine basic usage
- We will learn two commands:
- `docker-machine ls` (list existing hosts)
- `docker-machine env` (switch to a specific host)
.exercise[
- List configured hosts:
```bash
docker-machine ls
```
]
You should see your 5 nodes.
---
class: in-person
## How did we make our 5 nodes show up there?
*For the curious...*
- This was done by our VM provisioning scripts
- After setting up everything else, `node1` adds the 5 nodes
to the local Docker Machine configuration
(located in `$HOME/.docker/machine`)
- Nodes are added using [Docker Machine generic driver](https://docs.docker.com/machine/drivers/generic/)
(It skips machine provisioning and jumps straight to the configuration phase)
- Docker Machine creates TLS certificates and deploys them to the nodes through SSH
---
## Using Docker Machine to communicate with a node
- To select a node, use `eval $(docker-machine env nodeX)`
- This sets a number of environment variables
- To unset these variables, use `eval $(docker-machine env -u)`
.exercise[
- View the variables used by Docker Machine:
```bash
docker-machine env node3
```
]
(This shows which variables *would* be set by Docker Machine; but it doesn't change them.)
---
## Getting the token
- First, let's store the join token in a variable
- This must be done from a manager
.exercise[
- Make sure we talk to the local node, or `node1`:
```bash
eval $(docker-machine env -u)
```
- Get the join token:
```bash
TOKEN=$(docker swarm join-token -q worker)
```
]
---
## Change the node targeted by the Docker CLI
- We need to set the right environment variables to communicate with `node3`
.exercise[
- If you're using Play-With-Docker:
```bash
export DOCKER_HOST=tcp://node3:2375
```
- Otherwise, use Docker Machine:
```bash
eval $(docker-machine env node3)
```
]
---
## Checking which node we're talking to
- Let's use the Docker API to ask "who are you?" to the remote node
.exercise[
- Extract the node name from the output of `docker info`:
```bash
docker info | grep ^Name
```
]
This should tell us that we are talking to `node3`.
Note: it can be useful to use a [custom shell prompt](
https://github.com/jpetazzo/orchestration-workshop/blob/master/prepare-vms/scripts/postprep.rc#L68)
reflecting the `DOCKER_HOST` variable.
---
## Adding a node through the Docker API
- We are going to use the same `docker swarm join` command as before
.exercise[
- Add `node3` to the Swarm:
```bash
docker swarm join --token $TOKEN node1:2377
```
]
---
## Going back to the local node
- We need to revert the environment variable(s) that we had set previously
.exercise[
- If you're using Play-With-Docker, just clear `DOCKER_HOST`:
```bash
unset DOCKER_HOST
```
- Otherwise, use Docker Machine to reset all the relevant variables:
```bash
eval $(docker-machine env -u)
```
]
From that point, we are communicating with `node1` again.
---
## Checking the composition of our cluster
- Now that we're talking to `node1` again, we can use management commands
.exercise[
- Check that the node is here:
```bash
docker node ls
```
]

BIN
docs/mario-red-shell.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

92
docs/markmaker.py Executable file
View File

@@ -0,0 +1,92 @@
#!/usr/bin/env python
# transforms a YAML manifest into a HTML workshop file
import glob
import logging
import os
import re
import sys
import yaml
if os.environ.get("DEBUG") == "1":
logging.basicConfig(level=logging.DEBUG)
class InvalidChapter(ValueError):
def __init__(self, chapter):
ValueError.__init__(self, "Invalid chapter: {!r}".format(chapter))
def generatefromyaml(manifest):
manifest = yaml.load(manifest)
markdown, titles = processchapter(manifest["chapters"], "<inline>")
logging.debug(titles)
toc = gentoc(titles)
markdown = markdown.replace("@@TOC@@", toc)
exclude = manifest.get("exclude", [])
logging.debug("exclude={!r}".format(exclude))
if not exclude:
logging.warning("'exclude' is empty.")
exclude = ",".join('"{}"'.format(c) for c in exclude)
html = open("workshop.html").read()
html = html.replace("@@MARKDOWN@@", markdown)
html = html.replace("@@EXCLUDE@@", exclude)
html = html.replace("@@CHAT@@", manifest["chat"])
return html
def gentoc(titles, depth=0, chapter=0):
if not titles:
return ""
if isinstance(titles, str):
return " "*(depth-2) + "- " + titles + "\n"
if isinstance(titles, list):
if depth==0:
sep = "\n\n<!-- auto-generated TOC -->\n---\n\n"
head = ""
tail = ""
elif depth==1:
sep = "\n"
head = "## Chapter {}\n\n".format(chapter)
tail = ""
else:
sep = "\n"
head = ""
tail = ""
return head + sep.join(gentoc(t, depth+1, c+1) for (c,t) in enumerate(titles)) + tail
# Arguments:
# - `chapter` is a string; if it has multiple lines, it will be used as
# a markdown fragment; otherwise it will be considered as a file name
# to be recursively loaded and parsed
# - `filename` is the name of the file that we're currently processing
# (to generate inline comments to facilitate edition)
# Returns: (epxandedmarkdown,[list of titles])
# The list of titles can be nested.
def processchapter(chapter, filename):
if isinstance(chapter, unicode):
return processchapter(chapter.encode("utf-8"), filename)
if isinstance(chapter, str):
if "\n" in chapter:
titles = re.findall("^# (.*)", chapter, re.MULTILINE)
slidefooter = "<!-- {} -->".format(filename)
chapter = chapter.replace("\n---\n", "\n{}\n---\n".format(slidefooter))
chapter += "\n" + slidefooter
return (chapter, titles)
if os.path.isfile(chapter):
return processchapter(open(chapter).read(), chapter)
if isinstance(chapter, list):
chapters = [processchapter(c, filename) for c in chapter]
markdown = "\n---\n".join(c[0] for c in chapters)
titles = [t for (m,t) in chapters if t]
return (markdown, titles)
raise InvalidChapter(chapter)
sys.stdout.write(generatefromyaml(sys.stdin))

1637
docs/metrics.md Normal file

File diff suppressed because it is too large Load Diff

236
docs/morenodes.md Normal file
View File

@@ -0,0 +1,236 @@
## Adding more manager nodes
- Right now, we have only one manager (node1)
- If we lose it, we lose quorum - and that's *very bad!*
- Containers running on other nodes will be fine ...
- But we won't be able to get or set anything related to the cluster
- If the manager is permanently gone, we will have to do a manual repair!
- Nobody wants to do that ... so let's make our cluster highly available
---
class: self-paced
## Adding more managers
With Play-With-Docker:
```bash
TOKEN=$(docker swarm join-token -q manager)
for N in $(seq 4 5); do
export DOCKER_HOST=tcp://node$N:2375
docker swarm join --token $TOKEN node1:2377
done
unset DOCKER_HOST
```
---
class: in-person
## Building our full cluster
- We could SSH to nodes 3, 4, 5; and copy-paste the command
--
class: in-person
- Or we could use the AWESOME POWER OF THE SHELL!
--
class: in-person
![Mario Red Shell](mario-red-shell.png)
--
class: in-person
- No, not *that* shell
---
class: in-person
## Let's form like Swarm-tron
- Let's get the token, and loop over the remaining nodes with SSH
.exercise[
- Obtain the manager token:
```bash
TOKEN=$(docker swarm join-token -q manager)
```
- Loop over the 3 remaining nodes:
```bash
for NODE in node3 node4 node5; do
ssh $NODE docker swarm join --token $TOKEN node1:2377
done
```
]
[That was easy.](https://www.youtube.com/watch?v=3YmMNpbFjp0)
---
## You can control the Swarm from any manager node
.exercise[
- Try the following command on a few different nodes:
```bash
docker node ls
```
]
On manager nodes:
<br/>you will see the list of nodes, with a `*` denoting
the node you're talking to.
On non-manager nodes:
<br/>you will get an error message telling you that
the node is not a manager.
As we saw earlier, you can only control the Swarm through a manager node.
---
class: self-paced
## Play-With-Docker node status icon
- If you're using Play-With-Docker, you get node status icons
- Node status icons are displayed left of the node name
- No icon = no Swarm mode detected
- Solid blue icon = Swarm manager detected
- Blue outline icon = Swarm worker detected
![Play-With-Docker icons](pwd-icons.png)
---
## Dynamically changing the role of a node
- We can change the role of a node on the fly:
`docker node promote nodeX` → make nodeX a manager
<br/>
`docker node demote nodeX` → make nodeX a worker
.exercise[
- See the current list of nodes:
```
docker node ls
```
- Promote any worker node to be a manager:
```
docker node promote <node_name_or_id>
```
]
---
## How many managers do we need?
- 2N+1 nodes can (and will) tolerate N failures
<br/>(you can have an even number of managers, but there is no point)
--
- 1 manager = no failure
- 3 managers = 1 failure
- 5 managers = 2 failures (or 1 failure during 1 maintenance)
- 7 managers and more = now you might be overdoing it a little bit
---
## Why not have *all* nodes be managers?
- Intuitively, it's harder to reach consensus in larger groups
- With Raft, writes have to go to (and be acknowledged by) all nodes
- More nodes = more network traffic
- Bigger network = more latency
---
## What would McGyver do?
- If some of your machines are more than 10ms away from each other,
<br/>
try to break them down in multiple clusters
(keeping internal latency low)
- Groups of up to 9 nodes: all of them are managers
- Groups of 10 nodes and up: pick 5 "stable" nodes to be managers
<br/>
(Cloud pro-tip: use separate auto-scaling groups for managers and workers)
- Groups of more than 100 nodes: watch your managers' CPU and RAM
- Groups of more than 1000 nodes:
- if you can afford to have fast, stable managers, add more of them
- otherwise, break down your nodes in multiple clusters
---
## What's the upper limit?
- We don't know!
- Internal testing at Docker Inc.: 1000-10000 nodes is fine
- deployed to a single cloud region
- one of the main take-aways was *"you're gonna need a bigger manager"*
- Testing by the community: [4700 heterogenous nodes all over the 'net](https://sematext.com/blog/2016/11/14/docker-swarm-lessons-from-swarm3k/)
- it just works
- more nodes require more CPU; more containers require more RAM
- scheduling of large jobs (70000 containers) is slow, though (working on it!)
---
## Real-life deployment methods
--
Running commands manually over SSH
--
(lol jk)
--
- Using your favorite configuration management tool
- [Docker for AWS](https://docs.docker.com/docker-for-aws/#quickstart)
- [Docker for Azure](https://docs.docker.com/docker-for-azure/)

236
docs/namespaces.md Normal file
View File

@@ -0,0 +1,236 @@
class: namespaces
name: namespaces
# Improving isolation with User Namespaces
- *Namespaces* are kernel mechanisms to compartimetalize the system
- There are different kind of namespaces: `pid`, `net`, `mnt`, `ipc`, `uts`, and `user`
- For a primer, see "Anatomy of a Container"
([video](https://www.youtube.com/watch?v=sK5i-N34im8))
([slides](https://www.slideshare.net/jpetazzo/cgroups-namespaces-and-beyond-what-are-containers-made-from-dockercon-europe-2015))
- The *user namespace* allows to map UIDs between the containers and the host
- As a result, `root` in a container can map to a non-privileged user on the host
Note: even without user namespaces, `root` in a container cannot go wild on the host.
<br/>
It is mediated by capabilities, cgroups, namespaces, seccomp, LSMs...
---
class: namespaces
## User Namespaces in Docker
- Optional feature added in Docker Engine 1.10
- Not enabled by default
- Has to be enabled at Engine startup, and affects all containers
- When enabled, `UID:GID` in containers are mapped to a different range on the host
- Safer than switching to a non-root user (with `-u` or `USER`) in the container
<br/>
(Since with user namespaces, root escalation maps to a non-privileged user)
- Can be selectively disabled per container by starting them with `--userns=host`
---
class: namespaces
## User Namespaces Caveats
When user namespaces are enabled, containers cannot:
- Use the host's network namespace (with `docker run --network=host`)
- Use the host's PID namespace (with `docker run --pid=host`)
- Run in privileged mode (with `docker run --privileged`)
... Unless user namespaces are disabled for the container, with flag `--userns=host`
External volume and graph drivers that don't support user mapping might not work.
All containers are currently mapped to the same UID:GID range.
Some of these limitations might be lifted in the future!
---
class: namespaces
## Filesystem ownership details
When enabling user namespaces:
- the UID:GID on disk (in the images and containers) has to match the *mapped* UID:GID
- existing images and containers cannot work (their UID:GID would have to be changed)
For practical reasons, when enabling user namespaces, the Docker Engine places containers and images (and everything else) in a different directory.
As a resut, if you enable user namespaces on an existing installation:
- all containers and images (and e.g. Swarm data) disappear
- *if a node is a member of a Swarm, it is then kicked out of the Swarm*
- everything will re-appear if you disable user namespaces again
---
class: namespaces
## Picking a node
- We will select a node where we will enable user namespaces
- This node will have to be re-added to the Swarm
- All containers and services running on this node will be rescheduled
- Let's make sure that we do not pick the node running the registry!
.exercise[
- Check on which node the registry is running:
```bash
docker service ps registry
```
]
Pick any other node (noted `nodeX` in the next slides).
---
class: namespaces
## Logging into the right Engine
.exercise[
- Log into the right node:
```bash
ssh node`X`
```
]
---
class: namespaces
## Configuring the Engine
.exercise[
- Create a configuration file for the Engine:
```bash
echo '{"userns-remap": "default"}' | sudo tee /etc/docker/daemon.json
```
- Restart the Engine:
```bash
kill $(pidof dockerd)
```
]
---
class: namespaces
## Checking that User Namespaces are enabled
.exercise[
- Notice the new Docker path:
```bash
docker info | grep var/lib
```
- Notice the new UID:GID permissions:
```bash
sudo ls -l /var/lib/docker
```
]
You should see a line like the following:
```
drwx------ 11 296608 296608 4096 Aug 3 05:11 296608.296608
```
---
class: namespaces
## Add the node back to the Swarm
.exercise[
- Get our manager token from another node:
```bash
ssh node`Y` docker swarm join-token manager
```
- Copy-paste the join command to the node
]
---
class: namespaces
## Check the new UID:GID
.exercise[
- Run a background container on the node:
```bash
docker run -d --name lockdown alpine sleep 1000000
```
- Look at the processes in this container:
```bash
docker top lockdown
ps faux
```
]
---
class: namespaces
## Comparing on-disk ownership with/without User Namespaces
.exercise[
- Compare the output of the two following commands:
```bash
docker run alpine ls -l /
docker run --userns=host alpine ls -l /
```
]
--
class: namespaces
In the first case, it looks like things belong to `root:root`.
In the second case, we will see the "real" (on-disk) ownership.
--
class: namespaces
Remember to get back to `node1` when finished!

385
docs/netshoot.md Normal file
View File

@@ -0,0 +1,385 @@
class: extra-details
## Troubleshooting overlay networks
<!--
## Finding the real cause of the bottleneck
- We want to debug our app as we scale `worker` up and down
-->
- We want to run tools like `ab` or `httping` on the internal network
--
class: extra-details
- Ah, if only we had created our overlay network with the `--attachable` flag ...
--
class: extra-details
- Oh well, let's use this as an excuse to introduce New Ways To Do Things
---
# Breaking into an overlay network
- We will create a dummy placeholder service on our network
- Then we will use `docker exec` to run more processes in this container
.exercise[
- Start a "do nothing" container using our favorite Swiss-Army distro:
```bash
docker service create --network dockercoins_default --name debug \
--constraint node.hostname==$HOSTNAME alpine sleep 1000000000
```
]
The `constraint` makes sure that the container will be created on the local node.
---
## Entering the debug container
- Once our container is started (which should be really fast because the alpine image is small), we can enter it (from any node)
.exercise[
- Locate the container:
```bash
docker ps
```
- Enter it:
```bash
docker exec -ti <containerID> sh
```
]
---
## Labels
- We can also be fancy and find the ID of the container automatically
- SwarmKit places labels on containers
.exercise[
- Get the ID of the container:
```bash
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=debug)
```
- And enter the container:
```bash
docker exec -ti $CID sh
```
]
---
## Installing our debugging tools
- Ideally, you would author your own image, with all your favorite tools, and use it instead of the base `alpine` image
- But we can also dynamically install whatever we need
.exercise[
- Install a few tools:
```bash
apk add --update curl apache2-utils drill
```
]
---
## Investigating the `rng` service
- First, let's check what `rng` resolves to
.exercise[
- Use drill or nslookup to resolve `rng`:
```bash
drill rng
```
]
This give us one IP address. It is not the IP address of a container.
It is a virtual IP address (VIP) for the `rng` service.
---
## Investigating the VIP
.exercise[
- Try to ping the VIP:
```bash
ping rng
```
]
It *should* ping. (But this might change in the future.)
With Engine 1.12: VIPs respond to ping if a
backend is available on the same machine.
With Engine 1.13: VIPs respond to ping if a
backend is available anywhere.
(Again: this might change in the future.)
---
## What if I don't like VIPs?
- Services can be published using two modes: VIP and DNSRR.
- With VIP, you get a virtual IP for the service, and a load balancer
based on IPVS
(By the way, IPVS is totally awesome and if you want to learn more about it in the context of containers,
I highly recommend [this talk](https://www.youtube.com/watch?v=oFsJVV1btDU&index=5&list=PLkA60AVN3hh87OoVra6MHf2L4UR9xwJkv) by [@kobolog](https://twitter.com/kobolog) at DC15EU!)
- With DNSRR, you get the former behavior (from Engine 1.11), where
resolving the service yields the IP addresses of all the containers for
this service
- You change this with `docker service create --endpoint-mode [VIP|DNSRR]`
---
## Looking up VIP backends
- You can also resolve a special name: `tasks.<name>`
- It will give you the IP addresses of the containers for a given service
.exercise[
- Obtain the IP addresses of the containers for the `rng` service:
```bash
drill tasks.rng
```
]
This should list 5 IP addresses.
---
class: extra-details, benchmarking
## Testing and benchmarking our service
- We will check that the service is up with `rng`, then
benchmark it with `ab`
.exercise[
- Make a test request to the service:
```bash
curl rng
```
- Open another window, and stop the workers, to test in isolation:
```bash
docker service update dockercoins_worker --replicas 0
```
]
Wait until the workers are stopped (check with `docker service ls`)
before continuing.
---
class: extra-details, benchmarking
## Benchmarking `rng`
We will send 50 requests, but with various levels of concurrency.
.exercise[
- Send 50 requests, with a single sequential client:
```bash
ab -c 1 -n 50 http://rng/10
```
- Send 50 requests, with fifty parallel clients:
```bash
ab -c 50 -n 50 http://rng/10
```
]
---
class: extra-details, benchmarking
## Benchmark results for `rng`
- When serving requests sequentially, they each take 100ms
- In the parallel scenario, the latency increased dramatically:
- What about `hasher`?
---
class: extra-details, benchmarking
## Benchmarking `hasher`
We will do the same tests for `hasher`.
The command is slightly more complex, since we need to post random data.
First, we need to put the POST payload in a temporary file.
.exercise[
- Install curl in the container, and generate 10 bytes of random data:
```bash
curl http://rng/10 >/tmp/random
```
]
---
class: extra-details, benchmarking
## Benchmarking `hasher`
Once again, we will send 50 requests, with different levels of concurrency.
.exercise[
- Send 50 requests with a sequential client:
```bash
ab -c 1 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
```
- Send 50 requests with 50 parallel clients:
```bash
ab -c 50 -n 50 -T application/octet-stream -p /tmp/random http://hasher/
```
]
---
class: extra-details, benchmarking
## Benchmark results for `hasher`
- The sequential benchmarks takes ~5 seconds to complete
- The parallel benchmark takes less than 1 second to complete
- In both cases, each request takes a bit more than 100ms to complete
- Requests are a bit slower in the parallel benchmark
- It looks like `hasher` is better equiped to deal with concurrency than `rng`
---
class: extra-details, title, benchmarking
Why?
---
class: extra-details, benchmarking
## Why does everything take (at least) 100ms?
`rng` code:
![RNG code screenshot](delay-rng.png)
`hasher` code:
![HASHER code screenshot](delay-hasher.png)
---
class: extra-details, title, benchmarking
But ...
WHY?!?
---
class: extra-details, benchmarking
## Why did we sprinkle this sample app with sleeps?
- Deterministic performance
<br/>(regardless of instance speed, CPUs, I/O...)
- Actual code sleeps all the time anyway
- When your code makes a remote API call:
- it sends a request;
- it sleeps until it gets the response;
- it processes the response.
---
class: extra-details, in-person, benchmarking
## Why do `rng` and `hasher` behave differently?
![Equations on a blackboard](equations.png)
(Synchronous vs. asynchronous event processing)
---
class: extra-details
## Global scheduling → global debugging
- Traditional approach:
- log into a node
- install our Swiss Army Knife (if necessary)
- troubleshoot things
- Proposed alternative:
- put our Swiss Army Knife in a container (e.g. [nicolaka/netshoot](https://hub.docker.com/r/nicolaka/netshoot/))
- run tests from multiple locations at the same time
(This becomes very practical with the `docker service log` command, available since 17.05.)
---
## More about overlay networks
.blackbelt[[Deep Dive in Docker Overlay Networks](https://www.youtube.com/watch?v=b3XDl0YsVsg&index=1&list=PLkA60AVN3hh-biQ6SCtBJ-WVTyBmmYho8) by Laurent Bernaille (DC17US)]
.blackbelt[Deeper Dive in Docker Overlay Networks by Laurent Bernaille (Wednesday 13:30)]

18
docs/nodeinfo.md Normal file
View File

@@ -0,0 +1,18 @@
## Getting task information for a given node
- You can see all the tasks assigned to a node with `docker node ps`
- It shows the *desired state* and *current state* of each task
- `docker node ps` shows info about the current node
- `docker node ps <node_name_or_id>` shows info for another node
- `docker node ps -f <filter_expression>` allows to select which tasks to show
```bash
# Show only tasks that are supposed to be running
docker node ps -f desired-state=running
# Show only tasks whose name contains the string "front"
docker node ps -f name=front
```

58
docs/operatingswarm.md Normal file
View File

@@ -0,0 +1,58 @@
class: title, in-person
Operating the Swarm
---
name: part-2
class: title, self-paced
Part 2
---
class: self-paced
## Before we start ...
The following exercises assume that you have a 5-nodes Swarm cluster.
If you come here from a previous tutorial and still have your cluster: great!
Otherwise: check [part 1](#part-1) to learn how to setup your own cluster.
We pick up exactly where we left you, so we assume that you have:
- a five nodes Swarm cluster,
- a self-hosted registry,
- DockerCoins up and running.
The next slide has a cheat sheet if you need to set that up in a pinch.
---
class: self-paced
## Catching up
Assuming you have 5 nodes provided by
[Play-With-Docker](http://www.play-with-docker/), do this from `node1`:
```bash
docker swarm init --advertise-addr eth0
TOKEN=$(docker swarm join-token -q manager)
for N in $(seq 2 5); do
DOCKER_HOST=tcp://node$N:2375 docker swarm join --token $TOKEN node1:2377
done
git clone git://github.com/jpetazzo/orchestration-workshop
cd orchestration-workshop/stacks
docker stack deploy --compose-file registry.yml registry
docker-compose -f dockercoins.yml build
docker-compose -f dockercoins.yml push
docker stack deploy --compose-file dockercoins.yml dockercoins
```
You should now be able to connect to port 8000 and see the DockerCoins web UI.

348
docs/ourapponkube.md Normal file
View File

@@ -0,0 +1,348 @@
class: title
Our app on Kube
---
## What's on the menu?
In this part, we will:
- **build** images for our app,
- **ship** these images with a registry,
- **run** deployments using these images,
- expose these deployments so they can communicate with each other,
- expose the web UI so we can access it from outside.
---
## The plan
- Build on our control node (`node1`)
- Tag images so that they are named `$REGISTRY/servicename`
- Upload them to a registry
- Create deployments using the images
- Expose (with a ClusterIP) the services that need to communicate
- Expose (with a NodePort) the WebUI
---
## Which registry do we want to use?
- We could use the Docker Hub
- Or a service offered by our cloud provider (GCR, ECR...)
- Or we could just self-host that registry
*We'll self-host the registry because it's the most generic solution for this workshop.*
---
## Using the open source registry
- We need to run a `registry:2` container
<br/>(make sure you specify tag `:2` to run the new version!)
- It will store images and layers to the local filesystem
<br/>(but you can add a config file to use S3, Swift, etc.)
- Docker *requires* TLS when communicating with the registry
- unless for registries on `127.0.0.0/8` (i.e. `localhost`)
- or with the Engine flag `--insecure-registry`
- Our strategy: publish the registry container on a NodePort,
<br/>so that it's available through `127.0.0.1:xxxxx` on each node
---
# Deploying a self-hosted registry
- We will deploy a registry container, and expose it with a NodePort
.exercise[
- Create the registry service:
```bash
kubectl run registry --image=registry:2
```
- Expose it on a NodePort:
```bash
kubectl expose deploy/registry --port=5000 --type=NodePort
```
]
---
## Connecting to our registry
- We need to find out which port has been allocated
.exercise[
- View the service details:
```bash
kubectl describe svc/registry
```
- Get the port number programmatically:
```bash
NODEPORT=$(kubectl get svc/registry -o json | jq .spec.ports[0].nodePort)
REGISTRY=127.0.0.1:$NODEPORT
```
]
---
## Testing our registry
- A convenient Docker registry API route to remember is `/v2/_catalog`
.exercise[
- View the repositories currently held in our registry:
```bash
curl $REGISTRY/v2/_catalog
```
]
--
We should see:
```json
{"repositories":[]}
```
---
## Testing our local registry
- We can retag a small image, and push it to the registry
.exercise[
- Make sure we have the busybox image, and retag it:
```bash
docker pull busybox
docker tag busybox $REGISTRY/busybox
```
- Push it:
```bash
docker push $REGISTRY/busybox
```
]
---
## Checking again what's on our local registry
- Let's use the same endpoint as before
.exercise[
- Ensure that our busybox image is now in the local registry:
```bash
curl $REGISTRY/v2/_catalog
```
]
The curl command should now output:
```json
{"repositories":["busybox"]}
```
---
## Building and pushing our images
- We are going to use a convenient feature of Docker Compose
.exercise[
- Go to the `stacks` directory:
```bash
cd ~/orchestration-workshop/stacks
```
- Build and push the images:
```bash
docker-compose -f dockercoins.yml build
docker-compose -f dockercoins.yml push
```
]
Let's have a look at the `dockercoins.yml` file while this is building and pushing.
---
```yaml
version: "3"
services:
rng:
build: dockercoins/rng
image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
deploy:
mode: global
...
redis:
image: redis
...
worker:
build: dockercoins/worker
image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
...
deploy:
replicas: 10
```
.warning[Just in case you were wondering ... Docker "services" are not Kubernetes "services".]
---
## Deploying all the things
- We can now deploy our code (as well as a redis instance)
.exercise[
- Deploy `redis`:
```bash
kubectl run redis --image=redis
```
- Deploy everything else:
```bash
for SERVICE in hasher rng webui worker; do
kubectl run $SERVICE --image=$REGISTRY/$SERVICE
done
```
]
---
## Is this working?
- After waiting for the deployment to complete, let's look at the logs!
(Hint: use `kubectl get deploy -w` to watch deployment events)
.exercise[
- Look at some logs:
```bash
kubectl logs deploy/rng
kubectl logs deploy/worker
```
]
--
🤔 `rng` is fine ... But not `worker`.
--
💡 Oh right! We forgot to `expose`.
---
# Exposing services internally
- Three deployments need to be reachable by others: `hasher`, `redis`, `rng`
- `worker` doesn't need to be exposed
- `webui` will be dealt with later
.exercise[
- Expose each deployment, specifying the right port:
```bash
kubectl expose deployment redis --port 6379
kubectl expose deployment rng --port 80
kubectl expose deployment hasher --port 80
```
]
---
## Is this working yet?
- The `worker` has an infinite loop, that retries 10 seconds after an error
.exercise[
- Stream the worker's logs:
```bash
kubectl logs deploy/worker --follow
```
(Give it about 10 seconds to recover)
]
--
We should now see the `worker`, well, working happily.
---
# Exposing services for external access
- Now we would like to access the Web UI
- We will expose it with a `NodePort`
(just like we did for the registry)
.exercise[
- Create a `NodePort` service for the Web UI:
```bash
kubectl expose deploy/webui --type=NodePort --port=80
```
- Check the port that was allocated:
```bash
kubectl get svc
```
]
---
## Accessing the web UI
- We can now connect to *any node*, on the allocated node port, to view the web UI
.exercise[
- Open the web UI in your browser (http://node-ip-address:3xxxx/)
]
--
*Alright, we're back to where we started, when we were running on a single node!*

976
docs/ourapponswarm.md Normal file
View File

@@ -0,0 +1,976 @@
class: title
Our app on Swarm
---
## What's on the menu?
In this part, we will:
- **build** images for our app,
- **ship** these images with a registry,
- **run** services using these images.
---
## Why do we need to ship our images?
- When we do `docker-compose up`, images are built for our services
- These images are present only on the local node
- We need these images to be distributed on the whole Swarm
- The easiest way to achieve that is to use a Docker registry
- Once our images are on a registry, we can reference them when
creating our services
---
class: extra-details
## Build, ship, and run, for a single service
If we had only one service (built from a `Dockerfile` in the
current directory), our workflow could look like this:
```
docker build -t jpetazzo/doublerainbow:v0.1 .
docker push jpetazzo/doublerainbow:v0.1
docker service create jpetazzo/doublerainbow:v0.1
```
We just have to adapt this to our application, which has 4 services!
---
## The plan
- Build on our local node (`node1`)
- Tag images so that they are named `localhost:5000/servicename`
- Upload them to a registry
- Create services using the images
---
## Which registry do we want to use?
.small[
- **Docker Hub**
- hosted by Docker Inc.
- requires an account (free, no credit card needed)
- images will be public (unless you pay)
- located in AWS EC2 us-east-1
- **Docker Trusted Registry**
- self-hosted commercial product
- requires a subscription (free 30-day trial available)
- images can be public or private
- located wherever you want
- **Docker open source registry**
- self-hosted barebones repository hosting
- doesn't require anything
- doesn't come with anything either
- located wherever you want
]
---
class: extra-details
## Using Docker Hub
*If we wanted to use the Docker Hub...*
<!--
```meta
^{
```
-->
- We would log into the Docker Hub:
```bash
docker login
```
- And in the following slides, we would use our Docker Hub login
(e.g. `jpetazzo`) instead of the registry address (i.e. `127.0.0.1:5000`)
<!--
```meta
^}
```
-->
---
class: extra-details
## Using Docker Trusted Registry
*If we wanted to use DTR, we would...*
- Make sure we have a Docker Hub account
- [Activate a Docker Datacenter subscription](
https://hub.docker.com/enterprise/trial/)
- Install DTR on our machines
- Use `dtraddress:port/user` instead of the registry address
*This is out of the scope of this workshop!*
---
## Using the open source registry
- We need to run a `registry:2` container
<br/>(make sure you specify tag `:2` to run the new version!)
- It will store images and layers to the local filesystem
<br/>(but you can add a config file to use S3, Swift, etc.)
- Docker *requires* TLS when communicating with the registry
- unless for registries on `127.0.0.0/8` (i.e. `localhost`)
- or with the Engine flag `--insecure-registry`
<!-- -->
- Our strategy: publish the registry container on port 5000,
<br/>so that it's available through `127.0.0.1:5000` on each node
---
class: manual-btp
# Deploying a local registry
- We will create a single-instance service, publishing its port
on the whole cluster
.exercise[
- Create the registry service:
```bash
docker service create --name registry --publish 5000:5000 registry:2
```
- Now try the following command; it should return `{"repositories":[]}`:
```bash
curl 127.0.0.1:5000/v2/_catalog
```
]
(If that doesn't work, wait a few seconds and try again.)
---
class: manual-btp
## Testing our local registry
- We can retag a small image, and push it to the registry
.exercise[
- Make sure we have the busybox image, and retag it:
```bash
docker pull busybox
docker tag busybox 127.0.0.1:5000/busybox
```
- Push it:
```bash
docker push 127.0.0.1:5000/busybox
```
]
---
class: manual-btp
## Checking what's on our local registry
- The registry API has endpoints to query what's there
.exercise[
- Ensure that our busybox image is now in the local registry:
```bash
curl http://127.0.0.1:5000/v2/_catalog
```
]
The curl command should now output:
```json
{"repositories":["busybox"]}
```
---
class: manual-btp
## Build, tag, and push our application container images
- Compose has named our images `dockercoins_XXX` for each service
- We need to retag them (to `127.0.0.1:5000/XXX:v1`) and push them
.exercise[
- Set `REGISTRY` and `TAG` environment variables to use our local registry
- And run this little for loop:
```bash
cd ~/orchestration-workshop/dockercoins
REGISTRY=127.0.0.1:5000 TAG=v1
for SERVICE in hasher rng webui worker; do
docker tag dockercoins_$SERVICE $REGISTRY/$SERVICE:$TAG
docker push $REGISTRY/$SERVICE
done
```
]
---
class: manual-btp
# Overlay networks
- SwarmKit integrates with overlay networks
- Networks are created with `docker network create`
- Make sure to specify that you want an *overlay* network
<br/>(otherwise you will get a local *bridge* network by default)
.exercise[
- Create an overlay network for our application:
```bash
docker network create --driver overlay dockercoins
```
]
---
class: manual-btp
## Viewing existing networks
- Let's confirm that our network was created
.exercise[
- List existing networks:
```bash
docker network ls
```
]
---
class: manual-btp
## Can you spot the differences?
The networks `dockercoins` and `ingress` are different from the other ones.
Can you see how?
--
class: manual-btp
- They are using a different kind of ID, reflecting the fact that they
are SwarmKit objects instead of "classic" Docker Engine objects.
- Their *scope* is `swarm` instead of `local`.
- They are using the overlay driver.
---
class: manual-btp, extra-details
## Caveats
.warning[In Docker 1.12, you cannot join an overlay network with `docker run --net ...`.]
Starting with version 1.13, you can, if the network was created with the `--attachable` flag.
*Why is that?*
Placing a container on a network requires allocating an IP address for this container.
The allocation must be done by a manager node (worker nodes cannot update Raft data).
As a result, `docker run --net ...` requires collaboration with manager nodes.
It alters the code path for `docker run`, so it is allowed only under strict circumstances.
---
class: manual-btp
## Run the application
- First, create the `redis` service; that one is using a Docker Hub image
.exercise[
- Create the `redis` service:
```bash
docker service create --network dockercoins --name redis redis
```
]
---
class: manual-btp
## Run the other services
- Then, start the other services one by one
- We will use the images pushed previously
.exercise[
- Start the other services:
```bash
REGISTRY=127.0.0.1:5000
TAG=v1
for SERVICE in hasher rng webui worker; do
docker service create --network dockercoins --detach=true \
--name $SERVICE $REGISTRY/$SERVICE:$TAG
done
```
]
???
## Wait for our application to be up
- We will see later a way to watch progress for all the tasks of the cluster
- But for now, a scrappy Shell loop will do the trick
.exercise[
- Repeatedly display the status of all our services:
```bash
watch "docker service ls -q | xargs -n1 docker service ps"
```
- Stop it once everything is running
]
---
class: manual-btp
## Expose our application web UI
- We need to connect to the `webui` service, but it is not publishing any port
- Let's reconfigure it to publish a port
.exercise[
- Update `webui` so that we can connect to it from outside:
```bash
docker service update webui --publish-add 8000:80 --detach=false
```
]
Note: to "de-publish" a port, you would have to specify the container port.
</br>(i.e. in that case, `--publish-rm 80`)
---
class: manual-btp
## What happens when we modify a service?
- Let's find out what happened to our `webui` service
.exercise[
- Look at the tasks and containers associated to `webui`:
```bash
docker service ps webui
```
]
--
class: manual-btp
The first version of the service (the one that was not exposed) has been shutdown.
It has been replaced by the new version, with port 80 accessible from outside.
(This will be discussed with more details in the section about stateful services.)
---
class: manual-btp
## Connect to the web UI
- The web UI is now available on port 8000, *on all the nodes of the cluster*
.exercise[
- If you're using Play-With-Docker, just click on the `(8000)` badge
- Otherwise, point your browser to any node, on port 8000
]
---
## Scaling the application
- We can change scaling parameters with `docker update` as well
- We will do the equivalent of `docker-compose scale`
.exercise[
- Bring up more workers:
```bash
docker service update worker --replicas 10 --detach=false
```
- Check the result in the web UI
]
You should see the performance peaking at 10 hashes/s (like before).
---
class: manual-btp
# Global scheduling
- We want to utilize as best as we can the entropy generators
on our nodes
- We want to run exactly one `rng` instance per node
- SwarmKit has a special scheduling mode for that, let's use it
- We cannot enable/disable global scheduling on an existing service
- We have to destroy and re-create the `rng` service
---
class: manual-btp
## Scaling the `rng` service
.exercise[
- Remove the existing `rng` service:
```bash
docker service rm rng
```
- Re-create the `rng` service with *global scheduling*:
```bash
docker service create --name rng --network dockercoins --mode global \
--detach=false $REGISTRY/rng:$TAG
```
- Look at the result in the web UI
]
---
class: extra-details, manual-btp
## Why do we have to re-create the service to enable global scheduling?
- Enabling it dynamically would make rolling updates semantics very complex
- This might change in the future (after all, it was possible in 1.12 RC!)
- As of Docker Engine 17.05, other parameters requiring to `rm`/`create` the service are:
- service name
- hostname
- network
---
class: swarm-ready
## How did we make our app "Swarm-ready"?
This app was written in June 2015. (One year before Swarm mode was released.)
What did we change to make it compatible with Swarm mode?
--
.exercise[
- Go to the app directory:
```bash
cd ~/orchestration-workshop/dockercoins
```
- See modifications in the code:
```bash
git log -p --since "4-JUL-2015" -- . ':!*.yml*' ':!*.html'
```
]
---
class: swarm-ready
## What did we change in our app since its inception?
- Compose files
- HTML file (it contains an embedded contextual tweet)
- Dockerfiles (to switch to smaller images)
- That's it!
--
class: swarm-ready
*We didn't change a single line of code in this app since it was written.*
--
class: swarm-ready
*The images that were [built in June 2015](
https://hub.docker.com/r/jpetazzo/dockercoins_worker/tags/)
(when the app was written) can still run today ...
<br/>... in Swarm mode (distributed across a cluster, with load balancing) ...
<br/>... without any modification.*
---
class: swarm-ready
## How did we design our app in the first place?
- [Twelve-Factor App](https://12factor.net/) principles
- Service discovery using DNS names
- Initially implemented as "links"
- Then "ambassadors"
- And now "services"
- Existing apps might require more changes!
---
class: manual-btp
# Integration with Compose
- The previous section showed us how to streamline image build and push
- We will now see how to streamline service creation
(i.e. get rid of the `for SERVICE in ...; do docker service create ...` part)
---
## Compose file version 3
(New in Docker Engine 1.13)
- Almost identical to version 2
- Can be directly used by a Swarm cluster through `docker stack ...` commands
- Introduces a `deploy` section to pass Swarm-specific parameters
- Resource limits are moved to this `deploy` section
- See [here](https://github.com/aanand/docker.github.io/blob/8524552f99e5b58452fcb1403e1c273385988b71/compose/compose-file.md#upgrading) for the complete list of changes
- Supersedes *Distributed Application Bundles*
(JSON payload describing an application; could be generated from a Compose file)
---
class: manual-btp
## Removing everything
- Before deploying using "stacks," let's get a clean slate
.exercise[
- Remove *all* the services:
```bash
docker service ls -q | xargs docker service rm
```
]
---
## Our first stack
We need a registry to move images around.
Without a stack file, it would be deployed with the following command:
```bash
docker service create --publish 5000:5000 registry:2
```
Now, we are going to deploy it with the following stack file:
```yaml
version: "3"
services:
registry:
image: registry:2
ports:
- "5000:5000"
```
---
## Checking our stack files
- All the stack files that we will use are in the `stacks` directory
.exercise[
- Go to the `stacks` directory:
```bash
cd ~/orchestration-workshop/stacks
```
- Check `registry.yml`:
```bash
cat registry.yml
```
]
---
## Deploying our first stack
- All stack manipulation commands start with `docker stack`
- Under the hood, they map to `docker service` commands
- Stacks have a *name* (which also serves as a namespace)
- Stacks are specified with the aforementioned Compose file format version 3
.exercise[
- Deploy our local registry:
```bash
docker stack deploy registry --compose-file registry.yml
```
]
---
## Inspecting stacks
- `docker stack ps` shows the detailed state of all services of a stack
.exercise[
- Check that our registry is running correctly:
```bash
docker stack ps registry
```
- Confirm that we get the same output with the following command:
```bash
docker service ps registry_registry
```
]
---
class: manual-btp
## Specifics of stack deployment
Our registry is not *exactly* identical to the one deployed with `docker service create`!
- Each stack gets its own overlay network
- Services of the task are connected to this network
<br/>(unless specified differently in the Compose file)
- Services get network aliases matching their name in the Compose file
<br/>(just like when Compose brings up an app specified in a v2 file)
- Services are explicitly named `<stack_name>_<service_name>`
- Services and tasks also get an internal label indicating which stack they belong to
---
class: auto-btp
## Testing our local registry
- Connecting to port 5000 *on any node of the cluster* routes us to the registry
- Therefore, we can use `localhost:5000` or `127.0.0.1:5000` as our registry
.exercise[
- Issue the following API request to the registry:
```bash
curl 127.0.0.1:5000/v2/_catalog
```
]
It should return:
```json
{"repositories":[]}
```
If that doesn't work, retry a few times; perhaps the container is still starting.
---
class: auto-btp
## Pushing an image to our local registry
- We can retag a small image, and push it to the registry
.exercise[
- Make sure we have the busybox image, and retag it:
```bash
docker pull busybox
docker tag busybox 127.0.0.1:5000/busybox
```
- Push it:
```bash
docker push 127.0.0.1:5000/busybox
```
]
---
class: auto-btp
## Checking what's on our local registry
- The registry API has endpoints to query what's there
.exercise[
- Ensure that our busybox image is now in the local registry:
```bash
curl http://127.0.0.1:5000/v2/_catalog
```
]
The curl command should now output:
```json
"repositories":["busybox"]}
```
---
## Building and pushing stack services
- When using Compose file version 2 and above, you can specify *both* `build` and `image`
- When both keys are present:
- Compose does "business as usual" (uses `build`)
- but the resulting image is named as indicated by the `image` key
<br/>
(instead of `<projectname>_<servicename>:latest`)
- it can be pushed to a registry with `docker-compose push`
- Example:
```yaml
webfront:
build: www
image: myregistry.company.net:5000/webfront
```
---
## Using Compose to build and push images
.exercise[
- Try it:
```bash
docker-compose -f dockercoins.yml build
docker-compose -f dockercoins.yml push
```
]
Let's have a look at the `dockercoins.yml` file while this is building and pushing.
---
```yaml
version: "3"
services:
rng:
build: dockercoins/rng
image: ${REGISTRY-127.0.0.1:5000}/rng:${TAG-latest}
deploy:
mode: global
...
redis:
image: redis
...
worker:
build: dockercoins/worker
image: ${REGISTRY-127.0.0.1:5000}/worker:${TAG-latest}
...
deploy:
replicas: 10
```
---
## Deploying the application
- Now that the images are on the registry, we can deploy our application stack
.exercise[
- Create the application stack:
```bash
docker stack deploy dockercoins --compose-file dockercoins.yml
```
]
We can now connect to any of our nodes on port 8000, and we will see the familiar hashing speed graph.
---
## Maintaining multiple environments
There are many ways to handle variations between environments.
- Compose loads `docker-compose.yml` and (if it exists) `docker-compose.override.yml`
- Compose can load alternate file(s) by setting the `-f` flag or the `COMPOSE_FILE` environment variable
- Compose files can *extend* other Compose files, selectively including services:
```yaml
web:
extends:
file: common-services.yml
service: webapp
```
See [this documentation page](https://docs.docker.com/compose/extends/) for more details about these techniques.
---
class: extra-details
## Good to know ...
- Compose file version 3 adds the `deploy` section
- Further versions (3.1, ...) add more features (secrets, configs ...)
- You can re-run `docker stack deploy` to update a stack
- You can make manual changes with `docker service update` ...
- ... But they will be wiped out each time you `docker stack deploy`
(That's the intended behavior, when one thinks about it!)
- `extends` doesn't work with `docker stack deploy`
(But you can use `docker-compose config` to "flatten" your configuration)
---
## Summary
- We've seen how to set up a Swarm
- We've used it to host our own registry
- We've built our app container images
- We've used the registry to host those images
- We've deployed and scaled our application
- We've seen how to use Compose to streamline deployments
- Awesome job, team!

169
docs/prereqs-k8s.md Normal file
View File

@@ -0,0 +1,169 @@
# Pre-requirements
- Computer with internet connection and a web browser
- For instructor-led workshops: an SSH client to connect to remote machines
- on Linux, OS X, FreeBSD... you are probably all set
- on Windows, get [putty](http://www.putty.org/),
Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
[Git BASH](https://git-for-windows.github.io/), or
[MobaXterm](http://mobaxterm.mobatek.net/)
- A tiny little bit of Docker knowledge
(that's totally OK if you're not a Docker expert!)
---
class: in-person, extra-details
## Nice-to-haves
- [Mosh](https://mosh.org/) instead of SSH, if your internet connection tends to lose packets
<br/>(available with `(apt|yum|brew) install mosh`; then connect with `mosh user@host`)
- [GitHub](https://github.com/join) account
<br/>(if you want to fork the repo)
- [Gitter](https://gitter.im/) account
<br/>(to join the conversation during the workshop)
- [Slack](https://community.docker.com/registrations/groups/4316) account
<br/>(to join the conversation after the workshop)
- [Docker Hub](https://hub.docker.com) account
<br/>(it's one way to distribute images on your cluster)
---
class: extra-details
## Extra details
- This slide should have a little magnifying glass in the top left corner
(If it doesn't, it's because CSS is hard — Jérôme is only a backend person, alas)
- Slides with that magnifying glass indicate slides providing extra details
- Feel free to skip them if you're in a hurry!
---
## Hands-on sections
- The whole workshop is hands-on
- We will see Docker and Kubernetes in action
- You are invited to reproduce all the demos
- All hands-on sections are clearly identified, like the gray rectangle below
.exercise[
- This is the stuff you're supposed to do!
- Go to [container.training](http://container.training/) to view these slides
- Join the [chat room](@@CHAT@@)
]
---
class: pic, in-person
![You get five VMs](you-get-five-vms.jpg)
---
class: in-person
## You get five VMs
- Each person gets 5 private VMs (not shared with anybody else)
- Kubernetes has been deployed and pre-configured on these machines
- They'll remain up until the day after the tutorial
- You should have a little card with login+password+IP addresses
- You can automatically SSH from one VM to another
.exercise[
<!--
```bash
for N in $(seq 1 5); do
ssh -o StrictHostKeyChecking=no node$N true
done
for N in $(seq 1 5); do
(.
docker-machine rm -f node$N
ssh node$N "docker ps -aq | xargs -r docker rm -f"
ssh node$N sudo rm -f /etc/systemd/system/docker.service
ssh node$N sudo systemctl daemon-reload
echo Restarting node$N.
ssh node$N sudo systemctl restart docker
echo Restarted node$N.
) &
done
wait
```
-->
- Log into the first VM (`node1`) with SSH or MOSH
- Check that you can SSH (without password) to `node2`:
```bash
ssh node2
```
- Type `exit` or `^D` to come back to node1
<!--
```meta
^D
```
-->
]
---
## We will (mostly) interact with node1 only
- Unless instructed, **all commands must be run from the first VM, `node1`**
- We will only checkout/copy the code on `node1`
- When we will use the other nodes, we will do it mostly through the Docker API
- We will log into other nodes only for initial setup and a few "out of band" operations
<br/>(checking internal logs, debugging...)
---
## Terminals
Once in a while, the instructions will say:
<br/>"Open a new terminal."
There are multiple ways to do this:
- create a new window or tab on your machine, and SSH into the VM;
- use screen or tmux on the VM and open a new window from there.
You are welcome to use the method that you feel the most comfortable with.
---
## Tmux cheatsheet
- Ctrl-b c → creates a new window
- Ctrl-b n → go to next window
- Ctrl-b p → go to previous window
- Ctrl-b " → split window top/bottom
- Ctrl-b % → split window left/right
- Ctrl-b Alt-1 → rearrange windows in columns
- Ctrl-b Alt-2 → rearrange windows in rows
- Ctrl-b arrows → navigate to other windows
- Ctrl-b d → detach session
- tmux attach → reattach to session

245
docs/prereqs.md Normal file
View File

@@ -0,0 +1,245 @@
# Pre-requirements
- Computer with internet connection and a web browser
- For instructor-led workshops: an SSH client to connect to remote machines
- on Linux, OS X, FreeBSD... you are probably all set
- on Windows, get [putty](http://www.putty.org/),
Microsoft [Win32 OpenSSH](https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH),
[Git BASH](https://git-for-windows.github.io/), or
[MobaXterm](http://mobaxterm.mobatek.net/)
- For self-paced learning: SSH is not necessary if you use
[Play-With-Docker](http://www.play-with-docker.com/)
- Some Docker knowledge
(but that's OK if you're not a Docker expert!)
---
class: in-person, extra-details
## Nice-to-haves
- [Mosh](https://mosh.org/) instead of SSH, if your internet connection tends to lose packets
<br/>(available with `(apt|yum|brew) install mosh`; then connect with `mosh user@host`)
- [GitHub](https://github.com/join) account
<br/>(if you want to fork the repo)
- [Gitter](https://gitter.im/) account
<br/>(to join the conversation during the workshop)
- [Slack](https://community.docker.com/registrations/groups/4316) account
<br/>(to join the conversation after the workshop)
- [Docker Hub](https://hub.docker.com) account
<br/>(it's one way to distribute images on your cluster)
---
class: extra-details
## Extra details
- This slide should have a little magnifying glass in the top left corner
(If it doesn't, it's because CSS is hard — Jérôme is only a backend person, alas)
- Slides with that magnifying glass indicate slides providing extra details
- Feel free to skip them if you're in a hurry!
---
## Hands-on sections
- The whole workshop is hands-on
- We will see Docker in action
- You are invited to reproduce all the demos
- All hands-on sections are clearly identified, like the gray rectangle below
.exercise[
- This is the stuff you're supposed to do!
- Go to [container.training](http://container.training/) to view these slides
- Join the [chat room](@@CHAT@@)
]
---
class: in-person
# VM environment
- To follow along, you need a cluster of five Docker Engines
- If you are doing this with an instructor, see next slide
- If you are doing (or re-doing) this on your own, you can:
- create your own cluster (local or cloud VMs) with Docker Machine
([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-machine))
- use [Play-With-Docker](http://play-with-docker.com) ([instructions](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker))
- create a bunch of clusters for you and your friends
([instructions](https://github.com/jpetazzo/orchestration-workshop/tree/master/prepare-vms))
---
class: pic, in-person
![You get five VMs](you-get-five-vms.jpg)
---
class: in-person
## You get five VMs
- Each person gets 5 private VMs (not shared with anybody else)
- They'll remain up until the day after the tutorial
- You should have a little card with login+password+IP addresses
- You can automatically SSH from one VM to another
.exercise[
<!--
```bash
for N in $(seq 1 5); do
ssh -o StrictHostKeyChecking=no node$N true
done
for N in $(seq 1 5); do
(.
docker-machine rm -f node$N
ssh node$N "docker ps -aq | xargs -r docker rm -f"
ssh node$N sudo rm -f /etc/systemd/system/docker.service
ssh node$N sudo systemctl daemon-reload
echo Restarting node$N.
ssh node$N sudo systemctl restart docker
echo Restarted node$N.
) &
done
wait
```
-->
- Log into the first VM (`node1`) with SSH or MOSH
- Check that you can SSH (without password) to `node2`:
```bash
ssh node2
```
- Type `exit` or `^D` to come back to node1
<!--
```meta
^D
```
-->
]
---
## If doing or re-doing the workshop on your own ...
- Use [Play-With-Docker](http://www.play-with-docker.com/)!
- Main differences:
- you don't need to SSH to the machines
<br/>(just click on the node that you want to control in the left tab bar)
- Play-With-Docker automagically detects exposed ports
<br/>(and displays them as little badges with port numbers, above the terminal)
- You can access HTTP services by clicking on the port numbers
- exposing TCP services requires something like
[ngrok](https://ngrok.com/)
or [supergrok](https://github.com/jpetazzo/orchestration-workshop#using-play-with-docker)
<!--
- If you use VMs deployed with Docker Machine:
- you won't have pre-authorized SSH keys to bounce across machines
- you won't have host aliases
-->
---
class: self-paced
## Using Play-With-Docker
- Open a new browser tab to [www.play-with-docker.com](http://www.play-with-docker.com/)
- Confirm that you're not a robot
- Click on "ADD NEW INSTANCE": congratulations, you have your first Docker node!
- When you will need more nodes, just click on "ADD NEW INSTANCE" again
- Note the countdown in the corner; when it expires, your instances are destroyed
- If you give your URL to somebody else, they can access your nodes too
<br/>
(You can use that for pair programming, or to get help from a mentor)
- Loving it? Not loving it? Tell it to the wonderful authors,
[@marcosnils](https://twitter.com/marcosnils) &
[@xetorthio](https://twitter.com/xetorthio)!
---
## We will (mostly) interact with node1 only
- Unless instructed, **all commands must be run from the first VM, `node1`**
- We will only checkout/copy the code on `node1`
- When we will use the other nodes, we will do it mostly through the Docker API
- We will log into other nodes only for initial setup and a few "out of band" operations
<br/>(checking internal logs, debugging...)
---
## Terminals
Once in a while, the instructions will say:
<br/>"Open a new terminal."
There are multiple ways to do this:
- create a new window or tab on your machine, and SSH into the VM;
- use screen or tmux on the VM and open a new window from there.
You are welcome to use the method that you feel the most comfortable with.
---
## Tmux cheatsheet
- Ctrl-b c → creates a new window
- Ctrl-b n → go to next window
- Ctrl-b p → go to previous window
- Ctrl-b " → split window top/bottom
- Ctrl-b % → split window left/right
- Ctrl-b Alt-1 → rearrange windows in columns
- Ctrl-b Alt-2 → rearrange windows in rows
- Ctrl-b arrows → navigate to other windows
- Ctrl-b d → detach session
- tmux attach → reattach to session

BIN
docs/pwd-icons.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

View File

Before

Width:  |  Height:  |  Size: 59 KiB

After

Width:  |  Height:  |  Size: 59 KiB

18
docs/remark-0.14.min.js vendored Normal file

File diff suppressed because one or more lines are too long

21
docs/remark.min.js vendored Normal file

File diff suppressed because one or more lines are too long

2
docs/requirements.txt Normal file
View File

@@ -0,0 +1,2 @@
# This is for netlify
PyYAML

197
docs/rollout.md Normal file
View File

@@ -0,0 +1,197 @@
# Rolling updates
- By default (without rolling updates), when a scaled resource is updated:
- new pods are created
- old pods are terminated
- ... all at the same time
- if something goes wrong, ¯\\\_(ツ)\_/¯
---
## Rolling updates
- With rolling updates, when a resource is updated, it happens progressively
- Two parameters determine the pace of the rollout: `maxUnavailable` and `maxSurge`
- They can be specified in absolute number of pods, or percentage of the `replicas` count
- At any given time ...
- there will always be at least `replicas`-`maxUnavailable` pods available
- there will never be more than `replicas`+`maxSurge` pods in total
- there will therefore be up to `maxUnavailable`+`maxSurge` pods being updated
- We have the possibility to rollback to the previous version
<br/>(if the update fails or is unsatisfactory in any way)
---
## Rolling updates in practice
- As of Kubernetes 1.8, we can do rolling updates with:
`deployments`, `daemonsets`, `statefulsets`
- Editing one of these resources will automatically result in a rolling update
- Rolling updates can be monitored with the `kubectl rollout` subcommand
---
## Building a new version of the `worker` service
.exercise[
- Go to the `stack` directory:
```bash
cd ~/orchestration-workshop/stacks
```
- Edit `dockercoins/worker/worker.py`, update the `sleep` line to sleep 1 second
- Build a new tag and push it to the registry:
```bash
export REGISTRY=localhost:3xxxx TAG=v0.2
docker-compose -f dockercoins.yml build
docker-compose -f dockercoins.yml push
```
]
---
## Rolling out the new version of the `worker` service
.exercise[
- Let's monitor what's going on by opening a few terminals, and run:
```bash
kubectl get pods -w
kubectl get replicasets -w
kubectl get deployments -w
```
- Update `worker` either with `kubectl edit`, or by running:
```bash
kubectl set image deploy worker worker=$REGISTRY/worker:$TAG
```
]
--
That rollout should be pretty quick. What shows in the web UI?
---
## Rolling out a boo-boo
- What happens if we make a mistake?
.exercise[
- Update `worker` by specifying a non-existent image:
```bash
export TAG=v0.3
kubectl set image deploy worker worker=$REGISTRY/worker:$TAG
```
- Check what's going on:
```bash
kubectl rollout status deploy worker
```
]
--
Our rollout is stuck. However, the app is not dead (just 10% slower).
---
## Recovering from a bad rollout
- We could push some `v0.3` image
(the pod retry logic will eventually catch it and the rollout will proceed)
- Or we could invoke a manual rollback
.exercise[
- Cancel the deployment and wait for the dust to settle down:
```bash
kubectl rollout undo deploy worker
kubectl rollout status deploy worker
```
]
---
## Changing rollout parameters
- We want to:
- revert to `v0.1`
- be conservative on availability (always have desired number of available workers)
- be aggressive on rollout speed (update more than one pod at a time)
- give some time to our workers to "warm up" before starting more
The corresponding changes can be expressed in the following YAML snippet:
.small[
```yaml
spec:
template:
spec:
containers:
- name: worker
image: $REGISTRY/worker:v0.1
strategy:
rollingUpdate:
maxUnavailable: 0
maxSurge: 3
minReadySeconds: 10
```
]
---
## Applying changes through a YAML patch
- We could use `kubectl edit deployment worker`
- But we could also use `kubectl patch` with the exact YAML shown before
.exercise[
.small[
- Apply all our changes and wait for them to take effect:
```bash
kubectl patch deployment worker -p "
spec:
template:
spec:
containers:
- name: worker
image: $REGISTRY/worker:v0.1
strategy:
rollingUpdate:
maxUnavailable: 0
maxSurge: 3
minReadySeconds: 10
"
kubectl rollout status deployment worker
```
]
]

468
docs/sampleapp.md Normal file
View File

@@ -0,0 +1,468 @@
# Our sample application
- Visit the GitHub repository with all the materials of this workshop:
<br/>https://github.com/jpetazzo/orchestration-workshop
- The application is in the [dockercoins](
https://github.com/jpetazzo/orchestration-workshop/tree/master/dockercoins)
subdirectory
- Let's look at the general layout of the source code:
there is a Compose file [docker-compose.yml](
https://github.com/jpetazzo/orchestration-workshop/blob/master/dockercoins/docker-compose.yml) ...
... and 4 other services, each in its own directory:
- `rng` = web service generating random bytes
- `hasher` = web service computing hash of POSTed data
- `worker` = background process using `rng` and `hasher`
- `webui` = web interface to watch progress
---
class: extra-details
## Compose file format version
*Particularly relevant if you have used Compose before...*
- Compose 1.6 introduced support for a new Compose file format (aka "v2")
- Services are no longer at the top level, but under a `services` section
- There has to be a `version` key at the top level, with value `"2"` (as a string, not an integer)
- Containers are placed on a dedicated network, making links unnecessary
- There are other minor differences, but upgrade is easy and straightforward
---
## Links, naming, and service discovery
- Containers can have network aliases (resolvable through DNS)
- Compose file version 2+ makes each container reachable through its service name
- Compose file version 1 did require "links" sections
- Our code can connect to services using their short name
(instead of e.g. IP address or FQDN)
- Network aliases are automatically namespaced
(i.e. you can have multiple apps declaring and using a service named `database`)
---
## Example in `worker/worker.py`
![Service discovery](service-discovery.png)
---
## What's this application?
---
class: pic
![DockerCoins logo](dockercoins.png)
(DockerCoins 2016 logo courtesy of [@XtlCnslt](https://twitter.com/xtlcnslt) and [@ndeloof](https://twitter.com/ndeloof). Thanks!)
---
## What's this application?
- It is a DockerCoin miner! 💰🐳📦🚢
--
- No, you can't buy coffee with DockerCoins
--
- How DockerCoins works:
- `worker` asks to `rng` to generate a few random bytes
- `worker` feeds these bytes into `hasher`
- and repeat forever!
- every second, `worker` updates `redis` to indicate how many loops were done
- `webui` queries `redis`, and computes and exposes "hashing speed" in your browser
---
## Getting the application source code
- We will clone the GitHub repository
- The repository also contains scripts and tools that we will use through the workshop
.exercise[
<!--
```bash
[ -d orchestration-workshop ] && mv orchestration-workshop orchestration-workshop.$$
```
-->
- Clone the repository on `node1`:
```bash
git clone git://github.com/jpetazzo/orchestration-workshop
```
]
(You can also fork the repository on GitHub and clone your fork if you prefer that.)
---
# Running the application
Without further ado, let's start our application.
.exercise[
- Go to the `dockercoins` directory, in the cloned repo:
```bash
cd ~/orchestration-workshop/dockercoins
```
- Use Compose to build and run all containers:
```bash
docker-compose up
```
]
Compose tells Docker to build all container images (pulling
the corresponding base images), then starts all containers,
and displays aggregated logs.
---
## Lots of logs
- The application continuously generates logs
- We can see the `worker` service making requests to `rng` and `hasher`
- Let's put that in the background
.exercise[
- Stop the application by hitting `^C`
<!--
```meta
^C
```
-->
]
- `^C` stops all containers by sending them the `TERM` signal
- Some containers exit immediately, others take longer
<br/>(because they don't handle `SIGTERM` and end up being killed after a 10s timeout)
---
## Restarting in the background
- Many flags and commands of Compose are modeled after those of `docker`
.exercise[
- Start the app in the background with the `-d` option:
```bash
docker-compose up -d
```
- Check that our app is running with the `ps` command:
```bash
docker-compose ps
```
]
`docker-compose ps` also shows the ports exposed by the application.
---
class: extra-details
## Viewing logs
- The `docker-compose logs` command works like `docker logs`
.exercise[
- View all logs since container creation and exit when done:
```bash
docker-compose logs
```
- Stream container logs, starting at the last 10 lines for each container:
```bash
docker-compose logs --tail 10 --follow
```
<!--
```meta
^C
```
-->
]
Tip: use `^S` and `^Q` to pause/resume log output.
---
class: extra-details
## Upgrading from Compose 1.6
.warning[The `logs` command has changed between Compose 1.6 and 1.7!]
- Up to 1.6
- `docker-compose logs` is the equivalent of `logs --follow`
- `docker-compose logs` must be restarted if containers are added
- Since 1.7
- `--follow` must be specified explicitly
- new containers are automatically picked up by `docker-compose logs`
---
## Connecting to the web UI
- The `webui` container exposes a web dashboard; let's view it
.exercise[
- With a web browser, connect to `node1` on port 8000
- Remember: the `nodeX` aliases are valid only on the nodes themselves
- In your browser, you need to enter the IP address of your node
]
You should see a speed of approximately 4 hashes/second.
More precisely: 4 hashes/second, with regular dips down to zero.
<br/>This is because Jérôme is incapable of writing good frontend code.
<br/>Don't ask. Seriously, don't ask. This is embarrassing.
---
class: extra-details
## Why does the speed seem irregular?
- The app actually has a constant, steady speed: 3.33 hashes/second
<br/>
(which corresponds to 1 hash every 0.3 seconds, for *reasons*)
- The worker doesn't update the counter after every loop, but up to once per second
- The speed is computed by the browser, checking the counter about once per second
- Between two consecutive updates, the counter will increase either by 4, or by 0
- The perceived speed will therefore be 4 - 4 - 4 - 0 - 4 - 4 - etc.
*We told you to not ask!!!*
---
## Scaling up the application
- Our goal is to make that performance graph go up (without changing a line of code!)
--
- Before trying to scale the application, we'll figure out if we need more resources
(CPU, RAM...)
- For that, we will use good old UNIX tools on our Docker node
---
## Looking at resource usage
- Let's look at CPU, memory, and I/O usage
.exercise[
- run `top` to see CPU and memory usage (you should see idle cycles)
- run `vmstat 3` to see I/O usage (si/so/bi/bo)
<br/>(the 4 numbers should be almost zero, except `bo` for logging)
]
We have available resources.
- Why?
- How can we use them?
---
## Scaling workers on a single node
- Docker Compose supports scaling
- Let's scale `worker` and see what happens!
.exercise[
- Start one more `worker` container:
```bash
docker-compose scale worker=2
```
- Look at the performance graph (it should show a x2 improvement)
- Look at the aggregated logs of our containers (`worker_2` should show up)
- Look at the impact on CPU load with e.g. top (it should be negligible)
]
---
## Adding more workers
- Great, let's add more workers and call it a day, then!
.exercise[
- Start eight more `worker` containers:
```bash
docker-compose scale worker=10
```
- Look at the performance graph: does it show a x10 improvement?
- Look at the aggregated logs of our containers
- Look at the impact on CPU load and memory usage
<!--
```bash
sleep 5
killall docker-compose
```
-->
]
---
# Identifying bottlenecks
- You should have seen a 3x speed bump (not 10x)
- Adding workers didn't result in linear improvement
- *Something else* is slowing us down
--
- ... But what?
--
- The code doesn't have instrumentation
- Let's use state-of-the-art HTTP performance analysis!
<br/>(i.e. good old tools like `ab`, `httping`...)
---
## Accessing internal services
- `rng` and `hasher` are exposed on ports 8001 and 8002
- This is declared in the Compose file:
```yaml
...
rng:
build: rng
ports:
- "8001:80"
hasher:
build: hasher
ports:
- "8002:80"
...
```
---
## Measuring latency under load
We will use `httping`.
.exercise[
- Check the latency of `rng`:
```bash
httping -c 10 localhost:8001
```
- Check the latency of `hasher`:
```bash
httping -c 10 localhost:8002
```
]
`rng` has a much higher latency than `hasher`.
---
## Let's draw hasty conclusions
- The bottleneck seems to be `rng`
- *What if* we don't have enough entropy and can't generate enough random numbers?
- We need to scale out the `rng` service on multiple machines!
Note: this is a fiction! We have enough entropy. But we need a pretext to scale out.
(In fact, the code of `rng` uses `/dev/urandom`, which never runs out of entropy...
<br/>
...and is [just as good as `/dev/random`](http://www.slideshare.net/PacSecJP/filippo-plain-simple-reality-of-entropy).)
---
## Clean up
- Before moving on, let's remove those containers
.exercise[
- Tell Compose to remove everything:
```bash
docker-compose down
```
]

194
docs/secrets.md Normal file
View File

@@ -0,0 +1,194 @@
class: secrets
## Secret management
- Docker has a "secret safe" (secure key→value store)
- You can create as many secrets as you like
- You can associate secrets to services
- Secrets are exposed as plain text files, but kept in memory only (using `tmpfs`)
- Secrets are immutable (at least in Engine 1.13)
- Secrets have a max size of 500 KB
---
class: secrets
## Creating secrets
- Must specify a name for the secret; and the secret itself
.exercise[
- Assign [one of the four most commonly used passwords](https://www.youtube.com/watch?v=0Jx8Eay5fWQ) to a secret called `hackme`:
```bash
echo love | docker secret create hackme -
```
]
If the secret is in a file, you can simply pass the path to the file.
(The special path `-` indicates to read from the standard input.)
---
class: secrets
## Creating better secrets
- Picking lousy passwords always leads to security breaches
.exercise[
- Let's craft a better password, and assign it to another secret:
```bash
base64 /dev/urandom | head -c16 | docker secret create arewesecureyet -
```
]
Note: in the latter case, we don't even know the secret at this point. But Swarm does.
---
class: secrets
## Using secrets
- Secrets must be handed explicitly to services
.exercise[
- Create a dummy service with both secrets:
```bash
docker service create \
--secret hackme --secret arewesecureyet \
--name dummyservice --mode global \
alpine sleep 1000000000
```
]
We use a global service to make sure that there will be an instance on the local node.
---
class: secrets
## Accessing secrets
- Secrets are materialized on `/run/secrets` (which is an in-memory filesystem)
.exercise[
- Find the ID of the container for the dummy service:
```bash
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
```
- Enter the container:
```bash
docker exec -ti $CID sh
```
- Check the files in `/run/secrets`
]
---
class: secrets
## Rotating secrets
- You can't change a secret
(Sounds annoying at first; but allows clean rollbacks if a secret update goes wrong)
- You can add a secret to a service with `docker service update --secret-add`
(This will redeploy the service; it won't add the secret on the fly)
- You can remove a secret with `docker service update --secret-rm`
- Secrets can be mapped to different names by expressing them with a micro-format:
```bash
docker service create --secret source=secretname,target=filename
```
---
class: secrets
## Changing our insecure password
- We want to replace our `hackme` secret with a better one
.exercise[
- Remove the insecure `hackme` secret:
```bash
docker service update dummyservice --secret-rm hackme
```
- Add our better secret instead:
```bash
docker service update dummyservice \
--secret-add source=arewesecureyet,target=hackme
```
]
Wait for the service to be fully updated with e.g. `watch docker service ps dummyservice`.
<br/>(With Docker Engine 17.10 and later, the CLI will wait for you!)
---
class: secrets
## Checking that our password is now stronger
- We will use the power of `docker exec`!
.exercise[
- Get the ID of the new container:
```bash
CID=$(docker ps -q --filter label=com.docker.swarm.service.name=dummyservice)
```
- Check the contents of the secret files:
```bash
docker exec $CID grep -r . /run/secrets
```
]
---
class: secrets
## Secrets in practice
- Can be (ab)used to hold whole configuration files if needed
- If you intend to rotate secret `foo`, call it `foo.N` instead, and map it to `foo`
(N can be a serial, a timestamp...)
```bash
docker service create --secret source=foo.N,target=foo ...
```
- You can update (remove+add) a secret in a single command:
```bash
docker service update ... --secret-rm foo.M --secret-add source=foo.N,target=foo
```
- For more details and examples, [check the documentation](https://docs.docker.com/engine/swarm/secrets/)

16
docs/security.md Normal file
View File

@@ -0,0 +1,16 @@
# Secrets management and encryption at rest
(New in Docker Engine 1.13)
- Secrets management = selectively and securely bring secrets to services
- Encryption at rest = protect against storage theft or prying
- Remember:
- control plane is authenticated through mutual TLS, certs rotated every 90 days
- control plane is encrypted with AES-GCM, keys rotated every 12 hours
- data plane is not encrypted by default (for performance reasons),
<br/>but we saw earlier how to enable that with a single flag

62
docs/selfpaced.yml Normal file
View File

@@ -0,0 +1,62 @@
exclude:
- in-person
chat: FIXME
chapters:
- |
class: title
Docker <br/> Orchestration <br/> Workshop
- intro.md
- |
@@TOC@@
- - prereqs.md
- versions.md
- |
class: title
All right!
<br/>
We're all set.
<br/>
Let's do this.
- |
name: part-1
class: title, self-paced
Part 1
- sampleapp.md
- |
class: title
Scaling out
- swarmkit.md
- creatingswarm.md
- machine.md
- morenodes.md
- - firstservice.md
- ourapponswarm.md
- - operatingswarm.md
- netshoot.md
- swarmnbt.md
- ipsec.md
- updatingservices.md
- healthchecks.md
- nodeinfo.md
- swarmtools.md
- - security.md
- secrets.md
- leastprivilege.md
- namespaces.md
- apiscope.md
- encryptionatrest.md
- logging.md
- metrics.md
- stateful.md
- extratips.md
- end.md
- |
class: title
Thank you!

View File

Before

Width:  |  Height:  |  Size: 48 KiB

After

Width:  |  Height:  |  Size: 48 KiB

68
docs/setup-k8s.md Normal file
View File

@@ -0,0 +1,68 @@
# Setting up Kubernetes
- How did we setup these Kubernetes clusters that we're using?
--
- We used `kubeadm` on "fresh" EC2 instances with Ubuntu 16.04 LTS
1. Install Docker
2. Install Kubernetes packages
3. Run `kubeadm init` on the master node
4. Setup Weave (the overlay network)
<br/>
(that step is just one `kubectl apply` command; discussed later)
5. Run `kubeadm join` on the other nodes (with the token produced by `kubeadm init`)
6. Copy the configuration file generated by `kubeadm init`
---
## `kubeadm` drawbacks
- Doesn't setup Docker or any other container engine
- Doesn't setup the overlay network
- Scripting is complex
<br/>
(because extracting the token requires advanced `kubectl` commands)
- Doesn't setup multi-master (no high availability)
--
- It's still twice as much steps as setting up a Swarm cluster 😕
---
## Other deployment options
- If you are on Google Cloud:
[GKE](https://cloud.google.com/container-engine/)
Empirically the best Kubernetes deployment out there
- If you are on AWS:
[kops](https://github.com/kubernetes/kops)
... But with AWS re:invent just around the corner, expect some changes
- On a local machine:
[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/),
[kubespawn](https://github.com/kinvolk/kube-spawn)
FIXME
- If you want something customizable:
[kubicorn](https://github.com/kris-nova/kubicorn)
Probably the closest to a multi-cloud/hybrid solution so far, but in development
- Also, many commercial options!
FIXME

344
docs/stateful.md Normal file
View File

@@ -0,0 +1,344 @@
# Dealing with stateful services
- First of all, you need to make sure that the data files are on a *volume*
- Volumes are host directories that are mounted to the container's filesystem
- These host directories can be backed by the ordinary, plain host filesystem ...
- ... Or by distributed/networked filesystems
- In the latter scenario, in case of node failure, the data is safe elsewhere ...
- ... And the container can be restarted on another node without data loss
---
## Building a stateful service experiment
- We will use Redis for this example
- We will expose it on port 10000 to access it easily
.exercise[
- Start the Redis service:
```bash
docker service create --name stateful -p 10000:6379 redis
```
- Check that we can connect to it:
```bash
docker run --net host --rm redis redis-cli -p 10000 info server
```
]
---
## Accessing our Redis service easily
- Typing that whole command is going to be tedious
.exercise[
- Define a shell alias to make our lives easier:
```bash
alias redis='docker run --net host --rm redis redis-cli -p 10000'
```
- Try it:
```bash
redis info server
```
]
---
## Basic Redis commands
.exercise[
- Check that the `foo` key doesn't exist:
```bash
redis get foo
```
- Set it to `bar`:
```bash
redis set foo bar
```
- Check that it exists now:
```bash
redis get foo
```
]
---
## Local volumes vs. global volumes
- Global volumes exist in a single namespace
- A global volume can be mounted on any node
<br/>.small[(bar some restrictions specific to the volume driver in use; e.g. using an EBS-backed volume on a GCE/EC2 mixed cluster)]
- Attaching a global volume to a container allows to start the container anywhere
<br/>(and retain its data wherever you start it!)
- Global volumes require extra *plugins* (Flocker, Portworx...)
- Docker doesn't come with a default global volume driver at this point
- Therefore, we will fall back on *local volumes*
---
## Local volumes
- We will use the default volume driver, `local`
- As the name implies, the `local` volume driver manages *local* volumes
- Since local volumes are (duh!) *local*, we need to pin our container to a specific host
- We will do that with a *constraint*
.exercise[
- Add a placement constraint to our service:
```bash
docker service update stateful --constraint-add node.hostname==$HOSTNAME
```
]
---
## Where is our data?
- If we look for our `foo` key, it's gone!
.exercise[
- Check the `foo` key:
```bash
redis get foo
```
- Adding a constraint caused the service to be redeployed:
```bash
docker service ps stateful
```
]
Note: even if the constraint ends up being a no-op (i.e. not
moving the service), the service gets redeployed.
This ensures consistent behavior.
---
## Setting the key again
- Since our database was wiped out, let's populate it again
.exercise[
- Set `foo` again:
```bash
redis set foo bar
```
- Check that it's there:
```bash
redis get foo
```
]
---
## Service updates cause containers to be replaced
- Let's try to make a trivial update to the service and see what happens
.exercise[
- Set a memory limit to our Redis service:
```bash
docker service update stateful --limit-memory 100M
```
- Try to get the `foo` key one more time:
```bash
redis get foo
```
]
The key is blank again!
---
## Service volumes are ephemeral by default
- Let's highlight what's going on with volumes!
.exercise[
- Check the current list of volumes:
```bash
docker volume ls
```
- Carry a minor update to our Redis service:
```bash
docker service update stateful --limit-memory 200M
```
]
Again: all changes trigger the creation of a new task, and therefore a replacement of the existing container;
even when it is not strictly technically necessary.
---
## The data is gone again
- What happened to our data?
.exercise[
- The list of volumes is slightly different:
```bash
docker volume ls
```
]
(You should see one extra volume.)
---
## Assigning a persistent volume to the container
- Let's add an explicit volume mount to our service, referencing a named volume
.exercise[
- Update the service with a volume mount:
```bash
docker service update stateful \
--mount-add type=volume,source=foobarstore,target=/data
```
- Check the new volume list:
```bash
docker volume ls
```
]
Note: the `local` volume driver automatically creates volumes.
---
## Checking that persistence actually works across service updates
.exercise[
- Store something in the `foo` key:
```bash
redis set foo barbar
```
- Update the service with yet another trivial change:
```bash
docker service update stateful --limit-memory 300M
```
- Check that `foo` is still set:
```bash
redis get foo
```
]
---
## Recap
- The service must commit its state to disk when being shutdown.red[*]
(Shutdown = being sent a `TERM` signal)
- The state must be written on files located on a volume
- That volume must be specified to be persistent
- If using a local volume, the service must also be pinned to a specific node
(And losing that node means losing the data, unless there are other backups)
.footnote[<br/>
.red[*]If you customize Redis configuration, make sure you
persist data correctly!
<br/>
It's easy to make that mistake — __Trust me!__]
---
## Cleaning up
.exercise[
- Remove the stateful service:
```bash
docker service rm stateful
```
- Remove the associated volume:
```bash
docker volume rm foobarstore
```
]
Note: we could keep the volume around if we wanted.
---
## Should I run stateful services in containers?
--
Depending whom you ask, they'll tell you:
--
- certainly not, heathen!
--
- we've been running a few thousands PostgreSQL instances in containers ...
<br/>for a few years now ... in production ... is that bad?
--
- what's a container?
--
Perhaps a better question would be:
*"Should I run stateful services?"*
--
- is it critical for my business?
- is it my value-add?
- or should I find somebody else to run them for me?

4
docs/swarm-mode.svg Normal file

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 266 KiB

View File

Before

Width:  |  Height:  |  Size: 1.2 MiB

After

Width:  |  Height:  |  Size: 1.2 MiB

151
docs/swarmkit.md Normal file
View File

@@ -0,0 +1,151 @@
# SwarmKit
- [SwarmKit](https://github.com/docker/swarmkit) is an open source
toolkit to build multi-node systems
- It is a reusable library, like libcontainer, libnetwork, vpnkit ...
- It is a plumbing part of the Docker ecosystem
--
.footnote[🐳 Did you know that кит means "whale" in Russian?]
---
## SwarmKit features
- Highly-available, distributed store based on [Raft](
https://en.wikipedia.org/wiki/Raft_%28computer_science%29)
<br/>(avoids depending on an external store: easier to deploy; higher performance)
- Dynamic reconfiguration of Raft without interrupting cluster operations
- *Services* managed with a *declarative API*
<br/>(implementing *desired state* and *reconciliation loop*)
- Integration with overlay networks and load balancing
- Strong emphasis on security:
- automatic TLS keying and signing; automatic cert rotation
- full encryption of the data plane; automatic key rotation
- least privilege architecture (single-node compromise ≠ cluster compromise)
- on-disk encryption with optional passphrase
---
class: extra-details
## Where is the key/value store?
- Many orchestration systems use a key/value store backed by a consensus algorithm
<br/>
(k8s→etcd→Raft, mesos→zookeeper→ZAB, etc.)
- SwarmKit implements the Raft algorithm directly
<br/>
(Nomad is similar; thanks [@cbednarski](https://twitter.com/@cbednarski),
[@diptanu](https://twitter.com/diptanu) and others for point it out!)
- Analogy courtesy of [@aluzzardi](https://twitter.com/aluzzardi):
*It's like B-Trees and RDBMS. They are different layers, often
associated. But you don't need to bring up a full SQL server when
all you need is to index some data.*
- As a result, the orchestrator has direct access to the data
<br/>
(the main copy of the data is stored in the orchestrator's memory)
- Simpler, easier to deploy and operate; also faster
---
## SwarmKit concepts (1/2)
- A *cluster* will be at least one *node* (preferably more)
- A *node* can be a *manager* or a *worker*
- A *manager* actively takes part in the Raft consensus, and keeps the Raft log
- You can talk to a *manager* using the SwarmKit API
- One *manager* is elected as the *leader*; other managers merely forward requests to it
- The *workers* get their instructions from the *managers*
- Both *workers* and *managers* can run containers
---
## Illustration
![Illustration](swarm-mode.svg)
---
## SwarmKit concepts (2/2)
- The *managers* expose the SwarmKit API
- Using the API, you can indicate that you want to run a *service*
- A *service* is specified by its *desired state*: which image, how many instances...
- The *leader* uses different subsystems to break down services into *tasks*:
<br/>orchestrator, scheduler, allocator, dispatcher
- A *task* corresponds to a specific container, assigned to a specific *node*
- *Nodes* know which *tasks* should be running, and will start or stop containers accordingly (through the Docker Engine API)
You can refer to the [NOMENCLATURE](https://github.com/docker/swarmkit/blob/master/design/nomenclature.md) in the SwarmKit repo for more details.
---
## Swarm Mode
- Since version 1.12, Docker Engine embeds SwarmKit
- All the SwarmKit features are "asleep" until you enable "Swarm Mode"
- Examples of Swarm Mode commands:
- `docker swarm` (enable Swarm mode; join a Swarm; adjust cluster parameters)
- `docker node` (view nodes; promote/demote managers; manage nodes)
- `docker service` (create and manage services)
???
- The Docker API exposes the same concepts
- The SwarmKit API is also exposed (on a separate socket)
---
## You need to enable Swarm mode to use the new stuff
- By default, all this new code is inactive
- Swarm Mode can be enabled, "unlocking" SwarmKit functions
<br/>(services, out-of-the-box overlay networks, etc.)
.exercise[
- Try a Swarm-specific command:
```bash
docker node ls
```
]
--
You will get an error message:
```
Error response from daemon: This node is not a swarm manager. [...]
```

72
docs/swarmnbt.md Normal file
View File

@@ -0,0 +1,72 @@
class: nbt, extra-details
## Measuring network conditions on the whole cluster
- Since we have built-in, cluster-wide discovery, it's relatively straightforward
to monitor the whole cluster automatically
- [Alexandros Mavrogiannis](https://github.com/alexmavr) wrote
[Swarm NBT](https://github.com/alexmavr/swarm-nbt), a tool doing exactly that!
.exercise[
- Start Swarm NBT:
```bash
docker run --rm -v inventory:/inventory \
-v /var/run/docker.sock:/var/run/docker.sock \
alexmavr/swarm-nbt start
```
]
Note: in this mode, Swarm NBT connects to the Docker API socket,
and issues additional API requests to start all the components it needs.
---
class: nbt, extra-details
## Viewing network conditions with Prometheus
- Swarm NBT relies on Prometheus to scrape and store data
- We can directly consume the Prometheus endpoint to view telemetry data
.exercise[
- Point your browser to any Swarm node, on port 9090
(If you're using Play-With-Docker, click on the (9090) badge)
- In the drop-down, select `icmp_rtt_gauge_seconds`
- Click on "Graph"
]
You are now seeing ICMP latency across your cluster.
---
class: nbt, in-person, extra-details
## Viewing network conditions with Grafana
- If you are using a "real" cluster (not Play-With-Docker) you can use Grafana
.exercise[
- Start Grafana with `docker service create -p 3000:3000 grafana`
- Point your browser to Grafana, on port 3000 on any Swarm node
- Login with username `admin` and password `admin`
- Click on the top-left menu and browse to Data Sources
- Create a prometheus datasource with any name
- Point it to http://any-node-IP:9090
- Set access to "direct" and leave credentials blank
- Click on the top-left menu, highlight "Dashboards" and select the "Import" option
- Copy-paste [this JSON payload](
https://raw.githubusercontent.com/alexmavr/swarm-nbt/master/grafana.json),
then use the Prometheus Data Source defined before
- Poke around the dashboard that magically appeared!
]

184
docs/swarmtools.md Normal file
View File

@@ -0,0 +1,184 @@
# SwarmKit debugging tools
- The SwarmKit repository comes with debugging tools
- They are *low level* tools; not for general use
- We are going to see two of these tools:
- `swarmctl`, to communicate directly with the SwarmKit API
- `swarm-rafttool`, to inspect the content of the Raft log
---
## Building the SwarmKit tools
- We are going to install a Go compiler, then download SwarmKit source and build it
.exercise[
- Download, compile, and install SwarmKit with this one-liner:
```bash
docker run -v /usr/local/bin:/go/bin golang \
go get `-v` github.com/docker/swarmkit/...
```
]
Remove `-v` if you don't like verbose things.
Shameless promo: for more Go and Docker love, check
[this blog post](http://jpetazzo.github.io/2016/09/09/go-docker/)!
Note: in the unfortunate event of SwarmKit *master* branch being broken,
the build might fail. In that case, just skip the Swarm tools section.
---
## Getting cluster-wide task information
- The Docker API doesn't expose this directly (yet)
- But the SwarmKit API does
- We are going to query it with `swarmctl`
- `swarmctl` is an example program showing how to
interact with the SwarmKit API
---
## Using `swarmctl`
- The Docker Engine places the SwarmKit control socket in a special path
- You need root privileges to access it
.exercise[
- If you are using Play-With-Docker, set the following alias:
```bash
alias swarmctl='/lib/ld-musl-x86_64.so.1 /usr/local/bin/swarmctl \
--socket /var/run/docker/swarm/control.sock'
```
- Otherwise, set the following alias:
```bash
alias swarmctl='sudo swarmctl \
--socket /var/run/docker/swarm/control.sock'
```
]
---
## `swarmctl` in action
- Let's review a few useful `swarmctl` commands
.exercise[
- List cluster nodes (that's equivalent to `docker node ls`):
```bash
swarmctl node ls
```
- View all tasks across all services:
```bash
swarmctl task ls
```
]
---
## `swarmctl` notes
- SwarmKit is vendored into the Docker Engine
- If you want to use `swarmctl`, you need the exact version of
SwarmKit that was used in your Docker Engine
- Otherwise, you might get some errors like:
```
Error: grpc: failed to unmarshal the received message proto: wrong wireType = 0
```
- With Docker 1.12, the control socket was in `/var/lib/docker/swarm/control.sock`
---
## `swarm-rafttool`
- SwarmKit stores all its important data in a distributed log using the Raft protocol
(This log is also simply called the "Raft log")
- You can decode that log with `swarm-rafttool`
- This is a great tool to understand how SwarmKit works
- It can also be used in forensics or troubleshooting
(But consider it as a *very low level* tool!)
---
## The powers of `swarm-rafttool`
With `swarm-rafttool`, you can:
- view the latest snapshot of the cluster state;
- view the Raft log (i.e. changes to the cluster state);
- view specific objects from the log or snapshot;
- decrypt the Raft data (to analyze it with other tools).
It *cannot* work on live files, so you must stop Docker or make a copy first.
---
## Using `swarm-rafttool`
- First, let's make a copy of the current Swarm data
.exercise[
- If you are using Play-With-Docker, the Docker data directory is `/graph`:
```bash
cp -r /graph/swarm /swarmdata
```
- Otherwise, it is in the default `/var/lib/docker`:
```bash
sudo cp -r /var/lib/docker/swarm /swarmdata
```
]
---
## Dumping the Raft log
- We have to indicate the path holding the Swarm data
(Otherwise `swarm-rafttool` will try to use the live data, and complain that it's locked!)
.exercise[
- If you are using Play-With-Docker, you must use the musl linker:
```bash
/lib/ld-musl-x86_64.so.1 /usr/local/bin/swarm-rafttool -d /swarmdata/ dump-wal
```
- Otherwise, you don't need the musl linker but you need to get root:
```bash
sudo swarm-rafttool -d /swarmdata/ dump-wal
```
]
Reminder: this is a very low-level tool, requiring a knowledge of SwarmKit's internals!

Some files were not shown because too many files have changed in this diff Show More