diff --git a/slides/k8s/kubectlexpose.md b/slides/k8s/kubectlexpose.md index 37e385d1..9234e574 100644 --- a/slides/k8s/kubectlexpose.md +++ b/slides/k8s/kubectlexpose.md @@ -18,9 +18,52 @@ --- +## ⚠️ Heads up! + +- We're going to connect directly to pods and services, using internal addresses + +- This will only work: + + - if you're attending a live class with our special lab environment + + - or if you're using our dev containers within codespaces + +- If you're using a "normal" Kubernetes cluster (including minikube, KinD, etc): + + *you will not be able to access these internal addresses directly!* + +- In that case, we suggest that you run an interactive container, e.g.: + ```bash + kubectl run --rm -ti --image=archlinux myshell + ``` + +- ...And each time you see a `curl` or `ping` command run it in that container instead + +--- + +class: extra-details + +## But, why? + +- Internal addresses are only reachable from within the cluster + + (=from a pod, or when logged directly inside a node) + +- Our special lab environments and our dev containers let us do it anyways + + (because it's nice and convenient when learning Kubernetes) + +- But that doesn't work on "normal" Kubernetes clusters + +- Instead, we can use [`kubectl port-forward`][kubectl-port-forward] on these clusters + +[kubectl-port-forward]: https://kubernetes.io/docs/reference/kubectl/generated/kubectl_port-forward/ + +--- + ## Running containers with open ports -- Since `ping` doesn't have anything to connect to, we'll have to run something else +- Let's run a small web server in a container - We are going to use `jpetazzo/color`, a tiny HTTP server written in Go @@ -68,7 +111,7 @@ - Send an HTTP request to the Pod: ```bash - curl http://`IP-ADDRESSS` + curl http://`IP-ADDRESS` ``` ] @@ -77,25 +120,6 @@ You should see a response from the Pod. --- -class: extra-details - -## Running with a local cluster - -If you're running with a local cluster (Docker Desktop, KinD, minikube...), -you might get a connection timeout (or a message like "no route to host") -because the Pod isn't reachable directly from your local machine. - -In that case, you can test the connection to the Pod by running a shell -*inside* the cluster: - -```bash -kubectl run -it --rm my-test-pod --image=fedora -``` - -Then run `curl` in that Pod. - ---- - ## The Pod doesn't have a "stable identity" - The IP address that we used above isn't "stable" @@ -171,7 +195,7 @@ class: extra-details (i.e. a service is not just an IP address; it's an IP address + protocol + port) - As a result: you *have to* indicate the port number for your service - + (with some exceptions, like `ExternalName` or headless services, covered later) --- @@ -210,7 +234,7 @@ class: extra-details - Keep sending requests to the Service address: ```bash - while sleep 0.3; do curl http://$CLUSTER_IP; done + while sleep 0.3; do curl -m1 http://$CLUSTER_IP; done ``` - Meanwhile, delete the Pod: @@ -224,6 +248,8 @@ class: extra-details - ...But requests will keep flowing after that (without requiring a manual intervention) +- The `-m1` option is here to specify a 1-second timeout + --- ## Load balancing @@ -262,7 +288,7 @@ class: extra-details - Get a shell in a Pod: ```bash - kubectl run --rm -it --image=fedora test-dns-integration + kubectl run --rm -it --image=archlinux test-dns-integration ``` - Try to resolve the `blue` Service from the Pod: @@ -278,21 +304,73 @@ class: extra-details ## Under the hood... -- Check the content of `/etc/resolv.conf` inside a Pod +- Let's check the content of `/etc/resolv.conf` inside a Pod -- It will have `nameserver X.X.X.X` (e.g. 10.96.0.10) +- It should look approximately like this: + ``` + search default.svc.cluster.local svc.cluster.local cluster.local ... + nameserver 10.96.0.10 + options ndots:5 + ``` -- Now check `kubectl get service kube-dns --namespace=kube-system` +- Let's break down what these lines mean... -- ...It's the same address! 😉 +--- -- The FQDN of a service is actually: +class: extra-details - `..svc.` +## `nameserver 10.96.0.10` -- `` defaults to `cluster.local` +- This is the address of the DNS server used by programs running in the Pod -- And the `search` includes `.svc.` +- The exact address might be different + + (this one is the default one when setting up a cluster with `kubeadm`) + +- This address will correspond to a Service on our cluster + +- Check what we have in `kube-system`: + ```bash + kubectl get services --namespace=kube-system + ``` + +- There will typically be a service named `kube-dns` with that exact address + + (that's Kubernetes' internal DNS service!) + +--- + +class: extra-details + +## `search default.svc.cluster.local ...` + +- This is the "search list" + +- When a program tries to resolve `foo`, the resolver will try to resolve: + + `foo.default.svc.cluster.local` (if the Pod is in the `default` Namespace) + + `foo.svc.cluster.local` + + `foo.cluster.local` + + ...(the other entries in the search list)... + + `foo` + +- As a result, if there is Service named `foo` in the Pod's Namespace, we obtain its address! + +--- + +class: extra-details + +## Do You Want To Know More? + +- If you want even more details about DNS resolution on Kubernetes and Linux... + + check [this blog post][dnsblog]! + +[dnsblog]: https://jpetazzo.github.io/2024/05/12/understanding-kubernetes-dns-hostnetwork-dnspolicy-dnsconfigforming/ --- diff --git a/slides/k8s/kubenet.md b/slides/k8s/kubenet.md index abbc18ca..9c9c7435 100644 --- a/slides/k8s/kubenet.md +++ b/slides/k8s/kubenet.md @@ -8,17 +8,17 @@ - In detail: - - all nodes must be able to reach each other, without NAT + - all nodes can reach each other directly (without NAT) - - all pods must be able to reach each other, without NAT + - all pods can reach each other directly (without NAT) - - pods and nodes must be able to reach each other, without NAT + - pods and nodes can reach each other directly (without NAT) - - each pod is aware of its IP address (no NAT) + - each pod is aware of its IP address (again: no NAT) - - pod IP addresses are assigned by the network implementation +- Most Kubernetes clusters rely on the CNI to configure Pod networking -- Kubernetes doesn't mandate any particular implementation + (allocate IP addresses, create and configure network interfaces, routing...) --- @@ -32,13 +32,15 @@ - No new protocol -- The network implementation can decide how to allocate addresses +- IP addresses are allocated by the network stack, not by the users -- IP addresses don't have to be "portable" from a node to another + (this avoids complex constraints associated with address portability) - (We can use e.g. a subnet per node and use a simple routed topology) +- CNI is very flexible and lends itself to many different models -- The specification is simple enough to allow many various implementations + (switching, routing, tunneling... virtually anything is possible!) + +- Example: we could have one subnet per node and use a simple routed topology --- @@ -46,11 +48,11 @@ - Everything can reach everything - - if you want security, you need to add network policies + - if we want network isolation, we need to add network policies - - the network implementation that you use needs to support them + - some clusters (like AWS EKS) don't include a network policy controller out of the box -- There are literally dozens of implementations out there +- There are literally dozens of Kubernetes network implementations out there (https://github.com/containernetworking/cni/ lists more than 25 plugins) @@ -58,67 +60,73 @@ (Services map to a single UDP or TCP port; no port ranges or arbitrary IP packets) -- `kube-proxy` is on the data path when connecting to a pod or container, -
and it's not particularly fast (relies on userland proxying or iptables) +- The default Kubernetes service proxy, `kube-proxy`, doesn't scale very well + + (although this is improved considerably in [recent versions of kube-proxy][tables-have-turned]) + +[tables-have-turned]: https://www.youtube.com/watch?v=yOGHb2HjslY --- ## Kubernetes network model: in practice -- The nodes that we are using have been set up to use [Weave](https://github.com/weaveworks/weave) +- We don't need to worry about networking in local development clusters -- We don't endorse Weave in a particular way, it just Works For Us + (it's set up automatically for us and we almost never need to change anything) -- Don't worry about the warning about `kube-proxy` performance +- We also don't need to worry about it in managed clusters -- Unless you: + (except if we want to reconfigure or replace whatever was installed automatically) - - routinely saturate 10G network interfaces - - count packet rates in millions per second - - run high-traffic VOIP or gaming platforms - - do weird things that involve millions of simultaneous connections -
(in which case you're already familiar with kernel tuning) +- We *do* need to pick a network stack in all other scenarios: -- If necessary, there are alternatives to `kube-proxy`; e.g. - [`kube-router`](https://www.kube-router.io) + - installing Kubernetes on bare metal or on "raw" virtual machines + + - when we manage the control plane ourselves --- -class: extra-details +## Which network stack should we use? -## The Container Network Interface (CNI) +*It depends!* -- Most Kubernetes clusters use CNI "plugins" to implement networking +- [Weave] = super easy to install, no config needed, low footprint... + + *but it's not maintained anymore, alas!* -- When a pod is created, Kubernetes delegates the network setup to these plugins +- [Cilium] = very powerful and flexible, some consider it "best in class"... - (it can be a single plugin, or a combination of plugins, each doing one task) + *but it's based on eBPF, which might make troubleshooting challenging!* -- Typically, CNI plugins will: +- Other solid choices include [Calico], [Flannel], [kube-router] - - allocate an IP address (by calling an IPAM plugin) +- And of course, some cloud providers / network vendors have their own solutions - - add a network interface into the pod's network namespace + (which may or may not be appropriate for your use-case!) - - configure the interface as well as required routes etc. +- Do you want speed? Reliability? Security? Observability? + +[Weave]: https://github.com/weaveworks/weave +[Cilium]: https://cilium.io/ +[Calico]: https://docs.tigera.io/calico/latest/about/ +[Flannel]: https://github.com/flannel-io/flannel +[kube-router]: https://www.kube-router.io/ --- -class: extra-details - ## Multiple moving parts -- The "pod-to-pod network" or "pod network": +- The "pod-to-pod network" or "pod network" or "CNI": - provides communication between pods and nodes - is generally implemented with CNI plugins -- The "pod-to-service network": +- The "pod-to-service network" or "Kubernetes service proxy": - provides internal communication and load balancing - - is generally implemented with kube-proxy (or e.g. kube-router) + - implemented with kube-proxy by default - Network policies: diff --git a/slides/k8s/service-types.md b/slides/k8s/service-types.md index c4c848d9..c83fcff9 100644 --- a/slides/k8s/service-types.md +++ b/slides/k8s/service-types.md @@ -61,6 +61,8 @@ class: pic - This is available only when the underlying infrastructure provides some kind of "load balancer as a service" + (or in some special cases with add-ons like [MetalLB]) + - Each service of that type will typically cost a little bit of money (e.g. a few cents per hour on AWS or GCE) @@ -69,6 +71,8 @@ class: pic - In practice, it will often flow through a `NodePort` first +[MetalLB]: https://metallb.io/ + --- class: pic @@ -163,11 +167,13 @@ class: pic - Our code needs to be changed to connect to that new port number -- Under the hood: `kube-proxy` sets up a bunch of `iptables` rules on our nodes +- Under the hood: `kube-proxy` sets up a bunch of port forwarding rules on our nodes -- Sometimes, it's the only available option for external traffic + (using `iptables`, `ipvs`, `nftables`... multiple implementations are available) - (e.g. most clusters deployed with kubeadm or on-premises) +- Very useful option for external traffic when `LoadBalancer` Services aren't available + + (e.g. some clusters deployed on-premises and/or with kubeadm) ---