🏭️ Refactor Services sections

Make the content suitable to both live classes and recorded content
This commit is contained in:
Jérôme Petazzoni
2025-12-14 19:22:42 -06:00
parent 01b2456e03
commit 93ad45da9b
3 changed files with 167 additions and 75 deletions

View File

@@ -18,9 +18,52 @@
---
## ⚠️ Heads up!
- We're going to connect directly to pods and services, using internal addresses
- This will only work:
- if you're attending a live class with our special lab environment
- or if you're using our dev containers within codespaces
- If you're using a "normal" Kubernetes cluster (including minikube, KinD, etc):
*you will not be able to access these internal addresses directly!*
- In that case, we suggest that you run an interactive container, e.g.:
```bash
kubectl run --rm -ti --image=archlinux myshell
```
- ...And each time you see a `curl` or `ping` command run it in that container instead
---
class: extra-details
## But, why?
- Internal addresses are only reachable from within the cluster
(=from a pod, or when logged directly inside a node)
- Our special lab environments and our dev containers let us do it anyways
(because it's nice and convenient when learning Kubernetes)
- But that doesn't work on "normal" Kubernetes clusters
- Instead, we can use [`kubectl port-forward`][kubectl-port-forward] on these clusters
[kubectl-port-forward]: https://kubernetes.io/docs/reference/kubectl/generated/kubectl_port-forward/
---
## Running containers with open ports
- Since `ping` doesn't have anything to connect to, we'll have to run something else
- Let's run a small web server in a container
- We are going to use `jpetazzo/color`, a tiny HTTP server written in Go
@@ -68,7 +111,7 @@
- Send an HTTP request to the Pod:
```bash
curl http://`IP-ADDRESSS`
curl http://`IP-ADDRESS`
```
]
@@ -77,25 +120,6 @@ You should see a response from the Pod.
---
class: extra-details
## Running with a local cluster
If you're running with a local cluster (Docker Desktop, KinD, minikube...),
you might get a connection timeout (or a message like "no route to host")
because the Pod isn't reachable directly from your local machine.
In that case, you can test the connection to the Pod by running a shell
*inside* the cluster:
```bash
kubectl run -it --rm my-test-pod --image=fedora
```
Then run `curl` in that Pod.
---
## The Pod doesn't have a "stable identity"
- The IP address that we used above isn't "stable"
@@ -171,7 +195,7 @@ class: extra-details
(i.e. a service is not just an IP address; it's an IP address + protocol + port)
- As a result: you *have to* indicate the port number for your service
(with some exceptions, like `ExternalName` or headless services, covered later)
---
@@ -210,7 +234,7 @@ class: extra-details
- Keep sending requests to the Service address:
```bash
while sleep 0.3; do curl http://$CLUSTER_IP; done
while sleep 0.3; do curl -m1 http://$CLUSTER_IP; done
```
- Meanwhile, delete the Pod:
@@ -224,6 +248,8 @@ class: extra-details
- ...But requests will keep flowing after that (without requiring a manual intervention)
- The `-m1` option is here to specify a 1-second timeout
---
## Load balancing
@@ -262,7 +288,7 @@ class: extra-details
- Get a shell in a Pod:
```bash
kubectl run --rm -it --image=fedora test-dns-integration
kubectl run --rm -it --image=archlinux test-dns-integration
```
- Try to resolve the `blue` Service from the Pod:
@@ -278,21 +304,73 @@ class: extra-details
## Under the hood...
- Check the content of `/etc/resolv.conf` inside a Pod
- Let's check the content of `/etc/resolv.conf` inside a Pod
- It will have `nameserver X.X.X.X` (e.g. 10.96.0.10)
- It should look approximately like this:
```
search default.svc.cluster.local svc.cluster.local cluster.local ...
nameserver 10.96.0.10
options ndots:5
```
- Now check `kubectl get service kube-dns --namespace=kube-system`
- Let's break down what these lines mean...
- ...It's the same address! 😉
---
- The FQDN of a service is actually:
class: extra-details
`<service-name>.<namespace>.svc.<cluster-domain>`
## `nameserver 10.96.0.10`
- `<cluster-domain>` defaults to `cluster.local`
- This is the address of the DNS server used by programs running in the Pod
- And the `search` includes `<namespace>.svc.<cluster-domain>`
- The exact address might be different
(this one is the default one when setting up a cluster with `kubeadm`)
- This address will correspond to a Service on our cluster
- Check what we have in `kube-system`:
```bash
kubectl get services --namespace=kube-system
```
- There will typically be a service named `kube-dns` with that exact address
(that's Kubernetes' internal DNS service!)
---
class: extra-details
## `search default.svc.cluster.local ...`
- This is the "search list"
- When a program tries to resolve `foo`, the resolver will try to resolve:
`foo.default.svc.cluster.local` (if the Pod is in the `default` Namespace)
`foo.svc.cluster.local`
`foo.cluster.local`
...(the other entries in the search list)...
`foo`
- As a result, if there is Service named `foo` in the Pod's Namespace, we obtain its address!
---
class: extra-details
## Do You Want To Know More?
- If you want even more details about DNS resolution on Kubernetes and Linux...
check [this blog post][dnsblog]!
[dnsblog]: https://jpetazzo.github.io/2024/05/12/understanding-kubernetes-dns-hostnetwork-dnspolicy-dnsconfigforming/
---

View File

@@ -8,17 +8,17 @@
- In detail:
- all nodes must be able to reach each other, without NAT
- all nodes can reach each other directly (without NAT)
- all pods must be able to reach each other, without NAT
- all pods can reach each other directly (without NAT)
- pods and nodes must be able to reach each other, without NAT
- pods and nodes can reach each other directly (without NAT)
- each pod is aware of its IP address (no NAT)
- each pod is aware of its IP address (again: no NAT)
- pod IP addresses are assigned by the network implementation
- Most Kubernetes clusters rely on the CNI to configure Pod networking
- Kubernetes doesn't mandate any particular implementation
(allocate IP addresses, create and configure network interfaces, routing...)
---
@@ -32,13 +32,15 @@
- No new protocol
- The network implementation can decide how to allocate addresses
- IP addresses are allocated by the network stack, not by the users
- IP addresses don't have to be "portable" from a node to another
(this avoids complex constraints associated with address portability)
(We can use e.g. a subnet per node and use a simple routed topology)
- CNI is very flexible and lends itself to many different models
- The specification is simple enough to allow many various implementations
(switching, routing, tunneling... virtually anything is possible!)
- Example: we could have one subnet per node and use a simple routed topology
---
@@ -46,11 +48,11 @@
- Everything can reach everything
- if you want security, you need to add network policies
- if we want network isolation, we need to add network policies
- the network implementation that you use needs to support them
- some clusters (like AWS EKS) don't include a network policy controller out of the box
- There are literally dozens of implementations out there
- There are literally dozens of Kubernetes network implementations out there
(https://github.com/containernetworking/cni/ lists more than 25 plugins)
@@ -58,67 +60,73 @@
(Services map to a single UDP or TCP port; no port ranges or arbitrary IP packets)
- `kube-proxy` is on the data path when connecting to a pod or container,
<br/>and it's not particularly fast (relies on userland proxying or iptables)
- The default Kubernetes service proxy, `kube-proxy`, doesn't scale very well
(although this is improved considerably in [recent versions of kube-proxy][tables-have-turned])
[tables-have-turned]: https://www.youtube.com/watch?v=yOGHb2HjslY
---
## Kubernetes network model: in practice
- The nodes that we are using have been set up to use [Weave](https://github.com/weaveworks/weave)
- We don't need to worry about networking in local development clusters
- We don't endorse Weave in a particular way, it just Works For Us
(it's set up automatically for us and we almost never need to change anything)
- Don't worry about the warning about `kube-proxy` performance
- We also don't need to worry about it in managed clusters
- Unless you:
(except if we want to reconfigure or replace whatever was installed automatically)
- routinely saturate 10G network interfaces
- count packet rates in millions per second
- run high-traffic VOIP or gaming platforms
- do weird things that involve millions of simultaneous connections
<br/>(in which case you're already familiar with kernel tuning)
- We *do* need to pick a network stack in all other scenarios:
- If necessary, there are alternatives to `kube-proxy`; e.g.
[`kube-router`](https://www.kube-router.io)
- installing Kubernetes on bare metal or on "raw" virtual machines
- when we manage the control plane ourselves
---
class: extra-details
## Which network stack should we use?
## The Container Network Interface (CNI)
*It depends!*
- Most Kubernetes clusters use CNI "plugins" to implement networking
- [Weave] = super easy to install, no config needed, low footprint...
*but it's not maintained anymore, alas!*
- When a pod is created, Kubernetes delegates the network setup to these plugins
- [Cilium] = very powerful and flexible, some consider it "best in class"...
(it can be a single plugin, or a combination of plugins, each doing one task)
*but it's based on eBPF, which might make troubleshooting challenging!*
- Typically, CNI plugins will:
- Other solid choices include [Calico], [Flannel], [kube-router]
- allocate an IP address (by calling an IPAM plugin)
- And of course, some cloud providers / network vendors have their own solutions
- add a network interface into the pod's network namespace
(which may or may not be appropriate for your use-case!)
- configure the interface as well as required routes etc.
- Do you want speed? Reliability? Security? Observability?
[Weave]: https://github.com/weaveworks/weave
[Cilium]: https://cilium.io/
[Calico]: https://docs.tigera.io/calico/latest/about/
[Flannel]: https://github.com/flannel-io/flannel
[kube-router]: https://www.kube-router.io/
---
class: extra-details
## Multiple moving parts
- The "pod-to-pod network" or "pod network":
- The "pod-to-pod network" or "pod network" or "CNI":
- provides communication between pods and nodes
- is generally implemented with CNI plugins
- The "pod-to-service network":
- The "pod-to-service network" or "Kubernetes service proxy":
- provides internal communication and load balancing
- is generally implemented with kube-proxy (or e.g. kube-router)
- implemented with kube-proxy by default
- Network policies:

View File

@@ -61,6 +61,8 @@ class: pic
- This is available only when the underlying infrastructure provides some kind of
"load balancer as a service"
(or in some special cases with add-ons like [MetalLB])
- Each service of that type will typically cost a little bit of money
(e.g. a few cents per hour on AWS or GCE)
@@ -69,6 +71,8 @@ class: pic
- In practice, it will often flow through a `NodePort` first
[MetalLB]: https://metallb.io/
---
class: pic
@@ -163,11 +167,13 @@ class: pic
- Our code needs to be changed to connect to that new port number
- Under the hood: `kube-proxy` sets up a bunch of `iptables` rules on our nodes
- Under the hood: `kube-proxy` sets up a bunch of port forwarding rules on our nodes
- Sometimes, it's the only available option for external traffic
(using `iptables`, `ipvs`, `nftables`... multiple implementations are available)
(e.g. most clusters deployed with kubeadm or on-premises)
- Very useful option for external traffic when `LoadBalancer` Services aren't available
(e.g. some clusters deployed on-premises and/or with kubeadm)
---