Files
container.training/slides/namespaces.md
Jérôme Petazzoni 078023058b docs -> slides
2017-11-03 18:31:06 -07:00

4.7 KiB

class: namespaces name: namespaces

Improving isolation with User Namespaces

  • Namespaces are kernel mechanisms to compartimetalize the system

  • There are different kind of namespaces: pid, net, mnt, ipc, uts, and user

  • For a primer, see "Anatomy of a Container" (video) (slides)

  • The user namespace allows to map UIDs between the containers and the host

  • As a result, root in a container can map to a non-privileged user on the host

Note: even without user namespaces, root in a container cannot go wild on the host.
It is mediated by capabilities, cgroups, namespaces, seccomp, LSMs...


class: namespaces

User Namespaces in Docker

  • Optional feature added in Docker Engine 1.10

  • Not enabled by default

  • Has to be enabled at Engine startup, and affects all containers

  • When enabled, UID:GID in containers are mapped to a different range on the host

  • Safer than switching to a non-root user (with -u or USER) in the container
    (Since with user namespaces, root escalation maps to a non-privileged user)

  • Can be selectively disabled per container by starting them with --userns=host


class: namespaces

User Namespaces Caveats

When user namespaces are enabled, containers cannot:

  • Use the host's network namespace (with docker run --network=host)

  • Use the host's PID namespace (with docker run --pid=host)

  • Run in privileged mode (with docker run --privileged)

... Unless user namespaces are disabled for the container, with flag --userns=host

External volume and graph drivers that don't support user mapping might not work.

All containers are currently mapped to the same UID:GID range.

Some of these limitations might be lifted in the future!


class: namespaces

Filesystem ownership details

When enabling user namespaces:

  • the UID:GID on disk (in the images and containers) has to match the mapped UID:GID

  • existing images and containers cannot work (their UID:GID would have to be changed)

For practical reasons, when enabling user namespaces, the Docker Engine places containers and images (and everything else) in a different directory.

As a resut, if you enable user namespaces on an existing installation:

  • all containers and images (and e.g. Swarm data) disappear

  • if a node is a member of a Swarm, it is then kicked out of the Swarm

  • everything will re-appear if you disable user namespaces again


class: namespaces

Picking a node

  • We will select a node where we will enable user namespaces

  • This node will have to be re-added to the Swarm

  • All containers and services running on this node will be rescheduled

  • Let's make sure that we do not pick the node running the registry!

.exercise[

  • Check on which node the registry is running:
    docker service ps registry
    

]

Pick any other node (noted nodeX in the next slides).


class: namespaces

Logging into the right Engine

.exercise[

  • Log into the right node:
    ssh node`X`
    

]


class: namespaces

Configuring the Engine

.exercise[

  • Create a configuration file for the Engine:

    echo '{"userns-remap": "default"}' | sudo tee /etc/docker/daemon.json
    
  • Restart the Engine:

    kill $(pidof dockerd)
    

]


class: namespaces

Checking that User Namespaces are enabled

.exercise[

  • Notice the new Docker path:
docker info | grep var/lib
  • Notice the new UID:GID permissions:
sudo ls -l /var/lib/docker

]

You should see a line like the following:

drwx------ 11 296608 296608 4096 Aug  3 05:11 296608.296608

class: namespaces

Add the node back to the Swarm

.exercise[

  • Get our manager token from another node:

    ssh node`Y` docker swarm join-token manager
    
  • Copy-paste the join command to the node

]


class: namespaces

Check the new UID:GID

.exercise[

  • Run a background container on the node:

    docker run -d --name lockdown alpine sleep 1000000
    
  • Look at the processes in this container:

    docker top lockdown
    ps faux
    

]


class: namespaces

Comparing on-disk ownership with/without User Namespaces

.exercise[

  • Compare the output of the two following commands:
    docker run alpine ls -l /
    docker run --userns=host alpine ls -l /
    

]

--

class: namespaces

In the first case, it looks like things belong to root:root.

In the second case, we will see the "real" (on-disk) ownership.

--

class: namespaces

Remember to get back to node1 when finished!