Container isolation is achieved through a combination of Linux kernel features.
Namespaces
Docker containers are primarily built from two Linux primitives - cgroups (covered last time) and namespaces. Cgroups cover resource management per process - namespaces control what is visible to a process. Since a Docker container is nothing more than a Linux process, we can use namespaces of various types to control visibility of these resources:
- Unix Timesharing System (UTS) - basically hostnames and domain names.
- Process IDs
- Mount points
- Network
- User and group IDs
- Inter-process communications (IPC)
- Control groups (cgroups)
A process is always in one namespace of each type. We can see this immediately with the lsns command:
> lsns
NS TYPE NPROCS PID USER COMMAND
4026531834 time 103 1 root /sbin/init
4026531835 cgroup 103 1 root /sbin/init
4026531836 pid 103 1 root /sbin/init
4026531837 user 103 1 root /sbin/init
4026531838 uts 98 1 root /sbin/init
4026531839 ipc 103 1 root /sbin/init
4026531840 net 103 1 root /sbin/init
4026531841 mnt 94 1 root /sbin/init
Via bash, new namespaces can be created with the unshare command, which creates a subprocess running in a new namespace of the specified type:
> sudo unshare --pid --fork sh
> sudo unshare --mount sh
> sudo unshare --net bash
So the first one gives a process its own view of PIDs, starting from zero; the second its own device mounts; and the third allows for network namespaces which can enable routing between a container’s network namespace and the host’s!
User namespaces are useful for running rootless containers in Docker - as of recently, rootless mode can even allow the Docker daemon (plus containers) to all run as non-root, while still allowing users in containers to be root.
chroot
The last bit of isolation we need, in addition to granular resource management (cgroups) and resource isolation (namespaces), is chroot, e.g. “Change root”. This allows a container to set its root filesystem location / to somewhere else, effectively making everything above that location invisible.
So I can use the following commands on my Ubuntu 24.04 LTS VM to download an aarch64 Alpine distribution, and chroot into it as a new root filesystem, using its binaries - it essentially already feels like an Alpine container at that point!
mkdir newroot
curl -o newroot/alpine.tar.gz https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/aarch64/alpine-minirootfs-3.20.2-aarch64.tar.gz
tar -xzvf newroot/alpine.tar.gz -C newroot
rm -f newroot/alpine.tar.gz
chroot newroot sh