John Montroy

Control groups, a.k.a. cgroups, controls the resources that a process or group of processes can use. This includes things like memory, CPU, prioritization, and even number of PIDs. They were introduced into Linux in 2008, and have two major versions. v2 was released around 2015, and is approaching stable adoption.

You can see your cgroups in various ways:

If you want to see all resources that have cgroup controllers:

> cat /sys/fs/cgroup/cgroup.controllers
cpuset cpu io memory hugetlb pids rdma misc

Managing cgroups can be done manually via creating directories (e.g. cgroups!) under /sys/fs/cgroup, and those directories will be populated with the requisite resource files for management. In practice, it’s probably just best to find some tooling to do this for you, like what comes with cgroup-tools.

Docker and Kubernetes (or rather, Kubernetes via Docker) just use cgroups to manage resource limits in specified in your Pod config. You can see Docker containers under the system.slice/ cgroup, as well as the Docker daemon and Docker socket cgroups:

> ls /sys/fs/cgroup/system.slice/ | grep docker
docker-1b11d97404bfc1e89c28db8cfa9bc9d33535cd0378ced515ecae5b65d6632a77.scope/
docker.service/
docker.socket/

So for example, every process you start in the Docker container listed above will have resources managed by that first docker-1b11… scope, such that as you create processes in the container, you can watch the pids.current resource tick up:

# on host
> cat /sys/fs/cgroup/system.slice/docker-1b11<...>.scope/pids.current
2

# in container
> sleep 100 &

# on host 
> cat /sys/fs/cgroup/system.slice/docker-1b11<...>.scope/pids.current
3

You can also watch pids.current tick up inside the Docker container, where there’s just one top-level cgroup (containers are light!).

Meanwhile, processes started within your sessions will fall under the user.slice/ scope as various sessions. systemctl status shows all this nicely.

CGroups V2

Ubuntu 24.04 LTS uses cgroups v2. The main difference between cgroups v1 and v2 is how they approach hierarchy - v1 approaches it from a resources perspective, and v2 approaches it from a processes perspective. From this Medium article (via this KubeCon 2022 talk):

The details of this aren’t particularly important to this compared to the big picture - v1 enables “flexibility”, but in practice is kinda a mess:

V2 also introduces Pressure Stall Information, which basically gets us intel on resource usage over time, rather than just point-in-time statistics that mask how utilization behaves (bursty? steady increase?).

Lastly, V2 enabled resource limits for rootless containers, e.g. containers don’t have to run as root any more to use cgroups! Really great!

#Container-Security