podman rootless (and docker rootless) - bits and bobs
backstory
Recently I helped someone out online (by providing a tiny pointer) with rootless containers and I realized that I might have some useful pointers (bits and bobs). These are somewhat related to my previous blogpost about JFrog - Xray and Insight.
In the last couple of years, I have had experience with running OCI (Open Container Initiative) compliant rootless containers with podman rootless and sometimes with docker rootless. Most docker-compose files and how to’s online focus on the root full variants, which you definately do not want. Example (one liner ’exploit’) for abusing docker rootfull: docker run --rm -it --privileged -v /:/host/ ubuntu cat /host/etc/shadow
.
getting up and running
The podman rootless documentation is quite excellent at explaining it more thoroughly, but here I’ll try to focus on certain parts.
To be able to run containers root less, you will need to have a ‘sub’ UID and GID ’namespace’. This is basically a entire set of additional UIDs and GIDs, belonging to a certain local user. Example to add those for the user schauenburg
and installing podman and slirp4netns for networking:
echo "schauenburg:100000:65536" >> /etc/subuid
echo "schauenburg:100000:65536" >> /etc/subgid
apt-get install podman slirp4netns
Note: about the rootless networking tools: slirp4netns is the default for podman < 5.0 and pasta is the default for podman >= 5.0
docker hub references
Usually my first hurdle is rewriting the references in the docker-compose.yaml
for the image
tag. Most example files assume that docker hub is used, but that’s only 1 of many Docker registries. Another popular one is Quay.
So check that your image
tags do not look like: image: deluan/navidrome:latest
but are more explicit: image: docker.io/deluan/navidrome:latest
volume mounting
Most issues I have encountered, were due to sharing volumes (in my case directories) with my container and the application in the container not being able to access them. The application in the container runs with a certain UID and the volume that is used, has to be chowned to that UID/GID. To find out which UID the application runs as, I usually use 2 ways:
- run the container and use
ps
- read the
Dockerfile
of the container you’re running
Then it is only a matter of chowning the directories to the correct UID/GID. With podman you use podman unshare (user namespace share) and with docker you use nsenter (name space enter). Here’s an example of chowning a volume mounted directory on the host, using a UID/GID of 1000
inside of the container:
nsenter -U --preserve-credentials -n -t $(pgrep dockerd) chown 1000:1000 var
podman unshare chown 1000:1000 var
Afterwards, the permissions could look like this (viewed from the host and then from the containers perspective):
schauenburg@enterprise:~/ $ ls -ln
drwx------ 19 100999 1000 4096 okt 25 12:18 var
schauenburg@enterprise:~/ $ podman unshare
root@enterprise:~/ # ls -ln
drwx------ 19 1000 0 4096 okt 25 12:18 var
Spoiler/Caveat: UID 0 in the container namespace == the UID of the regular user on the system. So if schauenburg
is UID 1000 on the host, it is seen as UID 0 inside of the container namespace.
podman-compose vs docker-compose
Usually I use podman-compose and it works fine. Unfortunately it does have some missing features with regards to health checks, so if you need those, you might be better off by executing docker-compose. Upside: if you ensure that instead of the docker
command, podman
will be executed, I am confident it will work (have quickly tested it, not extensively though).