Docker security
Table of Contents
Introduction
Welcome to the Docker security article where you will find all important information about security aspects of Docker environment. We are talking about host and kernel security, Docker daemon attack surface, Docker security during remote communication between containers and Docker registries. Docker images as fundamental part of Docker world has huge security implications on complete environment. At the end you can check most important Docker security best practices.
Docker default security
Back in the day of golden age of virtualization security was pretty much straightforward. This time things are little bit different. Virtual machines are sitting on hypervisor host and virtualized on hardware basis. Container are virtualized on OS basis sharing the same kernel. This can be very tricky if you imagine several containers running with root privileges by default on host which can be easily compromised. Yes, containers by default are running with root user which can access host filesystem and do stuff.
Recently you could access Docker daemon by TCP-IP protocol without any encryption. Hopefully Docker daemon interface could not be longer accessed with TCP/IP but UNIX sockets on which you can set permissions. One major security concern are content of prepacked Docker images.
You can’t rely on unknown image which is just delivered for you. There should be some kind of mechanism which will validate it. When looking on set of Kernel capabilities which allow root operations which can be run inside container you can’t just ignore it. By default, Docker allows binding privileged ports (<1024) to a container which is huge security concern.
We can watch Docker security from different angles:
- Docker host security
- Kernel security
- Docker daemon security
- Docker registry
Namespaces
Linux namespaces are bread and butter on which Docker builds its security story. Initially, namespaces are not Docker feature but one of the Linux kernel built-in out of box features which Docker uses to secure containers. What namespaces do is process isolation between containers, which means any process of one container can’t in any way access process on another container.
Additionally, network stack is completely separated, there is no privileged access to sockets and interfaces. Each container has its own network stack which can be access like on regular TCP/IP network.
Control groups
Control groups are another Linux kernel out of box feature is responsible for resource shaping. They limit the CPU, memory, storage and network resources of containers preventing one container to eat all host resources and prevent denial-of-service attacks.
Linux kernel capabilities
Linux capabilities are connected with privileged and unprivileged processes. Root user has all privileges and all capabilities by default and bypasses al kernel security checks. All other users have to be checked by its UID or GID. You already know that container user runs with root privileges by default which is one of the major security concerns. However, that kind of root user does not have all capabilities which root user on host like cron and logging daemons, kernel modules, network configuration tools, SSH, etc.
Container by default has following capabilities:
CHOWN, DAC_OVERRIDE, FSETID, FOWNER, MKNOD, NET_RAW, SETGID, SETUID, SETFCAP, SETPCAP, NET_BIND_SERVICE, SYS_CHROOT, KILL, AUDIT_WRITE
Those capabilities are subset of Linux root capabilities. To run container with full privileges use parameter privileged.
Example (Running container with dropped capabilities)
To run container with removed capabilities of changing UID and GID:
[root@swmanager /]# docker run –cap-drop SETUID –cap-drop SETGID –cap-drop FOWNER centos /bin/sh |
Docker daemon security
REST API
Docker daemon has powerful features and for that reason it is very important to secure its surface. Back in the days, TCP/IP protocol was used to access from Docker client to Docker server, even the two components were on same host. Nowadays, communication is switched to standard UNIX sockets. This works well if components are on same host but if client connects to remote Docker server via REST API endpoint please secure it with HTTPS and certificates. Alternative is to use SSH over TLS.
Docker daemon on rootless mode
It is possible to run Docker daemon with non-root user which does not prevent all security issues bit it can mitigate large scope of potential vulnerabilities. Root user inside container maps a range of user ID-s on host. In other words, privileged root users inside container maps to unprivileged range in the parent namespace. However, rootless mode is in experimental phase dedicated for nightly builds only. Here are other limitations:
- Overlay FS not available except in Ubuntu distributions
- Limited network performance
- Can’t listen of network ports below 1024
- No control groups
Mapping root container user to user namespace on host
Another option to configure root user mode in container is to map container root user to user namespace on host. This approach adds extra layer of security into Docker world by running UIDs and GIDs of a subordinate mapping defined in /etc/subuid and /etc/subgid.
Users and groups are created with following format:
user:start_uid:uid_count
Example
Test: 50000:10000
We defined user test which will map to host with first UID of 50000. Number 10000 represent number of available UIDs to assign.
Do not forget to assign privileges to directories, files and process needed for user.
Example (Map container root user to host user namespace)
Create user test on host if does not exist:
[root@swmanager /]# useradd test |
Create corresponding files on etc folder:
[root@swmanager /]# touch {/etc/subuid,/etc/subgid} |
Define user test with starting UID of 50000 and available 10000 UIDs:
[root@swmanager /]# cat /etc/subgid
test:100000:65536 |
Edit docker.json file to add test user:
[root@swmanager /]# vi /etc/docker/daemon.json
{ “userns-remap”: “test” } |
Reload docker daemon:
[root@swmanager /]# systemctl daemon-reload |
Check that folder with user group is created:
[root@swmanager docker]# ls -ld /var/lib/docker/50000.50000/
drwx—— 14 50000 50000 182 May 9 23:24 /var/lib/docker/50000.50000/ |
Check ownership of user and group 50000:
[root@swmanager docker]# ls -l /var/lib/docker/50000.50000/
total 4 drwx—— 2 root 24 May 9 23:24 builder drwx—— 4 root 92 May 9 23:24 buildkit drwx—— 4 50000 50000 150 May 9 23:34 containers drwx—— 3 root 22 May 9 23:24 image drwxr-x— 3 root root 19 May 9 23:24 network drwx—— 12 50000 50000 4096 May 9 23:34 overlay2 drwx—— 4 root root 32 May 9 23:24 plugins drwx—— 2 root root 6 May 9 23:24 runtimes drwx—— 2 root root 6 May 9 23:24 swarm drwx—— 2 50000 50000 6 May 9 23:34 tmp drwx—— 2 root root 6 May 9 23:24 trust drwx—— 2 50000 50000 25 May 9 23:24 volumes |
Optionally set file, directory and process permission for test user.
Image trust
Docker main purpose is to run container based on image. Image is prepacked box with contains any service on which container can run. But images are already created, sitting in the remote registry and waiting for Docker to pull them. But you don’t know whether image comes from secure or insecure source, what is the content of the image and so on. By default Docker pulls and pushes images without any restrictions and verifications. And there come Docker Content Trust.
Docker Content Trust
Docker can be configured to run only signed images and not enabled by default. Docker Content Trust enables usage of signatures to send or retrieve images from remote registry. Integrity of Docker images is verified with PKI (public and private set of keys). When you push Docker image to remote registry, Docker Engine signs image with publisher private key. Pulling the image activates check with publisher public key. It is important to note that this method doe s not protect scenario where publisher can always push image with malware and sign it. Main protection is to prevent man in the middle attacks.
Additional security features
Host can be secured with following methods: SELinux, AppArmor, seccomp and custom security modules using Linux Security Modules (LSMs).
SElinux is kernel security module or Mandatory Access Control (MAC) responsible for type enforcement. Basically you have to defined type and privileges to type.
AppArmor is similar MAC solution like SELinux but based on file paths instead of type permissions. Docker containers run default AppArmor profile named docker-default. It is loaded from default template. Default policy can be overridden with parameter security-opt.
Seccomp (Secure Computing) is used to control and forbid system calls. It is active only if Docker was built with seccomp and kernel configuration includes CONFIG_SECCOMP. How to check:
[root@swmanager /]# grep CONFIG_SECCOMP= /boot/config-$(uname -r)
CONFIG_SECCOMP=y |
Default seccomp is a whitelist which can be found here.
To run container without default seccomp profile use parameter –security-opt seccomp=unconfined.
Best practices
- Run containers as non-root user if possible
- Use minimal and trusted base images with frequent image scanning
- Verify and sign images
- Use orchestration tool (Kubernetes or other) which will implement Role Access Based Control (RBAC) to control user privileges
- Read-only mounts for important host folders when using volume bind mounts
- Use capabilities to grant fine-grained privileges.
- Don’ bind container to host ports below 1024
- Enable TLS if using TCP/IP instead of Unix sockets
- Avoid using default Docker bridge network
- Use resource limits to prevent container resource host contamination
- Don’t use recursive copy