Introduction
Table of Contents
You would expect problem in Docker environment because there are so many factors that can go wrong. This article covers commands, common scenarios and tips how to successfully troubleshoot potential Docker issues. It is hard to cover all scenarios, things can go wrong at so many levels, but for you we will try to distinguish and group common categories – where the logs are stored in the first place, how to enter container, what are the useful command for troubleshoot and many more. Let’s start with Docker troubleshooting.
Docker container logs
First line of defense in Docker troubleshooting is to know where the logs are. If there is no logs things can get complicated. By default, Docker stores its logs in following format:
/var/lib/docker/containers/[container-id]/[container-id]-json. log.
This is default JSON output which stores logs on host machine. This is how you can check the default logging driver:
[root@swmanager ~]# docker info –format ‘{{.LoggingDriver}}’
json-file |
JSON file is just one of the logging drivers you can use: none, local, syslog, journald, gelf, fluentd, splunk, amazon and GCP.
Docker container automation started with specific ID (note the ID):
You can change it by configuring /etc/docker/daemon.json file:
{
“log-driver”: “splunk” } |
This line tells Docker to use splunk as default logging destination.
[root@swmanager ~]# docker run -dit -p 5000:5000 automation
f544bfad10f312c3befbc96cfdb3e2461f55579addff9c5a6b67b11f99c35b2a |
Go to the /var/lib/docker/containers and find directory name with ID above. Logs are available there:
[root@swmanager f544bfad10f312c3befbc96cfdb3e2461f55579addff9c5a6b67b11f99c35b2a]# cat f544bfad10f312c3befbc96cfdb3e2461f55579addff9c5a6b67b11f99c35b2a-json.log
{“log”:” * Serving Flask app \”__init__\” (lazy loading)\r\n”,”stream”:”stdout”,”time”:”2020-08-17T13:22:37.407939247Z”} {“log”:” * Environment: production\r\n”,”stream”:”stdout”,”time”:”2020-08-17T13:22:37.407991902Z”} …… |
Logs can be analyzed with useful command docker logs which shows the command’s STDOUT and STDERR. But if your logging drivers sends logs to remote location or database, docker logs will not work as expected. For Docker service you can use docker service log command.
Start and analyze mysql container logs:
[root@swmanager /]# docker run -dit -p 3306:3306 –name mysql -e MYSQL_ROOT_PASSWORD=password mysql:5.7
72b4567c09c2f9eef3cdd03c3acc0acf472aa337402a59932a4ae5dc8a756311 [root@swmanager /]# docker logs mysql 2020-08-31 12:45:18+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started. 2020-08-31 12:45:18+00:00 [Note] [Entrypoint]: Switching to dedicated user ‘mysql’ 2020-08-31 12:45:18+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started. 2020-08-31 12:45:18+00:00 [Note] [Entrypoint]: Initializing database files … |
There are useful flags you can use like –details (extra details), –follow (follow log output),–since and –until.
Types of logging
Completely different story is how logs are logged. Default way is blocking mode. These mode will write logs to its destination not caring about application work. These can bring latency and performance issues if logs are sent to remote location. If Docker uses default JSON format which stores logs on local system, impact is not high.
Second type of logging (non-blocking) does not have impact on application work because its sends logs to memory ring buffer. This way each process has separated stream and everything looks good. But downside is that memory buffer can be full and logs can leak. Default memory size of memory ring buffer is 1MB.
Because non-blocking mode is not default you have to define parameter –log.opt mode to non-blocking. Also you can tweak memory ring buffer size (our example 2M):
[root@swmanager ~]# docker run -dit –name mysql –log-opt mode=non-blocking –log-opt max-buffer-size=4m -e MYSQL_ROOT_PASSWORD=my-secret-pw mysql:5.7
d765e45169947d6e461374b8ca595de40e5a73ac58d03a52425a5c7150d1bfd5 |
Docker troubleshooting commands
There are few handy commands to troubleshooting potential issues on Docker containers. Command docker container logs <container> will display log output of desired container:
[root@swmanager ~]# docker container logs mysql
2020-09-24 06:01:21+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started. 2020-09-24 06:01:21+00:00 [Note] [Entrypoint]: Switching to dedicated user ‘mysql’ 2020-09-24 06:01:21+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started. 2020-09-24 06:01:21+00:00 [Note] [Entrypoint]: Initializing database files 2020-09-24T06:01:21.799449Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use –explicit_defaults_for_timestamp server option (see documentation for more details). 2020-09-24T06:01:22.560336Z 0 [Warning] InnoDB: New log files created, LSN=45790 2020-09-24T06:01:22.681348Z 0 [Warning] InnoDB: Creating foreign key constraint system tables. 2020-09-24T06:01:22.788340Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 5f85edc3-fe2b-11ea-b777-0242ac110003. 2020-09-24T06:01:22.793316Z 0 [Warning] Gtid table is not ready to be used. Table ‘mysql.gtid_executed’ cannot be opened. 2020-09-24T06:01:25.148931Z 0 [Warning] CA certificate ca.pem is self signed. 2020-09-24T06:01:25.493338Z 1 [Warning] root@localhost is created with an empty password ! Please consider switching off the –initialize-insecure option. |
If you want to see complete container configuration you can use command docker container inspect <container>. In our example we are checking if container is running:
[root@swmanager ~]# docker container inspect mysql | grep -i running
“Status”: “running”, “Running”: true, |
Command docker events covers full logging on operations performed on container, images, daemons, services, volumes and more.
Let’s display all docker events after specified date:
[root@swmanager ~]# docker events –since ‘2020-09-23’
2020-09-24T07:54:44.191241177+02:00 container create 24fb97271fdf5ca46c2f98e1bda66dfdc0265499952ccdcf416b3ae9551ef560 (image=centos, name=awesome_spence, org.label-schema.build-date=20200611, org.label-schema.license=GPLv2, org.label-schema.name=CentOS Base Image, org.label-schema.schema-version=1.0, org.label-schema.vendor=CentOS) 2020-09-24T07:54:44.383378689+02:00 network connect a37fd6c73c1681e55ffabecf0cf6ec46d4524764275107c417041f941944ebe2 (container=24fb97271fdf5ca46c2f98e1bda66dfdc0265499952ccdcf416b3ae9551ef560, name=bridge, type=bridge) |
All events on mysql image since specific date:
[root@swmanager ~]# docker events -f “image”=”mysql:5.7” –since ‘2020-09-23’
2020-09-24T08:01:16.359790543+02:00 container destroy 8e9af83ffde5e08feaadc04e10ae4f14c928c11abd41a289a8f01907fd5a0dc1 (image=mysql:5.7, name=mysql) 2020-09-24T08:01:19.790819485+02:00 container create d765e45169947d6e461374b8ca595de40e5a73ac58d03a52425a5c7150d1bfd5 (image=mysql:5.7, name=mysql) 2020-09-24T08:01:21.343794131+02:00 container start d765e45169947d6e461374b8ca595de40e5a73ac58d03a52425a5c7150d1bfd5 (image=mysql:5.7, name=mysql) |
Specific network can be inspected with:
[root@swmanager ~]# docker network inspect bridge
[ { “Name”: “bridge”, “Id”: “a37fd6c73c1681e55ffabecf0cf6ec46d4524764275107c417041f941944ebe2”, “Created”: “2020-09-24T07:53:06.82258442+02:00”, “Scope”: “local”, “Driver”: “bridge”, |
Docker swarm services can also be inspected with few commands. Please check Docker swarm article how to set up the service. We are running mysql service in one replica:
[root@swmanager ~]# docker service create –name mysql –replicas 1 -e MYSQL_ROOT_PASSWORD=my-secret-pw mysql:5.7
3pktdampz2f5mzt77yek920an overall progress: 1 out of 1 tasks 1/1: running [==================================================>] verify: Service converged |
Command Docker service ps <container> will display basic information of service (on which worker is running, desired states, name, etc.):
[root@swmanager ~]# docker service ps mysql
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS ksdi8jfubdzb mysql.1 mysql:5.7 swworker2 Running Running about a minute ago |
Like we used Docker logs command for plain docker containers, here we can use Docker service logs <container>:
[root@swmanager ~]# docker service logs mysql
mysql.1.ksdi8jfubdzb@swworker2 | 2020-09-24 06:28:51+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started. mysql.1.ksdi8jfubdzb@swworker2 | 2020-09-24 06:28:51+00:00 [Note] [Entrypoint]: Switching to dedicated user ‘mysql’ mysql.1.ksdi8jfubdzb@swworker2 | 2020-09-24 06:28:51+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started. mysql.1.ksdi8jfubdzb@swworker2 | 2020-09-24 06:28:51+00:00 [Note] [Entrypoint]: Initializing database files mysql.1.ksdi8jfubdzb@swworker2 | 2020-09-24T06:28:51.976853Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use –explicit_defaults_for_timestamp server option (see documentation for more details). |
To display swarm members with its basic information use following command:
[root@swmanager ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION uhcwhi00nw008n14pase5twd1 * swmanager.local Ready Active Leader 18.09.1 viuunheq5bm2xcmc5hvlnqwr2 swworker1.local Ready Active 18.09.1 1zgfm15evbtd5l4njk53bu63h swworker2 Ready Active 18.09.1 |
To see complete service configuration:
[root@swmanager ~]# docker service inspect mysql
[ { “ID”: “3pktdampz2f5mzt77yek920an”, “Version”: { “Index”: 890 }, “CreatedAt”: “2020-09-24T06:28:50.503417901Z”, “UpdatedAt”: “2020-09-24T06:28:50.503417901Z”, “Spec”: { “Name”: “mysql”, |
Inside Docker container
Good troubleshooting approach is to connect to container and check what is going on inside. There are several ways to do it: commands exec and attach and SSH into the container.
Run container (in our example CentOS):
[root@swmanager ~]# docker run -it –name centos -d centos /bin/bash |
Check that container is running:
[root@swmanager ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 67056cff24e9 centos “/bin/bash” 4 seconds ago Up 1 second centos |
Attach command
Use attach command to connect to container;
[root@swmanager ~]# docker attach centos
[root@67056cff24e9 /]# |
Check file systems and IP configuration:
[root@67056cff24e9 log]# df –h
Filesystem Size Used Avail Use% Mounted on overlay 6.2G 6.1G 124M 99% / tmpfs 64M 0 64M 0% /dev tmpfs 411M 0 411M 0% /sys/fs/cgroup /dev/mapper/cl-root 6.2G 6.1G 124M 99% /etc/hosts shm 64M 0 64M 0% /dev/shm tmpfs 411M 0 411M 0% /proc/asound tmpfs 411M 0 411M 0% /proc/acpi tmpfs 411M 0 411M 0% /proc/scsi tmpfs 411M 0 411M 0% /sys/firmware [root@67056cff24e9 log]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 20: eth0@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0 valid_lft forever preferred_lft forever |
Run Ctrl P and Ctrl-Q to exit container without stopping it:
[root@67056cff24e9 log]# read escape sequence
[root@swmanager ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 67056cff24e9 centos “/bin/bash” About a minute ago Up About a minute centos 7a24bd538b49 mysql:5.7 “docker-entrypoint.s…” 6 minutes ago Up 6 minutes 3306/tcp, 33060/tcp mysql.1.e14q7a7xqt3h64huwtjc2 |
Exec command
With exec command you can also get into container:
[root@swmanager ~]# docker exec -it centos /bin/bash
[root@67056cff24e9 /]# ls bin dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var |
SSH
Check the container ip address:
[root@swmanager ~]# docker inspect centos | grep -i ipaddress
“SecondaryIPAddresses”: null, “IPAddress”: “172.17.0.3”, “IPAddress”: “172.17.0.3”, |
In our example SSH connection is refused because there is no running ssh service inside container:
[root@swmanager ~]# ssh root@172.17.0.3
ssh: connect to host 172.17.0.3 port 22: Connection refused |
Docker copy file from container to host
Docker troubleshooting is all about identifying problems inside container. There is a way to copy files for analysis from container to host and vice versa. Files can be send from running or stopped container. Paths are relative to container root (path) and the current directory on host where docker command is run. User and group ownership is applied from destination (host) side. But with parameter –a you can preserves container user and group rights. Command used for copying files is docker cp:
Run mysql container:
docker run -dit -p 3306:3306 –name mysql -e MYSQL_ROOT_PASSWORD=password mysql:5.7 |
Copy log file from container to current directory:
[root@swmanager tmp]# docker cp mysql:/var/log/dpkg.log .
[root@swmanager tmp]# ls dpkg.log |
If you don’t specify –a parameter destination host will take over ownership
[root@swmanager tmp]# docker exec -it mysql /bin/bash
root@b815e68950a0:/var/log# useradd joe root@b815e68950a0:/var/log# chown joe:joe dpkg.log ls -la -rw-r–r– 1 joe 28636 Sep 10 06:54 dpkg.log |
Owner of dpkg.log was Joe, now is root user:
[root@swmanager tmp]# docker cp mysql.1.x6rxpcur4vj6l4z931kf2ywyt:/var/log/dpkg.log .
[root@swmanager tmp]# ls -la total 1 -rw-r–r– 1 root root 28636 Sep 10 08:54 dpkg.log |
You can rename log file at destination:
[root@swmanager tmp]# docker cp mysql.1.x6rxpcur4vj6l4z931kf2ywyt:/var/log/dpkg.log mysql.log |
Also copy file from host back to container:
[root@swmanager tmp]# docker cp mysql.log mysql.1.x6rxpcur4vj6l4z931kf2ywyt:/var/log/ |
Check that log file is there:
[root@swmanager tmp]# docker exec -it mysql.1.x6rxpcur4vj6l4z931kf2ywyt /bin/bash
root@b815e68950a0:/# ls -la /var/log/mysql.log -rw-r–r– 1 root root 28636 Sep 10 06:54 /var/log/mysql.log |
Troubleshoot the Docker daemon
Docker daemon starts automatically when machine starts. This ensures that dockerd daemon always runs by default. You can manually stop and start docker service:
[root@swmanager ~]# systemctl stop docker |
Typing dockerd will start daemon in foreground.
[root@swmanager ~]# dockerd
INFO[2020-11-02T12:07:52.620216713+01:00] parsed scheme: “unix” module=grpc INFO[2020-11-02T12:07:52.620320249+01:00] scheme “unix” not registered, fallback to default scheme module=grpc INFO[2020-11-02T12:07:52.620393949+01:00] parsed scheme: “unix” module=grpc INFO[2020-11-02T12:07:52.620408835+01:00] scheme “unix” not registered, fallback to default scheme module=grpc …. |
Ctrl-C will terminate service.
Configuration of daemon service can be done with configuration file (preferred) or putting flags when running docker daemon.
Configuration file is located at /etc/docker/daemon.json. Usually if not touched it is not created. Let’s put debug mode on:
{
„debug:“ true } |
Restart docker daemon:
[root@swmanager ~]# systemctl daemon-reload
[root@swmanager ~]# systemctl restart docker [root@swmanager ~]# systemctl stop docker |
Same thing can be achieved with flags:
[root@swmanager ~]# dockerd –debug=true
DEBU[2020-11-02T12:14:37.426522035+01:00] Listener created for HTTP on unix (/var/run/docker.sock) DEBU[2020-11-02T12:14:37.431237114+01:00] Golang’s threads limit set to 5580 |
With flag –help you can see all available options:
[root@swmanager ~]# dockerd –help |
Please make sure that there are no conflicts between docker configuration file and flags which can result in error.
Log level can be set with –log level flag. Values are debug, info, warn, error, fatal. Default is info.
[root@swmanager ~]# dockerd –log-level=”warn”
ERRO[0000] time=”2020-11-02T11:28:42Z” level=info msg=”listening on /run/docker/plugins/sshfs.sock” plugin=113af6aff2fae68d7134adcc71abb652827c37a87433e0689a |
Stack trace
In the events of unresponsive daemon you can force flush of full stack trace with
kill -SIGUSR1 $(pidof dockerd)
ps -aux | grep -i docker # to check dockerd process id |
How to check full stack trace?
journalctl -u docker.service
On CentOS check /var/log/messages. Systemctl status docker |
Out of memory
Containers by default don’t have any resource constraints. That can lead to unexpected situations where hosts can run out of memory, resulting in the system crash. Linux kernel has mechanism to deal with out of memory situations. Kernel throws out of memory exception and starts killing processes which have lowest OOM (Out of memory) priority. Docker daemon has high OOM but individual containers have very low. This can lead to situation where random containers will be shut down but dockerd daemon will continue to run. It is not recommended to tweak default OOM settings.
To prevent memory situations ensure that your app us running on host with adequate resources, limit the container resources. Best approach is limit the service. There are two types of limits hard and soft. Container will never go above hard limit, but will go above soft limit till certain conditions are met like out of memory exceptions.
Override binaries
Large number of images do not contain classic troubleshooting commands like traceroute, netcat, ip, telnet. Then it is hard to troubleshoot container without right commands. But you can always override binaries and mount host binary folders to container binary folder. It will override the default container binary folders like /bin, /sbin and /lib.
Run container in classic way:
[root@swmanager ~]# docker run -dit –name centos centos |
Go inside container:
[root@swmanager ~]# docker attach centos |
Check that command netstat is not available:
[root@6ce63969192d /]# netstat
bash: netstat: command not found |
Now, run same image overriding the binary folder /bin:
[root@swmanager ~]# docker run -it -v /bin:/bin centos /bin/bash |
Command netstat is now available:
[root@84317829b5bd /]# netstat
Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State Active UNIX domain sockets (w/o servers) Proto RefCnt Flags Type State I-Node Path Active Bluetooth connections (w/o servers) Proto Destination Source State PSM DCID SCID IMTU OMTU Security Proto Destination Source State Channel |