ARTICLE

Writing and Managing Application Logs with Docker

From Learn Docker in a Month of Lunches by Elton Stoneman

Manning Publications

--

Logging is usually the most boring part of learning a new technology, but it’s not the case with Docker. The basic principle is simple: you need to make sure your application logs are written to the standard output stream, because this is where Docker looks for them. This can be achieved in a couple of ways, which we’ll cover in this article, and then the fun begins.

___________________________________________________________

Take 37% off Learn Docker in a Month of Lunches by entering fccstoneman into the discount code box at checkout at manning.com.
___________________________________________________________

Docker has a pluggable logging framework — you need to make sure your application logs are coming out from the container, and then Docker can send them to different places. That lets you build a powerful logging model, where the application logs from all your containers are sent to a central log store with a searchable UI on top of it — all using open source components, all running in containers.

Welcome to stderr and stdout!

A Docker image is the snapshot of a filesystem with all your application binaries and dependencies, including some metadata telling Docker which process to start when you run a container from the image. That process runs in the foreground, and it’s like starting a shell session and then running a command — as long as the command is active it has control of the terminal input and output. Commands write log entries to the standard output and standard error streams (called stdout and stderr), and in a terminal session you see the output in your window. In a container Docker watches stdout and stderr and collects the output from the streams, and this is the source of the container logs.

TRY IT NOW You can see this easily if you run a timecheck app in a container — the application runs in the foreground and writes log entries to stdout:

# run the container in the foreground:
docker container run diamol/ch15-timecheck:3.0

# exit the container with Ctrl-C when you're done

You’ll see some log lines in your terminal, and you’ll find you can’t enter any more commands — the container is running in the foreground, and it’s like running the app in your terminal. Every few seconds the app writes another timestamp to stdout, and you’ll see another line in your session window. My output is in figure 1:

Figure 1. A container in the foreground takes over the terminal session until it exits

This is the standard operating model for containers — Docker starts a process inside the container and collects the output streams from that process as logs. All the apps we’ve used in this article follow this same pattern: the application process runs in the foreground — whichcould be a Go binary or the Java runtime — and the application is configured to write logs to stdout (or stderr — Docker treats both streams in the same way). Those application logs are written to the output stream by the runtime, and Docker collects them. Figure 2 shows the interaction between the application, the output streams and Docker:

Figure 2. Docker watches the application process in the container and collects its output streams

Container logs are stored as JSON files, and the log entries are available for detached containers which don’t have a terminal session, and for containers which have exited, ending the application process. Docker manages the JSON files for you and they have the same lifecycle as the container — when the container is removed, the log files are removed too.

TRY IT NOW Run a container from the same image in the background as a detached container and check the logs and then the path to the log file:

# run a detached container
docker container run -d --name timecheck diamol/ch15-timecheck:3.0

# check the mosty recent log entry:
docker container logs --tail 1 timecheck

# stop the container and check the logs again:
docker container stop timecheck
docker container logs --tail 1 timecheck

# check where Docker stores the container log file:
docker container inspect --format='{{.LogPath}}' timecheck

If you’re using Docker Desktop with Linux containers, remember that the Docker Engine is running inside a VM which Docker manages for you — you can see the path to the log file for the container, but you don’t have access to the VM and you can’t read the file directly. If you’re running Docker CE on Linux or you’re using Windows containers then the path to the log file is on your local machine and you can open the file to see the raw contents. You can see my output (using Windows containers) in figure 3:

Figure 3. Docker stores container logs in a JSON file and manages the lifetime of that file

The log file is an implementation detail which you don’t usually need to worry about. The format is simple: it contains a JSON object for each log entry with the string containing the log, the name of the stream where the log came from (stdout or stderr), and a timestamp. Code listing 1 shows a sample of the logs for my timecheck container:

Code listing 1. The raw format for container logs is a simple JSON object

{"log":"Environment: DEV; version: 3.0; time check: 09:42.56\r\n","stream":"stdout","time":"2019-12-19T09:42:56.814277Z"}
{"log":"Environment: DEV; version: 3.0; time check: 09:43.01\r\n","stream":"stdout","time":"2019-12-19T09:43:01.8162961Z"}

The only time you need to think about the JSON is if you have a container which produces lots of logs, and you want to keep all the log entries for a period but have them in a manageable file structure. Docker creates a single JSON log file for each container by default, and lets it grow to any size (until it fills up your disk). You can configure Docker to use rolling files instead with a maximum size limit — when the log file fills up, Docker starts writing to a new file. You also configure how many log files to use, and when they’re all full Docker starts overwriting the first file. You can set those options at the Docker Engine level to apply the changes to every container, or you can set them for individual containers. Configuring logging options for a specific container is a good way to get small, rotated log files for one application but keep all the logs for other containers.

TRY IT NOW Run the same app again but this time specifying log options to use three rolling log files with a maximum of 5KB each:

# run with log options and an app setting to write lots of logs:
docker container run -d --name timecheck2 --log-opt max-size=5k --log-opt max-file=3 -e Timer__IntervalSeconds=1 diamol/ch15-timecheck:3.0

# wait for a few minutes

# check the logs:
docker container inspect --format='{{.LogPath}}' timecheck2

You’ll see that the log path for the container is still a single JSON file, but Docker’s rotating log files using that name as the base but with a suffix for the log file number. If you’re running Windows containers or Docker CE on Linux you can list the contents of the directory where the logs are kept and you’ll see those file suffixes — mine are shown in figure 4:

Figure 4. Rolling log files let you keep a known amount of log data per container

A collection and processing stage for the application logs coming from stdout is where you can configure what Docker does with the logs. In the last exercise we configured the log processing to control the JSON file structure and there’s much more you can do with container logs. To take full advantage of that you need to make sure every app is pushing logs out of the container, and in some cases that takes a bit more work.

Relaying logs from other sinks to stdout

Not every app fits nicely with the standard logging model, and when you containerize some apps Docker won’t see any logs in the output streams. Some applications run in the background as Windows Services or Linux daemons, and the container startup process isn’t the application process. Other apps might use an existing logging framework which writes to log files or other locations (called sinks in the logging world), like syslog in Linux or the Windows Event Log. Either way there are no application logs coming from the container start process, and Docker doesn’t see any logs.

TRY IT NOW A new version of the timecheck app for this article writes logs to a file instead of stdout. When you run this version there are no container logs, although the app logs are being stored in the container filesystem:

# run a container from the new image:
docker container run -d --name timecheck3 diamol/ch19-timecheck:4.0

# check - there are no logs coming from stdout:
docker container logs timecheck3

# now connect to the running container, for Linux:
docker container exec -it timecheck3 sh

# OR windows containers:
docker container exec -it timecheck3 cmd

# and read the application log file:
cat /logs/timecheck.log

You’ll see that there are no container logs, even though the application is writing lots of log entries. My output is in figure 5 — I need to connect to the container and read the log file from the container filesystem to see the log entries:

Figure 5. If the app doesn’t write anything to the output streams you won’t see any container logs

This happens because the app is using its own log sink — a file in this exercise — and Docker doesn’t know anything about that sink. Docker only reads logs from stdout; there’s no way to configure it to read from a different log sink inside the container. The pattern for dealing with apps like this is to run a second process in the container startup command that reads the log entries from the sink that the application uses and writes them to stdout. That process could be a shell script or a simple utility app, and it’s the final process in the start sequence and Docker reads its output stream and the application logs get relayed as container logs. Figure 6 shows how that works:

Figure 6. You need to package a utility in your container image to relay logs from a file

This isn’t a perfect solution. Your utility process runs in the foreground, and it needs to be robust because if it fails then your container exits — even if the application is still working in the background. And the reverse is true: if the application fails but the log relay keeps running then your container stays up even though the app is no longer working. You need health checks in your image to prevent that from happening. Lastly, this isn’t an efficient use of disk, like if your app writes a lot of logs — they’ll be filling up a file in the container filesystem and filling up a JSON file on the Docker host machine.

It’s a useful pattern to know about. If your app runs in the foreground and you can tweak your config to write logs to stdout instead, it’s a better approach. If your app runs in the background then there’s no other option, and it’s better to accept the inefficiency and have your app behave like all other containers. An update for the timecheck app in this article adds this pattern, building a small utility app to watch the log file and relay the lines to stdout. Code listing 2 shows the final stages of the multi-stage Dockerfile — there are different startup commands for Linux and Windows:

Code listing 2. Building and packaging a log-relay utility with your app

# app image
FROM diamol/dotnet-runtime AS base

WORKDIR /app
COPY --from=builder /out/ .
COPY --from=utility /out/ .

# windows
FROM base AS windows
CMD start /B dotnet TimeCheck.dll && dotnet Tail.dll /logs timecheck.log

# linux
FROM base AS linux
CMD dotnet TimeCheck.dll & dotnet Tail.dll /logs timecheck.log

The two CMD instructions achieve the same thing, using the different approaches for the operating system. First the .NET application process is started in the background, using the start command in Windows and suffixing the command with a single ampersand & in Linux. Then the .NET tail utility starts, and configures to read the log file the application writes to. The tail utility watches that file and relays each new line as it gets written, and the logs get surfaced to stdout and become container logs.

TRY IT NOW Run a container from the new image and verify that logs are coming from the container — and that they still are written in the filesystem:

# runa  container with the tail utility process:
docker container run -d --name timecheck4 diamol/ch19-timecheck:5.0

# check the logs:
docker container logs timecheck4

# and connect to the container - on Linux:
docker container exec -it timecheck4 sh

# OR with Windows containers:
docker container exec -it timecheck4 cmd

# check the log file:
cat /logs/timecheck.log

Now the logs are coming from the container. It’s a convoluted approach to get there, with an extra process running to relay the log file contents to stdout, but once the container is running this is all transparent. The downside to this approach is the extra processing power used by the log relay and the extra disk space storing the logs twice — you can see my output in figure 7 which shows the log file is still in the container filesystem:

Figure 7. A log relay utility gets the application logs out to Docker, but using twice as much disk space

I use a custom utility to relay the log entries in this example, because I want the app to work across platforms. I could use the standard Linux tail command instead but there’s no Windows equivalent. The custom utility approach is also more flexible, because it could read from any sink and relay to stdout. That should cover any scenario where your application logs are locked away somewhere in the container which Docker doesn’t see.

When you have all your container images set up to write application logs as container logs, then you can start to make use of Docker’s pluggable logging system and consolidate all the logs coming from all your containers.

That’s all for this article. If you want to learn more about the book, you can check it out on our browser-based liveBook reader here.

--

--

Manning Publications

Follow Manning Publications on Medium for free content and exclusive discounts.