ARTICLE

Running Software in Containers: a quick guide

From Docker in Action, Second Edition by Jeff Nickoloff and Stephen Kuenzli

This article delves into running software in Docker containers.

__________________________________________________________________

Take 37% off Docker in Action, Second Edition. Just enter fccnickoloff into the discount code box at checkout at manning.com.
__________________________________________________________________

Solved problems and the PID namespace

Every running program — or process — on a Linux machine has a unique number called a process identifier (PID). A PID namespace is a set of unique numbers that identify processes. Linux provides tools to create multiple PID namespaces. Each namespace has a complete set of possible PIDs. This means that each PID namespace will contain its own PID 1, 2, 3, and so on.

Most programs don’t need access to other running processes or to be able to list the other running processes on the system. Docker creates a new PID namespace for each container by default. A container’s PID namespace isolates processes in that container from processes in other containers.

From the perspective of a process in one container with its own namespace, PID 1 might refer to an init system process like runit or supervisord. In a different container, PID 1 might refer to a command shell like bash. Run the following to see it in action:

Command #1 above should generate a process list similar to the following:

Command #2 above should generate a slightly different process list:

In this example you use the docker exec command to run additional processes in a running container. In this case, the command you use is called ps, which shows all the running processes and their PID. From the output it’s clear to see that each container has a process with PID 1.

Without a PID namespace, the processes running inside a container would share the same ID space as those in other containers or on the host. A process in a container would be able to determine what other processes were running on the host machine. Worse, processes in one container might be able to control processes in other containers. A process that can’t reference any processes outside of its namespace is limited in its ability to perform targeted attacks.

Like most Docker isolation features, you can optionally create containers without their own PID namespace. This is critical if you’re using a program to perform system administration task that requires process enumeration from within a container. You can try this yourself by setting the --pid flag on docker create or docker run and setting the value to host. Try it yourself with a container running BusyBox Linux and the ps Linux command:

#A Should list all processes running on the computer

Because containers all have their own PID namespace neither gain meaningful insight from examining it, nor take more static dependencies on it. Suppose a container ran two processes: a server and a local process monitor. That monitor could take a hard dependency on the server’s expected PID and use that to monitor and control the server. This is an example of environment independence.

Consider the previous web-monitoring example. Suppose you weren’t using Docker but were running NGINX directly on your computer. Now suppose you forgot that you’d already started NGINX for another project. When you start NGINX again, the second process won’t be able to access the resources it needs because the first process already has them. This is a basic software conflict example. You can see it in action by trying to run two copies of NGINX in the same container:

#A The output should be empty

#B Start a second nginx process in the same container

The last command should display output like:

The second process fails to start properly and reports that the address it needs is already in use. This is called a port conflict, and it’s a common issue in real-world systems where several processes are running on the same computer or multiple people contribute to the same environment. It’s a great example of a conflict problem that Docker simplifies and solves. Run each in a different container, like this:

#A Start the first nginx instance

#B Verify that it is working, should be empty

#C Start the second instance

#D Verify that it is working, should be empty

Environment independence provides the freedom to configure software taking dependencies on scarce system resources without regard for other co-located software with conflicting requirements. Here are some common conflict problems:

  • Two programs want to bind to the same network port.
  • Two programs use the same temporary filename, which is prevented by file locks.
  • Two programs want to use different versions of some globally installed library.
  • Two processes want to use the same PID file.
  • A second program you installed has modified an environment variable that another program uses. Now the first program breaks.
  • Multiple processes competing for memory or CPU time.

All these conflicts arise when one or more programs have a common dependency but can’t agree to share, or have different needs. Like in the earlier port conflict example, Docker solves software conflicts with such tools as Linux namespaces, resource limits, file system roots, and virtualized network components. All these tools are used to isolate software inside a Docker container.

That’s all for now. If you want to learn more about the book, check it out on liveBook here and see this slide deck.

Written by

Follow Manning Publications on Medium for free content and exclusive discounts.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store