Configuring Gunicorn for Docker

Gunicorn is a common WSGI server for Python applications, but most Docker images that use it are badly configured. Running in a container isn’t the same as running on a virtual machine or physical server, and there are also Linux-environment differences to take into account.

So to keep your Gunicorn setup healthy and happy, in this article I’ll cover:

  • Preventing slowness due to worker heartbeats.
  • Configuring the number of workers.
  • Logging to stdout.

Why Gunicorn “sometimes hang[s] for half a minute”

Gunicorn’s main process starts one or more worker processes, and restarts them if they die. To ensure the workers are still alive, Gunicorn has a heartbeat system—which works by using a file on the filesystem. Gunicorn therefore recommends that this file be stored in a memory-only part of the filesystem.

As the Gunicorn FAQ explains, the default directory for the heartbeat file is in /tmp, which in some Linux distributions is stored in memory via tmpfs filesystem. Docker containers, however, do not have /tmp on tmpfs by default:

$ docker run --rm -it ubuntu:18.04 df
Filesystem       1K-blocks     Used Available Use% Mounted on
overlay           31263648 25656756   3995732  87% /
tmpfs                65536        0     65536   0% /dev
tmpfs              4026608        0   4026608   0% /sys/fs/cgroup
/dev/mapper/root  31263648 25656756   3995732  87% /etc/hosts
shm                  65536        0     65536   0% /dev/shm

As you can see, /tmp is using the standard Docker overlay filesystem: it’s backed by the normal block device or harddrive your computer is using.

And that can lead to performance problems—to quote the FAQ: “in AWS an EBS root instance volume may sometimes hang for half a minute and during this time Gunicorn workers may completely block.”

Presumably you don’t want your workers blocking for 30 seconds, so what should you do? One option is to mount a tmpfs or ramfs in-memory filesystem onto /tmp, using Docker’s volume support. This will work, but not everywhere: not all environments that run Docker containers support arbitrary volumes.

A more general solution is to tell Gunicorn to store its temporary file elsewhere. In particular, if you look above you’ll see that /dev/shm uses the shm filesystem—shared memory, and in-memory filesystem.

So all you need to do is tell Gunicorn to use /dev/shm instead of /tmp. (This will be at least somewhat documented in the Gunicorn FAQ when the release after 19.9.0 comes out, but you’ll still have to remember to do it.)

Here’s how you do it on the command-line:

$ gunicorn --worker-tmp-dir /dev/shm ...

Configuring the number of workers

If you’re running Gunicorn directly on the base hardware or in a virtual machine, you typically want a single Gunicorn instance to make use of all available CPUs. Since Python isn’t great at using multiple CPUs, typically you’d start multiple workers, each a different process, so as to utilize all the CPUs.

When running in a container, however, you are typically in an environment that scales up by running more containers. Heroku, AWS Elastic Beanstalk, Kubernetes: all of them hide the hardware and expect to utilize multiple CPUs by spinning up multiple containers.

So it can be tempting to start Gunicorn with just a single worker. However, many of these systems also include a heartbeat mechanism that checks whether your server is alive by sending it an occasional query.

If you only have one worker, and it’s stuck handling a slow query, the heartbeat query will timeout. At that point the load balancer will decide the container is stuck and stop sending it queries. In some environments it might also get restarted. (Thanks to Jeremy Thurgood for telling me about this problem.)

The solution: start at least two workers, and probably also start a number of threads using the gthread worker backend. That way each worker process can handle multiple queries so long as some of its time is spent waiting (e.g. for a database query to return). This ensures maximum CPU utilization (not scaling) for the CPU power the container gets, and reduces the chances of being unable to respond to a heartbeat query.

$ gunicorn --workers=2 --threads=4 --worker-class=gthread ...

Logging

Container schedulers typically expect logs to come out on stdout/stderr, so you should configure Gunicorn to do so:

$ gunicorn --log-file=- ...

You may decide not to bother if you have nginx in front of Gunicorn and you want to use its logs instead.

nginx isn’t always necessary

Speaking of nginx, you don’t always need nginx or another proxy in front Gunicorn. Many container deployment systems already have a HTTP load balancer/reverse proxy built-in, in which case Gunicorn isn’t being exposed directly to HTTP clients anyway.

Containers are different

Running an application in a container is subtly different than running on a machine or in a VM: you have a different level of control (you typically can’t mount filesystems from inside the running container), different scaling models, and often different networking configurations.

Don’t just copy your old configuration—make sure to customize it appropriately for running in a container.