How to run uWSGI | ionel's codelog

Given the cornucopia of options uWSGI offers it's really hard to figure out what options and settings are good for your typical web app.

Normally you'd just balk and run something simpler with less knobs and dials, like mod-wsgi with Apache but alas, uWSGI is so flexible and has so many features that mod-wsgi lacks. If only it weren't so tricky to configure...

First off, hands down, this is the most important setting - you should always start your configuration in strict mode. This will save you lots of pain and suffering if you ever fiddle with options.

[uwsgi]
# Error on unknown options (prevents typos)
strict = true

In general the most reliable concurrency model is processes, with no threads:

# Formula: cores * 2 + 2
processes = %(%k * 2 + 2)

You could enable threads (the threads option) and use less processes but that can be problematic for code that is CPU-bound or not thread-safe. I wouldn't enable the gevent plugin - you're just asking for trouble with all that monkey-patching. Essentially you're using more memory to avoid certain problems.

Most of the useful uWSGI features rely on the master process, it's a pretty mandatory option to have:

# Most of uWSGI features depend on the master mode
master = true

So now that we have a master process we can do either load the application in the master one time or load it in every worker process. If your project has lots of imports and things going on at import time it's something worth considering but you need to be wary of how you manage external resources (like connections, locks and whatnot).

Basically each worker would be a copy of the master process. While the memory is copy-on-write the resources probably aren't.

You can deal with shared FDs by marking them as close-on-exec, these options will make uWSGI mark all the FDs as COE before forking a worker, and after forking uWSGI's internal FDs will also be COE (if you'd ever want to call fork() in your crazy app).

# Close fds on fork (don't allow subprocess to mess with parent's fds)
close-on-exec = true
close-on-exec2 = true

Locks can't be dealt with automatically. Well, the stdlib tries to, and even tho there have been many bug-fixes with logging locks being improperly shared after a fork you can always get a very sticky surprise. So essentially you need to ask yourself what's more important - speed or correctness.

If you're prepared to have health checks and rolling deployments, you shouldn't care so much about server boot time - I'm pretty sure correctness is what you want, thus you should make uWSGI import your code after it has started all the workers. Slower but safer:

# In case there's some bad global state (pointless to use with need-app = true)
lazy-apps = true

Otherwise, if you load the app before fork you might as well make just the service fail if it can't load the app at all. You can probably avoid implementing fancy health checks by just using this:

# Exit if no app can be loaded (pointless to use with lazy-apps = true)
need-app = true

You still need to have threading enabled most of the time, for example if you use Sentry:

# Enable threads for sentry, see:
# https://docs.sentry.io/clients/python/advanced/#a-note-on-uwsgi
enable-threads = true

Assuming you want to run a single project certain things can be disabled:

# Avoid multiple interpreters (automatically created in case you need mounts)
single-interpreter = true

Even if you don't run your app in a Docker container this is a good thing to do. Strangely uWSGI doesn't do this by default - a consequence of having too many features and use-cases I guess...

# Respect SIGTERM and do shutdown instead of reload
die-on-term = true

The preferred way to load your app should be module as it forces you get your application imported correctly. If you want to keep the configuration file generic you can use an environment variable, example:

# WSGI module (application callable expected inside)
module = $(DJANGO_PROJECT_NAME).wsgi

A bit of process management necessary most of the time:

# Respawn processes that take more than ... seconds
harakiri = 300
harakiri-verbose = true

# Respawn processes after serving ... requests
max-requests = 5000

# Respawn if processes are bloated
reload-on-as = 1024
reload-on-rss = 512

# We don't expect abuse so lets have fastest respawn possible
forkbomb-delay = 0

I wouldn't use the evil reload variants (evil-reload-on-rss and evil-reload-on-as) as they will kill your workers at unexpected points and that job is better left to the linux OOM killer anyway.

Assuming you'll have a Nginx frontend the best way to connect them is via a unix domain socket - it has the lowest overhead, and well, it's better to have a file with the wrong perms than a port open on the wrong interface. Assuming you'll start uWSGI as root:

# Assuming we start from root we need to create the socket way early
shared-socket = /var/run/app.uwsgi
chmod-socket = 666
socket = =0

# Change user after binding the socket
uid = app
gid = app

In Nginx all you need is something along these lines.

http {
  # Some fine-tuning
  client_max_body_size 10m;
  client_body_buffer_size 64k;
  large_client_header_buffers 8 32k;

  server {
    location / {
      include /etc/nginx/uwsgi_params;
      uwsgi_pass unix:/var/run/app.uwsgi;

      uwsgi_ignore_client_abort on;
      uwsgi_next_upstream off;
      uwsgi_read_timeout 300;
      # Prevent nginx discarding large responses.
      uwsgi_buffering on;
      # Initial response size (practically headers size)
      uwsgi_buffer_size 64k;
      uwsgi_buffers 8 32k;
    }
  }
}

Why do we need all these buffer tweaks and limits you wonder? Well you should strive for compatibility and resilience:

Allow requests with lots of cookies, should you need to have cookie session storage. That means big headers thus we increase some buffer sizes.
Disallow really large uploads. Most apps don't need to take file uploads larger than 10Mb so that's a good default.
Prevent getting DOS-ed by slow-client type of attacks like Slowris or RUDY. That means the frontend needs to buffer the request body - an acceptable trade-off if we also have a request body limit.

With those settings you should fare pretty well, but you should always tests anyway - slowhttptest is available as a Fedora and Ubuntu package.

Note that each worker will access the socket directly (call accept() on that socket) regardless of protocol (TCP or UDS) thus some workloads won't be evenly distributed to the uWSGI workers. So if you have an application that has some slow views and some fast views a good option to consider is this:

# Enable an accept mutex for a more balanced worker load
thunder-lock = true

You're essentially trading a bit of throughput and minimum latency for way better maximum latency. Read more about it here.

Other useful options:

# Good for debugging/development
auto-procname = true
log-5xx = true
log-zero = true
log-slow = 1000
log-date = [%%Y-%%m-%%d %%H:%%M:%%S]
log-format = %(ftime) "%(method) %(uri)" %(status) %(rsize)+%(hsize) in %(msecs)ms pid:%(pid) worker:%(wid) core:%(core)
log-format-strftime = [%%Y-%%m-%%d %%H:%%M:%%S]

# Enable the stats service for uwsgitop, pip install uwsgitop, and run:
#   uwsgitop /var/run/app.stats
stats = /var/run/app.stats

Another problem that you might care about, especially if you got used to apachectl -k graceful is, well, waiting for pending requests at shutdown. uWSGI just kills all the workers by default. You can enable graceful shutdown by having this hook:

# See: https://github.com/unbit/uwsgi/issues/849#issuecomment-118869386
# Note that SIGTERM is 15 not 1 :-)
hook-master-start = unix_signal:15 gracefully_kill_them_all

Note that it would make uWSGI always do a graceful shutdown, and you should always have harakiri enabled if you use this. Otherwise shutdowns and restarts can get stuck.

Another way to do this is to use the master fifo and send a graceful shutdown command, eg:

# For graceful shutdown you can run: echo q > /var/run/fifo.uwsgi
master-fifo = /var/run/fifo.uwsgi

You can also use this method to do a brutal shutdown/restart and other things.

But what if I don't want to run Nginx? *

uWSGI certainly makes this possible but alas, it also makes it very hard to get it right. Remember that we need the frontend to do protect the workers from abusive clients?

You'd think that running an HTTP router (the http option) as opposed to having the workers serve HTTP directly (the http-socket option) would protect from Slowris or RUDY (slow request body attack) but you'd be very wrong.

You can easily test this by running slowhttptest -B. It fails quickly all while Nginx runs like a champ. So is there a way to solve this? Or, how ugly is it? Funnily enough it's possible, and yes it's ugly and contrived:

# Same setup as before, allow starting as root and changing user later by using a shared socket
shared-socket = /var/run/app.uwsgi
chmod-socket = 666
uwsgi-socket = =0

# This is how a request runs with this setup:
#   http request -> http router -> fastrouter -> worker
http-to = /var/run/app.router
http = :8000
fastrouter = /var/run/app.router
fastrouter-use-pattern = /var/run/app.uwsgi

# Buffer in-memory up to 64kb
fastrouter-post-buffering = %(64 * 1024)

# 10Mb request body limit
limit-post = %(10 * 1024 * 1024)

It can't be simpler because the post-buffering option (necessary to prevent the workers getting hosed up by slow requests) doesn't apply to the http router - it applies to the worker. There's no http-post-buffering option thus the only choice is to have the fastrouter as the buffering middleman.

Note that it's best to leave fastrouter-post-buffering to a small value as buffer handling isn't very well done in uWSGI.

Likely you'll need to serve static files as well:

static-map = /static=/var/www/static
# Expire after 24h
static-expires = .* %(24 * 60 * 60)
static-gzip-all = true

The one tricky bit is the static-gzip-all option - uWSGI doesn't gzip on the fly - it expects .gz files around. There's a really easy way to build them using whitenoise. Either run python -m whitenoise.compress or use this Django setting:

# This automatically creates a .gz file for each static file
STATICFILES_STORAGE = "whitenoise.storage.CompressedStaticFilesStorage"

Now you might wonder why not also gzip responses. There are two ways of doing it - both problematic:

Use http-auto-gzip like in this uWSGI guide. Note that:
- You have to stop sending Content-Length from your application. You'll end up implementing middleware that removes the Content-Length that django.middleware.common.CommonMiddleware adds. No, you should not just remove CommonMiddleware for obvious reasons.
- The uWSGI-Encoding header is not removable with this technique (response-route-run = delheader:uWSGI-Encoding doesn't actually work).
- You cannot tweak the compression ratio (it's hardcoded at 9 - not really that efficient CPU-wise).
Here's an example that would work in general, with the aforementioned tradeoffs:
```
# I wouldn't copy this...
http-auto-gzip = true
collect-header = Content-Type RESPONSE_CONTENT_TYPE
response-route-if = equal:${RESPONSE_CONTENT_TYPE};application/json addheader:uWSGI-Encoding: gzip
response-route-if = startswith:${RESPONSE_CONTENT_TYPE};text/ addheader:uWSGI-Encoding: gzip
```

Use transformations. Although this approach is a bit more flexible, you still cannot tweak the compression ratio (same hardcode at 9 - inefficient CPU-wise) and it's more complex as you can see:

collect-header = Content-Type RESPONSE_CONTENT_TYPE
collect-header = Content-Length RESPONSE_CONTENT_LENGTH
# uWSGI internal are not that smart, thus no content-length means it's 0
response-route-if = empty:${RESPONSE_CONTENT_LENGTH} goto:no-length
# Don't bother compressing 1kb responses, not worth the trouble
response-route-if = islower:${RESPONSE_CONTENT_LENGTH};1024 last:
response-route-label = no-length
# Make sure the client actually wants gzip
response-route-if = contains:${HTTP_ACCEPT_ENCODING};gzip goto:check-response
response-route-run = last:
response-route-label = check-response
# Don't bother compressing non-text stuff, usually not worth it
response-route-if = equal:${RESPONSE_CONTENT_TYPE};application/json goto:apply-gzip
response-route-if = startswith:${RESPONSE_CONTENT_TYPE};text/ goto:apply-gzip
response-route-run = last:
response-route-label = apply-gzip
response-route-run = gzip:
# Why apply this filter too you wonder? The gzip transformation is not smart
# enough to chunk the body or set a Content-Length, thus keepalive will be broken
http-auto-chunked = true

Previously this blog post had response-route-run = chunked: but it appears that http-auto-chunked performs better.

TL;DR *

I just want to run uWSGI standalone, just give me my copy-pasta config or I'll copy something really bad from SO!

🙄

[uwsgi]
# Error on unknown options (prevents typos)
strict = true

# Formula: cores * 2 + 2
processes = %(%k * 2 + 2)

# Most of uWSGI features depend on the master mode
master = true

# Close fds on fork (don't allow subprocess to mess with parent's fds)
close-on-exec = true
close-on-exec2 = true

# In case there's some bad global state (pointless to use with need-app = true)
lazy-apps = true

# Enable threads for sentry, see:
# https://docs.sentry.io/clients/python/advanced/#a-note-on-uwsgi
enable-threads = true

# Avoid multiple interpreters (automatically created in case you need mounts)
single-interpreter = true

# Respect SIGTERM and do shutdown instead of reload
die-on-term = true

# See: https://github.com/unbit/uwsgi/issues/849#issuecomment-118869386
# Note that SIGTERM is 15 not 1 :-)
hook-master-start = unix_signal:15 gracefully_kill_them_all

# All the commands: https://uwsgi-docs.readthedocs.io/en/latest/MasterFIFO.html
master-fifo = /var/run/app.fifo

# Respawn processes that take more than ... seconds
harakiri = 300
harakiri-verbose = true

# Respawn processes after serving ... requests
max-requests = 5000

# Respawn if processes are bloated
reload-on-as = 1024
reload-on-rss = 512

# We don't expect abuse so lets have fastest respawn possible
forkbomb-delay = 0

# Enable an accept mutex for a more balanced worker load
thunder-lock = true

# Good for debugging/development
auto-procname = true
log-5xx = true
log-zero = true
log-slow = 1000
log-date = [%%Y-%%m-%%d %%H:%%M:%%S]
log-format = %(ftime) "%(method) %(uri)" %(status) %(rsize)+%(hsize) in %(msecs)ms pid:%(pid) worker:%(wid) core:%(core)
log-format-strftime = [%%Y-%%m-%%d %%H:%%M:%%S]

# Enable the stats service for uwsgitop, pip install uwsgitop, and run:
#   uwsgitop /var/run/app.stats
stats = /var/run/app.stats

# Same setup as before, allow starting as root and changing user later by using a shared socket
shared-socket = /var/run/app.uwsgi
chmod-socket = 666
uwsgi-socket = =0

# Change user after binding the socket
uid = app
gid = app

# This is how a request runs with this setup:
#   http request -> http router -> fastrouter -> worker
http-to = /var/run/app.router
http = :8000
fastrouter = /var/run/app.router
fastrouter-use-pattern = /var/run/app.uwsgi

# Buffer in-memory up to 64kb
fastrouter-post-buffering = %(64 * 1024)

# 10Mb request body limit
limit-post = %(10 * 1024 * 1024)

static-map = /static=/var/www/static
# Expire after 24h
static-expires = .* %(24 * 60 * 60)
# Don't forget to run python -m whitenoise.compress or similar!
static-gzip-all = true

# Apply conditional gzip encoding
collect-header = Content-Type RESPONSE_CONTENT_TYPE
collect-header = Content-Length RESPONSE_CONTENT_LENGTH
# uWSGI internal are not that smart, thus no content-length means it's 0
response-route-if = empty:${RESPONSE_CONTENT_LENGTH} goto:no-length
# Don't bother compressing 1kb responses, not worth the trouble
response-route-if = islower:${RESPONSE_CONTENT_LENGTH};1024 last:
response-route-label = no-length
# Make sure the client actually wants gzip
response-route-if = contains:${HTTP_ACCEPT_ENCODING};gzip goto:check-response
response-route-run = last:
response-route-label = check-response
# Don't bother compressing non-text stuff, usually not worth it
response-route-if = equal:${RESPONSE_CONTENT_TYPE};application/json goto:apply-gzip
response-route-if = startswith:${RESPONSE_CONTENT_TYPE};text/ goto:apply-gzip
response-route-run = last:
response-route-label = apply-gzip
response-route-run = gzip:
# Why apply this filter too you wonder? The gzip transformation is not smart
# enough to chunk the body or set a Content-Length, thus keepalive will be broken
response-route-run = chunked:

Addendum *

Note that this example will make uWSGI create several files at /var/run - it should be writable by the app user.