james.walters.click

Thoughts and lessons learned along the path of software development.

What Django Deployment is Really About

By James Walters on .

Django has this reputation for being hard to deploy. I don't think that's really true.

I think that people haven't taken the time to explain to beginners the concepts you're thinking about when it comes to deployment. We focus so much on teaching people how to build apps in Django that deployment feels like an afterthought. People are good at making a list of steps, and saying, "Well, here's how I do it." But if that solution isn't suitable because it uses a provider that won't work for you, or because the steps that worked yesterday don't work today, or for one of any number of reasons, then our beginner is stuck without any way to move forward.

I think if we just took the time to explain what the fundamental things are that you're trying to achieve in deployment, it would help beginners get their feet planted and be able to start figuring out each piece of the process, and what works for them in each part.

If you're here looking for deployment steps, you're at the wrong place. But if you'd like to get the lay of the land, an overview of what deployment's really about, then I hope this helps.

The Penguin in the Room 🐧️

Despite the fact that I'm a Linux Grandpa, I'm not here to talk about Linux today. I understand a lot of people getting started with Django don't have a lot of familiarity with Linux, and it's a whole new world to get acquainted with, but for the purposes of deployment, it's not so much a concept as it is an implementation detail.

Linux is just an operating system (more accurately, a family of operating systems). It just runs the computer that's going to be serving your site. That may be different than what you're used to, but it's not an entirely new concept. A new concept that's worth considering for deployment would be the idea of skipping the OS layer altogether (ehhhhhhhh, sort of) and deploying with a serverless function. But we're going to count that as outside our purview. If you're capable of considering the pros and cons of going serverless, I think you probably already have a firm grasp of what we're going to get into here. πŸ™‚οΈ

I think when it comes to deploying a Django site, you have four overall concerns (and it might be three, depending on what you choose to do):

  1. Static Files
  2. Database
  3. WSGI Server
  4. Web Server

We'll also mention a couple of minor concerns as we go.

1. Static Files πŸ“οΈ

The first thing that tends to throw beginners for a loop is all of a sudden we start worrying about what to do with static files. What's the deal? I thought Django was handling this for me. {% static %}, right?

Here's what's going on with static files. These are assets like your images, CSS stylesheets, and scripts. They're called "static" files because they aren't dynamically generated. Most of your Django project's pages are built with templates that get filled in with information by your program on demand when users request the page. That never happens with your images or CSS files. They're just files sitting on disk, ready to be served.

Web servers (like Apache or nginx) were built for the task of taking files on disk and serving them to users. That's where the web started. But we have a Django application because many pages need to be generated on demand with data from the database, we don't have them sitting on disk. A web server can't do that, we need an application to generate those files.

But since these static files aren't generated by your Django app, we don't want Django to handle them. Your Django program is actually way slower than the web server for this job, so we want the web server to do it.

How Django handles that is it stuffs all the static files into one directory (this will be STATIC_ROOT in your settings.py). This makes it easy for the web server to look and find the files it needs to serve. Then, we use {% static %} in our templates to tell Django that the request for this file needs to be passed along directly to the web server.

Now, the reason all this might seem kind of new is because Django has been handling static files for you in development. The Django development server (manage.py runserver) has to handle those requests, because when you're developing locally it's the only web server available. But when you deploy, you'll have to use a better web server than that, and that server will be quite capable of handling static files on its own.

Configuring a web server to handle static file requests is beyond the scope of this post, but here's a helpful guide on the topic. Give it a try! I was pleasantly surprised at how straightforward it was to write an nginx configuration file.

Once you've set a STATIC_ROOT to stuff your static files in, you can actually stuff them there by running manage.py collectstatic. You'll have to do this anytime you change any of your static files.

Whitenoise

It might be the case that for whatever reason you don't have access to be able to configure the web server for your site. In that case, your best option might be to use a tool called Whitenoise. This is a middleware that you can use in your Django app. If you don't know this, a Django request passes through all the different middlewares (listed in settings.py), and they each get their turn to do something. The Whitenoise middleware checks to see if the request is for a static file, and if so, it interrupts the rest of the request cycle (which would be slow) and immediately serves up the static file.

You might also opt for this option if you're lazy. But don't—it still isn't as fast as a web server handling this directly, and configuring nginx really isn't that hard. If you're a web developer, you should understand the tools used to run your code.

Media Files

It's worth mentioning that the same concerns apply to user-uploaded content, what Django calls "media files". If you have a site where users can e.g. upload a profile picture, you'll want to configure these as well.

2. Database πŸ’ΎοΈ

The next thing that you'll have to pay some thought to in deployment is your database situation.

In development, you've been using a local SQLite file. SQLite is really nifty, because it keeps everything in one single file. Since you don't "connect" to it, you don't have to handle connection details. Your DATABASE section in settings.py is pretty simple.

In deployment, SQLite may not be suitable (though it's more capable than you're often led to believe). Generally if you're using some sort of cloud provider, you'll be connecting to a database that's managed for you. This isn't usually different for your app in any material way (though databases differ in both features and limitations), but it is a rather important detail of deployment you need to be aware of.

The important difference will be that you'll need to configure connection details in your settings.py. This will mean adding a host address, port, username and password. It'll wind up looking something like this example from the Django docs:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydatabase',
        'USER': 'mydatabaseuser',
        'PASSWORD': 'mypassword',
        'HOST': '127.0.0.1',
        'PORT': '5432',
    }
}

Note the plaintext user and password—we'll return to this topic later.

Once you've got your deployment database set up, you'll want to run manage.py migrate to set it up for the first time, just like you did way back at the beginning of your project to set up SQLite. From that point onward, you should be good to go.

3. WSGI Server πŸ₯ƒοΈ

Now we get to an important lesson in understanding how your Django app actually runs.

In development, you've been using manage.py to run your app. But as the name would suggest, manage.py is just a management script. How do we run our Django app in a production environment?

You'll recall that we talked about having a web server (like nginx) installed. This is the piece of software that receives HTTP requests and returns HTTP responses. Web servers are great at serving files off of the computer's disk. But we have Django apps (or PHP apps, or web apps written in any other language or framework) to generate web pages on demand. So we have a web server that accepts requests and returns responses, and we have a web application that is capable of generating pages on demand.

How do we get these two things to work together though? The web server just handles files, how do we make it talk to another application?

In older days, a standard was defined called CGI (or Common Gateway Interface). Basically, the way it worked was when a web server received a request that needed to be handled by the web application, it would call a CGI script (these are usually written in Perl) and pass information about the request as arguments over standard input. That script could the call the web application the way it needed to be called and pass it the appropriate info.

In Python land, we have a similar standard called WSGI (or Web Server Gateway Interface). It's usually pronounced whizz-ghee. The way this works is the web server passes the request over to a WSGI server (like gunicorn or Waitress). That WSGI server runs a WSGI callable for your application. How on earth do I make that, you ask? Django already did it for you when you ran startproject—it's in the wsgi.py file.

So, you need to install a WSGI server, and pass it your wsgi.py file to run. If you're setting up a virtual machine, you might write a script to run this on boot, or use a systemd service. Then, you configure the web server to pass requests over to the WSGI server. The WSGI server will run it through your Django app, and after generating the appropriate page, pass that response back to the web server to hand off to your user.

ASGI

You might also hear about ASGI (azzz-ghee), or the asynchronous server gateway interface. As asynchronicity becomes easier and more popular in Python and other programming languages, web frameworks are looking to leverage that. Someday soon, this'll become the standard way to run your web app. As far as I understand, it's designed to be pretty similar to WSGI, just asynchronous. As far as your Django deployments go, it'll probably just mean you use asgi.py instead of wsgi.py.

4. Web Server πŸ•ΈοΈ

We've already talked about it a bit, but the last big deployment concern is setting up the web server which will be standing in front of your Django app.

As we alluded to earlier, Apache and nginx are going to be your two main options. Configuring either of them is beyond our scope here. If you don't know which to choose, nginx tends to be more popular these days, on account of how it handles requests within threads (it tends to be more efficient, and thus faster). Apache's a venerable old warhorse though, and it's a perfectly acceptable option. It ran much of the internet of the last two decades or more.

Depending on how you choose to deploy though, you might not need to worry about your web server at all. There are a number of routes to go, but most of the time you'll be looking at either a virtual machine (VM/VPS) or platform-as-a-service (PaaS).

A virtual machine (or virtual private server) is basically a rented computer. You can get these from providers like Linode (disclaimer: this is a referral link) for as little as $5 a month. This is basically a Linux server that you can SSH into. It'll be a very manual setup process, with little handholding. But, you'll understand every step of the way. If you go this route, you'll be responsible for everything about the server: restarting it if it goes down, tracking and rotating log files, managing security settings and updates, etc.

A platform-as-a-service is kind of different. Basically, the provider will abstract away the underlying computer, and you just focus on your app. You get your code uploaded, provide some details about hooking up a database, static files, where the script to execute is, etc., and you don't have to worry about the server-y stuff like worrying about updates, checking logs, or understanding security settings. You also wouldn't have to configure a web server, the provider would handle that for you. You mostly likely wouldn't even know which web server they're using under the hood. PaaS usually costs a little bit more than a $5 VPS, but if you don't want to become a part-time system administrator, then it's easily worth it.

Heroku is the original platform-as-a-service provider, but they've been in a bit of hot water lately for ending their free tier. I personally don't like it because I think it obfuscates and papers over the issues that I've covered here. Other options abound. I use a PaaS provider called PythonAnywhere. I think they've done a good job at keeping the control panels focused on the details we're talking about in this post. They also have a free tier for those just getting started.

Getting Code onto the Thing πŸ“¦οΈ

There's a couple of other minor concerns aside from what we've already mentioned, one of which is how you actually get your code onto the thing you're deploying to. There's a lot of flexibility here. In the old days, it would have been rsyncing the folder from your computer to the server. These days, since most developers are using git, they just run git clone on the server to pull down the repository (then, you can git pull anytime you make changes to your program). Your provider might even offer you a point-and-click file manager (PythonAnywhere does) and you can just manually upload your code folder that way.

There's a number of ways to do it, none of which are very hard.

Secrets πŸ”’οΈ

The last thing we should mention is handling secrets. You'll remember earlier in our database config our example had the user and password stored right there in settings.py. We don't really want to keep sensitive info like that in code, because we don't want it to be commited to version control, like git. Even if you have something like a private Github repository, there's a risk of someone getting access to your code. Once you've committed a password to git, it's basically impossible (not technically, but it's extremely difficult, and even then it might be too late) to get it out of your repository's commit history. It's better to handle secrets another way.

There are two primary schools of thought on this. The older convention for doing this would be to have two settings.py files, one for development and one for deployment, which is never committed to version control. The way people tend to do this these days is with environment variables, which are pieces of data that are stored in the runtime environment of the computer running your code.

There are a few ways to get at environment variables from within a Python script. Personally, I find the python-decouple package straightforward and easy to use. It allows me to store database credentials and my SECRET_KEY in a file called .env (which, again, should never be commited to version control). Then, I can get at those values with decouple.config(). Options abound though, and find whatever's right for you. Just make sure your secrets aren't committed to code.

Django's Deployment Checklist βœ…οΈ

It's worth mentioning that Django has it's own built-in deployment checklist to help you think through some of the things involved in deployment. To use it, just run manage.py check --deploy, and you'll get feedback on anything that needs attention.

django-simple-deploy

Eric Matthes has a wonderful project going called django-simple-deploy, which aims to be a utility that you can run against your code, and it will automatically handle all the steps needed to get your code deployed. The idea is not to abstract away the deployment process, but to be something that helps a newcomer get their code online, then they can compare the changes it made to their code settings and learn how exactly it went about it. As of writing, it supports deploying to Fly.io, Platform.sh, and Heroku. More platforms should be on the way.

Wrapping Up 🎁️

There's a lot of other things we could talk about. Should you containerize your app with Docker? Should you deploy with a serverless function? Which database is the best database? These are all good next steps to consider.

At bare minimum, I hope I've shown that these four major concerns (static files, database, WSGI server, and web server) are the places to begin when figuring out how to deploy your Django application. If you can get your head wrapped around these things, you'll be able to get your app online and make it to "Hello World". The rest are details that you'll pick up as you continue to develop and get comfortable putting your work online.