How to Build a Webhook Receiver in Django
A common way to receive data in a web application is with a webhook. The external system pushes data to yours with an HTTP request.
Correctly receiving and processing webhook data can be vital to your application working. In this post we’ll create a Django view to receive incoming webhook data.
Example use case
Imagine our site receives messages via webhook from a system at the infamous Acme Corporation. They follow the convention of sending POST requests with JSON bodies to a path on our site that we provide. They send a header with a secret token which we can use to authenticate their requests.
For the purposes of the example, we’ll ignore what we do with these messages and instead focus on the “scaffolding”.
Message log model
Before we start building a view, we should consider storing all incoming messages. Logging all incoming messages allows us to debug failures, check their structure is as documented, and otherwise audit what’s happening.
We could use any data store for the messages, but the simplest solution is to use a database model. This provides all the benefits of Django’s ORM and our database server’s durability guarantees.
The messages are JSON, so we can store them directly in a JSONField
. Since Django 3.1 this works for all database backends.
We should also store the time we received the message, and index it to improve query performance. This will allow us to see the messages in order. We can also use use it to clear old messages, avoiding indefinite table growth.
Combining these requirements we get this model:
from django.db import models
class AcmeWebhookMessage(models.Model):
received_at = models.DateTimeField(help_text="When we received the event.")
payload = models.JSONField(default=None, null=True)
class Meta:
indexes = [
models.Index(fields=["received_at"]),
]
Note we’re using models.Index
, the modern way to define indexes.
View
Our view should verify the request, receive the incoming message, store it, process it, and reply with a success response. We can do these steps like so:
import datetime as dt
import json
from secrets import compare_digest
from django.conf import settings
from django.db.transaction import atomic, non_atomic_requests
from django.http import HttpResponse, HttpResponseForbidden
from django.views.decorators.csrf import csrf_exempt
from django.views.decorators.http import require_POST
from django.utils import timezone
from example.core.models import AcmeWebhookMessage
@csrf_exempt
@require_POST
@non_atomic_requests
def acme_webhook(request):
given_token = request.headers.get("Acme-Webhook-Token", "")
if not compare_digest(given_token, settings.ACME_WEBHOOK_TOKEN):
return HttpResponseForbidden(
"Incorrect token in Acme-Webhook-Token header.",
content_type="text/plain",
)
AcmeWebhookMessage.objects.filter(
received_at__lte=timezone.now() - dt.timedelta(days=7)
).delete()
payload = json.loads(request.body)
AcmeWebhookMessage.objects.create(
received_at=timezone.now(),
payload=payload,
)
process_webhook_payload(payload)
return HttpResponse("Message received okay.", content_type="text/plain")
@atomic
def process_webhook_payload(payload):
# TODO: business logic
...
Note:
@csrf_exempt
disables Django’s default cross-site request forgery (CSRF) protection. Normally we wouldn’t want to accept a POST request without a CSRF token, as it could indicate a user being tricked into submitting a malicious form to our site from another. But for webhooks, we verify requests with different authentication schemes, so we can disable CSRF.@require_POST
blocks non-POST requests.@non_atomic_requests
disables theATOMIC_REQUESTS
(transaction-per-request) for this view. UsingATOMIC_REQUESTS
is normally a good idea, and a straightforward way of adding transactions to your Django application. Here, we’re using direct transaction control—the@atomic
onprocess_webhook_payload
—to ensure that if our business logic crashes, we’ve at least saved theAcmeWebhookMessage
for debugging. Therefore we don’t want a transaction around the whole view.Acme’s system provides some authentication with a token in the
Acme-Webhook-Token
header. We check this header against the token they should be using, which we store in an environment variable and read in our settings. If the two do not match, we can reject the incoming message.We use
secrets.compare_digest()
to perform the comparison. Unlike normal string comparison, this is guaranteed to take the same amount of time no matter the input string. This prevents timing attacks from retrieving our secret token. (Thanks to Florian Apolloner for reminding me to add this protection.)Authentication is very important for webhook receivers since they are on the public web, and anyone could potentially discover them. Since there’s no real standard for webhooks, different callers use different authentication methods. If you’re adapting this code, check your caller’s documentation.
Before storing the new message, we clean up stored messages older than a week. This is a simple way to remove old data.
If our webhook ends up running frequently, executing this delete query each time may get expensive. In this case we could move the deletion out to a periodic background task, similar to Django’s
clearsessions
.We use
json.loads()
to load the request body. We do this without any checking of theContent-Type
header or error handling if the body isn’t valid JSON. If an error does occur, the view will crash, and our error reporting software (e.g. Sentry) will alert us.This is a fine failure mode for our example. Since we’ve verified the message is from Acme, if the body is not JSON, something has gone wrong, and we’d like to know about it.
We store the data in the
AcmeWebhookMessage
model before attempting to process it. This ensures we have it logged even if we crash later.We call our business logic handler. This has a stub implementation, left empty for the purposes of this example. In a real world application we’d add some code here. That said, deploying a first version with an empty handler is a good way to test messages are being received correctly.
We return a plain-text OK response from our view. Typically webhook callers check only the status code, so we can keep the body minimal.
URL
We can add a URL mapping to our view with the standard path()
:
from django.urls import path
from example.core.views import acme_webhook
urlpatterns = [
...,
path(
"webhooks/acme/mPnBRC1qxapOAxQpWmjy4NofbgxCmXSj/",
acme_webhook,
),
]
The path contains a random string, generated with a password manager. This adds a little extra security-by-obscurity, since we won’t provide this URL to anyone but Acme. This prevents at least URL enumeration attacks from discovering our receiver.
Random URLs in the strings don’t provide real protection. URLs often get copied to insecure places, such as logs, emails, or sticky notes. Unfortunately some webhook callers do not support any authentication mechanism, so this can be the best option.
Tests
To test our webhook view, we can make requests to it with Django’s test client:
import datetime as dt
from http import HTTPStatus
from django.test import Client, override_settings, TestCase
from django.utils import timezone
from example.core.models import AcmeWebhookMessage
@override_settings(ACME_WEBHOOK_TOKEN="abc123")
class AcmeWebhookTests(TestCase):
def setUp(self):
self.client = Client(enforce_csrf_checks=True)
def test_bad_method(self):
response = self.client.get("/webhooks/acme/mPnBRC1qxapOAxQpWmjy4NofbgxCmXSj/")
assert response.status_code == HTTPStatus.METHOD_NOT_ALLOWED
def test_missing_token(self):
response = self.client.post(
"/webhooks/acme/mPnBRC1qxapOAxQpWmjy4NofbgxCmXSj/",
)
assert response.status_code == HTTPStatus.FORBIDDEN
assert (
response.content.decode() == "Incorrect token in Acme-Webhook-Token header."
)
def test_bad_token(self):
response = self.client.post(
"/webhooks/acme/mPnBRC1qxapOAxQpWmjy4NofbgxCmXSj/",
HTTP_ACME_WEBHOOK_TOKEN="def456",
)
assert response.status_code == HTTPStatus.FORBIDDEN
assert (
response.content.decode() == "Incorrect token in Acme-Webhook-Token header."
)
def test_success(self):
start = timezone.now()
old_message = AcmeWebhookMessage.objects.create(
received_at=start - dt.timedelta(days=100),
)
response = self.client.post(
"/webhooks/acme/mPnBRC1qxapOAxQpWmjy4NofbgxCmXSj/",
HTTP_ACME_WEBHOOK_TOKEN="abc123",
content_type="application/json",
data={"this": "is a message"},
)
assert response.status_code == HTTPStatus.OK
assert response.content.decode() == "Message received okay."
assert not AcmeWebhookMessage.objects.filter(id=old_message.id).exists()
awm = AcmeWebhookMessage.objects.get()
assert awm.received_at >= start
assert awm.payload == {"this": "is a message"}
Note:
- We use
@override_settings
to replace the token setting for every test in the test case. This means we don’t need to set a value in our test settings, nor use the sensitive real token, which we should not save in our code base. - To check the
@csrf_exempt
decorator, we create our test client with theenforce_csrf_checks
flag on. The test client would raise a CSRF error if we accidentally removed the decorator from the view. - We first test the view’s various failure modes before testing its success case. Testing both the missing and bad token cases is not strictly necessary for coverage, but done for completeness in case the code changes.
- When making assertions on the response status codes, we compare them with the
HTTPStatus
enum from the Python standard library. - To send the
Acme-Webhook-Token
header, we have to use the slightly unfriendlyHTTP_*
syntax.
Further changes
There are many ways we might need to improve our webhook receiver, beyond finishing its business logic. Here are some ideas:
- We might extract more data from the JSON body into separate fields on our
AcmeWebhookMessage
model. For example, if there are multiple types of message we might want to be able to query them. - The caller might, by necessity, send us messages more than once. We’d want to guard against reprocessing repeat messages, to behave idempotently. We can do this by querying past messages, but we’d need more fields and maybe an index.
- We could offload the processing of the messages to a background task, so we don’t make the caller wait for our success response. To do this we could extend our
AcmeWebhookMessage
model with more fields. Background processing could also make us more robust, allowing retries etc. - We could prevent the caller system from overwhelming us by adding rate-limiting, using django-ratelimit.
Read my book Boost Your Git DX to Git better.
One summary email a week, no spam, I pinky promise.
Tags: django