Counting Queries: Basic Performance Testing in Django

Filipe Ximenes
January 6, 2020
<p>It's very common to read about testing techniques such as TDD and how to test application business logic. But testing the performance of an application is a whole different issue. There are many ways you can do it, but a common approach is to set up an environment where you can DDoS your application and watch how it behaves. This is an exciting topic, but it's not what I want to talk about in this blog post. Today I want to cover a much simpler kind of test, and it's one that you can do using your default Django unit test setup: testing the number of times your application hits the database.</p><p>This is a simple thing to test, and it's one of the things that can hurt application performance very early on. It's also the very first thing I investigate once something starts running slow. The great news is that there's only one thing you need to know about to start writing this kind of test: the <a href="https://docs.djangoproject.com/en/3.0/topics/testing/tools/#django.test.TransactionTestCase.assertNumQueries">assertNumQueries</a> method and it's quite simple to use, here is an example:</p><pre><code>from django.test import TestCase, Client from django.urls import reverse from trucks.models import Truck class TrucksTestCase(TestCase): def test_list_trucks_view_performance(self): client = Client() Truck.objects.create(...) with self.assertNumQueries(6): response = client.get(reverse("trucks:list_trucks")) self.assertEqual(response.context["trucks_list"], 1) </code></pre><p>The above code asserts that during the <code>"trucks:list_trucks"</code> view the application will only hit the database 6 times. But there's a little bit more to it, notice that before running the assertion we first create a new <code>Truck</code> object and after it we assert that there's one object in the <code>trucks_list</code> context data of the view. This is an essential thing to do in this kind of test because it assures you are not testing against an empty data set. It's important to understand that just creating the <code>Truck</code> instance is not enough; you need to check if it was included in the context. You may be doing some filtering to the truck list data so there's a chance that your <code>Truck</code> instance would not be included in the results.</p><p>By doing the above we've already made significant progress, but there's another important step that people often forget about. If we want our views to scale we need to ensure that its performance will not degrade as the number of items returned by it grows. After all we still have a performance problem in case we hit the database 6 times to fetch one item but hit it 106 times in case we have 100 items. We want a constant number of database hits, no matter the number of items we are returning. Luckily the solution to this is also simple, we need to add one (or a few) more items to the database and count the number of hits again. Here's the final version of the test:</p><pre><code>from django.test import TestCase, Client from django.urls import reverse from trucks.models import Truck class TrucksTestCase(TestCase): def test_list_trucks_view_performance(self): client = Client() Truck.objects.create(...) with self.assertNumQueries(6): response = client.get(reverse("trucks:list_trucks")) self.assertEqual(response.context["trucks_list"], 1) Truck.objects.create(...) with self.assertNumQueries(6): response = client.get(reverse("trucks:list_trucks")) self.assertEqual(response.context["trucks_list"], 2) </code></pre><p>Notice that we check again the number of items returned in the context, but in the second run, we expect 2 trucks. The reasoning for that is the same as in the first time.</p><p>Ensuring a constant number of database hits as you add data is <strong>more important</strong> than having a low number of total hits.</p><p>The last thing to do is to ensure that your data is as hydrated as possible. That means that you also need to create the related data that is going to be used while your view is processed. If you don't do that, there's a risk that your production application is hitting the database more times than your test expects (although it might be passing). In our example, we would need to create a companion <code>TruckDriver</code> to our <code>Truck</code>.</p><pre><code>from trucks.models import Truck, TruckDriver ... truck = Truck.objects.create(...) TruckDriver.objects.create(name="Alex", truck=truck) </code></pre><p>If the number of database hits stops being constant after you do the above go learn more about the <a href="https://docs.djangoproject.com/en/3.0/ref/models/querysets/#select-related"><code>select_related</code></a> and <a href="https://docs.djangoproject.com/en/3.0/ref/models/querysets/#prefetch-related"><code>prefetch_related</code></a> methods.</p><p>That's all for today, hope from now on you start checking the number of queries to the database early on in your application. It won't take very much of your time to do it, and it will prevent a lot of trouble when your application starts growing in number of users.</p><p><strong>Looking for more?</strong><br><a href="https://www.vinta.com.br/blog/2017/how-i-test-my-drf-serializers/">How I test my DRF serializers</a><br><a href="https://www.vinta.com.br/blog/2017/dont-forget-stamps-testing-email-content-django/">Don't forget the stamps: testing email content in Django</a></p>