How to Filter Django QuerySets - 15 Examples For Beginners

I am going to show you how to filter your Django QuerySets. There are lots of tips and tricks to write more powerful Django queries. By applying the skills in the examples, you will build your confidence using Django ORM and writing better queries for your projects.

What is a QuerySet?

A QuerySet is a Django class that stores the results of database queries.

As stated in the Django documentation, creating QuerySets doesn't actually involve any activity with the database. Instead, Django will only read data from the database until the QuerySet is evaluated. In other words, Django delays any interaction with the database until it needs the data for something else.

How to write queries with Django ORM

Django ORM is a powerful way to query the database without writing SQL.

In this tutorial, I am going to show you how to query the database and filter the results. We are going to cover:

  1. Basic Filter

  2. Filter on Foreign Keys

  3. Selecting values

  4. Converting QuerySets into lists

  5. Sorting QuerySets

  6. Slicing results

  7. Count results

  8. Check if results exist

  9. Greater than / less than

  10. OR conditions (Q objects)

  11. Filter with lists

  12. Get distinct values

  13. Exclude data

  14. Earliest / latest results

  15. First & Last results

When you are confident filtering your QuerySets and want to take them to the next level, then I recommend learning how to annotate them. Annotations are useful for generating statistics from your data. I have a post with examples that include how to create the driver leaderboard and listing the top reasons why drivers don't finish races.

Formula 1 Examples

All the examples I'm going to give in this tutorial come from a Formula 1 stats project. I imported historic data about drivers and races into a SQLite database.

To understand what tables are in the database and how they are related, see the diagram below.

Often, we need to filter across multiple tables. For example, if you want to find all the races that took place in a particular country, then you will need to filter the Race table, but check the country in the Circuit table. This article has lots of examples about how to filter Foreign Key relationships.

1. Basic Filter

Let's start with the most basic kind of filter.

Get all the races that took place in 2021.

Race.objects.filter(year=2021)

Result (formatted for readability):

<QuerySet [
    <Race: 2021 Emilia Romagna Grand Prix>, 
    <Race: 2021 Bahrain Grand Prix>, 
    <Race: 2021 Qatar Grand Prix>, 
    <Race: 2021 Portuguese Grand Prix>, 
    <Race: 2021 Spanish Grand Prix>, 
    <Race: 2021 Monaco Grand Prix>, 
    <Race: 2021 Azerbaijan Grand Prix>, 
    <Race: 2021 Styrian Grand Prix>, 
    <Race: 2021 French Grand Prix>,
    '...(remaining elements truncated)...'
]>

Get all the races that took place in Spain in 2021.

Race.objects.filter(year=2021, circuit__country="Spain")

The country that the race was held in is stored in the Circuit table. We can use a double underscore to use columns in a related table in our query.

Result:

<QuerySet [<Race: 2021 Spanish Grand Prix>]>

3. SELECT fields

Get all the races that happened in 2021. List the date each one happened and the country it took place in.

In SQL, we can use SELECT statements to limit which fields are returned by the query. In Django, we can emulate this using .values() or .values_list().

What is the difference between values and values list?

Both will let you control which fields are selected by the query. .values will return a QuerySet of dictionaries and .values_list will return a QuerySet of tuples.

.values()

Race.objects\
    .filter(year=2021)\
    .values("circuit__country", "date")

Result (formatted for readability):

<QuerySet [
    {'circuit__country': 'Italy', 'date': datetime.date(2021, 4, 18)}, 
    {'circuit__country': 'Bahrain', 'date': datetime.date(2021, 3, 28)}, 
    {'circuit__country': 'Qatar', 'date': datetime.date(2021, 11, 21)}, 
    {'circuit__country': 'Portugal', 'date': datetime.date(2021, 5, 2)}, 
    {'circuit__country': 'Spain', 'date': datetime.date(2021, 5, 9)} '
   ...(remaining elements truncated)...'
]>

.values_list()

Race.objects\
   .filter(year=2021)\
   .values_list("circuit__country", "date")

Result (formatted for readability):

<QuerySet [
    ('Italy', datetime.date(2021, 4, 18)), 
    ('Bahrain', datetime.date(2021, 3, 28)), 
    ('Qatar', datetime.date(2021, 11, 21)), 
    ('Portugal', datetime.date(2021, 5, 2)), 
    ('Spain', datetime.date(2021, 5, 9))'
    ...(remaining elements truncated)...'
]>

4. Convert QuerySets into a List

Here, we will flatten the results into a list-like QuerySet. This only works if you're selecting one field from the table. While the result is technically still a QuerySet, it is very similar to a list.

List all the races that happened in 2021 by name.

If you use values_list and are only selecting one field, then you have the option to flatten the results (no tuples, just values) by adding flat=True.

races = Race.objects\
    .filter(year=2021)\
    .values_list("name", flat=True)

Result (formatted for readability):

<QuerySet [
    'Emilia Romagna Grand Prix', 
    'Bahrain Grand Prix', 
    'Qatar Grand Prix', 
    'Portuguese Grand Prix', 
    'Spanish Grand Prix', 
    'Monaco Grand Prix', 
    'Azerbaijan Grand Prix', 
    'Styrian Grand Prix', 
    'French Grand Prix', 
    'Austrian Grand Prix', 
    'British Grand Prix', 
    'Hungarian Grand Prix', 
    'Belgian Grand Prix', 
    'Dutch Grand Prix', 
    'Italian Grand Prix', 
    '...(remaining elements truncated)...'
]>

QuerySets are iterables so you can loop through them like a list. If you strictly need a list and not a QuerySet, then you can convert it using Python's list()method.

>>> list(races)
[
    'Emilia Romagna Grand Prix', 
    'Bahrain Grand Prix',
    'Qatar Grand Prix', 
    'Portuguese Grand Prix', 
    'Spanish Grand Prix', 
    'Monaco Grand Prix', 
    'Azerbaijan Grand Prix', 
    'Styrian Grand Prix', 
    'French Grand Prix', 
    'Austrian Grand Prix', 
    'British Grand Prix', 
    'Hungarian Grand Prix', 
    'Belgian Grand Prix', 
    'Dutch Grand Prix', 
    'Italian Grand Prix', 
    'Russian Grand Prix', 
    'Turkish Grand Prix', 
    'United States Grand Prix', 
    'Mexico City Grand Prix', 
    'São Paulo Grand Prix', 
    'Saudi Arabian Grand Prix', 
    'Abu Dhabi Grand Prix'
]

5. How to Sort Querysets

List each driver's name and date of birth, youngest first.

We can use .order_by to sort the results. If you include a minus sign before the keyword, results will be ordered in descending order. If the minus sign is omitted, then they will be ordered in ascending order.

Driver.objects\
    .all()\
    .order_by("-date_of_birth")\
    .values_list("first_name", "last_name", "date_of_birth")

Result (formatted for readability):

<QuerySet [
    ('Yuki', 'Tsunoda', datetime.date(2000, 5, 11)), 
    ('Lando', 'Norris', datetime.date(1999, 11, 13)), 
    ('Guanyu', 'Zhou', datetime.date(1999, 5, 30)), 
    ('Mick', 'Schumacher', datetime.date(1999, 3, 22)), 
    ('Nikita', 'Mazepin', datetime.date(1999, 3, 2)),
    '...(remaining elements truncated)...'
]>

6. Limiting (slicing) results

List the names and dates of birth of the 5 oldest drivers.

This is the equivalent of including LIMIT in your SQL queries.

Driver.objects\
    .all()\
    .order_by("date_of_birth")\
    .values_list("first_name", "last_name", "date_of_birth")[:5]

Result:

<QuerySet [
    ('Philippe', 'Étancelin', datetime.date(1896, 12, 28)), 
    ('Luigi', 'Fagioli', datetime.date(1898, 6, 9)), 
    ('Clemente', 'Biondetti', datetime.date(1898, 8, 18)),
    ('Arthur', 'Legat', datetime.date(1898, 11, 1)),
    ('Louis', 'Chiron', datetime.date(1899, 8, 3))
]>

7. Count results

How many races happened in 2021?

Race.objects.filter(year=2021).count()

Why .count() is more efficient than len()

You can use the len method to count the results in your queryset. However, Django will evaluate the queryset (fetches the data for all matching rows from the database). This isn't necessary. Using the .count() method will give you the same result without fetching the contents of all matching rows from the database.

8. Check if results exist

If you call exists() on a QuerySet, it will return True if there is at least one result in the QuerySet and False if there are none.

Did any races occur in 2022?

>>> Race.objects.filter(year=2022).exists()
True

Did any races occur in the year 1896?

>>> Race.objects.filter(year=1896).exists()
False

9. Greater than / less than

We can use __gt, __gte, __lt & __lte to perform 'greater than' and 'less than' queries.

Get all the drivers who were born on or after 1st January 1998, oldest first.

drivers = (
    Driver.objects.filter(date_of_birth__gte="1998-01-01")
    .order_by("date_of_birth")
    .values_list("first_name", "last_name", "date_of_birth")
)

Result (formatted for readability):

<QuerySet [
    ('George', 'Russell', datetime.date(1998, 2, 15)), 
    ('Lance', 'Stroll', datetime.date(1998, 10, 29)), 
    ('Nikita', 'Mazepin', datetime.date(1999, 3, 2)), 
    ('Mick', 'Schumacher', datetime.date(1999, 3, 22)), 
    ('Guanyu', 'Zhou', datetime.date(1999, 5, 30)), 
    ('Lando', 'Norris', datetime.date(1999, 11, 13)), 
    ('Yuki', 'Tsunoda', datetime.date(2000, 5, 11))
]>

This has returned all the drivers who were born since 1st January 1998. At time of writing, there were 7 drivers who met this criteria.

Get all the drivers who were born before 1st January 1900, youngest first.

oldest = (
    Driver.objects.filter(date_of_birth__lt="1900-01-01")
    .order_by("-date_of_birth")
    .values_list("first_name", "last_name", "date_of_birth")
)

Result (formatted for readability):

<QuerySet [
    ('Adolf', 'Brudes', datetime.date(1899, 10, 15)), 
    ('Piero', 'Dusio', datetime.date(1899, 10, 13)), 
    ('Louis', 'Chiron', datetime.date(1899, 8, 3)), 
    ('Arthur', 'Legat', datetime.date(1898, 11, 1)), 
    ('Clemente', 'Biondetti', datetime.date(1898, 8, 18)), 
    ('Luigi', 'Fagioli', datetime.date(1898, 6, 9)), 
    ('Philippe', 'Étancelin', datetime.date(1896, 12, 28))
]>

10. How to add OR conditions to your QuerySets

You can use Q objects to create OR conditions in your QuerySets.

Get the all the 2021 races that happened in France or Spain.

from django.db.models import Q

i = Race.objects.filter(
    Q(circuit__country="France") | Q(circuit__country="Spain"), 
    year=2021
)

Result:

<QuerySet [
    <Race: 2021 Spanish Grand Prix>, 
    <Race: 2021 French Grand Prix>
]>

11. Filter with lists

We can create OR conditions in our QuerySets using Q objects. However, there is a cleaner way if there are lots of conditions.

Get the all the 2021 races that happened in France, Spain, Portugal, Belgium or the Netherlands.

countries = ["France", "Spain", "Portugal", "Belgium", "Netherlands"]

races = Race.objects.filter(
    year=2021, 
    circuit__country__in=countries
)

Result (formatted for readability):

<QuerySet [
    <Race: 2021 Portuguese Grand Prix>, 
    <Race: 2021 Spanish Grand Prix>, 
    <Race: 2021 French Grand Prix>, 
    <Race: 2021 Belgian Grand Prix>, 
    <Race: 2021 Dutch Grand Prix>
]>

12. Get distinct values

The .distinct() method can be used to make sure no value of the supplied field appears in the QuerySet more than once.

List all the countries that hosted races in 2020. List them alphabetically. Make sure no country appears in the list more than once.

countries = (
    Race.objects.filter(year=2020)
    .distinct("circuit__country")
    .order_by("circuit__country")
    .values_list("circuit__country", flat=True)
)

Result (formatted for readability):

<QuerySet [
  'Austria', 
  'Bahrain', 
  'Belgium', 
  'Germany', 
  'Hungary', 
  'Italy', 
  'Portugal', 
  'Russia', 
  'Spain', 
  'Turkey', 
  'UAE', 
  'UK'
]>

13. Exclude data

The .exclude() method can be used to remove results from a QuerySet.

List all the nationalities of all drivers who aren't British or American.

nationalities = (
    Driver.objects.all()
    .exclude(nationality__in=["British", "American"])
    .distinct("nationality")
    .order_by("nationality")
    .values_list("nationality", flat=True)
)

Result:

<QuerySet [
  'American-Italian', 
  'Argentine', 
  'Argentine-Italian', 
  'Australian', 
  'Austrian', 
  'Belgian', 
  'Brazilian', 
  'Canadian', 
  'Chilean', 
  'Chinese', 
  'Colombian', 
  'Czech', 
  'Danish', 
  'Dutch', 
  '...(remaining elements truncated)...'
]>

14. Earliest and Latest

We can use .earliest and .latest on date fields. Instead of returning a QuerySet, they return a single result.

Find the oldest and youngest drivers

>>> oldest = Driver.objects.earliest("date_of_birth")
>>> oldest
<Driver: Philippe Étancelin>
>>> oldest.date_of_birth
datetime.date(1896, 12, 28)

Youngest:

>>> youngest = Driver.objects.latest("date_of_birth")
>>> youngest
<Driver: Yuki Tsunoda>
>>> youngest.date_of_birth
datetime.date(2000, 5, 11)

15. First and Last

.first() and .last() can be used to return the first and last results of a QuerySet.

Get the first and most recent race

>>> first = Race.objects.all().order_by("date").first()
>>> first
<Race: 1950 British Grand Prix>

Last:

>>> last = Race.objects.all().order_by("date").last()
>>> last
<Race: 2022 Abu Dhabi Grand Prix>

First and last still works if the queryset is empty. It will return None.

>>> race_1940 = Race.objects.filter(year=1940).first()
>>> race_1940 is None
True