Custom Python Dictionaries: Inheriting From dict vs UserDict

Custom Python Dictionaries: Inheriting From dict vs UserDict

Creating dictionary-like classes may be a requirement in your Python career. Specifically, you may be interested in making custom dictionaries with modified behavior, new functionalities, or both. In Python, you can do this by inheriting from an abstract base class, by subclassing the built-in dict class directly, or by inheriting from UserDict.

In this tutorial, you’ll learn how to:

  • Create dictionary-like classes by inheriting from the built-in dict class
  • Identify common pitfalls that can happen when inheriting from dict
  • Build dictionary-like classes by subclassing UserDict from the collections module

Additionally, you’ll code a few examples that’ll help you understand the pros and cons of using dict vs UserDict to create your custom dictionary classes.

To get the most out of this tutorial, you should be familiar with Python’s built-in dict class and its standard functionality and features. You’ll also need to know the basics of object-oriented programming and understand how inheritance works in Python.

Creating Dictionary-Like Classes in Python

The built-in dict class provides a valuable and versatile collection data type, the Python dictionary. Dictionaries are everywhere, including in your code and the code of Python itself.

Sometimes, the standard functionality of Python dictionaries isn’t enough for certain use cases. In these situations, you’ll probably have to create a custom dictionary-like class. In other words, you need a class that behaves like a regular dictionary but with modified or new functionality.

You’ll typically find at least two reasons for creating custom dictionary-like classes:

  1. Extending the regular dictionary by adding new functionality
  2. Modifying the standard dictionary’s functionality

Note that you could also face situations in which you need to both extend and modify the dictionary’s standard functionality.

Depending on your specific needs and skill level, you can choose from a few strategies for creating custom dictionaries. You can:

There are a few key considerations when you’re selecting the appropriate strategy to implement. Keep reading for more details.

Building a Dictionary-Like Class From an Abstract Base Class

This strategy for creating dictionary-like classes requires that you inherit from an abstract base class (ABC), like MutableMapping. This class provides concrete generic implementations of all the dictionary methods except for .__getitem__(), .__setitem__(), .__delitem__(), .__iter__(), and .__len__(), which you’ll have to implement by yourself.

Additionally, suppose you need to customize the functionality of any other standard dictionary method. In that case, you’ll have to override the method at hand and provide a suitable implementation that fulfills your needs.

This process implies a fair amount of work. It’s also error-prone and requires advanced knowledge of Python and its data model. It can also imply performance issues because you’ll be writing the class in pure Python.

The main advantage of this strategy is that the parent ABC will alert you if you miss any method in your custom implementation.

For these reasons, you should embrace this strategy only if you need a dictionary-like class that’s fundamentally different from the built-in dictionary.

In this tutorial, you’ll focus on creating dictionary-like classes by inheriting from the built-in dict class and the UserDict class, which seem to be the quickest and most practical strategies.

Inheriting From the Python Built-in dict Class

For a long time, it was impossible to subclass Python types implemented in C. Python 2.2 fixed this issue. Now you can directly subclass built-in types, including dict. This change brings several technical advantages to the subclasses because now they:

The first item in this list may be a requirement for C code that expects a Python built-in class. The second item allows you to add new functionality on top of the standard dictionary behavior. Finally, the third item will enable you to restrict the attributes of a subclass to only those attributes predefined in .__slots__.

Even though subclassing built-in types has several advantages, it also has some drawbacks. In the specific case of dictionaries, you’ll find a few annoying pitfalls. For example, say that you want to create a dictionary-like class that automatically stores all its keys as strings where all the letters, if present, are uppercase.

To do this, you can create a subclass of dict that overrides the .__setitem__() method:

Python
>>> class UpperCaseDict(dict):
...     def __setitem__(self, key, value):
...         key = key.upper()
...         super().__setitem__(key, value)
...

>>> numbers = UpperCaseDict()
>>> numbers["one"] = 1
>>> numbers["two"] = 2
>>> numbers["three"] = 3

>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}

Cool! Your custom dictionary seems to work well. However, there are some hidden issues in this class. If you try to create an instance of UpperCaseDict using some initialization data, then you’ll get a surprising and buggy behavior:

Python
>>> numbers = UpperCaseDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'one': 1, 'two': 2, 'three': 3}

What just happened? Why doesn’t your dictionary convert the keys into uppercase letters when you call the class’s constructor? It looks like the class’s initializer, .__init__(), doesn’t call .__setitem__() implicitly to create the dictionary. So, the uppercase conversion never runs.

Unfortunately, this issue affects other dictionary methods, like .update() and .setdefault(), for example:

Python
>>> numbers = UpperCaseDict()
>>> numbers["one"] = 1
>>> numbers["two"] = 2
>>> numbers["three"] = 3

>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}

>>> numbers.update({"four": 4})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4}

>>> numbers.setdefault("five", 5)
5
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4, 'five': 5}

Again, your uppercase functionality isn’t working well in these examples. To solve this issue, you must provide custom implementations of all the affected methods. For example, to fix the initialization issue, you can write an .__init__() method that looks something like this:

Python
# upper_dict.py

class UpperCaseDict(dict):
    def __init__(self, mapping=None, /, **kwargs):
        if mapping is not None:
            mapping = {
                str(key).upper(): value for key, value in mapping.items()
            }
        else:
            mapping = {}
        if kwargs:
            mapping.update(
                {str(key).upper(): value for key, value in kwargs.items()}
            )
        super().__init__(mapping)

    def __setitem__(self, key, value):
        key = key.upper()
        super().__setitem__(key, value)

Here, .__init__() converts the keys into uppercase letters and then initializes the current instance with the resulting data.

With this update, the initialization process of your custom dictionary should work correctly. Go ahead and give it a try by running the following code:

Python
>>> from upper_dict import UpperCaseDict

>>> numbers = UpperCaseDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}

>>> numbers.update({"four": 4})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4}

Providing your own .__init__() method fixed the initialization issue. However, other methods like .update() continue to work incorrectly, as you can conclude from the "four" key’s not being uppercase.

Why do dict subclasses behave this way? Built-in types were designed and implemented with the open–closed principle in mind. Therefore, they’re open to extension but closed to modification. Allowing modifications to the core features of these classes can potentially break their invariants. So, Python core developers decided to protect them from modifications.

That’s why subclassing the built-in dict class can be a little bit tricky, labor-intensive, and error-prone. Fortunately, you still have alternatives. The UserDict class from the collections module is one of them.

Subclassing UserDict From collections

Starting with Python 1.6, the language has provided UserDict as part of the standard library. This class initially lived in a module named after the class itself. In Python 3, UserDict was moved to the collections module, which is a more intuitive place for it, based on the class’s primary purpose.

UserDict was created back when it was impossible to inherit from Python’s dict directly. Even though the need for this class has been partially supplanted by the possibility of subclassing the built-in dict class directly, UserDict is still available in the standard library, both for convenience and for backward compatibility.

UserDict is a convenient wrapper around a regular dict object. This class provides the same behavior as the built-in dict data type with the additional feature of giving you access to the underlying dictionary through the .data instance attribute. This feature can facilitate the creation of custom dictionary-like classes, as you’ll learn later in this tutorial.

UserDict was specially designed for subclassing purposes rather than for direct instantiation, which means that the class’s main purpose is to allow you to create dictionary-like classes through inheritance.

There are also other hidden differences. To uncover them, go back to your original implementation of UpperCaseDict and update it like in the code below:

Python
>>> from collections import UserDict

>>> class UpperCaseDict(UserDict):
...     def __setitem__(self, key, value):
...         key = key.upper()
...         super().__setitem__(key, value)
...

This time, instead of inheriting from dict, you’re inhering from UserDict, which you imported from the collections module. How will this change affect the behavior of your UpperCaseDict class? Check out the following examples:

Python
>>> numbers = UpperCaseDict({"one": 1, "two": 2})

>>> numbers["three"] = 3
>>> numbers.update({"four": 4})
>>> numbers.setdefault("five", 5)
5

>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'FOUR': 4, 'FIVE': 5}

Now UpperCaseDict works correctly all the time. You don’t need to provide custom implementations of .__init__(), .update(), or .setdefault(). The class just works! This is because in UserDict, all the methods that update existing keys or add new ones consistently rely on your .__setitem__() version.

As you learned before, the most notable difference between UserDict and dict is the .data attribute, which holds the wrapped dictionary. Using .data directly can make your code more straightforward because you don’t need to call super() all the time to provide the desired functionality. You can just access .data and work with it as you would with any regular dictionary.

Coding Dictionary-Like Classes: Practical Examples

You already know that subclasses of dict don’t call .__setitem__() from methods like .update() and .__init__(). This fact makes subclasses of dict behave differently from a typical Python class with a .__setitem__() method.

To work around this issue, you can inherit from UserDict, which does call .__setitem__() from all the operations that set or update values in the underlying dictionary. Because of this feature, UserDict can make your code safer and more compact.

Admittedly, when you think of creating a dictionary-like class, inheriting from dict is more natural than inhering from UserDict. This is because all Python developers know about dict, but not all Python developers are aware of the existence of UserDict.

Inheriting from dict often implies certain issues that you can probably fix by using UserDict instead. However, these issues aren’t always relevant. Their relevance very much depends on how you want to customize the dictionary’s functionality.

The bottom line is that UserDict isn’t the right solution all the time. In general, if you want to extend the standard dictionary without affecting its core structure, then it’s totally okay to inherit from dict. On the other hand, if you want to change the core dictionary behavior by overriding its special methods, then UserDict is your best alternative.

In any case, remember that dict is written in C and is highly optimized for performance. In the meantime, UserDict is written in pure Python, which can represent a significant limitation in terms of performance.

You should consider several factors when deciding whether to inherit from dict or UserDict. These factors include, but aren’t limited to, the following:

  • Amount of work
  • Risk of errors and bugs
  • Ease of use and coding
  • Performance

In the following section, you’ll experience the first three factors in this list by coding a few practical examples. You’ll learn about performance implications a bit later, in the section on performance.

A Dictionary That Accepts British and American Spelling of Keys

As the first example, say you need a dictionary that stores keys in American English and allows key lookup in either American or British English. To code this dictionary, you’ll need to modify at least two special methods, .__setitem__() and .__getitem__().

The .__setitem__() method will allow you to always store keys in American English. The .__getitem__() method will make it possible to retrieve the value associated with a given key, whether it’s spelled in American or British English.

Because you need to modify the core behavior of the dict class, using UserDict would be a better option to code this class. With UserDict, you won’t have to provide custom implementations of .__init__(), .update(), and so on.

When you subclass UserDict, you have two main ways to code your class. You can rely on the .data attribute, which may facilitate the coding, or you can rely on super() and special methods.

Here’s the code that relies on .data:

Python
# spelling_dict.py

from collections import UserDict

UK_TO_US = {"colour": "color", "flavour": "flavor", "behaviour": "behavior"}

class EnglishSpelledDict(UserDict):
    def __getitem__(self, key):
        try:
            return self.data[key]
        except KeyError:
            pass
        try:
            return self.data[UK_TO_US[key]]
        except KeyError:
            pass
        raise KeyError(key)

    def __setitem__(self, key, value):
        try:
            key = UK_TO_US[key]
        except KeyError:
            pass
        self.data[key] = value

In this example, you first define a constant, UK_TO_US, containing the British words as keys and the matching American words as values.

Then you define EnglishSpelledDict, inheriting from UserDict. The .__getitem__() method looks for the current key. If the key exists, then the method returns it. If the key doesn’t exist, then the method checks if the key was spelled in British English. If that’s the case, then the key is translated to American English and retrieved from the underlying dictionary.

The .__setitem__() method tries to find the input key in the UK_TO_US dictionary. If the input key exists in UK_TO_US, then it’s translated to American English. Finally, the method assigns the input value to the target key.

Here’s how your EnglishSpelledDict class works in practice:

Python
>>> from spelling_dict import EnglishSpelledDict

>>> likes = EnglishSpelledDict({"color": "blue", "flavour": "vanilla"})

>>> likes
{'color': 'blue', 'flavor': 'vanilla'}

>>> likes["flavour"]
vanilla
>>> likes["flavor"]
vanilla

>>> likes["behaviour"] = "polite"
>>> likes
{'color': 'blue', 'flavor': 'vanilla', 'behavior': 'polite'}

>>> likes.get("colour")
'blue'
>>> likes.get("color")
'blue'

>>> likes.update({"behaviour": "gentle"})
>>> likes
{'color': 'blue', 'flavor': 'vanilla', 'behavior': 'gentle'}

By subclassing UserDict, you’re saving yourself from writing a lot of code. For example, you don’t have to provide methods like .get(), .update(), or .setdefault(), because their default implementations will automatically rely on your .__getitem__() and .__setitem__() methods.

If you have less code to write, then you’ll have less work to do. More importantly, you’ll be safer because less code often implies a lower risk of bugs and errors.

The main drawback of this implementation is that if you someday decide to update EnglishSpelledDict and make it inherit from dict, then you’ll have to rewrite most of the code to suppress the use of .data.

The example below shows how to provide the same functionality as before using super() and some special methods. This time, your custom dictionary is fully compatible with dict, so you can change the parent class anytime you like:

Python
# spelling_dict.py

from collections import UserDict

UK_TO_US = {"colour": "color", "flavour": "flavor", "behaviour": "behavior"}

class EnglishSpelledDict(UserDict):
    def __getitem__(self, key):
        try:
            return super().__getitem__(key)
        except KeyError:
            pass
        try:
            return super().__getitem__(UK_TO_US[key])
        except KeyError:
            pass
        raise KeyError(key)

    def __setitem__(self, key, value):
        try:
            key = UK_TO_US[key]
        except KeyError:
            pass
        super().__setitem__(key, value)

This implementation looks slightly different from the original one but works the same. It could also be harder to code because you’re not using .data anymore. Instead, you’re using super(), .__getitem__(), and .__setitem__(). This code requires certain knowledge of Python’s data model, which is a complex and advanced topic.

The main advantage of this new implementation is that your class is now compatible with dict, so you can change the super class at any time if you ever need to do so.

It’s often more convenient to extend the standard dictionary functionality by subclassing UserDict than by subclassing dict. The main reason is that the built-in dict has some implementation shortcuts and optimizations that end up forcing you to override methods that you can just inherit if you use UserDict as the parent class.

A Dictionary That Accesses Keys Through Values

Another common requirement for a custom dictionary is to provide additional functionality apart from the standard behavior. For example, say that you want to create a dictionary-like class that provides methods to retrieve the key that maps to a given target value.

You need a method that retrieves the first key that maps to the target value. You also want a method that returns an iterator over those keys that map to equal values.

Here’s a possible implementation of this custom dictionary:

Python
# value_dict.py

class ValueDict(dict):
    def key_of(self, value):
        for k, v in self.items():
            if v == value:
                return k
        raise ValueError(value)

    def keys_of(self, value):
        for k, v in self.items():
            if v == value:
                yield k

This time, instead of inheriting from UserDict, you’re inheriting from dict. Why? In this example, you’re adding functionality that doesn’t alter the dictionary’s core features. Therefore, inheriting from dict is more appropriate. It’s also more efficient in terms of performance, as you’ll see later in this tutorial.

The .key_of() method iterates over the key-value pairs in the underlying dictionary. The conditional statement checks for values that match the target value. The if code block returns the key of the first matching value. If the target key is missing, then the method raises a ValueError.

As a generator method that yields keys on demand, .keys_of() will yield only those keys whose value matches the value provided as an argument in the method call.

Here’s how this dictionary works in practice:

Python
>>> from value_dict import ValueDict

>>> inventory = ValueDict()
>>> inventory["apple"] = 2
>>> inventory["banana"] = 3
>>> inventory.update({"orange": 2})

>>> inventory
{'apple': 2, 'banana': 3, 'orange': 2}

>>> inventory.key_of(2)
'apple'
>>> inventory.key_of(3)
'banana'

>>> list(inventory.keys_of(2))
['apple', 'orange']

Cool! Your ValueDict dictionary works as expected. It inherits the core dictionary’s features from Python’s dict and implements new functionality on top of that.

In general, you should use UserDict to create a dictionary-like class that acts like the built-in dict class but customizes some of its core functionality, mostly special methods like .__setitem__() and .__getitem__().

On the other hand, if you just need a dictionary-like class with extended functionality that doesn’t affect or modify the core dict behavior, then you’re better off to inherit directly from dict in Python. This practice will be quicker, more natural, and more efficient.

A Dictionary With Additional Functionalities

As a final example of how to implement a custom dictionary with additional features, say that you want to create a dictionary that provides the following methods:

Method Description
.apply(action) Takes a callable action as an argument and applies it to all the values in the underlying dictionary
.remove(key) Removes a given key from the underlying dictionary
.is_empty() Returns True or False depending on whether the dictionary is empty or not

To implement these three methods, you don’t need to modify the core behavior of the built-in dict class. So, subclassing dict rather than UserDict seems to be the way to go.

Here’s the code that implements the required methods on top of dict:

Python
# extended_dict.py

class ExtendedDict(dict):
    def apply(self, action):
        for key, value in self.items():
            self[key] = action(value)

    def remove(self, key):
        del self[key]

    def is_empty(self):
        return len(self) == 0

In this example, .apply() takes a callable as an argument and applies it to every value in the underlying dictionary. The transformed value is then reassigned to the original key. The .remove() method uses the del statement to remove the target key from the dictionary. Finally, .is_empty() uses the built-in len() function to find out if the dictionary is empty or not.

Here’s how ExtendedDict works:

Python
>>> from extended_dict import ExtendedDict

>>> numbers = ExtendedDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'one': 1, 'two': 2, 'three': 3}

>>> numbers.apply(lambda x: x**2)
>>> numbers
{'one': 1, 'two': 4, 'three': 9}

>>> numbers.remove("two")
>>> numbers
{'one': 1, 'three': 9}

>>> numbers.is_empty()
False

In these examples, you first create an instance of ExtendedDict using a regular dictionary as an argument. Then you call .apply() on the extended dictionary. This method takes a lambda function as an argument and applies it to every value in the dictionary, transforming the target value into its square.

Then, .remove() takes an existing key as an argument and removes the corresponding key-value pair from the dictionary. Finally, .is_empty() returns False because numbers isn’t empty. It would have returned True if the underlying dictionary was empty.

Considering Performance

Inheriting from UserDict may imply a performance cost because this class is written in pure Python. On the other hand, the built-in dict class is written in C and highly optimized for performance. So, if you need to use a custom dictionary in performance-critical code, then make sure to time your code to find potential performance issues.

To check if performance issues can arise when you inherit from UserDict instead of dict, get back to your ExtendedDict class and copy its code into two different classes, one inheriting from dict and the other inheriting from UserDict.

Your classes should look something like this:

Python
# extended_dicts.py

from collections import UserDict

class ExtendedDict_dict(dict):
    def apply(self, action):
        for key, value in self.items():
            self[key] = action(value)

    def remove(self, key):
        del self[key]

    def is_empty(self):
        return len(self) == 0

class ExtendedDict_UserDict(UserDict):
    def apply(self, action):
        for key, value in self.items():
            self[key] = action(value)

    def remove(self, key):
        del self[key]

    def is_empty(self):
        return len(self) == 0

The only difference between these two classes is that ExtendedDict_dict subclasses dict, and ExtendedDict_UserDict subclasses UserDict.

To check their performance, you can start by timing core dictionary operations, such as class instantiation. Run the following code in your Python interactive session:

Python
>>> import timeit
>>> from extended_dicts import ExtendedDict_dict
>>> from extended_dicts import ExtendedDict_UserDict

>>> init_data = dict(zip(range(1000), range(1000)))

>>> dict_initialization = min(
...     timeit.repeat(
...         stmt="ExtendedDict_dict(init_data)",
...         number=1000,
...         repeat=5,
...         globals=globals(),
...     )
... )

>>> user_dict_initialization = min(
...     timeit.repeat(
...         stmt="ExtendedDict_UserDict(init_data)",
...         number=1000,
...         repeat=5,
...         globals=globals(),
...     )
... )

>>> print(
...     f"UserDict is {user_dict_initialization / dict_initialization:.3f}",
...     "times slower than dict",
... )
UserDict is 35.877 times slower than dict

In this code snippet, you use the timeit module along with the min() function to measure the execution time of a piece of code. In this example, the target code consists of instantiating ExtendedDict_dict and ExtendedDict_UserDict.

Once you’ve run this time-measuring code, then you compare both initialization times. In this specific example, the initialization of the class based on UserDict is slower than the class derived from dict. This result is an indicator of a serious performance difference.

Measuring the execution time of new functionalities may also be interesting. For example, you can check the execution time of .apply(). To do this check, go ahead and run the following code:

Python
>>> extended_dict = ExtendedDict_dict(init_data)
>>> dict_apply = min(
...     timeit.repeat(
...         stmt="extended_dict.apply(lambda x: x**2)",
...         number=5,
...         repeat=2,
...         globals=globals(),
...     )
... )

>>> extended_user_dict = ExtendedDict_UserDict(init_data)
>>> user_dict_apply = min(
...     timeit.repeat(
...         stmt="extended_user_dict.apply(lambda x: x**2)",
...         number=5,
...         repeat=2,
...         globals=globals(),
...     )
... )

>>> print(
...     f"UserDict is {user_dict_apply / dict_apply:.3f}",
...     "times slower than dict",
... )
UserDict is 1.704 times slower than dict

The performance difference between the class based on UserDict and the class based on dict isn’t that big this time, but it still exists.

Often, when you create a custom dictionary by subclassing dict, you can expect standard dictionary operations to be more efficient in this class than in a class based on UserDict. On the other hand, new functionality may have similar execution time in both classes. How would you know which is the most efficient way to go? Well, you have to time-measure your code.

It’s worth noting if you’re aiming to modify the core dictionary functionality, then UserDict is probably the way to go because, in this case, you’ll be mostly rewriting the dict class in pure Python.

Conclusion

Now you know how to create custom dictionary-like classes with modified behavior and new functionalities. You’ve learned to do this by subclassing the built-in dict class directly and by inheriting from the UserDict class available in the collections module.

In this tutorial, you learned how to:

  • Create dictionary-like classes by inheriting from the built-in dict class
  • Identify common pitfalls of inheriting the Python built-in dict class
  • Build dictionary-like classes by subclassing UserDict from the collections module

You’ve also written some practical examples that helped you understand the pros and cons of using UserDict vs dict when creating your custom dictionary classes.

You’re now ready to create your custom dictionaries and to leverage the full power of this useful data type in Python in response to your coding needs.

🐍 Python Tricks 💌

Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

Python Tricks Dictionary Merge

About Leodanis Pozo Ramos

Leodanis is an industrial engineer who loves Python and software development. He's a self-taught Python developer with 6+ years of experience. He's an avid technical writer with a growing number of articles published on Real Python and other sites.

» More about Leodanis

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Master Real-World Python Skills With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

Master Real-World Python Skills
With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

What Do You Think?

Rate this article:

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.


Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Keep Learning

Related Tutorial Categories: best-practices data-structures intermediate python