All posts

The power of Python descriptors

There are two features in Python which aren't often needed in everyday programming, but are essential to the inner workings of Piccolo. The first is metaclasses, and the second is the Python descriptor protocol.

In this article we'll look at the Python descriptor protocol, and why it's so powerful. In fact, it underpins many core Python features such as classmethods.

What is the descriptor protocol?

The descriptor protocol allows us to implement custom logic when a variable is accessed, or assigned a new value. For example:

class Parent:
    child = Child()

parent = Parent()

# With the descriptor protocol we can run custom logic:

Parent.child       # when it's accessed on the class
parent.child       # when it's accessed on a class instance
parent.child = 1   # when we assign a new value to it

Like many things in Python, it's implemented using magic methods. In this case __get__ and __set__:

class Child:
    def __get__(self, obj, objtype=None):
        print("I was accessed")

    def __set__(self, obj, value):
        print("I was assigned a new value")

Which gives us the following:

parent = Parent()

parent.child
>>> I was accessed

parent.child = 1
>>> I was a assigned a new value

There are lots of interesting use cases. When a value is assigned we could:

  • Store it in an external database.
  • Invalidate a cache.
  • Refresh some UI (it's not too dissimilar to how reactivity is handled in Vue JS).

When a value is read we could:

  • Calculate the value dynamically.
  • Fetch the value from an external source.
  • Log the value.

Context

What makes the descriptor protocol extra interesting is the obj argument which is provided to the __get__ and __set__ methods.

The obj argument is either None or a class instance.

  • When obj is None, then the the attribute was accessed on a class (i.e. Parent.child).
  • When obj is a class instance, the attribute was accessed on that instance (i.e. Parent().child).

We're able to customise the behaviour depending on where it was called from. A trivial example:

class Child:
    def __get__(self, obj, objtype=None):
        if obj is None:
            print("I was accessed from a class.")
        else:
            print("I was accessed from a class instance.")

In an ORM like Piccolo, having this information is incredibly value.

In the example below, the name attribute represents the column type:

class Band(Table):
    name = Varchar()

But when we do a database query, the name attribute returns the value in the database instead.

band: Band = await Band.objects().first()
>>> band.name
'Pythonistas'
>>> type(band.name)
str

Being able to have correct type annotations was a huge head scratcher - how do you have correct type annotations for an attribute which is context dependent?

It turns out we can do this using descriptors:

class Varchar(Column):
    ...

    @typing.overload
    def __get__(self, obj: Table, objtype=None) -> str:
        ...

    @typing.overload
    def __get__(self, obj: None, objtype=None) -> Varchar:
        ...

    def __get__(self, obj, objtype=None):
        # This is Piccolo specific:
        return obj.__dict__[self._meta.name] if obj else self

MyPy now knows when the name is a Varchar, and when it's a str.

Conclusions

This just scratches the surface of descriptors. As mentioned in the intro, they're not needed every day, but they help us solve really tricky problems, and unlock some interesting design space for Python libraries.

Resources

Posted on: 24 Jan 2022

Have any comments or feedback on this post? Chat with us on GitHub.