What's New in SQLAlchemy 2.0?

Posted by
on under

You may have heard that a major version of SQLAlchemy, version 2.0, has been released in January 2023. Or maybe you missed the announcement and this is news to you. Either way, I thought you'd be curious to know what's new in it, if it is worth upgrading and how difficult the upgrade is.

As with previous software reviews, this is going to be an opinionated overview. I have been using the SQLAlchemy ORM in web projects for a long time, so in this article I will discuss the features that affect my own work, both in positive or negative ways. If instead you are interested to see a list of every change that went into this new release, then the official change log is the place to go.

For The Impatient

If you just want to see how a real-world project written for SQLAlchemy 2.0 looks like and don't want to read through the analysis that follows, you can head over to my microblog-api project, which I just upgraded to use the latest SQLAlchemy features.

No More Query Object

The first change I'm going to discuss is the new query interface. To be exact, this feature has been introduced in SQLAlchemy 1.4 releases as a way to help developers transition to 2.0, so you may have already seen it.

The now "legacy" way to issue queries in the SQLAlchemy ORM consisted in using a Query object, available from the Session.query() method, or for those using the Flask-SQLAlchemy extension for Flask, also as Model.query. Here are some examples:

# using native SQLAlchemy
user = session.query(User).filter(User.username == 'susan').first()

# using Flask-SQLAlchemy
user = User.query.filter(User.username == 'susan').first()

In release 2.0 this is considered the "old" way of executing queries. You can still run queries in this way, but the documentation refers to this interface as the "1.x Query API" or the "legacy Query API".

The new Query API has a very clear separation between the queries themselves and the execution environment in which they run. The above query to look for a user by the username attribute can now be written as follows:

query = select(User).where(User.username == 'susan')

This example stores the query in the query variable. At this point the query has not executed yet, and is not even associated with a session.

To execute the query, it can be passed to the execute() method of a session object:

results = session.execute(query)

The returned value from the execute() call is a Result object, which functions as an iterable that returns Row objects with a similar interface to a named tuple. If you prefer to get the results without iterating over them, there are a few methods that can be called on this object, some of which are:

  • all() to return a list with a row object for each result row.
  • first() to return the first result row.
  • one() to return the first result row, and raise an exception if there are no results or more than one result.
  • one_or_none() to return the first result row, None if there are no results, or raise an exception if there are more than one result.

Working with results as tuples makes sense when each result row can contain multiple values, but when there is a single value per row it can be tedious to extract the values out of single-element tuples. The session has two additional execution methods that make working with single-value rows more convenient to use:

  • scalars() returns a ScalarResult object with the first value of each result row. The methods of Result listed above are also available on this new result object.
  • scalar() return the first value of the first result row.

The example legacy query shown above can be executed as follows using the new interface:

user = session.scalar(query)

If you are used to the old query object it can take some time to become familiar with the new way to issue queries, but I personally find the separation between creating and executing queries a great improvement. Once again keep in mind that the legacy queries can still be used, so you can transition gradually to the new style.

Session Context Managers

Something I'm very excited to see is the introduction of context managers for sessions, which were also included in release 1.4 to help developers with a migration towards 2.0. I have always implemented my own session context managers and even explained how to do it in a video from a few years ago.

In 1.3 and older releases, the scoped session was the main pattern for working with sessions. The Flask-SQLAlchemy extension, for example, embraced it with the db.session variable, which is its signature. To be honest, I have never liked scoped sessions because they tie a session to a thread, which is completely arbitrary. In many cases the life of a session is much shorter than that of a thread, so you have to resort to manual management to get things to work in the intended way.

Now a session can be instantiated with a context manager, so there is a clear start and end. Here is an example:

with Session() as session:
    session.add(user)
    session.commit()

Here the session is closed when the context manager block ends. And if an error occurs inside it, the session is rolled back.

A variant of this pattern can be used to have a session that automatically commits at the end, while still rolling back on errors:

with Session() as session:
    with session.begin():
        session.add(user)

Typing Improvements

Another interesting change introduced in release 2.0 is the option to use typing hints to declare columns and relationships in models. Consider the following definition for a User model:

class User(Model):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    username = Column(String(64), index=True, unique=True, nullable=False)
    email = Column(String(120), index=True, unique=True, nullable=False)
    password_hash = Column(String(128))
    about_me = Column(String(140))

In 2.0, the column type can be defined with the Mapped typing hint. If there are any additional options, they can be given in a mapped_column() call.

import sqlalchemy as sa
import sqlalchemy.orm as so

class User(Model):
    __tablename__ = 'users'

    id: so.Mapped[int] = so.mapped_column(primary_key=True)
    username: so.Mapped[str] = so.mapped_column(sa.String(64), index=True, unique=True)
    email: so.Mapped[str] = so.mapped_column(sa.String(120), index=True, unique=True)
    password_hash: so.Mapped[Optional[str]] = so.mapped_column(sa.String(128))
    about_me: so.Mapped[Optional[str]] = so.mapped_column(sa.String(140))

Relationships are typed in the same way. Here is an example:

class Post(Model):
    # ...
    author: so.Mapped['User'] = so.relationship(back_populates='posts')

Note how the type can be given as a string, which is sometimes necessary to avoid forward references.

Overall this isn't very different, but using typing hints can provide some benefits:

  • If you use an IDE that runs static analysis on your code and provides suggestions as you type, having a strongly typed model will help your IDE understand your code better.
  • A related feature introduced in 2.0 is an integration between models and dataclasses, which also relies on the typing hints.
  • One little benefit that I noticed is that there are less symbols that need to be imported from SQLAlchemy. For columns that are numbers, dates, times, or even UUIDs, you can now use the typing hint to define them, without having to import the corresponding type classes from SQLAlchemy (sadly the String() class is still needed because for many databases a maximum length must be given).

As with the queries, the old way of defining columns and relationships continues to be supported.

Write-Only Relationships

If you followed my Flask tutorials, you know that I have always recommended the lazy='dynamic' option for relationships that can be large, as this makes it possible to add pagination, sorting and filtering before SQLAlchemy goes and retrieves the related objects.

Unfortunately dynamic relationships are also considered legacy in SQLAlchemy 2.0, as they are incompatible with the new query interface. Instead, the recommended solution is a new relationship type called "Write-Only". Here is how to define a write-only relationship:

class User(Model):
    # ...
    tokens: so.WriteOnlyMapped['Token'] = so.relationship(back_populates='user')

Why the name "write-only"? The difference with the old dynamic relationship is that the write-only relationship does not load (or read) the related objects, but provides add() and remove() methods to make changes (or write) to it.

How do you get the related objects then? The relationship exposes a select() method that returns a query that you can execute on the session, possibly after adding filters, sorting or pagination. This is less automatic and definitely less magical than the dynamic relationship, but it supports the same use cases.

Here is an example of how to get the tokens objects associated with a given user, sorted by their expiration date:

tokens = session.scalars(user.tokens.select().order_by(Token.expiration)).all()

If I have to be honest, it took me a while to accept that the dynamic relationships are a thing of the past, because I really really like them. But I do understand that their design is incompatible with the new query interface. I struggled to build an application with SQLAlchemy 1.4 that did not use dynamic relationships, and complains from myself and others on the SQLAlchemy discussions board is what resulted in the addition of the write-only relationship in 2.0.

Async Support

Version 1.4 introduced a beta version of the asyncio extension, which provides async versions of the Engine and Session objects with awaitable methods. In release 2.0 this extension is not considered beta anymore.

The most important thing to keep in mind with regards to the asyncio support in SQLAlchemy 2.0 is that many of the best features of SQLAlchemy are possible because often database instructions are issued implicitly, for example as a result of the application accessing an attribute of a model instance. When using the asynchronous paradigm, implicit I/O as a result of an attribute being accessed is not possible because all database activity must happen while the application issues an await function call, as this is what makes concurrency possible.

A big part of setting up an asynchronous solution with SQLAlchemy involves preventing all the ways in which normally there would be implicit database actions, so you will need to have at least a basic understanding of how SQLAlchemy works under the hood, and be prepared to receive obscure error messages if you missed anything. The asyncio extension documentation explains the issues and provides guidance on what needs to be done, so getting a working solution is definitely possible, but it will not be as flexible as what you get under regular Python.

Conclusion

I hope this was a useful review of what I consider the most important changes in SQLAlchemy 2.0. The microblog-api repository contains a complete, non-trivial API project based on Flask, with database support provided by SQLAlchemy 2.0 and my Alchemical extension. Feel free to try it out if you'd like to see the new SQLAlchemy features in action!

Learn SQLAlchemy 2 with my New Book!

That's right, I am working on a new book that will teach you SQLAlchemy 2 by building an actual database project from the very beginning and in small incremental steps. The expected release date of the book is April 30th, 2023. If you would like to secure your copy, I have made it available to pre-order from the Kindle store with a 30% introductory discount.

Become a Patron!

Hello, and thank you for visiting my blog! If you enjoyed this article, please consider supporting my work on this blog on Patreon!

34 comments
  • #1 Fips said

    Brilliant overview, thank you Miguel! I bet I'll be referring back to this page if/when I decide to upgrade.

  • #2 Abdur-Rahmaan Janhangeer said

    SQLAlchemy 2.0 is the SQLAlchemy i always wanted. Result class and decoupling of queries is just really awesome as these are production gems.

  • #3 ernstl said

    Hi Miguel,

    thx for the nice and quick overview + tipps. Yesterday was 2.0.4 release.

    Lazy feature back :)

    The Session.refresh() method will now immediately load a relationship-bound attribute that is explicitly named within the Session.refresh.attribute_names collection even if it is currently linked to the “select” loader, which normally is a “lazy” loader that does not fire off during a refresh. The “lazy loader” strategy will now detect that the operation is specifically a user-initiated Session.refresh()...

    Rgds

  • #4 Miguel Grinberg said

    @ernstl: I think you have a misunderstanding. The "lazy" feature has never been removed. The 2.0.4 release has a minor improvement in the Session.refresh() method, that's all there is to it. There is nothing important that this changes.

  • #5 Andrew said

    What is the "so" in this line?

    email: so.Mapped[str] = so.mapped_column(String(120), index=True, unique=True)

  • #6 Miguel Grinberg said

    @Andrew: Sorry, I should have included those in the example. I have updated it now, so is an abbreviation for sqlalchemy.orm and sa is an abbreviation for sqlalchemy:

    import sqlalchemy as sa
    import sqlalchemy.orm as so
    
  • #7 CJ said

    Hi Miguel,

    Thank you for this post explaining the SQLAlchemy 2.0. I was having a lot of difficulty reading through the documentation while also trying to follow along to your amazing tutorial.

    I have some questions that I would really appreciate your help with:

    1. I noticed that in the models.py, you imported sqlalchemy as sa, but you don't seem to be using this. Is there a reason for importing sqlalchey in this module?

    2. I tried using "results = session.execute(query)" in the Flask Shell, but I get a "NameError: name 'session' is not defined". It works if I use db.session.execute(query) instead. How were you able to just use session instead of db.session?

    3. Where did you import the sqlalchemy's select module? I tried to import it in the init, but even then, when I start a flask session from CMD and try to query, I still get a NameError if I don't import the select module from sqlalchemy.

    Thanks so much!

  • #8 Miguel Grinberg said

    @CJ: I think you are assuming the snippets of code I'm showing in this blog post are complete examples that you can run. You are also assuming I'm using Flask-SQLAlchemy, but this post discusses SQLAlchemy in general, not in a Flask context. Answers:

    1. There is no models.py file shown in this article, just a snippet of one. I'm not sure which file you are looking at, but I'm almost certain it is not related to this article.
    2. db.session only makes sense if you are using Flask-SQLAlchemy.
    3. select is a function that you import from SQLAlchemy. Or with Flask-SQLAlchemy you have it as db.select.
  • #9 Salvatore Fusto said

    Hi Miguel, thanks and congrats for your brilliant post. i' d know if your book on sqlachemy is, or wil be, available in PDF format other than Kindle
    Best regards

  • #10 Miguel Grinberg said

    @Salvatore: It will be available both as an e-book and paper book on Amazon. I'm not currently planning to sell it through other channels.

  • #11 Salvatore Fusto said

    hello, i just descovered this interesting blog so thanks for it. i've a little question: in your opinion qhat is the best in developing a Flask app, Flask + SQLalchemy, or Flask + Flask-SQLalchemy?
    thanks and regards

  • #12 Miguel Grinberg said

    @Salvatore: The difference is not very significant. Flask-SQLAlchemy makes somethings a bit more convenient, but not in a major way.

  • #13 Sjoerd said

    Hi Miguel,

    I'm working on a application that is loosely based (the boilerplate) on the mega-tutorial. I'm developing it at home and use just "flask run" to fire-up the application. At work I'm use gunicorn. Works great but recently the application does not "see" database mutation - I need to restart gunicorn. I added some dirty print statement and changing and orm object though a form works fine.. After the db.session.commit() I query...first() that object and seems oke but when I move away from the page and get back to it old (eg. cached?) data is shown.. Again.. when I fire it up with just "flask run" there are no issues.
    I have the feeling is has something to do with the gunicorn starting workers.. Would moving to 2.0 fix this issue (which I do not want at this point to be fair) or do I need to change something.

    Best Regards
    Sjoerd

  • #14 Miguel Grinberg said

    @Sjoerd: I doubt a change to SQLAlchemy 2 will change anything for you. My guess is that this is an issue with your deployment, not with SQLAlchemy.

  • #15 George Bill said

    This is a wonderful post. Thanks for the update.

  • #16 Hendy Xu said

    That's great. This tutorial has been very helpful to me.

  • #17 Jochen Reinholdt said

    Hi Miguel

    Are you planning to update the microblog course + book with SQLAlchemy 2?

  • #18 Miguel Grinberg said

    @Jochen: I'm working on it, yes.

  • #19 Shahrukh said

    Hi Miguel, I just finished reading the book & doing the exercises. I'm trying to look back the microblog app and figuring out how much of the book can I start applying until your (much awaited) new course comes out :) My question was do you see Alchemical & Flask-Migrate & Flask-Sqlalchemy to be part of an app? I just added this to the microblog course and wondering if installing Alchecmical is the way to go instead.
    db = SQLAlchemy(
    metadata=MetaData(naming_convention={
    "ix": 'ix_%(column_0_label)s',
    "uq": "uq_%(table_name)s_%(column_0_name)s",
    "ck": "ck_%(table_name)s_%(constraint_name)s",
    "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
    "pk": "pk_%(table_name)s",
    })
    )

  • #20 Miguel Grinberg said

    @Shahrukh: I think there are two options. You can continue using Flask-SQLAlchemy and Flask-Migrate, or else you can switch to Alchemical. Both options are valid. My microblog-api project uses Alchemical in combination with Flask-Migrate, so that is a good project to get inspiration from if you like Alchemical.

  • #21 Abdurrasheed said

    Hi miguel thank you, for your great tutorial. Do you have a tutorial for your microblog-api

  • #22 Miguel Grinberg said

    @Abdurrasheed: Not at this time. You can read the source code to learn how that project works.

  • #23 Kescopee said

    Hi Miguel,

    I'm currently going through your SQLAchemy 2 book. Now encountered an error while trying to import the orders from orders.csv. Below is the error:
    "
    sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
    (psycopg2.errors.NotNullViolation) null value in column "product_id" of relation "orders_items" violates not-null constraint
    DETAIL: Failing row contains (null, a1a4b77a-8f57-4d99-a7ad-5a0f7ac49e90, 57.59, 1).

    [SQL: INSERT INTO orders_items (order_id, unit_price, quantity) VALUES (%(order_id)s::UUID, %(unit_price)s, %(quantity)s)]
    [parameters: {'order_id': UUID('a1a4b77a-8f57-4d99-a7ad-5a0f7ac49e90'), 'unit_price': 57.59, 'quantity': 1}]
    (Background on this error at: https://sqlalche.me/e/20/gkpj)
    "
    Also, when I tried the example code on page 147, the targeted entry in the order_items table could not be deleted. I made sure that the 'oi' had an OrderItem instance.

    Can you point me in the right direction, please?

  • #24 Miguel Grinberg said

    @Kescopee: Not sure exactly what the problem is, but the error occurs because the script is trying to insert an order item in which the product is not set. The product is the "null" that appears in the error message:

    DETAIL: Failing row contains (null, a1a4b77a-8f57-4d99-a7ad-5a0f7ac49e90, 57.59, 1)
    

    My guess is that your importer script has a logic error. You may want to download my version of the script from GitHub. Hopefully if you compare your version against mine you will find the mistake.

  • #25 Kescopee said

    Hi Miguel,

    Thank you for the quick response. I've gone through my script but got the same. Even replaced my script with yours (i.e from GitHub retrofun) and still got the same error. In the import script, I added a try-except block to detect where the error is happening using SQLAlchemyError. Below is the section:

    product = all_products.get(row['product1'])
                        if product is None:
                            print(f"Product before: {product}")
                            try:
                                product = session.scalar(select(Product).where(Product.name == row['product1']))
                            except exc.SQLAlchemyError:
                                print(f"No product with name: {row['product1']}")
                            print(f"Product after: {product}\n")
                            all_products[row['product1']] = product
    

    going through the console output, I noticed that the script could not find some of the product in the products table (i.e product1). Below is the error from the console:

    SAWarning: Column 'orders_items.product_id' is marked as a member of the primary key for table 'orders_items', but has no Python-side or server-side default generator indicated, nor does it indicate 'autoincrement=True' or 'nullable=True', and no explicit value is passed.  Primary key columns typically may not store NULL. Note that as of SQLAlchemy 1.1, 'autoincrement=True' must be indicated explicitly for composite (e.g. multicolumn) primary keys if AUTO_INCREMENT/SERIAL/IDENTITY behavior is expected for one of the columns in the primary key. CREATE TABLE statements are impacted by this change as well on most backends. (This warning originated from the Session 'autoflush' process, which was invoked automatically in response to a user-initiated operation.)
      product = session.scalar(select(Product).where(Product.name == row['product1']))
    No product with name: Orao
    Product after: None
    
    Product before: None
    No product with name: ZX80
    Product after: None
    
    Product before: None
    No product with name: Timex Computer 2048
    Product after: None
    

    Not clear to me what might be causing this.

Leave a Comment