Weekly Report, November 22 - 28

This was a week where I tried something new. Instead of cutting through tens of shallow PRs, I focused deeply on a single one. I installed Irit Katriel’s GH-29581 to try out PEP 654’s new except* and ExceptionGroup objects.

Let’s talk about what I found out. We’ll start with a short reminder of how exceptions work today, and then I’ll talk some about asyncio and how except* makes everything so much better (spoiler alert!).

A short review of try: except:

Exceptions are central to Python. I would argue that they are as central as our significant indentation itself. Remove exceptions from the language and what you’re left with won’t feel like Python anymore. Not only do we have a very rich built-in exception hierarchy which can further be extended by the user by subclassing a given exception.

Sometimes it’s easy to see when user code raises an exception, mostly when it’s using the raise keyword:

value = fetch_dict_from_network(...)
if value is None:
    raise TypeError("can't pass None!")
do_something_with(value)

In this example it’s clear that if the value is None, an exception is raised and so the next line to do_something_with(value) won’t be executed. Instead, the exception bubbles up the function call stack until it’s handled.

But in Python everything can go wrong. All of the following can raise exceptions sometimes, even without conditionals, loops, and explicit raise statements:

name.upper()
function(arg=True)
some_dict["key"]
some_list[12]
1.0 / divisor
file.write(b"data")
Path.home() / ".gitignore"

I’ll leave it as an exercise to the reader to figure out at least one way in which each of those lines can fail with an exception. When it does fail, how do we handle the exception? Using our flexible try:except: construct:

try:
    ...
except SpecificError:
    ...
except OtherSpecificError as ose:
    print(ose)
except Exception:
    ...
else:
    ...
finally:
    ...

A few things to note here:

  • if an exception raised inside the try: block is a subclass of the first except: block’s type, that block will execute once and no other blocks will execute;
  • if the raised exception didn’t match, the second except: block is tried, and if that doesn’t match, the third one, and so on;
  • if the raised exception didn’t match any except: blocks, it bubbles up the function call stack where hopefully a better try:except: will handle it;
  • if no exception was raised within the try: block, the else: block will be executed;
  • after the try: block is executed and either one of the except: blocks is executed or else: is executed, only then finally: is executed.

There are some more subtleties to how all this works, let me just name two:

  • the ose name is only available inside the except: block and it is deleted by Python afterwards;
  • users can decide to re-raise the same exception from an except: block, or the same exception with the chained context removed, or even a different exception with or without chained context.

Finally!

As you can see our humble try:except: is plenty complex. In fact, sometimes it’s not entirely obvious how it will behave. An example I like to demonstrate this with is this:

>>> d = {0: "first", 1: "second"}
>>> for i in range(4):
...     try:
...         msg = d[i]
...     except KeyError as ke:
...         print("ERROR", ke)
...         break
...     else:
...         print(msg)
...     finally:
...         print("finally", i)

The finally: block here is executed three times: after both successful key getters and also at the third time when after the KeyError the exception handler uses break to leave the function. This surprises some users. And if you think: “hey, fair enough, break is like a return, I expected that!” when change the break in the example to continue. Will the code still suggest to you that the finally block will be executed? Many users will be a little bewildered by this.

But our humble finally: is an important part of Python, too. It helps us track resources responsibly, like in this immortal example:

f = open(filename)
try:
    contents = f.read()
    process(contents)
finally:
    f.close()

This looks innocent to the seasoned Python programmer but presents a few pitfalls to newcomers (or tired experts!). In particular, we have to diligently remember to open the file outside of the try because the error conditions of a failed open are different from what we’re trying to protect with finally:. Then, we read and process the file in the try: block, to close it in the finally: block after. We do it in the finally: block in case an exception during reading the file interrupts the process. But we could discuss whether process(contents) shouldn’t be actually moved entirely outside of the try:finally: block since we no longer need to keep the file open if we already read it all, right? We needlessly hold onto the file descriptor here.

The point is that this is a bit too verbose and too easy to get wrong. In fact, it was the motivating example of PEP 343 that added the with statement to Python in 2005. So from this here we could go to the previous example to this:

with open(filename) as f:
    contents = f.read()
process(contents)

This is one of the most powerful examples of how notation changes perception of complexity. This humble with statement is both twice as short as the previous example, always correctly closes the file, and makes it easier to see that process(contents) could and should happen without keeping the file open. Remember about with statement, we’ll be using them quite a few more times today!

Exception Handling in asyncio Today

(Note: if you have never seen asyncio before, watch this series first.)

Let’s use a real-worldish example for this discussion. Let’s say we have a way to download contents from URLs to disk using asyncio:

import asyncio
from aiohttp import ClientSession

async def download(url: str, file: IO[bytes]):
    async with ClientSession() as session:
        async with session.get(url) as response:
            file.write(await response.read())

async def get_one(url: str, path: str):
    with open(path, "wb") as f:
        await download(url, f)

Look mom, three separate with statements. Clearly they’re a good idea. We have one to open and close the file, we have one to open an HTTP(S) client, and we have one to handle a single request/response operation. Nothing special as far as asyncio goes, but quite a few things can go wrong here:

  • the output directory might not be there or the user might not have permission to write to it;
  • local disk can be out of space;
  • session.get might fail to connect if there’s no Internet connection;
  • DNS might not work;
  • response from the server might not be 200;
  • response might be big enough for the computer to go out of memory.

All of those will raise exceptions in the respective places of the example code. The nice thing about the async/await syntax is that exceptions will be raised, and can be caught, just like in plain old regular Python code, using regular raise and regular try:except:.

asyncio.gather is easy but limited

But asyncio wouldn’t be of much use to use if we couldn’t make it deal with multiple things at once. So we can use this example to download multiple files at the same time like this:

async def download_many():
    await asyncio.gather(
          get_one("http://example.com/dl=f1", "f1.zip"),
          get_one("http://example.com/dl=f2", "f2.zip"),
          get_one("http://example.com/dl=f3", "f3.zip"),
    )

Note that thanks to asyncio.gather we only await once on the entire sequence of get_one() coroutines. This is nice but also scary: now up to three things can go wrong at the same time! All the complexity we talked about above is still there but now multiple exceptions might get raised by this one await. Or they should but today asyncio.gather will simply return the first exception and let the other coroutines run as if nothing happened. This isn’t great if we wanted to know if anything else went wrong, or if we wanted cancel the other downloads if any of them fails.

You can do something like this instead, but it’s pretty ugly:

async def download_many():
    results = await asyncio.gather(
        get_one("http://example.com/dl=f1", "f1.zip"),
        get_one("http://example.com/dl=f2", "f2.zip"),
        get_one("http://example.com/dl=f3", "f3.zip"),
        return_exceptions=True,
    )
    for result in results:
        if isinstance(result, BaseException):
            ...

Now you’re mixing returned values with exceptions so you need to check each result for whether it’s an exception… and later still filter it to some particular type that you know how to recover from, or log, or retry. It’s reinventing the try:except: syntax, badly.

More importantly, in this scenario the call only returns when everything is done. So in case of long timeouts this point can never1 happen.

asyncio.wait is flexible but messy

Instead of using asyncio.gather we can use Task objects and wait for them, like this:

async def download_many():
    tasks = []
    for coro in (
        get_one("http://example.com/dl=f1", "f1.zip"),
        get_one("http://example.com/dl=f2", "f2.zip"),
        get_one("http://example.com/dl=f3", "f3.zip"),
    ):
        tasks.append(asyncio.create_task(coro))
    done, pending = await asyncio.wait(
        tasks,
        timeout=None,
        return_when=asyncio.FIRST_EXCEPTION,
    )

This is a close equivalent of the previous example but more verbose to allow configuring timeouts and behavior on exceptions. In this case I chose return_when=asyncio.FIRST_EXCEPTION which means that when await asyncio.wait returns, the done set can possibly contain a task which didn’t complete successfully while pending will possibly contain tasks that are still running.

Interestingly, if you change the call to use return_when=asyncio.ALL_COMPLETED (the default), possibly many tasks in the done set will be unsuccessful with exceptions on them. How do we check for that? Like this:

    for task in pending:
        task.cancel()
    for task in done:
        try:
            result = task.result()
        except asyncio.TimeoutError:
            ...
        except aiohttp.ClientSSLError:
            ...
        except aiohttp.ClientResponseError:
            ...
        else:
            ...

We’re iterating over tasks that are done and run their result() method to unpack their returned value. If a task wasn’t successful, it will re-raise the exception it got before so we can handle it now.

There’s a few problems here. First of all, we begin with having to remember to cancel any pending tasks we don’t intend to keep running anymore. If we forget this step, they might keep running for a possibly long time, maybe interfering with shutdown or if we retried running the workload.

OK, so we marked all pending tasks as cancelled and now we need to iterate over all tasks manually. The happy-case scenario is now relegated to this sad little else: branch at the end of a rather complex try:except: block. But that’s a minor inconvenience compared with the cancellations. The exception logic to write here is so involved that it’s not unreasonable to forget to handle the cancellations or do them subtly wrong. In fact, the code above is subtly wrong.

Well, it’s not enough to mark tasks as cancelled. You need to also do one last asyncio.wait round on that pending set with some reasonably short timeout, like this:

    for task in pending:
        task.cancel()
    done, pending = await asyncio.wait(
        pending,
        timeout=2.0,
        return_when=asyncio.ALL_COMPLETED,
    )

You need to do that because those tasks internally might look something like this:

async def some_task():
    try:
        await do_work()
    finally:
        await clean_up()

When a task is cancelled with .cancel(), this means that asyncio will raise a CancelledError from this task’s currently running await. It will do it on the next opportunity when that task is executed by the event loop. If you don’t explicitly give it that opportunity by waiting on the pending tasks one last time, they might never be able to execute their finally: blocks (or the implicit equivalent of that when exiting a with statement block). Since code in those finally: blocks might itself be asynchronous, forgetting to wait one-last-time sometimes leads to frustrating “impossible” bugs.

Enter except*!

Finally, after a 2,000 word buildup, we reach the actual meat of the post: exception groups! An exception group is a special subclass of Exception that looks something like this:

>>> eg = ExceptionGroup(
...   "Server process exceptions",
...   (
...     ValueError("invalid logger config"),
...     TypeError("bytes expected as payload"),
...     ExceptionGroup(
...       "Fetching current configuration",
...       (
...         ValueError("timeout can't be negative"),
...         asyncio.TimeoutError("task-03 timed out"),
...       ),
...     ),
...   ),
... )

It’s got a name and a sequence of exceptions. Note that one of those exceptions is itself an exception group. So all in all you get an exception tree. In fact, that’s how it looks when you see the traceback:

>>> raise eg
+ Exception Group Traceback (most recent call last):      
|   File "<stdin>", line 1, in <module>                   
| ExceptionGroup: Server process exceptions               
+-+---------------- 1 ----------------                    
  | ValueError: invalid logger config                     
  +---------------- 2 ----------------                    
  | TypeError: bytes expected as payload                  
  +---------------- 3 ----------------                    
  | ExceptionGroup: Fetching current configuration        
  +-+---------------- 1 ----------------                  
    | ValueError: timeout can't be negative               
    +---------------- 2 ----------------                  
    | asyncio.exceptions.TimeoutError: task-03 timed out  
    +------------------------------------                 

Exception groups have a small API, you can print the message, list the exceptions, and most interestingly, filter the list based on some particular interesting type of Exception:

>>> eg.message
'Server process exceptions'
>>> eg.exceptions
(ValueError('invalid logger config'),
 TypeError('bytes expected as payload'),
 ExceptionGroup('Fetching current configuration',
     (ValueError("timeout can't be negative"),
      TimeoutError('task-03 timed out'))))
>>> eg.subgroup(asyncio.TimeoutError)
ExceptionGroup('Server process exceptions',
    [ExceptionGroup('Fetching current configuration',
        [TimeoutError('task-03 timed out')])])

Isn’t that all we need?

You might say, OK, this is it. With this construct and API we can handle multiple errors at once. So we don’t need any new syntax in Python, right?

Well, not exactly. If you use a regular try:except: block to catch an exception group, you’d have to filter it for the errors you’re interested in, and re-raise the rest in a new exception group. You’d have to use the eg.subgroup() API (and maybe eg.split() or eg.derive()) every time you’re doing error handling. This would get old real fast.

Enter except*, for real this time!

Instead, look at this glorious example:

>>> try:
...     raise eg
... except* asyncio.TimeoutError as te_group:
...     print("timeouts", te_group.exceptions)
... except* ValueError as ve_group:
...     print("values", ve_group.exceptions)
... except* Exception as others_group:
...     print("others", others_group.exceptions)
...
timeouts (ExceptionGroup('Fetching current configuration',
                         [TimeoutError('task-03 timed out')]),)
values (ValueError('invalid logger config'),
        ExceptionGroup('Fetching current configuration',
                       [ValueError("timeout can't be negative")]))
others (TypeError('bytes expected as payload'),)
>>>

Instead of having to unpack particular exceptions from the group manually, we use the new except* block to do it for us. Notice a very important distinction from the regular try:except: block:

With except*, when a single exception group is raised, each matching block will execute (at most once).

In the example all three blocks ran, exhausting all exceptions within the group. But what if we haven’t exhausted all of them? A new exception group with the remaining unhandled exceptions would automatically be re-raised. A higher scope can handle it, maybe just as a regular exception, who knows? For example:

>>> try:
...     try:
...         raise eg
...     except* asyncio.TimeoutError as te_group:
...         print("timeouts", te_group.exceptions)
... except Exception as or_else:
...     print("else:", type(or_else), or_else)
...
timeouts (ExceptionGroup('Fetching current configuration',
                         [TimeoutError('task-03 timed out')]),)
else: <class 'ExceptionGroup'> Server process exceptions

As you can see, here we only dealt with the timeouts so everything else was still unhandled and got re-raised. We caught it as a regular Exception and handled it that way.

So exception groups can be treated like regular exceptions in code that doesn’t understand them. The opposite is true as well! If you raise a single exception in a try:except*: block, it will get automatically wrapped in a group so you can handle it with the other exceptions without any special casing:

>>> try:
...     raise ValueError("regular exception!")
... except* ValueError as ve_group:
...     print("values", ve_group.exceptions)
...
values (ValueError('regular exception!'),)

Some conclusions from playing with exception groups and except*

This is still very new functionality – the PR isn’t even landed yet – but after spending a week with it, I can already tell you that:

  • exception groups compose very well with the rest of the language, you can mostly treat them like regular exceptions in outer try:except: blocks where you only retry or log errors;
  • try:except*: is a big convenience for when you need to deal with contents of an exception group in a readable way – but it isn’t “viral”, you don’t have to convert your existing try:except: blocks, you won’t have to teach it to third-graders in their first week of Python (just like you’re not teaching them about **kwargs);
  • the except* keyword isn’t a single keyword – the star is a separate token – but the authors recommend this spelling and not except *Exception because in the presence of multiple exception types it looks confusing (except *(SomeError, OtherError) is worse than except* (SomeError, OtherError);
  • it’s not only for asyncio! I can easily imagine other frameworks that will use this functionality: multiprocessing, concurrent.futures, atexit, and so on.

The future of asyncio error handling!

Remember our wordy asyncio.wait example along with its necessary pending task cancellation that runs them to completion, and the ugly loop to retrieve all exceptions one by one? That was a form of resource tracking so an obvious candidate for it was to use a with statement, right?

How about something like this?

async def download_many():
    async with TaskGroup(name="Downloads") as tg:
        for coro in (
            get_one("http://example.com/dl=f1", "f1.zip"),
            get_one("http://example.com/dl=f2", "f2.zip"),
            get_one("http://example.com/dl=f3", "f3.zip"),
        ):
            tg.create_task(coro)

This code is the equivalent of the task handling we had before. It always waits for all tasks to finish, it always cancels things properly, and always gathers multiple exceptions in a group so you can handle them like this in an outer call:

try:
    await download_many()
except* asyncio.TimeoutError:
    ...
except* aiohttp.ClientSSLError:
    ...
except* aiohttp.ClientResponseError:
    ...

I find this so much cleaner. It’s less code, easier to write correctly, and easier to read. So when can you expect that to happen?

Hopefully still for Python 3.11 alongside exception groups and except*. There seems to be some edge cases around yield that need more discussion but there’s already an existing implementation of TaskGroups in the form of Trio nurseries as well as asyncio TaskGroups in EdgeDB.

For sure, our first stop will be to land GH-29581.


  1. For some definition of “never”. asyncio has a very loose concept of “never” and “forever” 😉 

#Programming #Python/Developer-in-Residence