|
|
Subscribe / Log in / New account

What's coming in Python 3.8

LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

By Jake Edge
July 17, 2019

The Python 3.8 beta cycle is already underway, with Python 3.8.0b1 released on June 4, followed by the second beta on July 4. That means that Python 3.8 is feature complete at this point, which makes it a good time to see what will be part of it when the final release is made. That is currently scheduled for October, so users don't have that long to wait to start using those new features.

The walrus operator

The headline feature for Python 3.8 is also its most contentious. The process for deciding on PEP 572 ("Assignment Expressions") was a rather bumpy ride that eventually resulted in a new governance model for the language. That model meant that a new steering council would replace longtime benevolent dictator for life (BDFL) Guido van Rossum for decision-making, after Van Rossum stepped down in part due to the "PEP 572 mess".

Out of that came a new operator, however, that is often called the "walrus operator" due to its visual appearance. Using ":=" in an if or while statement allows assigning a value to a variable while testing it. It is intended to simplify things like multiple-pattern matches and the so-called loop and a half, so:

    m = re.match(p1, line)
    if m:
        return m.group(1)
    else:
        m = re.match(p2, line)
        if m:
            return m.group(2)
        else:
            m = re.match(p3, line)
            ...
becomes:
    if m := re.match(p1, line):
        return m.group(1)
    elif m := re.match(p2, line):
        return m.group(2)
    elif m := re.match(p3, line):
        ...
And a loop over a non-iterable object, such as:
    ent = obj.next_entry()
    while ent:
        ...   # process ent
	ent = obj.next_entry()
can become:
    while ent := obj.next_entry():
        ... # process ent
These and other uses (e.g. in list and dict comprehensions) help make the intent of the programmer clearer. It is a feature that many other languages have, but Python has, of course, gone without it for nearly 30 years at this point. In the end, it is actually a fairly small change for all of the uproar it caused.

Debug support for f-strings

The f-strings (or formatted strings) added into Python 3.6 are quite useful, but Pythonistas often found that they were using them the same way in debugging output. So Eric V. Smith proposed some additional syntax for f-strings to help with debugging output. The original idea came from Larry Hastings and the syntax has gone through some changes, as documented in two feature-request issues at bugs.python.org. The end result is that instead of the somewhat cumbersome:

    print(f'foo={foo} bar={bar}')
Python 3.8 programmers will be able to do:
    print(f'{foo=} {bar=}')
In both cases, the output will be as follows:
    >>> foo = 42
    >>> bar = 'answer ...'
    >>> print(f'{foo=} {bar=}')
    foo=42 bar=answer ...

Beyond that, some modifiers can be used to change the output, "!s" uses the str() representation, rather than the default repr() value and "!f" will be available to access formatting controls. They can be used as follows:

    >>> import datetime
    >>> now = datetime.datetime.now()
    >>> print(f'{now=} {now=!s}')
    now=datetime.datetime(2019, 7, 16, 16, 58, 0, 680222) now=2019-07-16 16:58:00.680222

    >>> import math
    >>> print(f'{math.pi=!f:.2f}')
    math.pi=3.14

One more useful feature, though it is mostly cosmetic (as is the whole feature in some sense), is the preservation of spaces in the f-string "expression":

    >>> a = 37
    >>> print(f'{a = }, {a  =  }')
    a = 37, a  =  37
The upshot of all of that is that users will be able to pretty-print their debugging, log, and other messages more easily. It may seem somewhat trivial in the grand scheme, but it is sure to see a lot of use. F-strings have completely replaced other string interpolation mechanisms for this Python programmer and I suspect I am far from the only one.

Positional-only parameters

Another change for 3.8 affords pure-Python functions the same options for parameters that those implemented in C already have. PEP 570 ("Python Positional-Only Parameters") introduces new syntax that can be used in function definitions to denote positional-only arguments—parameters that cannot be passed as keyword arguments. For example, the builtin pow() function must be called with bare arguments:

    >>> pow(2, 3)
    8
    >>> pow(x=2, y=3)
    ...
    TypeError: pow() takes no keyword arguments

But if pow() were a pure-Python function, as an alternative Python implementation might want, there is no easy way to force that behavior. A function could accept only *args and **kwargs, then enforce the condition that kwargs is empty, but that obscures what the function is trying to do. There are other reasons described in the PEP, but many, perhaps most, are not things that the majority of Python programmers will encounter very often.

Those that do, however, will probably be pleased that they can write a pure-Python pow() function, which will behave the same as the builtin, as follows:

    def pow(x, y, z=None, /):
	r = x**y
	if z is not None:
	    r %= z
	return r
The "/" denotes the end of the positional-only parameters in an argument list. The idea is similar to the "*" that can be used in an argument list to delimit keyword-only arguments (those that must be passed as keyword=...), which was specified in PEP 3102 ("Keyword-Only Arguments"). So a declaration like:
    def fun(a, b, /, c, d, *, e, f):
        ...
Says that a and b must be passed positionally, c and d can be passed as either positional or by keyword, and e and f must be passed by keyword. So:
    fun(1, 2, 3, 4, e=5, f=6)          # legal
    fun(1, 2, 3, d=4, e=5, f=6)        # legal
    fun(a=1, b=2, c=3, d=4, e=5, f=6)  # illegal
It seems likely that most Python programmers have not encountered "*"; "/" encounter rates are likely to be similar.

A movable __pycache__

The __pycache__ directory is created by the Python 3 interpreter (starting with 3.2) to hold .pyc files. Those files contain the byte code that is cached after the interpreter compiles .py files. Earlier Python versions simply dropped the .pyc file next to its .py counterpart, but PEP 3147 ("PYC Repository Directories") changed that.

The intent was to support multiple installed versions of Python, along with the possibility that some of those might not be CPython at all (e.g. PyPy). So, for example, standard library files could be compiled and cached by each Python version as needed. Each would write a file of the form "name.interp-version.pyc" into __pycache__. So, for example, on my Fedora system, foo.py will be compiled when it is first used and __pycache__/foo.cpython-37.pyc will be created.

That's great from an efficiency standpoint, but may not be optimal for other reasons. Carl Meyer filed a feature request asking for an environment variable to tell Python where to find (and put) these cache files. He was running into problems with permissions in his system and was disabling cache files as a result. So, he added a PYTHONPYCACHEPREFIX environment variable (also accessible via the -X pycache_prefix=PATH command-line flag) to point the interpreter elsewhere for storing those files.

And more

Python 3.8 will add a faster calling convention for C extensions based on the existing "fastcall" convention that is used internally by CPython. It is exposed in experimental fashion (i.e. names prefixed with underscores) for Python 3.8, but is expected to be finalized and fully released in 3.9. The configuration handling in the interpreter has also been cleaned up so that the language can be more easily embedded into other programs without having environment variables and other configuration mechanisms interfere with the installed system Python.

There are new features in various standard library modules as well. For example, the ast module for processing Python abstract syntax trees has new features, as do statistics and typing. And on and on. The draft "What's New In Python 3.8" document has lots more information on these changes and many others. It is an excellent reference for what's coming in a few months.

The status of PEP 594 ("Removing dead batteries from the standard library") is not entirely clear, at least to me. The idea of removing old Python standard library modules has been in the works for a while, the PEP was proposed in May, and it was extensively discussed after that. Removing unloved standard library modules is not particularly controversial—at least in the abstract—until your favorite module is targeted, anyway.

The steering council has not made a pronouncement on the PEP, nor has it delegated to a so-called BDFL-delegate. But the PEP is clear that even if it were accepted, the changes for 3.8 would be minimal. Some of the modules may start raising the PendingDeprecationWarning exception (many already do since they have been deemed deprecated for some time), but the main change will be in the documentation. All of the 30 or so modules will be documented as being on their way out, but the actual removal will not happen until the 3.10 release—three or so years from now.

The future Python release cadence is still under discussion; currently Python 3.9 is scheduled for a June 2020 release, much sooner than usual. Python is on an 18-month cycle, but that is proposed to change to nine months (or perhaps a year). In any case, we can be sure that Python 3.8 will be here with the features above (and plenty more) sometime before Halloween on October 31.


Index entries for this article
PythonReleases


(Log in to post comments)

What's coming in Python 3.8

Posted Jul 17, 2019 20:03 UTC (Wed) by iabervon (subscriber, #722) [Link]

It's worth noting that PEP 570 lets you have a function like:
def messagef(fmt, *args, **kwargs):
    ...
    fmt.format(*args, **kwargs)
    ...
without getting problems if someone wants to use fmt as one of the names in the format string, and without fishing it out of the args list and producing your own error if it's missing. Not to mention trying to make pure Python act like "{self}".format(self=6).

What's coming in Python 3.8

Posted Jul 17, 2019 22:52 UTC (Wed) by adobriyan (subscriber, #30858) [Link]

re.match example with returns is better written as
m = re.match(p1, line)
if m:
    return m.group(1)
m = re.match(p2, line)
if m:
    return m.group(2)
m = re.match(p3, line)
which is exactly how you would write it before.

If someone is in PEP writing mood, here is another nice to have thing — bitmasks syntax:

#m[01x_]
Unlike usual anding and comparing it is immediately obvious which bits should and should not be set. Example:
if (p[0] == 0m0xxx_xxxx) {
...
} else if (p[0] == 0m110x_xxxx && p[1] == 0m10xx_xxxx) {
...
} ...
Use 0m in C, C++, Rust, etc. Use #m in Common Lisp. Capital X is not accepted.

Should be doable in C++ with user defined literals: operator""_m8.

In general, Python switching to "democratic" mode is a sad thing to hear. Programming languages are becoming application-like, judged by the feature list, all similar to one another. 21-th century is becoming the century of Programming Language Republics.

What's coming in Python 3.8

Posted Jul 17, 2019 23:10 UTC (Wed) by jake (editor, #205) [Link]

Yes, something more like this would have been a better example:
    m = re.match(p1, line)
    if m:
        # do something with m
    else:
        m = re.match(p2, line)
        if m:
            # do something with m
        else:
            m = re.match(p3, line)
            ...
jake

What's coming in Python 3.8

Posted Jul 26, 2019 10:28 UTC (Fri) by poruid (guest, #15924) [Link]

Indeed, constructs like
if m:
return m.group(1)
else:
....
can be found in lots and lots of code, in many languages.
After a "return" statement it is: never reached and the "else" is therefore not needed.
It expresses a misapprehension of "return";

I like the walrus operator

Posted Jul 18, 2019 14:00 UTC (Thu) by david.a.wheeler (guest, #72896) [Link]

I like the walrus operator. It makes certain kinds of code much more succinct, and I find it easier to read.

It's true that a language shouldn't pick up an idea just because another one does. But there's a reason this kind of operation is common. The ":=" assignment operator is used in many other languages, so it's also easy to remember for many people.

What's coming in Python 3.8

Posted Jul 21, 2019 6:02 UTC (Sun) by t-v (guest, #112111) [Link]

"In general, Python switching to "democratic" mode is a sad thing to hear. Programming languages are becoming application-like, judged by the feature list, all similar to one another."

The part that me bothers about it is that all the heavy-weight governance makes Python ultimately self-focused instead of developer-focused.

As seen here, headline improvements are for a handful of corner-cases where people thought they can shave off 10 lines of code and 2 indentation levels somewhere.

At the same time, the Python REPL is just clumsy, when that used to be part of why Python is easy to learn and work with. IPython/Jupyter shows the route to a better experience, but it's a shame that its so bolted-on and is not taken as a source of inspiration for Python improvements.
Similarly, some more serious Typing support (have you ever tried to enforce typing at run time?) would be a great thing - to some that is much more problematic than e.g. speed.
The leaking batteries are another example - your core product might have more features by including them, the overall experience suffers.

From trying to address one of my tiny pet peeves of that, I first got a prolonged explanation that I'm a complete moron for the improvement I want from the "Get along in the Python community" colleague who might have had a bad day. When I had recovered from that encounter a year later, a core developer looked at it, found "someone else's problem, should be rejected" within less than 15 minutes of first seeing it - the answer sounding to me like he had not tried to understand why I felt it needed to be addressed in Python itself.

It's great to see Python developed, but the development seems to have lost connection to the vision of a "people-centered" programming language that kept me using it for >25 years.

What's coming in Python 3.8

Posted Jul 21, 2019 22:03 UTC (Sun) by flussence (subscriber, #85566) [Link]

That 0m notation is one of the more useful language ideas I've seen in a long time; short, simple, and solves a common problem.

It's a long way from the unified pattern-matching + destructuring assignment in the likes of Erlang, but given how awkward bitmask arithmetic can be in C-family languages I'm surprised it hasn't been done before. Hopefully someone reading this is in a position to adopt it.

What's coming in Python 3.8

Posted Jul 25, 2019 16:59 UTC (Thu) by lordsutch (guest, #53) [Link]

I would imagine in Python this sort of complex bitmask comparison (something more than testing single flag bits) is sufficiently rare that it would be, at best, a stdlib utility function:
def test_bitmask(value: int, mask: str) -> bool:
    mask = mask.replace("_", "")
    testval: int = 1
    for bit in reversed(mask):
        if bit == '1' and not (value & testval):
            return False
        elif bit == '0' and (value & testval):
            return False
        elif bit != "x":
            raise ValueError('Invalid bitmask character: '+bit)
        testval <<= 1
    return True

assert( test_bitmask(0x80, '1xxx_xxxx') == True )
assert( test_bitmask(0x3f, '0xx1_xxxx') == True )
assert( test_bitmask(0x3f, '0xx1_xxx0') == False )
This could probably be optimized a bit more (no pun intended...) but it'd work. I might use '.' instead of 'x' to make the mask look more like a regex, and perhaps avoid confusion with '0x' as a hex number, instead.

What's coming in Python 3.8

Posted Jul 25, 2019 19:21 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> avoid confusion with '0x' as a hex number, instead.

Well, `0.` is no less confusing there :) .

What's coming in Python 3.8

Posted Jul 25, 2019 20:16 UTC (Thu) by johill (subscriber, #25196) [Link]

You have a bug there in the negative case :-) Here's a version with that fixed, and operator overloading:
class Bitmask(object):
    def __init__(self, mask: str):
        self._mask = list(reversed(mask.replace('_', '')))

    def _test(self, value: int) -> bool:
        testval: int = 1
        for bit in self._mask:
            if bit == '1':
                if not (value & testval):
                    return False
            elif bit == '0':
                if value & testval:
                    return False
            elif bit != "x":
                raise ValueError(f'Invalid bitmask character: "{bit}"')
            testval <<= 1
        return True

    def __eq__(self, other):
        if type(other) == Bitmask:
            return other._mask == self._mask
        if type(other) != type(0):
            return NotImplemented
        return self._test(other)

tests = [
    "0x80 == Bitmask('1xxx_xxxx')",
    "0x3f == Bitmask('0xx1_xxxx')",
    "0x3f != Bitmask('0xx1_xxx0')",
    "0x40 != Bitmask('0000_xxxx')",
    "'0x80' == Bitmask('1xxx_xxx')",
    "None == Bitmask('1xxx_xxx')",
    "None != Bitmask('1xxx_xxx')",
    "Bitmask('1xxx_xxx') == Bitmask('1xxx_xxx')",
    "Bitmask('1xxx_xxx') == Bitmask('xxxx_xxx')",
    "Bitmask('1xxx_xxx') == Bitmask('0xxx_xxx')",
]

for test in tests:
    print(f'{test}: {eval(test)}')

What's coming in Python 3.8

Posted Jul 17, 2019 23:41 UTC (Wed) by debacle (subscriber, #7114) [Link]

> F-strings have completely replaced other string interpolation mechanisms for this Python programmer and I suspect I am far from the only one.

I would use f-strings, if I could use them with gettext in a safe and convenient way.

What's coming in Python 3.8

Posted Jul 18, 2019 15:52 UTC (Thu) by barryascott (subscriber, #80640) [Link]

And by design you cannot translate f strings.
That has lead to me never using them in any of my code.

What's coming in Python 3.8

Posted Jul 18, 2019 1:09 UTC (Thu) by Kamilion (subscriber, #42576) [Link]

I really hope they make 4.0 the release after 3.9.

Mainly so all of these new operators and syntax introduced over the python3 lifetime becomes reliable with a single version check.

if int(sys.version_info[0]) >= 4:
#oh hey I know all the async, await, walrus, and f-string behaviors will be around.

vs

if int(sys.version_info[0]) >= 3:
if int(sys.version_info[1]) >= 5:
do_all_the_things() # Go forth!
else:
print(u"You need at least python 3.5 to continue.")
exit(1)

Code many years in the field already does similar for py3, with a 3 instead of a 4 there to make use of py2.7/py3.4+ compat, QT4/5 code is common to see this with. Yeah, there's better ways to do things, I'm sure.

It's already said in the past that 4.0 won't be a 'breaking' release like 3.0 was.

"Hey, we cleaned up a few sharp edges left in 3.9, updated a few of the included batteries, didn't really add much new this release, here's 4.0."

There's nothing I'd like to see more than a 'ho hum, tying up loose maintenance ends, nothing amazing here' release.

Seeing 3.10, 3.11, 3.20, ugh, that'd just be a pain after 3.0, 3.1, and 3.2 being given the frowning of a lifetime for missing u"".

Please. For the love of sanity. 3.5's already a minimum for anything using await, and it's one of those things that once you've tasted the performance of, you just can't go back. Please. Don't make me go back. It's dark there and the UTF-16 structs are huge and terrifying. (Also, micropython only implements 3.5 compatibility...)

4.0, sooner, rather than later.

What's coming in Python 3.8

Posted Jul 18, 2019 4:48 UTC (Thu) by tbodt (subscriber, #120821) [Link]

if sys.version_info >= (3, 10):

What's coming in Python 3.8

Posted Jul 18, 2019 15:59 UTC (Thu) by smurf (subscriber, #17840) [Link]

What's the problem with comparing sys.version_info >= (3,10) instead of (4,0)?

Contrast with the heap of code that mistakenly checks the major version against ==3 instead of >=3 …

What's coming in Python 3.8

Posted Jul 18, 2019 17:23 UTC (Thu) by karkhaz (subscriber, #99844) [Link]

This probably isn't a mistake. It seems defensive and entirely sensible, given the magnitude of breaking changes that the last major version bump introduced. Similar to licensing your software as GPL 2 rather than GPL 2+ if you don't trust that the FSF won't introduce clauses that you won't like in GPL 3.

What's coming in Python 3.8

Posted Jul 19, 2019 7:38 UTC (Fri) by rschroev (subscriber, #4164) [Link]

> It seems defensive and entirely sensible, given the magnitude of breaking changes that the last major version bump introduced.

Python doesn't introduces large changes willy-nilly, neither are new major version going to break things like Python 3 did. The changes in Python 3 were a one-time event and were announced a long time in advance. That's not going to happen again anytime soon (probably never).

A movable __pycache__ - vulnerability?

Posted Jul 18, 2019 1:35 UTC (Thu) by amworsley (subscriber, #82049) [Link]

Wonder if there are any vulnerabilities with putting the cache directory on /tmp via an environment variable.

e.g. If you run a python script in a trusted location but if the environment sets the pycache directory to be on /tmp containing a matching compiled version file will the interpreter now skip the original python file and load the cached one from this /tmp directory. This would allow some one adding an environment variable to trick a process starting up a python script to actually run an arbitary compiled python file by switching the cache directory to an attacker controlled directory?

A movable __pycache__ - vulnerability?

Posted Jul 18, 2019 6:09 UTC (Thu) by 0x01 (guest, #112039) [Link]

the privilege level of being able to set an environment variable for a user is the same as being able to run arbitrary code as that user, no?

A movable __pycache__ - vulnerability?

Posted Jul 18, 2019 8:59 UTC (Thu) by FLHerne (guest, #105373) [Link]

I think the concern was that some naive admin might set PYTHONPYCACHEPREFIX to /tmp, perhaps hoping not to accumulate 'stale' cache files, without realizing the vulnerability.

A movable __pycache__ - vulnerability?

Posted Jul 18, 2019 13:43 UTC (Thu) by kiall (guest, #133240) [Link]

> I think the concern was that some naive admin might set PYTHONPYCACHEPREFIX to /tmp, perhaps hoping not to accumulate 'stale' cache files, without realizing the vulnerability.

At the same time a naive admin might `chmod -R ugo+w /`. Having the ability to control where that cache lives is IMO a good thing, even if some admins will make mistakes while using it.

A movable __pycache__ - vulnerability?

Posted Jul 19, 2019 11:11 UTC (Fri) by gdamjan (subscriber, #33634) [Link]

that's why all services should be run with `PrivateTmp=yes` systemd directive

A movable __pycache__ - vulnerability?

Posted Jul 26, 2019 18:26 UTC (Fri) by k8to (guest, #15413) [Link]

There are cases where this is untrue, for better or worse. Env vars may be viewed as a configuration system for setting options, or as a method to provide input (eg the ancient cgi interface). It's sketchy for sure as some env vars are extremely powerful, and some systems try to limit access to some of these. But there are situations in which env vars are expected to not be equivalent to code execution.

Whether this specific case represents a new problem over the other types of env vars that could enable code execution, I'm not really expert enough to comment, but there are blacklist env var approaches in the field, so adding a new such power-env-var can breach those types of defenses.

A movable __pycache__ - vulnerability?

Posted Jul 27, 2019 4:15 UTC (Sat) by flussence (subscriber, #85566) [Link]

An example to the contrary is daemontools' envdir mechanism, where the permissions for modifying a process's environment are entirely disjoint from what user it runs as. Usually the envdir and its contents are root-owned, but that's just a convention.


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds