Python 3.9 is around the corner
This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible. |
Python 3.9.0rc2 was released on September 17, with the final version scheduled for October 5, roughly a year after the release of Python 3.8. Python 3.9 will come with new operators for dictionary unions, a new parser, two string operations meant to eliminate some longstanding confusion, as well as improved time-zone handling and type hinting. Developers may need to do some porting for code coming from Python 3.8 or earlier, as the new release has removed several previously-deprecated features still lingering from Python 2.7.
Python 3.9 marks the start of a
new release cadence. Up until now, Python has done releases on an
18-month cycle. Starting with Python 3.9, the language has shifted to an
annual release cycle as defined by PEP 602 ("Annual
Release Cycle for Python
").
A table
provided by the project shows how Python performance has changed in a number
of areas since Python 3.4. It is interesting to note that Python 3.9 is worse
than 3.8 on almost every benchmark in that table, though it does perform
generally better than 3.7. That said, it is claimed that several Python
constructs such as range, tuple, list, and
dict will see improved performance in Python 3.9, though no specific
performance benchmarks are given. The boost is credited to the language
making more use of a fast-calling protocol for CPython that is described in
PEP 590
("Vectorcall: a fast calling protocol for CPython
").
As the PEP explains, Vectorcall replaces the existing tp_call convention which has poor performance because it must create intermediate objects for a call. While CPython has special-case optimizations to speed up this process for calls to Python and built-in functions, those do not apply to classes or third-party extension objects. Additionally, tp_call does not provide a function pointer per object (only per class), again requiring the creation of several intermediate objects when making calls to classes. Vectorcall is faster because it does not have the same intermediate-object inefficiencies that are found in tp_call. Vectorcall was introduced in Python 3.8, but starting with version 3.9 it is used for the majority the Python calling conventions.
New operators and methods
Python 3.9 includes new dictionary union operators, | and |=, which we have previously covered; they are used to merge dictionaries. The | operator evaluates as a union of two dictionaries, while the |= operator stores the result of the union in the left-hand side of the operation:
>>> z = {'a' : 1, 'b' : 2, 'c' : 3} >>> y = {'c' : 'foo', 'd' : 'bar' } >>> z | y {'a': 1, 'b': 2, 'c': 'foo', 'd': 'bar'} >>> z |= y >>> z {'a': 1, 'b': 2, 'c': 'foo', 'd': 'bar'}
There are many ways dictionaries can be merged in Python, but Andrew Barnert
said that the operator is designed to address the "copying
update
":
The problem is the copying update. The only way to spell it is to store a copy in a temporary variable and then update that. Which you can’t do in an expression. You can do _almost_ the same thing with {**a, **b}, but not only is this ugly and hard to discover, it also gives you a dict even if a was some other mapping type, so it’s making your code more fragile, and usually not even doing so intentionally.
In situations where the two dictionaries share a common key, the last-seen value for a key "wins" and is included in the merge as shown above for key c. While the standard union operator | only allows unions between dict types, the assignment variety |= can be used to update a dictionary with new key-value pairs from an iterable object:
>>> z = {'a' : 'foo', 'b' : 'bar', 'c' : 'baz'} >>> y = ((0, 0), (1, 1), (2, 8)) >>> z |= y >>> z {'a': 'foo', 'b': 'bar', 'c': 'baz', 0: 0, 1: 1, 2: 8}
PEP 584 ("Add
Union Operators To dict
") provides complete documentation of the new
operators.
Two new string methods have also been added in version 3.9: removeprefix()
and removesuffix().
These convenience methods make it easy to remove an unwanted prefix or suffix
from string data. As described in PEP 616 ("String
methods to remove prefixes and suffixes
"), these functions are being
added to address user confusion regarding the
str.lstrip() and
str.rstrip() methods, which are often mistaken as a means to
remove a prefix or suffix from a string. The confusion around
str.lstrip() and str.rstrip() comes from its optional
string parameter.
According to the PEP, the confusion for users stems from the fact that the
parameter passed to str.lstrip() and str.rstrip() is
interpreted as a set of individual characters to remove, rather than as a
single substring. With the additions, the project hopes to provide a
"cleaner redirection of users to the desired behavior.
" Using
these new methods is straightforward, as shown below:
>>> a = "PEP-616" >>> a.removeprefix("PEP-") '616'
Deprecation and porting
Developers should be aware of some features that are being deprecated and
removed in 3.9, as well as some more deprecations that are coming in 3.10.
Many Python 2.7 functions that emit a DeprecationWarning in version 3.8 "have been removed or will be removed soon
" starting with
version 3.9. The project recommends testing applications with the -W
default command-line option, which will show these warnings, before
upgrading. As we previously
covered, certain backward-compatibility layers, such as the aliases to
Abstract Base Classes in the collections
module, will remain for one last release before being removed in Python 3.10.
The complete
listing of removals in version 3.9 is available for interested readers.
Further, the release includes numerous new
deprecations of language features that will be removed in a future
release. An additional recommendation is to run tests in Python Development
Mode using the -X dev option to prepare code bases for future
changes.
Other goodies
As we reported, Python 3.9
ships with a new parsing expression
grammar (PEG) parser to replace the current LL(1) parser in version 3.8. In
PEP 617 ("New
PEG parser for CPython
") describing the change, the switch to the PEG
parser will eliminate "multiple 'hacks' that exist in the current
grammar to circumvent the LL(1)-limitation.
" This should help the
project substantially reduce the maintenance cost for the parser.
Python introduced type hinting in
version 3.5; the 3.9 release allows types like List and
Dict to be replaced with the built-in list and
dict varieties. Type hints in Python are mostly for linters and code
checkers, as they are not enforced at run time by CPython. PEP 585 ("Type Hinting
Generics In Standard Collections
") provides a listing of
collections that have become generics. Note that, with version 3.9, importing
the types (from typing) that are now built-in is deprecated. It
sounds like developers will have plenty of time to update their code,
however, as according to the PEP: "the deprecated functionality will be
removed from the typing module in the first Python version released 5 years
after the release of Python 3.9.0.
"
Thanks to flexible function and variable annotations, as described in
PEP 593
("Flexible function and variable annotations
"), Python 3.9 has a
new Annotated type. This allows the decoration of existing types
with context-specific metadata:
charType = Annotated[int, ctype("char")]
This metadata can be used in either static analysis or at run time; it is ignored entirely if it is unused. It is designed to enable tools like mypy to perform static type checking and provides access to the metadata at run time via get_type_hints(). To provide backward compatibility with version 3.8, a new include_extras parameter has been added to the get_type_hints() function with a default value of False, retaining the same behavior as existed in version 3.8. When include_extras is set to True, get_type_hints() will return the defined Annotation type for use.
Various other language changes can be expected in Python 3.9. __import__()
now raises ImportError instead of ValueError when a
relative import went past the top-level package. Decorators
have also been improved as described in PEP 614 ("Relaxing
Grammar Restrictions On Decorators
"), allowing any valid expression
(defined as "anything that's valid as a test in if, elif, and while
blocks
") to be used to invoke them. In Python 3.8, the expressions
available for use to invoke a decorator is limited. While the decorator
grammar limitations "were rarely encountered in practice
",
according to the PEP, they occurred often enough over the years to be worth
fixing in 3.9. The PEP has an example showing
how PyQt5 currently works
around the limitations.
Two new
modules are provided as part of the Python 3.9 standard library: zoneinfo
and graphlib.
The zoneinfo module, which we have previously covered, provides support
for the IANA time zone database
and includes zoneinfo.ZoneInfo,
which is a concrete datetime.tzinfo implementation allowing users to
load time zone data identified by an IANA name. The graphlib module
provides
graphlib.TopologicalSorter, a class that implements topological
sorting of graphs. In addition to these two new modules, many
existing modules were improved in various ways. One notable change
involves the asyncio module,
which no longer supports the reuse_address parameter of
asyncio.loop.create_datagram_endpoint() due to
"significant security concerns.
" The bug report describes a problem when
using SO_REUSEADDR on UDP in Linux environments. Setting
SO_REUSEADDR allows multiple processes to listen on sockets for the
same UDP port, which will pass incoming packets to each randomly; setting
reuse_address to True in a Python script would enable this
behavior.
There are a lot of interesting things worth checking out in Python 3.9, and the project's "What's new in Python 3.9" document is recommended for all the details. Additionally, the changelog provides an itemized list of changes between release candidates. Since no more release candidates of Python 3.9 are expected before the final version, developers may want to start testing their existing code to get a head start on the final release.
Index entries for this article | |
---|---|
Python | Releases |
(Log in to post comments)
Python 3.9 is around the corner
Posted Sep 22, 2020 22:53 UTC (Tue) by mathstuf (subscriber, #69389) [Link]
>>> a = "PEP-616"
>>> a.removeprefix("PEP-")
'616'
So it returns the result without modifying `a`? I feel like `remove` was a poor choice since I think this usually indicates modification of `self` (list.remove, set.remove). Though strings are immutable, so maybe that's the rationale? I feel like `without` may have been a better choice, but I suppose that ship has sailed by now.
Python 3.9 is around the corner
Posted Sep 23, 2020 0:36 UTC (Wed) by NYKevin (subscriber, #129325) [Link]
At this point, it is fair to assume the average developer is accustomed to str being immutable and having imperative method names.
Python 3.9 is around the corner
Posted Sep 23, 2020 9:38 UTC (Wed) by intgr (subscriber, #39733) [Link]
> Users may benefit from remembering that "strip" means working with sets of characters, while other methods work with substrings, so re-using "strip" here should be avoided.
Python 3.9 is around the corner
Posted Sep 23, 2020 11:21 UTC (Wed) by rurban (guest, #96594) [Link]
Python 3.9 is around the corner
Posted Sep 24, 2020 9:34 UTC (Thu) by LtWorf (subscriber, #124958) [Link]
So if you know python, you are not too surprised that a new string function doesn't suddenly make them mutable.
Python 3.9 is around the corner
Posted Sep 24, 2020 14:02 UTC (Thu) by mathstuf (subscriber, #69389) [Link]
Python 3.9 is around the corner
Posted Sep 22, 2020 23:45 UTC (Tue) by nix (subscriber, #2304) [Link]
Sigh. Haven't they learned their lesson from the whole Python 3 mess? Every time you do this you lose users, mostly for good. Do what Java does. Deprecate but never remove unless there is no alternative at all. (Ideally, the deprecation message should say what it's replaced with, or provide something machine-readable so that Python can do the replacement for you. Even *GCC* has things like this now. It's not rocket science.)
Python 3.9 is around the corner
Posted Sep 23, 2020 0:04 UTC (Wed) by cesarb (subscriber, #6266) [Link]
But Java did exactly that. In Java 11, they removed many APIs which had been deprecated in Java 9 (which had been released one year earlier, and skipped by many people since not only it wasn't a LTS release, but also had many breaking changes).
Python 3.9 is around the corner
Posted Sep 23, 2020 8:50 UTC (Wed) by nix (subscriber, #2304) [Link]
Myself... I have several things I know use APIs that are disappearing, *and* apps that insist on using the very latest features and that work on only one minor version of Python: thankfully (and probably not by chance) most Python libraries work on many minor versions, but if one of them ceases to be maintained, all the apps that use it are in possibly terminal trouble.
I think I can see why virtualenv is popular, but, y'know, it's not a positive on Python's part that people have to crock around its compatibility deficiencies by automating the installation of multiple interpreters. Can you imagine people doing that with a C compiler? Oh yes this is my C++98 compiler and this one over here is for C++17, it's got an incompatible libstdc++ so you have to run all the apps with this wrapper script... yes I know MSVC does that (and even has incompatible mallocs with different VS runtimes) but nobody thinks that's a better situation than the converse, right?
Python 3.9 is around the corner
Posted Sep 23, 2020 11:16 UTC (Wed) by pizza (subscriber, #46) [Link]
Yeah. virtualenv is a strength, but the fact that it's _necessary_ is anything but.
To make things worse, as upstream python has completely abandoned python2, they have started removing features from virtualenv (and pip) that some "older" codebases relied on to set up their environments. What a clusterf***.
Python 3.9 is around the corner
Posted Sep 24, 2020 7:55 UTC (Thu) by ehiggs (subscriber, #90713) [Link]
The biggest reason is that Android targets JDK8 so if you want your library available on Android you also use JDK8.
Python 3.9 is around the corner
Posted Sep 24, 2020 8:50 UTC (Thu) by ehiggs (subscriber, #90713) [Link]
Not sure if your nick is related, but this is exactly how nix (and guix) work(s). You build using your toolchain, install to a directory and then update a symlink to point to your bin which uses an RPath or local libraries to do the work. Then if you need to rollback you update the symlink again.
If your nick isn't related, I recommend checking nix out. It's really good for handling application installs.
Python 3.9 is around the corner
Posted Sep 24, 2020 12:23 UTC (Thu) by nix (subscriber, #2304) [Link]
Python 3.9 is around the corner
Posted Sep 30, 2020 13:46 UTC (Wed) by anton (subscriber, #25547) [Link]
people have to crock around its compatibility deficiencies by automating the installation of multiple interpreters. Can you imagine people doing that with a C compiler?Yes, I install as many gcc and clang versions as is practical (and preferably the oldest ones) because newer versions tend to work for fewer programs. Of course, the gcc and clang advocates claim that these programs had it coming. Maybe the solution for the Python advocates is to adopt this kind of attitude rather than talking about removing features. But it probably would not work that well for Python, because 1) there are more alternatives for Python than for C; and 2) You can sell feature removals to many C programmers by appealing to their elitism and promising them speedups (without presenting empirical results); I think this would not work so well for Python programmers.
Python 3.9 is around the corner
Posted Oct 3, 2020 22:44 UTC (Sat) by MrWim (subscriber, #47432) [Link]
A better comparison between C and Python in this regard is Python vs C dynamic libraries. In both languages you'll find 3rd party library authors that are blasé with compatibility - but I think Python 3.9 breaking compatibility would be analogous to backwards compatible changes to libc. And of course C also gives the option to statically link.
This isn't an entirely fair comparison of course. libc is minimal, and dependencies are a sufficient hassle in C that they are generally avoided if they can be. Whereas one of the strengths of Python is the fully featured standard library and significant and convenient package ecosystem.
Python 3.9 is around the corner
Posted Sep 23, 2020 2:10 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]
Well, this is just stupid. I was using reuse_port exactly because I wanted multiple processes to load-balance the packets, and in a container environment (so UIDs don't matter).
Python 3.9 is around the corner
Posted Sep 23, 2020 2:22 UTC (Wed) by pizza (subscriber, #46) [Link]
[/grumble]
...FFS, gets(), the patron saint of memory overflows, remains available in spite of it being formally deprecated 21 years ago, and its man page starting out with "_Never use this function_"
Python 3.9 is around the corner
Posted Sep 23, 2020 4:07 UTC (Wed) by dvdeug (subscriber, #10998) [Link]
C did do a lot of problematic quasi-deprecation, with strict aliasing causing a lot of nightmares along the way. You can't compile an older C program on a modern compiler and expect it to work.
Python 3.9 is around the corner
Posted Sep 23, 2020 11:02 UTC (Wed) by pizza (subscriber, #46) [Link]
Actually... you can. All modern C compilers [1] and their libraries allow compilation against older [2] revisions of the C spec; they just default to newer revisions. While strictly speaking that's due to the _implementation_ rather than the _specification_, the fact remains that you're far, far, far more likely to be able to successfully compile a 30-year-old C or C++ codebase than even 5-year-old Python code (even putting aside the still ongoing python2-3 debacle).
[1] Which, these days, pretty much just means GCC and LLVM/Clang as most proprietary compilers are built on the latter these days
[2] Including the original K&R C.
Python 3.9 is around the corner
Posted Sep 23, 2020 17:08 UTC (Wed) by NYKevin (subscriber, #129325) [Link]
I don't think I've ever heard of anyone, in this day and age, targeting a Python significantly older than 2.5, and even that is pretty rare.
Python 3.9 is around the corner
Posted Sep 23, 2020 4:06 UTC (Wed) by NYKevin (subscriber, #129325) [Link]
I read the bug. reuse_port is not reuse_address, so this complaint is misdirected.
There are two APIs we care about here:
* SO_REUSEADDR a/k/a reuse_address - Exists on basically everything with a sockets API. Is mostly sane (albeit with significant differences between *NIX implementations), but on UDP under Linux, it causes very weird behavior (as described in the article) because UDP is connectionless (packets aren't associated with specific connections, so you end up multiplexing all of your incoming traffic with somebody else's socket). On the other hand, its "original" (BSD) meaning is something like "allow sockets to bind to similar non-identical wildcard addresses, and also kill any lingering TCP sessions that conflict" - most people only care about the latter, so SO_REUSEADDR on UDP is (on BSD) usually a mistake. So they (the Python devs) pulled reuse_address out of *just the UDP API.* You can still use this for TCP connections as far as I can tell, and you can still manually set socket options by creating the socket first and handing it off to asyncio once you've configured it. You might (theoretically) want to do that if your UDP protocol establishes (logical) connections by moving to a higher port after exchanging one packet (e.g. TFTP), and you want to load balance it in-kernel. But if you have a multi-packet handshake, or if UDP ports are pre-negotiated over TCP or something, that was never really going to work anyway.
* SO_REUSEPORT a/k/a reuse_port - Exists on most platforms but not Windows. Specifically designed to ask for the "weird" behavior described above, but is stricter than reuse_address, particularly on Linux. They discussed pulling this one altogether because it's "low level" and doesn't exist on Windows, but apparently they decided that breaking backcompat for something that isn't actually a security vulnerability is not worth it, so for now, reuse_port lives on (but its days might be numbered).
See also https://stackoverflow.com/a/14388707 for further discussion of what these flags mean on different platforms.
Python 3.9 is around the corner
Posted Sep 23, 2020 6:56 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]
As far as I remember, neither one was accessible through native Python and I had to do the socketopt call using FFI.
Python 3.9 is around the corner
Posted Sep 23, 2020 10:12 UTC (Wed) by dvrabel (subscriber, #9500) [Link]
You probably want to use SO_REUSEPORT instead (see https://lwn.net/Articles/542629/).
Python 3.9 is around the corner
Posted Sep 23, 2020 8:21 UTC (Wed) by NYKevin (subscriber, #129325) [Link]
That's actually really neat. Lots of interesting problems can be expressed in terms of topological sorting (broadly speaking, problems of the form "A needs to go before B and C, B needs to go before D..."). This is exactly the sort of generic algorithm you would expect to find in a standard library, and I'm shocked I haven't seen it in Python (or other programming languages, to the extent I've looked) before now.
Python 3.9 is around the corner
Posted Sep 23, 2020 20:44 UTC (Wed) by Jandar (subscriber, #85683) [Link]
It is contained (sort of) in shell scripting: tsort(1). The shell standard library is /bin:/usr/bin ;-)
Python 3.9 is around the corner
Posted Sep 23, 2020 21:45 UTC (Wed) by nix (subscriber, #2304) [Link]
Python 3.9 is around the corner
Posted Sep 23, 2020 22:34 UTC (Wed) by Jandar (subscriber, #85683) [Link]
So if there were circular dependencies you had to specify the library more than once: -lfoo -lfoo -lfoo. The traditional Unix linker was very annoying.
That said, tsort(1) was handy for other usages as well.
Python 3.9 is around the corner
Posted Sep 23, 2020 23:03 UTC (Wed) by Wol (subscriber, #4433) [Link]
The linker on Pr1me systems was like that, and it meant you could have multiple subroutines with the same name. You needed to be careful how you planned it, though ...
Cheers,
Wol
Python 3.9 is around the corner
Posted Sep 24, 2020 4:26 UTC (Thu) by NYKevin (subscriber, #129325) [Link]
I'm going to be charitable and assume that nobody had gotten around to inventing namespaces yet. Because otherwise, that sounds a bit silly.
(I know, modern C still does not have namespaces, and is getting along just fine without them, thank you very much. What I mean by "inventing namespaces" is "adding a namespace-like feature to any then-popular and then-widely-used language that was generally available for use on these systems in particular." My reasoning is that, if you want namespaces, and namespaces happen to be available, you should obviously use namespaces rather than this crazy hack.)
Python 3.9 is around the corner
Posted Sep 24, 2020 8:44 UTC (Thu) by Wol (subscriber, #4433) [Link]
Every time you added a library, it would only resolve the references in the already-linked program - any references added by the library had to be resolved by another link. So a library that referenced itself usually had to be linked in twice ...
Cheers,
Wol
Python 3.9 is around the corner
Posted Sep 24, 2020 5:51 UTC (Thu) by neilbrown (subscriber, #359) [Link]
When your PDP-11/40 has 64K address space to be shared between code and data, you might find that separating a task into discrete tools is the difference between "works" and "doesn't work" (though I don't actually *know* that this was the motivation)
Python 3.9 is around the corner
Posted Sep 24, 2020 8:50 UTC (Thu) by Wol (subscriber, #4433) [Link]
That was one of my first jobs as a junior programmer. I got a stack of FORTRAN (II I believe, not IV!) printouts and had to produce flowcharts from the code so we could migrate it to a Pr1me 25/30 with - gasp - 256 KILObytes of main memory. I remember tripling the disk space by adding one of those fridge-sized 300MB removable disk pack drives.
Cheers,
Wol
Python 3.9 is around the corner
Posted Sep 24, 2020 12:20 UTC (Thu) by nix (subscriber, #2304) [Link]
(This would also explain why ld or ranlib didn't just exec tsort to do the job itself, even if it had to be split into a separate tool. What it *doesn't* explain is why this was never fixed, despite the unhelpfulness of the traditional behaviour and the low likelihood of the change breaking anything for anyone who wasn't doing something completely crazy with .a link ordering.)
Python 3.9 is around the corner
Posted Sep 29, 2020 10:26 UTC (Tue) by ras (subscriber, #33059) [Link]
It would only look like a triumph of simplicity over sanity if you haven't tried to make a multiuser OS, shell, complier, linker and so on run in 128k. tsort arrived with Unix 7. Unix 7 ran happily in 128k of memory, when the name 'tar' made sense and reading that 128k of memory off an rk05 took a second. No one wanted to wait another second for that 2nd pass.
Python 3.9 is around the corner
Posted Sep 24, 2020 12:42 UTC (Thu) by gerdesj (subscriber, #5446) [Link]
That's CoVID-19 reporting grade statistics. Superficial, lazy and nearly worthless. However, at least this quotes its source and is potentially reproducible.