The return of Python dictionary "addition"
Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net. |
Back in March, we looked at a discussion and Python Enhancement Proposal (PEP) for a new dictionary "addition" operator for Python. The discussion back then was lively and voluminous, but the PEP needed some updates and enhancements in order to proceed. That work has now been done and a post about the revised PEP to the python-ideas mailing list has set off another mega-thread.
PEP 584 ("Add
+ and += operators to the built-in dict class
") has gotten a fair
amount bigger, even though it has lost the idea of dictionary "subtraction",
which never gained significant backing the last time. It also
has two authors now, with Brandt Bucher joining
Steven D'Aprano, who wrote the original PEP. The basic idea is fairly
straightforward; two dictionaries can be joined using the "+"
operator or one dictionary can be updated in place with another's contents
using "+=". From the PEP:
>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3} >>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'} >>> d + e {'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'} >>> e + d {'cheese': 3, 'aardvark': 'Ethel', 'spam': 1, 'eggs': 2} >>> d += e >>> d {'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
As can be seen, it is effectively an "update" operation (similar to using the .update() method) where the last value for a particular key "wins". That is why "cheese" is "cheddar" for d + e, but it is 3 for e + d. The example also shows that the operation is not commutative, which bothered some commenters even though there are already several such "arithmetic" operators that are not commutative; list "addition" using "+" isn't either, for example.
There were some objections to removing subtraction, some +1 and -1 responses, and others along those lines, but the biggest chunk of the thread was taken up by the question of how to "spell" the operator. The question seems to boil down to whether to use "|" instead of "+"; that was also part of the discussion back in March and is mentioned in the PEP as well. The operation is seen by some as being analogous to the set union operation, which uses "|".
Richard Musil kicked off a big sub-thread by making the argument for the set-union usage, though he suggested an entirely new operator ("|<") for it. He is concerned about the ambiguity of the + operator in Python and that choosing something completely new will ensure that users do not guess incorrectly about what it does. Chris Angelico did not see things that way, however:
But Paul Moore is unsure that there is any real need for a new dictionary addition operator:
He said that he has never needed that kind of operator and suggested that someone do a survey of real-world Python code to see if it would be improved using the new operator, though he did admit to not following the debate closely. It turns out that a big chunk of the PEP is taken up by examples of how the new operator might be used, taking examples from third-party code (including SymPy and Sphinx). Moore was not entirely impressed with them, however, saying that only four out of the roughly 20 examples were improved with the switch, though another few were arguable.
Andrew Barnert thought
that Moore's observation actually made a good argument in favor of the proposal;
if those who are not in favor of the proposal think that roughly
a quarter of the examples are an improvement using it, that's a pretty
strong vote in its favor. Beyond that, though, he thinks the need for
+ (which he calls "copying update
") makes for a
compelling case, more so than just for the
+= operator ("mutating update
"):
With the "{**a, **b}" example, he is referring to
using the dictionary unpacking operator, "**", which is specified
in PEP 448
("Additional Unpacking Generalizations
"), to do the update
operation. While that "works",
it suffers from the drawbacks he mentions; it is also a fairly universally
disliked language idiom. Most are fine with the "**" operator
itself, but using it in that way is considered rather non-obvious and is quite unpopular.
D'Aprano pointed
out that adding two dictionaries using + has come up
frequently over the years, seemingly independently; "to many people,
the use of + for this
purpose is obvious and self-explanatory
". The thread continued with
some arguing for each spelling of the operator; in some sense, the
arguments often came down to "taste". There were also some
more exotic ideas (spellings other than + or |, providing
a "did you mean ... ?"
kind of error for + to lead users to |, and so on), but Guido
van Rossum said
he is "not crazy
" about the "did you mean ... ?" idea; he indicated
that he sees the field as already having been narrowed down:
1) Add d1 + d2 and d1 += d2 (using similarity with list + and +=)
2) Add d1 | d2 and d1 |= d2 (similar to set | and |=)
3) Do nothing
We're not going to introduce a brand new operator for this purpose, nor are we going to use a different existing operator.
Beyond that, his preference would be to use |, but he is not completely opposed to +:
In the end I'm +0.5 on | and |=, +0 on + and +=, and -0 on doing nothing.
While the discussion went on at length, no real consensus was reached. As is always the case in the Python world, though, it seems, the discussion was never heated or even contentious really; in the end it comes down to personal preferences. As D'Aprano put it, even if the PEP "fails", it will have succeeded at some level:
One would guess that the discussion will move from python-ideas to python-dev before too long and then likely to the steering council for some kind of decision. We know how one member of the council (Van Rossum) is leaning at this point, but we'll have to wait and see how the rest of that group feels as none have been active in the discussion. It seems like a reasonable "addition" to the language, however spelled, though using + seems more likely to head off newbie queries. Lists and dictionaries are much more integral to Python; those who are new to the language will probably see list "addition" well before they ever meet sets.
Index entries for this article | |
---|---|
Python | Dictionaries |
Python | Python Enhancement Proposals (PEP)/PEP 584 |
(Log in to post comments)
The return of Python dictionary "addition"
Posted Oct 29, 2019 18:15 UTC (Tue) by Otus (subscriber, #67685) [Link]
Is that the wrong way around? I would have thought :+ is the mutating one.
The return of Python dictionary "addition"
Posted Oct 29, 2019 18:21 UTC (Tue) by Otus (subscriber, #67685) [Link]
The return of Python dictionary "addition"
Posted Oct 29, 2019 23:45 UTC (Tue) by jake (editor, #205) [Link]
jake
The return of Python dictionary "addition"
Posted Oct 29, 2019 23:53 UTC (Tue) by Anssi (subscriber, #52242) [Link]
And the latter is where most of the need for the new operation is.
The return of Python dictionary "addition"
Posted Oct 30, 2019 2:16 UTC (Wed) by droundy (subscriber, #4559) [Link]
Keeping either the first or second value disambiguates an error case, but doesn't really feel correct or intuitive.
The return of Python dictionary "addition"
Posted Oct 30, 2019 5:17 UTC (Wed) by marcH (subscriber, #57642) [Link]
That would be mixing operations at the dict/set/list level with operations on the elements themselves. I'd find that everything but intuitive.
> Keeping either the first or second value disambiguates an error case, but doesn't really feel correct or intuitive.
"Intuitive" is always somewhat subjective/cultural, however keeping the second value is "correct" because this new operation is a (non-commutative) "update" operation. I'm stopping now to paraphrase the article.
The return of Python dictionary "addition"
Posted Oct 30, 2019 17:14 UTC (Wed) by droundy (subscriber, #4559) [Link]
The return of Python dictionary "addition"
Posted Oct 30, 2019 9:54 UTC (Wed) by embe (subscriber, #46489) [Link]
That is available ascollections.Counter
:
>>> a = collections.Counter({'a': 1}) >>> b = collections.Counter({'a': 2}) >>> a + b Counter({'a': 3})
The return of Python dictionary "addition"
Posted Oct 30, 2019 12:45 UTC (Wed) by weberm (subscriber, #131630) [Link]
The return of Python dictionary "addition"
Posted Oct 30, 2019 19:48 UTC (Wed) by NYKevin (subscriber, #129325) [Link]
The return of Python dictionary "addition"
Posted Oct 30, 2019 16:17 UTC (Wed) by pgdx (guest, #119243) [Link]
> is present in both ducts?
Yes, that has been addressed in both the email list (python-ideas) and in
PEP-584. It has been concluded that it's not a good (enough) idea.
Quoting PEP-584:
> Add the values (as Counter does).
>
> Too specialised to be used as the default behaviour.
The return of Python dictionary "addition"
Posted Nov 16, 2019 8:29 UTC (Sat) by iq-0 (subscriber, #36655) [Link]
It’s not always clear that you’re looking at code that is dealing with dictionaries, Though the data in them are logically “addable”.
Using set operators would immediately signal to the reader there is no mathematical addition taking place.
The return of Python dictionary "addition"
Posted Oct 30, 2019 22:36 UTC (Wed) by rioting_pacifist (guest, #134765) [Link]
I don't think it's worth it, as "+" is the syntax, an average python user would expect (especially given it's appeal to non-computer scientists)
The return of Python dictionary "addition"
Posted Oct 31, 2019 6:18 UTC (Thu) by buck (subscriber, #55985) [Link]
> for the sake of "technical correctness" .
Another possibility (and i don't know if this was brought up in
the PEP discussion, so sue me if i'm seeming to plagiarize) is
that making the operator a "|" makes it just non-obvious enough
to lead the "average python programmer" to stop and wonder,
"why didn't they just use '+'?", which might lead him or her to the
realization that it's got the maybe non-obvious behavior of dropping
values for overlapping keys, which might not occur if it was as
natural as seeing it spelled "+" by somebody, trying that out on
your own at some point, having the compiler accept it, and not
realizing you didn't think about key overlap
I.e., setting a trap for the unwary (like me)
A/k/a, leading you up the garden path
Indeed, i find myself even more sympathetic to the line of thinking
that says that the behavior doesn't line up well enough with my
naive notion of either "+" or "|" so why operator-ize it?, but now
i'm sure i must be rehashing the PEP discussion the article had
attempted to distill to the most pertinent highlights, probably
eschewing stuff like i'm now cluttering up the comment trail with
The return of Python dictionary "addition"
Posted Nov 4, 2019 17:11 UTC (Mon) by hkario (subscriber, #94864) [Link]
the |= is used for set, and the keys in dict are a set not a list, so it's more correct to use |= rather than +=, as the behaviour will NOT mirror behaviour from list
The return of Python dictionary "addition"
Posted Nov 5, 2019 9:17 UTC (Tue) by timrichardson (subscriber, #72836) [Link]
The return of Python dictionary "addition"
Posted Nov 8, 2019 0:51 UTC (Fri) by Pc5Y9sbv (guest, #41328) [Link]
Interestingly, PostgreSQL reuses their concatenation operator for JSONb objects with the same "copy and update" semantics proposed here. They also support subtraction of keys:
'foo'::text || 'bar'::text produces 'foobar'
'{"a":1, "b":2}'::jsonb || '{"a":3}'::jsonb produces '{"a":3, "b":2}'::jsonb
They also support subtraction of key strings from JSONb objects, but not subtraction of one object from another:
'{"a":1, "b":2}'::jsonb - 'a' produces '{"b":2}'::jsonb
The return of Python dictionary "addition"
Posted Oct 31, 2019 2:28 UTC (Thu) by lsl (guest, #86508) [Link]
So Python goes TIMTOWTDI now?
The unpacking thing appears to follow logically from existing language rules (it's the same as writing out the KV pairs of a and b in order, or at least I hope it is). It's hard to see how the issues with it justify the introduction of a new operator. Seems good enough already.
The return of Python dictionary "addition"
Posted Nov 1, 2019 12:51 UTC (Fri) by sytoka (guest, #38525) [Link]
The return of Python dictionary "addition"
Posted Nov 2, 2019 9:50 UTC (Sat) by rav (subscriber, #89256) [Link]
For the curious, dict literals indeed keep the last value if the same key is specified multiple times, at least according to my test on Python 3.7.4:
>>> {"a": 1, "a": 2}
{'a': 2}
I was actually surprised by this - I had guessed it would be a SyntaxError.
The return of Python dictionary "addition"
Posted Nov 2, 2019 21:02 UTC (Sat) by mathstuf (subscriber, #69389) [Link]
a = 'a'
{a: 1, a: 2, 'a': 3}