What should be in the Python standard library?
Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
Python has always touted itself as a "batteries included" language; its standard library contains lots of useful modules, often more than enough to solve many types of problems quickly. From time to time, though, some have started to rethink that philosophy, to reduce or restructure the standard library, for a variety of reasons. A discussion at the end of November on the python-dev mailing list revived that debate to some extent.
Jonathan Underwood raised the issue, likely unknowingly, when he asked about possibly adding some LZ4 compression library bindings to the standard library. As the project page indicates, it fits in well with the other compression modules already in the standard library. Responses were generally favorable or neutral, though some, like Brett Cannon, wondered if it made sense to broaden the scope a bit to create something similar to hashlib but for compression algorithms. Gregory P. Smith had a different take, however:
If anything, it'd be nice to standardize on some stdlib namespaces that others could plug their modules into. Create a compress in the stdlib with zlib and bz2 in it, and a way for extension modules to add themselves in a managed manner instead of requiring a top level name? Opening up a designated namespace to third party modules is not something we've done as a project in the past though. It requires care. I haven't thought that through.
Steven D'Aprano objected
to Smith's assertion about the Python Package
Index (PyPI): "PyPI makes getting
more algorithms easy for *SOME* people.
" He noted that in many
environments (e.g. schools, companies) users cannot install additional
software on the computers they are using, so PyPI is not the panacea it is
sometimes characterized as.
That led Cannon to suggest
discussing the standard library and its role: "We have never really
had a discussion about how we want to guide the stdlib going forward
(e.g. how much does PyPI influence things, focus/theme, etc.).
"
Paul Moore wasn't
sure that discussing the matter would really resolve anything, though:
A larger standard library would help those without access to PyPI, Antoine
Pitrou argued,
while a smaller one does not provide huge benefits: "Python doesn't
become magically
faster or more powerful by including less in its standard
distribution: the best it does is make the distribution slightly
smaller.
" But there are definite downsides to having a large
standard library, Benjamin Peterson said:
- The [development] of stdlib modules slows to the rate of the Python release schedule.
- stdlib modules become a permanent maintenance burden to CPython core developers.
- The blessed status of stdlib modules means that users might use a substandard stdlib modules when a better thirdparty alternative exists.
Steve Dower would rather see a smaller standard library with some kind of "standard distribution" of PyPI modules that is curated by the core developers. Later in the thread, he listed numerous different Python distributions as examples of what he meant, but that just highlighted another problem, Moore said: which of those should he recommend to his users? Right now, the standard library provides the base that a Python script can rely on:
Moore acknowledged that maintaining modules in the standard library has a
"significant cost
" but wondered if moving to the distribution
model was simply shifting those costs to users—without users gaining much
from it. Nathaniel Smith looked at the list of distributions and came to
a different conclusion: the "single-box-of-batteries
" model is
not really solving the problems it needs to solve.
It's really hard to tell whether specific packages would be good or bad additions to the stdlib, when we don't even know what the stdlib is supposed to be.
But Moore found that to be overstated somewhat. For him (and presumably
others), the standard library is what you can expect to find when you have
Python installed. That means that various things like StackOverflow
answers, tutorials, books, and so on can rely upon those pieces being
present, "much like you'd expect every
Linux distribution to include grep
". In addition, the "batteries
included" attribute is likely to have been part of what helped Python grow
into one of the most popular languages, D'Aprano said. "The
current model for the stdlib seems to be working well, and we mess
with it at our peril.
"
Nathaniel Smith sees
some advantages to the "standard distribution" model, though he is not sure
that it would really be the best option. "But what I like about it is that it could potentially reduce the conflict between what our different user groups need, instead of
playing zero-sum tug-of-war every time this comes up.
" Others
don't see it that way, though; "not every need can be solved by the
stdlib
", as Pitrou put
it. He continued:
Moore concurred: "In exploring alternatives, let's
not lose sight of the fact that the stdlib has been a huge success, so
we know we *can* deliver an extremely successful distribution based on
that model, no matter how much it might trigger regular debates :-)
"
In any case, as he pointed
out, a more concrete proposal (in the form of a PEP) is going to be
needed before any real progress can be made. Dower floated
some ideas about what a distribution might look like along the way, but,
without something like a PEP to discuss, participants are often
talking past each other based on their assumptions.
The topic has come up before on the Python mailing lists and at Python Language Summits. In 2015, there was a discussion at the summit on adding the popular Requests module to the standard library. Participants recognized that there were significant barriers—development pace, certificate handling, no asyncio support—to moving it into the standard library. In the end, it made sense for Requests to stay out. At the 2018 summit, Christian Heimes brought up a number of batteries that should perhaps be removed from the set, though the effort to create a PEP listing them seems to have stalled.
No firm conclusions were drawn in the discussion, but part of the underlying problem seems to be a lack of clarity on what the purpose of the standard library is. At the 2015 summit, Cannon suggested an informational PEP be drafted to solidify that; until that happens, there will be wildly differing views on what role the standard library serves. At the moment, though, there is no process to accept or reject a PEP even if one were on offer; that will have to await the new Python Steering Council, which will be elected in early February. One of the first orders of business of that group is likely to address the PEP process.
As far as adding LZ4 goes, the overall feeling from the thread is that it
would be useful to have it in the standard library—at least for those not
looking to change the standard library model. Adding LZ4 also requires a
PEP, however, so that process may be stalled by the governance
change, as well.
Index entries for this article | |
---|---|
Python | Standard library |
(Log in to post comments)
What should be in the Python standard library?
Posted Jan 10, 2019 14:58 UTC (Thu) by mageta (subscriber, #89696) [Link]
What should be in the Python standard library?
Posted Jan 10, 2019 17:38 UTC (Thu) by smurf (subscriber, #17840) [Link]
Also, there are downsides to having a large stdlib. Presumably² the Python developers are able to reason about them and strike a, well, reasonable balance between maintainance burden and compatibility issues, esp. since this would fall flat on its face at script startup time instead of crashing some indeterminate time later.
What should be in the Python standard library?
Posted Jan 10, 2019 23:00 UTC (Thu) by NYKevin (subscriber, #129325) [Link]
I disagree. I think it's pretty clear that Python 3 was an exceptional situation, and that going forward, they intend to adhere to PEP 4 for future deprecations. Furthermore, I'm rather skeptical that a significant number of actually used modules are going to be removed any time soon, even if they are deprecated. (For example, macpath is slated for removal in 3.8, but I doubt anyone cares at this point. On the other hand, there is no indication that they intend to remove optparse any time soon.)
However, I should also point out that the table of contents is getting unwieldy. It might make sense to reorganize it, or to split it into multiple pages. That would not break anyone's old scripts.
What should be in the Python standard library?
Posted Jan 10, 2019 23:26 UTC (Thu) by karkhaz (subscriber, #99844) [Link]
It breaks the flow of navigating to that page and browser-searching for a word related to the module that you want to use, whose name you do not know. For me, this has always been a reliable way of finding the right module.
If you split the ToC into multiple pages, I now need to guess what arbitrary page or category somebody has placed a module into, browse to that, search, discover that my guess was wrong, navigate back up and try again, and by now my concentration is long gone.
The ToC is not broken, there is no need to fix it.
Compressors in the Python standard library?
Posted Jan 10, 2019 15:17 UTC (Thu) by zougloub (subscriber, #46163) [Link]
But the thing is, since as of today much advanced functionality (eg. flushing, dictionary handling) isn't even exposed or documented in even the zlib module.
In any case given the amount of compressors, moving the various 3-4 letter compressor name words down a namespace would be clearly beneficial (except for compatibility of course, but there could be shims for the main/current compressors).
Compressors in the Python standard library?
Posted Jul 25, 2019 17:17 UTC (Thu) by k8to (guest, #15413) [Link]
Making them more regular would make it more reasonable to "drop in" additional compression algorithms, but that isn't completed work for sure.
What should be in the Python standard library?
Posted Jan 10, 2019 17:27 UTC (Thu) by MatyasSelmeci (subscriber, #86151) [Link]
What should be in the Python standard library?
Posted Jan 10, 2019 18:36 UTC (Thu) by hkario (subscriber, #94864) [Link]
standard library is a strength of the language, not its burden, just because it makes the core of the language move slower doesn't mean that the project itself is moving slower
There are people that do not use core language features too (e.g. generator functions), that doesn't mean we should think about moving them to PyPI.
What should be in the Python standard library?
Posted Jul 25, 2019 17:21 UTC (Thu) by k8to (guest, #15413) [Link]
Granted, sometimes the problem isn't the tools but rather things that are just difficult to deploy like 'cryptography'. But I still struggle with the status quo. Debian packaging tends to just work. I install a package and it runs. Python packages i get conflicts, build failures, inscrutible errors that make little sense. I know python has it a bit harder because it doesn't dictate the ecosystem it runs on, but it feels like some kind of binary package approach would make it vastly more reliable for those cases.
What should be in the Python standard library?
Posted Jan 10, 2019 21:37 UTC (Thu) by iabervon (subscriber, #722) [Link]
What should be in the Python standard library?
Posted Jan 12, 2019 11:16 UTC (Sat) by smcv (subscriber, #53363) [Link]
The Perl standard library has worked like this for a long time (with CPAN as the equivalent of PyPI).
What should be in the Python standard library?
Posted Jan 13, 2019 0:40 UTC (Sun) by ms-tg (subscriber, #89231) [Link]
> The Perl standard library has worked like this for a long time (with CPAN as the equivalent of PyPI).
And the Ruby standard library is going through the same evolutionary path, where bits of the standard library are being extracted to RubyGems, but the language ships with a defined set of “default gems” and pre-installs an additional set of “bundled gems”.
For more information please see
https://stdgems.org/
This is intended to meet the continued interests of a batteries-included common install everywhere, while recognizing that libraries stagnate and tend to go unmaintained in the classic standard library.
What should be in the Python standard library?
Posted Jan 17, 2019 21:34 UTC (Thu) by atnot (subscriber, #124910) [Link]
This is already the case in some places. For example, the python `json` module is an older version of the `simplejson` pypi module.
What should be in the Python standard library?
Posted Jan 13, 2019 14:21 UTC (Sun) by nilsmeyer (guest, #122604) [Link]
I wonder to what extent Python should be required to cater to broken corporate (and school) policies?
What should be in the Python standard library?
Posted Jan 13, 2019 20:08 UTC (Sun) by mb (subscriber, #50428) [Link]
https://www.zdnet.com/article/twelve-malicious-python-lib...
What should be in the Python standard library?
Posted Jan 14, 2019 10:23 UTC (Mon) by nilsmeyer (guest, #122604) [Link]
What should be in the Python standard library?
Posted Jan 18, 2019 14:43 UTC (Fri) by flussence (subscriber, #85566) [Link]
What should be in the Python standard library?
Posted Jan 17, 2019 9:40 UTC (Thu) by Wol (subscriber, #4433) [Link]
To what extent do you understand why those policies are in place? Would you like to go to jail?
Dunno how easy it is to do, but the idea of namespaces sounds very interesting to me. Split the stdlib up into modules, each in their own namespace, and allow drop-in replacements for each module.
That way, if the standard implementation stagnates, it's a reasonably easy job for it to be forked, improved, and fed back in.
Cheers,
Wol
What should be in the Python standard library?
Posted Jan 17, 2019 21:50 UTC (Thu) by nybble41 (subscriber, #55106) [Link]
What should be in the Python standard library?
Posted Jan 19, 2019 17:28 UTC (Sat) by jgu (guest, #129944) [Link]
I am not sure I came away feeling that the feeling was positive towards the addition of the lz4 bindings, as the final paragraph suggests - opinion seems very divided on that. I do see merit in the "compresslib" proposal though, and have been giving that some thought and prototyping.