Improving pretty-printing in Python
LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing |
The python-ideas mailing list is typically used to discuss new features or enhancements for the language; ideas that gain traction will get turned into Python Enhancement Proposals (PEPs) and eventually make their way to python-dev for wider consideration. Steve Jorgensen recently started a discussion of just that sort; he was looking for a way to add customization to the "pretty-print" module (pprint) so that objects could change the way they are displayed. The subsequent thread went in a few different directions that reflect the nature of the mailing list—and the idea itself.
Jorgensen prefaced his thoughts with a disclaimer of sorts: "This is
really an idea for an idea [...]
". He suggested that adding a
"dunder" method to Python objects for pretty-printing purposes. Those
methods have names that start and end with double underscores
(i.e. "dunder"); they are used internally by Python for a number of
standard tasks
(e.g. __init__()). A new one might allow objects to
represent themselves differently in Unicode streams:
Beyond that, objects might like to control how they are pretty-printed in the general case. pprint provides some amount of customization, in terms of text width, indentation, and traversal depth, but he is looking for more than that:
Guido van Rossum thought
the idea had some merit. He suggested that a
pprint alternative "that allows classes to have formatting hooks that get passed in some additional information (or perhaps a PrettyPrinter
object) that can affect the formatting
" might make sense. It would
be the type of feature that could be developed as independent modules on
the Python Package Index (PyPI),
"*except* it would be more effective if there was a standard, rather
than
several competing such modules, with different APIs
for the formatting hooks
". He encouraged a discussion on what that
API might look like.
Jonathan Fine offered up some potential starting points, at least in terms of design, in the reprlib and json modules in the standard library. Eric V. Smith pointed to the @functools.singledispatch decorator as a potential pattern to use; it allows for overloaded functions based on the type of the first argument.
But the definition of some putative __pretty__() method on objects
could be problematic, Barry Scott said.
"Pretty" is "in the eye of the beholder
", so he is skeptical
that objects can define a one-size-fits-all implementation; for example,
internationalization and localization might be required. Instead of
driving it from the object side, he would rather have something that takes
an object "and returns the pretty version depending on
the apps demands/config
". Stephen J. Turnbull more or less concurred
with that:
If an application wants to make such substitutions, I have no objection to that. But "explicit is better than implicit", and those substitutions should be made at the level of application I/O, not the class level IMO. (Yes, I know those "levels" are ill-defined, but that's still an appropriate informal principle, I think.)
Christopher Barker was concerned
that adding a new dunder method for pretty-printing, beyond the existing __str__() and
__repr__(), might just lead to the need for more than one version
of "pretty". He wondered about updating __str__() for standard
types, so that the output was "pretty" by default, but recognized that it
would likely break many things: "I imagine a LOT of code out there
(doctests, who know what else) does in fact expect the str() of builtins
not to change -- so this is probably dead in the water.
" But
beyond the code (and documentation) upheaval, it is far from clear what
"pretty" means, as Steven D'Aprano pointed
out:
py> pprint.pprint(list(range(200))) [0, 1, 2, 3, ... 198, 199]
I've inserted the ellipsis for brevity, the real output is 200 rows tall.
When it comes to floats, depending on what I'm doing, I may consider any of these to be "pretty":
- the minimum number of digits which are sufficient to round trip;
- the mathematically exact value, which could take a lot of digits;
- some short number of digits, say, 5, that is "close enough".
Turnbull agreed with Barker that doctest-based tests would be affected by a change to str() (which calls __str__() if present), but that other things would be broken as well, which is something that the project tries to avoid:
Beyond the standard library modules, Alex Hall noted two projects on GitHub that may be of interest: PrettyPrinter and pprint++. Jorgensen said that he is looking at those as well as the others suggested in the thread. He is continuing the discussion, but is now thinking that adding dunder methods is not the right approach:
Instead, he suggested adding a way for objects to register hooks governing how they want to be represented. It is still in the early going for any pretty-printing improvements; Jorgensen posted his initial message on March 15. Any wrangling over an API is still down the road a bit; a PEP and changes to the language, if any, are further out still. But there does seem to be a contingent that favors a feature of this sort, so it may well work its way into, say, Python 3.10, presumably sometime in 2021.
Index entries for this article | |
---|---|
Python | Enhancements |
(Log in to post comments)
Improving pretty-printing in Python
Posted Mar 19, 2020 1:19 UTC (Thu) by NYKevin (subscriber, #129325) [Link]
Improving pretty-printing in Python
Posted Mar 21, 2020 7:22 UTC (Sat) by divbzero (guest, #137744) [Link]
[1]: https://bugs.python.org/issue27362
[2]: https://mail.python.org/pipermail/python-ideas/2010-July/...
__json__ has not gained traction, probably because JSON serialization can be application dependent and JSONDecoder already provides a flexible way to customize JSON serialization.
__pretty__ seems even more in the eye of the beholder so I’m not surprised to see hesitant reactions to the idea.