Hacker News new | past | comments | ask | show | jobs | submit login
Black, the uncompromising Python code formatter, is stable (pypi.org)
500 points by crlees on Jan 29, 2022 | hide | past | favorite | 288 comments



I'm pleased to announce that Black is finally non-beta software! :party:!

Change log: https://black.readthedocs.io/en/latest/change_log.html

Going forward we'll follow our stability policy (https://black.readthedocs.io/en/latest/the_black_code_style/...).

Work continues as usual with bugfixes and enhancements, but style changes are now introduced under our new `--preview` CLI switch. This allows us to evolve Black's style without too much disruption to users that want consistency. The default style is updated yearly.

Thanks to our maintainers for orchestrating the efforts, especially to our most recent reinforcement Batuhan (@isidentical) who was responsible for our match statement support! A hearty thank you to all of our contributors for pushing Black forward, and to our users for being the reason we do it!


Congratulations. I'm a Python developer of 17+ years and Black is truly a huge blessing in the Python ecosystem.

That said, I'm a little sad to see it's gone stable without adding support for tabs, which would be extremely simple to add at this point (cf. https://github.com/jleclanche/tan/commit/e23c038167528bdacdd...). I have a lot of people using this tab-capable fork, that I did not advertise anywhere.

Łukasz seems to have a personal grudge against tabs which may be why the issue for tab support was closed early on, but there's a plethora of good reasons to support it behind a flag. I don't want to rehash those arguments here on HN but you think you could re-think the approach a bit?

I'd be happy to do a PR if it's not getting rejected right away with "no discussion allowed" like the last one was (before Black was moved to PSF maintainership).


On the contrary, I think it's fundamentally useful to the ecosystem that there be a winner in the tabs-vs-spaces/how-many-spaces debate. Code snippets become portable, developers don't need to adapt when joining a new team, etc. Indeed, https://www.python.org/dev/peps/pep-0008/#indentation has enshrined 4 spaces as the official recommendation. And when the best-in-class formatter enforces this, that's a good thing.

To be sure, I personally would have preferred that 2-spaces win out for compatibility with the Javascript ecosystem (so I am perhaps the furthest from the parent poster on the tabs-spaces spectrum!) but I abandoned my preference in favor of PEP-8 years ago, and doing so has opened far more doors for team productivity than it's closed.


I don't want to wade in or start a debate, but just share my experiences with formatting over the years. Lacking autoformatters in the past, I've worked with coworkers that preferred 2, 3, 4 & even 8 spaces for indentation.

My personal preference is tabs for indentation, spaces for alignment. The reason for this is the customizability that comes with tabs instead of spaces. Any editor worth using for coding has an easily configurable tab stop that you can set to your desired space count. With tabs and proper editor configs, everyone can be happy with how their code looks while remaining absolutely identical with no tools reformatting on checkout or commit (as I've sometimes seen).

For a long time, I was in the 4-space indentation camp. These days, I prefer 2. The reality is, though, I use whatever the codebase I'm working on has as "standard".


Since Black enforces that indents are exactly four spaces, you could configure your editor to render four spaces at the start of a line as two spaces, getting exactly the behaviour you desire from tabs.

An editor plugin for this wouldn't even have to be syntax-aware, except for some rare cases where you are using spaces to nicely align a multiline string.


“Since black enforces four spaces, all you need to do is reinvent the functionality of tabs”

I know it’s very far from the biggest problem in the world but this does make me chuckle a little.


The proper number of spaces seems to me to be dependent upon the font size. Smaller font, more spaces.

In the olden days, with fixed width fonts on an 80x24 terminal, I think I read somewhere that 3 spaces was optimal (fewest bugs); but that programmers have an aversion to non-multiples of 2. That is why you see 2- or 4-spaces, but not 3. (I think...)


Perhaps my naivete is showing, but I fail to see how an option that allows broader range of use cases suddenly becomes a religious war about spacing.

In the spirit of dev/user freedom, the creator has every right to enforce a standard, regardless of its basis - even if arbitrary - but I find it a little creepy. Then again, the prevailing usage of “opinionated” in dev circles was new to me, as well. What ever happened to design around maximum flexibility AND feature coverage?

Are developers more likely to be “opinionated” in their work if the predominant digital culture of their early career was rewarding of evangelism via self-promotion? Or grew up in an educational era that promoted activism? Are these sorts of issues more common in smaller / person-driven teams vs. corporate behemoths?

e.g., if VS Code were a one-person show vs. a corporate effort, would we risk seeing changelogs like “Insiders 2.20 - lead Architect and Face of the Product removes plug-ins starting with vowels, “because they lacked cohesion and product-centered aesthetics. They looked poopy in list format with most fonts.”

Absurd, to be sure, but what if these decisions don’t step on obvious toes? The average supporter is more likely to tolerate slightly warmer water than hop into another pot, right? And if you stayed in the pot through several degree increases, you’ll feel a sense of Boiling Frog Belonging (TM).

And if you credited the original dictatorial decision for this emergent sense of community, you’d be very much correct.


While I completely agree with all your other reasoning. It should be pointed out that PEP 8 is not intended to be a styling guide for user code, it's intended to be a styling guide for the standard library: https://www.python.org/dev/peps/pep-0008/#introduction

It's nice the broader Python community generally agrees with PEP 8 but it's always a little weird to see it referenced as an authority on styling user code when it is not it's intention and does not claim to be.


I've had various issues with PEP 8, but mixing underscores and camelcase (in particular) has never sparked joy for me.

And pylint seems to keep adding style checks that I don't like and have to disable.

But the main principle I do agree with is to strive for consistent style within a code base.


One thing I liked about Python was that PEP8 existed and was some sort of fairly aged standard for formatting. I’m not sure why people feel under deviation to suit their personal preferences is more valuable than consistency.


Consistency is not in and of itself valuable. What is valuable is when it makes things clearer, or otherwise easier. Tabs vs spaces has absolutely no effect on my comprehension of the code. Being able to use spaces for alignment would actually help in some cases but oh well, that's not allowed in python. The way black splits [] expressions is actually harmful when it's used in a chain with method calls as is not uncommon in pandas.


to anyone with even the slightest levels of OCD,(like say 95%? of software developers) consistency for the sake of consistency has a lot of value. The reason I like Black is not because of the authors opinions, but because it is consistent (due to the absence of configuration/user preferences).


Environments that prefer tabs just use Tan, or otherwise, those environments just didn't use Black at all.

It's not useful to the ecosystem because those projects not actually using Black and unable to make the switch end up in a worse scenario (hence the fork which at least fixes this). Nobody wins, here; at best, you're unaffected.

Again I invite you to look at Prettier which has had as much of an impact (if not more) on the JS ecosystem as Black did on Python, but does support tab indent (and is opinionated regardless).


The Prettier team has gone on to say they regret giving people several of the configuration options they did though and have a page dedicated to their option philosophy. https://prettier.io/docs/en/option-philosophy.html


Yes, you'll find no mention of regretting adding tab support to prettier on that page.

Key quote: --arrow-parens, --jsx-single-quote, --bracket-same-line and --no-bracket-spacing are not the type of options we’re happy to have.

That's because these are entirely stylistic choices. Tab support is a mechanical one. Worth mentioning as well that Prettier supports multiple languages and tab/spaces is a global option, whereas all of these are very language-specific.


I think one of the great benefits of Black is that it is opinionated. Giving folks the option to select the type of indentation blunts the benefits. If you have a huge code base which mixes tabs and spaces it’s going to be hard to diff and merge code (or even reuse code snippets).


The point of black is that you run it on the same codebase with the same parameters.

Prettier works the exact same way and does have --use-tabs as a parameter. Nobody died. No codebase ended up with mixed tabs and spaces from it. Codebases either do --use-tabs or don't.

Like I said, there are a lot of reasons to allow for this. For one thing, tabs are an accessibility feature, but also it's impossible to use Black in an environment that prefers tabs.

Whereas there's no such thing as "an environment that prefers exactly two spaces after every comma inside tuples", thus you don't need an option for this.


https://prettier.io/docs/en/option-philosophy.html

>Yet the more options Prettier has, the further from the above goal it gets. The debates over styles just turn into debates over which Prettier options to use. Formatting wars break out with renewed vigour: “Which option values are better? Why? Did we make the right choices?”

>And it’s not the only cost options have. To learn more about their downsides, see the issue about resisting adding configuration, which has more s than any option request issue.

>So why are there any options at all?

>A few were added during Prettier’s infancy to make it take off at all. >A couple were added after “great demand.” >Some were added for compatibility reasons.


Reading this doc it feels like the Prettier team is saying options like tabs/spaces were only added early on to get initial adoption, if it were up to them now there would be far less options to configure at all.


The options in question are some of the more.. fancy ones, and the "history" in question is things such as cross-compatibility with ESLint. Certainly not "tab support". (And I've contributed to Prettier a great deal, FWIW)


Yes, that's right.


On the contrary, I want to thank the authors of black for not adding tab support.

I’ve been writing Python code for two decades. I know all the arguments for tabs vs spaces.

Down the path of tabs lies madness.

If you want them that badly, please see git’s smudge/clean filters.

https://stackoverflow.com/questions/2316677/can-git-automati...

There. Now the Python world has a single standard while you are free to use tabs in your checkout. See, you can make all the people happy all the time. :-)


At the risk of repeating myself, I forked Black into Tan and just added --use-tabs. The alternative was "don't use Black". Others are in my situation as well and use Tan, somehow finding it despite it not being advertised anywhere. Hell I had people bugging me to update it a few weeks back.

There's extremely clear demand, I didn't just add a random flag to control how much space should be around parentheses. Switching an existing codebase using tabs to spaces is not always feasible.

That said, TIL about a lot of what's in your link, but it looks like a completely inappropriate solution, I think you can agree.


I shared that link in good faith, not aware of your specific use case. I still think it's for the better that black does not allow tabs, and that your fork is a good solution for shops that need tabs for an existing code base.

Black encourages the Python ecosystem to settle on spaces. There should be some friction involved (more than a switch) to use tabs, otherwise we're likely to see new code using tabs too.

$0.02 and all that.


Sorry, I re-read my comment and noticed it sounds way more aggressive than I intended it to be! I love the little hack actually, but it looks like it's really just a hack ;)


And I’m a little sad to see it’s gone stable without adding support for single quotes...

Wait I’m not. As much as I dislike double quotes, if they add an option every time someone is a little sad, we’re just going back to square one. I don’t want to debate styles anymore, ever, and I’m willing to give up my own aesthetics for that.


Just means I can't use it for work, since it turns a 1 line change into a several hundred line commit (all our code has single quotes).

These 'opinionated' formatters are great as long as you agree with the opinions they have. But the inflexibility makes them useless, otherwise.


And one commit converting all single quotes to double quotes is not an option?


Congrats to the black team on this release, and thank you scrollaway for your fork. We are in a similar position where we are (for various reasons) stuck on using tabs for an existing project. Luckily it seems like there might be some movement on this front where the maintainer team is at least more receptive to reopening this conversation:

https://github.com/psf/black/issues/2798


> support for tabs

Tabs are not PEP 8 compliant except for consistency with existing code:

https://www.python.org/dev/peps/pep-0008/#tabs-or-spaces

But if you're going to use a tool like Black in the first place, you're already committed to not preserving different formatting styles in existing code. You want to enforce one consistent style everywhere in the code base. And PEP 8 says that means no tabs.

> a personal grudge against tabs

I don't see why there would have to be any personal grudge given the above.


88 character lines are always pep8 non-compliant :)


Some teams strongly prefer a longer line length. For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the nominal line length from 80 to 100 characters (effectively increasing the maximum length to 99 characters), provided that comments and docstrings are still wrapped at 72 characters.


I know. That’s my point. Religiously following pep8 is absurd.


Dude, what part of "you can have it any color as long as it's black" did you NOT understand?


PEP-8 is the de-facto standard for python. Why would a formatter for python support anything (such as tabs) that deviates from that? There might be code that does not adhere to PEP-8, but to me this is not a justification, just some people distancing themselves from the python core community.


Have you read PEP8? :)

Tab codebases are PEP8 compliant. It merely says spaces are "preferred". What it disallows is mixing the two. What it also says is to use tabs if the codebase uses tabs, which I can't do with Black; how about that.

Incidentally, PEP8 takes a much stronger stance on line length, says they should be no more than 79 characters, and Black has enough sense not to respect that by default (and... offer an option, because line length, much like tabs for indent, is an accessibility feature).


Unfortunately Black goes with PEP8, and PEP8 recommends spaces. Guido made this call arbitrarily years ago for absolutely nonsensical reasoning (basically came down to that some people use shitty editors that don't handle tabs reasonably) and the entire Python community has had to suffer since.


> The default style is updated yearly.

I'm hoping there will be minimal or even zero style based commits, excepting those related to new Python features. It wouldn't be very Black-like to force a commit of a potentially enormous size on users of the library every year. Probably something you're already thinking about.

I was apprehensive about taking our legacy codebase Black, but zero regrets. Thanks for your work!


I remember we migrated 2+ million LoC to being formatted by Black at Dropbox.

Our Livegrep instance with a custom Git blame implementation always crashed at the commit made to do the migration :-) We had to pause our merge queue because we didn't want to run into conflicts, and I remember the `git push` ended up taking a while.

There was only one change that we had to make to Black to get it working on our codebase - https://github.com/psf/black/commit/024c9cab55da7bd3236fd887...

Glad to see it's now stable.


When I switched the company to Black, I reformatted the whole repository history with it. Every developer had to clone the repository again and reapply their local changes to it, but in the end it was a good choice because "blame" still works perfectly since we don't have one big reformatting commit.


An alternative is to use Git's blame.ignoreRevsFile[1] option to ignore specific commits when calculating blames. The downside is that although you can save the list of commits in the repo, you cannot do the same for the config setting itself, so it calls for some light automation at scale.

[1]: https://git-scm.com/docs/git-blame#Documentation/git-blame.t...



looks like a `break` is missing after `has_special_comment` becomes True


`another_really_really_long_element_with_a_unnecessarily_long_name_to_describe_what_it_does_enterprise_style` hahahahaha, I love that test case.


To clarify, the slow git push was likely custom pre-receive hooks being slow.


Fuck I love Black -- makes working with other developers amazing once you stick it in a precommit. The quote from Dusty Phillips on the homepage is perfect and has stuck with me for years. I don't have to debate with developers about their individual preferences over what's best because we can just use Black and be done with it.


I love it even though I dislike quite a few of its formatting rules.

Because the only thing worse than a slightly wonk formatting convention is spending any time at all implementing, arguing about, or otherwise worrying about formatting conventions.


Rob Pike's proverbs about Go fmt springs to mind: "Go fmt's style is nobody's favourite, but go fmt is everybody's favourite."


gofmt, on the other hand, gives you a lot more flexibility to wrap lines how you deem appropriate. I like gofmt way more than black.


Exactly, gofmt is actually almost perfect. Ironically, it makes Rob's quote less meaningful.

I think gofmt's brilliance comes down to it not being overly pedantic, it doesn't have an opinion on how all code should be formatted, it only fixes what it sees as clear mistakes.


Word. I got major pushback trying to fix this problem at my last company, which is (in part) why I'm not there.


Yup! This.


Before I introduced Black in my workplace, all code reviews were peppered with stuff like "should be indented", "use single quotes" etc. Made it impossible to talk about the stuff that actually matters.

When introducing it there was some resistance. There's always that one guy who tried it once and it reformatted that perfectly hand-formatted function. My answer to them was suck it up. In a tiny fraction of cases Black might make a suboptimal choice, but the rest of your code is so shit it outweighs this negative by 100x.


Agreed. My only grievance with Black was that it was really slow compared to gofmt or rustfmt, but that’s a perennial problem with Python tooling.


This is now improved. Black is compiled with mypyc, should be roughly 2X faster than before.


Oh, good to know!


Is there a way to pass hooks with a repo, or do you have to give people instructions about how to install them and hope they do it?


Not directly, unfortunately. Sometimes you can hook it into the tooling, like the Makefile or the test runner.

Other than that, enforce it in CI: patches that don’t match the code style can’t be merged


Here here! Same.


NB: It’s “Hear, hear”, not “Here, here”.


Thanks mate.


The problem I have with it is that, to me, these are super readable:

    x = [ 1, 2, 3 ]
    y = { 'a': 1, 'b': 2 }
Whereas these are not:

    x = [1, 2, 3]
    y = {'a': 1, 'b': 2}
To me, declaring inline lists and dicts without the leading and trailing space makes it more difficult to see what the data structure is, and what it contains. This is especially true when you have lists or dicts declared inline as arguments to functions, e.g.:

    my_function_call([ 1, 2, 3 ])
versus:

    my_function_call([1, 2, 3])
While it's generally not good style to declare data structures inside function calls, at least with the space it's still somewhat apparent what's happening, whereas without the space it just looks like three positional arguments. I love the idea of black, but in practice I'm not going to use something that changes my own code into something that's more difficult for me to read.


That’s gotta be a really unpopular opinion since I’ve never seen anyone write “[ 1, 2, 3 ]” instead of “[1, 2, 3]” in the last 10 years or so.

If anything I’m seeing the opposite trend with a few devs writing “[1,2,3]” so that large constant literals don’t overflow the line.


I've recently been working with an old C++ codebase that formats every function call like this:

  Foo<Bar,Baz> value = some.method (1, 2, 3 + 4) ;
  // or with no parameters:
  value = some.method() ;
I've never seen it anywhere else, and it took me a good week to get used to it.


You might not have been writing JavaScript. For objects, spacing properties from curly brackets is pretty much the norm, Prettier does that by default. For lists I see it rarer but I personally use it instinctively for consistency with objects, especially with things like useState destructuring and ([ complex, ...splatting, ...combinations ]) into function calls (which admittedly becomes uncomfortably terse).


And here’re some fresh examples with lists just encountered in the wild: https://developer.matomo.org/guides/tracking-javascript-guid...


FWIW, I do "[ 1,2,3 ]". So we exist, perhaps you just haven't bumped into us :-)


My only response to this would be “get used to it”.

I used to think 2 space tabs made code harder to read than 4. Now I used 2 spaces and I’ve gotten used to it and can read it fine. And I have no doubt you could learn to read the black formatted code just as easily. I can, since that is how I format my code anyway.


Agreed - end of day, code formatting is not that important in the context of how end users utilize the product.

Read a lot of different code and programming languages. They all look pretty much the same after awhile, formatting and syntax.


I think it just doesn't matter anymore because almost all editors support code folding and visual markers for it, making it MUCH easier regardless of 2 vs. 4.


> I used to think 2 space tabs made code harder to read than 4. Now I used 2 spaces and I’ve gotten used to it and can read it fine.

I might be in the minority, but it's an empirically testable proposition. Given the thousands of people taking Python quizzes for toptal, triplebyte, etc., it should be possible to see which version results in a lower time to answer or a higher accuracy rate.


And that’s the reason everyone should use tabs. You like 2, I like 4, someone else one the team likes 3, 6, 8, whatever? No problem if tabs are used, everyone can set tabstop to their liking!


Just learn how to make them readable. I am yet to see a single developer who didn’t manage to learn that. Seriously, the worst types are the ones who think they are experienced enough to think what “is more readable”. It’s all a matter of habit.


I disagree: your style is less readable than Black's. You've moved the data-structure syntax away from the data, and toward the function-call syntax, causing them to blend together.

This would be more readable -- put the space around the structure, not inside it:

    my_function_call( [1, 2, 3] )
Now we can clearly see the separation of the argument from the function call. But I wouldn't bother switching code formatters just to have this.


I'm not a big fan of automatic code formatters, unless they get somewhat more configurable. One simple example is line length. Usually a code formatter will break long lines, for example a function call with some arguments. But what if I have a log call there? Do I want to have that log span 3-6 rows, just because the silly formatter thought it is a long line? Well it is a long line, but I don't want to break it into multiple lines, as that would give that log call waaaay too much space. When the log call spans multiple lines, it distracts from the bits of code between log calls. Another example is, that these code formatters are often configured wrongly in people's code editors to reformat everything in the whole file. That adds lots of changes and people do not afterwards separate their commits for "only reformatting" and the actually important bits of their changes.

As PEP8 already says: "A Foolish Consistency is the Hobgoblin of Little Minds". An automatic code formatter is the epitome of consistency, as it applies the rules everywhere the same way. In many places it might give some benefits, but in others it will ruin the original code formatting. I am experienced enough to format my code in a readable way and I don't need it reformatted, just because someone has to try out some tool. Especially log calls. Those are a pet peave of mine.


Sure, but I care a LOT more about avoiding style battles with others than not having my ideal code formatting.

And not having to think about it at all is just an additional bonus.

I used to have to make style passes before I would commit code. And get style comments in code reviews. I don't miss those days.


The popularity of black makes it a Schelling point [0]. It's not my favorite code style. At some point before the enum module was added, I picked a habit of using single-quote 'strings' for internal fixed strings and double-quote "strings" for human-readable messages. Black's formatting of strings to all use double quotes removes this distinction. But having some 95% solution can get 95% of people to agree to it, which is enough to avoid arguments that never reach a conclusion.

[0] https://en.wikipedia.org/wiki/Schelling_point


Interesting, Black docs specifically calls out an option for users with that exact use case for mixing single and double-quoted strings (although they do recommend dispensing of it eventually): https://black.readthedocs.io/en/stable/the_black_code_style/...

> If you are adopting Black in a large project with pre-existing string conventions (like the popular “single quotes for data, double quotes for human-readable strings”), you can pass --skip-string-normalization on the command line. This is meant as an adoption helper, avoid using this for new projects.


Thank you, and I hadn't known about that. That sounds sort of like the options for Prettier that xigol mentioned, where they're maintained in order to get a wider user base, but not recommended as they fragment the long-term coding styles. At this point, I've accepted the double-quote strings and am trying to make a habit of using enums instead, but it's good to know that the options exist.

[0] https://news.ycombinator.com/item?id=30132012


Oh cool! I hadn't heard of that.


It's a really fun concept that helps explain and understand a lot of politics. That positions held by groups don't need to be internally consistent in the same way that an individual's positions (ideally) should be. For a group, they main instead represent a mutually agreeable compromise between internal divisions of a group. Sort of like how if you're ordering pizza for a group of friends, a safe bet would be to order two pizzas, one pepperoni and one veggie. It's not going to be everybody's favorite pizza, or may not be anybody's favorite pizza, but most groups wouldn't be too put off by the selection.


> Do I want to have that log span 3-6 rows, just because the silly formatter thought it is a long line?

Yes. I do not understand what’s wrong with it.

Code formatting is a political stance. It’s so much easier to adopt that than have everyone have slightly different opinions on how things should be formatted.


So my function of 3-4 lines actual code with log lines in between (so maybe 8 lines) becomes a function of 20 lines, because of the code formatter changing those log lines and putting every argument on a new line. Now the function takes two third of my screen and I cannot scan it as quickly with my eyes any longer. I cannot simply skip log lines, but need to check for the end of the log calls instead. No thanks.


This is also one of my gripes with black. I'm pretty sure whoever wrote was getting paid per line of code. So we got an auto-formatter than optimizes for maximum line count.


Fold the lines with logs. Problem solved.


That's actually an idea. Gonna try that out next time I have this problem. I just hope that there is some way to automatically fold. I will also need to get a good keybinding in place for unfolding, in case I do want to read that log line.


>I just hope that there is some way to automatically fold.

Depends on your editor. Vim can definitely do it.


> Now the function takes two third of my screen and I cannot scan it as quickly with my eyes any longer.

You should be able to scan is as fast because ignoring the 3 lines with additional spacing is easy.


> It’s so much easier to adopt that than have everyone have slightly different opinions on how things should be formatted.

Cant this be done with just a linter to enforce some rules without forcing an auto format. I really liked just working with a linter to enforce style, as long it passed everything was fine. But still left some room for the coder to make things readable


“Readable” is equivalent to “I am used to reading code like that” rather than “there is some idealised metric that defines readability”

You are mistaking your own habits for your rationality


I don't think that's true otherwise I would never be able to improve my codes readability. I have received suggestions that make code better to read and have read others code that I found to be neater. If it was just my habits then everything written by other people would always be harder to read


<salesman voice>Black's strictness getting you down? Try autopep8: The flexible Python formatter!</>

Its fast enough, can be made to run incrementally, has enough configuration options, and accepts a project-specific configuration file. There is room for a solution in-between laissez-faire and black's strictness.


I agree on the need for configurability, but the counterpoint is that on a team of devs where everyone has their own IDE auto formatting activated, applying this to the whole file just for a tiny logical change would create a lot of VCS noise. Autoformatters ensure that what makes it into the master branch has no surprises.


Just have no automatic formatting and have a good consensus agreed upon. I like to write ";" in JS for example, like in old times. I find it more explicit. However, I made a compromise for one project, where people had started out without ";" at the ends of lines. Sometimes you have to make compromises.

If no one is automatically code formatting, then no such noise is created. You could even get a linter telling you about PEP8 violations by underlining stuff for example and configure an exception list for the linter, which is company wide, but not have any automatic change happening. The worst offender is automatic code formatting at commit time. You think you commit one thing, but the formatter commits another.


> Just have no automatic formatting and have a good consensus agreed upon.

Wow, I wish I lived in your world. I've never see a team larger than 3 people have a consensus on code formatting.


Even if a good consensus is agreed upon how to enforce it in a team of (10 devs, say). So agree on a consensus black format and run as a pre-commit check before merge to master.


I think today's Python has long surpassed it's PEP8 edicts. There is no longer one and only one obvious way of doing things, which is not just sad but downright becoming a hazard for entry into the language. I cannot imagine being a newcomer anymore, as compared to the days where you'd read a book like Zed Shaw's Learn Python the Hard Way and get started. It's sad.


However, if one is a beginner, I think others in that environment should not put too much emphasis on formatting being exactly correct. Maybe drop a note when talking about the code or mentoring, but don't make it the main point, when there are probably many more significant things to focus on with code of a beginner. Maybe starting out has not become that much different then.


That's the whole point of a code formatter. To avoid these bike shedding discussions and get back to work.

Where you think a long call should be allowed somebody else will think it should be split.

Pick a code formatter, and fuck opinions, we'll do whatever the formatter says we have to do. Done. It is not that important if you put semicolons here or there as long as it is consistent.


The key part of your message is "I am experienced enough to format _my_ code in a readable way".

It's not a tradeoff between "good manual formatting" and "mediocre automated formatting".

It's a tradeoff between "random formatting" and "consistent formatting".

Manual formatting only works when a dictator spends their time enforcing a consistent style. Once projects get large, the value of this is marginal.


Adopting Black made me realize quite how much of my coding thinking capacity had previously been spent thinking about code formatting - I used to really sweat the details about how to break up a long function call, where to put the line breaks, how to indent my dictionary literals...

With Black, I don't spend a single moment thinking about that at all. I estimate I've got a 5-10% productivity boost in my time-spent-writing-code from this!


To add onto this, I also found that Black works as a nice heuristic indicating to split up code when the formatted output isn't "pretty" into separate lines.


It's been at least a decade since I've not used an autoformatter on every piece of code ever. I didn't realize there were people who worked at places without autoformatters still.


Had a similar experience when I used rustfmt (which led me to trying black). I was used to formatters that over formatted, removing newlines used to dilineate sections of code, etc. I tried rustfmt when learning Rust because I figured id learn adopt standard practices.

It was revolutionary. I went from formatting as I went, even for prototyping to throwing code at my editor and letting rustfmt fix it. Big difference in productivity.


One missed opportunity in Black's algorithm is that it currently treats the maximum line length as a literal hard limitation in number of characters. Here is a trivialized example:

    to_add = [item for item in data.new_items if item not in data.old_items]
    to_remove = [
        item for item in data.old_items if item not in data.new_items
    ]
Although the constructs are nearly structurally identical, they can be formatted very differently, which sometimes hinders understanding them.

A different approach would be to instead normalize all words to a certain fixed width. So, "to_add" and "to_remove" would have the same virtual width.

A related issue is that leading indentation counts towards the width limit. This causes refactorings which simply move code around (changing its indentation level) to change the code's shape, even when the code hasn't otherwise changed. This is exacerbated by that one often needs to mold code in such a way that Black formats it in an agreeable way, but this is generally not done during refactorings, so the readability of the code suffers.

I had the opportunity to write a formatter (for SQL, also unconfigurable/opinionated); it seems to successfully avoid these problems: https://github.com/CyberShadow/squelch


Line length is there for a reason, it's to fit everything on the screen. Ignoring leading spaces/indentation or giving a 'fudge factor' doesn't help keep everything on the screen.

Ultimately there has to be a hard limit and it's silly to argue over special cases that should exceed it (because that's just another thing to bikeshed over in code reviews). Black sets a hard limit and enforces it--done, no more discussion.

IMHO it's a code smell to have a bunch of long lines of code that are all visually similar but vary in a tiny and easy to miss way. Here's another way to think about the code you wrote that boils it down to an even smaller and more focused intent:

    new = set(data.new_items)
    old = set(data.old_items)
    to_add = new - old
    to_remove = old - new
It's not exactly the same as what you wrote but you get the idea, and it can be made simpler if you're using set types to start with. It's kind of a nudge that if your intent is to do set-like operations like difference, intersection, etc. then you might want to use the right tools for the job instead of banging out more procedural code.


To reiterate, the example is trivialized; there isn't always an elegant solution that you just can't see because you didn't think hard enough.

Regarding the hard line width, "fit everything on the screen" is a poor goal to aim for; a more useful goal is to make the best use of the two dimensions you have at your disposal. Squishing things across either axis will make for a poorer experience than occasionally requiring some scrolling for some setups.

You may also find it interesting that Prettier does not interpret it as a hard limit as well: https://prettier.io/docs/en/options.html#print-width


Honestly I can see this being a feature of the next gen of code formatters—dynamic reflow, like e-books.


Very much this. I will often times deliberately line up related operations to make the meaning clear. Black then clobbers all over it.


What I like to do is configure black for a maximum line length of 100 characters (although I prefer keeping docstrings at a maximum of 72 characters). That's not as restrictive as black's default of 72 characters, but still compact enough to be readable and to fit at least two files next to each other.

If a line is longer than that it's most of the time either something which should be wrapped into multiple lines anyway (like a dictionary with multiple keys) or a sign for code smell.

One of the most common code smells I see leading to longer than necessary lines (and which might make the code harder than necessary to read) is not using early returns.

Also using a trivialized example:

  to_add = []
  for item in data.new_items:
      if item not in data.old_items:
          # code spanning multiple lines with further indentation levels
          …
          
          
  to_add = []
  for item in data.new_items:
      if item in data.old_items:
          continue
  
      # code spanning multiple lines with further indentation levels
      …


Past related threads:

Black – Uncompromising Python code formatter - https://news.ycombinator.com/item?id=19939806 - May 2019 (244 comments)

Linting 400kLOC of Python Code with Black - https://news.ycombinator.com/item?id=18536731 - Nov 2018 (1 comment)

Black: An uncompromising Python code formatter - https://news.ycombinator.com/item?id=17151813 - May 2018 (255 comments)


Personally, I don't understand the appeal of opinionated code formatters. If you don't want to discuss "taste" questions, then just don't, it doesn't matter what option you pick. If it is an important question, then you should debate it.

I think you can convey meaning with subtle code style differences. Empty lines to delineate blocks. Single quotes if the string is a keyword, double quotes if it is for the user. Spaces around operators to make an expression clearer. I spend a minute or two before I commit to make the code tidy (linter and then manual tweaking) and would expect that from everybody on my team - it takes often less time than rebasing and picking good commit names, for example.

But even though it annoys me slightly when I encounter Black (or god forbid, Go) used in a project, I know a lot of people like it a lot, and it is good to have the choice. So congrats to the release! :-)


I think the general consensus of many people, myself included, is that it doesn't matter what the specifics are, as long as they're being applied consistently. Beyond that, it's a question of optimizing the cost/benefit ratio.

The problem with most things that have to be applied manually is that they're also applied inconsistently. It's like Hungarian notation. Over time, it becomes pure noise: you can't rely on it, especially when reading someone else's code, so it's safer to just ignore it. At which point, the convention's cost/benefit is approaching infinity, because its denominator nearly zero. So I'm not sure I care if the absolute cost is small. It's still an almost complete waste of effort.

The only place I've ever seen manual conventions applied consistently enough to be useful in a sustainable way is when someone's carved out their own private silo of code that is mostly only touched by them.


> convey meaning with subtle code style differences

Why? This is exactly why we need autoformatters :-) if you're on a team, you should be conveying meaning with variable names, comments, and structure. If it doesn't show up in the syntax tree, it doesn't have meaning. Consistency always wins because it elevates everyone's ability to read the code.

There's more to it than the social aspect though: putting effort into manually moving around characters in a text editor is not the best use of mental bandwidth. My experience (and that of many I know) is that formatters help you code faster with less bugs by letting you focus on the higher-level structure of the problem NOT on the minutiae of where each punctuation mark should live. If your code is syntactically correct, the autoformatter will make it look "right" and you don't have to think about consistent formatting at all. If it's not syntactically correct, you set your editor to autoformat on save and you get immediate feedback. So it's not just about consistency, it's also a productivity aid.

I understand having strong opinions about the "right" code style. Personally, I dislike the double quotes in black. So I type single quotes and let black do its thing on save. What I type does not need to be what I commit (so long as they both compile to the same AST)! I don't want to waste time debating formatting details in a PR, and I certainly don't want to interrupt my coding flow to manually perform the labor of shuffling ascii chars around when there is fully automated solution. I can't think of a single scenario where my personal style preference would outweigh the significant advantages I've seen from autoformatters.


> If it doesn't show up in the syntax tree, it doesn't have meaning.

Why did you use paragraphs to write your comment then? People read the code thats written.


Auto formatters support paragraphs, but they don't support I'm going to give this paragraph 2 blank lines because it's particularly important/complicated/blah.


Typographic standards typically don't, either. Fortunately, there's more than one way to convey emphasis.

Also, single functions that are long enough to contain something analogous to paragraphs are arguably not Pythonic in the first place.


For paragraph I just mean a series of 1 or more lines separated by a blank line vs no blank lines.


Are there two blank lines between these parts of the code for a reason or did the person who wrote this code make a mistake?

If that part of the code is special for whatever reason, add a comment saying why.


Oh interesting thanks! I couldn't find it in the docs before


Yeah, I don't think it's mentioned because it feels like supporting spaces between words, I’d guess. Just so fundamental a functionality it's assumed.


I think the argument is not that structure should be dispensed, or that it doesn't matter, rather that a given style convention is fungible, and we should probably all just pick something good enough and get on with it, because the code will execute regardless of whether we use " or ' around our strings.

Spend some time considering the language standards, but then ship it for god's sake.

See also from Peter Norvig[0]:

* Get involved in a language standardization effort. It could be the ANSI C++ committee, or it could be deciding if your local coding style will have 2 or 4 space indentation levels. Either way, you learn about what other people like in a language, how deeply they feel so, and perhaps even a little about why they feel so.

* Have the good sense to get off the language standardization effort as quickly as possible.

[0]:https://www.norvig.com/21-days.html


I actually had this formatter thing happen at work. We have code that generates more python code. Now the quotes are changed and we need escapes because a rule it being applied. Escapes are harder to read and look messier.


> So I type single quotes and let black do its thing on save.

It is the one small grip, I have with black. C&P for example some string-indexed dict element as argument into a f-string sometimes produces duplicate double quotes if it gets into black's hands beforehand. But in the end, black outweights all these minor things. This particular issue also is probably better considered as a tooling task for editors, IDEs etc.


> If you don't want to discuss "taste" questions, then just don't, it doesn't matter what option you pick.

Exactly, it doesn't matter what option you pick, as long as everybody picks the same option. Enter opinionated code formatters, because otherwise how do you do it?


> otherwise how do you do it?

Only reject obviously quite messy code, and only rarely allow formatting rewrites. Learn that anybody’s strong preferences for a specific style or discussion of such aren’t demonstrably beneficial for the bottom line of the business or customers.


This is a recipe for people spending time cleaning up files, back and forth, ad infinitum. Or, just letting the stubborn person win.


My point is you let people have their different styles, reject changes on lines where code isn't changed, and coach the style zealots to calm down. Changing a couple of bytes in a file? Don't change the style. Writing something new? Figure out what's best. Rewriting a big chunk? Make reasonable formatting choices.

Some programmers are zealots who have a hard time working with others, the solution isn't always to force the matter or acquiesce to the squeakiest wheel. A strong organization should be able to enable people do have their style and disable people from forcing their opinions on everybody or wasting everybody's time with low-value debate.


> Changing a couple of bytes in a file? Don't change the style. Writing something new? Figure out what's best. Rewriting a big chunk? Make reasonable formatting choices.

This is exactly the type of arguably useless effort we don't want to spend any time on. Having to think about what is "reasonable", "allowed", or "best" is non-negligible cognitive burden for both the writer and reviewer/reader.


When I use black or other formatter, I have to exert cognitive effort to anticipate how the formatter will format the code so that I can avoid the inevitable stupid formatting decisions.

I often then tweak the code in a way that is not necessarily better, it's merely rendered better by black.

It just feels like I'm fighting against the tool half of the time.


> My point is you let people have their different styles, reject changes on lines where code isn't changed, and coach the style zealots to calm down

Code with style changes, especially within a file, is harder to follow than almost any reasonable consistent style. And adopting a common formatter with common settings is a lot less work than “reject changes on lines where code isn't changed, and coach the style zealots to calm down” even if it was neutral in readability.


I think it's a tell when people get so fussy about style, if you can't get past non functional differences how are you going do on differences of functional differences. "my code is spaghetti and performs poorly, but at least I used a formatter!"


The reason I like black is the most opinonated folks have WEIRD opinions.

You see that here.

Things like a = [ 1, 2, 3 ] when 90% of the world is a = [1, 2, 3]

These folks will spend hours / days try to force weird approaches.

Sometimes they are obviously wrong. But it's annoying to have to fight over it. Something like black saves you the hassle and the WEIRD local coding rules someone comes up with if you are on a project with them.


I once came in on a project where one dev used "func( arg )" and the other used "func (arg)". They would reformat each other's lines when they changed something, so the diff had a lot of whitespace changes. The two were not on speaking terms any longer, so I was tasked with cleaning up the mess.

First step was to run an autoformatter. I used stock presets so neither of the two would think the other was favoured in the cleanup.


If I worked there I would always do func( arg) or func(arg ) just because.


Haha. Brilliant :) The extra spaces look so weird to me, but yeah, folks have strong opinions.


Chaotic evil


I just don't see why you would care if this extra space is in there or not. If they want to stick it in there, be my guest. I can read either just fine, even if, gasp, it's not 100% consistent through the whole codebase.


Will you change to match their code style? Great.

However, if another person ALSO has a strong opinion about this - nightmare.


Terrific point I never thought of. There's definitely a power law, where 80% of the arguing time is spent on the 20% dumbest shit.


I think I figured out why: it’s easy for everyone to have an opinion about simple things. Complex things that are worth arguing about take more knowledge/experience and at that point fewer folks are qualified to argue about them.



> Personally, I don't understand the appeal of opinionated code formatters. If you don't want to discuss "taste" questions, then just don't, it doesn't matter what option you pick.

I think the appeal is that it avoids the discussions entirely (i.e. if it doesn't matter what option you pick, why even have the option? Particularly if these options will mean you divert away from the standard python style guide, which has been trying to standardize these choices for years so that codebases have similar rules.)

i.e. if your org has inconsistent python formatting between different independently-operating teams, it's easier to enforce PEP 8 than start a load of pointless arguments about what good looks like.


> If you don't want to discuss "taste" questions, then just don't, it doesn't matter what option you pick. If it is an important question, then you should debate it.

I'm personally glad to see Python development come to the same conclusion enforced by Go's opinionated tooling.

That is: style questions are almost always unimportant to producing value, but teams still waste excessive time on them because of the engineering tendency to bikeshed triviata.


> I think you can convey meaning with subtle code style differences.

Yes that's very true, but, and now comes the big but - if you're working in a project with dozens and dozens of other devs, you will necessarily see all kinds of weird formatting. And this is where an opinionated formatter shines. It doesn't matter how many people work on your project, what the churn-rate is or if the guy that just started yesterday already read the code-formatting guidelines or not - the formatter will just take care of it and remove all hassles surrounding that topic. And this only at the price of opinionated formatting which I may or may not like - a price that I likely have to pay either way, because there's no way that everybody shares the same opinion about formatting in the first place.

I recommend to give opinionated formatters a try, if you haven't already. I know plenty of folks who were initially against it but eventually came around it because it made life a lot easier: No more discussions about formatting on PRs, no more tabs vs space wars, everything's already settled and automatically taken care of.


> Personally, I don't understand the appeal of opinionated code formatters.

Opinionated code formatters mean that opinionated coders with different preferences on a team don't produce (or at least produce less) noise in diffs.


I only reformat code I am rewriting, for that reason. I've never seen a tool that was very good at bending its own rules to minimize these accidental changes (e.g., let a function call go to column 83 to keep it on one line).


Sounds like you actually are very opinionated about your coding style, so it's both totally understandable and yet ironic that you'd begrudge that of others. I've tried hard to internalize that opinions are like assholes, everybody has one and they all stink. My own included.


> Empty lines to delineate blocks.

Use comments to describe those blocks and their meaning, don't just assume everyone is thinking exactly the same as you are about the code organization (hint, they probably aren't). Where you see a bunch of logically separated blocks of operations I might be new to the codebase and just see a bunch of unnecessary whitespace. Describe what's being organized and why in comments.


does it not allow you do something like this to set up variables?

  def foo(a,b,c):
    var x = 40
    var y = None
  
    bar()
    
    return x * b
I think the blocks of code make it more readable, some variables are set up, some work is done, then it returns something


I can stop discussing taste issues, but it doesn’t prevent others from debating what the proper style for my PR ought to be, meanwhile I can’t merge my PR.


Ye I don't understand it either. Why would you want some automated Whitespace Hitler telling you what to do. You can't even select between tabs and spaces for indentation.


(Note: I mostly write Java these days, so my viewpoint is colored by this.)

Code is meant to be read by humans. Compilers don't have wetware eyes. Homogenizing code that was hand-formatted for the situation ignores the art and craft of writing software.

I think formatters do have use as a "base" (e.g. Allman or K&R curlies?), especially for junior devs, but experienced developers that care about their software that has their name on it normally put their best foot forward. They would strive to present the most readable software they can put forth.

I've been writing software since 1981 and have yet to meet a code formatter that I like. `// @formatter:off` !


One man's "readable code" is another person's trash. As the number of commiters in a codebase increases, the chances that you find at least one person's code to be an irritating annoyance increases exponentially.

The only way to avoid it is to have the political power to call out the team member that's annoying. Easy enough to do if you're the lead and nobody will question your decisions.

The worst and dumbest fights I've encountered is with people with 30 years of experience who have diametrically opposed opinions. I _especially_ despise having to call out very senior engineers when their own code misbehaves.


Yes, very true. One has to prove their point of view. It's easier when the engineer in question has proven themselves in the org (I would say industry, but when you arrive at a shop, you generally need to start over.) I should have made the distinction between "senior in years" vs "senior via software admired by peers".

Since code is skewed heavily to the left of "read:write", the primary issues that formatting should address are [1] muscular eye fatigue induced by "formatting" or non-formatting that leads to excessive eye saccades (i.e. eyes "jumping" from point to point, or obliquely) [2] formatting that obscures logic leading to cognitive fatigue.

Another thought experiment I've used is: Imagine that you replaced [A-Za-z0-9] with squares and made it black and white. Leave the language's keywords alone. Does the shape of the code suggest the logic and flow? etc. That's more in the zone of cognitive load.

That's why it's an art and craft, to me. My thinking is also colored by smaller shops, since a stint as a cog in some large orgs burned up my soul.

And yes, I've been in a few dumb fights ;)


Familiarity improves readability/comprehension.

Black is not ideal but one can live with its output and in time the style becomes familiar to everyone.


Here's the best thing about Black:

Python has significant whitespace, so you can't just click "ignore whitespace changes" in Github diffs and it doesn't matter. You have to put up with every silly whitespace change.

Until Black.


5 upvotes generate a pro tip: have your CI run black --check against the code you want formatted. That way different devs can run the formatting how they like (in tox; in a pre-commit hook; in IDE; on the command line) and CI just enforces that it happened.


Unfortunately Black will reformat code which has only been indented/outdented, causing false diffs even with "ignore whitespace changes".


That's true. What I mean is it won't keep being reformatted when everyone runs their own formatter on the code before they start working on it.


I like Black for normal Python code, but it seems to mangle Pandas / Dask code at times. I still use it extensively cause it doesn't seem like there are other good alternatives.

I wrote a blog post on how to use Black in Jupyter Lab notebooks if anyone is interested: https://coiled.io/blog/code-formatting-jupyter-notebooks-wit...

It's really nice to format a notebook with the click of a button.


It's now enabled by default in IPython, for better or worse: https://github.com/ipython/ipython/pull/13397


“the next release will likely revert it”

— Matthias Bussonnier, https://github.com/ipython/ipython/issues/13463#issuecomment...


Great, I'm sure it will end up with a good solution eventually. Unfortunate that it had to be this heated about this change.

R. Hettinger needed to learn to always be graceful when commenting or criticizing the work of others, especially volunteers.

And we all saw how wild the difference is in engagement for a project like ipython between the "normal" level and the "viral" level. Users are just using it and depending on it, never interacting with the project, but if something attracts attention it can be like a thousand flies flocking to it. Often when something gets negative attention..


There are several main personalities in the python-dev space:

1) People who easily get passionate but are gentlemen when it counts. R. Hettinger is one of these.

2) People who get passionate and are politicians. GvR is one of those and he is above the CoC.

3) Vicious politicians who always remain calm. These are the most dangerous, are often mediocre and climb the ladder at various large corporations. Some of these have not contributed much.

4) Some really nice people. You generally won't see much of them in discussions.

R. Hettinger is independent and honest, unlike the snake pit that runs Python.


After this comment the submission dropped from place 25 to place 56 in less than 10 minutes. Suppression of opinion is a standard PSF politician tactic.


That will be reverted in the next version, i.e. it'll be optional.


I don't love how Black doesn't let you put some clarificatory parentheses -- for instance, they get wiped from `eq_balanced = (left_hand_side == right_hand_side)` But the benefits of never wasting time on discussing inanity outweigh any specific complaints.


You can disable formatting for specific lines or blocks with `# fmt: off`

https://black.readthedocs.io/en/stable/the_black_code_style/...


This is a great tool. I wish there was some settings to use single quotes instead of double quotes.


Funny that this is a common reaction against opinionated tools: "I wish there was a configurable option to apply my opinions"

But the whole idea is that you should learn to suppress your ego and let the tool be the one dictating stylistic choices...

Like the sibling comment mentions:

Dusty Phillips, writer:

"Black is opinionated so you don’t have to be."


That made me wonder, so I went and tested edge cases, like "abc\"d\"ef", which black quite properly converts to 'abc"d"ef'. So it does use single quotes where appropriate.


I understand that. And this is what I like in Black. But the single quotes thing is really up to debate. When you use it with other tools such as flake8 or pylint, you have to disable a bunch of things to make them work together because no one agrees on this point.


I'm not a full time Python dev so I won't (can't) get into what choice has more merits. But there is this:

https://black.readthedocs.io/en/latest/the_black_code_style/...

Seen from outside, I think the double quotes are just the natural form of strings in lots of other places, at least in those that come from the same family than C (C++, Java, C#, Rust, just to name a few)

Also Prettier.js (which I'd call the "Black of JavaScript") also settled on double quotes. So we're left with a de facto consensus across programming languages as a whole, which kind of feels nice.


> When you use it with other tools such as flake8 or pylint, you have to disable a bunch of things to make them work together

Using Flake8 with Black requires only two configuration options [1], while Pylint requires three [2]. If you prefer to use Black with --line-length 79, then it's down to a single configuration option for Flake8 or two for Pylint.

[1]: https://black.readthedocs.io/en/latest/guides/using_black_wi...

[2]: https://black.readthedocs.io/en/latest/guides/using_black_wi...


You can have some basic configurable stuffs in an opinionated tool though, see Prettier. It allows tab/space, quote style, line length etc., to be modified.


Prettier has some options, but they are not happy about it, it's just how things happened and now cannot be "solved", i.e. remove them, albeit they would love being able to do that:

https://prettier.io/docs/en/option-philosophy.html


"My opinion is the correct one"

- Author of some tool


“No opinion is correct, so we should stop wasting time debating it.”

- advocates of code formatters


"let's everyone stop debating it the exact moment after you adopt my preferences"


There is: -S, --skip-string-normalization Don't normalize string quotes or prefixes.


IIRC, the -S only avoid single quotes to get switched to double quotes. What I meant was allowing Black to switch double-quotes to single quotes automatically.


Try nero from PyPI instead, I’ll update it to this stable release shortly.


There’s also “blue” which I’m the author of and looks similar: https://pypi.org/project/blue/ would love to get your help and thoughts in the issue tracker or by email.


What did you have in mind?


There probably is, if you use another formatter that isn’t Black.


I like yapf.


Congratulations to the Black authors! It's a wonderful tool, probably the first one I install when creating a local Python development environment.


Shameless self-promotion, as a former coworker of Łukasz, Black's creator:

Another coworker and I have created an import sorter that fits well alongside Black, called µsort. It is designed from the ground up to be a safe, stable import sorter that won't move imports in ways that potentially change behavior of the codebase, and without needing developers to litter their code with "skip" directives. We use it in our daily formatting codemods on tens of thousands of source files every day, and just finished our 1.0 release in December.

https://usort.readthedocs.io

Going further, if you like enforcing both formatting and import order in your CI pipeline, I also created the project µfmt, which combines both Black and µsort into a single, atomic formatting step. This ensures there's never any conflict of opinion between the two tools, and any formatting changes are presented as a single diff result.

https://ufmt.readthedocs.io


> that won't move imports in ways that potentially change behavior of the codebase

How can that be the case? Can't Python imports have behaviour that varies on the time of the day if they want?

An import could monkey patch a basic operation one day but not the next.


Absolutely! We're hoping that the vast majority of modules are good citizens, but we also know that the reality not perfect. That's why µsort allows you to configure a list of modules with known import-time side effects, and µsort will then treat those are barriers everywhere.

https://usort.readthedocs.io/en/stable/guide.html#side-effec...


Even better, you can overwrite the default importer to substitute your own with any behavior you want. Used in EVE Online to do their "hide the python code somewhere" scheme.


Interesting. Now that isort has support for black-compatibility, what’s the main difference with usort?


We've written a number of words on "Why µsort", including a comparison to isort: https://usort.readthedocs.io/en/stable/why.html

The biggest reason is µsort's detection of sortable "blocks" of imports, and only sorting within those blocks to limit breaking changes, combined with use of proper parsing and CST manipulation to guarantee valid syntax after sorting.


FWIW, there's one particular use case where the isort behavior is incredibly useful: in eval-as-you-type data science e.g. when using Hydrogen in Atom you type an import whenever you need it but that leads to a lot of duplication and imports scattered across the file, and it's nice that isort just moves that up top on every save.


That's something we're aware of, and something we've considered, but also potentially dangerous from a safety standpoint, due to Python's dynamic nature, and how much moving imports can affect runtime semantics. Any implementation of this would need to be aware of those dangers, and also be able to provide stable and predictable behavior. At the moment, it's not a huge priority given the availability of tooling to automatically add missing imports, or the limited impact of under-sorting in the context of notebooks.


Very cool. I love the idea of one tool rather than two, I’ll check it out!


Long ago, I worked at a company that decided it needed coding standards. We spend two weeks arguing about coding standards. Nothing else got done.

Ever since then, I am not opinionated about coding standards. The benefits do not match the costs.

Although I do like the approach of "use something like black on precommit", and if you don't like it, you can reformat to your standards on check out, and be happy. It will get fixed on check in.


Sort of why I think Go got it right with gofmt. Its not as strict as it could be but all Go looks more or less the same and it’s because it was built into the language


To me, the point of using code formatters is that you don't have to think/talk about code formatting much at all. As in, there are never annoying comments about formatting in PRs.

A nice bonus is that you end up having uniformly readable code.


Too many people have too strong opinions on the subject. Some things are reasonably good ideas, many more are just arbitrary, strict rules prevent doing reasonable exceptions, and an enormous amount of time is wasted discussing the matter.


its the penultimate bike shedding issue.

If you're coders can code it doesn't matter, and if they can't formatting their poor code isn't going to fix it.


Just an off topic note on language: penultimate means second to (last, final, biggest, etc), i.e. the one before the ultimate. Perhaps an odd choice of word in context that is often misunderstood.

It leads to the question “then what is the ultimate bike shedding issue?”


yes, ironically the word choice is to avoid bike shedding, if I said it was the ultimate issue there would be all sorts but but no, this issue, where as if it's the 2nd everyone can have their pet issue be most important and we can reach agreement on it and move forward while tabling the discussion of what is the ultimate issue.

Whenever you have one of these bike shed issues in a meeting, do this: Give everyone 5 minutes to say their piece, give everyone an opportunity to ask 2 follow up questions to anyone, and a 2 minute response. Go through everyones opinion, take a vote and move on. Then set a date 1 year in the future to review/change the policy.


The bike shed


Arguing over coding standards is harmful, but consistency is a good thing. Best to adopt a standard style guide and limit any debate to deviations and even better to have a tool like Black make it automatic.


Yep. Even if I personally hate a coding style, I'll stick to it religiously if it's what my team uses. A good coding style is one that everyone adheres to.


Yep, just agree on something, whatever it is, enforce it everywhere and move on. It’s not worth the time arguing. I love gofmt for this.


> Black prefers [i.e. converts] double quotes (" and """) over single quotes (' and ''').

Can someone explain this to me? Why would you ever prefer " over ' in a language where both can be used equally?


It's a tic for me too, as I always prefer '.

But to be fair I think it's because 1. Double quotes are ""universally"" strings in other languages 2. Having apostrophes gets annoying.

The great benefit really is that I can write single word strings just fine, and black will adjust it for me. I get my cake, and so do the others.


Because "don't" is easier to read than 'don\'t'


If you use proper apostrophes instead of ASCII apostrophe / single quote, you don’t have to escape anything and you get better typography. "don’t" is easier to read than both.

If you’re on a Mac, Option-Shift-] will type it.


When you’re embedding SQL in a string, a you use ‘ a lot more than “. Not the only use case, but one to give some consideration.


I think that’s only MySQL. Postgres uses double quotes for identifiers and single quotes for literals.

EDIT: I misinterpreted that single quote as a backtick. In any case, both single and double quotes are common in SQL, but single quotes are a bit more common.


Black will use single quotes if your string contains double quotes and no single quotes. It'll reformat

     print("\"Hello\" he said")
to

     print('"Hello" he said')
I think pretty much all formatters for languages that let you use either quote do this. It makes sense.


And 'do not' is easier to read than both :)


Arguing over Black formatting choices is a little ironic, isn’t it?

The whole point is that they’re all arbitrary so just let the defaults win, and adjust yourself instead.


The presence and depth of the debate about this formatting choice suggests that it isn't arbitrary. Many have provided solid reasons for why one option should be preferred over the other. They aren't merely articulating their opinions; they are also providing objective reasons for why their opinion is valid.


Everything about formatting is arbitrary, as long as it's consistent in a given codebase to the point of predictability (e.g. 90% consistent, not necessarily 100% consistent).

To elaborate a bit, sure, there are arguments for everything talked about here, and they're valid, but the marginal utility of any one choice is dwarfed by opting out of the arguments entirely, and just going with whatever Black has selected as a default.


Because most commonly used languages use double quotes, not single quotes, for string literals. Some languages can use both, either interchangeably or with slightly different semantics, but others either uses single quotes for other purposes, or do not use them for anything at all. Therefore, double quotes for strings look more normal, so to speak, in source code.


Except that in the language you're most likely to mix with Python, SQL, strings are in single quotes (double quotes are for identifiers like column names, especially if they're case sensitive). So it's probably a bad choice on Black's part.


(MySQL, notoriously, uses double quotes for strings.)

However, does not Python and standard SQL using different quote characters make it easier? I.e.

  query = "SELECT column_a FROM a_table WHERE column_b = 'foo';"
compared to this, if single quotes were enforced:

  query = 'SELECT column_a FROM a_table WHERE column_b = \'foo\';'


If it's useful to use the same string literal character "as everyone else", it would be useful to use the same as a language you are very likely to see combined - so, single quote.

On the other hand, if it's useful to have minimum need of escaping, then single quote is still better, because in SQL it's much more common to quote identifier names than to have literal strings.

  query = 'SELECT "columnA" FROM "aTable" WHERE "columnB" = \'foo\';'
compared to

  query = "SELECT \"columnA\" FROM \"aTable\" WHERE \"columnB\" = 'foo';"


> it would be useful to use the same as a language you are very likely to see combined - so, single quote.

Why? It’s obvious that it causes issues as I showed, and yet you simplay claim the opposite without argument? The difference between other programming languages and SQL is, of course, that it’s very common to have embedded SQL code in your program, but not common to have embedded strings of code in another programming language.

> in SQL it's much more common to quote identifier names than to have literal strings.

That is certainly not true. The kind of SQL you quote is only found in auto-generated SQL by ORMs and the like, and those are not likely to exist as strings in code. To use your example, this is the normal SQL way to write that:

  query = "SELECT columnA FROM aTable WHERE columnB = 'foo';"
I.e. no double quotes needed; no need to escape anything.


Using double quotes makes it easier to have single quotes inside the string literals.


This was decided very early in the project, the discussion can be read here: https://github.com/psf/black/issues/51



Did you lock that issue before posting it here? Is this some kind of sore/forbidden topic for the project?

Edit: I suppose the question is superfluous seeing how many replies there are just here.


Yes, and yes. Edit: and yes.


That was the thing that bugged me at first, but other people on my team loved it. Some things bugged other people on my team, even though I love them. If a tool can manage to annoy everyone about the same amount, but in different areas, it’s doing something right.

And while it’s still kind of strange that `a = “foo”` looks different than `repr(a)`, after a while of using Black I don’t notice it anymore.


A couple of possible reasons:

* Double quotes are more visible, especially for triple quotes.

* It makes it easier to have text without having to manually escape apostrophes.


This is simple: the default is picked, so now no one has to argue about it.


Additionally, standardizing all quotes makes searching for a specific string literal across a codebase easier, since you don't need regex to match against both quote types. Once you've decided to standardize strings, it doesn't matter much which one you pick.

Personally, I still type single quotes in almost all cases, and just let Black reformat to double quotes to save shift key presses.


I prefer them too. Makes it easy to notice than single ones. It's just a personal preference, there's no big logic behind it.


The other way around (converting double quoted strings to single quoted ones) would be more understandable, as that is what str.__repr__ prefers:

    >>> "string"
    'string'
I sense a bloated ego of the author in Black's decisions -- to contradict with the language's preferred formatting style.


It’s easier and faster to differentiate between “ and ` than between ‘ and `.


Uniformity, presumably


Try nero from PyPI instead, I’ll upgrade to stable shortly.


Last commit was from 2019? It sounds like you're not automating the patch for every black release...


Things were factored a few times, wasn’t a consistent patch. Also I didn’t care for a few of the changes so didn’t upgrade yet. However stable should be a good target for automation.


In some languages single vs double has different functionality.


Anyone have a foolproof way to reformatting all the code in a repo without screwing up history? I've seen some complicated commands which seem too sketchy to a git novice like myself.


Do it all in one commit.

Then put that commit's (full) sha in a file named something like .git-blame-ignore-revs

Then `$ git config blame.ignoreRevsFile .git-blame-ignore-revs`


Thanks, that is helpful, esp I can google that to get more docs. One issue is everyone in the team needs to do so, but its probably worth it.


I'm a huge fan of black, and have been using it in most of my projects for a long time.

That being said, my biggest gripe with it is that it has at some point started reformatting docstrings [0] in addition to the code -- strictly following PEP 257 [1] -- without any way to disable that behaviour. While I understand the desire to standardise, docstrings are not executable code, and should be allowed more flexibility when it comes to formatting.

[0] https://github.com/psf/black/issues/1779

[1] https://www.python.org/dev/peps/pep-0257/


I honestly don't understand why devs are so hung up on formatting.

I know, I know consistency. I've heard that for years. I disagree that it matters. Where I work, one codebase is formatted with black, another is not. The un-blackened codebase is rife with all sorts of style inconsistencies. This does not make it more taxing to read.

The only downside is the useless PR nits from folks who randomly decide you ought to format something a bit differently. If it were up to me, I would just ban all style related comments and move on with life. But adopting Black seems politically feasible, so I advocate for that.


> I honestly don't understand why devs are so hung up on formatting.

Because easily 98% of the work of any developer is being a "wetware compiler" since "code is written for other people."

So the faster the human can mentally run the code, the faster they can then decide what action to take next

> This does not make it more taxing to read.

Ah, spoken like someone who has not yet worked in a "foreign language" shop, where the developers are new to the language at hand and bring their old "best practices" to the current language


>Because easily 98% of the work of any developer is being a "wetware compiler" since "code is written for other people."

People always say this. But they fail to provide any evidence that consistency helps, and if so if by how much. All they have is personal experience, which is all I have as well. And my personal experience says it just doesn't matter.

> Ah, spoken like someone who has not yet worked in a "foreign language" shop, where the developers are new to the language at hand and bring their old "best practices" to the current language

Quite the assumption. That's where many of the inconsistencies come from. And Black isn't going to help with different language idioms. That matters a lot more than whether or not you use a trailing comma for the last item of a multi-line list or not.


I use black and love it— but only with 120 line lengths. The default of 80 is wayyy to low for the days of 4k monitors. The tricks it uses to split some things up onto new lines actually makes it less readable.


I agree 80 is too low, but it doesn't have anything to do with 4k monitors, it's just that 80 is too short. Any longer than 120 is not readable imo. Luckily Black lets you change the line length so it's not an issue.


Maybe it’s different for other people but smaller text is much easier on my eyes with a 4k monitor.


As long as you can easily review code side-by-side with line numbers intact, you're good. The default was chosen for lower resolution 13" laptop monitors to be able to display the Phabricator diff page (think: Github PR review page) without having to wrap any lines.


I agree. I've sometimes found Black's output to be on the borderline of unreadable with the default line length of 80. My team settled on a length of 120 (the only Black config item we changed) and it has largely, though not entirely, solved that problem.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: