Modern Python Performance Considerations

uncomputation · on May 5, 2022

With JavaScript, these kinds of optimizations in an engine make sense due to the web being limited by it and thus speed is a huge factor. With Python, however, if a Python web framework is “too” slow, I would honestly say the problem is using Python at all for a web server. Python shines beautifully as a (somewhat) cross platform scripting language: file reading and writing, environment variables, simple implementations of basic utilities: sort, length, max, etc that would be cumbersome in C. The move of Python out of this and into practically everything is the issue and then we get led into rabbit holes such as this where since we are using Python, a dynamic scripting language, for things a second year computer science student should know are not “the right jobs for the tool.”

Instead of performance, I’d like to see more effort in portability, package management, and stability for Python because, essentially since it is often enterprise managed, juggling fifteen versions of Python where 3.8.x supports native collection typing annotations but we use 3.7.x, etc. is my biggest complaint. Also up there is pip and just the general mess of dependencies and lack of a lock file. Performance doesn’t even make the list.

This is not to discredit anyone’s work. There is a lot of excellent technical work and research done as discussed in the article. I just think honestly a lot of this effort is wasted on things low on the priority tree of Python.

waprin · on May 5, 2022

On paper, Python is not the right tool for the job. Both because of its bad performance characteristic and because it’s so forgiving/flexible/dynamic , it’s tough to maintain large Python codebases with many engineers.

At Google there is some essay that Python should be avoided for large projects.

But then there’s the reality that YouTube was written in Python. Instagram is a Django app. Pinterest serves 450M monthly users as a Python app. As far as I know Python was a key language for the backend of some other huge web scale products like Lyft, Uber, and Robinhood.

There’s this interesting dissonance where all the second year CS students and their professors agree it’s the wrong tool for the job yet the most successful products in the world did it anyway.

I guess you could interpret that to mean all these people building these products made a bad choice that succeeded despite using Python but I’d interpret it as another instance of Worse is Better. Just like Linus was told monolithic kernels were the wrong tool for the job but we’re all running Linux anyway.

Sometimes all these “best practices” are just not how things work in reality. In reality Python is a mission critical language in many massively important projects and it’s performance characteristics matter a ton and efforts to improve them should be lauded rather than scrutinized.

SideQuark · on May 5, 2022

>the most successful products in the world did it anyway

A few successful projects in the world did it. There's likely far more successful products that didn't use it.

The key metric along this line is how often each language allows success to some level and how often they fail (especially when due to the choice of language).

>should be lauded rather than scrutinized

One can do both at the same time.

fullsend · on May 5, 2022

Instagram has one billion monthly users generating $7 billion a year. There are almost zero products on earth as successful.

slt2021 · on May 5, 2022

Just compare Instagram written in Python to Google Wave, Google+ or any other Google's social media, written in C++/Java :))))

arinlen · on May 5, 2022

> Instagram has one billion monthly users generating $7 billion a year.

Doesn't Instagram serve mostly static content that's put together in an appealing way by mobile apps? I'd figure Instagram's CDN has far more impact than whatever Python code it's running somewhere in it's entrails.

Cargo cult approaches to tech stacks don't define quality.

xboxnolifes · on May 5, 2022

The point is that it's still one project. You need to count the failures as well to rule out survivorship bias.

jeremycarter · on May 5, 2022

And you can put 7 billion of effort into tweaking your python application performance?

fddhjjj · on May 5, 2022

> The key metric along this line is how often each language allows success to some level and how often they fail

How does python score on these key metrics?

arinlen · on May 5, 2022

> But then there’s the reality that YouTube was written in Python. Instagram is a Django app. Pinterest serves 450M monthly users as a Python app. As far as I know Python was a key language for the backend of some other huge web scale products like Lyft, Uber, and Robinhood.

All those namedrops mean and matter nothing. Hacking together proof of concepts is a time honoured tradition, as is pushing to production hacky code that's badly stiched up. Who knows if there was any technical analysis to pick Python over any alternative? Who knows how much additional engineering work and additional resources was required to keep that Python code from breaking apart in production? I mean, Python always figured very low in webapp framework benchmarks. Did that changed just because <trendy company> claims it used Python?

Also serving a lot of monthly users says nothing about a tech stack. It says a lot about the engineering that went into developing the platform. If a webapp is architected so that it can scale well to meet it's real world demand, even after paying a premium for the poor choice of tech stack some guy who is no longer around made in the past for god knows what reason, what would that say about the tech stack?

tempnow987 · on May 6, 2022

"All those namedrops mean and matter nothing"

If your goal is to actually ship product then this matters a lot. Many of us have dealt with folks spinning endlesslessly on "technical analysis" when just moving forward with something like python would be fine. Facebook is PHP.

I'm actually cautious now when folks are focused too much on the tech and tech analysis instead of product / users / client need.

travisjungroth · on May 6, 2022

> All those namedrops mean and matter nothing. ... Who knows if there was any technical analysis to pick Python over any alternative?

Why look at results when you can look at analysis!

arinlen · on May 6, 2022

> Why look at results when you can look at analysis!

The problem with this mindless cargo culting around frameworks and tech stacks is that these cultists look at business results and somehow believe that they have anything to do with arbitrary choices regarding tech stacks.

It's like these guys look at professional athletes winning, and process to claim the wins are due to the brand of shoes they are using, even though the athlete had no say about the choice and was forced to wear what was handed over to him.

dataflow · on May 5, 2022

I don't think "I could use tool X for job Y" implies "X was the right tool for job Y". You could commute with a truck to your workplace 300 feet away for 50 years straight and I would still argue you probably used the wrong tool for the job. "Wrong tool" doesn't imply "it is impossible to do this", it just means "there are better options".

albrewer · on May 6, 2022

The main thing is that Python is often in the top 10 choices for almost every problem on top of being insanely easy to learn and write; it also doesn't fragment its community - its standard library is so ridiculously large there are very few fault lines to break upon.

It's rarely the best choice, or even the fifth best. But if it's OK at a dozen things, then it makes it all but impossible to ignore. The fact that it sucks to write a GUI in is fine as long as I can put some basic "go" buttons and text boxes in front of a web scraper.

formerly_proven · on May 6, 2022

IME writing a GUI in Python is pretty easy in a whole bunch of different ways, shipping is the annoying part.

pjmlp · on May 6, 2022

Python is the new BASIC.

nightski · on May 5, 2022

Or maybe tech stack really doesn't have that much influence on the success or failure of the business :)

otterley · on May 6, 2022

Depends on the margins you have to work with. For high-margin businesses, I agree, the tech stack isn’t crucial to the health of the business. But for low-margin businesses for which compute is a significant cost center, tuning a tech stack to be more cost-efficient can make you a hero.

kragen · on May 6, 2022

You should read about MySpace and "Samy is my hero". Or Google vs. AltaVista. Or GeoWorks vs. Microsoft Windows. Or Yahoo Mail and the medireview problem. Or when Danger lost everybody's data. Or THERAC-25. Or Knight Capital's bug.

On the other hand, for many years a lot of Amazon's back office processes were written in Elisp.

With enough thrust you can get a pig to fly but effort can only compensate for bad technical decisions up to a point.

anthk · on May 6, 2022

>Elisp

Elisp? Elisp? Are you sure?

kragen · on May 6, 2022

I didn't witness it, but Steve Yegge says he did; from https://sites.google.com/site/steveyegge2/tour-de-babel:

> Shel wrote Mailman [not the Python mailing list manager, an Amazon-internal application] in C, and Customer Service wrapped it in Lisp. Emacs-Lisp. You don't know what Mailman is. Not unless you're a longtime Amazon employee, probably non-technical, and you've had to make our customers happy. Not indirectly, because some bullshit feature you wrote broke (because it was in C++) and pissed off our customers, so you had to go and fix it to restore happiness. No, I mean directly; i.e., you had to talk to them. Our lovely, illiterate, eloquent, well-meaning, hopeful, confused, helpful, angry, happy customers, the real ones, the ones buying stuff from us, our customers. Then you know Mailman.

> Mailman was the Customer Service customer-email processing application for ... four, five years? A long time, anyway. It was written in Emacs. Everyone loved it.

> People still love it. To this very day, I still have to listen to long stories from our non-technical folks about how much they miss Mailman. I'm not shitting you. Last Christmas I was at an Amazon party, some party I have no idea how I got invited to, filled with business people, all of them much prettier and more charming than me and the folks I work with here in the Furnace, the Boiler Room of Amazon. Four young women found out I was in Customer Service, cornered me, and talked for fifteen minutes about how much they missed Mailman and Emacs, and how Arizona (the JSP replacement we'd spent years developing) still just wasn't doing it for them.

> It was truly surreal. I think they may have spiked the eggnog.

PaulDavisThe1st · on May 6, 2022

AFAIK, mailman was the only thing ever wrapped in this way.

I wrote all the rest of the back-office utilities (at least, the initial versions of them), and I have never come across any indication that they got "wrapped" in anything else (they did, no doubt, evolve and mutate in something utterly different over time).

Yegge's quote is also slightly inaccurate in that mailman, like all early amzn software, was written in C++. Shel and I just chose not use very much of the (rather limited) palette of C++ syntax & semantics.

kragen · on May 6, 2022

Aha, the correction is greatly appreciated.

Most days I regret posting to HN. Today is not one of those days.

anthk · on May 6, 2022

Well, a proper Emacs module can be set up with menus and a relatively easy interface for everyone.

A good example it's GNUs.

w1nk · on May 5, 2022

> There’s this interesting dissonance where all the second year CS students and their professors agree it’s the wrong tool for the job yet the most successful products in the world did it anyway.

> I guess you could interpret that to mean all these people building these products made a bad choice that succeeded despite using Python but I’d interpret it as another instance of Worse is Better. Just like Linus was told monolithic kernels were the wrong tool for the job but we’re all running Linux anyway.

This isn't the correct perspective or take away. The 'tool' for the job when you're talking about building/scaling a website changes over time as the business requirements shift. When you're trying to find market fit, iterating quickly using 'RAD' style tools is what you need to be doing. Once you've found that fit and you need to scale, those tools will need to be replaced by things that are capable of scaling accordingly.

Evaluating this binary right choice / wrong choice only makes sense when qualified with a point in time and or scale.

lenkite · on May 9, 2022

Instagram created Cinder https://www.infoworld.com/article/3617913/instagram-open-sou... to address Python Performance.

YouTube video processing uses C++. It also uses Go and Java along with Python.

PInterest makes heavy use of Erlang for scaling.The rate-limiting system for Pinterest’s API and Ads API is written in Elixir and responds faster than its predecessor.

Takeaway: Basically you need to either build your own PythonVM/CPython fork for better Python performance or use another language for the parts that needs to scale or run fast.

roguas · on May 10, 2022

I think people look ahead. They see how the app evolves (from A to B) and then claim "X not good". Where as they do not judge the flexibility of X as a tool to move from A to B. Typically they look at B and make claims from that perspective.

Those companies that succeed in python usually have a long path and python was never successfully removed and most likely attempts were made. The PL economics is often stickyness and its not easy to propose absolute measure.

pjmlp · on May 6, 2022

We might be running Linux, yet it has so many virtualization layers on cloud infrastructure, user space stacks to workaround switching to kernel, with microservices for everything, that it is effectively a monolithic kernel being bended into a microkernel one.

Same thing with Python, those business succeed despite Python, and when they grew it was time to port the code into something else, or spend herculean efforts into the next Python JIT.

parentheses · on May 6, 2022

+1. Languages that are general purpose get used for everything. Perl for the web. Python for builds. Scala transpiled for web.

Portability has many solutions that are good enough - often only bad because they result in second order issues which are themselves solvable with limited pain. Being able to scale software further without having to solve difficult distributed systems problems is of value.

blagie · on May 5, 2022

I want a common language I can work with. Right now, Python is the only tool which fits the bill.

A critical thing is Python does numerics very, very well. With machine learning data science, and analytics being what they are, there aren't many alternatives. R, Matlab, and Stata won't do web servers. That's not to mention wonderful integrations with OpenCV, torch, etc.

Python is also competent at dev-ops, with tools like ansible, fabric, and similar.

It does lots of niches well. For example, it talks to hardware. If you've got a quadcopter or some embedded thing, Python is often a go-to.

All of these things need to integrate. A system with Ruby+R+Java will be much worse than one which just uses Python. From there, it's network effects. Python isn't the ideal server language, but it beats a language which _just_ does servers.

As a footnote, Python does package management much better than alternatives.

pip+virtualenv >> npm + (some subset of require.js / rollup.js / ES2015 modules / AMD / CommonJS / etc.)

JavaScript has finally gone from a horrible, no-good, bad language to a somewhat competent one with ES2015, but it has at least another 5-10 years before it can start to compete with Python for numerics or hardware. It's a sane choice if you're front-end heavy, or mobile-heavy. If you're back-end heavy (e.g. an ML system) or hardware-heavy (e.g. something which talks to a dozen cameras), Python often is the only sane choice.

kbenson · on May 5, 2022

> As a footnote, Python does package management much better than alternatives

No offense meant, but that sounds like the assessment of someone that has only experienced really shitty package management systems. PyPI has had their XMLRPC search interface disabled for months (a year?) now, so you can't even easily figure out what to install from the shell and have to use other tools/a browser to figure it out.

Ultimately, I'm moving towards thinking that most scripting languages actually make for fairly poor systems and admin languages. It used to be the ease of development made all the other problems moot, but there's been large advances in compiled language usability.

For scripting languages you're either going to follow the path or Perl or the the path of Python, and they both have their problems. For Perl, you get amazing stability at the expense of eventually the language dying out because there's not enough new features to keep people interested.

For Python, the new features mean that module writers want to use them, and then they do, and you'll find that the system Python you have can't handle what modules need for things you want to install, and so you're forced to not just have a separate module environment, but fully separate pythons installed on servers so you cane make use of the module ecosystem. For a specific app you're shipping around this is fine, but when maintaining a fleet of servers and trying to provide a consistent environment, this is a big PITA that you don't want to deal with when you've already chosen a major LTS distro to avoid problems like this.

Compiling a scripting language usually doesn't help much either, as that usually results in extremely bloated binaries which have their own packaging and consistency problems.

This is cyclical problem we've had so far. A language is used for admin and system work, the requirements of administrators grate up against the usage needs of people that use the language for other things, and it fails for non-admin work and loses popularity and gets replaced be something more popular (Perl -> Python) or it fails for admin work because it caters to other uses and eventually gets replaced by something more stable (what I think will happen to Python, what I think somewhat happened to bash earlier for slightly different reasons).

I'm not a huge fan of Go, but I can definitely see why people switch to it for systems work. It alleviates a decent chunk of the consistency problems, so it's at least better in that respect.

jonnycomputer · on May 5, 2022

>No offense meant, but that sounds like the assessment of someone that has only experienced really shitty package management systems. PyPI has had their XMLRPC search interface disabled for months (a year?) now, so you can't even easily figure out what to install from the shell and have to use other tools/a browser to figure it out.

Yes, this is, frankly, an absurd situation for python.

And then there is the fact that I end up depending on third-party solutions to manage dependencies. Python is big-time now; stop the amateur hour crap.

ryan_lane · on May 6, 2022

Most languages have numerous third-party solutions for managing dependencies, or only recently added native support. go only recently added modules and was an absolute mess prior to that. javascript has npm, yarn, and about a million others. PHP has compose, but it doesn't cover everything. C/C++ are a mess. Java has gradle, maven, sbt, etc.

Denvercoder9 · on May 5, 2022

> As a footnote, Python does package management much better than alternatives.

If you use it as a scripting language, that might very well be the case (it's at least simpler). When you're building libraries or applications, no, definitely not. It's a huge mess, and every 3 years or so we get another new tool that promises to solve it, but just ends up creating a bigger mess.

nerdponx · on May 5, 2022

This is an overly-dire assessment in my opinion. Setuptools + Pip has been more or less stable for years, and no changes in dev environment have been needed since stability was reached. There is a lot of new stuff coming out, for people who want new stuff, but there's nothing wrong with the old stuff if the old stuff works for you, which will remain supported for several more years if not forever.

whimsicalism · on May 5, 2022

I think poetry actually does solve it

Thrymr · on May 5, 2022

Oh, there are a half dozen different tools that solve python package management. Unfortunately, they are mutually incompatible and none solve it for all use cases.

wardedVibe · on May 5, 2022

https://xkcd.com/927/

AlphaSite · on May 6, 2022

It’s somewhat the opposite situation, poetry is a tool that lots of people are adopting rather than one being pushed by a standards committee. I don’t think it set out to unify a dozen standards, only build a good UX and to be reproducible.

whimsicalism · on May 6, 2022

Poetry isn't an additional standard, more like an implementation. It is PEP 518 compliant.

whimsicalism · on May 5, 2022

> it has at least another 5-10 years before it can start to compete with Python for numerics or hardware

More, given that no language competes at high-level numerics with Python outside of Julia and numerics in general only adds C++.

Robotbeat · on May 5, 2022

Fortran >:D

whimsicalism · on May 5, 2022

For low-level, fair. I only know of people in astronomy academia who actually use it nowadays though.

blagie · on May 6, 2022

There are good technical reasons to use Fortran even in 2022. For example, it avoids** is aliasing (pointers / references / etc.). This allows for some kinds of optimizations impossible in most other languages.

It's used in a bunch of small niches, but it has users beyond just astronomy.

** A million disclaimers apply.

whimsicalism · on May 6, 2022

I am aware that hypothetically fortran code is the fastest possible. That said, I am not sure how great this aliasing difference is in practice.

There is a reason why most BLAS implementations have been rewritten into C.

BenjiWiebe · on May 6, 2022

Signal processing stuff is still sometimes written in it.

DeathArrow · on May 5, 2022

>. R, Matlab, and Stata won't do web servers.

Not unless they're pushed to, like Python was.

>A critical thing is Python does numerics very, very well.

That's not Python doing numerical stuff. That's C code, called from Python.

blagie · on May 5, 2022

It's not C code. It calls into a mixture of C, CUDA, Fortran, and a slew of other things. Someone did the work of finding the best library for me, and integrating them.

As for me, I write:

A * B

It multiplies two matrices. C can't do that. In C, I'd have some unreadable matrix64_multiply(a, b). Readability is a big deal. Math should look more-or-less like math. I can handle 2^4, or 2**4, but if you have mpow(2, 4) in the middle of a complex equation, the number of bugs goes way up.

I'd also need to allocate and free memory. Data wrangling is also a disaster in C. Format strings were a really good idea in the seventies, and were a huge step up from BASIC or Python. For 2022?

And for that A * B? If I change data types, things just work. This means I can make large algorithmic changes painlessly.

Oh, and I can develop interactively. ipython and jupyter are great calculators. Once the math is right, I can copy it into my program.

I won't even get started on things like help strings and documentation.

Or closures. Closures and modern functional programming are huge. Even in the days of C and C++, I'd rather do math in a Lisp (usually, Scheme).

I used to do numerics in C++, and in C before that. It's at least a 10x difference in programmer productivity stepping up to Python.

Your comment sounds like someone who has never done numerical stuff before, or at least not serious numerical stuff.

arinlen · on May 5, 2022

> It multiplies two matrices. C can't do that. In C, I'd have some unreadable matrix64_multiply(a, b).

In C you get BLAS which provides functions like ?gemm, a BLAS level-3 function which is literally stands for general matrix-matrix product.

Hardly cryptic.

https://www.intel.com/content/www/us/en/develop/documentatio...

Also, anyone doing any remotely serious number crunching/linear algebra work is well aware that you need to have control over which algorithms you use to run these primitive operations, and which data type you're using.

> I used to do numerics in C++, and in C before that. It's at least a 10x difference in programmer productivity stepping up to Python.

I'm rather skeptical of your claim. Eigen is the de-facto standard C++ linear algebra toolkit and it overloads operators for basic arithmetics.

I'm not sure your appeal to authority is backed up with any relevant experience or authority.

https://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixA...

I'm not sure your appeal to authority is backed up with any relevant experience or authority. It's ok if you like Python and numpy, but don't try to pass off your personal taste for anything with technical merit.

patrick451 · on May 6, 2022

> > It multiplies two matrices. C can't do that. In C, I'd have some unreadable matrix64_multiply(a, b).

> In C you get BLAS which provides functions like ?gemm, a BLAS level-3 function >which is literally stands for general matrix-matrix product.

> Hardly cryptic.

Seriously? The signature for dgemm is

    void cblas_dgemm(const CBLAS_LAYOUT layout, const CBLAS_TRANSPOSE TransA,
                     const CBLAS_TRANSPOSE TransB, const CBLAS_INT M, const CBLAS_INT N,
                     const CBLAS_INT K, const double alpha, const double  *A,
                     const CBLAS_INT lda, const double  *B, const CBLAS_INT ldb,
                     const double beta, double  *C, const CBLAS_INT ldc)

Maybe you have a loads of free time, but I don't want to memorize that function signature when I could just type A @ B.

arinlen · on May 6, 2022

> Seriously? The signature for dgemm is (...)

The signature of gemm is trivial if you're aware of the basics of handling dense row-major/column-major matrices.

https://www.intel.com/content/www/us/en/develop/documentatio...

https://www.netlib.org/lapack/explore-html/db/dc9/group__sin...

I haven't met a single person who did any number crunching work whatsoever who ever experienced any problem doing basic matrix-matrix products with BLAS. Complaining about flags to handle row-major/column-major matrices while boasting about being an authority on number crunching is something that's not credible at all.

blagie · on May 7, 2022

You sound a lot like me, when I was a teenager.

blagie · on May 6, 2022

I don't actually mind memorizing these things if I'm doing this 60 hours per week. I used to be quite good at some of the C++ STL numerics libraries. I took the same perverse pride some folks here apparently do in knowing these things in- and-out.

However, most of code is about being able to read it and modify it, not write it. Equations are sometimes hard enough when they look like equations. If you've got a call like that in your code, that's going to be radically harder to understand than A*B.

That's not to mention all the clutter around it, of allocating and deallocating memory.

One of the things C++ programmers don't typically understand are the types of boosts one gets from closures, garbage collection, and functional programming in general, especially for numerics. I recommend this book:

https://en.wikipedia.org/wiki/Structure_and_Interpretation_o...

In it, you'll see code which:

- The user writes a Lagrangian

- The system symbolically computes the Lagrange equations (which are derivatives of the above)

- This is compiled into native code

- A numerical algorithm integrates that into a trajectory of motion

- Which is then plotted

All of this code is readable (equations are written in Lisp, but rendered in LaTeX). None of this is overly hard in a Lisp; this was 1-3 people hacking together, and not even central to what they were doing. It'd be neigh-impossible in C or C++.

Those are the sorts of code structures which C/C++ programmers won't even think of, because their impossible to express.

(Footnote: Recent versions of C++ introduced closures; I have not used them. People who have express they're not "real closures.")

whimsicalism · on May 6, 2022

Interesting... I would think symbolically deriving the Euler-Lagrange equations from the Lagrangian would be quite hard in practice.

blagie · on May 6, 2022

That's the shocking thing. It's totally not hard in a Lisp. I could write the code for that symbolic derivation in an evening, tops.

The hard part is the compiler. The somewhat hard part is the efficient numerical integrator (if you want good convergence and rapid integration). The symbolic manipulation is easy, once you know what you're doing. If you want to know what you're doing, it's sufficient to see how other people did it:

https://mitpress.mit.edu/sites/default/files/titles/content/...

And if you haven't seen it:

https://mitpress.mit.edu/sites/default/files/sicp/full-text/...

If you're dumb like me, and can't write a compiler or an efficient numerical integrator in your spare time, having this interpreted and using a naive integrator is still good enough most of the time. Computers are fast. The authors of the above proved the motion of the solar system is chaotic with a very, very long, very, very precise numeric integration on hardware from decades ago, so they have super-fancy code. For the types of things I do, a dumb integration is fine.

Once you've seen how it's done, and don't mind lower speed, that's trivial in any language with closures (e.g. Python or even JavaScript). If you want to swing a double-pendulum, or play around with the motion of a solar system over shorter durations, it's easy.

And using the tool is even easier. Look at SICM (linked book). The things that look like code snippets are literally all the code you need.

The system, if you want it:

https://groups.csail.mit.edu/mac/users/gjs/6946/installation...

whimsicalism · on May 8, 2022

Wow, I can't believe I've never heard of this book (SICM, not the wizard book). Seems incredibly cool.

That said, I'll maintain that my initial skepticism was somewhat justified given that the derivation relies on a pre-written symbolic manipulation/solver library (scmutils) so it's not quite 'from scratch' in a Lisp ;) although I believe that you could write such a library yourself, as SICP demonstrates.

As a side note, it looks like I can't go through this book because I have an M1 Mac, which GNU Scheme doesn't support :(

danuker · on May 5, 2022

> the number of bugs goes way up

In case you are forced to use the unreadable long-named unintuitively-syntaxed methods, add unit tests, and check that input-output pairs match with whatever formula you started with.

tomrod · on May 5, 2022

Yet, Python (and most of her programmers including data scientists, of which I am one) stumble with typing.

    if 0.1 + 0.2 == 0.3:
        print('Data is handled as expected.')
    else:
        print('Ruh roh.')

This fails on Python 3.10 because floats are not decimals, even if we really want them to be. So most folks ignore the complexity (due to naivety or convenience) or architect appropriately after seeing weird bugs. But the "Python is easiest and gets it right" notion that I'm often guilty of has some clear edge cases.

dullcrisp · on May 5, 2022

Why would you want decimals for numeric computations though? Rationals might be useful for algebraic computations, but that’d be pretty niche. I’d think decimals would only be useful for presentation and maybe accountancy.

tomrod · on May 5, 2022

Well, for starters folks tend to code expecting 0.1+0.2=0.3, rather than abs(0.3-0.2-0.1) < tolerance_value

Raw floats don't get you there unfortunately.

dullcrisp · on May 5, 2022

If you want that you should use integers. This seems to be a misalignment of expectations rather than a fault in the language.

Other people have posted other examples but it’s not possible to represent real numbers losslessly in finite space. Mathematicians use symbolic computation but that probably is not what you would want for numerics. I could see a language interpreting decimal input as a decimal value and forcing you to convert it to floating point explicitly just to be true to the textual representation of the number, but it would just be annoying to anyone who wants to use the language for real computation and people who don’t understand floating point would probably still complain.

Edit: I’ll admit I have a pet peeve that people aren’t taught in school that decimal notation is a syntactic convenience and not an inherent property of numbers.

gjm11 · on May 5, 2022

They also expect 1/3 + 1/3 + 1/3 == 1. Decimals won't help with that.

kbenson · on May 5, 2022

That's slightly different in that most programmers won't read 1/3 as "one third" but instead "one divided by three", and interpret that as three divisions added together, and the expectations are different. Seeing a constant written as a decimal invites people to think of them as decimals, rather than the actual internal representation, which is often "the float that most closely represents or approximates that decimal".

dekhn · on May 5, 2022

https://docs.python.org/3/library/decimal.html

tomrod · on May 5, 2022

Correct! Many python users don't know about this and similar libraries that assist with data types. Numpy has several as well.

d0mine · on May 6, 2022

It is not a Python thing, it is a floating-point thing. You need it if you want hardware support (CPU/GPU) for non-integer arithmetic in any language. Otherwise, you have decimal, fractions, sympy, etc modules depending on your needs.

https://docs.python.org/3/tutorial/floatingpoint.html

fnord123 · on May 5, 2022

This is an issue for accountancy. Many numerical fields have data coming from noisy instruments so being lossy doesn't matter. In the same vein as why GPUs offer f16 typed values.

mrtranscendence · on May 5, 2022

> That's not Python doing numerical stuff. That's C code, called from Python.

That's sort of a distinction without a difference, isn't it? Python can be good for numeric code in many instances because someone has gone through the effort of implementing wrappers atop C and Fortran code. But I'd rather be using the Python wrappers than C or especially Fortran directly, so it makes at least a little sense to say that Python "does numerics [...] well".

> Not unless they're pushed to, like Python was.

R and Matlab, maybe. A web server in Stata would be a horrible beast to behold. I can't imagine what that would look like. Stata is a terrible general purpose language, excelling only at canned econometrics routines and plotting. I had to write nontrivial Stata code in grad school and it was a painful experience I'd just as soon forget.

disgruntledphd2 · on May 5, 2022

You can do web stuff in R, but it's a lot harder than it needs to be. R sucks for string interpolation, and a lot of web related stuff is string interpolation.

mrtranscendence · on May 5, 2022

Yeah, I'm not surprised by that. The extent of my web experience in R is calling rcurl occasionally, so I've never tried and failed to do anything complicated.

fractalb · on May 5, 2022

> Not unless they're pushed to, like Python was.

Readability of code and ease of use is a big thing. It's just not about pushing hard till we make it.

edit: formating

jonnycomputer · on May 5, 2022

I wouldn't want to do a web-server in MATLAB. I like MATLAB, but no, not that.

brainz456 · on May 5, 2022

Or in some cases, FORTRAN code called from Python iirc.

rootusrootus · on May 5, 2022

> the problem is using Python at all for a web server

I don't agree with this. Maybe for a web server where performance is really going to matter down to the microsecond, and I've got no other way to scale it. I write server code in both Javascript and Python, and despite all of my efforts I still find that I can spin up a simple site in something like django and then add features to it much more easily than I can with node. It just has less overhead, is simpler, lets me get directly to what I need without having to work too hard. It's not like express is hard per se, but python is such an easy language to work with and it stays out of my way as long as I'm not trying to do exotic things.

And then it pays dividends later, as well, because it's really easy for a python developer to pick up code and maintain it, but for JS it's more dependent on how well the original programmer designed it.

srcreigh · on May 5, 2022

The problem with Django services is the insanely low concurrency level compared to other server frameworks (including node).

Django is single request at a time with no async. The standard fix is gunicorn worker processes, but then you require entire server memory * N memory instead of lightweight thread/request struct * N memory for N requests.

I shudder to think that whenever Django server is doing an HTTP request to a different service or running a DB query, it's just doing nothing while other requests are waiting in the gunicorn queue.

The difference is if you have an endpoint with 2s+ queries taking 2s for one customer, with Django, it might cause the entire service to stall for everybody, whereas with a decent async server framework other fast endpoints can make progress while the 2s ones are slow.

manfre · on May 5, 2022

Django has async support for everything except the ORM. async db is possible without the ORM or by doing some thread pool/sync to async wrapping. A PR for that was under review last I checked.

Either way, high concurrency websites shouldn't have queries that take multiple seconds and it's still possible to block async processes in most languages if you mix in a blocking sync operation.

pdhborges · on May 5, 2022

You can configure gunicorn to use multiple threads to recover quite a bit of concurrency in those scenarios and that is enough for many applications.

srcreigh · on May 5, 2022

What threading/workers configuration do you use?

I'm looking at a page now which recommends 9 concurrent. requests for a Django server running on a 4 core computer.

Meanwhile node servers can easily handle hundreds of concurrent requests.

pdhborges · on May 5, 2022

We use the ncpu * 2 + 1 formula for the number of workers that serve API requests.

I don't think in 'handling x concurrent requests' terms because I don't even know what that means. Usually I think around thoughout, latency distributions and number of connections that can be kept open (for servers that deal with web sockets).

For example if you have the 4 core computer and you have 4 workers and your requests take around 50ms each you can get to a throughput of 80 requests per second. If the fraction of request time for IO if 50% you can bump your thread count to try to reach 160 request per second. Note that in this case each request consumes 25ms of CPU so you would never be able to get more than 40 requests per second per CPU whether you are using node or python.

dirnctiwnsidj · on May 5, 2022

This sounds like sour grapes. Python is a general-purpose language. Languages like Awk and Perl and Bash are clearly domain-specific, but Python is a pretty normal procedural language (with OO bolted on). The fact that it is dynamic and high-level does not mean it is unsuited for applications or the back-end. People use high-level dynamic languages for servers all the time, like Groovy or Ruby or, hell, even Node.js.

What about Python makes it unsuitable for those purposes other than its performance?

rmbyrro · on May 5, 2022

Totally agree that performance is not on my top 10 wish list for Python.

But I disagree on "not the right jobs for the tool".

Python is extremely versatile and can be used as a valid tool for a lot of different jobs, as long as it fits the job requirements, performance included.

It doesn't require a CS degree to know that fitting job requirements and other factors like the team expertise, speed, budget, etc, are more important than fitting a theoretical sense of "right jobs for the tool".

blagie · on May 5, 2022

> It doesn't require a CS degree to know that fitting job requirements and other factors like the team expertise, speed, budget, etc, are more important than fitting a theoretical sense of "right jobs for the tool".

It requires experience.

A lot of those lessons only come after you've seen how much more expensive it is to maintain a system than to develop one, and how much harder people issues are than technical issues.

A CS degree, or even a junior developer, won't have that.

rmbyrro · on May 6, 2022

Experience does not lead to that conclusion.

Whether Python will be easier or harder to maintain depends on numerous factors that vary so much for each job that you cannot generalize upon.

That's something experience shows.

Reaching such a conclusion that "Python is not a right tool for web backend" is just naive.

No matter how experienced a developer is, reality of the world is at least 100x more diverse than what they alone could possibly have learned and experienced.

If one believes to possess the experience of everything to generalize on complex topics like this, it just shows this person could benefit from cultivating a bit more humbleness.

moffkalast · on May 5, 2022

Python can do just about anything... but it will take its time doing it.

rmbyrro · on May 6, 2022

And many times this is negligible, which puts this out of the equation.

robotsteve2 · on May 5, 2022

The world doesn't revolve around web development. It's not the only use case. Scientific Python is huge and benefits tremendously from the language being faster. If Python can be 1% faster, that's a significant force multiplier for scientific research and engineering analysis/design (in both academia and industry).

mrtranscendence · on May 5, 2022

Because most of the really huge scientific Python libraries are written as wrappers over lower-level language code, I'd be curious to what extent speeding up Python by, say, 10% would speed up "normal" scientific Python code on average. 1%? 5%?

animatedb · on May 5, 2022

If you are talking about large sets of numbers, then the speed up will be far below 1%.

make3 · on May 5, 2022

I'm not sure it's very relevant to say in a discussion of the answer of "how do we improve Python" is "don't use Python". People have all kinds of valid reasons to use Python. Let's keep this on topic please

digisign · on May 5, 2022

The folks that work on performance are not the folks working on packaging. Shall we stop their work until the packaging team gets in gear?

the__alchemist · on May 5, 2022

I agree! Here's a related point: Rust seems ideal for web servers, since it's fast, and is almost as ergonomic as Python for things you listed as cumbersome in C. So, why do I use Python for web servers instead of Rust? Because of the robust set of tools of tools Django provides. When evaluating a language, fundamentals like syntax and performance are one part. Given web server bottlenecks are I/O limited (mitigating Python's slowness for many web server uses), and that I'd have to reinvent several wheels in Rust, I use Python for current and future web projects.

Another example, with a different take: MicroPython, on embedded. The only good reason I can think for this is to appeal to people who've learned Python, and don't want to learn another language.

aldonius · on May 6, 2022

So really, you're not so much writing Python as writing Django, which just so happens to be Python.

totony · on May 5, 2022

>Instead of performance, I’d like to see more effort in portability, package management, and stability for Python because, essentially since it is often enterprise managed, juggling fifteen versions of Python where 3.8.x supports native collection typing annotations but we use 3.7.x, etc. is my biggest complaint. Also up there is pip and just the general mess of dependencies and lack of a lock file. Performance doesn’t even make the list.

I have been leading a Python project lately and, yes, the tooling is very poor, although it is getting better. I have found poetry to be a very good for venv management and lock files + having one file for all your config.

erosenbe0 · on May 6, 2022

Pip has a decent solution for lock files:

https://pip.pypa.io/en/stable/user_guide/#constraints-files

Barrin92 · on May 5, 2022

>Python shines beautifully as a (somewhat) cross platform scripting language

Python is much more than just a scripting language. I remember attending this talk[1] a few years about JPMorgan's 35 million LOC Python codebase. Python is being used to built seriously large software nowadays and I don't think performance is ever a minor issue. It should always be in the top 3 for any general purpose language because it directly translates into development speed, time and money.

[1]https://youtu.be/ZYD9yyMh9Hk

heavyset_go · on May 5, 2022

> Also up there is pip and just the general mess of dependencies and lack of a lock file.

You can use pyproject.toml or requirements.txt as lock files, Poetry can use the former and poetry.lock files, as well.

marius_k · on May 5, 2022

> and lack of a lock file

Is it possible to solve your problem using pip freeze?

pjmlp · on May 5, 2022

Agreed, my only use for Python since version 1.6, is portable shell scripting or when sh scripts get too complicated.

Anything beyond that, there are compiled languages with REPL available.

mrtranscendence · on May 5, 2022

What compiled languages do you have in mind? I suppose technically there are repls for C or Rust or Java, but I wouldn't consider them ideal for interactive programming. Functional programming might do a bit better -- Scala and GHCi work fine interactively. Does Go have a repl?

pjmlp · on May 5, 2022

Java, C#, F#, Lisp variants, and C++.

Eclipse has Java scratchpads for ages, Groovy also works out for trying out ideas and nowadays we have jshell.

F# has a REPL in ML linage, and nowadays C# also shares a REPL with it in Visual Studio.

Lisp variants, going at it for 60 years.

C++, there are hot reload environments, scripting variants, and even C and C++ debuggers can be quite interactive.

I used GDB in 1996, alongside XEmacs, as poor man's REPL while creating a B+Tree library in C.

Yes, there are Go interpreters available,

https://github.com/traefik/yaegi

cozzyd · on May 6, 2022

Particle physicists have been using interpreted c++ for "macros" forever. First using the terrible hack of cint, now using cling which is quite good.

pjmlp · on May 6, 2022

Indeed, although I remember there used to be some commercial ones as well, from ads on The C/C++ Users' Journal and Dr. Dobbs.

eatonphil · on May 5, 2022

> compiled languages

Might be tripping you up. Very few languages require that implementations be compiled or interpreted. For most languages, having a compiler or interpreter is an implementation decision.

I can implement Python as an interpreter (CPython) or as a compiler (mypyc). I can implement Scheme as an interpreter (Chicken Scheme's csi) or as a compiler (Chicken Scheme's csc). The list goes on: Standard ML's Poly/ML implementation ships a compiler and an interpreter; OCaml ships a compiler and an interpreter.

There are interpreted versions of Go like https://github.com/traefik/yaegi. And there are native-, AOT-compiled versions of Java like GraalVM's native-image.

For most languages there need be no relationship at all between compiler vs interpreter, static vs dynamic, strict or no typing.

peatmoss · on May 5, 2022

During Perl’s hegemony as The Glue Language, I feel like the folk wisdom was:

“Performance is a virtue; if Perl ceases to be good enough, or you need to write ‘serious’ software rewrite in C.”

And during Python’s ascension, the common narrative shifted very slightly:

“Performance is a virtue, but developer productivity is a virtue too. Plus, you can drop to C to write performance critical portions.”

Then for our brief all-consuming affair with Ruby, the wisdom shifted more radically:

“Developer productivity is paramount. Any language that delivers computational performance is suspect from a developer productivity standpoint.”

But looking at “high-level” languages (i.e. languages that provide developer productivity enhancing abstraction), we can rewind the clock to look at language families that evolved during more resource-constrained times.

Those languages, the lisps, schemes, smalltalks, etc. are now really, really fast compared to Python, and rarely require developers to shift to alternative paradigms (e.g. dropping to C) just to deliver acceptable performance.

Perl and Python exploded right at the time that Lisp/Scheme hadn’t quite shaken the myth that they were slow, with Python/Perl achieving acceptable performance by having dropped to C most of the time.

Now the adoption moat is the wealth of libraries that exist for Python—and it’s a hell of a big moat. If I were a billionaire, I’d hire a team of software developers to systematically review libraries that were exemplars in various languages, and write / improve idiomatic, performant, stylistically consistent versions in something modern like Racket. I’d like to imagine that someone would use those things :-)

edflsafoiewq · on May 5, 2022

Perl/Python/Ruby grew up in the 90s, the "Bubble economy" of the single core performance world, the likes of which had never and probably will never be seen again on the face of the Earth. In the post-Bubble world, throwing out 90% of your performance before you even start writing code, especially when the same dynamic features could be delivered via JIT without the cost, seems crazy.

rockyj · on May 5, 2022

So true, excellent point! I just do not understand startups choosing Python/Ruby in 2022 when you can get most of the features, type safety, concurrency, async and 5 times more speed in other languages.

WJW · on May 5, 2022

I don't think it is such a surprise. The ecosystems around Rails (for Ruby) and numpy/pandas/etc (for python) are orders of magnitude larger than you get in the modern languages. In Rails for example, adding an entire user management system (including niceties like password reset mails and must-haves like proper security for obscure vulnerabilities most people will have never heard of) is literally a single extra line in the gemfile and two console commands. In python the ML and numerics ecosystem are completely beyond anything another language has to offer at the moment, even more so when you compare the time to get started.

In addition, "real" performance is often tricky to measure and may be irrelevant compared to other parts of the system. Yes, Ruby is 10-100x slower than C. But if a user of my web service already has a latency of (say) 200ms to the server then it barely matters if the web service returns a response in 5 ms or in 0.5 ms. Similarly for rendering an email: no user will notice their email arriving half a second earlier. Similarly for a python notebook: if it takes 1 or 2 seconds to prepare some data for a GPU processing job that will take several hours, it doesn't really matter that the data preparation could have been done in 0.1 seconds instead if it had been done in Rust.

Especially for startups where often you're not sure if you're building the right thing in the first place, a big ecosystem of prebuilt libraries is super important. If it turns out people actually want to buy what you've made in sufficient numbers that the inefficiency of Ruby/Python/JS/etc becomes a problem then you can always rewrite the most CPU intensive parts in another language. Most startup code will never have the problem of "too many users" though, so it makes no sense to optimize for that from the start.

SemanticStrengh · on May 6, 2022

> are orders of magnitude larger than you get in the modern languages. This is plain wrong, Kotlin has access to the whole JVM/Java ecosystem. In fact in theory you can even access Python libraries via graalPython. Machine learning is one of the only niche to be language specific as if it's true that there are major Java deep learning libraries (such as the standford parser) they are increasingly obscolete vs Python.

ReflectedImage · on May 5, 2022

Well if you choose Python/Ruby you only need 1/3 of the developers as if you choose another language.

The productivity gain is so great it outweights everything else. It's as simple as that.

lee · on May 6, 2022

For startups it's a matter of rapidly developing an MVP to validate your business and iterating quickly.

Developer salaries are way higher than CPU and server costs. Productivity wins out here vs. performance.

munificent · on May 5, 2022

> Those languages, the lisps, schemes, smalltalks, etc.

The main reason those languages got fast despite being highly dynamic is because of very complex JIT VM implementations. (See also: JavaScript.)

The cost of that is that a complex VM is much less hackable and makes it harder to evolve the language. (See also: JavaScript.)

Python and Ruby have, I think, reasonably chosen to have slower simpler implementations so that they are able to nimbly respond to user needs and evolve the language without needing massive funding from giant corporations in order to support an implementation. (See also: JavaScript.)

There are other effects at play, too, of course.

Once your implementation's strategy for speed is "drop to C and use FFI", then it gets much harder to optimize the core language with stuff like a JIT and inlining because the FFI system itself gets in the way. Not having an FFI for JS on the web essentially forced JavaScript users to push to make the core language itself faster.

peatmoss · on May 5, 2022

Spending a weekend or two writing a Scheme that beats Python in performance has been a pastime for computer science students for at least a couple decades now. I'm not sure that I believe that a performant Scheme implementation has more complexity than e.g. PyPy. In fact, I'd wager the converse.

munificent · on May 5, 2022

Sure, but that's because Python has objects.

If your write an object system on top of your performant hobby Scheme implementation, you'll likely find that the performance of its method dispatch is about as slow as it is in Python. Probably even slower.

Purely procedural Python code isn't as slow as object-oriented Python code.

peatmoss · on May 5, 2022

That's fair, but also the fact that we're comparing hobby scheme implementations to two mainstream extremely popular implementations of Python and setting up conditions that forces (hobby) Scheme to play to Python's relative strengths is telling. :-)

The Python ecosystem has certainly received a lot of developer resources and attention the past couple of decades. Shall we compare the performance of CLOS on SBCL, which again has seen comparatively little developer resources, to Python's performance in dealing with objects? I'd take that performance wager.

Spivak · on May 5, 2022

This isn’t as much of a gotcha as you think. Python is slow because the language is so dynamic and simply has to do more behind the scenes work on each line. It’s not impressive that a language that does less is faster. What’s impressive is that a language that does more, like JS on V8, is faster.

CraigJPerry · on May 5, 2022

Is CLOS doing less than Python?

I'm thinking CLOS has more dynamism than Python - they're both dynamically typed, they're both doing a lookup then dispatch, but then CLOS adds dynamism on top of that, it's also looking in the metadata thingy (i'm not a lisp developer, do they call it the hash? I'm meaning the key value store on every "atom" - i'm so out of my depth here, is atom the right word?) plus if i remember right the way CLOS works you use multiple dispatch not just single dispatch like python.

ByteJockey · on May 6, 2022

The CLOS is more dynamic than python. You can do things like specialize a method on multiple types based on the runtime types (that is to say, the method conceptually belongs to the intersection of the classes, not any single class).

I like python (especially things like comprehensions), but to say it's more dynamic than common lisp is a little insane.

edflsafoiewq · on May 6, 2022

CL has many constraints that make it less dynamic to "minimize the observable differences between compiled and interpreted programs" (CLHS, 3.2.2.3 Semantic Constraints). For example, if f is defuned in a file, throughout that file you can assume (f x) refers to that f and does not have to be looked up dynamically at runtime.

If I recall correctly, method calls in CLOS are also not syntactically distinguished from regular calls either (like they are in x.f() languages), so there is no motivation to write a method unless you actually want the dynamic dispatch methods provide. Methods are fairly rare compared to regular functions.

ByteJockey · on May 7, 2022

> For example, if f is defuned in a file, throughout that file you can assume (f x) refers to that f and does not have to be looked up dynamically at runtime.

Unless you declare them "notinline".

http://www.lispworks.com/documentation/HyperSpec/Body/d_inli...

> so there is no motivation to write a method unless you actually want the dynamic dispatch methods provide.

Methods have some different functionality in CL than in most languages. For example, you can have the methods from the entire inheritance hierarchy (well, the classes that have had that method specialized for them) all be called as, essentially a pipeline of methods.

https://lispcookbook.github.io/cl-cookbook/clos.html#method-...

But this is a discussion about dynamism of the object system, and most of that is defined before runtime. How about changing an object's class at runtime? https://www.cliki.net/site/HyperSpec/Body/stagenfun_change-c...

pjmlp · on May 6, 2022

Smalltalk,

    a become: b

Now all references of a and b across the complete image are changed, invalidating all assumptions done at every call site sending messages to each of them.

mrtranscendence · on May 5, 2022

You're either exaggerating or the computer science students you're familiar with are wizards. I've never known the student who could write a Scheme implementation, from scratch, in one weekend that is both complete and which beats Python from a performance perspective.

peatmoss · on May 5, 2022

If it's an exaggeration, it's not much of one.

Two parts to your argument:

- Writing a Scheme implementation quickly: Google "Write a Scheme in 48 hours" and "Scheme from scratch." 48 hours to a functioning Scheme implementation seems to be a feat replicated in multiple programming languages.

- Performance: I haven't benchmarked every hobby scheme, but given the proliferation of Scheme implementations that, despite limited developer resources, beat (pure) Python with it's massive pool of developers (CPython, PyPy), I still don't buy the idea that optimizing Scheme is a harder task than optimizing Python. Again, I'd strongly suggest that optimizing Scheme is a much easier task than optimizing Python simply by virtue of how often the feat has been accomplished.

mrtranscendence · on May 5, 2022

If you can give me an implementation that implements almost all of R5RS, in 48 hours, beating Python in performance, and all by a single developer, I’ll tip my hat to that guy or gal. But I can’t imagine it’s too commonly done.

eatonphil · on May 5, 2022

Nobody said you can implement a full Scheme implementation in 48 hours or two weeks. That's very much besides the point about how poor CPython performance is.

mrtranscendence · on May 6, 2022

> Nobody said you can implement a full Scheme implementation in 48 hours or two weeks.

Fair enough, you're right. But if we're only talking about incomplete Scheme implementations it's not a very interesting claim. As I pointed out in another comment, even I could write a fast Scheme implementation in 48 hours if I kept my scope very limited. That doesn't say much about Scheme performance overall or how it relates to Python.

peatmoss · on May 6, 2022

Well let's flip this around: do you think you could write a performant minimal Python in a weekend? Scheme is a very simple and elegant idea. Its power derives from the fact that smart people went to considerable pains to distill computation to limited set of things. "Complete" (i.e. rXrs) schemes build quite a lot of themselves... in scheme, from a pretty tiny core. I suspect Jeff Bezanson spent more than a weekend writing femtolisp, but that isn't really important. He's one guy who wrote a pretty darned performant lisp that does useful computation as a passion project. Check out his readme; it's fascinating: https://github.com/JeffBezanson/femtolisp

You simply can't say these things about Python (and I generally like Python!). It's truer for PyPy, but PyPy is pretty big and complex itself. Take a look at the source for the scheme or scheme-derived language of your choice sometime. I can't claim to be an expert in any of what's going on in there, but I think you'll be surprised how far down those parens go.

The claim I was responding to asserted that lisps and smalltalks can only be fast because of complex JIT compiling. That is trueish in practice for Smalltalk and certainly modern Javascript... but it simply isn't true for every lisp. Certainly JIT-ed lisps can be extremely fast, but it's not the only path to a performant lisp. In these benchmarks you'll see a diversity of approaches even among the top performers: https://ecraven.github.io/r7rs-benchmarks/

Given how many performant implementations of Scheme there are, I just don't think you can claim it's because of complex implementations by well-resourced groups. To me, I think the logical conclusion is that Scheme (and other lisps for the most part) are intrinsically pretty optimizable compared to Python. If we look at Common Lisp, there are also multiple performant implementations, some approximately competitive with Java which has had enormous resources poured into making it performant.

eatonphil · on May 6, 2022

What is it that you think makes a full Scheme implementation as slow as CPython?

mrtranscendence · on May 6, 2022

I don’t think a full Scheme implementation is as slow as Python in general. What I’m hung up on is the claim that it’s so absolutely trivial to write a language implementation faster than Python that basically anybody at any skill level could do it in a weekend, and still have time for Sunday afternoon bocce.

pjmlp · on May 6, 2022

Start with,

"An Incremental Approach to Compiler Construction"

http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf

eatonphil · on May 5, 2022

I would not include PyPy in a list of easy to beat implementations.

JulianWasTaken · on May 5, 2022

Nor ones with massive pools of developers.

peatmoss · on May 5, 2022

Compared to most Scheme implementations?

eatonphil · on May 5, 2022

Substitute computer science student with "developer" and it holds for me. Definitely some CS students can do it too. Actually at my school we did have to implement a Scheme compiler. So yeah it's not too big of a stretch to say.

I think people who haven't implemented a language underestimate how slow CPython is. And overestimate how hard it is to build a compiler for a dynamic language.

I think every professional developer or CS student can and should build a compiler for a dynamic language!

mrtranscendence · on May 5, 2022

But the claim was that a student could write a conformant Scheme implementation in 48 hours that beats Python. Clearly it’s possible for a student to write a Scheme that’s faster than Python, but is it a reasonably complete Scheme done in a single weekend?

Even I, very much a non-computer scientist, could write a fast Scheme quickly if I could keep myself to a very small subset, so that’s not interesting to me.

eatonphil · on May 5, 2022

Conformant is a word you introduced, they didn't say that.

pjmlp · on May 6, 2022

Only if they aren't attending degrees where compiler design is part of the curriculum, which if that is the case kind of speaks for the quality of the institution.

mrtranscendence · on May 6, 2022

A reasonably complete, fast language implementation, from scratch, in one weekend, though? By someone who was only introduced to compilers within the last few months? I don’t believe most students would be capable of that, but I’m not a computer scientist so what do I know.

pjmlp · on May 7, 2022

Yes, Scheme is quite simple, no one is speaking about doing R7RS.

Compared with current state of affairs in CPython, even with basic code generation algorithms, the machine code would be fast enough.

Introducing to compilers means also writing one in the process, unless it is a lousy degree.

saghm · on May 6, 2022

> Python and Ruby have, I think, reasonably chosen to have slower simpler implementations so that they are able to nimbly respond to user needs and evolve the language without needing massive funding from giant corporations in order to support an implementation. (See also: JavaScript)

I'm not really sure I buy that argument for why JavaScript evolved slowly for a while. It seems more likely to me that it was due to the combination of basically being required to be 100% backwards compatible (no one making a browser wants to be the only one who doesn't work for a certain website), having no single canonical implementation (which means that all the common browsers would have to agree on a spec for something to be added or changed), and not really being used in the server context widely before Node. Python and Ruby both sacrificed backwards compatibility at times in their history, and both had one central implementation which others tended to draw from, so it was easier to get changes made.

munificent · on May 6, 2022

Agreed, that isn't the reason JavaScript in particular was essentially motionless for about a decade. As you note, that was largely due to political factors and disagreement about language direction among the various competing implementers.

But compared to say Smalltalk, Scheme, and Lisp, I think Ruby and Python have been able to evolve incrementally in part because of their simpler more hackable implementations.

igouy · on May 7, 2022

Similarly why shouldn't we think that business reasons left Smalltalk motionless after release from ParcPlace?

So traits for Squeak but not the commercial Smalltalks.

igouy · on May 5, 2022

> Python and Ruby have, I think, reasonably chosen to have slower simpler implementations…

?

https://shopify.engineering/yjit-just-in-time-compiler-cruby

munificent · on May 5, 2022

Yes, CRuby is slowly moving towards a JIT now because performance is a major blocker for user adoption.

The larger Python ecosystem has tried that a number of times too (Unladen Swallow, PyPy, etc.)

It's quite difficult since both of those languages already lean heavily on C FFI and having frequent hops in and out of FFI code tends to make it harder to get the JIT fast. JITs work best when all of the code is in the host language and can be optimized and inlined together.

eatonphil · on May 5, 2022

Javascript the language seems to have evolved much more than Python despite CPython's very simple implementation.

munificent · on May 5, 2022

Hence my point about "massive funding from giant corporations in order to support an implementation". :)

eatonphil · on May 5, 2022

Well almost all the JavaScript language innovation was syntax sugar and was implemented as transforms before the browsers implemented it. I think JavaScript devs mostly were fine to keep using transforms indefinitely and it's just been more convenient that the browsers have moved to implement it.

Python could have done this easily too but evolving as a language just isn't as big a priority (not that I'm saying it should be) and that's completely (or mostly) disconnected from their backend implementation decisions.

zdw · on May 5, 2022

This sounds a lot like what some Python package developers are trying with Rust (example being the cryptography package), which also has the unfortunate side effect of limiting support for some less popular platforms.

pyuser583 · on May 6, 2022

My remembering of Python is, “developer experience is paramount; if you need more performance use PyPy.”

To increase Python performance, you only need more Python!

peatmoss · on May 5, 2022

Is it gauche to offer my own counterpoint?

Another possibility is that the requirement to "drop to C" is a virtue by de-democratizing access to serious performance. In other words, let the commoners eat Python, while the anointed manage their own memory.

I personally find this argument a bit distasteful / disagree with it, but there was a thread the other day that talked about the, uh, variable quality of code in the Julia ecosystem (Julia being another language where dropping to C isn't important for performance). In Julia, the academics can just write their code and get on with their work—the horror!

joncatanio · on May 5, 2022

This is a great read, and it's fantastic to see all the work being done to evaluate and improve the language!

The dynamic-nature of the language is actually something that I had studied a few years back [1]. Particularly the variable and object attribute look ups! My work was just a master's thesis, so we didn't go too deep into more tricky dynamic aspects of the language (e.g. eval, which we restricted entirely). But we did see performance improvements by restricting the language in certain ways that aid in static analysis, which allowed for more performant runtime code. But for those interested, the abstract of my thesis [2] gives more insight into what we were evaluating.

Our results showed that restricting dynamic code (code that is constructed at run time from other source code) and dynamic objects (mutation of the structure of classes and objects at run time) significantly improved the performance of our benchmarks.

There was also some great discussion on HN when I had posted our findings as well [3].

[1]: https://github.com/joncatanio/cannoli

[2]: https://digitalcommons.calpoly.edu/theses/1886/

[3]: https://news.ycombinator.com/item?id=17093051

Animats · on May 5, 2022

But we did see performance improvements by restricting the language in certain ways that aid in static analysis, which allowed for more performant runtime code.

Well, yes. In Python, one thread can monkey-patch the code in another thread while running. That feature is seldom used. In CPython, the data structures are optimized for that. Underneath, everything is a dict. This kills most potential optimizations, or even hard-code generation.

It's possible to deal with that efficiently. PyPy has a compiler, an interpreter, and something called the "backup interpreter", which apparently kicks in when the program being run starts doing weird stuff that requires doing everything in dynamic mode.

I proposed adding "freezing", immutable creation, to Python in 2010, as a way to make threads work without a global lock.[1] Guido didn't like it. Threads in Python still don't do much for performance.

[1] http://www.animats.com/papers/languages/pythonconcurrency.ht...

chrisseaton · on May 5, 2022

> This kills most potential optimizations, or even hard-code generation.

It doesn’t - this has been a basically solved problem since Self and deoptimisation were invented.

Animats · on May 5, 2022

In theory, yes. In CPython, apparently not. In PyPy, yes.[1] PyPy has to do a lot of extra work to permit some unlikely events.

[1] https://carolchen.me/blog/jits-impls/

chrisseaton · on May 5, 2022

You’re trying to correct me by posting my own mentee’s blog post at me.

dijit · on May 5, 2022

As a very partial; almost unrelated question: Is there any python module that you use day-to-day that you'd like to have a significant speedup with?

I'm thinking of reimplementing some python modules in rust, as that seems like the kind of weird thing I'm in to. I've done it with some success (using the excellent work of the pyo3 project) professionally, but I'd be interested in doing more.

dotnet00 · on May 5, 2022

Definitely matplotlib. Navigating image plots in interactive mode with even just 10000x10000 pixels is painfully slow. While I've picked up some alternatives, they don't feel as clean as matplotlib.

wcunning · on May 5, 2022

10000% -- matplotlib for visualization of a lot of different data I've looked at, but esp things like high res images in machine learning contexts is incredibly slow, even on good computers. It does fine for small vector stuff and render once and save graphs, but it's bad for what a lot of people use it for.

yedpodtrzitko · on May 5, 2022

Pydantic is quite popular library. Its author is doing exactly this - rewriting its core [0] in Rust. It's still WIP, but readme mentions that "Pydantic-core is currently around 17x faster than Pydantic Standard."

[0] https://github.com/samuelcolvin/pydantic-core

skrtskrt · on May 6, 2022

Pydantic is not just popular (and awesome) on its own but serves as the underpinnings of a lot of the FastAPI functionality - faster Pydantic would make a LOT of apps faster

zmgsabst · on May 5, 2022

You’d be awesome if you wrote a library for large image processing.

You can make large Numpy arrays fine — eg, 20k x 20k or 500k x 500k, but trying to render that to anything but SVG or manual tilings pukes badly.

That’s my main blocker on rendering high dimensional shapes: you can do the math, but visualizations immediately fall over (unless you do tiling yourself).

There’s probably someone with a more useful idea than “gigapixel rendering” though.

tclancy · on May 5, 2022

Not working in Python right now, but I have 15 years of Python + Django on the web and while there are any number of attempts at this (I keep a list at https://pinboard.in/u:tclancy/t:json/t:python/), any improvement in JSON serialization and unserialization speeds is a huge boon to projects. I am trying to think of similar bottlenecks where a drop-in replacement can be a huge performance improvement.

JackC · on May 5, 2022

The missing thing last time I looked was a fast python json library that's byte-compatible with stdlib -- same inputs, same outputs. There are good fast options but they tend to add some (perfectly reasonable) limitation like fixed indentation size, for the sake of speed, that blocks them from being dropped into an existing public API.

mritchie712 · on May 5, 2022

pandas

SnooSux · on May 5, 2022

It's been done: https://github.com/pola-rs/polars

But I'm sure there's always room for improvement

rytill · on May 5, 2022

It’s not like polars is a drop-in replacement, it has a totally different API.

mrtranscendence · on May 5, 2022

You wrote "it has a totally different API", did you mean "it has an actually sane API?" Because that's what I think of when I compare pandas to polars.

ies7 · on May 6, 2022

Not if you've legacy apps using pandas everywhere.

Same API means I can import polars as pd and be done with it.

w-m · on May 5, 2022

This is a curious reply for me. I would think that there are very few parts in pandas that could be sped-up by reimplementing them with a compiled language. Pandas is plenty fast for the built-in methods, it only gets slow when you start interfacing with Python, e.g. by doing an `.apply` with your custom Python method. Obviously this interfacing part is impossible to speed up by reimplementing parts of pandas (you'd need a different API instead).

curiousgal · on May 5, 2022

I remember, when trying to squeeze some performance out of it, that a lot of the overhead came from it trying to infer types.

mynameis_jeff · on May 5, 2022

would give https://github.com/modin-project/modin a shot

fgh · on May 5, 2022

The answer would then be to have a look at polars.

didip · on May 5, 2022

If you are building server-side applications using Python 3 and async API and if you didn't use https://github.com/MagicStack/uvloop, you are missing out on performance big time.

Also, if you happen to build microservices, don't forget to try PyPy, that's another easy performance booster (if it's compatible to your app).

mrslave · on May 5, 2022

> if it's compatible to your app

Every time I experiment with PyPy (on a set of non-trivial web services) I encounter at least one incompatibility with PyPy in the dependency tree and leave disappointed.

DeathArrow · on May 5, 2022

As I see it, Python is good for glue code and small scripts where performance usually doesn't matter. Even if it would be more performant, it would be a nightmare for large code bases since it's dynamically typed.

I really enjoy Nim which is "slick as Python, fast as C".

supreme_berry · on May 5, 2022

You wouldn’t believe how many near-FAANGS have hundreds of large backend services on Python without any issues and from times where typing was in docstrings.

baisq · on May 5, 2022

Because they have insane amounts of money that they can throw at the machines.

bjourne · on May 5, 2022

I one had a database-backed website serving 50k unique visitors/day written in Django and hosted on a low-budget vps. Worked like a charm with very few hiccups.

db65edfc7996 · on May 5, 2022

Right? I do not understand why the comparison of Python is always, "But when you hit 10 million daily users, you are really going to be feeling the scaling pain." You can hit a very serious audience size before ever having to worry about performance characteristics at all. Computers are fast.

CraigJPerry · on May 5, 2022

I was curious so i had a bash at comparing the cost of just buying another server to throw at the problem vs telling a FAANG dev to optimise the code.

A dedicated 40core / 6Tb server is around $2k but will be amortized over the years of its life. It needs power, cooling, someone to install it in a rack, someone to recycle it afterwards, ..., around $175/yr

A FAANG dev varies wildly but $400k seems fair-ish (given how many have TC > 750k).

So that's about 12 hours of time optimising the code vs throwing another 40c / 6Tb machine at the problem for 365 days.

The big cost i'm missing out of both the server and the developer is the building they work in. What's the recharge for a desk at a FAANG, $150k/yr ? I have no idea how much a rack slot works out at.

Unless i've screwed up the figures anywhere, we should probably all be looking at replacing Python with Ruby if we can squeeze more developer productivity!

ramchip · on May 6, 2022

Adding hardware doesn't improve single-request performance, so slow stacks can require a bunch of optimizing or caching work that wouldn't be needed on a faster one. At some point it also impacts productivity when the test suite is slow, the app takes a long time to restart, etc.

CraigJPerry · on May 6, 2022

Sure it does, my home lab servers have a single thread performance approximately half that of a server today.

What’s the ping latency from US East to Europe? 80ms-ish? What’s a roundtrip to postgres with a regular business app type query, 20ms-ish? What’s the latency on a beefy rails app’s request handling, 40ms?

We’re talking 140ms best case for a slow stack. What can you get that down to with tuning work?

When your user comes along on their 4g connection with 800ms latency, will they be able to tell the difference?

Don’t get me wrong, I’d far rather invest time in making the stack efficient but from a business point of view, it might not make sense vs. just throwing hardware at the problem and spending your expensive engineering resources on making it possible to deliver more utility to customers.

dataflow · on May 5, 2022

> Python can quickly check to see if they are using the dynamic features

I don't understand how this is supposed to be "quickly" verifiable?

Nothing prevents you from doing eval('gl' + 'obals')()['len'] = ...; how is the interpreter supposed to quickly check that this isn't the case when you're calling a function that might not even be in the current module?

Doing this correctly would seem to require a ton of static analysis on the source or bytecode that I imagine will at best be slow, and at worst impossible due to the halting problem.

marcosdumay · on May 5, 2022

Hum... You are getting lost on theoretical undecidability.

On the real world, when faced with a generally undecidable problem, we don't run away and lose all hope. We decide the cases that can be decided, and do something safe when they can't be decided.

On your example, Python can just re-optimize everything after an eval. That doesn't stop it from running optimized code if the eval does not happen. It can do even more and only re-optimize things that the eval touched, what has some extra benefits and costs, so may or may not be better.

Besides, when there isn't an eval on the code, the interpreter can just ignore anything about it.

dataflow · on May 5, 2022

> You are getting lost on theoretical undecidability. [...] We decide the cases that can be decided, and do something safe when they can't be decided.

I'm not lost on that at all; I'm well aware of that. that's precisely why I wrote

>> [...] require a ton of static analysis on the source or bytecode that I imagine will at best be slow, and at worst impossible due to the halting problem

and not

>> static analysis is impossible in the general case so we run away and lose all hope.

I'm not sure how you read that sentiment from my comment.

marcosdumay · on May 5, 2022

Hum... Ok. Then the answer is that most cases do not demand as much analysis time as you expect, and the ones that demand more still can gain something from dynamic behavior analysis in a JIT.

Also, you can combine the two to get something better than any single analysis alone.

blagie · on May 5, 2022

I don't think the world is quite so bad.

x86 processors solve this by speculating about what's going on. If you suddenly run into a 1976-era operation, everything slows down dramatically for a bit (but still goes faster than an 8086). If you have a branch or cache miss, things slow down a little bit.

One has a few possibilities:

- A static analysis /proves/ something. print is print. You optimize a lot.

- A static analysis /suggests/ something. print is print, unless redefined in an eval. You just need to go into a slow path in operations like `eval`, so if print is modified, you invalidate the static analysis.

- A static or dynamic analysis suggests something probabilistically. You can make the fast path fast, and the slow path eventually work. If print isn't print, you raise an internal exception, do some recovery, and get back to it.

I'm also okay with this analysis being run in prod and not in dev.

As a footnote, JITs, especially in Java, show that this kind of analysis can be pretty fast. You don't need it to work 100% of the time. The case of a variable being redefined in a dozen places, you just ignore. The case where I call a function from three places which increments an integer each time, I can find with hardly any overhead at all. The latter tends to be where most of the bottlenecks are.

bootwoot · on May 5, 2022

I was reading this as an undetailed description of state available WITHIN the interpreter. Probably there is a table of globals that you can simply check last modification on or something like this. Whether you hit it with eval or some other tricky code, you can't modify a global without the interpreter knowing about it.

dataflow · on May 5, 2022

If that's what they mean, how would that be any faster than what's going on right now? I thought normally when you hit a callable, the interpreter would just look up its name, check to see if it's a built-in, and then call the built-in if so... whereas in this case you'd still have to look up the name of the callable (is the idea to bypass this somehow? what do they do currently?), check to see if it's different than the built-in you'd expect from the name (i.e. if it's ever been reassigned to), then call that expected built-in if it's not... which seems like the same thing? At best it would seem to convert 1 indirect call to a direct call, which would be negligible for something like Python. Is the current implementation somehow much slower than I'm imagining? What am I missing?

the-lazy-guy · on May 5, 2022

You could do something like primitive inline cache. Store "version" of the globals in another variable. Each time globals are modified - bump the version. For each call-site and/or keep what the global name is resolved to + version of "globals object" in a static variable. Now you can avoid name resolution if version hasn't changed between two executions of the line. Now in fast-path you just pay the price of (easily predicted, because globals almost never change) single compare and jump vs full hash-table lookup.

dataflow · on May 5, 2022

I think the core of the optimization you're mentioning hinges on a normal lookup being a slow hashtable lookup (of a string?)... whereas I imagined the first thing the interpreter would do would be to intern each name and assign it a unique ID (as soon as during parsing, say) and use that thereafter whenever they're not forced to use a string (like with globals()). That integer could literally be a global integer index into a table of interned strings, so you could either avoid hashing entirely (if the table isn't too big) or reduce it to hashing an int, both of which are much faster than hashing a string. Do they not do that already? Any idea why? I feel like that's the real optimization you'd need if checking a key in a hashtable is the slow part (and it's independent of whether the value is being modified).

chrisseaton · on May 5, 2022

> I don't understand how this is supposed to be "quickly" verifiable?

You don’t verify, and instead you run assuming no verification is needed. Then if someone wants to violate that assumption, it’s their problem to stop everyone who may have made that assumption, and to ask them to not make it going forward.

You shift the cost to the person who’s doing the metaprogramming and keep it free for everyone who isn’t.

https://chrisseaton.com/truffleruby/deoptimizing/

kmod · on May 5, 2022

Python dictionaries now have version counters that track how many times they were modified, so the quick check is to ask "was len not overidden last time and is the number of modifications to the globals the same as it was last time".

gpderetta · on May 5, 2022

One possibility is to move the cost to the assignment, so the code that assigns a new value to the global 'len' function is going to track and invalidate all cached lookups. Hopefully you are changing the binding of 'len' less often than you are calling it :)

kmod · on May 5, 2022

Cinder does this (invalidation), and both Faster CPython and Pyston use guarding.

gpderetta · on May 5, 2022

Right, of course, guarded devirtualization is a common technique.

SuaveSteve · on May 5, 2022

Why not switch to making __slots__ in classes the default and then making attribute changes to an object during runtime an opt-in? It will require a long grace period but wouldn't it help optimisation efforts immensely?

gjulianm · on May 5, 2022

That's going to require quite a lot of changes, it's a giant breaking change. All classes would need someone to go around finding all the attributes that are created and adding an __slots__ dictionary, to avoid regular attribute initialization in __init__ failing. It's a massive task, and it would completely break backwards compatibility for performance gains that not everybody will need.

yedpodtrzitko · on May 5, 2022

That would mean all installed dependencies need to comply with this change as well, which is unlikely to happen in any realistic timeframe.

anamax · on May 5, 2022

default __slots__ breaks a lot of monkey patching.

An "easier" change would be to add a class attribute "no__dict__", which says that the __dict__ attribute can't be used, which lets the implementation do whatever it wants. That can be incrementally added to classes.

Another option is a "no__getattr__" attribute, which disables gettattr and friends.