Hacker News new | past | comments | ask | show | jobs | submit login
Backend of Meta Threads is built with Python 3.10 (twitter.com/llanga)
480 points by BerislavLopac 10 months ago | hide | past | favorite | 438 comments



For the “Python isn’t fast enough for production backend” crowd from the same company that brought you the largest social network built on PHP + MySQL.


While that is technically correct, they haven't been running vanilla PHP for quite some time - it's jitted into x86 and run natively on the machine. It's also pretty extensively optimized. And the VM that does the jitting is all C++. Their MySQL deployment is also an internal fork that's been heavily modified for scalability (storage/compute separation, sharding, RocksDB based storage engine instead of InnoDB, Raft for synchronous replication, etc.). Lots of really fantastic engineering work has gone into making their backend infra scale.

Some worthwhile references:

[1] https://engineering.fb.com/2016/08/31/core-data/myrocks-a-sp...

[2] https://research.facebook.com/file/529018501538081/hhvm-jit-...

[3] https://research.facebook.com/file/700800348487709/HHVM_ICPE...


Facebook scaled to ~1b daily actives on MySQL + InnoDB. There was lots of engineering work, like schema sharding (denormalization), automation, plenty of bug fixes and patches for MySQL (most or all contributed back to upstream, from what I remember), and of course a massive caching layer; plus throwing crazy hardware at the problem. Nonetheless the underlying engine was something any MySQL user or admin would have recognized. And we backed it all up, every day, in < 24 hours, using an unmodified mysqldump. (FB MySQL team 2009-2012)


> And we backed it all up, every day, in < 24 hours, using an unmodified mysqldump.

But… how? Considering all the transactions in flight, and everything? And did you ever Test disaster recovery with that setup?

I’ve worked on relatively big projects, but FAANG engineering is like some entirely different field of software engineering. Fascinating.


(a) make each individual database small, but have a lot of them (b) There are lots of transactions in flight, but there is a well-ordered sequence of mutations (the binlog) that defines what has and has not been committed. So applying a backup means taking the full backup + replaying the binlogs. (c) testing can be done by just bringing up a slave from the backup and then comparing consistency with normal replicas.


To expand on this question, I'm wondering how useful daily backups even are for a site like Facebook. I mean, of _course_ you need them, but also, something about reverting all of FB to a state 24 hours ago seems disastrous even if it works. I can't imagine that it's an acceptable thing in anything but an absolute emergency. Imagine every single facebook user got rewound in time to the previous day, every message sent over the past day was lost, etc.

It was a lifetime ago I ever did DB administration (postgres in my case), but the write-ahead-logs being replicated out independently was extremely important for point-in-time recovery, such that you could always take the latest backup, zip forward through the WAL, and recover to any arbitrary point in time you want, so long as the WALs were available. I wonder how much something like this would have been done at FB scale.


We live-streamed the binlogs (what MySQL calls its WAL) to the backup infrastructure, check out the ORC article I posted elsewhere.


What yuliyp wrote is basically it. Although the individual shards weren't really small, even by modern standards.

> Considering all the transactions in flight, and everything?

If I remember, we used --flush-logs --master-data=2 --single-transaction, giving it a consistent point-in-time dump of the schemas, with a recorded starting point for binlog replays, enabling point-in-time and up-to-the-minute restores. Nowadays you have GTIDs so these flags are obsolete (except --single-transaction).

--single-transaction does put extra load on the database—I think it was undo logs? it's been a minute—which caused some hair-pulling, and I believe they eventually moved to xtrabackup, before RocksDB. But binary backups were too space-hungry when I was there, so we made it work.

Another unexpected advantage of mysqldump vs. xtrabackup, besides size, was when a disk error caused silent data corruption on the underlying file system. Mysqldump often read from InnoDB's buffer cache, which still had the good data in it. Or if the bad block was paged back in, it wouldn't have a valid structure and mysqld would panic, so we knew we had to restore.

> And did you ever Test disaster recovery with that setup?

Yes! I wrote the first version of ORC. This blog post is from long after I left, but it's a good summary of how it worked: https://engineering.fb.com/2016/10/28/data-infrastructure/co...

It wasn't the best code (sorry Divij)—the main thing I'm proud of was the recursive acronym and the silly Warcraft theme. But it did the job.

Two things I remember about developing ORC:

1) The first version was an utter disaster. I was just learning Python, and I hit the global interpreter lock super hard, type-error crashes everywhere, etc. I ended up abandoning the project and restarting it a few months later, which became ORC. In the interim I did a few other Python projects and got somewhat better.

2) Another blocker the first version had was that the clients updated their status by doing SELECT...FOR UPDATE to a central table, en masse, which turns out to be really bad practice with MySQL. The database got lock-jammed, and I remember Domas Mituzas walking over to my desk demanding to know what I was doing. Hey, I never said I was a DBA! Anyway, that's why ORC ended up having the Warchief/Peon model—the Peons would push their status to the Warchief (or be polled, I forgot), so there was only a single writer to that table, requiring no hare-brained locking.


More impressive was how Facebook managed so many MySQL instances with such a small DBA team. The average regional bank probably has more Oracle DBAs managing a handful of databases.


It sounds like you were involved in this. Since you were working there so long ago, would you be willing to write up a technical account of the things you did? I'd be interested in learning more about it. I figure the tech from 10 years ago is outdated enough that it wouldn't cause any issues if you made it public.


Appreciate the interest. Honestly, most of the cool stuff was getting to play with toys that all the other talented engineers developed; I had a relatively narrow piece of the pie. I did write up a bit in a sibling reply.


Curiously, VKontakte also started from PHP+MySQL but went another way. PHP is compiled ahead of time with something called KittenPHP. It's open-source. For databases they switched some parts to their own bespoke solutions in C that talk with PHP over the memcached protocol. These are called KittenDB (or "engines") for the simple single-purpose ones, and there was also a more generic MySQL replacement in development when I left in 2016, MeowDB.


I’m not sure why someone of facebook’s scale would want JIT instead of AOT. That’s a lot of servers all JITing the same thing that could be done once at build time.


Facebook actually does some compile-time optimizations, but also ships some of the profiling data for JITing between early deploy phases and the rest of the fleet: https://engineering.fb.com/2021/03/03/developer-tools/hhvm-j...

Also, Facebook used to do ahead of time compilation (HPHPc) but eventually HHVM managed to outperform it.


JIT compilation has the opportunity to do profile-guided optimization at runtime. JIT compilation is also simpler when distributing an application to non-identical servers, as it can optimize for the exact hardware it is running on.


HHVM's predecessor at FB was an AOT PHP to C++ transpiler called HPHPc. There are a lot more optimization opportunities with JIT compilation when dealing with dynamically typed languages like PHP.


Because they first tried to AOT, and a 2nd project with a JIT managed to get better performance out of it, with improved development workflow.


It terrifies me that they probably ran their numbers very carefully and realized that this feat of engineering was still cheaper than rewriting their platform using a more manageable tech stack.


The decision is often inspired by two things A) how near impossible it is to migrate the sheer amount of services and code without breaking something. B) that often the needed performance is not needed is not having big jumps. So small but constant improvements are the way to go.


While you're technically correct that they aren't running vanilla PHP, the main takeaway here is that if you want to ultimately want to scale to billions of users then you should probably use vanilla PHP, or maybe Python.


I'm not seeing how is that the takeaway btw, mind elaborating?

I'd think the same or bigger performance benefits can be reaped if you just start with stuff like OCaml / Haskell / Golang / Rust from the get go.


My point is that they started with vanilla PHP and then had the most successful scaling story of all time -- so if someone else wants to follow in their footsteps then they should go with what is tried and true.


Well that might be mistaking the chicken for the egg or vice versa. They made sure it works because they were heavily invested. So if anything, that is a story of a tenacious tech team and nothing much more beyond that.

Arguably picking almost any other tech would have worked as well because they would have doubled down on it as well and made sure that all the pieces that they need are in place and are working.

Thanks for elaborating, though I definitely don't share your conclusion.


> storage/compute separation

Have they really done this for OLTP workloads running in mysql? I know they do it for OLAP though.


Earlyish facebook engineer here. Early days FB php was nothing like the php used to template websites. All kinds of specialized libraries to enable a much more sophisticated programming style. Think functional helpers, and asynchronous execution on thousands of cores, spanning trees across data centers using ssh etc. As a tangent a lot of the good stuff I used was written by Evan Priestley, who also did Phabricator and lots of other strong systems.


Can you speak to the onboarding and learning curve in that environment? What was the training like? What kinds of timelines did you experience regarding expectations as a new hire?

Asking because I’m curious about these early stage mega-tech companies that are working at a scale I can’t fathom.


Six week boot camp, you are pretty fluent by week 4. Notably though this was before mega tech days, we had under 100 engineers.


>Early days FB php was nothing like the php used to template websites. All kinds of specialized libraries to enable a much more sophisticated programming style.

No idea if this was a credible leak of actual 2007-ish FB php, but it doesn't look like what you're describing.

https://gist.github.com/nikcub/3833406


I'm of course speaking from an outsider's point of view, but do you think PHP was a good option, or was it a "fitting a square peg into a round hole" situation?


> but do you think PHP was a good option, or was it a "fitting a square peg into a round hole" situation?

Facebook was founded in 2004. Whatever fancy tech you're thinking that would fit the round hole, it didn't exist then. They also never thought it would serve a couple billion of users one day, so they built it in whatever they had and knew at the time.

Same for Instagram and Python (they started with Django).


How they have used Instagram with Django is that they have replaced pieces of Django over time with custom parts which is a benefit of the design of Django.


It was not a deliberate choice, just it was what the site was built in, so people like Evan and Marcel (Laverdet) made php beautiful for us.


Random question... whatever happened to that visual friend/network graph that used to load on a separate window? I always thought that thing was pretty neat to see


Yep, Vanilla Python isn't fast enough so they have to do more:

"It's running on Instagram's #Cinder fork that includes a JIT, lazy-loaded modules, pre-compiled static modules, and a bunch of other interesting tweaks against vanilla Python 3.10"

From the Tweet.

Personally, I don't care what language people use. We all know that if you have money, you can just throw in more servers. Until your CFO and CTO decided that the next big thing is to do "Cost Saving" (in the eye of recession).

I'm surprised in 2023 people are still debating mainstream programming language "PROD" readiness.

It's like a "cool kid" competition.


It just dawned on me that linking to Twitter is now impossible on the open web. Unless you have an account, a person can not see the linked material.

This is a huge loss for the world, and makes me a bit angry.

Edit: I had falsely assumed that because GP missed out on the JITing etc. described in the linked tweet, that Twitter was still inaccessible to the open web. However, trying again now, Twitter is again viewable without an account. So Twitter's communication about needing an account to view it was lacking, but so was my diligence.


We are also in a transition period when most people and media have not realized this yet.

Yesterday, I read an article about Tour de France stage and it linked to about 4 tweets with videos which are not playable without an account.

I just hope this gets noticed by mainstream media and they change their practices instead of assuming that everyone has a Twitter account.


They removed the requirement to be logged in to view Tweets


> We all know that if you have money, you can just throw in more servers.

Definitely not true at Facebook scale. You need to be smart, fast, _and_ throw more servers at it.


I'm speaking on the aspect of Python as web app.

We both know the issue has always been how you store the data. Be it your main data storage, or the need of multiple storages for different use cases, etc.

The app server itself can be implemented in any languages. It's just a data orchestration + transformation logic anyway...


You mean the company that had to create their two PHP implementations and a translator to C++, to actually scale?


Facebook had hundreds of millions daily active users before writing a compiler.


98% of the people that initially thought (Python for Threads - OMG performance!) work at a company that will tell you straight up they expect 100M RPS in a year or two, will actually only get to 50k RPS in 5 years before the "Our incredible journey" letter is issued.


I use an unscientific rule of thumb that 10-100x scaling is about the most to plan for except for exceptional cases. Anything beyond that ends up with overcomplicated bloat for a "tomorrow" problem.

Being able to handle spikes and iterate quickly is probably more important.


And lots of nodes not able to keep up, rewriting backend services in C++ to make up for it.


> Facebook had hundreds of millions daily active users before writing a compiler.

In other words, Facebook felt compelled to write a compiler when they had hundreds of millions daily active users.


If you have hundreds of millions of daily active users then needing to write a compiler isn't the end of the world.


> If you have hundreds of millions of daily active users then needing to write a compiler isn't the end of the world.

It's not the end of the world because they have deep pockets and a problem domain where preserving the interface (I.e., keep everything still running on PHP) and optimizing the infrastructure that provides it is more cost effective and least disruptive than switching to a tech stack that is more performant.


Compiling PHP into machine code sounds pretty much impossible, are they using some subset of the language? Does the compiled code use a garbage collector? Or reference counting?


> are they using some subset of the language?

They're using a derived language called Hack.

> Does the compiled code use a garbage collector? Or reference counting?

Not sure but it's open source so I'm sure you can dig up the answers one way or another: https://github.com/facebook/hhvm


[flagged]


Please follow the site guidelines and edit swipes and putdowns out of your comments here. They're not what this site is for, and destroy what it is for.

https://news.ycombinator.com/newsguidelines.html


> This may be shocking to you, but _all_ code is effectively compiled to "machine code" eventually.

I'm not sure you understood the point. It means nothing to claim that deep down it's all opcodes or electrons flowing. What matters is being able to transform the code targeting the high level interface to the lowest level interface in a way that remains usable and regression-free. Sometimes compilers for widely popular languages introduce bugs and weird behavior too.


I don't think you understood the comments above, but I definitely don't understand your point...

Compilers introduce bugs, yes. Compilers are code, and code can have bugs and regressions.

But the comments above make no mention of bugs are regressions. They said compiling php into machine code is "impossible."

Going form high level interfaces to lower level is the easy part. It's what all interprets and compilers do already.

Having different intermediate forms is also a very common practice. There is nothing special about php here. Except for maybe it's underserved reputation of being an obtuse language.


Such a pedantic response, the archetypical HN comment, I can sense the pretentious smirk on your face while writing it, thanks. Words still mean something, we don't compile a dynamically typed, interpreted scripting language such as PHP for a reason and as it turns out Facebook didn't "compile PHP" either but a dialect of the language.

https://en.wikipedia.org/wiki/Hack_(programming_language)


Please don't respond to a bad comment by breaking the site guidelines yourself. That only makes things worse.

https://news.ycombinator.com/newsguidelines.html


Facebook transpiled it before using the HHVM.

https://en.wikipedia.org/wiki/HipHop_for_PHP


They ended up compiling PHP to assembly! Facebook is a company that will go to ridiculous lengths to avoid rewrites.


I know you're being hyperbolic, but during my time there, your impact was measured only on measurable things like performance gains but never on lines of code. So engineers would always go in the direction of whatever gets you the biggest win.

Sometimes that actually meant rewriting the thing from scratch, and other times that meant adding something else as a middleware to prevent touching the original thing.

For the most public success story, React was a complete rewrite of the thing before it, for example.


I spent five years there, and made that joke constantly.

It's mostly hyperbole but there's definitely a kernel of truth to it. I generally dislike rewrites and I wonder if FBs approach influenced me more than I realized at the time.


I thought they tried it multiple times. The problem is they had so many developers writing new code any rewrite attempt would fall behind super quickly?

So they created their own language Hack so allow a slow transfer from PHP to Hack. Which is basically the idea behind Kotlin.


Rewriting in hip-tech-du-jour is for startups with lots of funding and little of traction.

Why would a massively successful company toss a stick in their own spokes by tearing the product to ground?


But it's actually a _really bad_ strategy for a startup! It'll be realized the moment you try to hire for hip-tech-du-jour. Most likely, you don't even have the problem that hip-tech-du-jour claims to solve.


TIL what hip-tech-du-jour is lol


Every company should go to ridiculous lengths to avoid rewrites. Rewrites of significant tech (as a proxy, lets say 100+ kloc) spell doom. Doom.


Every company? What about banks that have old codebases running on Cobol?

I'm also very skeptical about big bang rewrites, but there are points where you need to migrate off certain tech. Ideally you can make that transition piece by piece though, but that also introduces its own set of problems (now you have two systems and the new one has to inherit some of the baggage of the old one in order to be compatible).


This is exactly the kind of innovation I see coming out this current AI boom we're in. Take legacy codebases and spit out complete rewritten versions in whatever language that not just translates, but takes it much further and re-interprets it on top of modern architecture designs. It completes it with tests and a basic UI, or whatever makes sense for the project.

Those systems are well understood and well tested, so it's not cost effective at the moment to embark on a complete rewrite. Current AI coding systems are also very unreliable, so it's just a matter of time before those two meet on the graph and voila, another boom in moving the legacy world into safe languages.


It’s definitely not for the timid or mediocre


I remember when I used to rewrite stuff before fully understanding the drawbacks of what I was rewriting it in.


Was there a separate initiative for assembly, or are you thinking about HipHop? HipHop was for C++ and got a lot of attention back when it was released.


Yes, the 2nd rewrite for a VM, where the JIT compiler turned out to be quite good versus HipHop, and development focus switched to HHVM.


I was thinking about HHVM, which I believed was essentially assembly for a VM. Am I wrong on that?


No, you're right. HHVM was the 2nd iteration. I was thinking of the first project, which was for C++. Looking at wikipedia, that was called HPHPc.


Lol. Do you think you are just going to use go and everything is going to scale to FB level size?

Facebook was already at a scale larger than 99.99999 percent of sites before they had HHVM


Remember when Twitter was written in Ruby?


Remember how Shopify is still processing hundreds of millions of transactions in Ruby on Rails without an issue currently?


Except that Shopify is actually paying a couple of enginners to write a Ruby JIT compiler.

Not only that, the community felt the pressure to write several JIT compilers already.

So....


So what? A company is trying to streamline their process and save costs?

Is there any large company at all that hasn't invested in making their tool chain better?

You talk about it like they are a failure. Lol


Costs that wouldn't have happened in first place if a scripting language hadn't been chosen in first place.


What we can’t know is whether they would have been fast enough to tweak their product to their current market without using a scripting language. Companies who do that aren’t sacrificing performance for nothing, scripting languages make a lot of things simpler and closer to the business domain, which cannot be ignored (anybody who’s in a huge project that requires a 30 min. compilation often enough to kill productivity knows what I’m talking about).

Besides, you can even use optimized C and get none of the benefits (but still all the drawbacks) because of the algorithms you’re using or your database or something else entirely such as microsservices etc. making differences in language speed negligible.

I guess the point is the language speed on backends don’t matter as much as many other things (network, database, architecture setup etc.) and usually the gains you get with your language being faster aren’t high unless you’re in a tight loop or you’re like Amazon where even a few milliseconds gain is more money.


> What we can’t know is whether they would have been fast enough to tweak their product to their current market without using a scripting language.

Exactly. And I am saying the following as a fan of dynamic languages but let's be realistic: this is practically their ONLY true benefit -- quicker time to market. They lose out on pretty much any other metric, with the exception of also slightly quicker iteration in the day-to-day work.


In a startup, time to market is everything.


True, though I've measured my time to prototype with different languages some months ago and the advantages of PHP / Ruby / Python are oversold.

- Ruby on Rails prototype: 2.5 days.

- Golang's Gofiber: 4.5 days.

- Elixir's Phoenix: 7.

- Rust: don't ask (too long lol).

- Python's Django (made by a friend): 3.5 days.

Not to split hairs here but if you are in a situation where 5 days more for a prototype matter then I am not very sure you would have succeeded even short-term after. Putting your foot in the door can be extremely important, absolutely, but after whipping out a quick MVP you'd likely immediately stumble upon the next obstacle which is not guaranteed to be overcome.

As mentioned in my previous comment, the only argument in favor of the fastest-to-prototype languages could be that they allow slightly higher velocity day-to-day as well. But I've worked with PHP and Ruby for a long time, and I've worked with Elixir, Golang and Rust for quite a while now as well and again, that particular advantage of PHP / Ruby / Python (a) does exist, yes, but (b) is not as big as people make it out to be.


So, wait, all apps are trivial in nature and can be built in less than a week?

By your own admission here GO takes nearly twice as long (which is about what I've seen). So if you aren't building a trivial app but instead building something that takes 6 months in Ruby you'd be spending nearly 12 months in go. Sounds like a real significant amount of time if you are rushing to get to market.


Please tell me what language they should have written it in then?

They have a incredibly successful company. What did ruby do to hurt you personally.

This whole line is co oketely irrational. You want thier webapp written in assembly?


Why resort to a straw man? There are plenty of options between "a language that does not even have strong types" and "machine code"? E.g. Elixir is strongly but dynamically typed and you can take it quite far while reaping most of the benefits of typing and not losing velocity as you would with many statically strongly typed languages.

Nuance matters. I wouldn't pick Ruby for anything except scripts nowadays. You get absolutely zero guarantees and being able to whip out an MVP in 2 days is not as important as people make it out to be. Extremely often being able to have an MVP in 10-20 days is just as viable but then you will have to write less tests and have more guarantees from your compiler.


It's not a straw man, I'm asking him where he stands. How far does he want to take it?

You mentioned Elixir. You literally suggested a language that was created 6 years after Shopify began.

You wouldn't pick Ruby, thats your choice. People are still making billions of dollars with it.

So, since you took up the mantle, what language should they have used 17 years ago when they started the company?


> You wouldn't pick Ruby, thats your choice. People are still making billions of dollars with it.

Not arguing with their results here. I am saying that they made it work despite how terrible it is. You have no guarantees about anything and you have to have much more tests than you would have in other languages just so you have basic stuff ensured. Not cool at all and I am saying this as a guy who made good money with Rails for 6.5 years. Not impressed to this day and I am happy I left it behind. It's good for scripting I'll admit, though nowadays I just learned bash/zsh better and use Golang for the same task occasionally.

> what language should they have used 17 years ago when they started the company?

Whoops, you are 100% correct that I ignored the historical timeline (Elixir didn't exist back then, yes), sorry about that, I derped big time.

That being said, all of PHP, Ruby, Python, Java and C# were indeed valid choices back then. But I've been part of successful rewriting efforts (biggest one was about 270k coding lines which I'll admit is much smaller than what Facebook and other big corps are dealing with) and to me the downsides of rewriting are overplayed:

- You need extensive tests? Well, you need them even if you never rewrite.

- You will now have two systems? So what, you already have a load balancer, you will just put a few more rules in it.

- Hard to find engineers for $new_language? That's true if you do it in its first 5 years of life but I've been part of growing ecosystems twice (Elixir and Rust) and it only gets easier with time. Solution: don't be a super early adopter, that's obviously too risky. E.g. both Elixir and Rust are beyond 10 years old at this point and are now a safer choice.

- Have to duplicate the old system's bugs? Tough nut to crack and I partially agree with this one but what my previous teams did was write down these bugs in the docs and made sure to start fixing them after the rewrite was completed. And in all cases fixing the bugs in the original language was planned anyway but was eternally kicked down the road.

--

My main point here is that it's OK to admit at one point that the tech you used for a while has outgrown its usefulness. I understand and recognize some of the downsides but to me they are overblown.


> My main point here is that it's OK to admit at one point that the tech you used for a while has outgrown its usefulness. I understand and recognize some of the downsides but to me they are overblown.

My entire complaint is just that you want to say it's outgrown it's usefulness. That is such an absolutist view. And the fact you are even bringing rust into this is crazy. I know C, C++, ASM, GO, and a few other low level languages. Rust is harder to learn than all of them, even ASM. And Rust in some ways can be a dumpster fire when you have to unwrap everything. A very intelligent person can take months to be comfortable in Rust or be similarly comfortable in GO in a couple days.

It's such a dogmatic view to say Ruby has outgrown it's usefulness, that's fine though you are entitled to it. Ruby isn't going away, it'll still be there and I'm sure people will continue to make money with it.


I just realized we're chatting in 4 separate spots, heh. So let's unite them.

> My entire complaint is just that you want to say it's outgrown it's usefulness. That is such an absolutist view.

I don't speak for all programmers and IT companies -- and neither do you. We are both right at the same time due to the fact that the world is big.

I've been in super small teams (think 3 people) all the way to 50+ and stricter typing is more and more sought after the bigger a team becomes. It's very hard to code confidently the weaker the typing of the language is after a project gets beyond a certain amount of coding lines. Every dev must spend extra time to self-onboard in the context of others' work before being able to meaningfully contribute.

Of course this curve tapers off eventually and some teams become super well-oiled machines -- that's a fact. But we don't live in an ideal reality; people leave, some get sick and are gone for 2-3 weeks, others get reallocated to other projects. Things happen and the ideal stable-ish state of a team is rarely achieved.

Having stronger stricter typing gives peace of mind when coding. It does reduce prototyping (and sometimes the day-to-day) speed but from one scale and up it is worth it and that cost is quickly surpassed by confident and quick delivery of features or bugfixes in the mid-term.

> And the fact you are even bringing rust into this is crazy.

Oh I agree. Rust can be infuriatingly hard and slow to progress with. I've made the mistake to try and prototype one thing with Rust and I lost one small business opportunity because of that. Won't ever do that again. Believe me, I've experienced this craziness first-hand and I agree with you there.

> It's such a dogmatic view to say Ruby has outgrown it's usefulness. Ruby isn't going away, it'll still be there and I'm sure people will continue to make money with it.

Maybe it's dogmatic to you because you imagine that if I was right Ruby would be no longer widely used and since you are not seeing that, you thus think me wrong? Well if so, (1) the world is big and even if Ruby is being gradually shown the door it can take decades until you and I notice, (2) your assertion that Ruby is not going anywhere is 100% true but does not contradict anything I've said because there's a place for everyone in the IT sphere and that still does not mean people are not seeing cracks in Ruby's perfect image (people from Twitter and Shopify have expressed displeasure with it a good amount of times in the past... and now I regret I never kept the archive links because I always get asked for source and can no longer provide it -- sigh).

BTW many people came to Elixir from Ruby and said they're never going back unless they have to build an MVP in a weekend -- they would go back only for that.

> Maybe there wouldn't be a product at all with GO or Elixir. It's hard to say.

I am not claiming either way. Don't imagine me such an extremist, please -- I am not. I am only saying "OK you did it with PHP, Ruby or Python, the tech did the job okay-ish but it's beginning to suck for you -- why are you so averse to admitting that it's time for a change?". I got a very cynical answer and at the risk of you thinking me even more extremistic I'll name it: a lot of programmers love cozy jobs and they might hate what they do daily but they still love the wages and are not gonna rock the boat. Obviously I can't claim any percentages but I've been around and I've met plenty of such guys and girls. So IMO you should factor that into your analysis.

Inertia and network effects do exist and sadly they often have nothing to do with the quality of the thing they are carrying (think of all the sucky software we all must use if we want to communicate with others -- that's a good example of inertia / network effects not being correlated with the quality of software).

> We can disagree about usefulness I guess. Ultimately Ruby has worked just fine.

Due to tenacious teams. Not due to Ruby's technical virtues which are not that many. I admire people who can make anything work but I've also drank beer with some of them and they said that some days they think of committing suicide (I hope they were joking but they genuinely looked and sounded unhappy). I am talking PHP, Python, Ruby, Javascript.

> They didn't report massive issues, and the DB is going to be the bottle neck way before Ruby, so I just don't see an issue.

I remember the times in several apps I consulted for, long ago. We had good metrics systems in all of them and we had DB requests go from 5ms to 50ms and ActiveRecord itself (when subtracting the DB times) has routinely eaten 150ms to 250ms, sometimes more. I am aware that these issues have been fixed a while ago but it did leave a sour taste in my mouth and I can't just non-critially accept a claim that the DB dominates every latency in Rails world. Maybe it's true nowadays but during the dark times of Rails 2.x and 3.x it definitely was not.

Now Elixir... the last companies I contracted with we were talking 10-100 microseconds of Elixir code and 5+ ms of DB requests. Pretty neat.

...And then I consulted for two Golang projects where we had something like 100 to 400 nanoseconds of app code and 5+ ms of DB requests. Insane.

So I don't disagree with you on the concept but I do disagree somewhat that Ruby / Rails is not a resource hog. At one point it definitely was and it was humanly noticeable even.

But I can concede that nowadays this is very likely no longer true. I got no beef with any tech, I am simply a guy who is always looking for something better and thus I don't get attached to any tech. Elixir has served me very well in the last 7 years (together with Golang and Rust sprinkled in) but if e.g. Rust gains Erlang's / Elixir's transparent concurrency / parallelism abilities and fault tolerance and speed of development then I'd have zero qualms abandoning Elixir.

> So, wait, all apps are trivial in nature and can be built in less than a week?

Come on now, you are starting to sound like you want to misrepresent what I am saying. :) I was talking about quick MVPs / prototypes. Of course beyond those everything else is very different.

> So if you aren't building a trivial app but instead building something that takes 6 months in Ruby you'd be spending nearly 12 months in go.

You assume the curve of Elixir or Golang is linear; it's not. It's an asymptote.

To give you a contrived example:

- Month 1: Rails app at 20%, Elixir app at 10%

- Month 2: Rails app at 35%, Elixir app at 25%

- Month 3: Rails app at 45%, Elixir app at 40%

- Months 4 to 6: Rails app at 55%, Elixir app at 60%

- Months 7 to 9: Rails app at 80%, Elixir app at 90% and starting to work on final touches.

And so it went the few times I had the privilege of witnessing parallel Rails and Elixir development (including rewriting a few Rails apps to Elixir's Phoenix). The Rails guys knew their stuff pretty well and I liked working with them -- but even they admitted that they are regularly blocked on too many checks in tests or in the controllers themselves due to much weaker typing. (Though there were other reasons as well but hey, this comment became an essay already.)

---

I understand that you are skeptical. A lot of us cope with the fear of missing out by simply denying there's something to miss out on. Plus we can't be everywhere all at once so we eventually find our own corner and become experts there. All of that is completely fine and I am not judging; we can't all chase some theoretical perfection, plus when we hit 32-35 y/o the reality of "this is still just a job even if I like programming" sets in and we are all very excused for not researching and knowing every single alternative of doing things.

If I am reading you correctly, you disagree that the older tech (Ruby included) has outgrown its usefulness. Ultimately I don't think we disagree with each other because we are both right at the same time: maybe where you work and the people you communicate with Ruby / Rails are still deemed super good at what they do, they get the job done, the product brings enough revenue to drown out the drag that Rails can be and everyone is happy. Cool, more power to you. I however changed sub-careers in programming several times in a row and I am solving for tangibly different problems than what most Rails companies I've met back in the day did. And I've had a lot of financial success with Elixir, Golang and Rust, and my customers were super appreciative of the work done. Even now I am tutoring people who have 30+ years of programming experience and they are very happy with Elixir in particular.

We being in different bubbles is 100% fine. The world is big and rich and interesting. I am not disparaging your choice. Hopefully I am offering you an alternative take from another vantage point instead.


> I understand that you are skeptical. A lot of us cope with the fear of missing out by simply denying there's something to miss out on.

I'm just not going to respond to this because it's completely disingenuous when you make statements like that.

You are free to go on with your irrational views and I'll go on ignoring them.


I took an effort to explain my views thoroughly, and you resorted to calling me disingenuous and irrational.

The part you quoted was the only piece that could be interpreted as seeking conflict -- though it had a more charitable interpretation that you decided to ignore -- and you latched onto that.

Oh well, I tried. Let the future readers judge for themselves.


>Remember when Twitter was written in Ruby?

Ruby is the exception to the rule.

Friends don't let friends start new Ruby projects in 2023.


Ruby in 2023 is miles ahead it was 10 years ago.


Yet it's still super behind tech like Elixir or Golang, which are both faster and give you more compile-time guarantees.


Behind how? There are no good web frameworks for Golang at all, and relatively few sites being built with it.

Elixir? Okay, it's a nice language that is also dynamically typed. It has some advantages but good luck finding anyone with knowledge of how to program in it or has even used a functional language before.

I don't personally use Ruby these days but to say it's super behind is just silly. Golang is as similar to Ruby as C++ is.


People are starting to recognize benefits of even gradual typing en masse these days, while some of us knew it for 10+ years. Not to be an elitist, I've made plenty of other mistakes and I am trying to not look down on anyone, but to discount types is not a well-informed take IMO.

That's what I mostly meant by saying Ruby is behind Elixir and Golang. And mind you, Elixir is strongly but dynamically typed and I still find it much better than Ruby.


I'm not discounting types at all. I started in strongly typed languages and only gradually moved to dynamic languages. For massive apps generally dynamic languages are not good.

However, Ruby also has a lot of creature comforts that Elixir and GO don't have. Maybe there wouldn't be a product at all with GO or Elixir. It's hard to say.


Respectfully disagree about not starting new Ruby projects now - can we still be friends? It works and is easy to get started for simple stuff, same with PHP for some folks or Django etc.


"Easy to start" is oversold in my experience. At one point you want more guarantees upfront.


More guarantees like how? Genuinely curious. I’ve been working with Nest.js lately and it’s good but rails is still more plug and play in my opinion.


More guarantees as in asserting on exact data shape which also includes some typing guarantees like "function argument 1 is always gonna be this struct" (basically a map with keys guaranteed to be present).

In Ruby you get nothing like that, all your function arguments are just variable names. This increases testing friction a lot. I've been all over the spectrum: from PHP and Ruby through Elixir (combining best of static and dynamic types IMO, though still flawed) to Golang and finally to Rust which is super strict. To me Elixir and Golang are close to perfect. Rust takes it too far and development velocity can suffer a lot until you become an expert (which can take quite a while).

Plug and play is nice but my opinion remains that it's oversold. Quickly whipping out prototypes is not the only virtue of a programming language (though technically that's a feature of Rails, Ruby's killer app, and not of Ruby itself).


The problem is that is only so helpful anyway because you have to send the data to the front end and then translate it back, and in my experience that is where most of the issue is. If you are building a front end facing site you are constantly going to be fighting with that.


Yep, agreed, and I am saying that as a guy who still prefers server-side rendering.

Stuff like Elixir's LiveView and its imitators (like PHPx and I think C#'s Blazor?) are where things get better but since I am not interested in frontend, I leave that work to other people.


Why not exactly? What language would you suggest?


Elixir, Golang, maybe even Rust (that one is a hard sell though, there could be a lot of friction at the start and part of it persists even when the project matures).


Literally none of those languages existed when Shopify was launched. Did you expect them to get into a time machine or make that happen or what?


True, I lost track of the original problem, sorry.

Still, at one point it's OK to admit the original tech is no longer as useful and migrate to something else IMO.


I appreciate your admission here.

We can disagree about usefulness I guess. Ultimately Ruby has worked just fine. They didn't report massive issues, and the DB is going to be the bottle neck way before Ruby, so I just don't see an issue.

If I knew I was building a site for a massive load would I use Ruby? No. But I don't think it's a terrible choice either.


LOL, at least Ruby inspired a new generation of languages that we have today.


Of course not. One uses Erlang and scales to Whatsapp scale.


Guess I'll just write my next project in Bash then, after all, if it's a good idea they will come right? Execution can wait until you've got a few hundred mil.


>Guess I'll just write my next project in Bash then, after all, if it's a good idea they will come right? Execution can wait until you've got a few hundred mil.

Yea actually. Bash/CGI could handle plenty. If thats your most comfortable language then go for it.


Bash is seriously underrated as a programming language these days.

About 80% of the code I now write is in bash, because usually the problem ones trying to solve has a trivial solution if you just use existing tooling in creative ways.


I don't disagree with your premise and I write plenty of bash and zsh scripts but "the creative ways" very often are a rabbit hole that exposes way too many edge cases.

At one point I wake up two hours later and realize I could have finished this in 15 minutes with Golang or even Ruby.


You're telling me that they chose to build with PHP and had the most successful scaling story of all time?


Quite often, the choice of language is failing to see the forest for the trees. Saving nanoseconds or cycles because you chose C++ over Python pales in comparison to milliseconds spent at network barriers reading from a cache service or database


Perceivable latency isn't the only consideration. Depending on your business, compute for your proprietary workloads might be one of your biggest expenses. You could see an order of magnitude improvement to resource consumption depending on the type of workload and the language it is written in.

Ultimately these these choices are all about trade-offs. Maybe python is fine for them, maybe they've built themselves into a corner. Time will tell.


When they need a full re-write the specification will be easy: they've already got the pseudocode. (I kid because I love Python)


The difference between C++ and Python is not nanoseconds.

And I have definitely seen projects fail due - in part - to language choice. Of course projects can succeed in almost any language but that doesn't mean the language choice is irrelevant.


Example? I'm curious


Python is generally around 50 times slower than C++. Obviously it varies massively with the benchmark but that's a good ballpark.

Just Google "python vs c++ speed" and you will find hundreds of examples (or "python vs rust speed" - Rust is essentially the same speed as C++).

Here's the first result - they got a 25x speedup:

https://towardsdatascience.com/how-fast-is-c-compared-to-pyt...


Did you read the article and comments? The comments point out a lot of issues with the article. Sadly, I've seen this article referenced before on other discussions.


At Facebook scale those nanoseconds are worth millions in compyte, energy, etc.


But do the millions you spend a year on compute cost more or less than the millions you would spend a year in labor finding the increasingly rare breed of C++ developers who can optimize things for instruction or cycle count? Such developers usually have over a decade of experience (if not multiple decades). Python developers are a dime a dozen, comparatively

Plus, instruction and cycle counting is low hanging fruit compared to memory latency. You can cache-optimize a program in any language so long as the memory representation of some data is relatively transparent.


> But do the millions you spend a year on compute cost more or less than the millions you would spend a year in labor finding the increasingly rare breed of C++ developers who can optimize things for instruction or cycle count?

That's a very valid question but in the case of Facebook they already have them so why not use them for that?

I mean yeah, they made their choice -- use HHVM and it likely served them very well. I am just pointing out that in their case sourcing extra (or even any at all) C++ devs is a non-issue because they already have plenty.

Fully agreed with your memory latency remark.


Yes and no, I still remember the times when several Ruby on Rails apps I consulted for had 5ms DB queries and Rails' ActiveRecord was taking 120ms.

Technically choosing C++ over Python will save you several orders of magnitude more than nanoseconds.

Though C++ might be a bad example. I'd replace Python web app with Elixir or Golang.


It may not be because of Python, but Thread is definitely not ready for the current load. Few hours ago no one could access to Zuck's profile page, and he had to take it to private to hide the issue.


Tweets load very slowly on my end. So slow I thought there was a js error on my end, you know, like when you get a never ending spinner animation because there was an unhandled exception thrown.


Surely you mean threads.


I think there have been issues where profile pages say “this account is private and has 0 followers.” I saw that yesterday on every account I clicked on (on web). And then a few minutes later it was back to normal. It might not be that he manually took it private to hide an issue.


PHP is actually shockingly fast for a non-JIT'd language these days.


PHP8 has a JIT now but I get what you mean.


For me, it does seem really strange since PHP (currently) is faster than Python and PHP wasn't fast enough for Facebook. But it's interesting to see how it'll play out.


It’s obvious why they went with python. Since Instagram was written in python and Threads is based on instagram, they’re probably using a lot of the libraries they already have for Instagram.


That makes some sense.


Probably Python is deemed as more appealing to hipsters than good old PHP.


Nothing good about old php. Been doing php in 2004 and Django in 2009. Both were shit by todays standards but php was exceptionally so.


Python is older than PHP.


In addition to the Facebook/PHP comments here, an often overlooked one is that Instagram was a big Python/Django shop for years (and probably still is in many parts).


Most of Instagram is Django.


PHP didn't scale for the number of users or the number of developers. Presumably that's why they built their own PHP-like language called Hack with static types and a new runtime.


i don't think anybody really claims python can't be used to build large software

the claim is usually more about how the maintenance costs are just way higher versus a statically typed language


But is that true if the dynamically typed language has type annotations?


They truly and tangibly help somewhat but people give them too much credit.


it's a spectrum, of course, but imo yes -- i don't want types to be opt-in at compile time, i want them to be mandatory


OTOH threads is kinda slow as shit right now


Except PHP actually is fast while Python isn't.


Except it's not. Bindings to C libraries are.


Your semantics are irrelevant. People who write PHP end up with faster performing applications than people who write Python.


Am I seeing two people debating which language is faster without quoting any numbers?


I assumed it was common knowledge at this point.

https://www.techempower.com/benchmarks/#section=data-r21&l=z...


Would this pass their System Design interview?


They didn't use vanilla Python, though.


Pfft. It would be 100x more performant if they'd written it in asm.


They also wouldn’t release this year, perhaps neither next


In the end, everything is written in asm.


makes one wonder what the cost of needing bigger / more machines to make up for the slow runtime is though.


Python is just fine for all the 100 users Threads will manage to acquire. :)


> all the 100 users

That’s off by multiple orders of magnitude, if you believe their reports of 30 million sign ups in under a day.

https://www.theverge.com/2023/7/6/23786108/threads-internal-...


You mean sign-in's?


I’m using their terminology. The distinction seems irrelevant for the conversation.


I believe you should create a separate Threads account even though it uses your Instagram account. It's like OAuth2 single sign-on, but technically you still sign up for a different service.


Any language will be fast enough with enough modern hardware.

But it’s not “just Python”.

The OS state that needed to be loaded is not Python.

The network gear firmware is not just Python.

The databases are not Python.

Where it becomes just Metas Python is after a whole lot of other necessary work is done by other smarter people (the ones who master the physics of building the machines), not just some dweebs who got pulled the internet onto the host.

These articles are just primate rage bait


They wrote a new app in the language 99.9% of 0-5 year experienced devs know.

Absolutely shocking.


Really, all the performance intensive parts are in various c++ aggregator and recommendation type services. But the webserver is Django, yes.


Thanks for your comment. I'm really interested in this topic. How do you know the web server is Django? I searched but couldn't find this.

Why would they use Django? I did some small projects in it but I assumed it wasn't very fast and wouldn't be suitable for a big app like this. I would like to know the pros and cons. Why didn't they build up something in C++ or Rust? Won't python limit the speed of responses despite being build the hard stuff in compiled languages? Sorry for being so naive, I am an amateur.


I worked at IG on a previous iteration of Threads, with some of the engineers who wrote the current Threads app. It's Django!

(Heavily modded, run on a custom Python JIT, and using an extremely custom FB-developed database, also used for IG and FB.)

It's Django because IG was originally written in Django back in the day. FB's general approach to scaling is keep the interface generally similar, and slowly replace things behind the scenes as necessary — rather than doing big rewrites. It's seemed to work pretty well.

Ultimately the language used for the web server isn't a huge deal compared to the performance of the database, for these kinds of massive social apps. Plus, well, they do have the custom JIT — but that's very new, and when I first joined IG in 2019, we were running vanilla Python in production.


Thank you, awesome answer! HN comunity is awesome.

Thanks also to every other answer I received for my question, I appreciated all of them.


They deemed getting to the market fast was more important than hardware costs. If their architecture is good, then they were probably really quick and flexible using Python to glue their architecture together which would have been the development bottle neck to getting the service up and running. All the components of the service can be done by separate teams in whatever language they deem most effective.


It is fairly normal to build web servers in a way so they can be scaled horizontally (more instances rather than larger instances). So they can just have more containers run their Django servers and distribute the load between them.


At least a few years ago, most of Instagram’s server side code was in Python. Python may be slower than C++, but it’s not about raw speed it’s about being “fast enough.”


To answer the first, because I am no longer bound by nda.

Why Django? Because that's what it originally was. Same with YouTube frontend btw. The apps just grew and grew.


Whether it's in C++ or Rust or Python, almost all of any slowness would be from database waiting anyway.


Not really. Look at Techempower web benchmarks.


It wouldn't. Databases are fast, python is slow.


Network is slow. So you are waiting on the db for most of a request lifecycle.


Network latency within an AWS AZ is <1ms and throughput is in the GB/s range.

What percentage of python webapps do you think are hitting this as their latency and throughput limit?

(Assuming effective DB use of course, i.e. not doing dozens of DB roundtrips to server a single result or getting megabytes of data and filtering on the client etc.)


True... When I measured something similar in a large python app, the biggest chunk of time went into python object serialization/deserialization.


That depends on how much work is done in stored procedures instead of wasting network traffic and client CPU cycles to process the results.



Most of Facebook is built on PHP. I’m surprised they didn’t choose Laravel.


It turns out calling the app "Threads by Instagram" is not just a branding gimmick. The Instagram backend was always Python/Django since before the acquisition.


That's pretty interesting, I wonder if that means that it's build by a team at Instagram, and not Facebook. I'd assume so.

I get that Facebook is still a huge success, but I do find it telling that they opted to put Threads under Instagram, rather than Facebook.


I think its pretty obvious they went with Instagram branding, as Facebook branding has been tarnished for a while.


I'm confused.

It's Facebook by Meta. Instagram by Meta. Why isn't it Threads by Meta?


Tied to ig account.


Threads by Instagram by Meta ... TIM for short.


Laravel is not a requirement to use PHP effectively.


Oh, I know. But as a joke I chose the most well known framework.

Personally evren I prefer php over python.


Laravel wasn't created until 2010/11~


The web codebase was once written in PHP, but they moved away from that a decade ago.


It’s still very similar to php. Same syntax, same execution model. If you look at some hack snippets you would think it’s php (because it mostly is).

Switching from hack to php and viceversa should be extremely easy.

That being said, they chose python because Instagram is written in python, they are probably using a lot of existing libraries.


That wouldn't even make sense. They created hack/hhvm to scale and optimize the platform and codebase. Taking a step backwards and refactoring (hack != php now) would be a horrible idea.


Interesting and lovely to hear they decided to use Django. What is your source, though?


It's a heavily modified internal fork of Django. Source: I work on our Python Language Foundation team.


What exactly do you do? This sounds like a JIT/Interpreter/Language dev job. Is this correct?

If so, do you have any recommendations or suggestions for someone getting their feet wet on this?


It's mostly foundational work around developer tooling and infrastructure that isn't already covered by other dedicated teams (eg, there is a dedicated Cinder team). My latest work has focused around formatting and linting, and includes open source work on µsort, our import sorter [1], and fixit, our custom linter [2] that makes it easy for teams to build and deploy their own custom lint rules with autofixes.

Some of my team mates work on driving internal adoption of Cinder, rolling out new versions of Python everywhere, improvements to our build and packaging systems, supporting other Python tooling teams, etc. There's a lot of cross-functional project work, and our primary goal is to improve the Python developer experience, both internally and externally wherever we can.

1: https://usort.rtfd.io

2: https://fixit.rtfd.io


What do people use for the editor? Are they not all standardized on Intellij?


Keep in mind that many/most of these tools also need to work in diff review and CI tools, not just in an editor. That said, we primarily support developers using VS Code (with internal LSP/plugins/integrations) or our internal version of Jupyter notebooks. We also have a non-negligible number of folks that prefer alternative IDEs or editors like pycharm, vim, or emacs. We try to build our tools such that they are accessible by CLI or APIs, so that they can fit into any of these systems as needed.


Are you using the ORM/auth/admin features or is Django just a lightweight router? I can't imagine the ORM being too useful at Facebook's scale.


I'm not familiar enough with Django to say for sure, but I assume at this point it's almost entirely custom ORM and libraries on top of async django routing/response.


I believe the django app would use a python version of the internal ent ORM. You can get a sense for the style of ent/node from this external fork that was written while the author had access to the original ent https://entgo.io/docs/getting-started


How is Django now? I’m a long time python dev (data eng space) but new-ish to web dev. I started a Flask project 2 years ago and found it to be pretty full of footguns, mixed messaging on best practices for scalable apps, and the ecosystem feels overbloated with vapor ware extensions.

Is this just a Flask problem, or does Django have the same issues?


You simply can't beat Django's ORM for general stuff. It's too awesome. This alone makes it so hard to choose anything else.

I know django doesn't have that "shiny factor" to it these days - but it's very reliable.

> mixed messaging on best practices for scalable apps

The WSGI stuff can be kinda confusing and is used across a lot of python frameworks including django and I think flask?

My advice for "simple scaling" is to start with a separate Postgres instance, and then use gunicorn. Use celery immediately if you have any "long lived" tasks such as email. If you containerize your web layer, you'll be able to easily scale with that.

Finally - use redis caching, and most importantly - put Nginx in front! DO NOT serve static content with django!

> the ecosystem feels overbloated with vapor ware extensions.

This still exists to some degree for some more niche stuff, largely because of it's age. Although impressively they'll generally still work or work with minimal modifications. It's popular enough and old enough that most normal things you'd want to do have decent extensions or built in support already.


Not only can you not beat Django's modeling in Python, but I simply cannot find an in-kind solution in JS either. Nothing covers all of the bases the way Django does, or more specifically DRF/graphene-django.

The current state of the art is apparently Prisma, but it covers only a small part of the full picture.


> Use celery immediately if you have any "long lived" tasks such as email

Hey, quick question from a relative newbie who is currently trying to solve this exact problem.

Besides Celery, what are good options for handling long-running requests with Django?

I see 3 options:

- Use Celery or django Q to offload processing to worker nodes (how do you deliver results from the worker node back to the FE client?)

- Use a library called django channels that I think supports all sorts of non-trivial use cases (jobs, websockets, long polling).

- Convert sync Django to use ASGI and async views and run it using uvicorn. This option is super convoluted based on this talk [0], because you have to ensure all middleware supports ASGI, and because the ORM is sync-only, so seems like very easy to shoot yourself in the foot.

The added complication, like I mentioned, is that my long-running requests need to return data back to the client in the browser. Not sure how to make it happen yet -- using a websocket connection, or long polling?

Sorry I am ambushing you randomly in the comments like this, but it sounds like you know Django well so maybe you have some insights.

---

[0] Async Django by Ivaylo Donchev https://www.youtube.com/watch?v=UJzjdJGS1BM


Use anything except Celery, is my vote. Even if that "anything" is something you roll yourself.

Celery is mature, but has bitten me more than anything else.

For scheduling, there are many libraries, but it's good to keep this separate from Celery IMO.

For background tasks, I think rolling your own solution (using a communication channel and method tailored to your needs) is the way to go. I really do at this point.

It definitive is not using async, I think that will bite you and not be worth the effort.

Huey is worth a look.


Django ORM has supported async syntax for some time now, and it can work fully async starting with Django 4.2 (and psycopg3). There are still a few rough edges (such as not being able to access deferred attributes from async contexts) but there are workarounds.

I usually use `asyncio.create_task` from async views for small, non-critical background tasks. Because they run in a thread you will lose them if the service crashes (or Kubernetes decides to restart the pod), but that's fine for some use cases. If you need persistency use Celery or something similar.

Django combined with an async-ready REST framework such as Django Ninja is very powerful these days.


I use celery/redis, it's perfect for my use case.


Ehh, Django ORM always was one of the worst parts of Django.


True, but it's the best ORM currently available for any language. It might not be the fastest, but it is the one that's has the highest level of developer comfort.

Using Django is probably the reason why I can stand using SQLAlchemy, it's way to complicated for everything I do and it's just not a nice an experience.


> True, but it's the best ORM currently available for any language.

Such a strong assertion!

I've used multiple ORMs (not Django's), including some of Haskell's type-safe ORMs (e.g. Persistent and Beam). I could not imagine going back. What makes Django's ORM so great?


I am not a fan of ORMs personally, so haven’t tried lots of them. But even in Python, SQLAlchemy is way better than Django ORM.

Django ORM is tolerable on small projects, and quickly gets in the way on anything a bit more complicated.


Have you used SQLAlchemeny and PeeWee extensively? I have found these to be pretty fair when compared to Django.

Anyway, you can use the Django ORM without Django :)


>True, but it's the best ORM currently available for any language.

Better than Entity Framework? Consider me impressed.


Depending on what you consider important, and I haven't used Entity since... 2010 I think. For me, those two doesn't compare, the simplicity of the Django ORM makes Entity look dated, complex, like something out of the mainframe era.

Entity is way more flexible, in the sense that you can use in any .Net project really. The Django ORM have no value outside Django, I have yet to anyone use just the ORM, without the rest of the framework.


Early EF was GUI based complex one. Recent version is fine after they support "Code-first" API.


Django is the opposite. It holds your hand and tells you how to structure your "app", templates, database, and media assets. And django-admin is godsend. It's awesome when you are on the beaten path cause everything just works. But if you step of the beaten path you will have to tweak various obscure settings and create weird hooks to override Django's defaults. Flask is the opposite. Just a very well-written web server and it's up to you to decide things like project structure, template engine, orm, asset manager, etc. Personally I prefer Django since most of my web applications are quite mundane and don't require the flexibility Flask can provide.


I've described that as Django is fantastic for writing Django apps. If the thing you want to do can be completely modeled in the defaults, it's impossible to beat.

If that's only 99% true, you might want to investigate other options.


You can always build 99% of the App (crud, ui stuff) in Django and the 1% be its own service.


I can't give Django enough praise. I never thought I'd be writing Python.

I've mostly used Node, Java, and Rust, with about 12yrs experience now. Only about one year with Python and Django recently. I am so much more productive than anything else I've tried. Using anything else feels like a mistake for web apps or CRUD apis. I also use type hints.

I use Django + django-ninja + django-unicorn for dynamic UIs.

Building sidewaysdata.com with django right now! Govscent is in Django too, the ORM is nice to combine with AI stuff already in python.

I'm also following the Ash framework which looks promising.


You are also running sidewaysdata.com with DEBUG turned on in production, apparently ;)


lol thanks. it's not really ready for use yet anyway :)


fixed


Is django-unicorn similar to htmx?


Unicorn is like htmx specific to Django. It reminds me more of Hotwire or Meteor. Very productive! Not 1.0 yet, but I'm happily using it in prod.


Django is fantastic. It's much bigger than flask, and has one of the best ORMs I've ever used. Also comes with a built in admin panel which is always nice. You will want to use Django REST Framework to make REST APIs, but if getting a site up and running is all you want, it's perfect.

If you want to purely make APIs though, give FastAPI a try.


I use django professionally and for personal projects. I started with Flask and then did FastAPI for a while, but I like django the most since it's the most mature.

The ORM and django admin are killer features out of the box that the other frameworks don't have. I will say though that FastAPI is really nice, especially if you need async support. However, I have found that using django ninja [1] adds a lot of nice to haves that FastAPI has to django that makes it much more fun to use again.

[1] https://django-ninja.rest-framework.com/


+1 for django ninja. I built a little side project with it a few months back and was super impressed by how easy it was to get up and running. I didn’t want to leave the Django ORM behind so was very pleased to find this project.


Having used Node/express, Rails, Flask, and Django extensively, I have been very impressed with Django. The framework itself is good, but the superpowers come from the ORM and DRF, and a few other great plugins. The stuff you can do with the ORM and DRF is nothing short of incredible. Now, you could argue that the framework shouldn't be tightly coupled to the ORM, and sqlalchemy does provide some stiff competition as an example of that, but in my experience Django ORM edges out sqlalchemy in terms of usability.

There are some footguns around N+1 queries, but they are general to all database interfaces and pretty easy to avoid IMO.


I've used Flask/SQLAlchemy and Django a bunch in my day, professionally and on the side.

It's a different strokes for different folks sort of deal.

Django is very opinionated, in a way that is "eventually" correct (i.e. they might have had some bad opinions many years ago, but they have generally drifted in a better direction). If the opinions don't align with what you're building, the escape hatches are not generally well-documented or without penalty.

Flask is minimalist and flexible. Once you find a "groove" to building with it that fits your sensibilities, it's quite, quite nice. That being said, the most recent versions of flask have excellent documentation, imo, and the tutorial is a bit more opinionated in a highly productive way. The "patterns" section of the docs is also super useful for productionizing your app.

Personally, I prefer the combo of Flask+SQLAlchemy, and eventually Alembic once you decide that's good for your app's maturity level. I respect Django a lot, I just enjoy the explicitly "less magical" aspects of a Flask stack, which is an opinion-based trade-off imo.


As a minor rebuttal, I get sick of Flask because it is bring-your-own everything. Nothing I write will ever NeedToScale(TM) or be off the beaten path from the Django Way. No interest in plugging in an alternative template language, form validation, email library etc. When I get stuck in Django, I know someone else has experienced the exact same situation. With Flask, I have to pray to deity that Miguel Grinberg has written about a similar enough situation. The majority of documentation foregoes Flask Blueprints.


Same opinion. Django is marvelous for getting stuff done.


I'm a self-taught noob as far as web engineering goes, but between this course (cs50.harvard.edu/web/2020/) and ChatGPT, I got several commercial Django sites live in a very short amount of time and effort.


Django is excellent. There are some rough edges and signs of age, but the core ORM is fantastic to develop with.


Instagram devs have given a lot talks about python and django at conferences. As far as I know IG is still running on django so it makes sense that the dev team would stick with it for something new.


Django is WSGI/ASGI framework not a webserver. What do they actually use to terminate HTTP?


Not sure about what Meta is using for Threads, but gunicorn and nginx are a common set up for Django in production.

Some will use `python manage.py runserver` in production, and they are using the defacto wrong set up. Don't ever do that.


Most likely a reverse proxy of some sort.


How heavily are type annotations and type checking used in the codebase?


It’s fully typed and nothing can be merged that doesn’t pass the type checker


What are the benefits of python with types vs a statically typed language like java, golang, etc?


The main “benefit” is that Instagram is written in Python and always has been.

It’s millions of lines of code, you can’t just change it to be Java one day.


Correct me if I'm wrong, but Django is not a web server. It's a framework used in conjunction with an app server e.g. gunicorn and a web server e.g. nginx


This would partially explain why it feels so slow. Some page refreshes are taking up to a couple of seconds.

It's probably hurried along python code and fairly simple horizontal scaling. Some population of individual instances are probably pegged, and random requests are frequently landing on their backlogged request queues.

This seems to have been rushed out the door to capture a unique market opportunity, and they'll clean up the engineering once they get engagement.


> It's running on Instagram's #Cinder fork that includes a JIT, lazy-loaded modules, pre-compiled static modules, and a bunch of other interesting tweaks against vanilla Python 3.10.

So not entirely just python 3.1.


I have ported code from Cinder to CPython . The fork has some optimizations that can be easily put in CPython and Facebook is open to port features. I’m not sure if Facebook wants to continually have a fork but CPython is open to have those features merged in if they make sense.


The Cinder team's longterm goal is to upstream as much as we can, and make the rest available as pip-installable extensions that anyone could theoretically install and use on CPython. The fewer internal changes and patches we need to maintain, the faster we can adopt upstream Python releases and all of the associated performance and tooling wins.


Okay cool thanks for the clarification


I've been following it with updates from Talk Python To Me and Python Bytes.

Great work, much appreciated.


That seems like pretty much the perfect way to go about things. Kudos on that work!


Mind that Python 3.10 is nine major versions ahead of 3.1


*minor


:s/major/minor


No, I meant major. Each of the 3.x releases brings in new features, deprecations, and even sometimes removed features. They are all major releases.

Python doesn't go by semver rules. Semver isn't a definition for all software project versioning systems everywhere.


You are factually wrong.

"To clarify terminology, Python uses a major.minor.micro nomenclature for production-ready releases. So for Python 3.1.2 final, that is a major version of 3, a minor version of 1, and a micro version of 2.

* new major versions are exceptional; they only come when strongly incompatible changes are deemed necessary, and are planned very long in advance;

* new minor versions are feature releases; they get released annually, from the current in-development branch;" — https://devguide.python.org/developer-workflow/development-c...


Good point 3.10 != 3.1


Python is a spec not an implementation, for what it is worth.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: