|
|
Subscribe / Log in / New account

Python cryptography, Rust, and Gentoo

LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

By Jake Edge
February 10, 2021

There is always a certain amount of tension between the goals of those using older, less-popular architectures and the goals of projects targeting more mainstream users and systems. In many ways, our community has been spoiled by the number of architectures supported by GCC, but a lot of new software is not being written in C—and existing software is migrating away from it. The Rust language is often the choice these days for both new and existing code bases, but it is built with LLVM, which supports fewer architectures than GCC supports—and Linux runs on. So the question that arises is how much these older, non-Rusty architectures should be able to hold back future development; the answer, in several places now, has been "not much".

The latest issue came up on the Gentoo development mailing list; Michał Górny noted that the Python cryptography library has started replacing some of its C code with Rust, which is now required to build the library. Since the Gentoo Portage package manager indirectly depends on cryptography, "we will probably have to entirely drop support for architectures that are not supported by Rust". He listed five architectures that are not supported by upstream Rust (alpha, hppa, ia64, m68k, and s390) and an additional five that are supported but do not have Gentoo Rust packages (mips, 32-bit ppc, sparc, s390x, and riscv).

Górny filed a bug in the cryptography GitHub repository, "but apparently upstream considers Rust's 'memory safety' more important than ability to actually use the package". As might be guessed, the developers of the library have a bit of a different way of looking at things. But the enormous comment stream on the bug made it clear that many were taken by surprise by the change that was made in version 3.4 of cryptography, which was released on February 7.

Christian Heimes, one of the developers of contributors to the library, made it clear that they would not be removing the dependence on Rust. He pointed to an FAQ entry on how to disable the Rust dependency for building the library, but noted that it will not work when cryptography 3.5 is released. He also pointed out that Rust is solely a build-time dependency; there are no run-time dependencies added.

But multiple people in the bug report complained about how the notice was given that the Rust dependency was being added; some thought that the project followed semantic versioning, which would mean that this kind of change should not come in a minor release. It turns out that the project has its own versioning scheme, which allows this kind of change (as does semantic versioning, actually). But Heimes did indicate that there may not have been sufficient communication about the change. He pointed to a pull request from July 2020 and a December 22 cryptography-dev mailing list announcement, both by Alex Gaynor, as places where the issue surfaced. Following links from those finds more discussion of the idea, but it is clear that news of the upcoming change did not reach far outside of the cryptography community. In part, that may be due to the usual way users get the library, as Heimes explained:

The majority of users either uses binary wheels (macOS x86_64, glibc Linux x86_64 + aarch64, Windows X86 + X86_64) or Linux distro packages. Binary wheels don't require an additional Rust libraries. Only users on Alpine (musl), BSD, other hardware platforms, and distro packagers are affected.

Many of the Alpine Linux users who were affected by the change, some of whom were loudly annoyed in comments on the bugs, have continuous integration and deployment (CI/CD) systems that update and build relevant packages frequently. In this case, though, the missing Rust compiler broke many of them. The most recent Alpine versions do have Rust support, though, so the fix there is fairly straightforward, or may be.

But for architectures that currently do not, and, in truth, likely never will support Rust, there is no way forward except perhaps forking cryptography and maintaining the C-based version going forward. Górny suggested that in the gentoo-dev thread and in the bug. Others were similarly inclined toward that, but it is unclear if there is really enough wherewithal to support such a fork. Python 3.8 and 3.9 release manager Łukasz Langa challenged Górny (and others) to proceed with a fork: "I invite you to do that. Please put your money and time where your mouth is. Report back in a year's time how it went."

Langa also pointed out that the cryptography maintainers are volunteers as well, which means they get to allocate their efforts in whatever direction they wish, even if it makes it inconvenient for other volunteers elsewhere. Beyond that, those changes are being made for a reason:

Before I begin, I'd like to remind you that security is a numbers game. If the cryptography maintainers can help 90% of their users by switching to a modern memory-safe language, then they'd be irresponsible holding back just because among the remaining 10% there exist fringe platforms which can't even run a Rust compiler.

[...] You expect those volunteers to keep their security-focused project dependent on inherently insecure technology because it would make your own job easier. Your goals and requirements might not be matching the goals and plans of the maintainers of cryptography. It might be unfortunate for you but it really is as simple as that.

The bug comments went on at length; there were some real problems that needed addressing in the way the CI/CD systems were handling versions in dependencies like cryptography, for example. But there was plenty of heat directed at the developers for "forcing" their Rust choice on others, and for "breaking" various systems. For their part, the developers have tried to help those with systems that can run Rust, but have shrugged their shoulders about the others.

Eventually, things boiled over and commenting was disallowed from anyone other than project contributors. Gaynor, in particular, felt that the problems were unavoidable for these, largely ancient, platforms. Once the thread had closed, he summarized what had been discussed and reiterated that the cryptography developers are not going to be held back by platforms that do not support Rust.

Back in Gentoo-land, it turned out that the cryptography dependency for Portage came because it was using urllib3 and requests. Those two packages in Gentoo are dependent on cryptography, but it turns out that they do not actually need it. A pull request to fix that was merged, so the problem for Portage, which is pretty fundamental to the operation of a Gentoo system, was averted.

At least it was averted for now. Górny is concerned that the trustme TLS test-certificate generator, which is used in the distribution's tests, does need cryptography, so some platforms may not be able to be fully tested. On the other hand, the cryptography developers have decided may decide to create a 3.3 LTS release that will maintain the pre-Rust version of the library until the end of 2021. Only fixes for CVEs will be made in that version, however.

But Górny has a bigger worry. He believes it is possible that some future version of Python itself will require Rust, though it is not entirely clear what he is basing that on. It would be devastating for Gentoo on the architectures that do not have Rust, since the distribution relies heavily on Python. It would seem likely to be problematic for other distributions as well, but the only real solution there is to get LLVM (thus Rust) working for those architectures—or for the gccrs GCC front-end for Rust (or a similar project) to make further progress.

While it may well be that Python itself does not go down that path, it is pretty clear that Rust is becoming more popular with every passing day. It would certainly be wonderful if it could be supported "everywhere", but it is going to take some real work to get there. The LLVM developers have been somewhat leery of taking on new architectures, unless they can be convinced there will be long-term support for them, which is understandable, but makes the problem even worse.

We saw a problem similar to that of Gentoo's back in 2018 with Debian and librsvg and we are likely to see it recur—frequently—over the coming years. It is not unreasonable for projects to use new tools, nor for projects to be uninterested in supporting ancient architectures. It is most certainly unfortunate, but we find ourselves between the proverbial rock and its friend, the hard place. Perhaps, with luck, something will change with that predicament.


Index entries for this article
SecurityPython
SecurityRust
PythonCryptography
PythonOther languages


(Log in to post comments)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 0:27 UTC (Thu) by sthibaul (subscriber, #54477) [Link]

> fringe platforms which can't even run a Rust compiler.

Did he actually try to port Rust to a new platform?

That sentence is completely upside down. It is Rust that can't even
build&run on a Posix system.

It really is a beast. And I haven't seen anything in Rust to make it
easy. Basically you have to go over all your libc headers and describe
them structure by structure, bit by bit to rust. And Rust people don't
seem to care about the situation since they have already ported to the
only few platforms they care about...

LLVM += new architecture

Posted Feb 11, 2021 1:56 UTC (Thu) by marcH (subscriber, #57642) [Link]

> Basically you have to go over all your libc headers and describe them structure by structure, bit by bit to rust.

"All your libc headers" are not all architecture-specific, are they? Can you elaborate?

> > fringe platforms which can't even run a Rust compiler.

I can't wait the next LWN article on the real issue: how much effort is this really. Plural starts at two so LLVM must have some solid architecture abstractions, no?

> > The LLVM developers have been somewhat leery of taking on new architectures, unless they can be convinced there will be long-term support for them, which is understandable, but makes the problem even worse.

This could/should start in an unofficial branch. Don't distributions routinely tweak toolchains already? This caused most C portability issues across Linux distributions I've met. I understand rebasing such a branch would be more work (how much more?) than merely "tweaking" a toolchain but at least the tooling and workflow should be there already.

You may consider Rust "exotic" but LLVM is of course not.

LLVM += new architecture

Posted Feb 11, 2021 10:58 UTC (Thu) by sthibaul (subscriber, #54477) [Link]

> "All your libc headers" are not all architecture-specific, are they? Can you elaborate?

For a given OS, not all headers are arch-specific, but a large part of them is. Think about dirent structure, poll structure, fenv details, pthread structures, signal structures, stat structures, time structures, etc. ad nauseam.

And for a different OS, it's basically all the headers that need to be ported. Making basic free software libraries depend on rust basically means excluding from the free software game any non-Linux OS unless somebody takes up the daunting task of explaining the libc headers to Rust, while Rust could simply act like all other such languages (perl, python, haskell) do: just interpret the libc headers at configure time.

LLVM += new architecture

Posted Feb 26, 2021 13:48 UTC (Fri) by Gaelan (subscriber, #145108) [Link]

For what it's worth, there's bindgen, which automates the process of converting C headers to a Rust interface. It looks like the Rust people aren't using it for libc at the moment, but if you needed it to support a new architecture/libc, you could write a version of the libc crate that used bindgen, or just use bindgen's output as a starting point to add to the hand-written definitions in the libc crate.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 8:52 UTC (Thu) by josh (subscriber, #17465) [Link]

> And Rust people don't seem to care about the situation since they have already ported to the only few platforms they care about...

We care a great deal about portability, and Rust supports quite a few target platforms, including both production and hobby platforms.

All of the mentioned platforms have not had hardware manufactured for years.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 11:02 UTC (Thu) by sthibaul (subscriber, #54477) [Link]

> > And Rust people don't seem to care about the situation since they have already ported to the only few platforms they care about...
>
> We care a great deal about portability

Then where is a tool that just picks up the libc headers to determine the sizes of structures etc. rather than having to hardcode everything in the rust repository? I did spend some time at some point to start writing them, copy/pasting from BSD files for a fair part, but everything has to be checked bit for bit to be sure that it's accurate, that is a very bug-prone process.

And... the pace at which Rust goes made me feel that probably what I wrote at the time is simply completely outdated since then and I'd better restart over from scratch next time I happen to find some time on this.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:28 UTC (Thu) by dbaker (guest, #89236) [Link]

Are you talking about bindgen?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:32 UTC (Thu) by farnz (subscriber, #17727) [Link]

The tool that does that is called "bindgen". It's imperfect (necessarily so, because some of the descriptions in libc headers are in the form of human constraints on - e.g. - which #defines can be ORed together) but it does a decent first cut. The need for better is why the libc crate has hand-produced bindings instead, which have been human-read and the applicable extra constraints applied.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:19 UTC (Thu) by sthibaul (subscriber, #54477) [Link]

> The tool that does that is called "bindgen".

Ok, thanks for the hint! At the time (novembre 2018) there was very little mention of this within rustc, and people had told me that they didn't think there was something like that.

> which #defines can be ORed together

Sure, at some point there is semantic that C doesn't provide. But that semantic is all the same for all Posix platforms...

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:50 UTC (Thu) by farnz (subscriber, #17727) [Link]

Unfortunately, it's not the same for all POSIX platforms - just about every platform has extensions to POSIX that aren't the same as other platforms :-(

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:55 UTC (Thu) by sthibaul (subscriber, #54477) [Link]

> Unfortunately, it's not the same for all POSIX platforms - just about every platform has extensions to POSIX that aren't the same as other platforms :-(

Sure there are some extensions, but that's not much compared to the common Posix API.

Porting rust to a system that provides the Posix interface should just be a matter of running a script that looks for the ABI of the Posix interface (even if only to record it in the source tree). Like perl has been doing for decades.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:57 UTC (Thu) by josh (subscriber, #17465) [Link]

bindgen has been around for a while, but it's fiddly to build. I'm currently working on some things that may improve that.

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 11:39 UTC (Thu) by freem (guest, #121851) [Link]

>> which #defines can be ORed together

> Sure, at some point there is semantic that C doesn't provide. But that semantic is all the same for all Posix platforms...

Is it C which does not, or is it the source code?
In many places, I see long lists of #defines which are only powers of 2. I never understood why they don't use bitfields which, if I am not wrong, are in C since at least C89?

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 17:12 UTC (Thu) by nybble41 (subscriber, #55106) [Link]

> In many places, I see long lists of #defines which are only powers of 2. I never understood why they don't use bitfields which, if I am not wrong, are in C since at least C89?

Bitfields have some downsides, the biggest one being the lack of a standard, portable memory layout. The placement of individual bitfields within a structure is implementation-defined and varies in practice according to the target architecture, especially with respect to byte order. As a result, it is generally considered best practice to avoid directly sharing structures with bitfields between programs which may not share the same code-generation settings—for example, between user-space and kernel-space, or anything related to the layout of bits in a hardware register or persistent storage. Even for local data within the same process there are some ergonomic issues, such as the fact that one cannot take the address of a bitfield, make use of atomic operations, or easily set, clear, or test multiple bitfields within the same structure in the same operation in the same way that one can use bitwise AND/OR operations with a bitmask. The compiler can paper over some of these issues but this relies more heavily on the optimizer to merge together a series of one-bit read-modify-write sequences.

Python cryptography, Rust, and Gentoo

Posted Feb 25, 2021 14:11 UTC (Thu) by myrrlyn (guest, #145084) [Link]

one of the reasons that i, personally, am looking forward to rust pushing out c is that i have a library that covers all of the points about bitfields you just laid out, and makes them trivial to correctly, conveniently, and performantly use in software that needs the space compaction

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:03 UTC (Thu) by josh (subscriber, #17465) [Link]

I would love to see much of the libc crate autogenerated; someone would need to write the necessary code to make that happen. It shouldn't just autogenerate at build time, because that would require an installed version of the libc headers for a target in order to cross-compile for that target (which you don't currently need). That would also make the libc crate dependent on the version of headers you have installed. Today, the libc crate carefully avoids using features that are "too new" for the versions of libc that it supports. (Rust typically supports running with the version of glibc from the previous RHEL or Debian stable, for instance, rather than requiring the latest libc.)

So, autogenerating the libc crate would likely need to happen as part of constructing that crate (before publishing it), or alternatively the libc crate would need to ship with copies of appropriate versions of the libc headers for supported platforms. Either way, this would likely be welcome, but would require some substantial infrastructure to implement. If someone is interested in implementing that, I would suggest starting a conversation on the libc issue tracker about what the architecture and requirements for that would look like.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:11 UTC (Thu) by sthibaul (subscriber, #54477) [Link]

> It shouldn't just autogenerate at build time, because that would require an installed version of the libc headers for a target in order to cross-compile for that target (which you don't currently need).

I don't really understand that argument.

When you cross-build something for another target, you'd just install the libc headers for that other target, yes! I don't see why you wouldn't. Making platform ports and libc bindings upgrade way more difficult just for this reason seems terribly weak to me.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:18 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

The problem is that it adds another axis to your "build environment matrix". In addition to target triple and compiler version, I now need to ask "what libc do you have installed?". And I need to document how to get that information for any given build environment. Oh, you're on macOS trying to cross-compile to Linux? Oof, sorry, try again next life.

Hand-coded bindings are annoying to keep in sync, but finding out someone is trying to target an older libc, a newer libc, or a completely different OS doesn't sound like less maintenance effort to me. Autogeneration is fine, but I'd like *that* generated code committed because it's just too damn important to leave up to the wild west of Random Developer Machine.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:23 UTC (Thu) by sthibaul (subscriber, #54477) [Link]

> Autogeneration is fine, but I'd like *that* generated code committed because it's just too damn important to leave up to the wild west of Random Developer Machine.

Ok, fine. That's actually exactly the approach that perl has been using for decades.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 15:04 UTC (Wed) by iainn (guest, #64312) [Link]

> Autogeneration is fine, but I'd like *that* generated code committed because it's just too damn important to leave up to the wild west of Random Developer Machine.

Isn't that more of an argument to perform the autogeneration in a hermetic environment, like a container? Maintainers might also have Randomish environments.

(I agree the uploaded create should contain the autogenerated code; not needing libc headers, as discussed, is a good reason.)

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 16:38 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

Even then, you might be missing something due to an `#if` check you're not up-to-date with. I think the kernel providing its ABI via CTF or the like is *far* better in this realm (at least for the Linux-specific bits of the question). Of course, for libc/POSIX/etc., the headers *are* the definition, so that's what one should use there.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:39 UTC (Thu) by josh (subscriber, #17465) [Link]

> When you cross-build something for another target, you'd just install the libc headers for that other target, yes! I don't see why you wouldn't.

Because that makes cross-compilation harder and more fragile. LLVM-based environments typically have all the code generation capabilities for every platform available through the same binary and toolchain. Having to install a specific cross-libc, and having the capabilities of the libc crate depend on your installed libc headers, would make cross-compilation harder, less reproducible, and more fragile.

This doesn't mean we can't do code generation based on those headers, but we'd want to ship either the headers or the generated code with the libc crate.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 8:44 UTC (Mon) by glaubitz (subscriber, #96452) [Link]

> All of the mentioned platforms have not had hardware manufactured for years.

Actually, that's not true. There is regularly new hardware developed for the Amiga and the Linux kernel even see support for new Amiga hardware:

> https://lore.kernel.org/lkml/20190811043253.24938-1-max@e...

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 7:38 UTC (Thu) by lysse (guest, #3190) [Link]

Rust supports quite a few modern targets? That's great, but consider the number of targets supported by C: All of them.

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 20:37 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Sure. C is also 50 years old. Rust is what, 10? First, no language is going to show up with the platform support C has right out of the gate. Also, not all platforms are still relevant today. Those that are will gain Rust support if there's enough interest by users of those platforms to use the tools provided by Rust. If they're not, well, they won't get Rust software.

Python cryptography, Rust, and Gentoo

Posted Apr 16, 2021 21:35 UTC (Fri) by wtarreau (subscriber, #51152) [Link]

> First, no language is going to show up with the platform support C has right out of the gate

You're confusing language and compiler. A language is platform-agnostic, it even works with paper and pen.

The problem here is that this so-called language knows a single compiler whose developers are not interested in porting to what they consider irrelevant platforms. It's their right, it's their project. What is sad is that jerks are blindly following this without consideration for their own users. With python cornering itself behind the sole list of platforms supported by rust and betraying its users, we're certain never to ever see python 4.

Python cryptography, Rust, and Gentoo

Posted Apr 16, 2021 22:23 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> What is sad is that jerks are blindly following this without consideration for their own users.

That seems to be a lot of projection to me. If my target audience is mainstream platforms or the project's goal is to implement something securely, what obligations do I have to those who happen to find it useful in their niche without considering to make sure it's within my project's goal or target audience? If someone comes up to me and says "hey, I got it working on the Nintendo Switch last year, your latest release breaks it", what am I to do? I'm not getting (or potentially bricking) a Switch to test it and I never claimed to support such a thing anyways. Patches that don't interfere with other things are fine, but if you're asking me to resurrect some pre-refactoring code just for your setup, sorry. Feel free to fork though.

I suspect this will be used for new drivers. Existing drivers will need a lot of "oomph" to warrant a rewrite. It seems to also have put more interest into the GCC frontend, so if/when that reaches the finish line, it'll be just as good for all the other Linux platforms.

> With python cornering itself behind the sole list of platforms supported by rust and betraying its users, we're certain never to ever see python 4.

Python has done no such thing. The maintainers of cryptography may arguably have done this (I make no claim to either argument on that front here), but they are not the maintainers of the CPython project. Any Python 4 was known to never be a thing because the maintainers recognized something of the trainwreck that the Python3 migration ended up being.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 11, 2021 16:20 UTC (Thu) by clugstj (subscriber, #4020) [Link]

That wording struck me as odd also. It's not the platforms that can't run Rust, it's that Rust doesn't support those platforms.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 11, 2021 22:24 UTC (Thu) by moltonel (guest, #45207) [Link]

Round peg, square hole, semantically one is not more responsible than the other. We tend to feel that the other team should do the job, so do you identify more with the platform user inconvenienced by the new dependency on a language that seems unnecessary, or with the developer inconvenienced by the obsolete platform he'll never use ? Is the incompatibility a bigger issue for the platform or for the language ?

We all have an instinctive answer to those questions but it doesn't really matter, we all agree that the incompatibility is a problem, and it can be approached from either side.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 12, 2021 8:24 UTC (Fri) by sthibaul (subscriber, #54477) [Link]

> Is the incompatibility a bigger issue for the platform or for the language ?

When the dependency is put on librsvg, which is a dependency for gtk, yes it does hurt strongly.
If the dependency is added to python itself, which is a dependency for so much free software, that could as well just kill the platform. So much for the dream of free software.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 12, 2021 17:14 UTC (Fri) by nix (subscriber, #2304) [Link]

The dream is still alive: the software is free, so you are free to port LLVM and Rust to any platform of your choice (and maintain it yourself forever, probably, because I kinda doubt the LLVM maintainers want to drag around m68k support in the upstream tree).

The free software dream was never about *having* stuff. It was always about *being able to do* stuff, and that is still there.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 12, 2021 23:22 UTC (Fri) by Wol (subscriber, #4433) [Link]

Actually, following the llvm mailing list, I think m68k support is being welcomed into the tree.

Granted, it's on the basis "you want it in, you need to make sure it's maintained", but the llvm attitude to architectures seems to be similar to the linux attitude to device drivers - better in than out!

Cheers,
Wol

"fringe platforms which can't even run a Rust compiler"

Posted Feb 14, 2021 22:25 UTC (Sun) by rodgerd (guest, #58896) [Link]

The problem has never been that these things can't be done: it's that the people making noise about them either want someone else to do it, for free; or a concern-trolling because they're looking for an excuse to block a tool that they don't like, while failing to offer a meaningful alternative.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 15, 2021 8:51 UTC (Mon) by glaubitz (subscriber, #96452) [Link]

> The problem has never been that these things can't be done: it's that the people making noise about them either want someone else to do it, for free; or a concern-trolling because they're looking for an excuse to block a tool that they don't like, while failing to offer a meaningful alternative.

That's not true. I'm one of these loud voices and I'm actually also one of the people who did lots of contributions to Rust to make it more portable:

> https://lwn.net/Articles/771355/

"fringe platforms which can't even run a Rust compiler"

Posted Feb 15, 2021 8:49 UTC (Mon) by glaubitz (subscriber, #96452) [Link]

> The dream is still alive: the software is free, so you are free to port LLVM and Rust to any platform of your choice

What would the Rust developers say if the LLVM project hypothetically changed its code in a way that it could no longer be used with Rust?

Would your answer also be "Go maintain your own LLVM fork!"?

Or if the kernel developers decided to drop support for anything but large IBM mainframes and POWER servers?

Would you also say "No problem, I'll maintain my own kernel fork!"?

"fringe platforms which can't even run a Rust compiler"

Posted Feb 15, 2021 11:30 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> What would the Rust developers say if the LLVM project hypothetically changed its code in a way that it could no longer be used with Rust?
What if GCC tomorrow decides to drop all languages except Ada?

> Would your answer also be "Go maintain your own LLVM fork!"?
That's actually what Rust had been doing for a while. They used to maintain a private fork of LLVM with Rust-specific patches. So yes, "go and maintain your fork".

> Or if the kernel developers decided to drop support for anything but large IBM mainframes and POWER servers?
> Would you also say "No problem, I'll maintain my own kernel fork!"?
Yup.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 12, 2021 17:26 UTC (Fri) by moltonel (guest, #45207) [Link]

Python isn't getting a rust dependency, it's "just" a (pretty important) python package that is. Python *is* dropping support for the s390 platform, as support had already been dropped left and right, including in the Linux kernel.

IMHO the "dream of free software" is intact: all the tools are available to resolve the incompatibilities and we all agree on that goal. We "just" need somebody motivated enough to do or sponsor the work. It's already happening to some extent, but it can't be done overnight and it can't delay much the vast majority of users who are up to date.

I suppose another lens to analyze the situation with is whether these incompatibilities are the cause or the symptom of a platform's death.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 12, 2021 22:14 UTC (Fri) by sthibaul (subscriber, #54477) [Link]

> I suppose another lens to analyze the situation with is whether these incompatibilities are the cause or the symptom of a platform's death.

Yes, but making ports difficult can also kill in the egg possibly interesting new projects.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 13:54 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Sure, any little thing could "kill in the egg possibly interesting new project". Economic recession, war, etc. Less hyperbolicly, I don't know that LLVM, Rust, et al. are actively making ports more difficult other than raising the bar that needs to be met to qualify as a Supported Platform. If those platforms are not able to support more than C or C++, why do *they* get to hold the rest of the world hostage with such a low bar of offered features?

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 19:38 UTC (Sat) by sthibaul (subscriber, #54477) [Link]

I'm not saying that platform shouldn't bother porting more than a C/C++ compiler.

There is room for a middle ground between "you have to do a lot of work to port Rust to your platform" and "you only need a C compiler and Posix environment to port Rust to your platform".

That said, the free software ecosystem has historically only roughly needed a C compiler to get perl, python, etc. which helped a lot various platform projects to get stuff working.

If the bar is raised too much, the possibility of interesting projects lowers.

If bash/gcc/gdb/etc. were not easy to port in early 90s, we would just not have Linux today.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 20:48 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

These days it's more like "you need LLVM for your platform". This will automatically give you a C/C++ compiler, Rust and soon Go. If anything, it's actually easier to port non-trivial userspace to new platforms today.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 20:50 UTC (Sat) by sthibaul (subscriber, #54477) [Link]

No, LLVM won't immediately give you Rust, see other comments about the workload to port libc.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 21:51 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

Rust doesn't need libc.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 21:58 UTC (Sat) by sthibaul (subscriber, #54477) [Link]

But cargo does, and building rust needs cargo?

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 22:12 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

It depends on what you WANT to do. If you're bringing up a small embedded platform then you don't need to run cargo on it. You just do cross-compilation, which only requires LLVM for the target.

If you want to bring up a full large-scale OS (think Fuchsia or Linux) then you'll need a libc for Python and other tools anyway. You'll need to port the libc crate to that OS as well, but this will be a pretty small task overall compared to the amount of work you have to do.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 22:20 UTC (Sat) by sthibaul (subscriber, #54477) [Link]

> You'll need to port the libc crate to that OS as well, but this will be a pretty small task overall compared to the amount of work you have to do.

As I mentioned in other comments, porting the libc create is currently *not* a small task, you have to go over all system-provided libc headers to check for precise bits and bytes. This step deserves some automation, like perl/python/etc. have.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 22:29 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

Here's an example of libc for VxWorks (a non-Unix OS): https://github.com/rust-lang/libc/blob/master/src/vxworks...

It's 2000 lines and most of them can be trivially generated by cut&pasting headers and doing string replaces. You can write it in within a day.

Automating it would be nice, but different OSes can have wildly different header conventions.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 22:37 UTC (Sat) by sthibaul (subscriber, #54477) [Link]

> most of them can be trivially generated by cut&pasting headers and doing string replaces.

cut&pasting+string replace from the C .h headers??

> You can write it in within a day.

For somebody that doesn't even know Rust from the start?

> Automating it would be nice, but different OSes can have wildly different header conventions.

I'm not saying to automate string replacements, but simply do like all other languages implementations I have seen do: interpret the C headers in C, i.e. bindgen.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 23:13 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

> cut&pasting+string replace from the C .h headers??
Yup. Get your platform's headers, copy the definitions and then massage them until they are valid Rust.

If you want something minimalistic, you can even skip most of them (I'd estimate that around 500 lines are truly needed).

> For somebody that doesn't even know Rust from the start?
Add one more day for cargo-culting from an existing file. If you ever had to debug autohell, you're qualified enough to do it.

> I'm not saying to automate string replacements, but simply do like all other languages implementations I have seen do: interpret the C headers in C, i.e. bindgen.
That is possible, but you'd spend quite a bit of time debugging the generated code for your custom platform anyway. And given that these files are pretty much static once they are written, they don't need a lot of maintenance.

In reality, bringing up a non-Unix platform is such a huge undertaking that you'd likely be better off by writing the libc in _Rust_.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 23:20 UTC (Sat) by sthibaul (subscriber, #54477) [Link]

> Get your platform's headers, copy the definitions and then massage them until they are valid Rust.

glibc, for instance, has ~500 headers, the definitions are split among arch-specific and non-arch-specific files. Just to give an instance for the st_dev member of the stat structure: it is actually defined in bits/stats.h, with type __dev_t. Type __dev_t is defined in bits/types.h from __DEV_T_TYPE. That type is defined in bits/typesizes.h from __UQUAD_TYPE. That type is defined in bits/types.h from __uint64_t.

> That is possible, but you'd spend quite a bit of time debugging the generated code for your custom platform anyway.

What is left to debug, when the C interpretation is *by definition* the correct answer?

> In reality, bringing up a non-Unix platform is such a huge undertaking that you'd likely be better off by writing the libc in _Rust_.

Why Rust?

Really, that's exactly the point: it looks like Rust people want to bring the whole world into Rust, and not bother with the rest of the world that isn't Rust.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 23:25 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

> glibc, for instance, has ~500 headers, the definitions are split among arch-specific and non-arch-specific files.
If you have a libc from your device vendor, then just do "#include <socket.h>" and then pre-process the file. If you are making a completely new arch, then YOU have to write these 500 files.

> What is left to debug, when the C interpretation is *by definition* the correct answer?
For example, Rust has special definitions to navigate the socket headers (these are macros in C, not functions). C also doesn't specify the error returns.

> Why Rust?
Because it's safer than C and faster to write.

> Really, that's exactly the point: it looks like Rust people want to bring the whole world into Rust, and not bother with the rest of the world that isn't Rust.
Honestly, that would be a great outcome for the world.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 14, 2021 0:20 UTC (Sun) by sthibaul (subscriber, #54477) [Link]

> just do "#include <socket.h>" and then pre-process the file.

Yes, but that still doesn't magically collapse all the typedefs. With just socket.h that produces 500 lines of C, that a C compiler and scripts shared by *all* ports would much better grok than doing seds by hand.

> For example, Rust has special definitions to navigate the socket headers (these are macros in C, not functions).

Ok, but that's only a very tiny part of the Posix interface.

> C also doesn't specify the error returns.

You mean the set of errors that a function may return? Yes, and a kernel never makes such a promise as the exact set of errors it might ever return in the coming decades.

> > Why Rust?
> Because it's safer than C and faster to write.

Let me rephrase: why Rust in particular? And not the myriad of other languages that have been invented by mankind?

> > Really, that's exactly the point: it looks like Rust people want to bring the whole world into Rust, and not bother with the rest of the world that isn't Rust.
> Honestly, that would be a great outcome for the world.

So that's exactly what I wrote in another comment: so much for the free software dream, if you're free to use any language, as long as it is Rust.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 14, 2021 16:05 UTC (Sun) by nix (subscriber, #2304) [Link]

> Yes, but that still doesn't magically collapse all the typedefs. With just socket.h that produces 500 lines of C, that a C compiler and scripts shared by *all* ports would much better grok than doing seds by hand.

Aha, thank you for giving me another use case for CTF! The upcoming CTF format v4 work will include an objdump option that emits a collection of C header files given a .ctf section. They won't be the same as the inputs, of course (no comments, for starters, and with the default linker options you'll get one big header for non-ambiguous types and then small headers #including the big one for all the ambiguously-defined ones), but it'll be better than nothing -- but until now I didn't have an answer for why this would ever be better than using the original .h files. Now I have one: transformations on the headers to help people doing work like this. (A transformation that collapses away all typedefs, leaving everything textually defined in terms of the base types, and leaving the typedefs around but with no actual users, should be trivial to implement. It's probably a bad idea to *use* those headers, but for human analysis for FFIs it's probably likely to be useful.)

"fringe platforms which can't even run a Rust compiler"

Posted Feb 15, 2021 0:28 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> Ok, but that's only a very tiny part of the Posix interface.
You actually don't need much to bootstrap Rust. The now-removed CloudABI was about 500 lines of code for the libc, which is probably about the minimal amount.

> Let me rephrase: why Rust in particular? And not the myriad of other languages that have been invented by mankind?
For stuff like libc you need a language that has no runtime (no GC, no threads created behind your back, etc.), natively compiled, memory-safe and preferably with additional type-safety-enforced guarantees.

Not many languages fit this bill. Out of modern-ish ones only Rust and Swift fit the bill.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 14, 2021 20:25 UTC (Sun) by geert (subscriber, #98403) [Link]

IIUIC, Rust expects to call into snprintf() on VxWorks?

Last time I developed software for VxWorks (15y ago), its C library didn't have snprintf(), so we had to use our own implementation.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 14, 2021 21:57 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

> IIUIC, Rust expects to call into snprintf() on VxWorks?
Nope. Rust has its own string formatting support (in stdlib) that is independent of the OS.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 20:03 UTC (Sat) by rgmoore (✭ supporter ✭, #75) [Link]

Yes, but making ports difficult can also kill in the egg possibly interesting new projects.

The problem in this case is not so much the difficulty of making the port but the lack of resources to do the porting. Many of the platforms in question have been out of production for a decade or more and were in serious decline long before that. The only people supporting them are a handful of hobbyists who like maintaining obsolete hardware. We're far more likely to kill interesting new projects in the egg by making support for those platforms a requirement than by failing to support them.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 13, 2021 20:10 UTC (Sat) by sthibaul (subscriber, #54477) [Link]

> The problem in this case is not so much the difficulty of making the port but the lack of resources to do the porting.

That's the original question of the article, yes. But the same difficulty will be faced by new platforms, the difference being that you won't see them not-appear.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 14, 2021 18:59 UTC (Sun) by laarmen (subscriber, #63948) [Link]

I don't know, this reasoning sound weird to me. If I make my language hard to port on ENIAC, it will most likely make it harder to port it on new platforms that behave like ENIAC, thus limiting the opportunities of building new platforms that differ from the currently supported platforms in the same way that ENIAC does.

If it's a new platform shouldn't we assume that it will behave more like current platforms than older ones? Otherwise, why do you need a new platform as the older one already exist? Of course you can say that the new platform is a variant of the older one but running much much faster, but that's only because it doesn't and its properties are custom built to support your argument. An empty set can have any property you want it to.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 15, 2021 22:45 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link]

The difference is that building an interesting new hardware platform is inherently a major undertaking, and it's only going to be done by an organization with substantial resources. That includes the resources to port all the significant languages to work with their new platform. Being able to do that is more or less required for a new platform to be interesting. A platform that can't run the languages people care about isn't very interesting.

"fringe platforms which can't even run a Rust compiler"

Posted Feb 15, 2021 22:47 UTC (Mon) by sthibaul (subscriber, #54477) [Link]

> building an interesting new hardware platform

I'm not talking only about hardware platform, but also OS.

> That includes the resources to port all the significant languages to work with their new platform.

Yes, but that doesn't mean that projects shouldn't make reasonable attempts to make it not too hard.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:44 UTC (Fri) by rodgerd (guest, #58896) [Link]

I guess everyone should have less security so a few nerds can keep running processors that haven't been built this century.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 0:39 UTC (Thu) by BirAdam (guest, #132170) [Link]

This mania for “memory safety” isn’t necessarily bad, but the Rust people are making me hate everyone who complains about C’s lack of memory safety.

First, Rust solves one problem and adds 3 more. It adds backward compatibility breaks. It isn’t as bad as Python at this, but then the Python people are not advocating Python as a systems language. C’s one great strength is that C code is C code. It tends to just keep working over time. The second added problem is precisely this one. Rust is being promoted as a systems language when it doesn’t work on all of the hardware needed by a systems language. The third major issue is that Rust has the cargo system as part of its standard use model. This encourages bad behavior. I do not care how “memory safe” your language is if people regularly include unvetted code from some repo.

The final point that I have yet to hear properly explained is why C is good enough to write other languages in, but not okay for others to use. You’re either admitting that other programmers are “talented enough” to use C and that you are not, or you’re just pawning responsibility off on someone else because you’re too lazy to properly do your job. Either way, C as a tool is blameless of programmer error.

(btw, I know that Rust is not written in C, it was initially ocaml and then rewritten in Rust, just making a point about the constant screaming about “C is bad because everyone knows C is bad”)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 0:59 UTC (Thu) by Paf (subscriber, #91811) [Link]

“ The final point that I have yet to hear properly explained is why C is good enough to write other languages in, but not okay for others to use. You’re either admitting that other programmers are “talented enough” to use C and that you are not, or you’re just pawning responsibility off on someone else because you’re too lazy to properly do your job. Either way, C as a tool is blameless of programmer error.”

Come on, you know better. The idea is that it is sometimes necessary or desirable but *difficult and time consuming* to write well/safely/securely in C. The goal here is to reduce the amount of code that needs to be written in it, and I think that’s reasonable. (I’m a file system and kernel dev who makes his living in C, btw.)

I love working in C, I love the simplicity and feeling of precision. But it’s been amply demonstrated that humans are not great at getting memory allocation and pointer arithmetic, etc, right, and if we can remove that as a problem for more code, then that’s desirable. And yeah, sure, some developers are better at it than others. But why should we make anything harder unless we need to?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 5:12 UTC (Thu) by marcH (subscriber, #57642) [Link]

> I love the simplicity and feeling of precision

Emphasis on "feeling"

https://queue.acm.org/detail.cfm?id=3212479

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 11:40 UTC (Fri) by sdalley (subscriber, #18550) [Link]

That was a *really* good article!

The increasingly mind-boggling and foot-shooting complexity of modern C compiler optimizations is the clearest evidence one could wish for that C is not "close to the metal" of any modern mainstream processor. Like a tree growing on top of a pile of buried scrap metal, modern architectures and compilers have had to distort and twist themselves to grow around the need of preserving the illusion that they have flat memory, fixed registers, pointer arithmetic and sequential operation.

What would a useful modern low-level language that treats vectors, co-processors, threads, segments, references and caches as first-class objects look like?

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 12:17 UTC (Fri) by pizza (subscriber, #46) [Link]

The reasons compilers are so re-writingly complex is the same reason that modern CPUs are so re-writingly complex: squeezing every last drop of performance out of _existing_ code.

After the top-line price, raw performance is the only thing that folks actually care about.

(Granted, the tide has begun to shift slightly in favor of "security", but given the choice, folks will choose "faster" over "more secure"... every. single. time.)

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 17:35 UTC (Fri) by anselm (subscriber, #2796) [Link]

The reasons compilers are so re-writingly complex is the same reason that modern CPUs are so re-writingly complex: squeezing every last drop of performance out of _existing_ code.

Also, humans have a better chance of writing working (let alone efficient) code if they don't need to think about “vectors, co-processors, threads, segments, references and caches as first-class objects”. We have compilers so we don't need to worry about all of those (the vast majority of us who aren't working on actual compilers, anyway).

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 22:08 UTC (Fri) by marcH (subscriber, #57642) [Link]

For the PDP-11, C provided an outstanding trade-off: user-friendly programming concepts that mapped really well to the hardware.

While these concepts don't map with the hardware anymore, they stayed familiar and their programmer-friendliness has indeed not regressed. But it hasn't progressed either.

It is a very sad vicious circle to see that programming concepts and hardware keep meeting in a place that does not exist any more. Something like "retpoline" is the absolute irony: still meeting the hardware in that old, fictional place BUT with the knowledge of what hardware really does behind the scenes AND the intention to defeat that! Multiple layers of masquerading; what a carnival.

It's fantastic to see that a new crop of programming languages are at least trying to evolve a bit.

http://worrydream.com/#!/TheFutureOfProgramming (Bret Victor)

C was a great low-level language - for the PDP-11

Posted Feb 15, 2021 9:46 UTC (Mon) by anton (subscriber, #25547) [Link]

The referenced article is not particularly good, just a hodgepodge of pet peeves.

As for the complexity of gcc and clang/LLVM, it is an indication that they have too much budget and want to produce good benchmark results (at the cost of worse usability) to justify that (admittedly they are also doing things that help usability, but they could do that without doing the other nonsense).

As for flat memory and caches (and, mentioned in the paper, cache coherency protocols), that is indeed hardware architecture for speeding up existing software written for a simple memory model, plus being able to run processes with large memory needs. Hardware architects needed a long time to get here, and tried to throw the complexity over to programmers the whole time (and are still doing it, with weak memory consistency): Instead of caches, they wanted us to manage fast memory by software, with the most recent instance being the SPEs of the Cell Broadband Engine (used in the PlayStation 3). Instead of somewhat consistent shared memory, they would rather have given us distributed memory, with software managing the transfer of data from remote to local memory before processing (supercomputers still have this). All this would make general-purpose programming so much harder that the alternatives with more complex hardware won out. So the architectures provide at least single-threaded programs with a "flat" memory model, and a language that reflects that memory model with, e.g., address arithmetic is a sensible low-level language for that (but note taht C as understood by the gcc and clang maintainers is not such a language).

Segments are what I first thought of when you mentioned "flat memory". This has been pretty much eliminated as architectural (mis)feature (and where it is present, it has not been used for a while); having it in an architecture costs extra hardware, and costs extra in software. As to how a low-level language would look that supports it, look at the C standard; it includes many restrictions that cater for these kinds of architectures; and these days the gcc and clang maintainers use these restrictions as justification for miscompiling programs on architectures with flat memory.

As for register renaming (vs. "fixed registers"), Intel has spent billions on IA-64 aka Itanium based on the idea that compilers could rename "fixed registers" and reorder instructions better than the hardware can. In the end it turned out that the hardware with register renaming performs better for most software. The IA-64 approach would also have required more complex compilers to perform well, and the Itanium CPUs are also quite power-hungry even without a register renamer.

Vectors as first-class objects: Look at APL, J, or FP, although I would not call these languages low-level. Still, Backus was not pleased with architecture and programming languages and proposed FP as an alternative programming model. But despite Backus' standing and his high-profile presentation of his critique and alternative, FP/FL have not seen mainstream success nor taken the functional programming community by storm.

On a completely different track, you can look at GNU C's vector extensions, which is pretty low-level.

As for threads, we have seen SMT in mainstream CPUs since 2002 and multi-core CPUs in the mainstream since 2005. The low-level approaches to that have been pthreads and the C++ memory model, but they are hard to program with. By contrast, Unix pipes (a high-level concept) lets me use multiple cores or hardware threads without particular effort (but typically only for rather limited amounts of parallelism).

Occam is a programming language for programming distributed-memory multiprocessors (but even on shared-memory machines, each thread could get its private memory, limiting the memory ordering headaches to the implementation of communications). I think that one other thing that the transputers and Occam did right was to make thread creation, destruction and communications very cheap, so finding the right granularity of parallel processing was not as critical as on current mainstream stuff. Still, I don't see these aspects of Occam being picked up in the mainstream, so maybe they are not as important as I think.

Overall, the problem of making good use of many threads with little burden on the programmers is still unsolved, and that's why architectures with lots of slow threads have not found mainstream success.

C was a great low-level language - for the PDP-11

Posted Feb 15, 2021 12:47 UTC (Mon) by excors (subscriber, #95769) [Link]

> Instead of caches, they wanted us to manage fast memory by software, with the most recent instance being the SPEs of the Cell Broadband Engine (used in the PlayStation 3). Instead of somewhat consistent shared memory, they would rather have given us distributed memory, with software managing the transfer of data from remote to local memory before processing (supercomputers still have this). All this would make general-purpose programming so much harder that the alternatives with more complex hardware won out.

On the other hand GPGPU has risen in popularity, and that often does require the programmer to explicitly handle distributed memory. In OpenCL terminology you have host memory (the system RAM shared with the CPU), global memory (VRAM), local memory (shared by a large group of work-items), and private memory (basically the register file for a single work-item, though with some sharing between nearby work-items). You have to declare where all your data will live in that hierarchy, and write code to copy it between different levels, and partition your work-items to be in the same group/subgroup when they need to share data efficiently, and that can have a massive effect (maybe 1-2 orders of magnitude) on performance.

For serious number-crunching, GPUs won out over CPUs, which I suspect is because their memory model is much more scalable than the CPU's illusion of consistent shared memory, *and* they have a programming model that makes it relatively easy to exploit that memory model (by running many thousands of parallel threads so the programmer can usually ignore memory latency and branch latency - even if 90% of threads are stalled, there's enough runnable threads to keep all the ALUs busy or to saturate memory bandwidth - and by having just enough sharing between threads so they can coordinate on non-trivially-parallelisable problems).

As far as I can see, Cell was somewhere in the middle: it had GPU-like memory (8 SPEs with 256KB of local memory, and 2KB of private memory (/registers) split between 4-16 work-items (/SIMD lanes)) but it had a more traditional CPU-like programming model (just a single thread per SPE, running SIMD instructions, but even worse than regular CPUs at branches). The problem wasn't the distributed memory model, the problem was that it didn't commit hard enough in either direction and so it was beaten by GPUs on one side and traditional CPUs on the other side.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 1:58 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

> The final point that I have yet to hear properly explained is why C is good enough to write other languages in
Rust compiler is written in Rust (although there's an alternative incomplete reimplementation in C++ for bootstrapping).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 7:48 UTC (Thu) by rsidd (subscriber, #2582) [Link]

You didn't read the full comment. Last para.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 2:01 UTC (Thu) by marcH (subscriber, #57642) [Link]

> I do not care how “memory safe” your language is if people regularly include unvetted code from some repo.

These are both very serious security issues but unrelated to each other. They seem related only because C sucks at both safety _and_ code re-use.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:23 UTC (Thu) by LtWorf (subscriber, #124958) [Link]

There are lots of C libraries.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:45 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link]

> There are lots of C libraries.

Yes there are some but unlike Rust or Python, there is no single place you can go to look for them and the tooling around installing or updating the dependencies isn't as straightforward.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:55 UTC (Thu) by k3ninho (subscriber, #50375) [Link]

Ex-kernel developer Rusty Russell's library of patterns, C-Code Archive Network didn't achieve ubiquity.
 
K3n.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:54 UTC (Thu) by logang (subscriber, #127618) [Link]

apt search xyz
apt install libxyz

How is that not straightforward?

The difference between C here and other languages isn't in the straightforwardness of finding and installing libraries but the difficulty of publishing them. Getting a library into pypy/whatever requires zero effort and there is zero quality control. Getting a library into a distribution is a lot harder and as a result the C libraries there tend to be of a higher quality; but the cost of this is that there are fewer choices.

However, I believe this is a good thing. No serious programmer should be choosing to depend on tiny and marginally maintained libraries that often don't care one wit about breaking their consumers. This can create very serious headaches down the road. Thought and care should be put into every dependency. Just because it's trendy these days to do otherwise, doesn't mean it's a good idea.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:31 UTC (Thu) by roc (subscriber, #30627) [Link]

Depending on distro libraries is a nightmare for developers. It creates so many problems.

When I make my software depend on a distro library, I now have to worry about:
-- Adding a step before the build that makes sure the library package is installed, e.g. providing instructions *per-distro* to install it manually, making my software harder to build
-- For distros that don't package the library (or package a version of it that's older than I need), providing instructions to build and install that library manually, making my software even harder to build
-- Making sure my software builds and runs with a range of library versions packaged by different distros and distro versions, potentially packaged in different ways with different directory layouts etc across distros
-- On platforms like Windows, iOS and Android (i.e. where almost all users are), where users cannot or will not build the software themselves and I need to provide binaries, and there definitely will not be a "distro package" I can use, I need to vendor the library myself anyway

Once I vendor the library for Windows/mobile, it's usually easier to just use that for Linux too. This is why big projects like Firefox/Chrome vendor everything.

Example: rr uses Capnproto, BLAKE2 and brotli, but we only depend on Capnproto as an external library; we vendor BLAKE2 and brotli. Even the single Capnproto dependency is horrible to deal with. For example we want to distribute rr binaries that work across distros, which means we want the rr build to support static linking of Canproto, but many distro Capnproto packages don't support static linking.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:18 UTC (Thu) by logang (subscriber, #127618) [Link]

Many of these problems already have solutions and haven't really been that big of a deal in the past. autoconf and Cmake have existed for a long time.

It's hard to avoid the issues with Windows/iOS/Android but if you're developing such an application you are probably not using C. Windows has always had a hellish story for libraries.

Firefox, for one, is distributed by many in an unvendored form.

But the overall theme is that the libraries you are using (or not) need help. If a library you want to use is good, and well maintained but not packaged, help them package it. If an algorithm isn't in a library, find an existing library that is a good fit and add to it (or, in the worst case start a new library, preferably that contains a lot more than just one algorithm). Or maybe the benefits of the latest and greatest compression algorithm are outweighed by older ones due to their accessibility. Develop with library versions that are commonly available, not the latest and greatest. Wait for features to mature (and possibly help them mature) before depending on them. If distros don't package a static library of something, send a patch so they can. Ultimately doing all this work allows you to write software that can be included in a distro and that should be the long term goal that is by far the easiest for all your users and easiest for the people that end up maintaining your software.

There is an awfully large amount of well written C software that has been written this way, has stood the test of time and will likely be around for a long time to come.

Yes, this can take more time and may mean you have to do more work in the short term, or wait for new features to percolate through the process. But the long term end result is a more sustainable ecosystem with a lot less work over the entire community. Vendoring something might make less work for you in the moment, but is more work for other people (or even your future self) down the line and doesn't solve anything for other people with the same problems as you.

If you want to write brittle broken software that needs constant attention and maybe doesn't even work at all in a few years, then yes, go ahead and keep doing things this way. Those that engineer things properly will still be around, still making constant progress.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:36 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

> Ultimately doing all this work allows you to write software that can be included in a distro and that should be the long term goal that is by far the easiest for all your users and easiest for the people that end up maintaining your software.
Why mutilating your development process in order to confirm to arbitrary distro whims should be your goal?

A goal of an application developer is to provide value to users. Around 99% of users use iOS/Android/Windows/macOS, not classic distros.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:29 UTC (Thu) by roc (subscriber, #30627) [Link]

>If a library you want to use is good, and well maintained but not packaged, help them package it.

For all distros that any of my users might conceivably use? And then wait for several years for users to actually update to distro versions where the new package is present? No, that is completely unreasonable.

It's actually kind of breathtaking what you're suggesting here --- become a member of many different distro communities, learn all their different processes, persuade all of them to accept the library (what if they don't?), and stay engaged long term. All to avoid vendoring one library. I doubt there is a single person who has ever done this.

> Or maybe the benefits of the latest and greatest compression algorithm are outweighed by older ones due to their accessibility. Develop with library versions that are commonly available, not the latest and greatest. Wait for features to mature (and possibly help them mature) before depending on them.

Yes, creating worse performing, less capable software is definitely an option. I prefer not to.

> If you want to write brittle broken software that needs constant attention and maybe doesn't even work at all in a few years, then yes, go ahead and keep doing things this way.

Your preferred approach "needs constant attention and may not even work at all in a few years" --- you require me to pay constant attention to how distros are packaging my dependent libraries and regularly contribute to that process. In fact, because bugs are found and requirements change, any project with external dependencies requires ongoing attention.

My main project Pernosco is in Rust, has tons of dependencies (because it does a lot), and Rust+cargo have done a great job of managing those dependencies over the last five years. I am happy to keep on doing this this way.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 14:58 UTC (Fri) by MrWim (subscriber, #47432) [Link]

> But the overall theme is that the libraries you are using (or not) need help.

I agree. I think distro library management incentivises working around bugs, while cargo* incentivises helping upstream libraries.

I don't see this point brought up very often when discussing cargo, but I consider it to be one of the principal advantages of cargo.

> Develop with library versions that are commonly available, not the latest and greatest.

I think this is a sensible approach if you limit yourself to libraries, dynamically linked and available in distros. However I think it demonstrates how distro package managers incentivise *not* helping the libraries you're using.

Imagine you're writing some code and you come across a bug in a library you're using. You can choose to fix the bug upstream, or you can choose to work around it in your downstream code. With cargo you clone your dependency's git repo, fix the bug, push the change to a pull request upstream and update your Cargo.toml dependency to point at your new git revision with:

mydep = { git = "https://github.com/me/mydep.git", rev = "9f35b8e" }

You can leave it pointing at that specific revision until the upstream makes a new release at which point you update your Cargo.toml back to:

mydep = "3"

Fixing the bug (or adding the feature) upstream is the path of least resistance. Once you do it others who are using the library can benefit at the time of their choosing. In my mind many small fixes like this **is** the maturation process

Now what's the process with distro package managers? You're working on your new feature for your software. You come across a bug. You fix it upstream, you wait for it to get accepted upstream, you wait for upstream to make a new release and then you wait a few years for it to get into enterprise distros. Then you upgrade your infrastructure to a new major distro version, and only then can you deploy your new software that depends on this bug-fix/feature to get it in the hands of your users.

No, waiting, waiting and waiting is not going to fly. You want to help upstream but depriving your users of the new feature in your software for years is too high a cost to pay. So you work around the bug in your software and maybe if you've got time left over you also submit a fix upstream.

> Wait for features to mature (and possibly help them mature) before depending on them

I think this is the crux of my argument. cargo makes it easy to help features mature. Limiting yourself to distro repos means you have to wait for them to mature.

* possibly other language package managers too, but I'm not sure. I think cargo is best-in-class in this regard and some of its advantages may not apply to non-compiled languages/languages that don't statically link.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 15:21 UTC (Fri) by MrWim (subscriber, #47432) [Link]

Another way cargo encourages upstream collaboration is standardisation. I believe that the biggest barrier to open-source contribution is actually getting the software built in the first place. It's generally easy with rust, because it's always the same, and because the compilation model and cargo seem well designed. Check out the source code and:

cargo build

cargo takes care of finding and building the required dependencies. When you want to test your change it's cargo test. Finding the git repo for a dependency is easy too. It's linked to from its page on crates.io.

Note that nothing I've said above is related to rust as a language, it's all about tooling, but most importantly the culture of the rust community.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:35 UTC (Fri) by roc (subscriber, #30627) [Link]

This is a really good point.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 14:16 UTC (Sat) by LtWorf (subscriber, #124958) [Link]

> Imagine you're writing some code and you come across a bug in a library you're using. You can choose to fix the bug upstream, or you can choose to work around it in your downstream code.

You can also send the patch to the distribution directly, or send it to both parties.

> Fixing the bug (or adding the feature) upstream is the path of least resistance.

You claim that the work of:

* forking
* fixing
* making a pull request to upstream
* going through multiple rounds until your patch is good enough to be included upstream and respects their standard of quality
* monitoring upstream's releases to know when a new release with your fix is out
* change your dependencies back to use upstream

is the path of least resistance

LOL.

It isn't. Want to know what people will do? Fork, patch, and point forever to their out of date fork.

Now THAT is the path of least resistance, it only includes 2 of the steps of the previous list. Of course now all this software might contain security vulnerabilities that will never be fixed.

> Now what's the process with distro package managers?

For a bugfix you can patch a package directly in the distribution.

> You fix it upstream,

Or directly downstream, as I said.

> No, waiting, waiting and waiting is not going to fly.

You assume that distributions and upstream projects are maintained by members of 2 different races. Distribution maintainers can be fast, and upstream maintainers can take months to reply. It depends entirely on the specific project..

Also you are saying loads of incorrect things and forgetting that distributions can and do patch bugs out.

> I don't see this point brought up very often when discussing cargo, but I consider it to be one of the principal advantages of cargo.

As we have seen, your entire assumption of what the "path of least resistance" is, was completely wrong. So was the conclusion :)

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 14:42 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

> You can also send the patch to the distribution directly, or send it to both parties.

Not all patches should have this done. For example, those which change API are certainly not eligible for direct distro inclusion (IMO). Upstream should have a look before someone else ships a new API in their name for sure. Even for bugfixes, I don't know if my patch is an X/Y problem and that I'm actually patching a symptom and not a root cause. Upstream can certainly help improve these patches better than packagers (on average).

> You claim that the work of: … is the path of least resistance

IME? Yes. Because things like PyPI, crates.io, etc. make releases so easy, once it is in, the release shouldn't be *too* hard. Because I can't publish *my* crate to crates.io while pointing to my fork (unless I publish it as a crate of its own on crates.io, but that requires renaming due to collisions…which is then *more* work on my consuming side).

> For a bugfix you can patch a package directly in the distribution.

"the distribution". As if there's only one.

> Distribution maintainers can be fast, and upstream maintainers can take months to reply.

What does this have to do with anything? The reverse is also certainly possible.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 15:45 UTC (Sat) by LtWorf (subscriber, #124958) [Link]

> those which change API are certainly not eligible for direct distro inclusion (IMO).

Those are not eligible to be accepted anywhere.

> Upstream can certainly help improve these patches better than packagers (on average).

There is an amount of software that distribution maintainers fork and become the "new upstream" because the actual upstream completely abandoned the project.

Yes upstream people abandon projects all the time. See python2 in red hat.

> IME? Yes. Because things like PyPI, crates.io, etc. make releases so easy, once it is in, the release shouldn't be *too* hard.

You can just point to your commit forever. Your software certainly wouldn't break.

> "the distribution". As if there's only one.

Uhm distributions share patches with each other.

> What does this have to do with anything? The reverse is also certainly possible.

It is possible, but you presented it as the only existing possibility.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 22:56 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

> Those are not eligible to be accepted anywhere.

APIs do change. I'm including things like "add a new enum variant for some new OpenSSL feature" kind of API changes in this. These patches certainly have a place, just not in some distro-specific patch (woe be unto anyone relying on distro packages being representative of upstream decisions in this case). See https://lwn.net/Articles/845448/ for a real-world case of this happening.

> There is an amount of software that distribution maintainers fork and become the "new upstream" because the actual upstream completely abandoned the project.

Why would I select such a project for a new dependency? All you're left with is projects that now need to port off of it (at least that would be my decision assuming there wasn't a distro-agnostic maintenance process set up). Case in point: scrot in Fedora (maintainer here). giblib and scrot were abandoned by upstream. The community picked up scrot, but left giblib alone. giblib starts to FTBFS. I don't want to maintain it; it's just a dependency of a project I do care about and I really don't want questions about it outside of that use. I file an issue upstream to port away from giblib. Still nothing. It's certainly not a patchset I want to maintain. So, scrot is currently dead in Fedora because I explicitly do *not* want to become an upstream.

As for something like Python2, yeah, that'll get some distro pickup. giblib? Not worth my time.

> You can just point to your commit forever. Your software certainly wouldn't break.

Not if I want to publish it anywhere (useful); crates.io requires that crates.io provide all your dependencies. I imagine PyPI is probably similar, but don't know.

> Uhm distributions share patches with each other

As if that's typical or even common (I'd like to see evidence). I've had to hunt down distro patches to our project that never got contributed to us, upstream. If they're not sharing upstream (or even filing issues about what they are patching), why would they share with each other? Granted, things have gotten better, but why must upstream be the one prodding here?

> It is possible, but you presented it as the only existing possibility.

Maybe that's MrWim you're thinking of?

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 15:08 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

> It isn't. Want to know what people will do? Fork, patch, and point forever to their out of date fork.
This is true. However, clicking a couple of buttons on a web form and submitting a PR is pretty easy.

> You assume that distributions and upstream projects are maintained by members of 2 different races. Distribution maintainers can be fast, and upstream maintainers can take months to reply. It depends entirely on the specific project..
So your users must depend on a whim of an unpaid maintainer for months-to-years? That's a nice model.

> Also you are saying loads of incorrect things and forgetting that distributions can and do patch bugs out.
The other poster actually nails most obvious issues with distros. They simply suck for application writers.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 20:36 UTC (Sat) by roc (subscriber, #30627) [Link]

Submitting changes to distro library packagers instead of upstream would only be my last resort, for the case where upstream is completely unhelpful. (In which case I'll be looking to move off that dependency anyway.) It simply doesn't scale given the number of distributions in use. In fact I have never, ever done this.

What I *have* done, many times, is exactly what MrWim proposed: made local changes to a Rust library via a temporary Cargo [patch], and later submitted those changes upstream --- and had them accepted. The former step is indeed the path of least resistance and lets me make progress in my project. The latter step is justified because there is an ongoing maintenance cost to those patches, so reducing the number of them that we're carrying at any one time pays off long term. The review they get upstream is also valuable. I'm working through this right now at https://github.com/rayon-rs/rayon/issues/562 for example.

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 22:10 UTC (Sun) by marcH (subscriber, #57642) [Link]

> made local changes to a Rust library via a temporary Cargo [patch], and later submitted those changes upstream --- and had them accepted. The former step is indeed the path of least resistance and lets me make progress in my project. The latter step is justified because there is an ongoing maintenance cost to those patches, so reducing the number of them that we're carrying at any one time pays off long term.

_This_ is "real" open-source: zero boundary between downloading/using/cloning/branching/forking/experimenting = complete freedom. This is why decentralized version control felt liberating. I would even argue that a project still stuck in centralized/medieval version control cannot really be considered open-source because of the added friction. And don't get me started on directories with sometimes long lists of *.patch files... never heard about branches?

Configuring and building C/C++ code at large is a nightmare and Linux distributions have been performing an amazing and critical job there. However to solve this they had to add layer(s) of indirection between software authors and users which adds friction and delays. So it's really not a surprise to see many authors trying to connect directly with their users. Random and recent example:

git clone some_python_project
pip install --editable .
<hack, test, hack, test>
git push new_pull_request

It should never be more complicated than this.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 6:43 UTC (Tue) by LtWorf (subscriber, #124958) [Link]

Open source has a precise definition that has absolutely nothing to do with your preferred version control system.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 7:38 UTC (Tue) by marcH (subscriber, #57642) [Link]

Next time at least pretend to try to get the point.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 10:54 UTC (Tue) by LtWorf (subscriber, #124958) [Link]

Next time make one that makes sense?

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 8:44 UTC (Wed) by marcH (subscriber, #57642) [Link]

FYI, the usual behavior on this site when you don't understand something is to either ask or ignore.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 10:39 UTC (Wed) by MrWim (subscriber, #47432) [Link]

> This is why decentralized version control felt liberating.

Thanks for this analogy, there's definitely something to it. Something about not having to ask permission before acting, but instead being able to develop using the same tools as anyone, publish the results and have the results be judged instead.

Maybe what git is to a project, cargo is to a super project, or dependency graph. Hmm, that doesn't quite feel right because the versioning is still provided by git. It's the lockfile that extends git semantics to your entire dependency graph. Hmm, not sure if that's right, I'll have to think on it, but the analogy is food for thought.

As it is I'm a big fan of lockfiles, which can even be applied to whole distros and I believe that good tooling can unlock functional freedom, although I agree with LtWorf that "cannot really be considered open-source" is rather hyperbolic.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 19:15 UTC (Wed) by rgmoore (✭ supporter ✭, #75) [Link]

I agree with LtWorf that "cannot really be considered open-source" is rather hyperbolic.

I think it's closer to true than you might expect. The GPL requires that source be provided in the preferred format for making modifications. That was meant to exclude things like generated code (rather than the material used to generate it) and obfuscated source, but it's not that far out to extend it to source being a copy of the version control system rather than just the raw source files.

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 22:19 UTC (Thu) by marcH (subscriber, #57642) [Link]

Imagine a few people on the Internet want to collaborate and start some experimental branch for a project hosted with subversion _without_ bothering the maintainers. They would most likely start by cloning the subversion project to git or similar - and submit back to subversion only in the end (if ever).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 12:29 UTC (Mon) by MrWim (subscriber, #47432) [Link]

> Distribution maintainers can be fast, and upstream maintainers can take months to reply. It depends entirely on the specific project..

I agree. Distro maintainers can be fast, or not, and the same is true of upstream maintainers. Patches may be suitable for inclusion in stable distros, or not. Upstream maintainers may request changes to the patches, maybe in several rounds, or not.

The point is that with cargo the process is both asynchronous and uniform.

# Asynchronous

I don't need to wait for the maintainer to evaluate my patches for me to continue with my development. We can iterate over the best way to implement something at leisure without me making demands over their time "Please reply promptly this is very important to us", etc.

The other important asynchrony here is that the modifications we make needn't affect other users of the library until both the modifications and the applications are ready. This means that you can be sure that modifications you've made to a dependency don't break curl for example which also uses this dependency. Your changes are isolated until the point they are confirmed good. The way rust and cargo works is that you can even have multiple versions of the same dependency in your application at the same time.

You might respond that multiple versions of a single dependency on a system is a bad thing - it makes security validation and updates more difficult. I'd agree that the ideal state is that all applications on your system use the same version of a library. What I don't like about the traditional dynamically linked distro model is that this is enforced by technology, rather than policy.

I don't like it because I think it makes upgrading a library needlessly fraught, and places too much responsibility on the shoulders of the library maintainer, rather than spreading it more widely among the application maintainers. The library maintainer can review a patch and refuse it on various grounds like it contains bugs, or that it's already working as intended, or that it's likely to break compatibility, etc. They may also refuse it because they're worried that it will break a dependant package. This level of effort from a library maintainer is reasonable to expect - but it would be unfair to ask that maintainer to do QA on all the packages that depend on their package to confirm the lack of breakage. They may not even know how some their dependencies are supposed to behave, so noting that they're not behaving correctly after the patch has been applied would be very difficult indeed.

Instead, by removing the technical requirement that the library be upgraded in lockstep across the whole of the system we can more gracefully upgrade libraries without giving up on the **policy** that there should only be one version. Application maintainers can validate that an dependency upgrade has not broken their application and apply the upgrade then. This is a much narrower task than validating that a library hasn't broken any application, and the task falls on the person best placed to perform it.

The current process works well for patches that are both small and urgent. This applies to most buffer overflow or integer overflow fixes for example. I think a different process would be better for patches that are either not small or not urgent.

I haven't mentioned the importance of uniformity yet, but this reply is already long enough and has taken enough time so I'll worry about that later.

> > You fix it upstream,
>
> Or directly downstream, as I said.

I was responding to logang's comment. Specifically: "the libraries you are using (or not) need help.". I interpret "the libraries" to mean the upstream library. My point is that it's easier for one to help the upstream libraries when they've got them from cargo, rather than from the distro.

It seems that our disagreement is a matter of priorities. My understanding is that you believe in the primacy of the distro, and as such helping the distro has highest priority. Conversely my priorities are my application and users, the upstream library and only then the distro.

I also believe in distros, but principally as distributors/integrators/maintainers of applications, rather than of libraries.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 6:57 UTC (Tue) by LtWorf (subscriber, #124958) [Link]

> I don't need to wait for the maintainer to evaluate my patches for me to continue with my development.

And what if they completely reject your changes or require substantial changes?

This can be for several reasons.

What would you do at that point? You'd need to either give up using the feature you added or refactor your code.

> What I don't like about the traditional dynamically linked distro model is that this is enforced by technology, rather than policy.

The debian policy does not say anything against static linking vs dynamic linking. It does mandate that there can exist only 1 copy of a certain source within the archive. Browsers are granted exceptions because they are irreplaceable and do whatever they want.

> This level of effort from a library maintainer is reasonable to expect - but it would be unfair to ask that maintainer to do QA on all the packages that depend on their package to confirm the lack of breakage. They may not even know how some their dependencies are supposed to behave, so noting that they're not behaving correctly after the patch has been applied would be very difficult indeed.

That's basically the opposite of Torvald's approach :D

> we can more gracefully upgrade libraries without giving up on the **policy** that there should only be one version.

Libraries that are made to support it are normally available in multiple versions for a period of time. Typical example is Qt. Qt4 has been removed only very recently. But point releases replace the previous version, because we don't want to do the go way of depending on a specific commit.

> It seems that our disagreement is a matter of priorities. My understanding is that you believe in the primacy of the distro, and as such helping the distro has highest priority. Conversely my priorities are my application and users, the upstream library and only then the distro.

And when the user ends up with unpatched vulnerabilities because he downloaded some binary from some website because it wasn't included in a distribution because it violated every existing policy… how is the user served well by this?

He has to hope the author has a system in place to recompile when a CVE appears, that there is a repository set up to get the latest version. Using a distribution this would all be solved already rather than having to be solved 10000 times by every single project.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 8:28 UTC (Tue) by matthias (subscriber, #94967) [Link]

>> This level of effort from a library maintainer is reasonable to expect - but it would be unfair to ask that maintainer to do QA on all the packages that depend on their package to confirm the lack of breakage. They may not even know how some their dependencies are supposed to behave, so noting that they're not behaving correctly after the patch has been applied would be very difficult indeed.
> That's basically the opposite of Torvald's approach :D
No, it is exactly Torvalds' approach. The kernel policy is no regressions, yes. But Torvalds does not do the QA for all software that runs on linux. That would be simply impossible. He releases -rc versions of the kernel and asks everyone to test. If regressions are reported, the corresponding patches are reverted (or improved). But it is the job of the users (distros, cloud service providers, etc.) to do the testing and validation.

And it is definitely up to the distros to test whether a new kernel version works for them before they include it into the distribution.

The main difference between the kernel and many other projects is, that the kernel developers care much more about the regression reports received from the users and that regression reports have higher priority than new features.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 13:07 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

> And what if they completely reject your changes or require substantial changes?

If we had gone with your proposal, now my distro is stuck with a patch rejected by upstream. Yay?

In this case, I rework the patch and adapt my code when I point to the next proposed patch. Same as step 1, just with a different baseline to diff from.

> Using a distribution this would all be solved already rather than having to be solved 10000 times by every single project.

As if Linux is the only distribution platform for projects these days. You do realize that Linux (and the BSDs) are the oddballs out here, right? Pretty much everything else does vendoring or the like to a large extent. And if I want a turnkey release from my website, a tarball with dependencies embedded is the answer even for Linux without waiting for distros to churn on the new release (which generally takes a month or two to hit the unstable channels for our project; add a release cycle for the stable channel to have a chance).

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 0:46 UTC (Sat) by hunger (subscriber, #36242) [Link]

> Many of these problems already have solutions and haven't really been that big of a deal in the past. autoconf and Cmake have existed for a long time.

I have seen both autoconf and cmake referred to as a problem more often than as a solution:-)

Watch out: The C/C++ world has tooling for dependency management incoming as well. Sooner or later you will have similar problems with those languages depending on very specific versions of libraries as well. Developers want that stuff, so they will write it. In the end each developer gets the tooling she deserves;-)

> If a library you want to use is good, and well maintained but not packaged, help them package it.

Why? The package will either be too old, or have the functionality my program needs stripped out since some packager did not like a dependency it introduced. Or its crippled or patched to no longer work properly for my application. Or the necessary files for your buildsystem of choice are not installed. Or they can't be found. Or users have crippled the libraries to prevent some imaginary issue or another. I had to deal with all of the above in bug reports already. It is such a huge pain to debug this kind of issue.

As a developer you need to vendor libraries (at thebvery least as a fallback if system libraries are not found!), even when running on distributions that have official packages of the required libraries. And you will get bug reports due to incompatible libraries because some distro packager will unvendor your libraries for you, blissfully ignoring the documented requirements.

> There is an awfully large amount of well written C software that has been written this way, has stood the test of time and will likely be around for a long time to come.

That is survivors bias... tons of poorly written crap written in C got lost and good riddance! Undoubtedly a lot of code written in any other language will not stand the test of time either.

> But the long term end result is a more sustainable ecosystem with a lot less work over the entire community.

There is less code because it takes ages to write and maintain. Is that a benefit to the eco system as a whole? I doubt it: Making something hard does make the persistent people stick around, and many good programmers I know are lazy and easy to distract:-)

> Vendoring something might make less work for you in the moment, but is more work for other people (or even your future self) down the line and doesn't solve anything for other people with the same problems as you.

It is more work for me and other people down the line. You get so many more bugs due to library incompatibilities and such. Those produce extra work for the users, the packagers and the developers. We are just used to this work load, so we ignore it.

> If you want to write brittle broken software that needs constant attention and maybe doesn't even work at all in a few years, then yes, go ahead and keep doing things this way.

If you want brittle software now, then go ahead:-)

Seriously there is great and well tested C code out there. That is wonderful! There is also some great rust code out there. Also awesome! Great code is a wonderful thing to have in any language! In my mind code quality is not related to how hard it is to use 3rd party libraries in the programming language of choice.

I also do not want to see programming as an activity where you need to recite obscure texts that got cargo-culted down to you by your elders! And it is rare to find a medium sized project in C or C++ that does not do exactly that -- at the very least in some dark corner of its build system. Rust projects at least do have less dark corners. Part of the reason is of course that rust has not accumulated so much historical baggage yet:-)

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:14 UTC (Fri) by LtWorf (subscriber, #124958) [Link]

> Adding a step before the build that makes sure the library package is installed, e.g. providing instructions *per-distro* to install it manually, making my software harder to build

You mean a README file with a list of dependencies? I'm sure people on a certain distribution know how to use their package manager.
If they don't know, they won't be compiling your software anyway, because they don't know how to install a compiler.

> For distros that don't package the library (or package a version of it that's older than I need), providing instructions to build and install that library manually, making my software even harder to build

Users of stable distributions are familiar with the issue.

> Making sure my software builds and runs with a range of library versions packaged by different distros and distro versions, potentially packaged in different ways with different directory layouts etc across distros.

That is incentive to:
1. Do not depend on amateur libraries that change API
2. Use autotools and let it figure out all this stuff

> On platforms like Windows

There are no package managers on windows. So that is a completely different situation. But anyway you won't be using the same binary on linux and windows.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 13:57 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> There are no package managers on windows.

There are. They're not as mature as anything Linux has AFAIK, but there is at least (in no particular order):

- vcpkg (probably the most useful for the discussion at hand)
- Conan (CMake-based)
- chocolatey (binary-based, includes Visual Studio/MSVC packages)
- anaconda (scientific/Python oriented, but has other bits too)

I think there's another, but I can't remember it's name. There's also zero surprise from me if there are others I haven't heard of.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:58 UTC (Fri) by roc (subscriber, #30627) [Link]

You really need a single standard package manager, preferably shipped with the OS, so there is a high chance users already have it installed and "install package manager" doesn't just make your installation process longer.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 13:35 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

While true, I think what I'd do is use vcpkg for developer management, bundle everything up into a single package and ship that via normal means (installer/relocatable zip) and maybe chocolatey depending on the tool target audience.

Anaconda would probably be better if you're already in that realm though.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 20:39 UTC (Sat) by roc (subscriber, #30627) [Link]

vckpg looks cool, thanks for pointing to it. But it looks more like "cargo for C++" than a distro package manager.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 20:41 UTC (Sat) by roc (subscriber, #30627) [Link]

Of course, as such, it may be a good answer to the problem of "how do I consume C third-party libraries" which was the original issue before we got into a discussion of the "use distro packages" non-solution.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:56 UTC (Fri) by roc (subscriber, #30627) [Link]

> You mean a README file with a list of dependencies? I'm sure people on a certain distribution know how to use their package manager.

Sure. One problem is, that package list changes with each distribution and sometimes within versions of each distribution. So we only have instructions for Fedora and Ubuntu, and those instructions are wrong for some versions of those distros.

Anyway, every single step makes the software harder to build.

> Users of stable distributions are familiar with the issue.

So? The issues still exist.

> That is incentive to:
> 1. Do not depend on amateur libraries that change API

Distro policies are responsible for varying file layouts and naming conventions. Distros also make varying decisions about library versions and which features they enable at build time.

> 2. Use autotools and let it figure out all this stuff

Autotools are a nightmare and writing autotools feature tests for everything I care about would be a ton of extra work.

> you won't be using the same binary on linux and windows.

Indeed, but once you've done the work to build Windows binary you can reuse that work to build a Linux binary with vendored libraries.

People arguing that the Right Way to build C/C++ software is to make it Linux only, use distro libraries, do a ton of extra work, and downgrade the performance and functionality of that software to fit the shipped libraries, are not doing C/C++ any favours.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 8:12 UTC (Sat) by abartlet (subscriber, #3928) [Link]

> Sure. One problem is, that package list changes with each distribution and sometimes within versions of each distribution. So we only have instructions for Fedora and Ubuntu, and those instructions are wrong for some versions of those distros.

This got so bad for Samba (both the distribution versions to cover and the Samba versions to cover) that we ended up building a massive infrastructure to:
- Create Docker images for CI
- Test every build in all the supported distributions
- Publish a 'install dependencies for samba' script.

Just looking at the table here: https://wiki.samba.org/index.php/Package_Dependencies_Req... to see where this ends up.

Even the source data for those generated scripts, for a single release is quite complex: https://gitlab.com/samba-team/samba/-/blob/master/bootstr...

So for software of any serious size, it is not just a README with a list of dependencies. Furthermore, Samba has found we have to have configure checks looking for the library (otherwise folks complain that their build failed) and to make those fail by default (not 'auto-detect' and work around) because otherwise features just go missing.

All in all it is hard to argue that this is really a good vision to match.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:56 UTC (Thu) by roc (subscriber, #30627) [Link]

Making consumption of third-party libraries extremely painful is not a good way to address whatever downsides there are of depending on third-party libraries. In reality C/C++ programmers react to that pain by either vendoring libraries (with bad tools, which make updates expensive, which creates security and correctness hazards), or by reimplementation (which on average means lower quality because development effort is spread over more implementations).

For example in our Rust project (https://pernos.co) we use cargo-deny in CI to scan our dependencies for known CVEs and break the build if there is one. This is working very well. Nothing like it exists for C because the infrastructure for consuming third-party libraries in C is hopelessly fractured.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:16 UTC (Thu) by marcH (subscriber, #57642) [Link]

> apt search xyz
> apt install libxyz
> How is that not straightforward?

It seems straight-forward when you ignore all the work that distributions perform behind the scenes to achieve that result.

It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

It seems straight-forward if you've never debugged CMake or (much worse) autotools.

It seems straight-forward as long as you don't need different packages that require different versions of xyz.

It seems straight-forward as long as you don't try to use a package from another distro because it's missing on yours.

It seems straight-forward as long as you don't try to naively "upgrade" the LTS version of your distro with packages from a newer version of the _same_ distro.

If it's so straight-forward, why have brand new projects like flatpak, snap etc. just been created?

Code re-use, software distribution and maintenance is hard, really hard. I'm not claiming rust or anything else cracked that nut, far from it and downloading random code from the Internet (in _any_ language_) is of course a security disaster[*] Pretending on the other hand that this problem has already been solved is either dishonest or incredibly naive and probably why the entire industry is still so bad at this. Have you never heard about "DLL Hell?". We should all keep open mind, take interest in any new approach and ignore anyone recommending to keep doing what they've have been always been doing.

[*] latest and greatest fun: https://www.theregister.com/2021/02/10/library_dependenci...

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:36 UTC (Thu) by logang (subscriber, #127618) [Link]

>It seems straight-forward when you ignore all the work that distributions perform behind the scenes to achieve that result.

Absolutely right. It's a lot of work and a hard problem to deal with dependencies. Which is why we should pool the work in distributions and everyone should use and benefit from it.

>It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

That's an odd statement. I do that multiple times a week on more than a dozen machines.

>It seems straight-forward if you've never debugged CMake or (much worse) autotools.

I've done both. Not that hard.

>It seems straight-forward as long as you don't need different packages that require different versions of xyz.

If libraries are well maintained and care about not breaking their users, and support a range of their own dependencies (instead of essentially vendoring their own dependencies by insisting on a very specific version) this problem tends not to be that bad. Even in python, good well maintained libraries ensure they work on a wide range of python versions and with a range of versions of their own dependencies. But also, in general, long deep dependency trees should be avoided and pushed back against.

>It seems straight-forward as long as you don't try to naively "upgrade" the LTS version of your distro with packages from a newer version of the _same_ distro.

I've done this a lot. For the rare critical package, this is hard and should simply not be done. 9 times out of 10, it is easy.

> If it's so straight-forward, why have brand new projects like flatpak, snap etc. just been created?

No idea. But I avoid those like the plague. They don't solve any of my problems.

> Code re-use, software distribution and maintenance is hard, really hard. I'm not claiming rust or anything else cracked that nut, far from it and downloading random code from the Internet (in _any_ language_) is of course a security disaster[*] Pretending on the other hand that this problem has already been solved is either dishonest or incredibly naive and probably why the entire industry is still so bad at this. Have you never heard about "DLL Hell?". We should all keep open mind, take interest in any new approach and ignore anyone recommending to keep doing what they've have been always been doing.

Absolutely right. But the new languages don't seem to solve these problems, they just ignore them and try to vendor everything. From a security, maintenance and longevity perspective the distros have been doing far better, which is why I always go to them first and strongly resist the newer trends to vendor everything.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:07 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> Absolutely right. It's a lot of work and a hard problem to deal with dependencies. Which is why we should pool the work in distributions and everyone should use and benefit from it.

You know how many users we have using distro-provided deployment mechanisms? Zero (that I hear from). I hear from distributor maintainers and we work to accomodate building with external deps (because *I* care and testing against external versions is the easiest way to set warning bells for API changes coming down the pipe).

Existing deployments are on bespoke machines with oddball dependencies not packaged by distros. They use custom MPI builds that are tuned for the hardware. External libraries compiled against those MPI libraries. And other things too.

I agree that distros do a lot of work and I'm grateful for it, but the "everyone deploys to Linux (or FreeBSD)" mentality (this is especially rampant in the web world too) is short-sighted to me. We vendor the "core" libraries we need. I even made sure we do it properly: no untracked patches to them, mangle the symbols, soname, and header include paths to avoid conflicts with external copies, provide options to *use* external copies, etc. It's a lot of work.

And after all that? I would really rather just drop a `Cargo.lock` file in for stability and have CI churn on new releases to let me know of what's up in the future.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:52 UTC (Thu) by roc (subscriber, #30627) [Link]

> But the new languages don't seem to solve these problems, they just ignore them and try to vendor everything

Effectively Rust wants developers to vendor everything, but a lot of work has gone into Rust+cargo to solve a lot of hard problems. For example:

cargo provides simple commands to update a dependency to the latest version, usually as simple as "cargo update" or "cargo update -p <library>".

cargo makes it easy to override a (possibly indirect) dependency with a patched version, via "[patch]".

rust-sec/advisory-db collects CVEs for Rust libraries and you can configure the cargo-deny tool to automatically break your build if one of your dependencies has an outstanding CVE.

Rust is designed so that by default linking multiple versions of the same library into a single binary works fine (always undesirable, but sometimes a necessary last resort).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 23:31 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

Effectively Rust wants developers to vendor everything

I don't think this is quite right. As I understand it "vendoring" means copying the source code of libraries you used into your own source tree rather than linking to the distribution-provided library at run time. As I understand it, there are a few problems with vendoring:

  • Library fragmentation. When people on the project discover something wrong with the library- a bug or missing feature- there's a tendency to patch it in the local copy rather than pushing the fix to upstream. Even if the project attempts to push changes upstream, the project may keep them if upstream is uninterested, resulting in fragmentation of the library.
  • Patch delays. If something upstream gets patched, it takes extra time and effort to push the patch out to all the projects that have vendored the library compared to patching the single distribution provided version. This is annoying with ordinary bugs and a serious danger with security bugs.
  • Hidden copies. It can be difficult even to track down all the projects that have vendored the library to make sure their copy has been fixed. This further slows patch rollout.

What Rust (and many other languages with their own dependency resolution systems) does is slightly different. They incorporate libraries into a statically linked binary, but they still treat the library as an external dependency rather than copying it into the project wholesale. That means they still have problems with patch delays but much less of one with library fragmentation or hidden copies than projects which have truly vendored libraries.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 1:07 UTC (Fri) by marcH (subscriber, #57642) [Link]

> As I understand it "vendoring" means copying the source code of libraries you used into your own source tree [...] tendency to patch it in the local copy rather than pushing the fix to upstream

In other words forking the source.

> They incorporate libraries into a statically linked binary, but they still treat the library as an external dependency rather than copying it into the project wholesale.

In other words forking the binaries but not the source.

There are probably a few other (and incompatible...) "definitions" of vendoring, for instance those that (wrongly) care about where the copy is hosted, but I don't think any other vendoring definition matters besides the two ways of forking above. I suspect we can get rid of that new word and not lose anything - actually gain some clarity. Please prove me wrong!

Duplication is not bad in itself, it's bad only when it leads to Divergence.
https://doc.rust-lang.org/book/ch03-01-variables-and-muta...

I stopped saying "Copy/Paste", now I say Copy/Paste/Diverge. Even the least technical managers understand he latter.

Examples of Duplication that keeps Divergence under control: cache invalidation, RCU, version control, snapshot isolation, transactional memory,...

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 1:23 UTC (Fri) by roc (subscriber, #30627) [Link]

Yes, I used the term loosely. Sorry about that.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 1:13 UTC (Fri) by marcH (subscriber, #57642) [Link]

> > It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

(I meant "apt upgrade)

> That's an odd statement. I do that multiple times a week on more than a dozen machines.

Then pause once and try to gauge about how many persons and lines of code you trust every time you install or upgrade a few dozens packages. Maybe pip and cargo won't look that bad after all.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:15 UTC (Fri) by LtWorf (subscriber, #124958) [Link]

> It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

Uh?

How is this more risky than having 900 copies of libpng and hoping that all of them will be upgraded when inevitably the next buffer overflow is found?

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 17:03 UTC (Fri) by marcH (subscriber, #57642) [Link]

I'm not the one pretending to know which is more risky.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 22:12 UTC (Fri) by roc (subscriber, #30627) [Link]

Suppose we had a distro where all packages were built with Rust and statically linked the "png" crate, a CVE is issued for that crate, and a new minor version of "png" is available that fixes the bug. It would be very simple to scan all Cargo.lock files for all packages to see which ones are using a vulnerable version of "png". For each affected package, "cargo update -p png" would update to a non-vulnerable version. It would be easy to automate the entire process.

In this hypothetical distro you would also want to run 'cargo-deny' in CI to ensure that every time a package is built, the build fails if there is an outstanding CVE against one of its components.

The big picture here is that Rust+cargo standardize the build process and metadata to make managing dependencies much easier, more consistent and scalable.

(Of course we're ignoring the issue that you will have to do this much less frequently for a Rust PNG library because Rust code isn't prone to buffer overflows...)

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 2:14 UTC (Tue) by dvdeug (subscriber, #10998) [Link]

Let's compare that to what we have right now; we have a CVE in libpng, we upgrade the version of libpng in the distro, and fix all the packages without recompilation. That's already complex enough without literally recompiling almost every program on the system.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 2:55 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

And then some applications randomly break because of an ABI change.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 2:31 UTC (Wed) by dvdeug (subscriber, #10998) [Link]

Recompilation has been known to randomly break applications as well. The art of a good security patch is that it doesn't change anything besides making the security fix.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 4:59 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

> Recompilation has been known to randomly break applications as well.
Uhh.... Whut? Recompilation can't break applications, especially with repeatable builds. A bad fix that changes the API can certainly do that.

But it's way better than random breakages because the ABI has subtly changed.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 8:33 UTC (Wed) by geert (subscriber, #98403) [Link]

If the compilation is needed due to a change in a dependency, it is not a repeat of the previous build. If it was, there was no point in recompiling it.
If the compiler has changed, the recompiled application may behave differently, too.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 16:40 UTC (Tue) by foom (subscriber, #14868) [Link]

If we had the ability to cross-compile for slow target architectures, and proper build automation, recompiling everything that depends on libpng wouldn't need to be a problem.

Distributing the update to users might need some adjustments, too, in order to avoid massive bandwidth usage -- a good mechanism to send just binary deltas for the affected files would be more important than it is now.

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 22:50 UTC (Thu) by flussence (subscriber, #85566) [Link]

It's a bit ironic that the software I *most* need cross-compilation for is the stuff most resistant to being cross-compiled…

(Bought more RAM than I thought I'd ever need. The compiler crashes because it runs out of i686 registers now. *sigh*)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 5:22 UTC (Thu) by roc (subscriber, #30627) [Link]

Rust is incredibly good at avoiding compatibility breaks in practice. They are very rare and generally involve discovering that some code pattern is unsound and needs to be illegal. My Rust project is nearly 5 years old at this point and we hardly ever have had to deal with compatibility breaks. When there rarely is a break, it almost always manifests as your code failing to build with a new version of the compiler.

On the other hand, in practice C *does* have compatibility breaks. All large C programs contain bugs where they rely on subtle undefined behavior. Periodically a compiler update will change how they handle that behavior. This is worse than Rust because these regressions are generally not caught at compile time, they will show up in testing or production.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:23 UTC (Thu) by vstinner (subscriber, #42675) [Link]

CPython is made of 500K lines of C code. I can testify that it breaks at every GCC major release. Each time, we discover new "undefined behavior" which were running fine previously.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:03 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> Each time, we discover new "undefined behavior" which were running fine previously.

Should be:

> Each time, we discover where we were using undefined behavior and just getting lucky previously.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:44 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

This isn't a terribly meaningful distinction. It doesn't really matter if you want to blame it on C having lax standards that allow undefined behavior or on lazy programmers allowing their programs to depend on it, it shows that C is not such a stable platform for building complex programs in practice. It's not like the CPython team are a bunch of slackers who don't know how to program. If they're running into this kind of problem, it's a problem with the system rather than their specific group.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:15 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Oh, I certainly agree with that. Maybe I got a bit too quip-y here.

I'll note that I just added clang-tidy checking to one of our code bases (takes almost 2 hours too :( ). Tons of things are ignored because we've been lax for far too long, but getting things like ASan, UBSan, clang-tidy and a whole host of other tools looking at the code is important for C and C++ code bases to keep their sanity in the unfortunately not-always-well-understood corners of the languages that stick out all over the place.

But it's also a mistake to then turn around and blame the compiler for utilizing the freedom the language gives it for the developer's lack of knowledge in that area (and is why, IMO, the burden of proof is on the developer, not the linter, when masking lint detections). You either have to live with the dish C has been serving all of us for the past 40+ years with all the rot and flavor-enhancing spices we now have available or step up, get in the kitchen, and improve things. IMNSHO, Rust developers have been doing that.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:41 UTC (Thu) by Wol (subscriber, #4433) [Link]

But this is what another post on LWN pointed me at - C IS NO LONGER A LOW-LEVEL LANGUAGE.

The whole point behind "undefined" or "implementation specific" behaviour was that - where CPU behaviour varied - it would do whatever was easiest for the CPU. The logical model behind C and modern processors have diverged so much that there is no longer a simple equivalence between the C language and the processor machine/assembly code. So "undefined behaviour is whatever the hardware does" no longer makes sense, but that is what is supposed to mean!

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:44 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

That's fine when you're coding for a given processor. When you're coding a portable program, undefined behavior is just not acceptable (unless someone foolishly decided "whatever C does here" as part of *their* spec).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:30 UTC (Thu) by Wol (subscriber, #4433) [Link]

Which is why C is not a good language - which is what a lot of posters here are saying.

Personally, I find C a perfectly okay language. I just feel that C, and Unix, and all that are a perfect example of what matters is not being any good, what matters is being in the right place at the right time. I cut my coding teeth on FORTRAN, and would probably still be using it if I had the opportunity.

As that article said, C is the perfect language for programming a PDP-11. It's just that modern computers behave completely differently to a PDP-11. Again, I cut my teeth on 50-series Pr1mes. Pr1me tried to re-write a large slab of the system in C, and I suspect that was (a small) part of the reason they went under (the bigger part being microprocessors like the 6502, the 8080 etc, were beginning to eat the minicomputers' lunch). And the 50-series having a strongly segmented architecture, it just didn't map on to the microprocessors' way of working.

Someone needs to do a "C", and design a new low-level language for programming x64.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:38 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> Someone needs to do a "C", and design a new low-level language for programming x64.

Just as aarch64 enters the stage in a meaningful way? Seems apt ;) .

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 2:18 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

> Someone needs to do a "C", and design a new low-level language for programming x64.

Well, there's assembly language. Or LLVM IR, if you wanted something a bit more optimized. But I imagine you wanted something higher-level than either of those options.

IMHO the single most significant pain point for C is undefined behavior. You can broadly divide UB into three types:

1. Essential UB - UB that results from stack/heap corruption or other cases where "You can only figure out what will happen if you know exactly how everything is laid out in memory, the order in which threads are executed, etc." It's "essential" because knowing what architecture you're using only gives you a little information about the program's likely behavior.
2. Accidental UB - UB that results from differences in architectural behavior (e.g. how negative numbers are represented, whether trap representations are a thing, whether memory is segmented, etc.). It's "accidental" because many of these instances of UB are artifacts of the state of the market at the time C was standardized, rather than fundamental constraints on what we can predict about program behavior.
3. UB that should always crash - Mostly, this is just "dereferencing NULL, dividing by zero, and anything else that everyone agrees should always immediately trap," but for the sake of completeness, I would define this as any situation where it's possible (on a reasonable, modern system, when running in userspace) to immediately detect the problem and crash, with no meaningful performance penalty for doing so (e.g. the runtime doesn't have to do array bounds checking or similar).

For addressing #3, the answer is obvious: Crash, and don't have it be UB. For #2, the answer is similarly obvious: Either pick "whatever the x86-64 does" or say "it's implementation-defined" (and not UB). But for #1, the only really effective way to remove it is to prevent stack/heap corruption statically, at compile time. And if you go down that road, you will fairly quickly find yourself reinventing the Rust wheel. Alternatively, you can insert bounds checks everywhere, and go down the Java road instead, but then you're not really a "low-level language" anymore.

TL;DR: I am unable to visualize anything that matches your description, but doesn't already exist.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:09 UTC (Fri) by khim (subscriber, #9252) [Link]

> e.g. the runtime doesn't have to do array bounds checking or similar

But even your short list (with two elements) includes two things which are hard to implement on some platforms. Accessing NULL wouldn't be caught on MS-DOS or many other “small” CPUs (and real mode is not dead if we would consider platforms which we are discussing in the article live… heck, in a world where Windows 3.0 support is added to compilers in year 2020 it can be considered more alive than other architectures discussed here). Catching “divide by zero” is not trivial, e.g., on AArch64 (fp exceptions are optional there are you need to periodically check if they happened — looks more-or-less array bounds checking or similar to me).

> Alternatively, you can insert bounds checks everywhere, and go down the Java road instead, but then you're not really a "low-level language" anymore.

But you have just said that you should crash instead! Make up your mind, please!

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 17:57 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

> But even your short list (with two elements) includes two things which are hard to implement on some platforms.

Those platforms can use C. I was asked to design a language "for programming x64," so I deliberately neglected to support older platforms.

I also explicitly stated that we were talking about a "modern system." MS-DOS is not a modern system. Windows 3.0 is not a modern system. Please do not snip out parts of my comment and then complain that the snipped out pieces no longer make any sense.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 18:31 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

Java throws an unchecked exception (which is a reasonable, but much more controlled, approximation to "crashes") if you make an out-of-bounds array access.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 1:39 UTC (Sat) by Wol (subscriber, #4433) [Link]

> 3. UB that should always crash - Mostly, this is just "dereferencing NULL, dividing by zero, and anything else that everyone agrees should always immediately trap,"

And here we have to disagree - computers are supposed to do maths, and division by zero is a common mathematical operation. The result is (scalar) infinity, I believe, and it's actually absolutely fundamental to that branch of mathematics known as calculus.

(One of the problems people have with infinity(s) is that there are so many, and you can't mix them ... :-)

One of my early projects that I remember involved a lot of Pythagorus. The problem was, the three vertices of my triangle could easily lie on a straight line, which would result (as far as the maths was concerned) in a "divide by divide-by-zero". To which the answer is zero. As far as the program was concerned, though, it did result in a crash, resulting in a load of extra code to trap the fact that computers can't do maths properly :-)

I don't know whether the language people are doing this, but imho they should get rid of both implementation-specific behaviour, and undefined behaviour. Let's take the example of shifting a negative amount. imho the principle of least surprise says that a negative left shift is a right shift, so if you explicitly ask for the new standard you get the defined behaviour. Unless you also ask explicitly for the old behaviour. If you don't ask for anything it remains implementation-specific (until the compiler default advances to the new standard :-) And fix undefined behaviour the same way - that should only be allowed when asking for something non-sensical :-)

They had this exact problem with FOR/NExT loops in 1977 :-) FORTRAN did the test at the end, so all loops executed at least once, while Fortran77 did it at the start so loops could possibly not execute at all. So we had switches to force either new or old behaviour.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 9:02 UTC (Sat) by mkbosmans (subscriber, #65556) [Link]

> And here we have to disagree - computers are supposed to do maths, and division by zero is a common mathematical operation.
> The result is (scalar) infinity, I believe, and it's actually absolutely fundamental to that branch of mathematics known as calculus.

That is not the case at all.
While you can say: lim x→0 n/x = inf, it does not follow that n/0 = inf.

And as for the more general point, calculus deals with real numbers for the most part. Computers operate on floating point and integer numbers. Operations that make sense in one domain don't necessarily translate 1:1 to another.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 10:09 UTC (Sat) by Wol (subscriber, #4433) [Link]

> That is not the case at all.
> While you can say: lim x→0 n/x = inf, it does not follow that n/0 = inf.

But isn't that what the MATHS MEANS? It doesn't follow that x will *reach* 0, but if it does, then n/x *must* equal infinity. (Quite often, x=0 is illegal in the problem domain.)

> And as for the more general point, calculus deals with real numbers for the most part. Computers operate on floating point and integer numbers. Operations that make sense in one domain don't necessarily translate 1:1 to another.

Principle of least surprise. Yes, floating point is a quantum operation, while reals aren't, but given that (I believe) the IEEE definition of floating point includes both NaN and inf, it would be nice if computers actually used them - I believe some popular computers did 40 years ago (DEC Vax), and I guess it's the ubiquity of x86 that killed it :-(

And the whole point of fp is to imitate real. Again, principle of least surprise, the fp model should not crash when fed something that is valid in the real domain. It should get as close as possible.

People are too eager to accept that "digital is better" "because it's maths", and ignore the fact that it's just a model. And people find it hard to challenge a mathematical model, even when it's blatantly wrong, "because the maths says so".

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 11:38 UTC (Sat) by Jonno (subscriber, #49613) [Link]

> While you can say: lim x→0 n/x = inf, it does not follow that n/0 = inf.

No, you can't say that. For n∈ℝ⁺, lim (x⭢0)⁺ (n/x) = ∞, but lim (x⭢0)⁻ (n/0) = -∞, so lim x⭢0 (n/0) does not exist. [For n∈ℝ⁻, lim (x⭢0)⁺ (n/x) = -∞ and lim (x⭢0)⁻ (n/x) = ∞, so lim (x⭢0) does not exist either; but for n∈{0}, lim (x⭢0)⁺ (n/x) = 0 and lim (x⭢0)⁻ (n/x) = 0, and so lim (x⭢0) (n/x) = 0].

> But isn't that what the MATHS MEANS? It doesn't follow that x will *reach* 0, but if it does, then n/x *must* equal infinity. (Quite often, x=0 is illegal in the problem domain.)

No, the maths says that the domain of the divisor does not include zero; that the closer to zero a positive divisor gets, the closer to positive infinity the value gets; and that the closer to zero a negative divisor gets, the closer to negative infinity the value gets.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 13:15 UTC (Sat) by mpr22 (subscriber, #60784) [Link]

x86 floating point is (hardware defects aside) IEEE-754, floating point division by 0.0 is defined in C, and if you compile:

#include <stdio.h>
int main()
{
float f = 1.0f/0.0f;
printf("%f\n", f);
return 0;
}

with gcc or clang and link against glibc, it prints "inf".

Integer division by 0, on the other hand, is undefined under finite-width two's complement (or unsigned) arithmetic.

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 2:47 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

Infinity is neither an integer nor a real number (when both terms are defined in the mathematical sense rather than the computational sense). The real numbers observe something called the "Archimedean property," which states that there are no infinities or infinitesimals (except that zero is infinitely smaller than all non-zero values).

Why do real numbers have this limitation? Well, the blunt fact is, there's only one totally ordered metrically complete field, and it's the real numbers.[1] If you want to introduce an infinity, you have to give up one of the following:

1. The field axioms (which, broadly speaking, say that you can add, subtract, multiply, and divide real numbers, and these operations behave in sensible and familiar ways).
2. The total ordering of the reals (i.e. for any two reals a and b, either a > b, a < b, or a = b, where = is given its ordinary interpretation of "are literally the same mathematical object" rather than something which the ordering is allowed to define).
3. Two additional axioms about how the ordering interacts with the field operations (basically, you can always add the same number to both sides of an inequality without invalidating it, and the product of two positive numbers is always positive).
4. The reals are Dedekind-complete (in simple terms, "you can take limits" - in more precise terms, every non-empty subset of the reals that has an upper bound, has a least upper bound).

For example, IEEE 754:

1. Everything is non-associative, which is not allowed under the field axioms. Also, NaN and ±inf don't have additive inverses.
2. Since 1.0 / 0.0 != 1.0 / -0.0, we cannot have 0.0 and -0.0 be "the same value" (because you get different answers when you try to use them in the same expression). Neither number is greater than the other according to IEEE 754, and so they violate total ordering. Also, all of the NaNs violate total ordering, too.
3. There are cases for which x < y, but x + z == y == y + z (because x is the largest value with exponent n and y is the smallest value with exponent n+1). Also, you can trivially break this with ±inf.
4. Almost satisfied: The set of negative floats has two upper bounds which are incomparable (0.0 and -0.0), so we cannot say which is the "least" upper bound. But I'm pretty sure this is the only counterexample (ignoring trivial alterations such as "negative floats greater than -1.0," etc.) because I can't think of a way to construct a counterexample out of NaN or inf.

Or the extended real numbers, which IEEE 754 is intended to mimic:

1. inf - inf is not defined (rather than giving an NaN), which is not allowed under the field axioms. Also, ±inf don't have additive inverses.
2. Satisfied if you assume that -inf < all reals < inf.
3. Not satisfied because, for finite numbers x and y with x < y, x + inf = y + inf = inf.
4. The extended reals are compact, which in this context is an even stronger property than completeness.

Or the hyperreal numbers, which are explicitly designed to "follow all of the usual rules" (for use in nonstandard analysis):

1. Satisfied by the transfer principle.
2. Satisfied by the transfer principle.
3. Satisfied by the transfer principle.
4. Not satisfied: Consider the set of infinitesimals. This is clearly bounded, but it cannot have a *least* upper bound, or else you could derive contradictions by doubling this least upper bound (which must give you a non-infinitesimal) and reasoning about the relationship between the resulting number and the set of infinitesimals.

TL;DR: If you like doing calculus etc. in the usual way, then you can't have infinities or infinitesimals.

[1]: Every totally ordered metrically complete field is isomorphic to the real numbers. So they're all, effectively, "the same field" with different names for their elements. We need this caveat because, if you really wanted to, you could just start calling 2 ** 128 "infinity." But it would still be 2 ** 128. You could still multiply two by itself 128 times to get to your "infinity." "A rose by any other name" and all that.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 15:59 UTC (Sat) by mafrasi2 (guest, #144830) [Link]

> And here we have to disagree - computers are supposed to do maths, and division by zero is a common mathematical operation. The result is (scalar) infinity, I believe, and it's actually absolutely fundamental to that branch of mathematics known as calculus.

> (One of the problems people have with infinity(s) is that there are so many, and you can't mix them ... :-)

Division by zero is *not* a common mathematical operation. It is literally undefined in mathematics as well.

In fact, it is absolutely fundamental that it is undefined, because otherwise you could do for any number x

x / 0 = infinity
x = infinity * 0

which would mean that any number is equal to any other number (because they all equal infinity * 0 and equality is transitive).

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 23:57 UTC (Sat) by Wol (subscriber, #4433) [Link]

Except that anything multiplied by 0 is zero.

So we have infinity *0 = 5*0 = 4*0 =3*0 = 2*0 = 1*0 =0*0.

If we divide each term by 0, does that mean infinity = 5 = 4 = 3 = 2 = 1 = 0?

I think we've fallen foul of Godel's incompleteness theorem. In order to make the maths work, we need special rules outside of the maths like "divide by zero, you get infinity" and "divde by infinity, you get zero". And a whole lot of physics depends on infinities. I can't give you any examples (or maybe I can), but there are various different types such that quite often infinity != infinity, and the physics doesn't work. And the "is it 10 or 11 dimensions" model of space-time works, I believe, because it just happens to be true that infinity does actually equal infinity.

Infinity and zero are special cases, required by Godel, that are needed to make everything else work. Take that example I gave of calculating the sides of a triangle - as soon as we accept that "divide by divide-by-zero equals 0" I can use THE SAME maths on any two points in a cartesian system to calculate the distance between them. Basically I try and construct a right angle triange and calculate the hypotenuse, and if the triangle collapses into a line THE SAME maths still works. And it makes sense that it works...

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 1:28 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

> So we have infinity *0 = 5*0 = 4*0 =3*0 = 2*0 = 1*0 =0*0.

inf * 0 is an indeterminite form. It isn't zero, it isn't inf, it isn't a number. Your logic just breaks down here.

> I think we've fallen foul of Godel's incompleteness theorem.

Umm, no. This is way before Gödel gets involved.

> In order to make the maths work, we need special rules outside of the maths like "divide by zero, you get infinity" and "divde by infinity, you get zero".

No, these rules don't exist (in normal mathematics, see later). They may "make sense" in specific instances, but they are nonsense if you try to extrapolate from them. Dividing by zero is not an operation you can do. It isn't inf, nan, or any other "thing", it just can't be done (at least in the axiomatic framework generally used; IEEE is notably lacking in axioms, so sure inf is fine there). I'm sure one could make an algebra where division by zero "makes sense" (cf. modular algebra or surreal numbers for other number systems; surreal *might* have division by zero, but it is…weird), but it might not be as useful as the algebra we use all the time.

> And a whole lot of physics depends on infinities.

I think you mean infinite series or infinitesimals, not infinities.

> I believe, because it just happens to be true that infinity does actually equal infinity.

Maybe you're thinking of the continuum hypothesis? Though I don't think string theory cares about it in particular (its truth is independent of ZF or ZFC). Though I don't know for sure in that specific instance.

> Infinity and zero are special cases, required by Godel,

I feel like you're not understanding Gödel. Gödel states that there are truths that are unprovable in any given proof system that is consistent. Or you can have all truths, but then you gain all falsities as well without the power to tell the difference. There's nothing in it about infinity or zero (as applied to number theory). Those existed before Gödel came along and are fine. I recommend the book Gödel's Proof by Nagel and Newman which is what finally turned the light bulb on for me (after not getting it in Gödel, Escher, Bach by Hofstadter and another reference I can't remember).

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 10:06 UTC (Sun) by Wol (subscriber, #4433) [Link]

As I understand Godel, a simple way to put it is "you cannot use a system to prove itself correct". So it's easy to prove boolean logic correct, AS LONG AS you don't restrict the proof to using only boolean logic. It's easy to prove number theory correct AS LONG AS you don't restrict tthe proof to using only number theory. That's why we can't prove logic correct, because we have nothing else to throw into the mix.

So I have no qualms about throwing that infinity stuff into the proof, because otherwise you can't class zero as a number, because it behaves completely differently to all the other numbers. "My logic breaks down". Yes, because my logic (as per Godel) MUST be either incomplete, or inconsistent. Without that rule, it's inconsistent. With that rule it's incomplete. Pick one ... I've gone for consistency.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 13:08 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

> As I understand Godel, a simple way to put it is "you cannot use a system to prove itself correct".

There are two parts.

The first is that any sufficiently powerful[1] system of arithmetic is incomplete. In this sense it means that there are statements one can make in the system for which no proof exists (of either its truth or falsity).

The second is that in such a system, the consistency of the system itself is one such statement.

It makes no such claim as to which statement is required.

> That's why we can't prove logic correct, because we have nothing else to throw into the mix.

Sure, but *that* system is also not provably correct. So what have you gained? You (claim to have) jumped one rung up a countably infinite ladder among a countably infinite selection of such ladders. Yay? :)

> So I have no qualms about throwing that infinity stuff into the proof, because otherwise you can't class zero as a number, because it behaves completely differently to all the other numbers.

Zero is a number. It works just fine. Division has a singularity at its value, but all kinds of functions have singularities. Do we need something else for tan(π/2)? Why not extend to the complex numbers with sqrt(-1) while we're at it? Quaternions? Octonions? Sedenions? Each of these is a separate algebra, an extension of algebra over the reals. We don't use them in general because we don't need the additional power they offer in day-to-day uses.

> Yes, because my logic (as per Godel) MUST be either incomplete, or inconsistent. Without that rule, it's inconsistent. With that rule it's incomplete. Pick one ... I've gone for consistency.

You're using the wrong definition of "consistency". It isn't consistent as in "all values must be able to take places of all other values in all expressions".[2] It is consistent as in "there are no contradictions between provable statements" which is *way* more important in (useful) mathematics.

[1] Peano arithmetic is sufficiently powerful. Arithmetic with just the natural numbers, addition, and multiplication, I believe, is not.
[2] You're still "inconsistent" in this sense about the square root of negative numbers for example. Why not toss those in? Why stop at trying to make division "consistent" in this sense when you're leaving out the trigonometric functions, square root, and the other infinite singularity-containing or domain-limited functions alone?

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 16:34 UTC (Sun) by nix (subscriber, #2304) [Link]

> Each of these is a separate algebra, an extension of algebra over the reals. We don't use them in general because we don't need the additional power they offer in day-to-day uses.

... and because they gain annoying limitations (lose a useful property of the reals) with every such extension, and by the time you get to the sedenions there's not really very many useful properties they have left (associativity is more or less it).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 7:17 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

> I feel like you're not understanding Gödel.

This is nothing to be ashamed of, by the way. I took an entire college course just focusing on the incompleteness theorems, and I still only have a very loose ability to follow their basic form. The incompleteness theorems are very, *very* hairy math. You cannot simply skim a couple of Wikipedia articles and expect to understand Gödel.

If you insist on trying to figure out what Gödel was saying without spending multiple years of your life studying the surrounding mathematics, then I would suggest starting out with Gödel Escher Bach. Yes, that's a very thick book, no, you should not skim it. The main advantage of GEB is that it actually does explain why and how completeness breaks down under arithmetic, using a real (if rather awkward) implementation of PA. For this reason, it is not an easy read, but it's better than an introductory model theory textbook.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 22:46 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link]

GEB is not an easy read, but it is probably as easy and fun a read as any book that makes a serious pretense of explaining Gödel's Incompleteness Theorem is likely to be.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 3:29 UTC (Fri) by zev (subscriber, #88455) [Link]

C is the perfect language for programming a PDP-11. It's just that modern computers behave completely differently to a PDP-11. [...] Someone needs to do a "C", and design a new low-level language for programming x64.

I see this kind of thing said a lot, and frankly it's never made the slightest bit of sense to me. What exactly about C itself is remotely PDP-specific? It doesn't strike me as terribly specialized for a PDP or any other particular ISA as it is for the Von Neumann model of computation, which was still pretty ubiquitous last time I checked. If we were all doing dataflow on FPGAs or whatnot, then sure, it'd be a poor fit, but we're still fetching and executing instructions (semantically) one at a time that load and store bytes in memory, pretty much just like Ken and Dennis did on their DECs.

"But modern machines have out-of-order execution and branch prediction and multi-level cache hierarchies!" I've seen some people argue...sure, but the whole point of that kind of microarchitectural sophistication is that it's microarchitectural -- it's not even directly visible at the assembly level, let alone in a high-level language. (Itanium exposed bits of its microarchitecture at the ISA level and look what a raging success that was.)

C's not without its shortcomings, but this notion that it's inappropriate for today's machines because it was initially run on a PDP-11 seems rather silly. Some of those shortcomings:

  • unsafety: if a 1970 PDP had been exposed to the variety of hostile inputs today's internet-connected machines are, this would have been just as much an issue.
  • pointer aliasing: C's challenges are much more entangled with modern compilers than they are with our hardware.
  • lack of abstractions/facilities for "programming in the large": pretty clearly unrelated to the underlying hardware.

None of these seem at all connected to its PDP origins. How would a "language for programming x64" differ?

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 4:21 UTC (Fri) by roc (subscriber, #30627) [Link]

Memory accesses are much slower on modern machines relative to other operations, so it is more important than it used to be to avoid redundant loads and stores. Thus, alias analysis has become more important to optimization, and C compilers more aggressive about exploiting whatever assumptions they can get away with (e.g. type-based alias analysis).

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:20 UTC (Fri) by Wol (subscriber, #4433) [Link]

> C's not without its shortcomings, but this notion that it's inappropriate for today's machines because it was initially run on a PDP-11 seems rather silly. Some of those shortcomings:

It's not that it's inappropriate. One of its major failings is that people *think* it's low level, but it doesn't map that well to what modern processors actually DO. In short, we treat it like the low-level language it *was*.

And it's that disconnect between what we think, and what actually happens, that causes all the problems.

Let's take your "unsafety" point, for example. On a PDP-11, I could have easily reasoned about what was ACTUALLY HAPPENING inside the CPU. That's not to say my programming is perfect, but my mental model of reality would have been reasonably close to reality. Nowadays, that's not true AT ALL.

And that's what bites kernel programmers all the time. Especially the noobs, their mental model of what's going on is wildly out of kilter with reality. The compiler takes the code they wrote and massively rewrites it behind their backs. And then the CPU effectively runs the object code in an interpreter I often get the impression ...

That's the point of a low-level language. Imho, if you have well-written code in a low-level language, the compiler SHOULD NOT be able to do that much optimisation. That's not a description of modern C !!!

And therein lies our problem.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 19:08 UTC (Fri) by khim (subscriber, #9252) [Link]

<font class="QuotedText">&gt; That's the point of a low-level language. Imho, if you have well-written code in a low-level language, the compiler SHOULD NOT be able to do that much optimisation.</font>

<p>If we would use that definition then modern systems don't have <b>any</b> low-level languages. Not even machine code conforms: CPUs with speculative execution may do massive changes to what you wrote in your code!</p>

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 23:13 UTC (Fri) by Wol (subscriber, #4433) [Link]

Didn't I say that the CPU was an object code INTERPRETER? :-)

But if I'm unable to REASON LOGICALLY about what the CPU is going to do, how on earth am I going to get deterministic (ie it does what I want it to do) behaviour from my program?

It's turtles all the way down and logic (and the ability to debug!) has just gone down the plughole with the bathwater ...

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 15:28 UTC (Mon) by anton (subscriber, #25547) [Link]

Interestingly, I have seen repeated claims that current widely-used architectures have been designed for C. While I don't think that's what actually happened in most cases, the claim that C is a bad fit for current architectures is grotesque (although, admittedly, C does not have language features for all the architectural features that architectures have; some are reflected in GNU C extensions, e.g., labels-as-values or vector extensions).

Concerning a new low-level language, yes, we need that, not because C (used as a low-level language) is a bad fit for current architectures, but because the gcc and clang maintainers do not want to support C as a low-level language, and the mindset behind that seems to pervade the C compiler community.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 19:44 UTC (Mon) by Wol (subscriber, #4433) [Link]

Thing is, C *is* a bad fit for modern architectures. It has a whole bunch of features that are undefined, or implementation-defined, which are MEANT to be low-level "match the hardware". Except that they aren't.

Let's just get rid of all these so-called "low level" cock-ups, accept that C is now a high-level language and that undefined and implemetation-specific behaviours shouldn't exist, and move on.

Someone brought up retpolines - that monstrosity that tries to make sure that both the hardware and the software agree on the imaginary hardware interface that's where the CPU microcode and language macrocode meet ... wtf are we doing!

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 9:27 UTC (Tue) by anton (subscriber, #25547) [Link]

I did not mean the language lawyer version of C. That version is a bad fit for any architecture (including the PDP-11). However, it's great for adversarial compiler maintainers who want to do whatever they want (e.g., produce good benchmark results, grudgingly cater to requests by paying customers and tell other users that their bug reports are invalid), because this version allows them to always blame the programmer for something or other. After all, no terminating C program is a "strictly conformant program", and whenever someone mentions "conformant program" (the only other conformance level for program defined in the C standard), the adversaries produce advocacy why we should consider "conformant programs" as meaningless (interestingly, they claim that we should take the rest of the C standard at face value).

I mean C as used in many programs, which has a pretty simple correspondence to architectural features (and you see it easily in contexts where optimization does not set in, e.g., when you separately compile a function that performs just one thing).

The adversaries want us to consider C as a high-level language with no correspondence to the bare metal; that makes it easier to blame the programmers and absolve compiler maintainers of responsibility. The question is why any programmer would want that. We have plenty of high-level languages, often better than C in that capacity, but not that many low-level languages; basically C is the only popular one.

Concerning a totally defined C: I think that is at odds with a low-level language for multiple architectures, but as most (all?) C compilers have demonstrated for the first quarter-century of the language, that's no hindrance for implementing C in a benign rather than adversarial way. And for those who don't know how to do that, I have written a paper (which also explains why I consider totally defined C impractical).

I don't know what retpolines have to do with any of that. They are a workaround for a vulnerability in some microarchitectures and they cannot be implemented in C (there are limitations to C's low-level nature). The vulnerability should be fixed at the microarchitecture level, and I expect that the hardware manufacturers will come out with microarchitectures that do not have this vulnerability.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 13:18 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

> I did not mean the language lawyer version of C.

What version of C are you talking about? The ISO standard? The image of C that you have in your head? My head? The C that GCC 2.95 accepted and worked with?

Let's imagine a world where C compilers magically stop doing "magic optimization" steps that tend to break code. What's going to happen is that C programmers that don't know this stuff already is going to have their code be pessimized and, presumably, slower in practice. What are they going to do? Start writing their C in such a way that the compiler was doing internal transformations during optimization passes anyways. They'll learn C more (and, I imagine, less satisfied with it), hopefully be using linters and tooling to tell them where their NULL checks are inverted with uses and such.

Rereading that, maybe it wouldn't be so bad. Maybe folks would migrate to better languages. Others might actually learn more about how loose C is in practice. The optimization passes could be migrated to the linters rather than the compiler to explain "hey, you could reorder your code to This Way and gain some performance". Maybe these passes would then gain some prose explaining what and why of them.

Then again, I have no idea such a C would be specified at ISO to disallow these optimizations while still allowing for architectures to not be forced into twos-complement representations or the like because "it's faster/easier for them".

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 15:28 UTC (Tue) by anton (subscriber, #25547) [Link]

As I wrote: "I mean C as used in many programs", and I actually point to a paper where I explain this in more detail. As for "pessimizing", it's certainly the case that advocates of adversarial C compilers claim that the adversarial behaviour is good for performance, invariably without giving any numbers to support these claims; maybe they think that repeating these claims and wishful thinking makes them true.

Wang et al. checked that for gcc-4.7 and clang-3.1 and found that the adversarial "optimizations" produced a minimal speedup on SPECint 2006, and that speedup could also be achieved by small changes to the source code in two places.

Yes, a performance advisor that points out places where changing the source code may improve performance would be a more productive way to spend both the compiler maintainer's and the compiler user's time than writing "optimizations" that break existing code, "sanitiziers" to find the places where the "optimizations" break it, and, on the programmer's side, "sanitizing" their code to withstand the latest attacks by "optimizers" (but not the next one). Moreover, such an advisor could point out optimizations that a programmer can do but a conformant compiler cannot (e.g., because there is an alias that even language lawyering cannot explain away). Of course, unlike maintained programs, benchmarks would not benefit from such a performance advisor, that's why no work goes into performance advisors; and conversely, "optimizations" don't break benchmarks (the compiler maintainers revert the "optimization" in that case), unlike other programs, and that's why we see "optimizations".

But what's more, by insisting on a very limited interpretation of what C means, the language lawyers remove opportunities for optimizations that programmers can make in the source code. I have discussed this at length.

I, too am skeptical that trying to change the C standard is the way to get rid of adversarial C compilers (not the least because you won't be able to achieve consensus with the implementors of these compilers on the committee), and I guess that's why advocates of adversarial compilers like to direct the blame for the misdeeds of these compilers at the standard, rather than at the compiler maintainers. It's not the standard that requires them to miscompile existing, tested and working, programs, it's the compiler maintainers' choice, so they alone are responsible.

Concerning architectures with other than two's complement representation of signed numbers, the last new such architecture was introduced over 50 years ago, and descendants of such architectures exist only in very few places and run select programs. There are far fewer of these machines than of the architectures (all two's-complement) that are not LLVM targets and that have brought about the parent article. And coding in a way that makes use of knowledge about the representation is one of the things that you can do for performance in a low-level language (and compilers do not perform all of these optimizations).

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 10:55 UTC (Fri) by khim (subscriber, #9252) [Link]

> undefined behaviour is whatever the hardware does

If that is the definition behavior then what the heck is implementation-defined behavior?

No, the confusion is much deeper. “Undefined behavior” always meant what it means today. And, in fact, most types of undefined behavior don't cause any confusion. Attempts to read from pointer after calling free or reading from undefined variable rarely cause confusion.

Something like this:

int foo() {
  int i;
  i = 42;
}

int bar() {
  int i;
  return i;
}

int main() {
  foo();
  printf("%d\n", bar());
}

Should code like above work or not? Clang breaks it even when compiled with -O0 (but gcc with -O0 works, although any other optimization level breaks it).

I don't know any practicing programmer who says compilers should support code like the above example.

Tragedy happened when decisions of C standards committee clashed with developer's expectations. Because C was designed to create portable programs lots of things which are, actually, well-defined (yet different!) on many platforms were put into “undefined behavior, don't use” bucket (instead of “implementation-defined, use carefully” bucket).

The intention was, of course, to make programs portable, but completely different things happened instead: so many “implementation-defined, use carefully” things were marked as “undefined behavior”s that developers started thinking that “undefined behavior” means precisely that: whatever the hardware does.

And now we have all that mess.

But no, “undefined behavior” never meant whatever the hardware does. Not even in C89. It was always “something your program should never do.”

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 12:52 UTC (Fri) by Wol (subscriber, #4433) [Link]

I think that mistake proves my point ... :-)

Undefined, implementation dependent, whatever. The point is, it BREAKS THE PROGRAMMER'S MENTAL MODEL.

And however much you want to blame the programmer, if programmers keep on doing it, it's a design fault ...

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 17:38 UTC (Fri) by khim (subscriber, #9252) [Link]

> And however much you want to blame the programmer, if programmers keep on doing it, it's a design fault ...

Got it. So we have issues with C which even Rust doesn't fully address:

— if you put check outside of loop the it would't test all elements of array.

— if you initialize your variable after it's used then program doesn't work.

— if you change the variable then other variables (which were calculated on basis on that variable) don't change as they should.

— you need to actually allocate memory for your data structure, just declaring pointer doesn't mean you can use these.

And I can probably add dozens more.

</sarcasm off>.

Granted: these are expectations of people who have started studying programming about two month ago… but they are very-very common.

Should we do something about them? If yes then what… if no, then why the heck no?…

> The point is, it BREAKS THE PROGRAMMER'S MENTAL MODEL.

Sure — but pretty much anything can break it if programmer is not taught properly.

The C (and C++) suffer mostly from Hyrum's Law: many thing which were supposed not to work… actually work — with real-world compiler. And then, later… they stop (even if documentation always warned not to use them)… that is when trouble happens (think glibc story).

That's the only problem with C/C++… but it's pretty severe: C language on paper and C language as implemented by typical compiler were different for so long that it's unclear what can be done at this point.

The thing is: I'm not sure switching to Rust (or any other language) would save us. After 10-20-30 years they would be in the same situation, too.

I'm not even really sure what can be done about it. Have just one fixed compiler without any changes? I don't think it would really work.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 22:20 UTC (Fri) by roc (subscriber, #30627) [Link]

> I'm not sure switching to Rust (or any other language) would save us. After 10-20-30 years they would be in the same situation, too.

No they won't.

Rust is designed to eliminate "undefined" or "implementation defined" behavior outside of explicit "unsafe" blocks. Yes, there will be compiler bugs etc, but really there will be vastly less of such problematic behaviors in Rust programs than in C and C++ programs.

That means we can expect Rust programs to behave much more consistently over time than C/C++ programs, as hardware and compilers evolve.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 18:13 UTC (Fri) by anselm (subscriber, #2796) [Link]

Tragedy happened when decisions of C standards committee clashed with developer's expectations. Because C was designed to create portable programs lots of things which are, actually, well-defined (yet different!) on many platforms were put into “undefined behavior, don't use” bucket (instead of “implementation-defined, use carefully” bucket).

AFAIR, the C89 standard carefully distinguished between “undefined” and “implementation-defined” behaviour. “Implementation-defined” behaviour is very emphatically not “undefined” behaviour, it's just that it is not defined by the language standard but by the various implementations (or their underlying platforms).

For example, the result of the >> operator applied to a negative signed integer is implementation-defined – many platforms offer a choice between arithmetical and logical right-shift and the compiler writer needs to pick one of the two, but after that, that particular compiler on that platform will always do it that way. (The reason why this particular behaviour was declared implementation-defined is probably that Ritchie didn't stipulate what was desired and by the late 1980s there were enough C implementations doing it one way or the other that nobody could agree anymore on which way was “correct” without making the other half of the industry “wrong”, and breaking programs that relied on the other behaviour.)

With appropriate care, you can exploit implementation-defined behaviour – especially if your set of implementations is small –, but with undefined behaviour, all bets are off. If you're interested in C code that is maximally portable between implementations, implementation-defined behaviour is, of course, something to avoid, but again it is a good idea to flag it as such in the standard so people can be aware of it.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 20:31 UTC (Fri) by khim (subscriber, #9252) [Link]

> AFAIR, the C89 standard carefully distinguished between “undefined” and “implementation-defined” behaviour.

Yes, but that wasn't my point.

You have explained perfectly why right shift of the negative value is “implementation-defined” behavior. All is very logical and proper.

But what about shift by negative value? Many (most?) low-level programmers expect that this would be “implementation-defined”, too. After all most CPUs do something predictable when they get negative value as shift value (different ones do different things but all CPUs I know do something predictable). More-or-less same as with shift of negative value: there may be different outcomes on different CPUs, yet there would be some outcome, right?

Well… no.

If you would actually open C89 standard you would see that “the result of a right shift of a negative-valued signed integral type (6.3.7)” is listed in “Appendix G, part 3 Implementation-defined behavior”… yet “an expression is shifted by a negative number or by an amount greater than or equal to the width in bits of the expression being shifted (6.3.7)” is not in part 3… it's in “Appendix G, part 2 Undefined behavior”!

I would love to know why that difference is there? Do some CPUs lock up when faced with negative shift? Or does something crazy happens (like: it takes so long that DRAM starts losing it's contents)? Or maybe some compiler couldn't handle it? Or… maybe committee just decided that if they would declare it “undefined behavior” then people would stop using it and compiler writers can generate better code?

I have no idea, really. But the end result: -1 >> 1 is “implementation-defined behavior” yet 1 >> -1 is “undefined behavior”.

To most low-level guys this is sheer insanity… yet that's how C89 is defined.

> If you're interested in C code that is maximally portable between implementations, implementation-defined behaviour is, of course, something to avoid, but again it is a good idea to flag it as such in the standard so people can be aware of it.

It's actually done in exactly this way. Not only C standard distinguishes “unspecified behavior”, “implementation-defined behavior”, and “undefined behavior”. It actually have all of them listed in three appendixes! To make sure noone would mix them up.

The only problem: actual programmers don't consult these when they are writing code. They try to guess. Based on their mental model. And for most programmers mental model either says that you could't shift negative value and you couldn't shift by negative value, too (these are sorta-lucky ones: they may not write fastest code, yet they tend to write correct code) or, alternatively, they assume you can push anything you want into a shift and get something back… and then they write something like (a >> (i-1)) * i with comment /* if i == 0 then result is zero and we don't care what a >> (i-1) produces */… only then modern compiler “looks” on that, notices that i couldn't ever be zero (because this would lead to undefined behavior) and happily nukes check if (i == 0) and removes “dead code”.

And that is where shouting starts. C89 standard clearly says that “undefined behavior” could lead to anything at all… yet “advanced programmers” say that “removing code which I specifically wrote there to catch errors is not anything at all in my book”… hilarity ensues.

P.S. I wonder if people who developed C89 are still alive and can say what they think about all that… does anyone know?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 15:58 UTC (Thu) by luto (subscriber, #39314) [Link]

C++ breaks my code with almost every major gcc update do to improved standard compliance.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:55 UTC (Fri) by ceplm (subscriber, #41334) [Link]

> Rust is incredibly good at avoiding compatibility breaks in practice.

Just an anecdotal piece of my personal experience. The only piece of Rust software I was following (for a year or so) was https://github.com/daa84/neovim-gtk/ and it forced me to upgrade my Rust compiler twice, because code for the later versions were no longer compatible with the old one.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 22:29 UTC (Fri) by roc (subscriber, #30627) [Link]

That is a completely different issue.

The original commenter said:
> [Rust] adds backward compatibility breaks. It isn’t as bad as Python at this, but then the Python people are not advocating Python as a systems language. C’s one great strength is that C code is C code. It tends to just keep working over time.

Likewise, Rust code tends to keep working over time: new versions of the compiler can compile old Rust code. That's what we were talking about.

You are talking about a different issue: can old versions of the compiler compile new Rust code? No, not always, because Rust adds features and some Rust projects like to use those new features.

If you're developing software and you want to never upgrade your compiler, that's fine. Just don't, and stick with the set of features it supports.

If you're consuming someone else's software and never want to upgrade your compiler, better pick projects with the above policy.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 18:51 UTC (Sat) by ceplm (subscriber, #41334) [Link]

Of course, technically, you are completely correct, but I don’t have these problems (or at least not that frequently) with programs in most other programming languages (not mentioning C, because that’s really unfair). Either Rust is still too unstable, or older versions were so lacking that programmers are forced to use the latest features from the bleeding-edge compilers. Am I right?

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 22:59 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Forced to use the latest features? Not likely. Allowed them to remove bad patterns, simplify others, or just use new ones? Sure. There's the nightly-only `-Z minimal-versions` flag to use the *minimum* declared versions of dependencies, but discussions about stabilizing it and making it work across the ecosystem haven't gotten very far (and I've submitted over a dozen PRs to make my dependencies work with the flag).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 1:21 UTC (Mon) by roc (subscriber, #30627) [Link]

I don't know why the neovim-gtk authors bumped their MSRV but most likely they saw some small feature that would be useful, and view upgrading the compiler as a trivial step (`rustup upgrade`), and so saw it as an easy win. If upgrading the Rust compiler is actually hard for some significant part of their community (I don't know why that would be), let them know!

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 10:25 UTC (Mon) by laarmen (subscriber, #63948) [Link]

I think it has to do with the fact that Rust has really good tooling to manage the toolchain from a developer PoV along with a good backward compatibility. Upgrading the rustc version is assumed to be a trivial step (rustup update and voilà), and if you're an application author you can provide binaries for most platforms, which means the upgrade doesn't concern those users as there is no need for them to upgrade their runtime.

In contrast, most other languages either have a runtime component, which makes upgrading painful for all users, and/or an upgrade process that was not trivial when the community around the language started forming its habits. I would assume you'll find a similar attitude in the Go ecosystem.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 12:03 UTC (Mon) by ceplm (subscriber, #41334) [Link]

That’s what I call immature environment: if it works on my laptop, it’s perfect, ship it!

Without considering maintenance costs, long-time support, and or combining multiple Rust projects in one system (e.g., Linux/Mac OS X distributions).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 13:47 UTC (Mon) by laarmen (subscriber, #63948) [Link]

You're putting words in my mouth.

The fact that application developer using Rust don't particularly care about sticking to a lower version of Rust is indeed partly because the language is still evolving and gaining features, for instance async/await which was not available in the version of rustc originally shipped with Debian (the situation has since changed). But my point is that it also comes from the fact that it is *very* easy for a Rust developer using what is considered the standard way of developing in Rust to install and use multiple versions of a toolchain, and for most users *from their PoV* the version of Rust doesn't matter since either they compile from source and can thus be expected to use standard (for Rust) tooling, or they are using already-compiled binaries with no runtime dependency on the version of Rust.

Saying that maintenance costs, long-time support and combining multiple Rust projects in one system isn't considered by the Rust community is just plainly false. It's just a matter of perspective : they consider that rustc is just another build-time dependency and it's okay to require those that build to bump it, which in its face makes sense : you're already updating something in your system (the project itself), it shouldn't be a big bother to update something else.

I'm not saying this is the absolute right choice, as there clearly is a mismatch with how distros work, but things are not black and white.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 14:32 UTC (Mon) by ceplm (subscriber, #41334) [Link]

> You're putting words in my mouth.

I am not. I haven’t mean it as pretending to quote you, or I haven’t even claimed that you would support this statement, but that it seems to me that this attitude is too present in the Rust community.

> […] the language is still evolving and gaining features, for instance async/await which was not available in the version of rustc originally shipped with Debian (the situation has since changed).

Yes, in other words, the language is still too immature for the projects of the size of Linux distros and similar.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 15:28 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> and or combining multiple Rust projects in one system (e.g., Linux/Mac OS X distributions).
It works just fine with multiple Rust projects and environments (see https://doc.rust-lang.org/edition-guide/rust-2018/rustup-... ). I have several Rust versions installed on my laptop side-by-side for tests and cross-compilation.

And the fact that you mention Windows/macOS is especially funny, because Rustup has native installers for them, making experimenting with them very easy.

And of course, Cargo makes sure that libraries don't interfere with each other, so each project gets its own dependency closure.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 5:28 UTC (Thu) by roc (subscriber, #30627) [Link]

> Rust is being promoted as a systems language when it doesn’t work on all of the hardware needed by a systems language.

Who gets to define which hardware needs to be supported for a language to be "a systems language"?

When gcc drops support for a CPU architecture, does that mean "systems languages" no longer need to support that hardware? Did someone appoint the gcc maintainers as the guardians who get to define what it means to be "a systems language"?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 6:12 UTC (Thu) by jalla (subscriber, #101175) [Link]

no, but when gcc drops support for the language you're still able to build C software for the target. brcm still ships gcc4 as the primary toolchain, as an example. Many others additionally ship gcc4 as the primary toolchain. Requiring software that never existed to build software for real systems (like s390x) is preposterous and missing the $ of the market.

What has happened here is the epitome of the python mindset, which is "if it doesn't impact me, it doesn't matter". I'm not going to take a stance on if this is right or wrong, but it's actively harmful against users.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 6:54 UTC (Thu) by StillSubjectToChange (subscriber, #128662) [Link]

"Requiring software that never existed to build software for real systems (like s390x) is preposterous and missing the $ of the market."

Rust supports the s390x as a Tier 2 platform, Gentoo just doesn't have packages for it yet. However, Rust does not support the s390 and neither has Linux since 2015. Besides, if a company bought an IBM mainframe then they shouldn't be making *any* complaints about support from open source projects.

"I'm not going to take a stance on if this is right or wrong, but it's actively harmful against users."

Realistically it isn't very many users. If a platform is so anemic that it doesn't have an LLVM backend and isn't implementing one, then it's functionally abandoned.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 9:07 UTC (Thu) by josh (subscriber, #17465) [Link]

> brcm still ships gcc4 as the primary toolchain, as an example

Which means that the same complaints would arise for using any C features that GCC 4 doesn't support, such as most C11 features.

I remember seeing complaints when projects dropped support for pre-C89 compilers. Those complaints don't make it reasonable to keep K&R C support forever.

> Requiring software that never existed to build software for real systems (like s390x)

Rust supports s390x. It sounds like Gentoo didn't ship Rust for that platform.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:07 UTC (Thu) by banana (guest, #144773) [Link]

LLVM supports s390x. It doesn’t support s390, which hasn’t been manufactured in 21 years.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 6:11 UTC (Thu) by StillSubjectToChange (subscriber, #128662) [Link]

"The third major issue is that Rust has the cargo system as part of its standard use model. This encourages bad behavior. I do not care how “memory safe” your language is if people regularly include unvetted code from some repo.
...
Either way, C as a tool is blameless of programmer error."

So C is blameless for its numerous pitfalls, but Rust is responsible for people misusing cargo? No, I don't think that is a reasonable opinion.

"The final point that I have yet to hear properly explained is why C is good enough to write other languages in, but not okay for others to use."

This seems like a complete non sequitur. If your language compiler is written in C then it's only a build time dependency. If your language interpreter/vm is written in C then you must spend a lot of time making sure it's reliable and secure. In either case it is much safer than having everyone write C.

But C is on the way out for implementing new programming languages. LLVM is a C++ based project, they are even implementing their libc in C++ instead of C. GCC is using C++ for more and more of the compiler. The Go toolchain doesn't need C at all. All major JavaScript engines are written in C++. The list goes on, but clearly people don't believe that C is fit for implementing their languages anymore and will choose not to if possible.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 10:41 UTC (Thu) by Sesse (subscriber, #53779) [Link]

Well, Rust is the only language I've seen where you cannot have a global variable without pulling in a crate.

(You can have a global variable, but not reasonably access is without a mutex, and to initialize that mutex, you de facto need the lazy_alloc crate.)

There are so many things I think Rust has done right. I really want to love the language. But I so dislike that it is yet another language with dependency sprawl and its own package manager that works for its one language only.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 11:51 UTC (Thu) by moltonel (guest, #45207) [Link]

You can use std::sync::Once instead of an external crate, or even just a plain static for basic types.

If you want your global to be mutable after init then of course you need to protect it with a mutex or similar, that's a basic C mistake that Rust is protecting you from.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 11:56 UTC (Thu) by Sesse (subscriber, #53779) [Link]

Yes, I wanted it to be mutable after init; it's a cache of state between multiple HTTP requests. So I need a Mutex, and how do you initialize a Mutex safely without lazy_static?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:07 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

You could do what `lazy_static` is doing behind the scenes. It's not compiler magic or anything, but plain Rust code. The crate just packages it up in a nicer API than stamping out the manual code all the time.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:13 UTC (Thu) by Sesse (subscriber, #53779) [Link]

Sure, but when I searched around for this, people were “do not reimplement lazy_static, you'll be doing it wrong, use the crate”.

It's a bit like Turing completeness. It's _possible_ to do without a crate, but it's definitely much harder, and it doesn't really matter what you do as a single developer long as the entire ecosystem goes the other way.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:25 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Sure, I didn't say it'd be easy. The fact that lazy_static makes it so easy behind its easy-to-use API is a *benefit*, not a downside.

Global mutable state is a tricky thing in C and C++ too. It's a mistake that they make it so easy to appear that one got it right. The fact that one can easily add a dependency that does something as "trivial" as making it does is a vast improvement. C and C++ dependency management is a PITA on a good day with reasonable projects as dependencies. Rarely do you get one, nevermind both.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:44 UTC (Thu) by Sesse (subscriber, #53779) [Link]

What? In C++, I can have a global std::mutex with no external dependencies. There's absolutely no reason why Rust couldn't have a simple way of initializing one in std, without requiring a crate.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 15:37 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Sure, you can make the mutex, but what data is it guarding? Who ensures the mutex is *used* when accessing the guarded data (spoiler: no one)? When does any of it get initialized at runtime? C++ doesn't guarantee any of this stuff (other than "it happens before it's needed" for the initialization). For example, static initializers might only be run when other code in the .o is used. If the guarded data lives in the data section and is looked up in some way other than through the TU that contains the mutex (e.g., `dladdr` or something), the mutex initializer might not be run at the right time, so good luck with that.

In the case of lazy_static specifically, there is another way that looks better and is also more performant[1]: once_cell. It doesn't use a macro and looks cleaner anyways (no weird macro-syntax of `static ref`). So in this specific instance, the stdlib would have gained a subpar API for such a thing anyways. This is, IMO, a vast improvement over the Python way which enshrines bad APIs because "they were available first" and is how one ends up with urllib, urllib2, urllib3, and requests.

[1]https://github.com/async-rs/async-std/issues/406

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 15:58 UTC (Thu) by Sesse (subscriber, #53779) [Link]

> Sure, you can make the mutex, but what data is it guarding? Who ensures the mutex is *used* when accessing the guarded data (spoiler: no one)?

Please stop the whataboutism.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:13 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

You're arguing that Rust makes global mutable state harder to do. Yes, it does. It is in service of helping to point out improper use of such things. That is, IMO, important in such a discussion of comparing the ease of making such declarations in each language.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:30 UTC (Thu) by Sesse (subscriber, #53779) [Link]

No, I am arguing that Rust makes even simple things hard to do _without pulling in crates_. I am notably not making a comparison against C++.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:46 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

I'm arguing that what seems "simple" is not as simple as you might think it is based on how "easy" C and C++ have made it in the past.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:36 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Why do you even need a global state? That's an uncommon thing to use, so having users to write code is perfectly fine for that.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:30 UTC (Thu) by marcH (subscriber, #57642) [Link]

If you think a global mutable state is a "simple thing" then you have a serious memory safety problem.

It took a very long wait, but in 2011 even C finally got a memory model that realizes concurrency has to be part of the language

https://en.wikipedia.org/wiki/C11_(C_standard_revision)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:41 UTC (Thu) by Sesse (subscriber, #53779) [Link]

I think initializing a mutex should be simple thing!

I'm giving up this discussion; too many people are interested in arguing against strawmen, and too few people are interested in discussing the actual problem. It's pretty off-putting when a community's reaction to criticism is “who needs to do that anyway”.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:59 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> I think initializing a mutex should be simple thing!

And it is, mechanically. Semantically, it is *not* a simple thing. These kinds of issues are what Rust is aiming to tackle as a whole.

Could once_cell or lazy_static be added to the stdlib? Sure. Why not yet? Maybe the API isn't sufficiently nailed down, soundness cases considered, etc. enough for the stdlib. Until then, crates.io is a handy place for these things to mature *while getting real world (ab)use*.

Some context for the C++ side of things. Improvements living in some random P paper on the ISO C++ standard committee mailings isn't going to get battle-hardened by anyone other than the author without the heroic work of making it available on existing language specs (e.g., Eric Neibler's Ranges library). This kind of stuff is nigh impossible with language features too. There is still errata coming in for `for (auto i : expr)` for crying out loud because this is undefined behavior:

std::vector<std::string> func();
// ...
for (auto i : func()[0]) // oops, you're iterating on a temporary that just got destructed
{}

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:58 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

> I think initializing a mutex should be simple thing!
It's actually not. For example, on some systems mutexes can require a system call.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:14 UTC (Thu) by laarmen (subscriber, #63948) [Link]

Actually, this is relevant to the discussion. A Rust mutex isn't standalone but a container, and it must be able to guarantee that its contents are a valid value for the contained type, whereas a C++ std::mutex doesn't have any knowledge of the data it protects. The former approach makes it possible to guarantee protected access, but it makes the mutex implementation that much difficult.

I agree that it's a bit of a shame to have to resort to a thirdparty crate to easily have a global mutex-protected variable, but as pointed by someone else in the thread, it turns out the approach used by the proeminent crate for this might not be the best after all, which makes me glad it has not been imported into std after all :)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:24 UTC (Thu) by farnz (subscriber, #17727) [Link]

All of those are lists of why what appears to be simple in C++ ("just" create a std::mutex at global scope) are in fact not simple at all once you consider the details. It's just that C++'s (and C's) way of doing things relies on you knowing that it's not that simple, and that you have a whole boatload of other complexity to consider when doing this.

And Rust globals are writable - you have to use the unsafe marker with writes to a global, because writing to a global without sufficient consideration of concurrency results in memory unsafety (in C, Rust, or C++ - this is a common thing to all systems languages). C++ just doesn't force you to confront that front-and-centre, where Rust does.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:19 UTC (Thu) by josh (subscriber, #17465) [Link]

> Yes, I wanted it to be mutable after init; it's a cache of state between multiple HTTP requests. So I need a Mutex, and how do you initialize a Mutex safely without lazy_static?

To answer your question:

We're currently adding an equivalent to lazy_static in the standard library: https://doc.rust-lang.org/std/lazy/index.html . It's currently available on nightly, and folks are working towards marking it stable.

We're generally very careful before adding something to the Rust standard library. We have strong stability guarantees for anything in the standard library, and once we add something it's subject to those same guarantees. The Python project has the philosophy that "the standard library is where code goes to die", for much the same reason; there are various pieces of the Python standard library for which the standard wisdom is "don't use it, use this third-party module instead". We want to avoid that situation in Rust whenever we can, even if that means that some common functionality requires a crate. It's very easy to add a dependency on a crate from the crates.io ecosystem.

Now, separate from the answer to your question, there are two reasons you might not want to use a global mutex-guarded variable as the cache for your HTTP requests. First, you might want to use a concurrent data structure instead, such as "dashmap", a fast concurrent hashmap. And second, you might consider putting that data structure in one of your library's objects instead, so that you (or other code calling into your code) can use multiple such objects concurrently without global state. All that said, you *can* use a global mutex-guarded variable if you want to, and std::lazy should let you do that using just the standard library.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:58 UTC (Thu) by ceplm (subscriber, #41334) [Link]

> So C is blameless for its numerous pitfalls, but Rust is responsible for people misusing cargo?

No, but C has glibc, Rust has ??? Is there a complete standard library for Rust (in the style of glibc or the Python stdlib), or are Rust people the same as Node or Lua ones (although, the latter is more forgivable, because of embedded space): “La, la, la, there is no problem, just pick some stuff from NPM/Luarocks/Cargo.”

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:45 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Rust's stdlib is pretty much equal in features to glibc. It's not like glibc has anything apart from basic IO.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 4:51 UTC (Fri) by zev (subscriber, #88455) [Link]

I dunno, glibc's got, say, strftime(), and a PRNG (or three). And sure, the latter's not cryptographically secure, but it's nice to be able to generate some quick-n-dirty test data without having to take a third-party dependency.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 12:22 UTC (Fri) by khim (subscriber, #9252) [Link]

Unfortunately glibc have lots of things besides basic IO. Internationalization and authorization, berkeley db and elliptic function.

If you try to look on glibc you'll find bazillion different things there. Most in half-useful state and not very useful at all.

While glibc does the best job it can it's really a pile of crazy things which are there just because Unix variants have grown all these warts there and GLibC needs to support it all.

I, for one, am very glad that Rust have nothing like glibc.

What I find most ironic is that some of the same people who condemned systemd “because it violates Unix philosophy” (specifically they claim it violates the principle: “do one thing well”) now condemn Rust because it refuses to provide huge pile of… things in it's standard library.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 8:57 UTC (Thu) by josh (subscriber, #17465) [Link]

> It adds backward compatibility breaks.

Rust cares quite a bit about *not* having backwards compatibility issues. A project dropping compatibility with ancient platforms would be a potential backwards compatibility break in that project, not in Rust.

> Rust is being promoted as a systems language when it doesn’t work on all of the hardware needed by a systems language.

Only if you're tautologically defining "all the hardware needed by a systems language" as "every platform that has ever had a C compiler".

Rust runs on most current platforms, and many non-current ones.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:36 UTC (Thu) by farnz (subscriber, #17727) [Link]

In answer to your thing about "why is C good enough to write other languages in?" - it's not. It's been used because it's what we had when those languages started being written, and good things that exist are better than perfect things that don't exist, and so we now have technical debt to pay down in relation to security and undefined behaviour.

Rust is one path for paying down that technical debt - it's not the only possibility, but it's one that exists now and has found a sweet spot that Agda (theoretically better, but harder to use) and Object Pascal (Lazarus project) have not found. I'm confident that in the future, we will find a new sweet spot language, and will have tech debt written in Rust to pay down, too.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 13:25 UTC (Wed) by kmweber (guest, #114635) [Link]

And, I mean, it's not like it's particularly difficult to write memory-safe code in C. I don't know where this myth that C is an "inherently insecure platform" comes from. It's not. It's exactly as easy to write safe code in C, as it is to write unsafe. It's programmers who choose not to, not the language forcing it on them.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 14:28 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

Sure, writing secure C is *possible*, but I think that, as a whole, we programmers have proven to be pretty shitty at it. If the Linux kernel code review process with all the C veterans can't get it right (just look at the stable kernel patch queue!), what makes you think it's viable for the general coding population to use it? Sure, one can use clang-tidy, sanitizers, valgrind, etc. on it, but I see that as a failing of the language being propped up by expensive tooling rather than a benefit of the language itself.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 15:37 UTC (Wed) by Wol (subscriber, #4433) [Link]

Yup. Different languages, different strengths, different weaknesses. C *encourages* you to play with pointers, which means even experienced programmers use them when they're not necessary. And if you play with knives when you don't need to, you WILL, on average, get cut. Sometimes badly.

I'm sure Rust has its faults. My favourite language, DataBasic, has quite a few. But one of the biggest flaws in a language is using it in an environment for which it is not suited. C *was* brilliant as a low-level system language. Hardware has evolved. C is no longer low-level. People still use it as a low-level language and get badly sliced by the impedence mismatch between what C thinks the hardware is, and what the hardware really is. And it's the easy access to pointers that encourages this dangerous behaviour.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 15:41 UTC (Wed) by pizza (subscriber, #46) [Link]

It's disingenuous to claim that "programmers choose to not write memory-safe" code. Bugs are (almost) never intentional.

But you're far more likely to get cut when playing with knives than with spoons.

Meanwhile, in the world I where I spend most of my F/OSS (and often, professional) time, the majority of the code I write is what other language consider inherently "unsafe". It's probably fair to say that C is the least-worst option.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 1:44 UTC (Thu) by marcH (subscriber, #57642) [Link]

> Those two packages in Gentoo are dependent on cryptography, but it turns out that they do not actually need it. A pull request to fix that was merged, so the problem for Portage, which is pretty fundamental to the operation of a Gentoo system, was averted.

Literally ROFL.

> At least it was averted for now.

Yes, of course. I'm back in my chair, sorry.

> there may not have been sufficient communication about the change.

Fixed!

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 3:28 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

> It turns out that the project has its own versioning scheme, which allows this kind of change (as does semantic versioning, actually).

What's even more interesting is that the semver spec (https://semver.org) gives the following rationale for this rule:

> Software that explicitly depends on the same dependencies as your package should have their own dependency specifications and the author will notice any conflicts.

Obviously, that proved to be a rather optimistic assumption in this case. On the one hand, I'm having a hard time believing that this is somehow the cryptography developers' fault. They are not in a position to tell downstream reusers how to do dependency management. On the other hand, Python's packaging system is... kind of bad at this entire problem space. In Debian, for example, apt-get update && apt-get upgrade is *mostly* safe (if using stable), and unattended-upgrades pointing to (just) the debian-security repos is almost completely safe. But pip install --upgrade can and will make a complete hash of anything and everything if you're not careful, and the simplest alternative, pip freeze >requirements.txt, is pretty much stuck at the opposite extreme of "never upgrade anything ever." So instead you have to do more complicated shenanigans with requirements and/or constraints, which basically forces you to know about every single package in your entire transitive dependency tree. Not ideal.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 7:01 UTC (Thu) by tiran (guest, #94212) [Link]

Christian Heimes here.

the article has two factual errors:

1) I'm not a developer of PyCA cryptography and don't have the commit bit. I'm merely a contributor, supporter, and packager for Fedora & RHEL. Alex and Paul call the shots.

2) The developers have NOT decided to create a 3.3 LTS release yet. So far it's just a proposal I made to appease users. Alex is strongly anti-LTS, but still open to being persuaded.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:31 UTC (Thu) by jake (editor, #205) [Link]

Sorry for the factual errors, Christian ...

> I'm not a developer of PyCA cryptography and don't have the commit bit.

When I looked at your entry in the comment stream on the bug (hovered over "tiran"), it said "Committed to this repository in the last week", so I thought that made you one of the developers of the code. Are we perhaps disagreeing over what "developer" means here?

jake

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 15:38 UTC (Fri) by t-v (guest, #112111) [Link]

Even if you disregard official associations (which in my experience will show up as "member" or "collaborator" or somesuch), I think it is useful to distinguish developer and contributor.
A better way to assess involvement would be

https://github.com/pyca/cryptography/graphs/contributors

This seems to indicate that Christian is 19 commits (at the time of writing this), which would make him a somewhat regular contributor.
At least I would not consider myself "a developer of ..." unless I'd be more heavily involved. (Of course, there are also contributions outside commits, but if you chose that metric, the above stats page seems a better source than the "committed..." line in the hover.)
One caveat is that it seems to only work with Name matches or github ids or so, so I've seen it being quite off when people switch (work, typically) email.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 8:06 UTC (Thu) by glaubitz (subscriber, #96452) [Link]

Copying my comment from HN for more visibility:

The Rust portability issue is the reason I have been pushing forward with the new M68k backend for LLVM which is due to be merged shortly:

> https://github.com/M680x0/M680x0-mono-repo

Patches for the M68k backend are discussed on reviews.llvm.org.

I have also started a Rust port for m68k already that depends on the LLVM backend work above:

> https://github.com/glaubitz/rust/tree/m68k-linux

For other architectures, the goal is to get gccrs moving forward and merged upstream:

> https://github.com/philberty/gccrs/

For anyone interested in helping, there are Bountysource campaigns I started to support both efforts:

> https://www.bountysource.com/issues/90829856-llvm-complet...

> https://www.bountysource.com/issues/86138921-rfe-add-a-fr...

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 9:09 UTC (Thu) by josh (subscriber, #17465) [Link]

> For other architectures, the goal is to get gccrs moving forward and merged upstream:

I would much rather see rustc_codegen_gcc upstreamed: https://github.com/antoyo/rustc_codegen_gcc

A GCC backend for rustc would provide all the code generation of GCC without duplicating the Rust frontend.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 11:06 UTC (Thu) by jrtc27 (subscriber, #107748) [Link]

Both have value. A separate backend is the simpler solution, but having an entirely separate frontend means you have an independent implementation which is good for the language, just like we see with GCC and LLVM.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:29 UTC (Thu) by moltonel (guest, #45207) [Link]

It's not as clear-cut as you think.

Looking at the GCC/LLVM/MSVC example, implementations are able to compete on C/C++ language features, build/runtime performance, supported targets, analyzers, etc. The story is similar with Javascript engines, for example. That's both a frontend and backend win.

But looking at something like Python, none of the alternate implementations compete on language features, just aspects like performance (I put stackless in the performance category rather than feature) or target. Language-wise, they all lag behind CPython. That's a backend win but a frontend loss.

A rust gcc frontend will always lag behind rustc, like mrustc currently does. Keeping compatibility with gccrs will be a (hopefully small) burden for rust developers. Is the frontend/backend tradeoff that Python made a good bet for Rust too ? Turns out that rustc has an advantage over cpython: it already supports multiple backends (llvm and cranelift). It will hopefully support a gcc backend "soon". So rust could get its backend win while avoiding a frontend loss.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:48 UTC (Thu) by glaubitz (subscriber, #96452) [Link]

> A rust gcc frontend will always lag behind rustc, like mrustc currently does.

Rust will probably (and hopefully) slow down with their process of changing the language. At some point, they will want to stabilize and then alternative implementations are able to catch up.

It's Rust's very own interest that alternative, more portable implementations exist. This will help Rust's adoption in the community because the portability is no longer a concern.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:21 UTC (Thu) by moltonel (guest, #45207) [Link]

Python is 30 years old and just added pattern matching. C++17 and C++20 feel like completely new languages. The alternative implementations are constantly lagging. If you mean "stability" as "no new language features" then I don't know of any general-purpose language that reached that stage without being considered dead. OTOH, Rust prides itself on "stability without stagnation" meaning that new features get added without breaking backward compatibility, and they do an exemplary job of it.

Rust clearly needs to reach niche and new platforms faster, to gain wider acceptance and not cause anguish when the next library starts using it. If it can do that without inflicting frontend differences on the user, it'll have done better at portability than C and C++.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:35 UTC (Thu) by glaubitz (subscriber, #96452) [Link]

> Python is 30 years old and just added pattern matching. C++17 and C++20 feel like completely new languages. The alternative implementations are constantly lagging. If you mean "stability" as "no new language features" then I don't know of any general-purpose language that reached that stage without being considered dead.

The difference is that these other two languages have a specification and alternative implementations that can keep up with each other, aren't they? Also, especially C/C++ are evolving at a much slower pace than Rust where almost every new upstream release contains new language features.

I mean, the whole reason this article exists is because Rust is special in this regard and causes problems for downstream projects. You can't really deny that, can you?

> Rust clearly needs to reach niche and new platforms faster, to gain wider acceptance and not cause anguish when the next library starts using it. If it can do that without inflicting frontend differences on the user, it'll have done better at portability than C and C++.

Sure, but the problem is that this hasn't happened yet and one of the main reasons is the fact that the language is moving so fast that alternative implementations have trouble keeping up.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:59 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

I don't know how well the various C++ implementations are keeping up with C++20 (at least). Specifically with Concepts, Coroutines, and Modules implementation progress.

> Also, especially C/C++ are evolving at a much slower pace than Rust where almost every new upstream release contains new language features.

I think they have a longer release cadence, but the difference between Rust c. 2017 and Rust c. 2020 is, IMO, *far* smaller than the difference between C++17 and C++20 (IMO). Rust added async syntax and language features, tweaked the module import spelling (in a backwards compatible way via the editions mechanism), added const compile-time evaluation support, and other things. On the other hand, C++ got consteval, a completely new module system (that needs build systems to chase actual use of it in practice), concepts, coroutines, and many many more things.

> Sure, but the problem is that this hasn't happened yet and one of the main reasons is the fact that the language is moving so fast that alternative implementations have trouble keeping up.

I think the main thing is that Rust is a hard language to implement. mrustc skips lifetime checking, but handles the rest of the language (as of 1.29 or so). The main developer of it went onto other tasks (IIRC, academic pursuits). A community could certainly get behind it to push it further.

I don't know that it's so much "have trouble keeping up" as getting over the initial hurdle to getting a lifetime-tracking system that is independently implemented. Once that happens, I don't think much in the language would be *too* difficult to keep up with (most of the changes I'm seeing in release notes are stdlib, perf changes, or incremental language changes that are likely way easier than C++ incremental updates (as long as we're comparing)).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:09 UTC (Thu) by roc (subscriber, #30627) [Link]

I would say Rust is adding features at a lower rate than C++ is, currently. C++ just added modules, concepts and coroutines which are each *huge* language features.

I agree with you that multiple popular Rust frontends, in conjunction with a Rust specification, would be a good thing at some point, to clarify implementation bugs vs language features. However it's a huge investment to keep all three C++ frontends going and Rust isn't yet at the point where that could be sustained.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 12:34 UTC (Fri) by khim (subscriber, #9252) [Link]

It's just so happened that C++20 got huge number of new features (thus, I think it would be the same story as with C++11/C++14: many users would just skip C++20 completely and go from C++17 to C++23 where rough edges of all these new things would be somewhat softened).

But these things were developed for more than ten years! C++ concepts, in particular, were conceived before Rust even got it's name!

It's just so happened that C++20 is huge release… time will show if Rust would ever have such a huge language-bending moment like C++11 and C++20 were.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 14:02 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

And it's not like C++23 or C++26 are looking any smaller either. Networking, modularized stdlib, pattern matching, templated `this` (to avoid rewriting the same body for a const, non-const, ref, and rvalue-ref variants of a method), metaprogramming, reflection, contracts, and others I can't even remember right now are all on the docket for the next two releases.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 17:56 UTC (Fri) by khim (subscriber, #9252) [Link]

The things which you have listed are not larger than what Rust gets in three years, though.

Small additions like these are added rust basically every release.

But concepts and especially modules… they change things. Not just allow you to write less code in some cases, but allow you to do things which weren't possible (or a least weren't feasible) before on the level of them whole program design.

Even metaprogramming doesn't change C++ as much as modules or concepts: you just get “for free” something which was already available before — just with IDL and tools like capnproto.

Sure that's nice simplification, but it doesn't imply insane amount of work which would be needed to make modules or concepts work. They would require changes in literally everything: from standard library to third-party libraries and many other things before they would become truly useful.

All the things that you listed are, actually, pretty minor in comparison: they only need local changes for you to benefit from them.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 18:41 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> The things which you have listed are not larger than what Rust gets in three years, though.

Maybe not. These are not things that I think any one person's view is going to give the same size and scope for any individual feature as another's view anyways. Personally, these are not "small" things to be adding to C++ (in aggregate).

Pattern matching, at least, adds a decent amount of syntax to the language. Contracts does too (though, thankfully, both are via context-sensitive keywords). Template-this also has some additional syntax (though that feels like a cleanup at least).

Modularized stdlib probably is going to have knock-on effects for projects not using IWYU where types aren't leaking out over entire #include trees.

I'm also aware of the scope of modules pretty intimately: I'm working with the ISO committee to make sure they can be built at all :) .

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:52 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

Sure, but the problem is that this hasn't happened yet and one of the main reasons is the fact that the language is moving so fast that alternative implementations have trouble keeping up.

I don't think it's as much a matter of alternative implementations having trouble keeping up as it is Rust not yet having reached critical mass to justify an alternative implementation. C and C++ have enough users to justify multiple FOSS implementations and multiple independent proprietary ones. Rust doesn't have that kind of user base yet.

My impression is that the perception of Rust as changing rapidly is driven by the way the compiler is developed. The developers have often used features new to release N when writing release N+1, so anyone who wants to bootstrap from another language will have to build every version of the compiler to get to the current version. That leads to the perception that the language as a whole is developing at a breakneck pace, even though it has a good history of preserving backward compatibility.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:57 UTC (Thu) by glaubitz (subscriber, #96452) [Link]

> That leads to the perception that the language as a whole is developing at a breakneck pace, even though it has a good history of preserving backward compatibility.

It may have backward compatibility, but the main problem is that many upstream projects adopt new language feature rather quickly which means that distributions with longer release cycles such as Debian, RHEL or SLE will have to backport new versions of the Rust compiler when updating packages like Firefox.

I have not observed this particular problem with C/C++ code while maintaining packages for an LTS distribution.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:01 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Eh, the "core" libraries that "everyone" depends on tends to keep the MSRV (minimum suitable Rust version) pretty low, or at least documented. Sure, things get bumped occasionally, not that even that pace is suitable for LTS distro cadences, but it isn't a mad rush to use the latest-and-greatest for the entire ecosystem.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 4:54 UTC (Fri) by plugwash (subscriber, #29694) [Link]

The reality in Debian is that due to the huge number of security holes that web browsers acrete and the fact that even the LTS release cycles of firefox are much shorter than those of a stable release they are practically forced to upgrade to new release series of firefox within a stable release.

Before firefox/thunderbird started using rust, this wasn't a huge problem, the firefox package could be updated largely independently of everything else. Nowadays though, a new release series of firefox basically means a new release of rustc.

And that breaks things. Sure the rustc developers have defined a stable subset, but a bunch of crates depend on less stable features. Some of this through the officially endorsed "feature gates" mechanism, others through a certain flag that is not supposed to be used by user code but is anyway that lets you use "nightly" features on a stable compiler.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 13:52 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> Some of this through the officially endorsed "feature gates" mechanism, others through a certain flag that is not supposed to be used by user code but is anyway that lets you use "nightly" features on a stable compiler.

Feature gates can only be opened on a nightly compiler. The stable compiler does not let you use them (unless you masquerade as the stdlib with some internal flags, but then you're in a "you get to keep both pieces" situation anyways).

What you're probably actually seeing is the crates using the newer language/stdlib features of *stable* rust releases.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 8:56 UTC (Mon) by glaubitz (subscriber, #96452) [Link]

> What you're probably actually seeing is the crates using the newer language/stdlib features of *stable* rust releases.

I'm pretty sure I have seen crates that required Rust nightly in the past.

Building rustfmt is one of such cases: https://github.com/rust-lang/rustfmt#installing-from-source

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:29 UTC (Thu) by josh (subscriber, #17465) [Link]

> Rust not yet having reached critical mass to justify an alternative implementation

Rust is permissively licensed, which removes much of the motivation for any alternative implementation.

And personally, I'm hopeful that mrustc will meet any regulatory or similar requirements for "alternative implementation", so that we don't need another one for that reason either.

I hope that we go as long as possible *without* an alternative implementation, to avoid fragmenting the ecosystem by expecting crate authors to use the subset of Rust provided by that implementation.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 23:06 UTC (Thu) by moltonel (guest, #45207) [Link]

> The difference is that these other two languages have a specification and alternative implementations that can keep up with each other, aren't they?

My point is that the alternative implems are *not* keeping up with each other and each new feature. I'm sure the spec is useful, but it clearly doesn't solve the problem.

> I mean, the whole reason this article exists is because Rust is special in this regard and causes problems for downstream projects. You can't really deny that, can you?
> Sure, but the problem is that this hasn't happened yet and one of the main reasons is the fact that the language is moving so fast that alternative implementations have trouble keeping up.

The problem is caused by Rust not being available on some platforms. You seem to imply that if it had a spec or evolved more slowly, rust would have alternate implementations an be available everywhere, but it's not that easy. Specs and lack of evolution don't give you alternate implementations, time does. The language isn't moving that fast, as other have pointed out. Alternates implementations don't guarantee availability on more platforms (see mrustc). And more platforms can be reached without the need of alternate implementations (see rustc_codegen_gcc).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:24 UTC (Thu) by josh (subscriber, #17465) [Link]

Rust has a permissively licensed frontend (and backends). Duplicating that code seems like a substantial waste, compared to collaborating on the existing implementation.

There were many competing proprietary C compilers. GCC provided a FOSS C compiler and compiler infrastructure, and there wasn't a strong need for a *second* such infrastructure until LLVM came along with substantially different design goals (permissive license and easy extension).

I don't think there's value in a second frontend, and I think it would cause ecosystem fragmentation.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 9:26 UTC (Thu) by roc (subscriber, #30627) [Link]

That's awesome.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 10:00 UTC (Thu) by glaubitz (subscriber, #96452) [Link]

Feel free to join #llvm-m68k on OFTC IRC and our monthly m68k online meeting: http://m68k.info/

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:38 UTC (Thu) by tlamp (subscriber, #108540) [Link]

So another commenter stated[0] how painful it is to port rust to another platform - can you, which is doing an actual port, please comment on that? (out of pure interest).

[0]: https://lwn.net/Articles/845637/

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 8:58 UTC (Mon) by glaubitz (subscriber, #96452) [Link]

> So another commenter stated[0] how painful it is to port rust to another platform - can you, which is doing an actual port, please comment on that? (out of pure interest).

If the new platform is just a new Linux architecture, the port is pretty much straight-forward and easy provided that LLVM has a backend for the architecture.

I cannot comment onto ports for new operating systems as I haven't worked on this myself.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:13 UTC (Thu) by flussence (subscriber, #85566) [Link]

I have to wonder why Portage doesn't just use pycurl in that case; it already needs a command-line program for downloading tarballs anyway, might as well be consistent about it.

(Ironically the default download program—wget—isn't supported by Gentoo on m68k, while curl is.)

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 8:15 UTC (Fri) by jak90 (subscriber, #123821) [Link]

That too will only get you as far as alternatives to the Rust backends for libcurl currently in development (Hyper, Rustls) will enjoy productive support.
In that case, upstream is probably more upfront about breaking yesterdays compatibility at least.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 2:01 UTC (Sat) by flussence (subscriber, #85566) [Link]

You seem to be confused. The dependency chain of pycurl is python, curl, openssl. Rust has no say in this, and it won't for as long as curl touts support for 80+ platforms as a feature.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 13:43 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

libcurl is looking to add backends which are implemented in Rust. Now, I'm sure these will remain optional, but what is a distro to do: not ship the backends in the libcurl package or have a libcurl-with-rust replacement package. (Of course, this is moot if the backends are loaded dynamically and the Rust bits can just go into a new package; I just don't know.)

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 16:37 UTC (Sun) by nix (subscriber, #2304) [Link]

curl backends are largely not pluggable; a given build uses one backend of any given type, and that's it. (Hence a lot of distros ship several curl packages using different crypto backends.)

So it's easy to avoid rust usage in curl on platforms that don't support it: build in some other choice of backends that aren't implemented in Rust. Working on embedded systems with harsher constraints is why a lot of these backends exist in the first place (e.g. the mbedtls backend), so this is nothing new for curl.

Python cryptography, Rust, and Gentoo

Posted Feb 20, 2021 1:11 UTC (Sat) by KZB (guest, #144978) [Link]

What the hack? Dropping Support for platforms because some people tend to use RUST, a language which is less well-known that C and less used?

That doesn't even make sense, even with the Security argumentation.
It's like "a complex password is more secure than a long password".
A false sense of security.

Somebody here has been trying to get into RUST by their bad documentation?
And they say "if you tend to use platforms WE DON'T SUPPORT.. you have to do it yourself, because WE ARE RUST".

The syntax of RUST and C is horrible. Lets switch to Python guys. * facepalm *

Also.. we should ask ourselves why RUST and these Python people trying to make it harder to get new platforms (and even older ones) supported. Hm... maybe "market reasons" ;-)


Copyright © 2021, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds