Silkenweb Example: Hackernews Clone

OpenBSD now enforcing no invalid NUL characters in shell scripts

185 points by CTOSian 9 months ago | 154 comments

amiga386 9 months ago
Here's the actual diff:
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/bin/ksh/shf.c....
And it looks like that covers all parsed parts of the shell script or history file, including heredocs. I get the feeling it's going to break all shar archives with binary files (not that they're particularly common). It will stop NULs being in the script itself, but it won't stop them coming from other sources, e.g.
```
    $ var=$(printf '\0hello')
    -bash: warning: command substitution: ignored null byte in input
    $ echo $var
    hello
```
It remains to be seen if this will be adopted by anyone else, or if it'll be another reason to use OpenBSD only as a restricted environment and not as a general computing platform.
> "If there is ONE THING the Unix world needs, it is for bash/ksh/sh to stop diverging further"
> OpenBSD ksh: diverges further
- chasil 9 months ago
  The only thing that is required to happen is that they all obey the rules of the POSIX shell (when called as /bin/sh).
  Otherwise, anything goes.
  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...
  All the userland utilities must have the behavior (and problems) specified here:
  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/
- matrix2003 9 months ago
  Eh - I actually like developing on OpenBSD first, because of restrictions like this. If it runs on OpenBSD, you are likely to have fewer bugs around things like malloc.
  OpenBSD is also really good about upstreaming bug fixes, which is a good thing. Firefox used to be a dumpster fire of core dumps on OpenBSD, and many issues were uncovered and fixed that way.
- jolmg 9 months ago
  > I get the feeling it's going to break all shar archives with binary files
  shar encodes binary files. Here's what it does with a file that has contents: "foo\0bar\n":
```
    sed 's/^X//' << 'SHAR_EOF' | uudecode &&
  begin 600 foo.txt
  (9F]O`&)A<@K.
  `
  end
  SHAR_EOF
```
  Interestingly, passing that heredoc to uudecode in the shell, it produces no output. However, if I pass the whole shar output to unshar, it does produce the file with the correct content.
- raverbashing 9 months ago
  > I get the feeling it's going to break all shar archives with binary files (not that they're particularly common)
  Base64 encode them.
  This is not diverging further, this is bringing sanity to the table
- bell-cot 9 months ago
  > Here's the actual diff:
  Only 8 short, simple lines of c code. Beautiful.
mcculley 9 months ago
"We are in a post-Postel world" is a great way to put it. This needs to be repeated by everyone working with file formats or accepting untrusted input.
- cesarb 9 months ago
  > "We are in a post-Postel world" is a great way to put it.
  See also RFC 9413 (https://www.rfc-editor.org/rfc/rfc9413.html), originally called "draft-thomson-postel-was-wrong" (https://datatracker.ietf.org/doc/draft-thomson-postel-was-wr...).
- nabla9 9 months ago
  Agreed.
  When every implementation in wide use has their own quirks, you must support them all to make your program widely used. Every special case is yet another potential bug to chase down.
  It also allows "Embrace, extend, and extinguish" -strategy that Microsoft used so successfully to assfuck the internet over a decade.
  - 9 months ago
  - pjmlp 9 months ago
    I think you mean Google.
    - nabla9 9 months ago
      No. The Microsoft. MS invented the term. DOJ found that MS used "Embrace, extend, and extinguish" in internal documents.
      Younger people don't know how absolutely ruthless and harmful Wintel monopoly was under Gates. Java did not work on purpose. Javascript did not work for purpose.
      <!--[if IE]>
      everywhere.
      They attempted to kill open web in the crib with their blackbird project. Only MSN (The Microsoft Network) for normal people.
- stackghost 9 months ago
  "Postel" is not a term that carries any significance for me, and Googling that word didn't turn anything up that seemed relevant.
  Who or what is a Postel?
  - Ndymium 9 months ago
    It's a reference to Jon Postel who wrote the following in RFC 761[0]:
```
    TCP implementations should follow a general principle of robustness:
    be conservative in what you do, be liberal in what you accept from
    others.
```
    Postel's Law is also known as the Robustness principle. [1]
    [0] https://datatracker.ietf.org/doc/html/rfc761#section-2.10
    [1] https://en.wikipedia.org/wiki/Robustness_principle
    - arcanemachiner 9 months ago
      I've always felt that this was a misguided principle, to be avoided when possible. When designing APIs, I think about this principle a lot.
      My philosophy is more along the lines of "I will begrudgingly give you enough rope to hang yourself, but I won't give you enough to hang everybody else."
    - IshKebab 9 months ago
      Ironically it leads to less robust systems in the long term.
    - thaumasiotes 9 months ago
      > Postel's Law is also known as the Robustness principle.
      Really? It seems like it's obviously just a description of how natural language works.† But in that case, there's an enforcement mechanism (not well understood) that causes everyone to be conservative in what they send.
      We can observe, by the natural language 'analogy', that the consequence of following this principle is that you never have backwards compatibility. Otherwise things generally work.
      † Notably, it has nothing to do with how math works, making it a strange choice for programming.
  - komon 9 months ago
    A reference to Postel's Law: be conservative in what you produce and liberal in what you accept.
    The law references that you should strive to follow all standards in your own output, but you should make a best effort to accept content that may break a standard.
    This is useful in the context of open standards and evolving ecosystems since it allows peers speaking different versions of a protocol to continue to communicate.
    The assertion being made here is that the world has become too fraught with exploiting this attitude for it to continue being a useful rule
    - godshatter 9 months ago
      What would have been the result of John Postel advocating for conservative inputs, I wonder? I'm wondering if the most common protocols would have been bypassed if they had all done this by other protocols that allowed more liberal inputs.
  - ok123456 9 months ago
    The fact that googling Postel was worthless also indicates we're in a post-google search world.
    - stackghost 9 months ago
      I'm actually astounded at how quickly the quality of Google search results has tanked in recent years.
    - Brian_K_White 9 months ago
      2nd result on kagi was about him but in the form of another critic.
      https://datatracker.ietf.org/doc/draft-thomson-postel-was-wr...
      Hard disagree.
      It's a valid argument, but I say it's merely an argument, not an argument that wins or should win.
      But also, I say that detecting out of spec or unexpected input and handling it in any other way than crashing IS adhering to Postel.
      Refusing to process a request is better than munging the data according to your own creative interpretation of reasonable or likely, and then processing that munged data.
      I consider that to be within Postel to return a nice error (or not if that would be a security divulgence). Failing Postel would be to crash or do anything unintended.
    - skybrian 9 months ago
      Google’s results for “Postel’s law” and “Jon Postel” are fine. “Postel” is ambiguous, a fairly common surname, so websites of unrelated companies show up, and a disambiguating page on wikipedia that links to Jon Postel and several other people.
    - AStonesThrow 9 months ago
      Bing had no trouble at all finding him from my device.
  - runjake 9 months ago
    Jon Postel was instrumental in making the Internet what it is today.
    https://en.wikipedia.org/wiki/Jon_Postel
    The Wikipedia article is kinda unclear and doesn't provide the proper context, so:
    - Ran IANA, which assigned IP addresses for the Internet.
    - Editor of RFCs, which are documents that defined protocols in use by the Internet.
    - He wrote a bunch of important RFCs that defined how some very important protocols should work.
    - Created or helped create SMTP, DNS, TCP/IP, ARPANET, etc.
  - teraflop 9 months ago
    It's a reference to "Postel's law" which is a pretty well-known principle in the networking world, and in software more broadly. Named after Jon Postel, who edited and published many of the RFCs describing core Internet protocols.
    https://en.wikipedia.org/wiki/Robustness_principle
  - CoastalCoder 9 months ago
    Adding to the sibling comments, this is briefly covered in Eric Raymond's wonderful book, "The Art of Unix Programming" [0].
    [0] https://en.wikipedia.org/wiki/The_Art_of_Unix_Programming
- 9 months ago
- Brian_K_White 9 months ago
  There is no such thing as a post Postel world. But handling the input in any other way than crashing or ub IS perfectly Postel.
  Deciding that nul is invalid data, and refusing to allow it, and refusing to munge the data and proceed based on the munged data that you essentially made up, as long as whatever you did do instead was graceful and intentional, to me that is perfectly Postel.
jrockway 9 months ago
I like the term post-Postel.
There are two reliability constraints that all software faces; security and interoperability. The more lax you are about validation, the more likely interoperability is. "That's weird, I'll just do whatever" is doing SOMETHING, and it's often to the end user's liking. But, you also enter a more and more undefined state inside the software on the other side, and that's where weird things happen. Weird things happening typically manifest as security problems. So the more effort you go to to minimize the possibility of entering a weird state, the more confidence you have that your software is working as specified.
Postel's Law made a lot of sense to me when developing the early Internet. A lot of people were reading imperfect RFCs, and it was nice when your HP server could communicate with a Sun workstation, even though maybe some bit in the TCP header was set wrong. But now? You just gotta get it right and push a hotfix when you realize you messed something up. (Sadly, I don't think it's possible. Middleboxes are getting more and more popular. At work, we make a product where the CLI talks to the server over HTTP/2. We also install Zscaler on every workstation. Zscaler simply blocks HTTP/2. So you can't use our product. Awkward.)
- Thiez 9 months ago
  This is also where Google went right with QUIC: encrypt as much as possible to show middleboxes the least possible. This combats ossification. Then again it seems likely middleboxes will just block QUIC (or UDP in general).
- SAI_Peregrinus 9 months ago
  The Cryptographic Doom Principle (if you have to perform any cryptographic operation before verifying the MAC on a message you’ve received, it will somehow inevitably lead to doom)[1] is a sort of anti-Postel's Law.
  [1] https://moxie.org/2011/12/13/the-cryptographic-doom-principl...
saagarjha 9 months ago
> There appears to be one piece of software which is misinterpreting guidance of this, and trying to depend upon embedded NUL.
Curious what this is
- semiquaver 9 months ago
  I wonder if it’s https://justine.lol/ape.html / cosmopolitan libc
  - tiffanyh 9 months ago
    Just yesterday I asked @jart, here on HN, about Cosmo & OpenBSD.
    https://news.ycombinator.com/item?id=41627889
    APE was mentioned and some interesting tidbits in the GitHub link provided in the HN comment above.
  - chubot 9 months ago
    I'm pretty sure it is, I remember reading something about this
    Yeah I found it here
    https://news.ycombinator.com/item?id=41030960
    2019 bug - https://austingroupbugs.net/view.php?id=1250
    https://justine.lol/cosmo3/
    > This is an idea whose time has come; POSIX even changed their rules about binary in shell scripts specifically to let us do it.
    FWIW I agree with this OpenBSD change, which says more pointedly
    All the shells are written in C, and majority of them use C strings for everything, which means they cannot embed a NUL, so this is not surprising. It is quite unbelievable there are people trying to rewrite history on a lark, and expecting the world to follow alone.
    i.e. it's not worth it to change a bunch of old code in order to allow making code more esoteric.
    We want systems code to be more predictable, reliable, and less esoteric ... not more esoteric
    - asveikau 9 months ago
      > POSIX even changed their rules about binary in shell scripts specifically to let us do it.
      I'd seen this quote around. The fact that the standards were changed to allow it never struck me as a good indication that it should be relied upon. It seems rather backwards of how these standards work.
      I got flamed on HN once for saying cosmopolitan libc shouldn't be used for production because it relies on weird behaviors and implementation quirks that aren't really an ABI.
  - eesmith 9 months ago
    Shouldn't be. See the "exit 1" in your link? That's the end of the shell script, and as the OpenBSD link says;
    > It remains possible to put arbitrary bytes AFTER the parts of the shell script that get parsed & executed (like some Solaris patch files do). But you can't put arbirary bytes in the middle,
    - oguz-ismail 9 months ago
      It is. Binaries generated by cosmocc have NUL in the middle.
sneela 9 months ago
> This was in snapshots for more than 2 months, and only spotted one other program depending on the behaviour (and that test program did not observe that it was therefore depending in incorrect behaviour!!)
Fascinating. I wonder what that program is, and why it depends on the NUL character.
bell-cot 9 months ago
Kudos to OpenBSD!
Similar to the olde-tyme "-o noexec" and "-o nosuid" options for `mount`, there should be easy, no-exceptions ways to blanket ban other types of simply obvious red-flag activity.
parasense 9 months ago
Is this going to murder those fancy shell scripts that self-extract a program appended to the tail, which is really just an encoded blob of some kind, presumably compressed, etc.. ???
- talideon 9 months ago
  Not if it was done competently. Shar files and the likes shouldn't contain NULs, even if they contain compressed data. The appended data should be binary safe.
  - Thiez 9 months ago
    And in case your data does contain NULs, presumably one could add a layer of base64 encoding. Not nice for the filesize, but also much less likely to upset a text editor when the script is opened (even in the absence of NUL bytes).
chasil 9 months ago
I was going to check the status of mksh (the Android system shell), but the project page returns:
"Unavailable For Legal Reasons - Sorry, no detailled error message available."
http://www.mirbsd.org/mksh.htm
The Android system shell is now abandoned? This is also in rhel9 basesos.
- chaosite 9 months ago
  Looks fine here, maybe they're blocking your IP range for some reason?
- talideon 9 months ago
  Fine for me. I just got a HTTP warning and nothing else.
  ~~I believe Android uses toybox, not mksh.~~ It does use toybox, but toybox doesn't appear to include a shell.
- kbolino 9 months ago
  It's blocked for me too, but only on my home Internet (Xfinity), not my phone (Google Fi/T-Mobile).
  - torstenvl 9 months ago
    Works fine for me on Xfinity Home via WiFi, Xfinity Mobile, T-Mobile, and Visible by Verizon.
    - kbolino 9 months ago
      Whatever the issue was, it seems to have been resolved sometime after I last checked.
  - chasil 9 months ago
    I see it on my T-Mobile device also. Strange.
- fragmede 9 months ago
  What's your browser? The server is using an old TLS version which is no longer supported, and some clients will try https and fail there and not try http.
  - chasil 9 months ago
    I'm using Edge on my corporate desktop.
    Edge first tries TLS and comes back with: "SSL handshake error '-1' sslerr='1' sslerrdesc='error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol' sslerrfunc='607' sslerrreason='258'"
    Setting to http:// results the the above error, along with "httpd/3.30A Server at www.mirbsd.org Port 80" - I think that the target itself is blocking me.
- 9 months ago
- tux3 9 months ago
  Works from an EU IP, so whatever it is, it's probably not GDPR?
- blueflow 9 months ago
  > Android system shell
  This hurt a little.
chrisfinazzo 9 months ago
Related: The installer for iTunes 12.2.1 included a bug which might recursively delete a volume if the path given as input included incorrectly escaped spaces.
- NewJazz 9 months ago
  Reminds me of this...
  https://hackaday.com/2024/01/20/how-a-steam-bug-once-deleted...
Taikonerd 9 months ago
On a similar note, I sometimes think about how newline characters are allowed in filenames, and how that can break simple...
```
    for each $filename in `ls`
```
loops -- because in many contexts, UNIX treats newlines as a delimiter.
Is there any legitimate use for filenames with newlines?
- bityard 9 months ago
  Well, knowing how to deal with wacky input and corner cases are a requirement of learning ANY programming language. Bourne-style shells are no exception.
  Your example has illegal syntax, but the biggest issue is that you should never parse the output of ls. The shell has built-in globbing. This is how you would loop over all entries (files, dirs, symlinks, etc) in the current directory without getting tripped up by whitespace:
```
    for e in *; do echo "got: $e"; done
```
  - Taikonerd 9 months ago
    > knowing how to deal with wacky input and corner cases are a requirement of learning ANY programming language.
    In general, I agree. But if there's a corner case that occasionally breaks naive code but otherwise doesn't do anything, then I'm going to think, "maybe we should just remove that corner case."
    - zokier 9 months ago
      David Wheeler has been complaining (and suggesting fixes) about this for a long time: https://dwheeler.com/essays/fixing-unix-linux-filenames.html
      safename LSM https://lwn.net/Articles/686789/
    - bell-cot 9 months ago
      Replace "maybe" with "OBVIOUSLY". Keeping useless-but-hazardous "features" in any language is as idiotic as keeping a heap of oily rags in the furniture factory warehouse.
  - dzaima 9 months ago
    But, of course, this wouldn't be shell if that didn't have footguns more; namely, it breaks if ran in an empty directory (giving a literal "got: *"), and excludes the arbitrary set of files whose name begins with ".".
- chuckadams 9 months ago
  > Is there any legitimate use for filenames with newlines?
  IMHO no, but they can exist, so you need to handle them without blowing up. Also, even spaces are considered delimiters here, which is why it's bad form to parse the output of ls.
```
    $ touch "foo bar baz"
    $ for f in `ls`; do echo $f; done
    foo
    bar
    baz

    # always use double quotes, though they aren't needed here
    $ for f in *; do echo "$f"; done 
    foo bar baz
```
  At least the OS guarantees you won't run into NUL though.
  - unqueued 9 months ago
    There is a pretty good syntax for dealing with nasty filenames, if you must: ANSI-C quoting[1].
    If you have to output in a shellscript in this format, use printf %q
    from man printf:
```
       %q     ARGUMENT is printed in a format that can be reused as shell input, escaping non-printable
              characters with the proposed POSIX $'' syntax.
```
    It is just $'<nasty ansi-c escaped chars>'
    $ touch $'\nHello\tWorld\n' $ ls
    One thing I do like about a filesystem that fully supports POSIX filenames is that at the end of the day a filesystem is supposed to represent data. I think it is totally sensible to exclude certain characters, but that it should be done higher up in the stack if possible. Or have a flag that is set at mount time. Perhaps even by subvolume/dataset.
    One thing I haven't seen mentioned is that POSIX filenames are so permissive that they allow you to have bytes as filenames that are invalid UTF-8. That's why the popular ncdu[2] program does NOT use json as it's file format, although most think it does. It's actually json but with raw POSIX bytes in filename fields, which is outside of the official json spec. That does not stop folks from using json tools to parse ncdu output though.
    Another standard that is also very permissive with filenames is git. When I started exploring new ways to encode data into a git repo, it was only natural that I encountered issues with limitations of filesystems that I would check out in.
    Try cloning this repo, and see if you are able to check it out: https://github.com/benibela/nasty-files
    It is amazing how many things it breaks.
    If you are writing software that deals with git filenames or POSIX filenames (that includes things like parsing a zip file footer), you can not rely on your standard json encoding function, because the input may contain invalid utf-8. So you may need to do extra encoding/filtering.
    [1]: https://www.gnu.org/s/bash/manual/html_node/ANSI_002dC-Quoti...
    [2]: https://dev.yorhel.nl/ncdu/jsonfmt
  - kstrauser 9 months ago
    I’m not in a place where I can easily check. What happens there if the file name contains a quote?
    - chuckadams 9 months ago
      It's fine, the content of an expanded variable isn't parsed further:
      $ touch "foo \"bar baz"; for f in *; do echo "$f"; done foo "bar baz # quotes don't affect it either $ touch "foo \"bar baz"; for f in *; do echo $f; done foo "bar baz
      Though once you start passing args with quotes to other scripts, things get ugly. Rule of thumb is to always pass with "$@", and if that isn't enough to preserve quoting for whatever use case, write them out to a tempfile instead, or don't use a shell script for it in the first place.
- IsTom 9 months ago
  You can also create files named e.g. '--help' (if you're not particularly malicious) and with globbing it'll cause e.g. 'ls *' to print help.
  - jasonjayr 9 months ago
```
    touch -- '-f ..'
```
    (If you want to lay an evil trap)
    Remember that in most option parsing libraries, putting '--' in your arguments stops option parsing, so you can safely run:
```
    rm -- '-f ..'
```
- Joker_vD 9 months ago
  Sticky notes on the desktop :) Who needs data storage when you can store it all in the metadata?
- fragmede 9 months ago
  A GUI file browser will display the filename with a newline in it as a new line (and an icon above it) so as to be asthetically pleasing.
- xxpor 9 months ago
  this is why things like `find -print0` exist, which is IMO the easiest way to handle this robustly.
whiterknight 9 months ago
Side note: tell your startup to switch its “hardware with Ubuntu Linux inside” to BSD. You will have a much more stable and simple platform that can last a long time.
- quesera 9 months ago
  The recommendation is solid, but FWIW no one looking for stability would choose Ubuntu, among the Linuxen!
raverbashing 9 months ago
> There appears to be one piece of software which is misinterpreting guidance of this, and trying to depend upon embedded NUL.
Big oof here. Why? How?
> If there is ONE THING the Unix world needs, it is for bash/ksh/sh to stop diverging further by permitting STUPID INPUT that cannot plausibly work in all other shells. We are in a post-Postel world.
Amem
opk 9 months ago
I've always found the fact that zsh copes with NUL characters in variables etc to be really useful. I can see why this approach makes sense for OpenBSD but they can't prevent NULs appearing in certain places like piped input.
lupusreal 9 months ago
Does this break those self-extracting script/tar files? I forget how those are done, I haven't seen one in many years.
- zx2c4 9 months ago
  From the article: "It remains possible to put arbitrary bytes AFTER the parts of the shell script that get parsed & executed (like some Solaris patch files do). "
- jancsika 9 months ago
  If you don't know anything about OpenBSD, here's a fun thing:
  1. Randomly choose "yes" or "no" to this question.
  2. Read the post and get the answer.
  3. Repeat until you begin to get a tingly "Spidey sense" that overrides your random-choice.
  My Spidey sense here was, "Yes, because OpenBSD would have already thought about and covered that use-case." And indeed, toward the end of the post, that contingency is covered and documented.
  Note: if you try this at your job and sense that the company will almost always choose the worst option, you should probably leave that job.
- sneela 9 months ago
  Are you talking about Shar?
  https://en.wikipedia.org/wiki/Shar_(file_format)
  - ape4 9 months ago
    That was a neat idea back in the day but should disallowed now. Running downloaded executables considered harmful.
    - osmsucks 9 months ago
      > Running downloaded executables considered harmful
      Most executables are downloaded. :)
    - Joker_vD 9 months ago
      Not in the "Installation: just run `docker run kekw/our-shiny-ai-chatbot` in your shell" world we're living today.
- 73kl4453dz 9 months ago
  They were generally uuencoded or similar
klooney 9 months ago
Does this break the self extracting tarball trick, where you have a bootstrap shell script with a binary payload appended?
- oguz-ismail 9 months ago
  No, they still work.
nubinetwork 9 months ago
So I can't bury a tarball inside a shell script anymore?
- josephcsible 9 months ago
  You still can; it just needs to go at the end:
  > It remains possible to put arbitrary bytes AFTER the parts of the shell script that get parsed & executed (like some Solaris patch files do).
- volkadav 9 months ago
  Looks like you might be able to at the end of the file, reading the commit message, just not willy-nilly in the middle. :)
soupbowl 9 months ago
I wish FreeBSD replaced /bin/sh with OpenBSDs.
- rollcat 9 months ago
  FreeBSD made many cool moves in the 14.0 release, like finally getting rid of sendmail and adopting DMA (the irony), so perhaps there's a chance?
  But FreeBSD has always been much less focused on polish/cleanliness than OpenBSD; I mean - they have THREE firewalls, wtf.
  - toast0 9 months ago
    > they have THREE firewalls, wtf.
    I've not used ipf, but ipfw and pf have a different model and different features (although in 14.0, there's more overlap). I have to use them both.
chmorgan_ 9 months ago
Wow, they still use CVS...
- IcePic 9 months ago
  This was "answered" in 2013 at the end of this post, https://marc.info/?l=openbsd-misc&m=136724343006024&w=2
  I guess it hasn't changed since.
yesssql 9 months ago
[dead]
9 months ago
enriquto 9 months ago
Great. Now forbid spaces in filenames.
- ben_bai 9 months ago
  Funny enough filenames are just byte sequences. So almost anything goes.
  There was just some patch that added '/' protection, because that's the only character that's not allowed in filenames.
  https://github.com/openbsd/src/commit/46f7109a9e03df89b66ada...
sph 9 months ago
Is this in reference to something? Judging from the comments, NUL bytes in shell scripts are a common occurrence that everybody is celebrating this change as if it were ground breaking.
I mean, it's a good idea, but I wonder what am I missing here. Also what do they mean by post-Postel?
- BlackFly 9 months ago
  Early spec of TCP had a section on the robustness principle that was generally known as Postel's law (https://datatracker.ietf.org/doc/html/rfc761#section-2.10). At the time and until recently this was considered good design. Nowadays people generally want servers to be stricter in what they accept since decades of experience dealing with diverging interpretations of a specification create problems for interoperability.
  - eesmith 9 months ago
    "until recently"? More than 10 years just going by HN. https://news.ycombinator.com/item?id=5161214
    I think HTML showed the problem with Postel's principle. Quoting "Postel’s Law is not for you" at http://trevorjim.com/postels-law-is-not-for-you/ from 2011
    > The next version of HTML, HTML5, should considerably reduce the problem of browser incompatibilities. It does this, in part, by rejecting Postel’s Law for browser implementors. Instead of allowing browsers to be liberal when dealing with “flawed” markup, HTML5 requires them to parse it exactly as in the HTML5 specification, and that specification is given much more precisely than before, in the form of a deterministic state machine, in fact. HTML5 is trying to give implementors no leeway at all in this, in the name of browser compatibility.
    - cesarb 9 months ago
      > "until recently"? More than 10 years just going by HN.
      The TCP protocol is from the 1970s (according to Wikipedia, it's from 1974, which is 50 years ago). Something which only happened 10 years ago is recent.
- JimDabell 9 months ago
  Postel’s Law, also known as the Robustness Principle:
  > be conservative in what you do, be liberal in what you accept from others
  It’s intended as a way to maximise compatibility, and people have generally followed it when designing protocols and file formats. However it’s led to many security vulnerabilities and has caused a lot of compatibility problems itself. These days a lot of people are realising that it’s more harmful than helpful.
- semiquaver 9 months ago
  Postel’s Law: https://datatracker.ietf.org/doc/html/rfc761#section-2.10
2snakes 9 months ago
Surprised noone has mentioned the Crowdstrike issue, which was due to NUL characters wasn't it?
- amiga386 9 months ago
  It was not. The Crowdstrike issue was:
  1. Their code was calling a 21-parameter "matcher" function with 20 parameters of data.
  2. They didn't notice, because all the matcher rules had "allow anything" for the 21st parameter and so never looked at it.
  3. They later published the first list of rules with something other than "allow anything" as the 21st parameter, direct to customers.
  4. On customer machines, the first rule with a non "match everything" 21st parameter went to look at the 21st element of the 20 element array. It expected a string pointer, but instead there was random stack data. It tried dereferencing this to read the string it was expecting, which caused the kernel driver to segfault during early startup, putting customer machines in a boot loop.
  https://www.crowdstrike.com/wp-content/uploads/2024/08/Chann...

  > If there is ONE THING the Unix world needs, it is for bash/ksh/sh to
  > stop diverging further by permitting STUPID INPUT that cannot
  > plausibly work in all other shells.  We are in a post-Postel world.
  > 
  > It remains possible to put arbitrary bytes *AFTER* the parts of the
  > shell script that get parsed & executed (like some Solaris patch files
  > do).  But you can't put arbirary bytes in the middle, ahead of shell
  > script parsed lines, because shells can't jump to arbitrary offsets
  > inside the input file, they go THROUGH all the 'valid shell script
  > text lines' to get there.

  So here it is again, an example of OpenBSD making software behavior saner for all of us.

I don't consider use of all caps over a minor issue to be sane behavior. At best it's immaturity (trying to force your point rather than persuade), and at worst it's an emotional imbalance that effects judgement. That said, it's ksh, on OpenBSD, so I couldn't care less what they do.

PufPufPuf 9 months ago
What a weird take. There are just a few emphasized words in the commit message.
- 9 months ago