My Favourite Diff

205 points by jwatzman 5 years ago | 84 comments
  • kazinator 5 years ago
    Recent GCC can diagnose the situation where a switch case falls through without a /* fallthrough */ comment.[1]

    C++17 has a comment-like language feature for demarcating desired fallthrough.

    Fallthrough in case processing is often useful; it's just not such a great default.

    In the TXR Lisp tree-case[2] construct, any case which returns the : object falls through to the next case.

    This is used all over the place in the library.

    --

    [2] https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Warning-Options...

    [1] https://www.nongnu.org/txr/txr-manpage.html#N-03D834A5

    • adrianmonk 5 years ago
      > not such a great default

      This is a tangent, but it reminds me of a personal policy of mine: if I'm thinking about changing a default, I ask myself if there should be a default at all.

      The consequences of having the wrong default were serious enough that it even came to my attention. So by changing the default, am I potentially exchanging one set of problems for another?

      Sometimes it would be better overall if the user is forced to explicitly state which way they want it. Not always, of course, but I like to ask the question.

      • adrianmsmith 5 years ago
        Right! I had a bug in a SQL statement to fetch tasks “order by priority” and then execute them in that order. A stupid mistake, but I must have looked at that code tens of times and I just couldn’t spot it. The bug was that that statement found then executed the lowest priority items first.

        It’s not clear to me that an “order by” should have a default direction. I don’t know why one direction is more likely to be used than the other. Without a default, I would have been forced to think about the direction and wouldn’t have had that bug.

        • Apanatshka 5 years ago
          A similar example is in a "filtering" operation. I mostly have this in functional programming where you pass a predicate to filter on. But what's the perspective of the filter? Are you filtering poison from your water, or are you sifting the water looking for gold nuggets. I prefer to use functions remove and retain (respectively) to be more explicit.
          • Someone 5 years ago
            That’s an ambiguity in English. “Foo is priority #1, bar priority #2, so foo has the higher priority” is perfectly valid English.

            The only way to ‘solve’ that would be by banning priorities higher than 1, for example by numbering them 1, 0, -1,… or by using floats in the range (0,1] for priorities.

            • brigandish 5 years ago
              Isn't everything usually ABC and 123 by default?

              I suppose countdowns aren't but there's a clue in the name…

            • renox 5 years ago
              It's a good question but in this case, I wouldn't be surprised if 99% of the usage don't fallthrough.
            • johnny-lee 5 years ago
              My perl script which scans C/C++ source code for typos would emit a warning (case 19) for a case statement that fell through to the next case statement. The script handled the 'fall through'-comment situation as well.

              You can read the 1999 Perl Conference paper for the perl script at https://sites.google.com/site/typopl2/afastandeasywaytofindb...

            • letientai299 5 years ago
              _Often my biggest contribution isn’t being the only one who is able to fix something, but being the only one who realises something is fixable._

              I feel relatable. At all the placed I've worked, people realize the problem, sometime find the solution, but most of the time, they won't fix it, unless it directly affect their code/project/product. As a result, technical debt keeps bubble up and we have more and more issues.

              Right now, the only thing I can do is just to keep fixing issues as I encounter them. But that's for the code, I don't have any solution for the "people" problem.

              • xorcist 5 years ago
                I call this the person with taste. One of the first questions I try to answer in new setting is to identity the person "with taste" around. They are usually a good entry point if you want to get something done.
                • nicbou 5 years ago
                  Many problems lack a Designated Person Who Cares. Everyone has something to do, a task to accomplish. There's never someone whose job is to improve the codebase and the tooling, even when it can benefit the team.
                  • auxym 5 years ago
                    I have not worked in software for long (a year, a while ago) but my experience with agile at that time was that even if you realize that you can and should fix something, you still have do political representation to the scrum master, team lead and product manager so that an issue (sorry, user story) is created and gets prioritized (over features and other bugs) in a future sprint.

                    Going rogue and spending an afternoon fixing stuff without an approved US (to charge time on) was not seen well.

                    • julius_set 5 years ago
                      Unfortunately this is often the case at larger companies, unless you find a really cool manager / PM to work under who just trusts you — note this can be achieved over time.

                      I’m talking from personal experience of going rogue :)

                  • xelxebar 5 years ago
                    This reminded me of Raymond Chen's blurb article on his blog, The Old New Thing:

                    https://devblogs.microsoft.com/oldnewthing/20110118-00/?p=11...

                    For whatever reason that heavily influenced me, and now I almost always submit patches together with my bug reports. Sometimes its a whole lot more work, but I feel like this practice pays dividends in the amount you learn as well as the comraderie it fosters with the upstream devs.

                    • MrStonedOne 5 years ago
                      One annoying thing about the war on case fallthrough at the language level is the lack of imagination.

                      `continue` is a thing, compliments switch's existing `break` statement usage and is overall a much better solution for explicitly specifying fallthrough then returning special objects or c#'s abuse of goto.

                      • Spivak 5 years ago
                        I think it's a pretty syntax idea but it means continue works opposite your intuition inside switch statements compared to everywhere else.

                        I mean you could kinda argue that switch could conceptually mean something like

                            for matcher, block in cases:
                                if matcher.matches(subject) or continued:
                                    eval block
                        
                        but then continue still feels magic since it bypasses the test.
                      • JadeNB 5 years ago
                        Since there's not much context here: the article, which I enjoyed, ends with "Be the change you want to see in the world", and shows how the author exemplified this by coding a fix to a persistent source of PHP bugs at $WORK.
                        • 5 years ago
                          • thunderrabbit 5 years ago
                            And it doesn't show an actual diff.
                          • aasasd 5 years ago
                            As a technical aside (I know it's not the main point, but still): with the ubiquity of Git workflows employing pre-merge checking on the platform of choice (e.g. Github or a CI tool), this is rather easily done via a rule in an off-the-shelf linter. No `break` in a `case`? Can't merge it.

                            As a language-design aside, `switch` in general is stinky stuff. Not just with the fall-through: it also violates the regular C-style syntax for no particular reason, having a mini-syntax of its own instead.

                            But the most perverse thing I've seen done with `switch` is using it as `if`:

                                switch (true) {
                                    case ($a == $b): ...
                                    case ($c == $d): ...
                                    case (itsFullMoonToday()): ...
                                    default: ...
                                }
                            • Someone 5 years ago
                              If that’s the most perverse you’ve seen, I guess you haven’t seen Duff’s device, which mixes it with a do…while loop (code copied from http://www.catb.org/~esr/jargon/html/D/Duffs-device.html):

                                 register n = (count + 7) / 8;      /* count > 0 assumed */
                              
                                 switch (count % 8)
                                 {
                                 case 0:        do {  *to = *from++;
                                 case 7:              *to = *from++;
                                 case 6:              *to = *from++;
                                 case 5:              *to = *from++;
                                 case 4:              *to = *from++;
                                 case 3:              *to = *from++;
                                 case 2:              *to = *from++;
                                 case 1:              *to = *from++;
                                                    } while (--n > 0);
                                 }
                              • tzs 5 years ago
                                I have no idea if this still works in modern C, but here's something kind of similar I've seen used in C circa 1986:

                                  switch (foo)
                                  {
                                    case 1:
                                      ...code just for case 1...
                                      if (0)
                                      {
                                    case 2:
                                        ...code just for case 2...
                                      }
                                      ...code for both case 1 and case 2...
                                      break;
                                    case 3:
                                      ...
                                
                                  }
                                
                                used when two cases have a common tail, and you have sufficient space constraints that you do not want to duplicate the common tail code, and you have sufficient time constraints (and maybe space constraints on the stack) that you don't want to put the common tail code in a subroutine and call it from both cases.
                                • fisherjeff 5 years ago
                                  Oh my god. I’ve wondered about a way to do this but... this is just devious.
                                  • X-Istence 5 years ago
                                    This is goto with slightly more code ;-)
                                  • aasasd 5 years ago
                                    That's not perversity, that's suicide by art.
                                  • jwatzman 5 years ago
                                    Post author here.

                                    One of the nice advantages we had in using our static analysis tool for this, with real code intelligence, instead of a simpler linter, was that we could do a much more clever check. The actual rule was "every case must be terminal", and we had a rich analysis for "terminal", so stuff like this would be considered legal:

                                        case blah:
                                          if (cond) {
                                            throw blah;
                                          }
                                          if (cond2) {
                                            some_noreturn_function();
                                          } else {
                                            return 42;
                                          }
                                    
                                    etc etc. My example is a bit contrived, and the wisdom of doing such complicated things in switch/case is questionable, but it was useful when dealing with an existing codebase.

                                    Another advantage was that our tool was designed for extremely fast analysis over the entire codebase, so errors were given to programmers as they were writing code, immediately. (200ms response time to any change. At that level you can run it on every keystroke, which we did.) Traditional linters can in principle do this, but in practise are often not written with this sort of performance requirements in mind.

                                    (And yes, PHP allowing arbitrary expressions in case labels was... probably not a good idea.)

                                    • muglug 5 years ago
                                      FYI, phpcs now has something similar (since roughly 2016), but I imagine your implementation predates that by a significant margin (and yours obviously supports noreturn too).

                                      > PHP allowing arbitrary expressions in case labels was... probably not a good idea.

                                      Yeah, possibly the worst offender is the (somewhat common) "switch (true)" pattern: https://psalm.dev/r/4c168a9c5d

                                      Sadly I don't have the latitude you all do to govern how code gets written by users of my own static analysis tool – fall-through has to work there, too: https://psalm.dev/r/1d4ee59bd6

                                    • edflsafoiewq 5 years ago
                                      A switch is a kind of computed goto. It's a fantastically useful construct when you need it. You don't usually need it. I'm not sure what mini-syntax you're referring to, the syntax is roughly what I would have guessed it would look in C if I hadn't seen it and you just told me its semantics.

                                      Most people treat switch as a peculiar form of match (or cond or whatever), probably because this would usually be much more useful to have than a computed goto, and in many derivative languages they've put in weird additions that seem to be an attempt to drag it halfway to that, things that don't really make any sense in C, like allowing dynamic expressions in the cases (such as your example), or forcing all cases to be at the top level.

                                      (I feel like I write a lot of "in defense of switch" comments on HN...)

                                      • Supermancho 5 years ago
                                        > Most people treat switch as a peculiar form of match (or cond or whatever)

                                        Completely made up switch illustrates how I use it in every language if I can...Go has a succinct version of this.

                                            switch(True) {
                                                case ($x == 1):
                                                    $x++;
                                                    break;
                                                case ($x > 3):
                                                    $x--;
                                                    break;
                                                default:
                                                    break;
                                            }
                                        • edflsafoiewq 5 years ago
                                          Why? It's longer, more complex, and more highly indented than a normal if-elif-else.
                                      • benhoyt 5 years ago
                                        This is actually a feature in Go, but you leave off the "true" condition entirely: https://tour.golang.org/flowcontrol/11

                                        I quite like it, as it cuts down on the amount of brace and if/else if noise.

                                        • kazinator 5 years ago
                                          That could be the work of a Lisp coder rebelling against having cond taken away, and having to fend with if/else/elif.
                                          • aasasd 5 years ago
                                            Oh, no—that was the guy who loved MS and .NET despite working in a Linux-based shop, drove a Chinese-Russian SUV when not riding a roadbike in tight pants, supported Putin and swore about every ten words in both speech and chats, that's all while looking hella dorky. One thing I can't picture him doing is pining for Lisp.

                                            Now, grinding everyone's gears with his abuse of `switch`—that was right up his alley, what with already grinding them with his political commentary outside of code.

                                          • RealityVoid 5 years ago
                                            What language is that? It's not C, because it wouldn't compile.
                                            • JadeNB 5 years ago
                                              Looks kind of like Perl to me, except that Perl uses `given … when …`.
                                              • naniwaduni 5 years ago
                                                The third example is valid C.
                                            • userbinator 5 years ago
                                              On the contrary, I've heard of a story where a team decided to apply some new tool to their codebase which warned about a missing break, inserted the break so the warning disappeared, "fixed" the failing tests that now resulted, and then upon the next release was subsequently screamed at by numerous customers for the unwanted behaviour change.

                                              but in a very high-level language like PHP, there’s really no reason for switch to fall through at all

                                              I disagree. Switch fallthrough is very useful because it can reduce code duplication, especially when the logic has a ladder-like structure. Trying to impose increasingly draconian and such arbitrary rules is only going to lead to a self-fulfilling-prophecy where all the intelligent developers will get fed up and leave, and what's left are those which will continue to create tons of bugs some other way instead.

                                              • jwatzman 5 years ago
                                                Author here.

                                                Yeah, you can't just mechanically insert the breaks :) I carefully went through each instance where my rule tripped, looked at the surrounding code, and figured out the right fix. Trying to automate that would indeed have lead to disaster -- and the lack of ability to automate this is why no one had done this in the past, it sounded like too much work. (It wasn't that much work.)

                                                And I agree switch fallthrough can be useful! We just required it be annotated from now on, to convey the intent, as opposed to doing it by accident.

                                                • nicbou 5 years ago
                                                  I don't think I would quit a company over a more stringent linter.

                                                  To me, this is the same attitude that leads to safety hazards in other industries. "I don't need safety measures because -I- am not an idiot".

                                                  Even the smartest developers make dumb mistakes. If you can eliminate some of those early with minimal friction, it's worth it. The earlier you catch a mistake, the cheaper it is. Linter rules are less of a hassle than bugs in production.

                                                  • earthboundkid 5 years ago
                                                    The kind of person who complains about the linter is the exact reason your company needs a linter.
                                                  • Cyberdog 5 years ago
                                                    It reduces code duplication at the cost of increasing complexity and error-prone-ness, and there are ways to reduce the duplication. For example, given the following snippets:

                                                        switch ($foo) {
                                                          case 123: 
                                                            // do 1
                                                          case 23:
                                                            // do 2
                                                          default:
                                                            // do 3
                                                        }
                                                        
                                                        if ($foo == 123) {
                                                          do1();
                                                          do2();
                                                          do3();
                                                        }
                                                        else if ($foo == 23) {
                                                          do2();
                                                          do3();
                                                        }
                                                        else {
                                                          do3();
                                                        }
                                                    
                                                    Yes, the second one is more verbose and duplicates code, but I think it's also far more clear in terms of intention.
                                                    • dalemyers 5 years ago
                                                      You can even take it a step further:

                                                          if ($foo == 123) {
                                                            do1();
                                                          }
                                                          
                                                          if ($foo == 123 || $foo == 23) {
                                                            do2();
                                                          }
                                                          
                                                          do3();
                                                  • 5 years ago
                                                    • evmar 5 years ago
                                                      I did a similar thing for Chrome back in the day, to have the compiler check an `override` annotation:

                                                      http://neugierig.org/software/chromium/notes/2011/01/clang.h...

                                                      • vortico 5 years ago
                                                        Just a reminder that in most compiled and JIT optimized languages with switches, this

                                                          switch (c) {
                                                            case 4: return 30;
                                                            case 5: return 70;
                                                            default: return 0;
                                                          }
                                                        
                                                        compiles to the same instructions and therefore same performance as

                                                          if (c == 4) {
                                                            return 30;
                                                          }
                                                          else if (c == 5) {
                                                            return 70;
                                                          }
                                                          else {
                                                            return 0;
                                                          }
                                                        
                                                        As a side note, it's easier to see accidental fallthroughs if you write switch statements like this.

                                                          switch (c) {
                                                            case 4: {
                                                              return 30;
                                                            } break;
                                                            case 5: {
                                                              return 70;
                                                            } break;
                                                            default: {
                                                              return 0;
                                                            } break;
                                                        
                                                        But in my opinion, the `if` statement is more clear in both cases, and only one level of indentation instead of 2.
                                                        • comex 5 years ago
                                                          Huh. To my eyes, the switch version is "obviously" much more clear! It's significantly more compact, with less visual noise you have to mentally ignore. And having the possible inputs and outputs lined up vertically (especially with nothing else between them) makes it easy to understand the logic. Plus, using a switch indicates at a glance that the variable being tested is the same in all cases, whereas with an if-else-if chain there could be any weird expression hiding in one of the conditions.

                                                          Eye of the beholder, indeed.

                                                          • vortico 5 years ago
                                                            Perhaps a better comparison to the switch statement is

                                                              if (c == 4) return 30;
                                                              if (c == 5) return 70;
                                                              return 0;
                                                            • comex 5 years ago
                                                              Yep, that's a big improvement in my eyes, making it comparable from a readability perspective. In my opinion, that leaves more minor pros and cons of each approach, depending on the language. For example, in C and C++, as kazinator mentioned elsewhere in the thread, GCC and Clang now have an option to require all fallthroughs to be explicitly annotated, which effectively removes one of the main downsides of switch. And switch is better at detecting duplicate cases, as well as checking for exhaustivity if you're matching against enum variants. On the other hand, if you change the example so that the cases don't return (perhaps they assign to a variable instead), with switch you would need to add a `break;` after each one, increasing visual noise. If statements don't have that problem. Or if you change the example so that one of the values `c` is compared to is not a constant, some languages (C, C++, Java but not JavaScript, PHP) don't allow that with switch statements...
                                                          • bla3 5 years ago
                                                            That's only true if there are <= 3 "clusters" of switch case values, else compilers do smarter things. See https://youtu.be/gMqSinyL8uk
                                                          • enriquto 5 years ago
                                                            Are you sure? I'd guess that if the cases are an interval of consecutive integers it may do a "jmp [c]" sometime.
                                                            • KMag 5 years ago
                                                              You're correct. Many compilers will emit jump tables if the domain is dense enough. Also, even if compiling to a bunch of conditional jumps, they'll usually emit them to require O(log N) conditional jumps (equivalent to a binary tree of nested ifs) instead of the O(N) linear arrangement suggested by the GP.
                                                              • vortico 5 years ago
                                                                Clang compiles both switch statements and `if` statements to the same tree search code. E.g. https://godbolt.org/z/7M_3PU My point is that the optimizations you're describing for switch statements are also applied to a bunch of `if` blocks, at least in LLVM-based languages with an LLVM version from the last several years.
                                                              • vortico 5 years ago
                                                                Looks the same to me. https://godbolt.org/z/CfhD5k Switches were invented mostly because they were an easy hint to the compiler to reduce to a tree search, but because compilers are smart today, they're obsolete unless you just prefer their syntax.

                                                                Even if the cases are consecutive, most compiler optimizers are smart enough to determine that the ASTs are equivalent.

                                                                • enriquto 5 years ago
                                                                  > they're obsolete unless you just prefer their syntax.

                                                                  This is a good point.

                                                                  Sometimes I prefer the (ugly) syntax of switches than that of conditionals. The reason is that a chain of conditionals is non-symmetrical (there's no condition for the first "case"), while a switch is formally invariant to permutation of the cases. I have even seen shit like this:

                                                                      if (false) ;
                                                                      else if (c == 1) { first case ; }
                                                                      else if (c == 2) { second case; }
                                                                      ...
                                                                  
                                                                  just to preserve the symmetry along the cases.
                                                            • joatmon-snoo 5 years ago
                                                              Google applies ErrorProne in the default javac toolchain to mitigate this for Java builds: https://errorprone.info/bugpattern/FallThrough
                                                              • dlbucci 5 years ago
                                                                Recently, my coworker basically asked if he should set a TTL on a password for a job we were running. We said yeah, but our code would need changes to accommodate it before it expired and we didn't have time to make them.

                                                                I joked that we'd make a JIRA ticket that we'd repeatedly deprioritize until our code mysteriously broke in 6 months, but then my coworker actually made an Outlook reminder for when we had a month left!

                                                                I thought that was really smart, so I guess I learned a lesson about actually solving problems instead of just pointing them out like a smart ass.

                                                                • CGamesPlay 5 years ago
                                                                  What did you do in cases where the fall through was desired behavior? Did the resulting code (replacing a fall through with a nested if or something?) look better or worse as a result?

                                                                  The typical way to handle this at companies that don't have a static-type-checker department is to require a comment that says /* falls through */ at the end of the switch. Why was such a simple solution inappropriate here?

                                                                  • mplewis 5 years ago
                                                                    Because code reviews are more fallible than automatic linters, and OP had the luxury of working somewhere that the noisy people didn't fight for an anti-feature.
                                                                    • CGamesPlay 5 years ago
                                                                      I wasn't talking about requiring it in code review though. The OP didn't make an automatic linter that issued a warning when you did this implicitly, he removed the ability to do it entirely. With a linter, you can add a comment, like /* falls through */ to disable the lint warning. With the OP's change you can... use a series of if-else blocks? The alternative isn't listed, which is why I asked.

                                                                      Sorry about your last company, though. Sounds like it was annoying to work there!

                                                                      • jwatzman 5 years ago
                                                                        OP here.

                                                                        Explicitly-annotated fallthrough was still allowed; I codified a particular annotation that the analysis tool understood and allowed. I removed implicit fallthrough, but explicit fallthrough can be quite useful sometimes.

                                                                        • 5 years ago
                                                                      • GhostVII 5 years ago
                                                                        Sounds like that's what they did:

                                                                        The cleanup diffs I’d written that afternoon mostly just annotated clearly-intentional fallthroughs

                                                                      • l0b0 5 years ago
                                                                        Some of my favourite diffs were related to linters:

                                                                        - Linting files different from origin/master in a pre-commit hook.

                                                                        - Linting everything in CI.

                                                                        - Making the linter rules stricter (mypy does a great job of having several orthogonal strictness flags, so you can pick the set appropriate for your familiarity with the tool).

                                                                        - Implementing project-specific lint checks.

                                                                        • ChrisMarshallNY 5 years ago
                                                                          Cool story.

                                                                          I use Swift. It does not have default switch fallthrough. After a couple of versions, I learned about the fallthrough statement.

                                                                          I remember being frustrated by it, at first, but, like so many Swift oddities, it rapidly became second nature; thus, proving out that one of the goals of Swift is to train developers to write code properly.

                                                                          • tcbasche 5 years ago
                                                                            Go also has the same behaviour, with the fallthrough keyword. I find myself using a switch so rarely anyway...
                                                                          • saagarjha 5 years ago
                                                                            I thought the first release of Swift had fallthrough?
                                                                            • ChrisMarshallNY 5 years ago
                                                                              Maybe it did, but I didn’t learn about it for quite some time. I just learned to work around it.

                                                                              Swift is a deep river. I still learn new stuff every day, and I write Swift, seven days a week.

                                                                          • hkai 5 years ago
                                                                            If the author discovered tslint/eslint he'd be surprised that this problem was long solved in the JS world :)
                                                                            • muglug 5 years ago
                                                                              I believe this happened before tslint or eslint existed.
                                                                            • unnouinceput 5 years ago
                                                                              So many cases about this and that, pro and cons of switch fall-through, while Pascal was King from beginning.
                                                                              • T3RMINATED 5 years ago
                                                                                WinMerge is the #1 differ out there so stop spreading lies.