Overhead of Returning Optional Values in Java and Rust (2021)

179 points by krisgenre 2 years ago | 158 comments
  • kasperni 2 years ago
    Optional is being being converted to a value class as part of Project Valhalla. I reran the benchmarks with the latest Valhalla branch [1] and added a test that used OptionalLong instead of Optional<Long>.

    Without Valhalla

    - OptionBenchmark.sumSimple avgt 5 328,110 us/op

    - OptionBenchmark.sumNulls avgt 5 570,800 us/op

    - OptionBenchmark.sumOptional avgt 5 2223,887 us/op

    - OptionBenchmark.sumOptionalLong avgt 5 1201,987 us/op

    With Valhalla

    - OptionBenchmark.sumSimpleValhalla avgt 5 327,927 us/op

    - OptionBenchmark.sumNullsValhalla avgt 5 584,967 us/op

    - OptionBenchmark.sumOptionalValhalla avgt 5 572,833 us/op

    - OptionBenchmark.sumOptionalLongValhalla avgt 5 326,949 us/op

    OptionalLong is now as fast as simple sum. And SumOptional is now as fast as SumNulls. So the overhead of using OptionalLong and Optional<Long> seems to have gone away with Valhalla.

    It would be great if boxing could be eliminated as well. But few people writes code like what is being benchmarked (in hot loops) in practice.

    [1] https://github.com/openjdk/valhalla

    • kasperni 2 years ago
      Realising that my formatting is difficult to understand if you have not used JMH previously. I've cleaned up the numbers:

                        Pre Valhalla   | Valhalla
      
        sumSimple       328,110 us/op  | 327,927 us/op
      
        sumOptionalLong 1201,987 us/op | 326,949 us/op
      
        sumNulls        570,800 us/op  | 584,967 us/op
      
        sumOptional     2223,887 us/op | 572,833 us/op
      • malfist 2 years ago
        What unit of measurement is us/op?
        • CodeMage 2 years ago
          Microseconds per operation
      • pestatije 2 years ago
        Those numbers don't support your statement
        • misja111 2 years ago
          I was confused at first too, but there's a '5' after each benchmark which doesn't belong to the benchmark speed but which belongs to the benchmark title (?).
          • kasperni 2 years ago
            It it the number of non-warmup-runs for the benchmarking tool (JMH). The original article included them as well so thought I would just post the numbers in the same format here. But I totally understand that they are confusing if you are not in the Java ecosystem.
          • Manjuuu 2 years ago
            The important numbers are those related to Optional/OptionalLog, there is a 4x+ improvement after Valhalla.
            • kasperni 2 years ago
              Could you tell me exactly what is the issue?

              sumNullsValhalla and sumOptionalValhalla returns 584,967 us/op and 572,833 us/op respectively

              sumSimpleValhalla and sumOptionalLongValhalla returns 327,927 us/op and 326,949 us/op respectively

              • rkuska 2 years ago
                Maybe the person missed the difference because how your numbers are formatted:

                - OptionBenchmark.sumOptional avgt 5 2223,887 us/op

                vs

                - OptionBenchmark.sumOptionalValhalla avgt 5 572,833 us/op

                At least to me after the first read it seemed like comparison of two similair 5kish values.

                • 2 years ago
                  • stonemetal12 2 years ago
                    sumOptionalLong 1,201,987 us/op | 326,949 us/op

                    Sum Optional long is about 4x faster. 1.2Million microseconds per operation (pre Valhalla) vs 300K microseconds per operation (post Valhalla).

                • re-thc 2 years ago
                  We need this sooner than later!
                  • Pet_Ant 2 years ago
                    What are you doing? Our code is slow from terrible design choices and endless redirects, this is leagues below being relevant for us.
                    • re-thc 2 years ago
                      Building normal software like everyone else - just with a different mindset. Java doesn't have to be Spring or bloatware.

                      Before the recent generics everyone wrote Golang like Java without boxing and don't complain so why not :).

                • marginalia_nu 2 years ago
                  I think in general these types of benchmarks makes boxed numbers look bad only because they're being compared to adding integers, which is a ridiculously fast operation to start with. Accomplishing half that speed despite "dereferencing a pointer" is kinda insane in the light of 'Latency Numbers Every Programmer Should Know'.

                  (The reason it works is because Java doesn't actually allocate new Longs for small numbers, it fetches them from a table of constants; it's always the same 256 objects that are being dereferenced. I don't know their memory layout, but I'd half expect them to be sequential in memory as that's would be much a low hanging fruit optimization. Optional<Long>'s performance is what you'd expect without these optimizations. Also in this scenario you really should use OptionalLong instead of Optional<Long> but that's beside the point ;-)

                  • jillesvangurp 2 years ago
                    I think the real issue is that you probably should worry about this overhead only in the context of really tight loops where you are basically wasting a lot of memory allocation and garbage collection for no good reason. But otherwise, Java is actually pretty good dealing with short lived small objects. The garbage collector is pretty good for that.

                    Normal usage of this stuff is not going to cause any more issues than trigger the OCD of people that obsess about this stuff. And in the rare case that you do have an issue, you can do something more optimal indeed.

                    • pdpi 2 years ago
                      GC performance isn't the only thing you need to care about here. Iterating over an ArrayList<T> in Java already makes you chase pointers all over the place due to Java's lack of value types, which wrecks cache locality. ArrayList<Optional<T>> just makes you chase two pointers for each object instead.
                      • CHY872 2 years ago
                        Java uses thread local allocation buffers such that objects allocated one after another are typically contiguous in RAM. Most modern Java gcs are also compacting, meaning that the heap ends up approximately breadth first ordered after GC.

                        What this means is that in practice, pointer chasing is less of an issue than you’d expect. Even a linked list will end up with decent cache locality.

                        Obviously this won’t always work, but it generally works a lot better than the same structure in a systems language.

                        https://shipilev.net/jvm/anatomy-quarks/11-moving-gc-localit...

                        • vips7L 2 years ago
                          List<Optional<T>> is pure insanity fwiw. Filter out your empties before making the list.
                        • pkolaczk 2 years ago
                          If you're writing a database engine, and you accidentally wrap each table cell in a Java object, you'll very quickly hit a GC performance wall, particularly when using a low pause GC. So we try to write Java without objects. Guess how easy that is. ;)
                          • marginalia_nu 2 years ago
                            Yeah I deal with that a lot with my search engine index too. Honestly it's not that bad once you get used to it.

                            You can get away with referencing the data through (mutable and reusable) pointer objects that reference memory mapped areas yet provide a relatively comfortable higher level interface. This gets rid of object churn while keeping a relatively sane interface.

                            • jillesvangurp 2 years ago
                              If you do stuff like that, use a profiler and identify and fix your real performance bottlenecks. As opposed to applying premature optimization blindly. Same with GC tuning. This has gotten easier over the years. But there are still lots of tradeoffs here.

                              There are plenty of fast performing databases and other middleware written in Java. The JVM is a popular platform for that kind of thing for a good reason. Writing good software of course is a bit of a skill. Benchmarks like this are kind of pointless. Doing an expensive thing in a loop is slow. Well duh. Don't do that.

                          • ramblerman 2 years ago
                            The table of constants leads to this fun example:

                            System.out.println(Integer.valueOf(22) == Integer.valueOf(22)); // true

                            System.out.println(Integer.valueOf(2200) == Integer.valueOf(2200)); // false

                            Which is a bit confusing to say the least. I realize one should never be using == with objects, but still.

                            • matsemann 2 years ago
                              That's another quality of life improvement in Kotlin, == calls the equals method on the objects, not comparing their references. Which is what most java programmers want in 99% of cases. And makes the code more readable, a.equals(b) is harder to read than a == b.
                              • marginalia_nu 2 years ago
                                In most cases I'd use equals(a, b) in Java (with a static import of Objects.equals). It's both null safe and more readable.
                              • n4r9 2 years ago
                                Similar thing happens in Python (assuming my knowledge hasn't gone out of date)

                                  x = 2
                                  print(x is 2) // true
                                  x = 200
                                  print(x is 200) // false
                                • masklinn 2 years ago
                                  I think a big difference is Java you have to switch between `==` and `Object.equals` depending whether your type is a primitive (e.g. int) or a reference (e.g. Integer).

                                  In Python the cases where you’d be using `is` are a lot more restricted, and a bit of an optimisation (although most style guides will require it in the cases where it makes sense).

                                  There’s basically 3 cases where you’d use `is` in Python:

                                  - the standard singletons None, True, and False (and for the latter two you’d usually use truthiness anyway)

                                  - placeholder singletons (e.g. for default values when you need something other than None)

                                  - actual identity check between arbitrary objects (which very rarely happens IME)

                                • bzzzt 2 years ago
                                  I kind of like that behaviour. You're comparing object references and get exactly the expected output instead of some magically overloaded equality operator. If you want that, just use 'equals' ;)
                                  • kaba0 2 years ago
                                    This will also be “fixed” once Valhalla arrives.
                                  • jnellis 2 years ago
                                    The cached range is -128 to 127. I just reran these benchmarks with ints instead of longs and the mask changed to 127(0x7F) and sumNulls and sumSimple results are exactly the same: 0.6 ns/op. As for the sumOptional method, changing Optional<Integer> for the primitive variant OptionalInt, doesn't change the result much, its the actual creation of the Optional object itself that dominates the time.

                                    In nutshell, on my old i5 2500k:

                                    ints 0.6ns/op

                                    cached Integer 0.6ns/op

                                    boxed Integer 1.3ns/op

                                    OptionalInt 3.5ns/op

                                    Optional<Integer> 4.2ns/op (time includes boxing the int)

                                    Where an op is getting the number, checking it, then an addition.

                                    For hot loops inside a jmh benchmark, you can use @OperationsPerInvocation(MAX) and it will spit out the results in this more readable format for the time just inside the loop.

                                    • Groxx 2 years ago
                                      Benchmark structure is one thing I hope more languages copy from Go - hard-coding iterations is pretty silly, doomed to need repeated changes or become problematically imprecise as hardware and runtime changes occur.
                                    • weego 2 years ago
                                      Moreso, non-highly specialised use cases for optionals in this class of language are mostly commonly used around IO ops of some kind (DB, streams, messaging, API etc) so we're into "make sure there's no flies on the elephant when we weight it" territory.
                                    • Someone 2 years ago
                                      I think part of the problem is that Java’s Optional isn’t a

                                        Either[null, T]
                                      
                                      but a

                                        Either[null, boxed T, boxed null]
                                      
                                      ‘Boxed null’ is what the documentation (https://docs.oracle.com/javase/8/docs/api/java/util/Optional...) calls “an empty Optional”

                                      That means that, for example, an Optional[Byte] can have 258 different values and cannot, in general, be compiled to a ”pointer to byte” because that has only 257 different values.

                                      Edit: reading https://news.ycombinator.com/item?id=35133241, the plan is to change that. I fear that, by the time they get around to that, lots of code will handle the cases null and Optional containing null differently, making that a breaking change.

                                      • moonchild 2 years ago
                                        > reading https://news.ycombinator.com/item?id=35133241, the plan is to change that. I fear that, by the time they get around to that, lots of code will handle the cases null and Optional containing null differently, making that a breaking change

                                        The post your link links to explains exactly how they intend to avoid this problem.

                                        • Someone 2 years ago
                                          The way I read that is that it says they’ll introduce an “inline class” that’s used to implement the “reference class” Optional, not that it will be replaced by it.

                                          IMO, that’s not solving the problem, but doing the best you can once you’ve decided to implement Optional as a reference class now and as a value class at some future time.

                                          I think I would have waited for the proper implementation.

                                        • marginalia_nu 2 years ago
                                          [True|False|FileNotFound] ;-)

                                          For what it's worth, Java's also got this class: https://docs.oracle.com/javase/8/docs/api/java/util/Optional...

                                          Although in practice there isn't much performance difference in my experience.

                                          • adrianmsmith 2 years ago
                                            > I fear that, by the time they get around to that, lots of code will handle the cases null and Optional containing null differently, making that a breaking change.

                                            Yes, I was working with code once which wrapped string IDs into a FooId object (a good idea in principle) and all of the following had different meanings:

                                                FooId x = null;
                                                FooId x = new FooId(null);
                                                FooId x = new FooId("");
                                                FooId x = new FooId(...an actual ID...)
                                            
                                            I think one was for not showing any content at all, one was for showing default content, another was that content was there but the user wasn't allowed to see it, etc.

                                            I'm so glad I left that company...

                                            • nayuki 2 years ago
                                              How is boxed null possible? The documentation for Optional.of(T value) says that it'll throw NullPointerException if value == null. https://docs.oracle.com/javase/8/docs/api/java/util/Optional...
                                              • vanjajaja1 2 years ago
                                                That's because Optional.of(null) == Optional.empty() Optional.empty() is boxed null
                                                • conro1108 2 years ago
                                                  Optional.of(null) throws a NullPointerException ;)

                                                  Optional.ofNullable(null) == Optional.empty()

                                            • jeroenhd 2 years ago
                                              > A special “I would never write Rust like that” variant that returns the value in a Box on the heap. This is because a friend of mine, after seeing my results, told me I was cheating, because all Rust versions so far used registers/stack to return the value, and Java was at a disadvantage due to returning on the heap (by default). So here you are.

                                              Why would Rust be cheating here? Java cannot make these types of optimizations yet (though they are likely coming with Project Valhalla) but that doesn't mean Rust should be similarly handicapped in benchmarks.

                                              Java has many smart optimizations and advantages over Rust (being garbage collected for one, making it much easier to write code in, and runtime reflection, a blessing and a curse) and with tricks like rearranging objects to make more effective use of CPU caches you can end up writing Java that's very close in performance to native, precompiled code.

                                              However, when it comes to raw performance, you shouldn't expect the standard JVM to come close to Rust. There is inherent overhead in the way the language and the runtime are designed. There is no "cheating" here, the algorithms are the same and some languages just produce more efficient code in these scenarios. You wouldn't slow down the JVM to make the benchmark fair for a Python implementation either!

                                              A more interesting comparison may be compiling Java to native assembly (through Graal for example) so Java too can take advantage of not having to deal with reflection and using SIMD instructions.

                                              Alternatively, a Java vs C# rundown would also be more interesting, as both languages serve similar purposes and solve similar problems. C#'s language-based approach to optional values has the potential to be a lot faster than Java's OOM-based approach but by how much remains to be seen.

                                              Java vs Kotlin may also be interesting to benchmark to see if the Kotlin compiler can produce faster code than Java's Optional; both run inside the same JVM so the comparison may be even better.

                                              • pjmlp 2 years ago
                                                Kotlin is only syntax sugar, so any bytecode pattern it happens to generate better than javac is also doable in Java.

                                                In fact it is mostly the opposite, all the Kotlin concepts that don't exist in Java (the language), need additional bytecodes to fake their semantics on top of JVM bytecodes optimized for Java semantics.

                                                Like functions, lazy initializations, delegation, or co-routines.

                                                • usrusr 2 years ago
                                                  But it's syntax sugar designed to make you stop worrying and love the null. Would be quite interesting to see how "elvised" Long? would microbenchmark against Optional<Long> and OptionalLong!
                                                  • pjmlp 2 years ago
                                                    Not worth the trouble to be married to JetBrains tooling.
                                                  • TeeWEE 2 years ago
                                                    I dont think this is true for the nullability type... Which are in essence compile time optionals... Without the overhead.
                                                    • tadfisher 2 years ago
                                                      Kotlin has a runtime cost here, by inserting runtime null checks. You can disable them with a compiler flag if you don't have non-Kotlin consumers.
                                                      • pjmlp 2 years ago
                                                        Same can be achieved with static analysis tooling like Sonar, Findbugs, PMD,...
                                                      • tadfisher 2 years ago
                                                        Define "doable". There are many, many bytecode constructs that are possible on the JVM, but are not generated by javac: https://stackoverflow.com/a/23218472

                                                        Do you mean, "javac can also implement them if it is modified to do so"? Because you are also making the case that Kotlin is syntax sugar on top of Java, when it is actually a bytecode-generating compiler in its own right, so I'm not sure how to understand this comment.

                                                        • pjmlp 2 years ago
                                                          Java is syntax sugar for Java Virtual Machine, and the only language that actually matters for the design of Java Virtual Machine Bytecodes.

                                                          Anyone else has to generate boilercode to pretend the semantics expected by those bytecodes, was easily shown via javap tooling on .class files.

                                                      • kaba0 2 years ago
                                                        I agree with you, though let’s add that different languages have different paradigms/idiomatic patterns and in case of Java Optional is not one, while in case of Rust, it is and was likely optimizes extensively. Of course the niches of the two languages are very different, the whole point of Rust is being a safe low-level language, which can express the wanted functionality more specifically (at the cost of much higher developer complexity).

                                                        So this test is as “unfair” as benchmarking Rust’s allocation performance against Java, for example

                                                        • motoboi 2 years ago
                                                          Graal is tested and produces no different result than base java. That surprised me.
                                                        • ithkuil 2 years ago
                                                          Rust has an interesting optimization for Option<T> when T has enough "room" for encoding a marker for "None": https://google.github.io/comprehensive-rust/std/box-niche.ht...

                                                          e.g. Option<NonZeroU64> is effectively encoded and operated on as u64, but it gives the type system a way to make sure you correctly handle the case where "0" means something special for you

                                                          • chrismorgan 2 years ago
                                                            Just a pity we currently only have number types with the niche at zero; something like NonMaxU32, which represents numbers in the range [0, 2³² − 2], would be useful at least as often, leaving 0xffffffff available for niche optimisations like Option::<NonMaxU32>::None.

                                                            NonMinI32 could also be interesting as a symmetrical number type, representing [−2³¹ + 1, 2³¹ − 1] and leaving the bit pattern 0x80000000 for niche optimisations.

                                                            • dathinab 2 years ago
                                                              When there was a lot of discussions about niche optimizations and integers that if we had const generics we could have `NonXXX` types which use const generics to specify the niche.

                                                              But back then we hadn't had const generics and it was time wise too far off to wait for const generics in anyform (including unstable rustc internal only usage of it).

                                                              So if now that we have const generics somone sits downs discusses the technical details on zulip, then writes a RFC and then writes an implementation we theoretically could have it soon.

                                                              Through I'm not sure how easy/hard the implementation part would be.

                                                              Some problems to discuss for standardization would be:

                                                              - is there any in progress work, overlapping RFC etc. (Idk. there should be older in progress work, but someone might be working on it right no idk). There could also be work on a more generic niche handling code which would happen to also cover this idk.

                                                              - should multiple niches be handled and if so how with which limitations (there are no variadic generic and ways to emulate them like through type nesting likely wouldn't have pef and complexity problems)

                                                              - can it be usefull for outside of optimizations to have e.g. a range limited integer

                                                              - if the gap is big enough (i.e. u32 limited to a hypothetical u24), should it interact with packed representation

                                                              - is there any risk of it being confusing/unexpected (should not be the case, but still needs to be evaluated)

                                                              EDIT: There seem to be unstable following attributes:

                                                              #[rustc_nonnull_optimization_guaranteed] #[rustc_layout_scalar_valid_range_start(...)] #[rustc_layout_scalar_valid_range_end(..)]

                                                              • tialaramex 2 years ago
                                                                Eventually it will be possible to write new types like this in stable Rust, the current approach is Pattern Types.

                                                                Today you can do this in nightly Rust, using a deliberately permanently unstable attribute, that's what my nook crate does to produce e.g. BalancedI8 which is a signed byte from -127 to 127. It will be nice when some day Pattern Types, or an equivalent are stabilized.

                                                                • kibwen 2 years ago
                                                                  I strongly suspect that niches will be stabilized for user-defined types someday. Unlike some other features (e.g. specialization) where there are open questions about how the feature could possibly work, niches are well-understood and mostly just need somebody to champion them.
                                                              • cryptos 2 years ago
                                                              • rocqua 2 years ago
                                                                The rust NonZeroU64 solution has a subtle 'bug' that happens not to matter.

                                                                The function

                                                                    fn get_optional_non_zero(n: u64) -> Option<NonZeroU64>
                                                                
                                                                        let i = n & 0xFF;
                                                                        if i == MAGIC { None } else { NonZeroU64::new(i) }
                                                                    }
                                                                
                                                                Actually returns None for n = 0 or n is any multiple of 256.

                                                                The resulting usage in the sum still yields the same result, because skipping zeros in an addition doesn't matter, but it is a subtle difference between this get-function compared to all of the others. It also doubles the number of None cases the code needs to handle.

                                                                • TwentyPosts 2 years ago
                                                                  > A special “I would never write Rust like that” variant that returns the value in a Box on the heap. This is because a friend of mine, after seeing my results, told me I was cheating, because all Rust versions so far used registers/stack to return the value, and Java was at a disadvantage due to returning on the heap (by default). Uhh, okay? This sounds a bit silly to me. It's good to add the additional comparison, sure, but "cheating" is just not the right word. The point of this article is ostensibly to compare the built-in Option types, not heap allocation. The fact that Java allocates any Options on the heap is part of that comparison (and reflects badly on Java, fwiw).

                                                                  Either way, glad to see that Rust is doing a good job eliminating the overhead. I'm not sure if arithmetic is the right kind of benchmark here, but it'd probably be difficult to measure the performance overhead across "real" codebases, so focusing on a tight loop microbenchmark is probably fine.

                                                                  • MrBuddyCasino 2 years ago
                                                                    Handling nulls by wrapping references in Optionals (at least if they can't be optimised away) is IMO strictly inferior to static analysis by the compiler as in Kotlin and forcing correct error handling. Its really all you need! The problem is not nullable references, after all Optionals can contain nulls, the problem is documenting if a value can be null via the type system and correctly handling those cases.

                                                                    Hoping Java will get this one day, but probably not...

                                                                    • jeroenhd 2 years ago
                                                                      If you set up your project with the right linters and validators, you can use @Nullable and friends to get close to Kotlin's type system. The readability of `public @Nullable frobulateWidget(@NotNull frobber)` may be questionable, but at least it'll work. I believe Jetbrains has a library that adds these annotations so its IDE (and probably other tools as well) can judge the nullability of fields. In fact, IntelliJ even has a button that will make the IDE infer nullability and add annotations for an entire class or project.

                                                                      Combine those annotations with a linter + pipeline that marks nullability warnings as errors and you've come pretty close to Kotlin's advantages. Of course, Kotlin also has some more advanced mutability controls and other advantages that Java doesn't get for free.

                                                                      When it comes to simple values, null vs non-null can be solved by using primitives (long) instead of objects (Long), as primitives can never be null.

                                                                      • kaba0 2 years ago
                                                                        > `public @Nullable frobulateWidget(@NotNull frobber)`

                                                                        You can mark the default state. I like to mark everything as NotNull, unless specified otherwise. That way only Nullable annotations are needed at the rare occasion null is a valid value.

                                                                        And I believe it gives you the exact same guarantees as Kotlin, minus the syntactic sugar — nullability is one of the few things that can be statically analyzed.

                                                                        Most linters also know the standard library’s nullability information, so it’s quite good.

                                                                      • iainmerrick 2 years ago
                                                                        I had the same thought. Kotlin and TypeScript have a good approach here. Don’t avoid using nulls at runtime, just beef up the type system so you know at compile time when they might appear. (The only wrinkle in both cases is that you might have to interact with Java/JS code that might not be null-safe.)
                                                                      • lmm 2 years ago
                                                                        > Handling nulls by wrapping references in Optionals (at least if they can't be optimised away) is IMO strictly inferior to static analysis by the compiler as in Kotlin and forcing correct error handling. Its really all you need!

                                                                        Disagree. The Kotlin way of doing it leads to really subtle bugs in generic code, because T? is usually different from T but sometimes it's not. (For example, if you write a generic cache that caches the result of f(x) in a map, it's really easy to accidentally write code that doesn't cache the result if it's null, and not notice).

                                                                        Also a lot of the time you don't actually want Optional, you want Either, because you want to know why the value wasn't present. Either is really limited in Kotlin.

                                                                        • cryptos 2 years ago
                                                                          At least Oracle is working on JVM stuff that could be used to introduce union types in Java. https://openjdk.org/jeps/8204937
                                                                          • MrBuddyCasino 2 years ago
                                                                            I've never seen this posted before, but this seems incredibly important and interesting. Maybe just needs a catchy name.
                                                                          • kaba0 2 years ago
                                                                            There are actually plans to tackle it in conjunction with Value types: https://news.ycombinator.com/item?id=34700346
                                                                            • haspok 2 years ago
                                                                              The problems with null are:

                                                                              1. They are not composable (can't map or flatmap or fold/reduce them).

                                                                              2. They can only represent one extra value, if you need more, you are back to square one (eg. you can't return an error value, only the fact that there is no value).

                                                                              If we make another step, one could argue that even optionals are lacking, one should model the possible domain values with sums and products in such a way that no nulls or optionals are required. Do not try this in a language with such a basic type system as Java or even Kotlin though, you will run into the limits of the type system almost immediately.

                                                                              • kaba0 2 years ago
                                                                                In modern Java you could do this:

                                                                                  sealed interface Option<T> permits Some<T>, None<?> {}
                                                                                  record Some<T>(T value) implements Option<T> {}
                                                                                  record None() implements Option<T> {
                                                                                    static <T> None<T> none() { return new None<>(); // can also be a single instance
                                                                                    }
                                                                                  }
                                                                                
                                                                                The only less than ideal part is that None needs the generic type, but that can be easily circumvented by adding a generic helper method. You can add all the Monad goodies to the Option interface and you will even get exhaustive switch cases with pattern matching. The only thing Java’s type system can’t express is abstracting those Monad goodies, but it can absolutely implement them on a case-by-case basis.
                                                                                • haspok 2 years ago
                                                                                  Thank you for bringing this to my attention, my current project is on Java 11, and I'm really struggling with the restrictions around enums (the poor man's sum type :)) and interfaces... maybe I can push for Java 17 at least!

                                                                                  The fundamental problem in Java is, however, what you stated in your last sentence: you are limited in abstraction, in most cases you have to implement the specifics.

                                                                                  • sn9 2 years ago
                                                                                    Mario Fusco gave a talk showing how to implement monadic patterns in Java years ago.

                                                                                    I never found the talk online, but here's the speaker deck: https://speakerdeck.com/mariofusco/monadic-java

                                                                                  • MrBuddyCasino 2 years ago

                                                                                        1. They are not composable (can't map or flatmap or fold/reduce them).
                                                                                    
                                                                                    Kotlin helpfully added mapNotNull() and similar methods.

                                                                                        2. you can't return an error value, only the fact that there is no value
                                                                                    
                                                                                    Yes I much prefer Rust-like return values, the non-local control flow of exceptions leads to convoluted code and improper error handling.
                                                                                    • dtech 2 years ago
                                                                                      2 is not different for Option(al)/Maybe. 1 is simply not true in Kotlin: value?.let is both map and flatmap for value with nullable type. Which one depending on whether you return a nullable type inside the let (flatmap) or not (map).
                                                                                      • tsss 2 years ago
                                                                                        It absolutely is true. The ?. operator is nominally different to map/flatMap. It does not extend to other monadic types and neither can you abstract with map/flatMap over nullable types. Not to mention more advanced type system features like higher kinded data that Kotlin can only dream of. Option can be mapped over types, ? can not. Option can be handled by sop-generic programming, ? requires a special case. Option is bijective, ? is not.
                                                                                    • gjadi 2 years ago
                                                                                      I've never used Kotlin, but there are several systems that provide Null Analysis in Java. For example FindBugs https://findbugs.sourceforge.net/manual/annotations.html

                                                                                      Is Kotlin better because it works out of the box or are there differences in the feature set?

                                                                                      • cryptos 2 years ago
                                                                                        From my experience Kotlin's null handling works better than these external tools. Another point is that also the APIs need some support for it, to be convenient. Kotlin has methods like mapNotNull for example.
                                                                                        • gjadi 2 years ago
                                                                                          I agree that it's better when its built-in because the whole ecosystem uses it. Whereas in Java you may need to wrap third party code if they don't use the null analysis (or the same tool).

                                                                                          But regarding the feature I imagine it is the same. Or are there cases where the Java Null Analysis fails?

                                                                                      • mrkeen 2 years ago
                                                                                        > The problem is not nullable references

                                                                                        Disagree. If you don't PUT the nulls into the language, you don't need a brigade of PhDs to develop the static analysis to tell you whether you have nulls.

                                                                                        I'm sick of worshipping at the altar of backward compatibility. Just because we used to choose to include nulls doesn't mean we need to keep choosing to include them.

                                                                                        • MrBuddyCasino 2 years ago
                                                                                          You still need to signal the absence of a value. This is not necessarily the same thing as a monadic Return<R, E> type as in Rust, which I quite like and would prefer over exceptions. I think the Kotlin solution is very elegant, considering the restrictions of the host platform of using exceptions as error signaling mechanism.
                                                                                          • dgb23 2 years ago
                                                                                            You missed the point entirely. Nulls are just part of proper unions in Kotlin (and other languages). They are just part of the type and they are explicit.

                                                                                            Instead of having to wrap an optional value, you just annotate the type as being the union of something _or_ null. You get the same guarantees, but it actually composes openly instead of having to create a closed, specific construct that enumerates variants.

                                                                                            • mrkeen 2 years ago
                                                                                              > You missed the point entirely.

                                                                                              No.

                                                                                        • diffuse_l 2 years ago
                                                                                          My guess is that C# will be much better, since it has support for Value types.

                                                                                          Wasn't Java supposed to get support for Value types some time ago?

                                                                                          • pjmlp 2 years ago
                                                                                            Project Valhala has the goal to be ABI compatible with existing JARs, hence why it has taken so long, they want to add value types semantics without breaking Maven central.
                                                                                            • SideburnsOfDoom 2 years ago
                                                                                              C# can be much better, or it can be much the same. It depends on which implementation you use. Standardisation is an issue here, the system library has the "Nullable<T>" struct (1), but not a standard Result<T, E> class (or struct). Popular libraries such as "OneOf" use a class type. The single-valued "OneOf" type is effectively an "Option<T>" (2)

                                                                                              Nullable<T> itself is not exactly the same as Option<T> since it does not cover types that already allow nulls. It adds nulls rather then removing them.

                                                                                              Many people roll their own Option<T> or Result<T,E> type, since it's easy enough to start, and it's usually a class type.

                                                                                              1) https://learn.microsoft.com/en-us/dotnet/api/system.nullable...

                                                                                              2) https://github.com/mcintyre321/OneOf/blob/master/OneOf/OneOf...

                                                                                              • alkonaut 2 years ago
                                                                                                C# does indeed produce a rather tight assembly

                                                                                                https://sharplab.io/#v2:EYLgtghglgdgNAFxBAzmAPgAgLACgACATAMx...

                                                                                                • orthoxerox 2 years ago
                                                                                                  The compiler takes care of this, but the idiomatic way to write the body of the loop in C# is

                                                                                                      sum += things[i] ?? 0;
                                                                                                  • alkonaut 2 years ago
                                                                                                    You could even argue the most idiomatic C# is things.Sum(x => x ?? 0) and there the real test would be if the compiler/jit would match the speed of the for() loop exactly, i.e. not allocating any enumerators on the heap, not boxing any values and so on.
                                                                                                • kaba0 2 years ago
                                                                                                  Well, Valhalla is described as “requiring 7 PhD’s knitted together”, doing that while remaining backwards compatible is insanely hard.
                                                                                                  • olavgg 2 years ago
                                                                                                    [flagged]
                                                                                                    • pharmakom 2 years ago
                                                                                                      I downvoted because this is not a constructive comment and it makes sweeping generalisations.
                                                                                                      • SideburnsOfDoom 2 years ago
                                                                                                        C# dev here, and I don't bash Java, in fact I don't tend to comment on languages that I'm not an expert on, because well, not an expert. This covers the vast majority of programming languages and toolkits. Nobody know them all. Nobody knows more than a small fraction of them.

                                                                                                        I'll make one generalisation though: any dev of $langA that spends their time bashing $langB is doing pathetic insecure gatekeeping. And should give it up. This applies to what grandparent comment is talking about, and to grandparent comment itself.

                                                                                                        It's not cool to hate on an out-group, even if it's a community bonding experience.

                                                                                                        I refer the grandparent to Scott Hanselman: https://www.youtube.com/watch?v=IzhQIpT7S50

                                                                                                  • Yujf 2 years ago
                                                                                                    Why is the author talking about null in the intro, which implied using pointers and thus boxed objects and then running benchmarks on integers? That makes no sense to me.
                                                                                                    • sirwhinesalot 2 years ago
                                                                                                      Because it's a benchmark on Optionals? In Java an Optional<Long> requires boxing, in Rust it does not. You'd expect a "sufficiently smart compiler" to detect this and avoid needless boxing after inlining and escape analysis but clearly that is not the case.

                                                                                                      Note that "Long" in Java can be null because it is boxed, "long" (lowercase) however cannot be null, but it also can't be Optional<long>. Java sucks :)

                                                                                                      EDIT: I'd love to see a C# version of this.

                                                                                                      • winrid 2 years ago
                                                                                                        Java's language philosophy is simple - everything must be an object. This simplifies a lot of things.

                                                                                                        Rust has optionals built into the language. Rust's philosophy is to be a super powerful tool, language complexity be damned.

                                                                                                        I find it hard to say java sucks in this context. Each language is making trade offs that align with their vision.

                                                                                                        • TazeTSchnitzel 2 years ago
                                                                                                          Rust doesn't have optionals built-in. The language has no special support for them (beyond the try operator); just like Java, Rust's optional type is provided by the standard library, but it could be trivially implemented yourself and your implementation would have the same behaviour and performance characteristics. It's literally just:

                                                                                                            enum Option<T> {
                                                                                                                Some(T),
                                                                                                                None,
                                                                                                            }
                                                                                                          
                                                                                                          What makes Rust fast here is that it has value types and can optimise them.
                                                                                                          • sshine 2 years ago
                                                                                                            > Note that "Long" in Java can be null because it is boxed, "long" (lowercase) however cannot be null, but it also can't be Optional<long>. Java sucks :)

                                                                                                            I think using primitive types as generics is something that makes Java less ergonomic than C# (where they’re called unmanaged types), whether it is considered justified or necessary.

                                                                                                            To say Java sucks because of this is a bit much. To say Java sucks because you can’t avoid null is definitely warranted. (You can say good things about Java, and not being able to opt out of nulls is not one of them.)

                                                                                                            • u320 2 years ago
                                                                                                              > Rust's philosophy is to be a super powerful tool, language complexity be damned.

                                                                                                              This is not Rust's philosophy at all.

                                                                                                              • jonhohle 2 years ago
                                                                                                                But `Optional` could have been a value type from the start and had effectively zero overhead, especially if it were specialized for primitive types. There are 8 primitive types, so supporting them all with a value-type optional would not have been the end of the world, even if it was only a language level optimization (e.g. optional becomes a 96-bit-128-bit type and the compiler is responsible for ensuring primitives are wrapped/unwrapped specially).

                                                                                                                GNU Trove is a collection library that focuses on optimizing for primitive types and is significantly faster that Java collections which require boxing.

                                                                                                                • tpm 2 years ago
                                                                                                                  > Java's language philosophy is simple - everything must be an object.

                                                                                                                  Except primitive types like long in this case, which are not objects. This was a performance-consistency tradeoff made in the early 90s. It made sense at the time and now doesn't make sense to some people, but that's ok. I wouldn't say Java sucks because of that either. Now type erasure, that's a different topic.

                                                                                                                  • pharmakom 2 years ago
                                                                                                                    The Java compiler “sucks” (not the language) because it’s not making optimisations that are safe and a human could do themselves fairly simply.
                                                                                                                    • thaumasiotes 2 years ago
                                                                                                                      > Java's language philosophy is simple - everything must be an object. This simplifies a lot of things.

                                                                                                                      Everything except primitive types, functions, and arrays (of any type). The different status of arrays can be a real pain.

                                                                                                                      Ruby says the same thing, and they're even worse about functions behaving differently than objects do.

                                                                                                                      • komadori 2 years ago
                                                                                                                        Rust doesn't have optionals built into the language except insofar as Option<T> is defined in the standard library. The difference is that Rust allows you to define new value-types, whereas Java has a small fixed set of "primitive" value-types.
                                                                                                                    • Inityx 2 years ago
                                                                                                                      Also from the beginning of the article:

                                                                                                                      > The task was to compute a sum of all the numbers, skipping the number whenever it is equal to a magic constant. The variants differ by the way how skipping is realized:

                                                                                                                      > 1. We return primitive longs and check if we need to skip by performing a comparison with the magic value directly in the summing loop.

                                                                                                                      > 2. We return boxed Longs and we return null whenever we need to skip a number.

                                                                                                                      > 3. We return boxed Longs wrapped in Optional and we return Optional.empty() whenever we need to skip a number.

                                                                                                                      Seems pretty reasonable to me.

                                                                                                                      • alkonaut 2 years ago
                                                                                                                        And the only one that truly would make sense would of course be Optional<long>, i.e. the optional primitive long...

                                                                                                                        First having to declare the value in the one type of four that makes least sense, then praying that the compiler optimizes the allocation of not one but TWO(!) objects(!) in order to represent "maybe a number" is basically why I ragequit Java almost 20 years ago.

                                                                                                                        • kaba0 2 years ago
                                                                                                                          20 years ago there were no generics, so you couldn’t have implemented it that way. You could have written a class OptionalLong { long value; boolean isSet; } at the time and that would have only a single allocation overhead. Alternatively, have an array of longs and a boolean array marking which ones are set, with a trivial wrapper object over that for essentially zero overhead.

                                                                                                                          Java’s tradeoffs are maintainability in huge teams over multiple years with relatively fast performance even if you write your code very naively, with top notch tooling, observability, etc. In the rare case you have to optimize in the hot loops you can allow to have less readable code like I mentioned.

                                                                                                                    • layer8 2 years ago
                                                                                                                      Note that Java’s Optional was never intended to be a general-purpose Maybe type or general-purpose replacement for null, unlike Rust’s option. As Brian Goetz explains in https://stackoverflow.com/a/26328555/623763:

                                                                                                                      Our intention was to provide a limited mechanism for library method return types where there needed to be a clear way to represent "no result", and using null for such was overwhelmingly likely to cause errors.

                                                                                                                      For example, you probably should never use it for something that returns an array of results, or a list of results; instead return an empty array or list. You should almost never use it as a field of something or a method parameter.

                                                                                                                      I think routinely using it as a return value for getters would definitely be over-use.

                                                                                                                      • chriswarbo 2 years ago
                                                                                                                        > For example, you probably should never use it for something that returns an array of results, or a list of results; instead return an empty array or list

                                                                                                                        This also applies to null/Maybe as well: both would violate the principle of least surprise (e.g. the AWS DynamoDB SDK has queries return an 'Array<Item>'; but this is 'null' if there are no matches!). It also complicates the domain model, making two distinct forms of empty value ('None' versus 'Some(List())'; or '[]' versus 'null'), which may not have any semantic difference.

                                                                                                                        > You should almost never use it as a field of something

                                                                                                                        I agree, although it's often preferable to expose methods rather than fields anyway; in which case it's a return value, which seems OK.

                                                                                                                        > or a method parameter

                                                                                                                        Sure, that's what polymorphism/overloading is good for, e.g. instead of `foo(int arg1, Optional<String> arg2)` we can have separate `foo(int arg1, String arg2)` and `foo(int arg1)` definitions (where the latter will probably call the former with some default).

                                                                                                                        > I think routinely using it as a return value for getters would definitely be over-use

                                                                                                                        I agree, since that would indicate our model is too weak, and missing some domain-relevant information. For example, if many of our 'Order' methods return optional results, there's probably a finer-grained distinction to be made, like 'PendingOrder', 'FulfilledOrder', etc. which don't need the optional qualifiers.

                                                                                                                        (Personally I try to avoid the term "getter": APIs should make sense without reference to their underlying implementation; whether that happens to be "getting" a field, or calling out to some other methods/objects, etc. That's the point of encapsulation :) )

                                                                                                                        • paulddraper 2 years ago
                                                                                                                          > was never intended to be a general-purpose Maybe type

                                                                                                                          Say how a general-purpose Maybe type in Java would be different than Optional.

                                                                                                                          "We made a vehicle with an engine and four wheels but never intended it to be a car."

                                                                                                                          • layer8 2 years ago
                                                                                                                            You wouldn’t have both of:

                                                                                                                              Optional<T> a = null;
                                                                                                                              Optional<T> b = Optional.empty();
                                                                                                                            
                                                                                                                            In other words, Optional would have to be a non-nullable type. Which of course means that, to have it be a reference type, Java would have to support non-nullable reference types. But if Java did support those, you wouldn’t really need Optional in the first place, because then the current nullable types would fulfill that purpose.
                                                                                                                            • vanjajaja1 2 years ago
                                                                                                                              The leaky null abstraction in java is orthogonal to "what does an Optional type express".
                                                                                                                        • dgb23 2 years ago
                                                                                                                          It’s interesting to see how different languages deal with nulls and similar constructs.

                                                                                                                          In some languages like TS, PHP or Kotlin have proper unions that you just handle with branching.

                                                                                                                          Rust lets you pattern match againt a construct that holds a value or doesn’t. Option is an actual thing there that you need to unpack in order to look inside.

                                                                                                                          In Clojure nils are everywhere. They tell you that “you’re done” in an eager recursion, or that a map doesn’t have something etc. Many functions return something or nil, and depending on what you’re doing you care about the value vs the logical implication.

                                                                                                                          nils flow naturally through your program and it’s not something you are worried about, as many functions do nil punning. Well as long as you don’t directly deal with Java - then you have to be more careful.

                                                                                                                          • orthoxerox 2 years ago
                                                                                                                            With void-returning methods and null-punning you can end up skipping critical side effects, which are more common in idiomatic Java than in idiomatic Clojure. C# tries to make this explicit for the receiver with its Elvis operator and nullable reference types.

                                                                                                                                foo?.Bar(baz);
                                                                                                                            
                                                                                                                            and

                                                                                                                               var result = foo?.Bar(baz);
                                                                                                                            
                                                                                                                            both do what's expected: skip the method call and return null (or void) if the receiver is null, and the compiler complains if you don't do that when foo is inferred to be nullable.
                                                                                                                          • chrismorgan 2 years ago
                                                                                                                            (2021)

                                                                                                                            Discussion at publication time: https://news.ycombinator.com/item?id=28887908

                                                                                                                            • gjadi 2 years ago
                                                                                                                              Interesting.

                                                                                                                              This article resonate in me with the recent articles of Casey Muratori about non-pessimistic code:

                                                                                                                              Within the realm of the module, don't use pessimistic code (avoid Boxing) _but_ that doesn't prevent you to provide a safe API. E.g. the result of the loop could be wrapped if that made sense.