Three Pillars of Reproducible Builds
82 points by spatten 3 years ago | 14 comments- FartyMcFarter 3 years agoOne of the most fun non-determinism bugs I have worked on was the result of using an associative container with the key type being a pointer (like a std::map<void*, int> or similar), and then iterating over this container.
Since the order and value of dynamically allocated pointers is non-deterministic, this resulted in diverging behaviour at some point.
Better be sure that all your tools used during the build don't do this kind of thing as well.
- aidenn0 3 years agoWith ASLR off, the order and value should be identical between runs on the same malloc implementation, as stochastic allocators are not in common use
- FartyMcFarter 3 years agoNot when multi-threading is involved, I would think. That or timing-dependent code making allocations.
- aidenn0 3 years agoThat's true.
- aidenn0 3 years ago
- FartyMcFarter 3 years ago
- aidenn0 3 years ago
- pabs3 3 years agoThese three aren't enough, you also need to take care of not storing build timestamps, hostnames, timezones, sorting and more:
- chriswarbo 3 years agoSome of that is mentioned, e.g.
> Build steps that use system time to generate timestamps.
> Builds that change behavior based on currently set environment variables but don’t commit environment variable configurations.
- chriswarbo 3 years ago
- jiehong 3 years agoOn the JVM, maven doesn’t make this particularly easy.
It’s possible to try to store dependencies locally instead of shared in a global m2 repository, but it’s difficult to stop maven from adding the current time in jars or wars…
It’s as if all the default settings are the opposite of what they should be for reproducible builds.
Any idea if there is a project to try to improve things with maven or with another JVM tool? (Grade, sbt, etc.)
- mchmarny 3 years agoIf you have an option to containerize the app, Jib may be what you are looking for. Plugs into Maven, and the same source/content always generates the same image - https://github.com/GoogleContainerTools/jib
- donmcronald 3 years agoAnd this is the best explanation of Jib [1], but it’s hard to find via Google. It’s how all builds for every ecosystem should work IMO.
- donmcronald 3 years ago
- chriswarbo 3 years ago> Any idea if there is a project to try to improve things with maven or with another JVM tool? (Grade, sbt, etc.)
We've found SBT to be less reproducible than Maven. In particular, its "configuration file" (build.sbt) is actually executable Scala code (and highly imperative too, e.g. appending to mutable dependency lists). I've seen projects which choose different dependencies based on env var settings, string matches, etc.
I've also seen projects which add pre/post steps to a test suite, for spinning-up and tearing-down a mock database (the dynamodb-local SBT plugin). The crazy part about that, is that SBT only becomes aware of the plugin when it's about to execute the test suite; hence it doesn't appear in any dependency lists, so we can't automatically fetch it ahead-of-time. By the way, that plugin itself works by downloading and running a "latest.zip" file from an AWS URL....
- robto 3 years agoHuawei just published a paper (Towards Build Verifiability for Java-based Systems[0]) on trying to get the JVM ecosystem reproducible. It looks like it's early days, but I'm paying attention.
- zzandd 3 years agohttps://reproducible-builds.org/docs/jvm/ Which links to https://maven.apache.org/guides/mini/guide-reproducible-buil...
Haven't tried this myself as I don't particularly like maven. It should be possible though
- mchmarny 3 years ago
- cies 3 years agoHow can you discuss this w/o mentioning Nix (or the likes)?
- _3u10 3 years agoI guess any stubs the compiler adds will also have to be reproducible, big whoop.