Silkenweb Example: Hackernews Clone

Homomorphically Encrypting CRDTs

277 points by jakelazaroff 3 weeks ago | 74 comments

plopilop 3 weeks ago
As the article mentions, fully homomorphic encryption is insanely slow and inefficient. But I have to say that it is a relatively new field (the first FHE scheme was discovered in 2009), and that the field has immensely progressed over the last decade and a half.
The first FHE scheme required keys of several TB/PB, bootstrapping (an operation that is pivotal in FHE schemes, when too many multiplications are computed) would take thousands of hours. We are now down to keys of "only" 30 MB, and bootstrapping in less than 0.1 second.
Hopefully progress will continue and FHE will become more practical.
- gritzko 3 weeks ago
  The first CRDTs have been remarkably impractical, e.g. WOOT[0]. These days, state-of-the-art CRDT databases are not much different from your regular LSM performance-wise. For example, RDX CRDTs[1,2] are all implemented by a merge-sort-like algorithm, pure O(N). Metadata overheads have been tamed in most implementations.
  [0]: https://github.com/el10savio/woot-crdt
  [1]: https://github.com/gritzko/librdx
  [2]: https://github.com/gritzko/go-rdx
  - sgarland 3 weeks ago
    Do you have benchmarks at scale, ideally compared to other store / DB implementations? I’ve seen other CRDT libraries using Postgres (inadvisedly) bring it to its knees due to the massive amount of chattiness and updates.
    - gritzko 3 weeks ago
      RDX works as a merge operator inside virtually any LSM database. It is strictly O(N), so almost every LSM db benchmark would be relevant here.
      There is an RDX-specific LSM engine BRIX, but that one is good for its Merkle properties.
      Chotki is basically Pebble.
  - meindnoch 3 weeks ago
    >For example, RDX CRDTs
    No affiliation, right?
    - gritzko 3 weeks ago
      I am the author of RDX. I let others brag about thier implementations.
- westurner 3 weeks ago
  Should students trust and run FHE encrypted WASM or JS grading code that contains the answers on their own Chromebooks; for example with JupyterLite and ottergrader?
  On code signing and the SETI@home screensaver
  - westurner 3 weeks ago
    This is at -3! Perhaps I wasn't clear enough:
    Code signing is important when allowing use of computational resources for [FHE] encryption, because there is no good way to trace execution of so obfuscated code.
- 6r17 3 weeks ago
  CRDTs are also crazy slow due to their architecture ; even the best alg out there are costly by design ; so adding homomorphic encryption is even more of a challenge ; tough it really is impressing I'm curious if this can be usable at all;
  edit so i bring some "proof" of my claim: from this very page : `To calculate the new map, the server must go through and merge every single key. After that, it needs to transfer the full map to each peer — because remember, as far as it knows, the entire map is different.`
  - jason_oster 3 weeks ago
    CRDTs are not inherently “crazy slow”. Researchers just don’t succumb to the appeal of premature optimization.
    See: https://josephg.com/blog/crdts-go-brrr/
    (And even these optimizations are nascent. It can still get so much better.)
    The section you quoted describes an effect of homomorphic encryption alone.
    There is the problem that both CRDTs and encryption add some overhead, and the overhead is additive when use together. But I can’t tell if that is the point you are trying to make.
    - josephg 3 weeks ago
      Yep. Author here - that article is out of date now. I should really do a followup. Performance of CRDTs has improved again through a new grab bag of tricks. I’ve also been told the beta of automerge 3 uses a lot of the optimisations in that post, and it’s now much faster as a result.
      A crdt library should be able to handle millions of changes per second. If it’s the bottleneck, something somewhere has gone wrong.
    - hansvm 3 weeks ago
      > additive
      The overhead is usually multiplicative per-item. Let's say you're doing N things. CRDTs make that O(Nk) for some scaling factor k, and adding encryption makes it O(Nkj) for some scaling factor j.
      Give or take some multiplicative log (or worse) factors depending on the implementation.
  - motorest 3 weeks ago
    > CRDTs are also crazy slow due to their architecture ;
    You must back up your extraordinary claim with some extraordinary evidence. There is nothing inherently slow in CRDTs.
    Also, applying changes is hardly on anyone's hot path.
    The only instance where I saw anyone complaining about CRDT performance, it turned out to be from very naive implementations that tried to spam changes with overly chatty implementations. If you come up with any code that requires a full HTTPS connection to send a single character down the wire, the problem is not the algorithm.
  - __MatrixMan__ 3 weeks ago
    Is it the CRDT that's slow there, or is the problem that they've made it one party's job to update everybody?
    By having a server in the mix it feels like we're forcing a hub/spoke model on something that wants to be a partial mesh. Not surprising that the hub is stressed out.
    - svieira 3 weeks ago
      The whole point of Conflict-free Replicated Data Types is that you don't need an authoritative server. You're thinking of Operational Transform which does require an authority.
  - Asraelite 3 weeks ago
    > CRDTs are also crazy slow due to their architecture
    What kinds of CRDTs specifically are you referring to? On its own this statement sounds far too broad to be meaningful. It's like saying "nested for loops are crazy slow".
  - MangoToupe 3 weeks ago
    > CRDTs are also crazy slow
    compared to what? c'mon
qualeed 3 weeks ago
It doesn't actually say it anywhere, so: CRDT = Conflict-free replicated data type.
https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...
- Xeoncross 3 weeks ago
  e.g. commonly used by applications that need to deal with multiple people editing a document.
  - MangoToupe 3 weeks ago
    ...or, really, any kind of conflict-free syncing.
teleforce 3 weeks ago
>Runtime performance is also — to put it lightly — lacking. I benchmarked the unencrypted and encrypted versions of the last write wins register on an M4 MacBook Pro. The unencrypted one averaged a merge time of 0.52 nanoseconds.The encrypted one? 1.06 seconds. That’s not a typo: the homomorphically encrypted merge is two billion times slower.
Ouch!
- sieabahlpark 3 weeks ago
  [dead]
NetRunnerSu 3 weeks ago
FHE is indeed slow, but the progress since 2009 is truly remarkable. Bootstrapping speed alone improved by tens of millions of times, and tfhe-rs already demonstrates homomorphic AES-128. Real-time FHE for AI inference/training feels increasingly plausible.
> https://github.com/sharkbot1/tfhe-aes-128
- ProofHouse 3 weeks ago
  Can I double click on ‘plausible’ ?
  - ketralnis 3 weeks ago
    Sure but it will just select the word in your browser
  - throitallaway 3 weeks ago
    Someone's a Chamath fan.
Joker_vD 3 weeks ago
> Now, however, the server can no longer understand the changes you send. If you want to see your friend’s latest changes, you’ll need to both be online at the same time.
What? No, the server sends you the changes you've not seen yet, you decrypt and merge them, and so you get the latest version of the document. Right?
The homomorphic encryption is a fascinating topic, but it's almost never an answer if you need anything resembling reasonable performance and/or reasonable bandwidth.
I've seen a paper that ingeniuously uses homomorphic encryption to implement arbitrary algorithmic computations, totally secret, by encoding a (custom-crafted) CPU together with RAM and then running "tick a clock" algorithm on them. And it works, so you can borrow some AWS huge instance and run you super-important calculations there — at 1 Hz. I am not kidding, it's literally 1 virtual CPU instruction per second. Well, if you are okay with such speed and costs, you either have very small data — at which point just run your computation locally, or you're really, really rich — at which point just buy your own goddamn hardware and, again, run it locally.
- killerstorm 3 weeks ago
  It's very common for CS and cryptography-adjacent papers to describe something impractical. Even more impractical than what you described - e.g. complexity of an attack is reduced from 2^250 to 2^230.
  The purpose of these papers is to map out what's possible, etc, which might at some point help with actual R&D.
- zawaideh 3 weeks ago
  If the server can't operate on the content, it can't merge it into the CRDT documents. Which means it would need to sending and receiving the entire state of the CRDT with each change.
  If the friend is online then sending operations is possible, because they can be decrypted and merged.
  - ath92 3 weeks ago
    Generally, this is not really true. The point of CRDTs is that as long as all parties receive all messages (in any order), they should be able to recreate the same state.
    So instead of merging changes on the server, all you need is some way of knowing which messages you haven’t received yet. Importantly this does not require the server to be able to actually read those messages. All it needs is some metadata (basically just an id per message), and when reconnecting, it needs to send all the not-yet-received messages to the client, so it’s probably useful to keep track of which client has received which messages, to prevent having to figure that out every time a client connects.
    - josephg 3 weeks ago
      You’re talking past each other. These are both valid descriptions of CRDTs - just different types of CRDTs.
      Generally there’s two categories of CRDTs: state based and operation based CRDTs.
      State based CRDTs are like a variable with is set to a new value each time it changes. (Think couchdb if you’ve used it). In that case, yes, you generally do update the whole value each time.
      Operation based CRDTs - used in things like text editing - are more complex, but like the parent said, deal with editing events. So long as a peer eventually gets all the events, they can merge them together into the resulting document state. CRDTs have a correctness criteria that the same set of operations always merges into the same document, on all peers, regardless of the order you get the messages.
      Anyway, I think the parent comment is right here. If you want efficient E2E encryption, using an operation based crdt is probably a better choice.
    - charcircuit 3 weeks ago
      If it takes 1 seconds per merge as per the article it sounds like a poor user experience for when new people join they have to wait hundreds or thousands of seconds to get to the doc.
  - Joker_vD 3 weeks ago
    I... still can't make heads or tails out of this description. Let me restate how I understand the scheme in TFA: there are two people, editing on the same document using CRDTs. When one person makes an edit, they push an encrypted CRDT to the sync server. Periodically, each of them pulls edits made by the other from the sync server, apply them to their own copy, and push the (encrypted) result back. Because of CRDT's properties, they both end up with the same document.
    This scheme doesn't require them two people to be on-line simultaneously — all updates are mediated via the sync server, after all. So, where am I wrong?
    - eightys3v3n 3 weeks ago
      I think the difference in understanding is that the article implies, as I understand it, that the server is applying the changes to the document when it receives a change message, not the clients. If the clients were applying the changes then we don't need Homomorphic encryption in the first place. The server would just store a log of all changes; cleaning it up once it was sure everyone played the changes if that is possible. Without Homomorphic encryption, the server must store all changes since some full snapshot and a full snapshot of the document. Where as with it, the server only ever stores the most recent copy of the document.
      This could be done to reduce the time required for a client to catch up once it comes online (because it would need to replay all changes that have happened since it last connected to achieve the conflict free modification). But the article also mentions something about keeping the latest version quickly accessible.
    - 3 weeks ago
    - crdrost 3 weeks ago
      So, there is a reason that CRDT researchers would not like this response that you have given, but down-thread from you it's not why the author jakelazaroff didn't like it, but it's worth giving this answer too.
      The reason CRDT researchers don't like the sync server is, that's the very thing that CRDTs are meant to solve. CRDTs are a building-block for theoretically-correct eventual consistency: that's the goal. Which means our one source-of-truth now exists in N replicas, those replicas are getting updated separately, and now: why choose eventual consistency rather than strong consistency? You always want strong consistency if you can get it, but eventually, the cost of syncing the replicas is too high.
      So now we have a sync server like you planned? Well, if we're at the scale where CRDTs make sense then presumably we have data races. Let's assume Alice and Bob both read from the sync server and it's a (synchronous, unencrypted!) last-write-wins register, both Alice and Bob pull down "v1" and Alice writes "v1a" to the register and Bob in parallel writes "v1b" as Alice disconnects and Bob wins because he happens to have the higher user-ID. Sync server acknowledged Alice's write but it got lost until she next comes online. OK so new solution, we need a compare-and-swap register, we need Bob to try to write to the server and get rejected. Well, except in the contention regime that we're anticipating, this means that we're running your sync server as a single-point-of-failure strong consistency node, and we're accepting the occasional loss of availability (CAP theorem) when we can't reach the server.
      Even worse, such a sync server _forces_ you into strong consistency even if you're like "well the replicas can lose connection to the sync server and I'll still let them do stuff, I'll just put up a warning sign that says they're not synced yet." Why? Because they use the sync server as if it is one monolithic thing, but under contention we have to internally scale the sync server to contain multiple replicas so that we can survive crashes etc. ... if the stuff happening inside the sync server is not linearizable (aka strongly consistent) then external systems cannot pretend it is one monolithic thing!
      So it's like, the sync server is basically a sort of GitHub, right? It's operating at a massive scale and so internally it presumably needs to have many Git-clones of the data so that if the primary replica goes down then we can still serve your repo to you and merge a pull request and whatever else. But then it absolutely sucks to merge a PR and find out that afterwards, it's not merged, so you go into panic mode and try to fix things, only for 5 minutes later to discover that the PR is now merged. And if you've got a really active eventually consistent CRDT system that has a lot of buggy potential.
      For the CRDT researcher the idea of "we'll solve this all with a sync server" is a misunderstanding that takes you out of eventual-consistency-land. The CRDT equivalent that lacks this misunderstanding is, "a quorum of nodes will always remain online (or at least will eventually sync up) to make sure that everything eventually gets shared," and your "sync server" is actually just another replica that happens to remain online, but isn't doing anything fundamentally different from any of the other peers in the swarm.
  - blamestross 3 weeks ago
    > Which means it would need to sending and receiving the entire state of the CRDT with each change. > If the friend is online then sending operations is possible, because they can be decrypted and merged.
    Or the user's client can flatten un-acked changes and tell the server to store that instead.
    It can just allways flatten until it hears back from a peer.
    The entire scenario is over-contrived. I wish they had just shown it off instead of making the lie of a justification.
  - clawlor 3 weeks ago
    There are variants of CRDTs where each change is only a state delta, or each change is described in terms of operations performed, which don't require sending the entire state for each change.
hoppp 3 weeks ago
Don't use THFE-rs because Zama requires a commercial license for anything other than prototyping. Their license sucks.
Instead use openFHE (C++) or lattigo (golang) libraries which are both free to use commercially.
somezero 3 weeks ago
FHE is simply the wrong tool here. FHE is for a central server operating on data held/known by another. They want MPC -multiple parties jointly computing on distributed data- and that’s fairly more efficient.
- hoppp 3 weeks ago
  I agree. I also often want to use FHE only to realize there are better tools for the job and better ways to do things.
  The use-case is pretty niche
yusina 3 weeks ago
I'm not sure I get the premise of the article. I know what a CRDT is and how homomorphic encryption works. But why do both parties have to be online at the same time to sync? They could send the updates asynchronously, store-and-forward style, and everything in flight is encrypted. Why does this need server that keeps state (which is kept in encrypted state and modified, as per the proposal)?
- noam_k 3 weeks ago
  I think the issue here is that the server would have to store a copy of the register per peer, as it can't calculate which one is the most recent. Using FHE allows the server to hold a single copy.
  In other words the server could forward and not store if all parties are always online (at the same time).
  - neon_me 3 weeks ago
    Server will store encrypted blob and its hash/etag.
    Client before upload of data, check for hash/etag of blob he originally fetched. If blob on server has different one, it will download it, decrypt, patch new data on existing one, encrypt and reupload.
    Whats the catch?
    AES is hardware accelerated on the most devices - so with all the ops it will be significantly faster than any homomorphic enc nowadays.
    - nixpulvis 3 weeks ago
      I too was wondering the same thing. FHE is cool tech, but this seems to me to be a bad application of it since it will undoubtedly be less efficient.
      FHE is useful when trying to compute on data from various sources who all mutually want to keep some information secret. For example, Apple's use of FHE to categorize photos [1]. In this case all the server is really doing is "compressing" for lack of a better word, the change sets, so each offline client doesn't need to sync every message since they are already merged by the server.
      If all you want is to keep a synchronizing server in the dark, but all clients can be trusted with the unencrypted data, traditional key exchange and symmetric encryption should suffice.
      [1]: https://machinelearning.apple.com/research/homomorphic-encry...
  - yusina 3 weeks ago
    So it's "just" a storage optimization?
meindnoch 3 weeks ago
I like it! The slowness and inefficiency of homomorphic encryption is nicely complemented by the storage bloat of CRDTs.
3 weeks ago
DPDmancul 3 weeks ago
Theoretically only one operation is sufficient: both NAND and NOR are universal gates, meaning all boolean operations are expressible in terms of only NANDs (or equivalently only NORs). The downside is that the expressions became longer.
- zackmorris 3 weeks ago
  You shouldn't have been downvoted.
  In chip design, boolean logic circuits are often reduced to their simplest NAND/NOR representation, optimized, then translated to RTL, DTL or TTL as explained below:
  https://en.wikipedia.org/wiki/NAND_logic
  https://en.wikipedia.org/wiki/NOR_logic
  https://en.wikipedia.org/wiki/De_Morgan%27s_laws
  https://en.wikipedia.org/wiki/Truth_table
  https://en.wikipedia.org/wiki/Resistor–transistor_logic
  https://en.wikipedia.org/wiki/Diode–transistor_logic
  https://en.wikipedia.org/wiki/Transistor–transistor_logic
  https://www.etechnog.com/2021/12/comparison-between-rtl-dtl-...
  https://hackaday.com/2015/06/30/gates-to-fpgas-ttl-electrica...
  I studied this heavily in my electrical and computer engineering (ECE) courses in the late 1990s, and wish that more programmers were exposed to it.
  On that note, I never had a single spreadsheet course. Which is regrettable, as that would have tied functional programming to logic circuits. Then it's a small step to understand mutability, then monads (which are more difficult), how they negatively impact determinism, and how that greatly increases the mental load of tracking state in imperative programming. Leading to why futures, promises, deferreds and async programming in general should be avoided.
  Unfortunately these insights are all but unknown today, which left us with the over-engineered soup of web languages and frameworks that bury us in mountains of unnecessary complexity. Which we tried to get away from when declarative and idempotent HTML/HTTP metaphors were invented late in the desktop computing era, when C++ and object-oriented programming dominated tech.
mihau 3 weeks ago
Sorry for going off-topic, but kudos for UI/UX on your blog!
To name a few: Nice style (colors, font, style), "footnotes" visible on the margin, always on table of contents, interactivity and link previews on hover.
Nice. What's your tech stack?
- jakelazaroff 3 weeks ago
  No worries and thank you! The tech stack is really just Astro with content stored in Markdown files.
  The link previews on hover are powered by Tippy.js [1] (maybe deprecated by now?). I wrote a script to collect all the links from my Markdown files, scrape the open graph tags and commit the data to the repo as JSON files.
  The interactive demos are web components! I wrote a bit about how I built them [2] but maybe should do a more in-depth blog post at some point.
  [1] https://atomiks.github.io/tippyjs/
  [2] https://jakelazaroff.com/words/web-components-will-outlive-y...
xvector 3 weeks ago
> keep_self.select(&self.peer, &other.peer);
> Rather than if or match expressions, we use FheBool’s select method. It returns the first argument if the underlying FheBool value is true, or the second argument if the underlying value is false.
This wasn't super clear to me. I get the computation is performed on encrypted values, but how does ".select" actually work?
Is ".select" just evaluating to "keep_self*self.peer + keep_self*other.peer"?
GRBurst 3 weeks ago
Following the topic of FHE for quite some time and I am always hoping that we get some breakthroughs regarding speed. Seeing how this can be applied to CRDTs another awesome topic, is really cool, even if it still more theory than practice :-D
thrance 3 weeks ago
Another cool use of FHE, allowing to browse wikipedia privately while still online: https://news.ycombinator.com/item?id=31668814
DinoNuggies45 3 weeks ago
This is wild. I’ve seen CRDTs pop up in collaborative editors, but hadn’t considered them alongside FHE before. What kind of performance hit are you seeing in practice?
chrisweekly 3 weeks ago
For the uninitiated:
"In the context of encryption, 'homomorphic' describes methods that enable computations on encrypted data without needing to decrypt it first.
scyclow 3 weeks ago
Encrypt the diffs for the server and write the hash to a blockchain to manage the ordering. Boom, problem solved without HME.
- ergl 3 weeks ago
  You don't need ordering guarantees for diffs: apply them out of order and the final result should be the same (that's one of the key properties of CRDTs).
  - scyclow 3 weeks ago
    So why do you need timestamps? Or, why do you even need a third party server to run HME to begin with? Why not just encrypt the data and let the client figure it out?
867-5309 3 weeks ago
Conflict-Free Replicated Data Type
drob518 3 weeks ago
Love the presentation.
jdefr89 3 weeks ago
“We introduced a new abstraction for no reason! Take a look!!”
kristel100 3 weeks ago
[dead]