Decentralized Syndication – The Missing Internet Protocol

142 points by brisky 5 months ago | 94 comments
  • glenstein 5 months ago
    While everyone is waiting for Atproto to proto, ActivityPub is already here. This is giving me "Sumerians look on in confusion as god creates world" vibes.

    https://theonion.com/sumerians-look-on-in-confusion-as-god-c...

    • echelon 5 months ago
      These are still too centralized. The protocol should look more like BitTorrent.

      - You don't need domain names for identity. Signatures are enough. An optional extension could contain emails and social handles in the payload if desired.

      - You don't need terabytes of storage. All content can be ephemeral. Nodes can have different retention policies, and third party archival services and client-side behavior can provide durable storage, bookmarking/favoriting, etc.

      - The protocols should be P2P-first rather than federated. This prevents centralization and rule by federated cabal. Users can choose their own filtering, clustering, and prioritization.

      • viraptor 5 months ago
        > Nodes can have different retention policies, and third party archival services and client-side behavior can provide durable storage, bookmarking/favoriting, etc.

        That's completely achievable in AP. Most current servers use reasonable retention, extended for boosted posts.

        • MichaelZuo 5 months ago
          Then it is a bit strange why it wasn’t designed to be ‘BitTorrent-like’ from the beginning as the parent suggests.
        • immibis 5 months ago
          There's no known way to make this work well yet, but feel free to invent that. Until that happens, federated is mostly the best we have, because most people don't want to be responsible for their own servers.

          P.S. ActivityPub is a euphemism for Mastodon's protocol, which isn't just ActivityPub.

          • RobotToaster 5 months ago
            Isn't this ipfs?
            • FireInsight 5 months ago
              Isn't this Nostr?
              • thwarted 5 months ago
                Isn't this nntp?
            • remram 5 months ago
              I would love to have an RSS interface where I can republish articles to a number of my own feeds (selectively or automatically). Then I can follow some my friends' republished feeds.

              I feel like the "one feed" approach of most social platform is not here to benefit users but to encourage doom-scrolling with FOMO. It would be a lot harder for them to get so much of users' time and tolerance for ads if it were actually organized. But it seems to me that there might not be that much work needed to turn an RSS reader into a very productive social platform for sharing news and articles.

              • James_K 5 months ago
                This interface already exists. It's called RSS. Simply make feed titled "reposts" and add entries linking to other websites. I already have such a thing on my own website with the precise hope that others will copy it.
                • remram 5 months ago
                  At some level yes, but I would like to be able to de-duplicate if multiple people/feeds repost the same article, and it would need a lot more on the discovery side (so I can find friends-of-friends, more feeds from same friend I follow, etc). Like a web-of-trust type of construct which I see as necessary with the accelerating rise of bots on all platforms.
                  • James_K 5 months ago
                    Deduping can be done on the reader end. As for a web of trust, you can put a friends list on your website.
                • fabrice_d 5 months ago
                  That looks close to custom feeds in the ATProto / BlueSky world.
                  • edhelas 5 months ago
                    XMPP XEP-0060 Pubsub is doing that :)

                    I wrote a specific XEP for the social part https://xmpp.org/extensions/xep-0472.html

                    And it's implemented in Movim https://movim.eu/

                    • AndrewDucker 5 months ago
                      This is pretty-much exactly what I use Pinboard for.
                    • openrisk 5 months ago
                      Its not obvious to me that what is missing here is another technical protocol rather than more effective 'social protocols'. If you havent noticed, the major issues of today is not the scaling of message passing per-se but the moderation of content and violations of the boundary between public and private. These issues are socially defined and cannot be delegated to (possibly algorithmic) protocols.

                      In other words what is missing is rules, regulations and incentives that are adapted to the way people use the digital domain and enforce the decentralized exchange of digital information to stay within a consensus "desired" envelope.

                      Providing capabilities in code and network design is ofcourse a great enabler, but drifting into technosolutionism of the bitcoin type is a dead end. Society is not a static user of technical protocols. If left without matching social protocols any technical protocol will be exploited and fail.

                      The example of abusive hyperscale social media should be a warning: they emerged as a behavior, they were not specified anywhere in the underlying web design. Facebook is just one website after all. Tim Berners-Lee probably did not anticipate that one endpoint would succesfully fake being the entire universe.

                      The deeper question is, do we want the shape of digital networks to reflect the observed concentration or real current social and economic networks or do we want to use the leverage of this new techology to shape things in a different (hopefully better) direction?

                      The mess we are in today is not so much failure of technology as it is digital illiteracy, from the casual user all the way to the most influential legal and political roles.

                      • miohtama 5 months ago
                        > The deeper question is, do we want the shape of digital networks to reflect the observed concentration or real current social and economic networks or do we want to use the leverage of this new techology to shape things in a different (hopefully better) direction?

                        Here is a book on the topic - Compliance Industrial Complex;

                        https://www.amazon.com/Compliance-Industrial-Complex-Operati...

                        It's about anti-policies (anti hate, anti money laundering, etc.), securitization of governance (private companies create and enforce what should be law) and pre-crime, using technology to do this instead of addressing underlying social problems.

                        • pessimizer 5 months ago
                          > If you havent noticed, the major issues of today is not the scaling of message passing per-se but the moderation of content and violations of the boundary between public and private.

                          Are those the major issues of today? Those are the major issues for censors, not for communicators.

                          • pluto_modadic 5 months ago
                            yes, moderation is an issue that doesn't scale. therefore, many technologists ignore it in favor of "oh, fancy serverless architecture". priority should be on building moderation and tools like reply controls (e.g. only mutuals), shared inboxes (for friends to assist cleaning out hate mail), mod appeals and the like. It's a thorny issue that involves /listening/ to community organizers, who go through pains with poorly written software to try to keep a community civil.
                            • openrisk 5 months ago
                              Are spammers and scammers "communicators"? How about organized misinformation campaigns? In what kind of deeply sick ideological la-la-land is any kind of control of information flow "censorship".
                          • nunobrito 5 months ago
                            NOSTR has solved most of these topics in a simple way. Anyone can generate a private/public key without emails or password, and anyone can send messages that you can verify as truly belonging to the person with that signature.

                            They have hundreds of servers running today by volunteers, there is little cost of entry since even cellphones can be used as servers (nodes) to keep you private notes or keep the notes from people you follow.

                            There is now a file sharing service called "Blossom" which is decentralized in the same simple manner. I don't think I've seen there a way to specify custom domains, people can only use the public key for the moment to host simple web pages without a server behind.

                            Many of the topics in your page are matching with has been implemented there, it might be a good match for you to improve it further.

                            • brisky 5 months ago
                              Can NOSTR handle 100 million daily active users?
                              • nunobrito 5 months ago
                                Your question rephrased: "Can EMAIL handle 100 million daily users?".

                                The answer is yes.

                                NOSTR is similar to emails. They depend on nostr/email providers and aren't depending on any single of them, what exists is a common agreement (protocol). The overwhelming majority of those providers are free and you can also run your own from the cellphone.

                                Some providers might become commercial like gmail, still many others will still provide access for free. Email is doing just fine nowadays, NOSTR will do fine as well.

                                • Groxx 5 months ago
                                  This is all necessarily true of any "protocol". It is absolutely not true that every protocol scales efficiently to 100 million active users all interacting though, so it is basically a meaningless claim.

                                  E.g. ActivityPub has exactly the same claims, and it's currently handling several million, essentially all interactable. Some parts are working fine, and some parts are DDoSing every link shared on any normally-connected instance.

                            • wmf 5 months ago
                              1. Domain names: good.

                              2. Proof of work time IDs as timestamps: This doesn't work. It's trivial to backdate posts just by picking an earlier ID. (I don't care about this topic personally but people are concerned about backdating not forward-dating.)

                              N. Decentralized instances should be able to host partial data: This is where I got lost. If everybody is hosting their own data, why is anything else needed?

                              • evbogue 5 months ago
                                If the data is a signed hash, why does it need the domain name requirement? One can host self-authenticating content in many places.

                                And one can host many signing keys at a single domain.

                                • catlifeonmars 5 months ago
                                  In the article, the main motivation for requiring a domain name, is to raise the barrier to entry above “free” to mitigate spamming/abuse.
                                  • uzyn 5 months ago
                                    A 1-time fixed cost will not deter spam, it only encourages more spamming to lower the averaged per-spam cost. Email spamming requires some system set up, that's a 1-time fixed cost above $10/year but it does not stop spam.
                                  • wmf 5 months ago
                                    One person per domain is essentially proof of $10.
                                    • hinkley 5 months ago
                                      There was a psychological study that decided that community moderation tends to be self healing if, and only if, punishing others for a perceived infraction comes at a cost to the punisher.

                                      I believe I have the timeline right that this study happened not too long before StackOverflow got the idea that getting upvoted gives you ten points and downvoting someone costs you two. As long as you’re saying something useful occasionally instead of disagreeing with everyone else, your karma continues to rise.

                                  • macawfish 5 months ago
                                    Domain names are fine but they shouldn't be forced onto anyone. Nothing about DID or any other flexible and open decentralized naming/identity protocol will prevent anyone from using domain names if they want to.
                                    • hinkley 5 months ago
                                      Time services can help with these sorts of things. They aren’t notarizing the message. You don’t trust the service to validate who wrote it or who sent it, you just trust that it saw these bytes at this time.
                                      • catlifeonmars 5 months ago
                                        Something that maintains a mapping between a signature+domain and the earliest seen timestamp for that combination? I think at that point the time service becomes a viable aggregated index for readers who use to look for updates. I think this also solves the problem for lowering the cost of participation… since the index would only store a small amount of data per-post, and since indexes can be composed by the reader, it could scale cost effectively.
                                        • hinkley 5 months ago
                                          I’ve only briefly worked with these but got a rundown from someone more broadly experienced with them. Essentially you treat trust as a checklist. I accept this message (and any subsequent transactions implied by its existence) if it comes from the right person, was emitted during the right time frame (whether I saw it during a separate time frame), and <insert other criteria here>. If I miss the message due to transmission errors or partitioning, I can still honor it later even though it now changes the consequences of some later message I can now determine to have arrived out of order.
                                          • arccy 5 months ago
                                            that's too much tech for a trust problem it can't solve. just use a TimeStamp Authority like https://freetsa.org/index_en.php or https://knowledge.digicert.com/general-information/rfc3161-c...
                                        • brisky 5 months ago
                                          Hi, author here. Regarding backdating it is a valid concern. I did not mention in the article, but in my proposed architecture users could post links of others (consider that a retweet). For links that have reposts there could exist additional security checks implemented to check validity of post time.

                                          Regarding hosting partial data: there should be an option to host just recent data for the past month or other time frames and not full DB of URLs. This would make decentralization better as each instance could have less storage requirements, but total information would be present on the network.

                                          • imglorp 5 months ago
                                            Recent events also taught us that proof of work is a serious problem for the biosphere when serious money is involved and everybody scales up. Instead, it seems proof of stake is more what is required.
                                            • wmf 5 months ago
                                              Yeah, a verifiable delay function is probably better for timestamping.
                                          • hkt 5 months ago
                                            https://en.wikipedia.org/wiki/Syndie was a decent attempt at this which is, I gather, still somewhat alive.
                                            • defanor 5 months ago
                                              AIUI, the "Decentralized" added to RSS here stands for:

                                              - Propagation (via asynchronous notifications). Making it more like NNTP. Though perhaps that is not very different functionally from feed (RSS and Atom) aggregators: those just rely on pulling more than on pushing.

                                              - A domain name per user. This can be problematic: you have to be a relatively tech-savvy person with a stable income and living in an accommodating enough country (no disconnection of financial systems, blocking of registrar websites, etc) to reliably maintain a personal domain name.

                                              - Mandatory signatures. I would prefer OpenPGP over a fixed algorithm though: otherwise it lacks cryptographic agility, and reinvents parts of it (including key distribution). And perhaps to make that optional.

                                              - Bitcoin blockchain.

                                              I do not quite see how those help with decentralization, though propagation may help with discovery, which indeed tends to be problematic in decentralized and distributed systems. But that can be achieved with NNTP or aggregators. While the rest seems to hurt the "Simple" part of RSS.

                                              • James_K 5 months ago
                                                A number of countries actually offer free domain names to citizens. I agree with the rest though. I don't see what this adds to RSS, which already has most of these things given its served over HTTPS in most cases.
                                                • pluto_modadic 5 months ago
                                                  cryptographic agility is a recipe for JWT shooting you in the foot. Age or Minisign strike good balances by making the cryptography decision for you.
                                                • convolvatron 5 months ago
                                                  alot of the use cases for this would have been covered by protocol designs suggested by Floyd, Jacobson and Zhang in https://www.icir.org/floyd/papers/adapt-web.pdf

                                                  but it came right at a time when the industry had kind of just stopped listening to that whole group, and it was built on multicast, which was a dying horse.

                                                  but if we had that facility as a widely implemented open standard, things would be much different and arguably much better today.

                                                  • rapnie 5 months ago
                                                    > built on multicast, which was a dying horse.

                                                    There's a fascinating research project Librecast [0], funded by the EU via NLnet, that may boost multicast right into modern tech stacks again.

                                                    [0] https://www.librecast.net/about.html

                                                    • nunobrito 5 months ago
                                                      What is that used for? Was looking at the documentation but I'm still without understanding the use case they are trying to solve.

                                                      Isn't multicasting something already available with UDP or Point-to-Point connections without a single network envolved?

                                                      • convolvatron 5 months ago
                                                        by 'multicast' here one really means a facility that's provided by layer 3. So yes, we can build our own multicast overlays. But a generic facility had two big benefits. One is that the spanning distribution tree can be built with a knowledge of the actual topology, and copies can be made in the backbone where they belong (copies in the overlay often mean that the data can traverse the same link more than once).

                                                        The other big one is access. If we call agree on multicast semantics and addressing, and its built into everyone operating system, then we can all use that as a equal access facility to effectively publish to everyone, not just people who happen to be part of this particular club and are running this particular flavor of multicast.

                                                  • teddyh 5 months ago
                                                    Is he reinventing USENET netnews?
                                                    • bb88 5 months ago
                                                      Yes and no. I think the issue primarily is that I could never just generate a new newsgroup back when usenet was popular and get it to syndicate with other servers.

                                                      The other issue is who's going to host it? I need a port somehow (CGNAT be damned!).

                                                      • hinkley 5 months ago
                                                        Spam started on Usenet. As did Internet censorship. You can’t just reinvent Usenet. Or we could all just use Usenet.
                                                        • stackghost 5 months ago
                                                          >Or we could all just use Usenet.

                                                          Usenet doesn't scale. The Eternal September taught us that.

                                                          To being Usenet back into the mainstream would require a major protocol upgrade, to say nothing of the seismic social shift.

                                                          • hinkley 5 months ago
                                                            That’s also my feeling. There’s a space for something that has some of the same goals as Usenet while also learning from the past.

                                                            I don’t think it’s a fruitful or useful comment to say something is “like Usenet” as a dismissal. So what if it is? It was useful as hell when it wasn’t terrible.

                                                      • fiatjaf 5 months ago
                                                        Nostr is kind of what you're looking for.
                                                        • doomroot 5 months ago
                                                          My thought as well.

                                                          ps When is your SC podcast coming back?

                                                        • cyberax 5 months ago
                                                          That is a really great list of requirements.

                                                          One area that is overlooked is commercialization. I believe, that the decentralized protocol needs to support some kind of paid subscription and/or micropayments.

                                                          WebMonetization ( https://webmonetization.org/docs/ ) is a good start, but they're not tackling the actual payment infrastructure setup.

                                                          • jasode 5 months ago
                                                            The blog mentions the "discovery problem" 7 times but this project's particular technology architecture for syndication doesn't seem to actually address that.

                                                            The project's main differentiating factor seems to be not propagating the actual content to the nodes but instead save disk space by only distributing hashes of content.

                                                            However, having a "p2p" decentralized network of hashes doesn't solve the "discovery" problem. The blog lists the following bullet points of metadata but that's not enough to facilitate "content discovery":

                                                            >However it could be possible to build a scalable and fast decentralized infrastructure if instances only kept references to hosted content.

                                                            >Let’s define what could be the absolute minimum structure of decentralized content unit:

                                                            >- Reference to your content — a URL

                                                            >- User ID — A way to identify who posted the content (domain name)

                                                            >- Signature — A way to verify that the user is the actual owner

                                                            >- Content hash — A way to identify if content was changed after publishing

                                                            >- Post time — A way to know when the post was submitted to the platform

                                                            >It is not unreasonable to expect that all this information could fit into roughly 100 bytes.

                                                            Those minimal 5 fields of metadata (url+userid+sig+hash+time) are not enough to facilitate content discovery.

                                                            Content discovery of reducing the infinite internet down to a manageable subset requires a lot more metadata. That extra metadata requires scanning the actual content instead of the hashes. This extra metadata based on actual content (e.g. Google's "search index", Twitter's tweets & hashtags, etc) -- is one of the factors that acts as unescapable gravity pulling users towards centralization.

                                                            To the author, what algorithm did you have in mind for decentralized content discovery?

                                                            • 5 months ago
                                                              • brisky 5 months ago
                                                                Thanks for the comment, these concerns are valid. At the core the protocol supports only basic discovery - you can see who is posting right now and history of everyone who has ever posted. Regarding rich context discovery where content could be found by specific tags and key words this would be implemented by reader platforms that crawl the index
                                                              • somat 5 months ago
                                                                Ipfs has a pub/sub mechanism.

                                                                As far as I can tell it is stuck in some sort of inefficient prototype stage. which is unfortunate because I think it is one of the neatest most compelling parts of the whole project. it is very cool to be able build protocols with no central server.

                                                                Here is my prototype of a video streaming service built on it. I abandoned the idea mainly because I am a poor programmer and could never muster the enthusiasm to get it past the prototype stage. but the idea of a a video streaming service that was actually serverless sounded cool at the time

                                                                http://nl1.outband.net/fossil/ipfs_stream/file?name=ipfs_str...

                                                                • brisky 5 months ago
                                                                  I think both RSDS and IPFS use libp2p pub/sub mechanism
                                                                • neuroelectron 5 months ago
                                                                  I think it's pretty clear they don't want us to have such a protocol. Google's attack on RSS is probably the clearest example of this, but there's also several more foundational issues that prevent multicasts and similar mechanisms from being effective.
                                                                  • pfraze 5 months ago
                                                                    Atproto supports deletes and partial syncs
                                                                    • bshacklett 5 months ago
                                                                      Am I the only one concerned by this?

                                                                      > In RSDS protocol DID public key is hosted on each domain and everyone is free to verify all the posts that were submitted to a decentralized system by that user.

                                                                      DNS seems far too easy to hijack for me to rely on it for any kind of verification. TLS works because the server which an A(AAA) record points to has to have the private key, meaning that you have to take control of that to impersonate the server. I don’t see a similar protection here.

                                                                      • James_K 5 months ago
                                                                        Perhaps this is a little naïve of me, but I really don't understand what this does. Let's say you have website with an RSS feed, it seems to have everything listed here. I suppose pages don't have signatures, but you could easily include a signature scheme in your website. In fact I think this is possible with existing technologies using a link element with MIME type "application/pkcs7-signature".
                                                                        • blacklion 5 months ago
                                                                          It is funny, how link to text with these words: "Everybody has to host their own content" points to medium.com, not to tautvilas.lt
                                                                        • WorldPeas 5 months ago
                                                                          I think the author here would be happy to learn that secure scuttlebutt (SSB) exists. https://github.com/ssbc/scuttlebutt-protocol-guide
                                                                          • toomim 5 months ago
                                                                            I am working on something like this. If you are, too, please contact me! toomim@gmail.com.
                                                                            • evbogue 5 months ago
                                                                              I'm working on something like this too! I emailed you.
                                                                            • est 5 months ago
                                                                              Pity RSS is one-way. There's no standard way of comment or doing iteractions.
                                                                              • marcus0x62 5 months ago
                                                                                That's supposed to be WebMention, right? It is standard (W3C), but not widely deployed.

                                                                                https://www.w3.org/TR/webmention/

                                                                                • uzyn 5 months ago
                                                                                  Interaction/comment syndication would be very interesting. This is, I feel, what makes proprietary social media so addictive.
                                                                                  • hinkley 5 months ago
                                                                                    Someone on the NCSA Mosaic team had a similar idea, but after they left nobody remaining knew what to do with it or how it worked.

                                                                                    It took me 20 years to decide maybe they were right. A bunch if Reddits more tightly associated with a set of websites and users than with a centralized ad platform would be fairly good - if you had browser support for handling the syndicated comments. You could have one for your friends or colleagues, one for watchdog groups to discuss their fact checking or responses to a new campaign by a troublesome company.

                                                                                    • sali0 5 months ago
                                                                                      Its an interesting point. I haven't even read the article yet, but have been reading the comments. Maybe they were the star of the show all along.
                                                                                      • Zak 5 months ago
                                                                                        This comment describes ActivityPub.
                                                                                    • Uptrenda 5 months ago
                                                                                      >Everybody has to host their own content

                                                                                      Yeah, this won't work. Like at all. This idea has been tried over and over on various decentralized apps and the problem is as nodes go offline and online links quickly break...

                                                                                      No offense but this is a very half-assed post to gloss over what has been one of the basic problems in the space. It's a problem that inspired research in DHTs, various attempts at decentralized storage systems, and most recently -- we're getting some interesting hybrid approaches that seem like they will actually work.

                                                                                      >Domain names should be decentralized IDs (DIDs)

                                                                                      This is a hard problem by itself. All the decentralized name systems I've seen suck. People currently try use DHTs. I'm not sure that a DHT can provide reliability though and since the name is the root of the entire system it needs to be 100% reliable. In my own peer-to-peer work I side-step this problem entirely by having a fixed list of root servers. You don't have to try "decentralize" everything.

                                                                                      >Proof of work time IDs can be used as timestamps

                                                                                      Horribly inefficient for a social feed and orphans are going to screw you even more.

                                                                                      I think you've not thought about this very hard.

                                                                                      • catlifeonmars 5 months ago
                                                                                        > In my own peer-to-peer work I side-step this problem entirely by having a fixed list of root servers. You don't have to try "decentralize" everything.

                                                                                        Not author, but that is what the global domain system is. There are a handful of root name servers that are baked into DNS resolvers.

                                                                                        • Uptrenda 5 months ago
                                                                                          You're exactly right. It seems to work well enough for domains already so I kept the model.
                                                                                      • spacedRepprEXP 5 months ago
                                                                                        > Keeping track of time and operations order is one of the most complicated challenges of a decentralized system.

                                                                                        Only in decentralized systems. In centralized ones, fake timestamps down to the bit all over the motherfucking space. So, basically, quasi, ultimately, so to speak, time and order don't matter in centralized systems, only the Dachshund does.

                                                                                        • 5 months ago
                                                                                          • dang 5 months ago