I Wrote an Activitypub Server in OCaml: Lessons Learnt, Weekends Lost

154 points by gopiandcode 2 years ago | 108 comments
  • mariusor 2 years ago
    The author makes the basic mistake of most of the people implementing ActivityPub services: they want to map the logic of an existing type of web application and contort existing domain objects into encoding/decoding to an "impractically large number" of options. That happens because they want two things in one: a server and a client.

    The ActivityPub specification needs to be read with a goal similar to an email server in mind. It should do one thing: receive JSON-LD objects in inbox, process them according to the specification, and(maybe) store them on disk.

    The idea of "users", "friends", "posts", "feeds" etc, are concepts that belong to the clients on top of this server, not in the server itself.

    This separation between clients and server will also allow better interop/graceful degradation of object types that the client/server don't specifically understand.

    • still_grokking 2 years ago
      This comment raised a whole bunch of red flags for me.

      Fist and foremost: Saying that something is like an email server translates for me into "this is an under- and over-specified swamp at the same time, full of quirks, and actually not implementable in any reasonable way". Because that's what email is. I almost can't think of a greater horror than writing an email server from scratch…

      I don't know enough about ActivityPub to judge whether it's really like email. I would strongly hope it isn't, as otherwise it would be a tech you should probably better never touch as a developer.

      The next thing is: If an ActivityPub server only receives and sends some opaque BLOBs what's the whole point of it?

      But when it's not about opaque BLOBs you need to map the structures in the spec to proper types in a statically typed languages as you can't manipulate them otherwise in any meaningful way. If it's not possible to do that because the spec is vague and/or there is no coherent data model behind it that would be just another reason to not touch this tech. Nobody needs the next underspecified, stringly-typed "email".

      I really hope I'm reading this wrong!

      • mariusor 2 years ago
        The email comparison helps people to understand the directional way ActivityPub works, I don't know enough about email (whichever of SMTP or IMAP/POP3/samd you consider that to be) to make a comparison at protocol level.

        > If [...]receives and sends some opaque BLOBs what's the whole point of it?

        There are some rules about how to have side effects for said blobs. Some of the blobs themselves have side effects. That's mostly what ActivityPub is: rules about how to distribute the blobs in the federated context, rules to what to do with the blobs when they reach your servers (when coming from other servers, or directly from clients).

        The vocabulary that ActivityPub is based upon, is another whole specification, called ActivityStreams, and which didn't originate in the W3C group. This vocabulary has three (*main) types of objects: Activities - which provide the backbone of ActivityPub (Like, Follow, Create, Update), Actors - basically different types of users (these are the entities that operate the activities) and, Objects - whatever the Activities operate on.

        • WorldMaker 2 years ago
          > If an ActivityPub server only receives and sends some opaque BLOBs what's the whole point of it?

          There's still a difference between "try to black-box the incoming data as much as possible" and "treat the incoming data as opaque BLOBs and assume". The data is mostly JSON-LD which is a far cry from "binary large objects". It is always going to be "semi-transparent" as it will always be JSON. Whether or not you like the "-LD" extensions to JSON (they are heavy, they do have a lot of RDF baggage you may not desire), they give you a bunch of guaranteed "baseline schema" for the JSON objects that you can use for static typing that might be "good enough" for a lot of "meaningful manipulations" (such as following links to pick up related objects; LD => linking data) and that is all easily transparent.

          A lot of the schemas beyond "LD" in ActivityPub are client/application-specific beyond most of the JSON-LD basics and should be easy to treat as a black box unless doing client/application-specific tasks. That's not necessarily "stringly typed", it's kind of a classic "serialization onion": The server at best needs to know that it is JSON and it may have JSON-LD metadata for relevant related linked objects (and a few other metadata fields common to "introspection", similar to "headers"). The client can dig deeper and know it is not just "any" JSON object but a more specific schema for a given class of thing the client cares about.

          • still_grokking 2 years ago
            To be honest, this sounds indeed quite like the mess that email is.

            If the server isn't just a "dumb 'BLOB' storage" it will need to handle application logic (sooner or later, as this is actually what servers are for)…

            But given that the application logic seems to be mostly unspecified, kind of wild west, where every client application can do whatever it thinks it's users like, this will unavoidably end in all the problems you have with email, where the server needs to know about all the specific details, quirks, and idiosyncrasies of every client ever built.

            The whole concept reads like an implementation of "'Postel's Law' fallacy".

            • mariusor 2 years ago
              Thank you for articulating this very well, I was getting a bit frustrated at OPs contrarianism. :)
            • vidarh 2 years ago
              E-mails are not opaque blobs, and neither are ActivityPub messages. The point is that at it's lowest layer an implementation should care about receiving messages addressed to one or more Collections. That's it. It makes implementing a functioning ActivityPub implementation a lot easier.

              The next layer up then specifies some rules for how to process those messages: Like on an e-mail server, if a message is sent to your "local" server intended for onwards delivery, the server must forward it on. Otherwise it is added to an OrderedCollection - effectively a mailbox.

              The spec then sets out a structure for giving the messages an Activity type that determines further fields, and for some of these activities there are rules specifying how the relevant Actor's should act when those activities / messages are processed by them.

              You can decide to do that synchronously when receiving the message. Sometimes that may be fine. But you can also strictly layer the implementation and deliver to a collection first and then asyncronously have workers process those messages. What you in either case ought to do for your own sanity is to at least logically separate the low level message pump (inbox/outbox) from the processing of activities.

              For starters, doing this separation cleanly makes writing a scaleable implementation far easier.

              > you need to map the structures in the spec to proper types in a statically typed languages as you can't manipulate them otherwise in any meaningful way

              This is just not true. You can handle dynamic structures in statically typed languages just fine. It is in any case irrelevant, as ActivityStreams (which ActivityPub is based on) defines a typed vocabulary [1]. An implementation can choose to dynamically process extensions or it can choose to statically type the activities it understands and treat the rest as mostly opaque blobs other than the envelope/addressing -- this is exactly why it's beneficial to apply the layering as suggested with the comparison to e-mail and decouple the message pump from the processing of activities.

              [1] https://www.w3.org/TR/activitystreams-vocabulary/

            • MuffinFlavored 2 years ago
              > JSON-LD

              https://json-ld.org/ for anybody else not super familiar

              • cratermoon 2 years ago
                OK, but for someone who wants to build a useful tool that does what the author wants, "interacting with the Fediverse", such as federating with Mastodon, how useful is doing that one thing?
                • jeroenhd 2 years ago
                  It depends on your goal. If your server is just a tool you use, you can ignore lot of concepts. There is no local timeline, there are no users, all follows belong to a single user, etc.

                  I can't find the link but a while back there was a post on the front page about how to get a findable, read only ActivityPub profile by just uploading some static JSON files. Not exactly a Twitter competitor, but you don't need much to start exchanging messages.

                • vidarh 2 years ago
                  I think this is the wrong way of looking at it. If you're doing the whole stack, sure you will end up implementing quite a few things.

                  But consider that you can write a generic ActivityStreams server without supporting any of the ActivityPub activities. Now you have a generic platform to build on.

                  Tack on a tiny bit of support for e.g. addressing etc. as found in ActivityPub and you have what you need for federation.

                  With that generic platform, doing what you're suggesting is a matter of implementing a handful of Activities that mutates Objects and Collections.

                  What the author did is the equivalent of implementing a mailing-list manager by first writing a mail server from scratch instead of just writing the bits managing the list and sends, because he didn't have that lower level layer to build on.

                  There is indeed a lot of missing tooling to work with ActivityStreams/ActivityPub, that makes it painful now, and unfortunately a lot of ActivityPub implementers takes the same tack as the author and builds one big monolith instead of first building that lower layer.

                  • mariusor 2 years ago
                    If you want to create one just for yourself, sure. If you want to create something for the rest of the world, probably not very much.

                    I get the "scratch your own itch" mentality, but not if you kneecap all efforts that try to build on top of it. :D

                  • JustSomeNobody 2 years ago
                    Do you know of a small sample project that does this as an example?
                    • mariusor 2 years ago
                      There are no "small sample" projects as far as I know. But if you look in my profile (or other comments in this thread) I did develop a server which only does ActivityPub, client to server and server to server.
                    • iudqnolq 2 years ago
                      (My only knowledge of activitypub comes from reading this article.)

                      To receive JSON-LD messages don't you need to send follow requests? And to do that don't you need to deal with the fact the spec is too complicated and most servers implement inconsistent parts of it?

                      • vidarh 2 years ago
                        To receive JSON-LD messages, someone needs to send them to you. Sending follow requests is perhaps the easiest way to do that, but those follow requests do not need to be initiated by the same code that hosts the inbox.

                        The point is there are several potentially independent layers and modules there: The message pump itself at least can be implemented separately from the decoding of individual message types, and separate from managing followers and following, the same way e.g. a mail server knows nothing about how to follow mailing lists, or decoding email messages past the header.

                        • still_grokking 2 years ago
                          That sounds like a mess.

                          Reading through the other comments here it seems that the spec is in fact a mess…

                    • dahwolf 2 years ago
                      Saw some comments on the protocol being fluffy and typical implementations resource hungry. This is an interesting guy to follow:

                      https://universeodon.com/@supernovae

                      He's the admin of universeodon, a mastodon instance with 13K MAU. He recently shared that in a month's time, 3TB of text was transferred just in ActivityPub events. Images a multiple of it. I don't know what the bill is, but I was pretty shocked by the stats...for "just" 13K users.

                      And the cruel thing is that it still doesn't work properly. Likes/boosts and replies do not properly synchronize.

                      • robga 2 years ago
                        Universeodon uses Fastly as a CDN. My masto admin experience is 80-90% CDN caching though obviously that’s images and static content.

                        Bandwidth is a function of the number of connected servers for remote followers. A popular person like George Takei makes a post and 15000 servers ask universeodon for the text. IMHO some sort of smarter fan out or message relay capability would help, much like email infrastructure.

                        Having said that, you can run 10k Mastodon MAU on a $200pm cloud resource budget, though double that adds headroom, a dynamically scalable architecture, a staging instance, elastic search and translation, more admin tooling, etc. Yes, some instances spend multiples of that per 10k but if you budget $400pm you’ll sleep well.

                        (Thing is, a 1k MAU would cost $100+ if you want to start with a scalable footprint).

                        Universedon had 70k MAU after the November surge and AIUI is still easily scalable to 100k+.

                        I don’t know about you, but a cloud running cost of $0.02-$0.10 per MAU sounds cheap to me. All large instances can cover it with donations. The “real” cost is moderation and administration labour.

                        > And the cruel thing is that it still doesn't work properly. Likes/boosts and replies do not properly synchronize.

                        It works exactly as designed. Personally I am fine that on someone else’s post I don’t need to see every reply across the world and fully synced like counts, but I can respect the POV that many users expect just that. This is not an activity pub limitation but rather a software choice by Mastodon. Many users, including @supernovae, push for functional changes around this.

                        • latch 2 years ago
                          3TB for the text of 13K user sound crazy, you're right. But for the bill, strictly speaking about bandwidth, a 10mbps unmetered connection gets you roughly that. And 10mbps is pretty uncommon now because it's so low. So I'd expect the bandwidth bill to essentially be free (i.e. included)
                          • MR4D 2 years ago
                            Would love to see how this compares to git.
                            • xvilka 2 years ago
                              Because they should have designed more compact and less talkative protocol. It was a similar problem with XMPP.
                              • still_grokking 2 years ago
                                But people are saying that the Fediverse could replace Twitter and Facebook and Tiktok and Instagram and what not, don't they?

                                How much hardware would you need for 100 million MAUs? (And that's just a fraction of the current social media users.)

                                If it really "scales" like indicated in the parent post this tech will never provide any alternative to centralized social media sites just for technical reasons no matter what people want or do.

                                Maybe someone experienced in effective distributed systems should start to design an alternative.

                                Otherwise there won't be any viable alternative to the commercial silos no matter how bad people would like one.

                                • dahwolf 2 years ago
                                  Mastodon absolutely does not scale that far and this should be considered fact. Far smaller instance owners sometimes complain about bills of hundreds of USD per month.

                                  So the only way to scale up "indefinitely" is by having many small/medium-sized instances, but not really. In a 100M+ network, instances will suffer due to the wasteful nature of federation plus social media being append-only.

                                  Costs will forever go up and this doesn't even mention the burden and liabilities of moderation.

                                  Amidst all our hate for big social, we've forgotten about all the things they do very well. They are (financially) free. They are reliable and scale up without you noticing. You do not have to generally worry about your entire account and content being gone because some mod gave up. If you don't do anything funny, all your content is preserved, forever. Moderation works reasonably well, even if never perfect.

                                  We've taken all that for granted. But it costs billions, an army of engineers, mods, legal, marketing, UX, top notch infrastructure to make it run and work this smoothly.

                                  The idea that a bunch of enthusiasts can replicate this, is misplaced.

                                  • saurik 2 years ago
                                    > If you don't do anything funny, all your content is preserved, forever.

                                    Are you really not able to think of any dead social networks that deleted all of their content? Hell: Google alone has managed to delete at least two!

                                    • vidarh 2 years ago
                                      > Costs will forever go up and this doesn't even mention the burden and liabilities of moderation.

                                      Storage costs have historically dropped so fast that this just isn't an issue I'd worry about. I ran a mail provider with ~2m accounts around 2000. We had less aggregate storage than my laptop has on a single M.2 SSD now. With redundancy, the cost per GB for us at that time was around 10,000 times higher than it'd have been today. Our total processing power was lower than my laptop. Our bandwidth use was a tiny fraction of my home internet connection.

                                      In other words: If it's an issue today, it soon won't be. Costs are low enough per user today to be viable, and they'll only drop.

                                      > You do not have to generally worry about your entire account and content being gone because some mod gave up.

                                      I do have to worry about what happens when I have to start again because engagement is cratering and I can't just migrate elsewhere or run my own, though (e.g. I get more engagement on Mastodon than on Twitter despite 100x as many followers on one of my Twitter accounts). Since I can (and did) choose to run my own instance, I don't need to worry about that again.

                                      > The idea that a bunch of enthusiasts can replicate this, is misplaced.

                                      This was the kind of argument used against open source 20-30 years ago. It's was just as ridiculous an argument then. This is magnitudes easier than what was achieved with open source, and it's also flawed because just as detractors of open source you're presuming that it will only be enthusiasts, and that e.g. nobody will start and operate commercial services for those who prefer that (some commercial hosters already exists for Mastodon for example).

                                      • 2 years ago
                                        • rektide 2 years ago
                                          > Mastodon absolutely does not scale that far

                                          Maybe?

                                          > and this should be considered fact.

                                          I see absolutely nothing about ActivityPub that is inherently hard to scale. Adding more points of presence & smart fan-out seems like it could keep scaling indefinitely, from what I can see.

                                          > Costs will forever go up

                                          Let's simplify follower-ship and assume instead a simpler model of bi-directional friending. If you ahve M users each of which have N total friends, the naive fear is this: that costs will keep growing terrifyingly. But I expect it's actually more a sigmoid curve. As you grow into millions and especially many-millions of users, more and more you'll have your people following the same person - which reduces traffic - and more and more of your people followed by multiple people on another server - which reduces traffic. The actual growth curve here is more sigmoidal, with a big variability depending on the number of active instances fediverse instances out there.

                                          > You do not have to generally worry about your entire account and content being gone because some mod gave up.

                                          To get listed on joinmastadon you have to promise to give at least 3 months heads-up, during which time users have to be able to activate an account transfer to other systems. There was a case where mastodon.au nearly pulled the plug with a short notice, but someone else stepped up to take over the system.

                                          The potential for this to be a huge problem is absolutely super real, but in practice, I have almost no real concern over this and think it should be broadly disregarded. The server I was on shut down, but the mod gave us almost a full year of continued service, and another year of read-only service during which we can transfer. Trying to drum up fear & doubt over the instability of this system seems irresponsible & premature, given how well things have gone.

                                          > The idea that a bunch of enthusiasts can replicate this, is misplaced.

                                          I agree that there are huge advantages & I think we absolutely have taken much for granted. Right now you're looking at this through shit-colored glasses though, and I think it's naive to wish so hard for this all to fail. It's convenient & easy to say it'll never work.

                                          But there's so many strategies where we can start to turn the scale into something useful. If servers publish WebBundles of user's feeds, users can p2p distribute the signed http content among themselves, perhaps via WebTorrent. Rather than live in fear of our users and our scaling, we can rely on our users to help us tackle scaling. Yes we can, man. It's totally doable. Get off the ledge.

                                        • robga 2 years ago
                                          > How much hardware would you need for 100 million MAUs?

                                          Given an informed yardstick of $.02-.05 per MAU for a 10k instance, possibly $2-5M/month, though much depends on the profile of users per instance.

                                          The support, moderation, legal, administration, and corporate costs would dwarf this cloud/hardware cost.

                                      • erwinh 2 years ago
                                        A bit off-topic but the post title will probably attract relevant people.

                                        What are the thoughts on OCaml on HN?

                                        • cccbbbaaa 2 years ago
                                          It replaced Python for everything longer than a couple hundred of lines long for me. Fast language, fast compile times, clean(-ish) syntax, strong typing system, good ecosystem, and now multicore support? Yes please!

                                          I must be more nuanced, though: existing libraries in opam are generally very, very good (I really like cmdliner), but many things may be missing. There is no alternative to Django, for instance. No serious IDE, except emacs. The standard library was so lacking that there is at least an alternative. The situation improved, but there's still missing stuff compared to Python.

                                          • mattpallissard 2 years ago
                                            > There is no alternative to Django, for instance.

                                            https://aantron.github.io/dream/, which is new and used by ocaml.org as well as OP

                                            > No serious IDE, except emacs

                                            and vim, and visual studio, and whatever else supports the LSP protocol via https://github.com/ocaml/ocaml-lsp

                                            > The standard library was so lacking that there is at least an alternative.

                                            While janestreet does have an publish their own stdlib, I personally try to stick to the stdlib whenever possible. Not to knock janestreet. I'm glad they're around and have contributed a bunch.

                                            But overall I agree with you. It's been my favorite language to write in for years now. You can't just reach for off-the-shelf libraries for every little thing. Although the ones that do exist tend to be written halfway decently.

                                            • winrid 2 years ago
                                              Dream is not a Django alternative. Django's powers come from probably one of (the best?) best ORMs in the industry, along with generated database schema migrations, and generated admin panels, to name a few. There's also the Django Rest Framework which makes putting together REST apis generated from your models super easy.
                                            • mdaniel 2 years ago
                                              > No serious IDE, except emacs.

                                              https://plugins.jetbrains.com/plugin/9440-reasonml (72k downloads)

                                              https://plugins.jetbrains.com/plugin/18531-ocaml (2k downloads)

                                              I'm not in the ocaml ecosystem enough to evaluate their quality, but anything on top of IJ is for sure a serious IDE

                                              • amelius 2 years ago
                                                Do you make GUIs in OCaml, and which libraries do or would you use?

                                                And how about scientific computing (SciPy), deep learning (PyTorch etc.), or computational geometry (Shapely etc.)?

                                              • still_grokking 2 years ago
                                                I've heard good things about OCaml in general.

                                                But "no serious IDE, except emacs" is a non-starter imho, if it's true.

                                                They should really invest in this. Otherwise the language won't attract any professional developers in the large.

                                                • yw3410 2 years ago
                                                  Vscode works since there is an LSP server.
                                                  • trenchgun 2 years ago
                                                    It is not true, as other commenters have said.
                                                • yodsanklai 2 years ago
                                                  It's my favorite language by far!

                                                  Pros: type safe, GC, fast, (arguably) a simple and practical language if you have a functional mindset (much simpler and pragmatic than Haskell IMHO).

                                                  Cons: it's a niche language, so tooling/libraries/online help aren't on par with more mainstream languages. No canonical standard library (different codebases will use different standard libraries and even disagree on pervasive functions such as List.map). Whenever the code uses monad (e.g. concurrency monad / error handling), I find the language loses its simplicity.

                                                  Maybe it's true of every languages but I'm disappointed by some OCaml codebases where often two extreme cohabit

                                                  1. people who don't know the language and don't write idiomatic code (like, refusing to write .mli, abusing imperative features)

                                                  2. OCaml experts who over-engineer things and want to use the latest features and make the code hard to read/maintain

                                                  In a professional settings, it can be hard to have these two populations coexisting, and people tend to be quite opinionated when it comes to such languages (love it or hate it -> it's often a source of struggle).

                                                  • jolux 2 years ago
                                                    OCaml is a great language but I would probably choose F# if I had to pick a language for a new project because of libraries.
                                                    • WorldMaker 2 years ago
                                                      I haven't used OCaml much directly, but F# is a common enough tool in my toolbelt at this point. My experience of F# is that overall it's a good language family. The access to .NET's standard library (the BCL) and easy interop with C# are the biggest reasons F# is the tool I more often reach to as it already fits the ecosystem most of my other development is in, but I'd love to work more directly with OCaml should the need arise.
                                                      • zem 2 years ago
                                                        one of my favourite languages! not so much for its (excellent) technical qualities, but just as a matter of personal taste - it joined ruby and racket in a short list of languages that just feel nice to program in. (i suspect D would join that list too but despite being interested in it for a while i haven't yet had a compelling project to use it for.)
                                                      • SideburnsOfDoom 2 years ago
                                                        My question is this: if I was to try to hack up an ActivityPub server in my platform of choice, how would I know how compliant it is? Is there any compliance test suite to verify this?

                                                        "Try and load it up in a client app" seems suboptimal.

                                                        "load it up and see" attitude is part of what made parsing and renderings HTML so hairy, and compliance test suites helped.

                                                        • mariusor 2 years ago
                                                          There was a suite of tests, that sadly fell to bitrot. One of the developers in the community created a parallel application that could test implementations, but then this too ended up unmaintained[1].

                                                          [1] https://github.com/go-fed/testsuite

                                                        • nologic01 2 years ago
                                                          I found the post well written and informative. Though I am clueless about OCaml it feels as this would be useful for anybody working on a new server implementation in any language ecosystem as it highlights what needs to be done and potential bottlenecks.

                                                          As for the activitypub spec and the currently popular implementations it doesnt take long exposure to the fediverse to realise there are some rough edges and historical accidents (e.g mastodon being actually the defacto interpretation of the standard). Imho now that there is substantial more mindshare devoted to decentralized social it would be opportune to revisit these things and if needed revise before they get backed in.

                                                          • mikece 2 years ago
                                                            Im looking forward to a solid ActivityPub server written in Go or Rust that can run on modest hardware/small resource Docker hosts.
                                                            • SideburnsOfDoom 2 years ago
                                                              > Im looking forward to a solid ActivityPub server written in Go or Rust that can run on modest hardware/small resource

                                                              The "Lightweight" GoLang ActivityPub server is GoToSocial https://github.com/superseriousbusiness/gotosocial

                                                              The better-known lightweight servers are Pleroma and fork Akkoma, written in Elixir https://akkoma.dev/AkkomaGang/akkoma/

                                                              Some of this info I got via: https://social.treehouse.systems/@ariadne/110226729543740723

                                                            • jeroenhd 2 years ago
                                                              I think there is (was?) an attempt to rewrite Mastodon into Rust but I haven't heard much about it.

                                                              A single user Mastodon instance takes an unreasonable amount of resources. I don't know if it's just because of Ruby (Gitlab has the same problem, so it might just be) or because everyone is wasting money on expensive servers, but an RSS feed on steroid shouldn't take this much RAM.

                                                              • WorldMaker 2 years ago
                                                                Mastodon itself is designed for "flagship scale" (given lead developers run mastodon.social and mastodon.online, two of the biggest instances and the most "dogfooding" two instances) so it bundles an entire cluster of services: background processors (sidekiq), caches (redis, I think?), database server (postgres), optional ElastiCache, and more. I don't know how much Ruby itself accounts for expensive overhead, but just running all of those other things on a single server vertically for a single user instance is a massive, expensive overhead. (It's clearly built for horizontal scale where your background services and caches and database servers may all be different clusters of VMs/servers over vertical stack efficiency when "scaled down" from the "natural" "mastodon.social scale" that Mastodon is most optimized for.)

                                                                It's an interesting optimization problem reminder that scaling factors are different for different needs and not everything scales cleanly to every use case. A single user instance should be able to use a much smaller vertical stack, but scaling down from a wide horizontal stack is not necessarily the best or cheapest place to start when building something like that.

                                                                (There are some interesting projects I've seen to build single user instances with much less overhead, shorter vertical stacks. I'm curious to see where those efforts go. In my own usage of Mastodon my "single user" instance gets the benefits of the horizontal scaling Mastodon was built for because my hosting provider does a bunch of work to make sure that they take advantage of that economy of scale to host many small instances for cheaper than trying to run small instances in one-off VMs.)

                                                                • Xeoncross 2 years ago
                                                                  This is the same problem that plagued mail servers. It takes so many different components which all have their own configurations to optimize (and memory/cpu footprints) that it ends up being too complex for most people to actually run an instance.

                                                                  I'm also completely positive that a Go or Rust version on a $5 VPS with an in-memory database model (RocksDB, LevelDB, BadgerDB, etc..) could easily handle hundreds of thousands of users and gigs of content each day. It seems like it would be easy to expand that to a multi-node key-value store like TiDB, CockroachDB, etc.. if you needed to grow beyond a single host.

                                                                • mdaniel 2 years ago
                                                                  https://github.com/rustodon/rustodon#readme which has an awesome name but you're correct it appears that specific repo stalled out. I didn't check on the 41 forks of it
                                                                  • 2 years ago
                                                                  • knjllppppp 2 years ago
                                                                    I've had a go at doing it in Go and the ActivityPub spec is so loosely defined that it's just a real challenge if you intend to actually unmarshal the JSON you receive

                                                                    It's not completely impossible but you have to be okay with discarding a lot of unknown options or essentially reverse engineering the objects used by the servers you are federated with

                                                                    That's not to say it's impossible, I was able to crawl the network successfully, but it hints at the reason that Mastodon and Pleroma use dynamic languages

                                                                    I'd be very interested to see a flexible/complete AP implementation in any statically typed language

                                                                    Fwiw WriteFreely is implemented in Go with go-fed but -- correct me if I'm wrong -- that library seemed more limited to me than what Pleroma and Mastodon support

                                                                    • mariusor 2 years ago
                                                                      I'm surprised you didn't find my library because I managed to create a statically typed vocabulary library for Go that maps the specification verbatim: https://pkg.go.dev/github.com/go-ap/activitypub#Object

                                                                      It wasn't easy indeed, and it locked me out of some options to support execution time vocabulary extensions, but hey, it works and it's relatively easy to use.

                                                                      • knjllppppp 2 years ago
                                                                        I'm surprised I didn't find it, too! My google-fu must be getting rusty. Thanks for the link, I'll have to have a deep dive when I get the time :)
                                                                      • yawaramin 2 years ago
                                                                        Here's the implementation described in OP: https://github.com/Gopiandcode/ocamlot

                                                                        OCaml is a statically-typed language. It falls somewhere between Go and Haskell on the spectrum of type 'strength'.

                                                                        • zimpenfish 2 years ago
                                                                          > I'd be very interested to see a flexible/complete AP implementation in any statically typed language

                                                                          Try Honk[1] or GotoSocial[2]?

                                                                          [1] https://humungus.tedunangst.com/r/honk [2] https://github.com/superseriousbusiness/gotosocial

                                                                          • mariusor 2 years ago
                                                                            Neither is flexible, nor strives for completion. They are both implementations that try to map the ActivityPub vocabulary on an existing web-application domain.

                                                                            They are not ActivityPub servers, but web-apps that use the ActivityPub vocabulary to federate, which is what I meant in the grandparent post when I mentioned the classic mistake of ActivityPub implementers. :D

                                                                        • mariusor 2 years ago
                                                                          Well, there is one already as the reference implementation for a suite of libraries I wrote. You can find it at https://github.com/go-ap/fedbox. (Contributions welcome)
                                                                          • zimpenfish 2 years ago
                                                                            Does it only support C2S as the API? Are there any clients which actually support C2S rather than the Mastodon API?
                                                                            • mariusor 2 years ago
                                                                              It does support server to server, but currently it does not play well with Mastodon due to its limited support of HTTP Signatures algorithms. I didn't get bothered enough by this yet to actually fix it on my side.

                                                                              And there are a number of clients that work with this specific brand of client to server ActivityPub but I wrote all of them. The one that can be seen on the internet is a link aggregator similar to HN and (old) reddit, you can find a demo instance at https://brutalinks.tech.

                                                                          • mxuribe 2 years ago
                                                                            There are several websites out there which hope to list many ActivityPub servers (and clients) in many (programming) languages, and other implemtnation aspects...Like, here's an oldie but goodie website: https://fediverse.party/en/miscellaneous/ ...There are other wbsites of course.

                                                                            Just select your desired lang. and review! Now, of course, it might be early days for some languages (e.g. for Rust, etc.)...But, one reason why some languages are used over others...is due to ease of deploying on VPCs and VPC-like hosts (...historically the land that php ruled ;-)

                                                                            Enjoy, and I hope you find what you're looking for!

                                                                            • bgorman 2 years ago
                                                                              Ocaml code compiles to native binaries, just like Go/Rust.
                                                                              • yawaramin 2 years ago
                                                                                Why specifically those languages? Others can also target modest hardware/small resource Docker hosts.
                                                                                • sangnoir 2 years ago
                                                                                  Pleroma (written in Elixir) is one of the lighter, Mastodon-compatible AP servers available. I recently read a post (a toot actually, but I hate that term) by a Mastodon administrator observing that that Pleroma is often a common thread to problematic Fediverse instances because it can run on cheap VPS boxes on throwaway domains. Spammers/griefers can cause a lot of moderation problems for the same amount as a Twitter Blue subscription.

                                                                                  It is dubious endorsement, but I think it shows how much more efficient Pleroma is than other popular, easy-to-use-OOTB AP servers: 9 out of 10 price-conscious griefers use and endorse Pleroma

                                                                                  • Xeoncross 2 years ago
                                                                                    Sure, Zig, Nim, D, Erlang, etc.. could also do this, but Go and Rust are both big enough and just about as fast and low memory as anything that is available.

                                                                                    Java can be faster than Go, but not by much and I've always seen it to use 5-10x the memory.

                                                                                    Scripting languages like Typescript, Python, PHP and Ruby can't hold a candle to the speed of Rust and Go while also using significantly more memory. They also don't natively support multiple cores / threads.

                                                                                    Rust and Go represent the most approachable middle ground on all accounts of familiarity, performance (allocs and calculation speed), and large communities with libraries covering whatever you could want.

                                                                              • throwaway290 2 years ago
                                                                                There's also LitePub, though development seems stalled (?)