Ask HN: Best modern file transfer/synchronization protocol?

70 points by daveidol 1 year ago | 58 comments
Hi HN,

I'm looking to build something to transfer files between two devices on a network (one client and one server, both of which are under my control).

Obviously I could write something bespoke with raw TCP sockets, but rather than reinventing the wheel I'm curious about what existing options people recommend. I assume there are some better technologies than FTP nowadays?

Ideally some kind of built-in fault tolerance would be great, because my plan is to use this on a phone/in an environment where the connection could be interrupted.

Edit: just to clarify - this is something I want to build into an application I am writing, ideally with support across iOS (client), Windows, and mac (server).

One way transfer is all I need, and I mostly plan on photos/videos (so multiple files ~3-20MB in size).

Thanks!

  • orbz 1 year ago
    No need to get fancy, scp or rsync are the tried and true options here.
    • brudgers 1 year ago
      Or on Windows, Xcopy.
      • PopAlongKid 1 year ago
        Or robocopy.
        • jftuga 1 year ago
          Yes. It has many options - one of them being the capability of copying multiple files concurrently with the /MT switch.
        • krylon 1 year ago
          robocopy is nice.
      • ryukoposting 1 year ago
        I use syncthing for this. It's a little fiddly to set up, and transfer speeds aren't great. However, it's very reliable once configured, and it barely uses any resources after an initial scan of the folder you want to sync.
        • BrandoElFollito 1 year ago
          I ended up with Syncthing after trying everything else (that O can self-host).

          You can ease the eureka moment by remembering that each node is completely independent and decided what comes in, and suggests what goes out.

          You can say that a folder is two way on a node and read only on another - and this is great because tout make decisions locally that allow you to build nice things.

          The real issue is that you have to check on both sides of the pipe what is allowed.

          It is very robust and good for LAN and remote sync (the traffic is outgoing to a relay server. Note that many of these relay servers are also TOR nudes so they can be flagged by your severity systems. You can always use your own)

        • 2color 1 year ago
          https://iroh.computer/sendme

          https://iroh.computer/

          Iroh is what you’re looking for. You can embed into your code, and it handled wire encryption, verification, and NAT hole punching. All over QUIC.

        • pvtmert 1 year ago
          I highly recommend Unison (https://github.com/bcpierce00/unison)

          It allows you to sync between 2 machines (bi-directional) over TCP or SSH.

          Note that TCP way is not encrypted, you may use wireguard as transport layer encryption for that purpose...

          You can use an external application to copy if file size is larger than an arbitrary number. (Eg: use rsync for files > 1gb)

          • q0uaur 1 year ago
            another vote for Unison, been using it for about a year for practically everything and it's great. takes a moment to grasp the concept, but it's 100% worth it.

            that said, i haven't managed to set it up for my android phone yet - it's not available in termux and i have NO idea where to start if i'd like to package it myself. probably has to be done by someone who knows ocaml, since termux' environment is so different from normal linux.

            unless... maybe i should give proot a shot (chroot in termux, lets you run something closer to linux). but it's another layer of complexity on top of everything....

          • whalesalad 1 year ago
            seconding rsync and syncthing.

            the server could expose an smb or nfs share, the client could mount it, and then sync to that mount.

            rsync over ssh also works, if you do not want to run smb/nfs.

            this is also a cool tool https://rclone.org/

            • rmorey 1 year ago
              I cannot recommend rclone enough. Been using it to transfer petabyte-scale datasets flawlessly. Available as librclone as well
              • mongol 1 year ago
                rclone is no protocol though
                • tetris11 1 year ago
                  it even has a decent Android client in F-droid, RCX. I just wish it supported SSHFS
                  • gazby 1 year ago
                    RCX is approaching abandonware status (last release 2y ago). Round Sync is worth a look.
                    • tetris11 1 year ago
                      Thanks for the recommendation! I searched F-droid but found nothing, and then I realised it was probably.on Izzy
                  • ksjskskskkk 1 year ago
                    rclone is garbage if you need performance or just copy between two points you control.

                    it shines when you need to sprinkle your data over many "clouds"

                    • gazby 1 year ago
                      I'd argue the opposite. Rclone's parallelization options are unmatched.
                      • ksjskskskkk 1 year ago
                        why parellization matters if you are sending from/to points you control? that's usually from one machine's ssd to another's.
                      • esafak 1 year ago
                        So what do you use?
                    • arun-mani-j 1 year ago
                      On a similar note, can someone tell me what's the fastest (wireless) way to transfer files between two laptops on same network (i.e. hotspot)?

                      scp, rsync, wormhole give me only 2-3 mb/s.

                      For the context, I'm trying to transfer about 50-70 GB files.

                      What's causing the bottleneck here and what am I missing? Thanks in advance!

                      https://github.com/magic-wormhole/magic-wormhole

                      • Fire-Dragon-DoL 1 year ago
                        Filezilla server + client is very fast and has a resume option
                        • neurostimulant 1 year ago
                          Are you transferring a large amount of small files?
                          • arun-mani-j 1 year ago
                            No, it is a large zip file.
                            • neurostimulant 1 year ago
                              Are speed tests (e.g. speed.cloudflare.com) on both laptop also giving similar number?

                              If speed test result is much faster, the bottleneck could be the CPU incapable of encrypting/decrypting data fast enough using the default encryption method used by SCP/rsync. In that case, try unencrypted file transfer instead. Maybe just serve the file temporarily with `python -m http.server`

                        • oschrenk 1 year ago
                          If it’s one way (that wasn’t quite clear from the requirements to me).

                          take a look at https://tus.io/

                          • daveidol 1 year ago
                            This is awesome! I am indeed looking for something one way - so this looks great. Thanks for sharing.
                          • bhaney 1 year ago
                            Yeah I usually just use rsync for this. In a loop if the network is unreliable.
                            • rrix2 1 year ago
                              I built something on top of the Syncthing API this week after using it on its own for years.

                              A local instance of Syncthing can behave as a robust sync tool + inotify API for applications consuming the files: https://docs.syncthing.net/rest/events-get.html#get-rest-eve...

                              i believe there's an embeddable golang library, but if you want something easy to use on android check in on syncthing-fork which lets you define more granular sync conditions including "just turn on 5 minutes every hour" https://github.com/Catfriend1/syncthing-android

                              • mynegation 1 year ago
                                What are your latency and bandwidth requirements? How big are the files? If you are already looking past obvious TCP-based choices like HTTP and FTP, you might be interested in FASP/Aspera https://en.m.wikipedia.org/wiki/Fast_and_Secure_Protocol

                                Edit: I’ll leave it here just in case it is useful for others but it may or may not be embeddable into your app, especially on the phone.

                                • jayknight 1 year ago
                                  I would probably start with rsync.
                                  • daveidol 1 year ago
                                    Thanks - is there a reference implementation for rsync that works on windows and iOS? My impression was it was more of a CLI tool for Linux/macOS.
                                    • teddyh 1 year ago
                                      • necovek 1 year ago
                                        If you are looking to integrate into another app, maybe check https://librsync.github.io/ out.
                                        • daveguy 1 year ago
                                          Unfortunately one of the caveats about what librsync is not (from the link):

                                          librsync also does not include any network functions for talking to SSH or any other server. To access a remote filesystem, you need to provide your own code or make use of some other virtual filesystem layer.

                                          Having this seems to be one of the primary requirements.

                                    • sneak 1 year ago
                                      rsync over ssh for one-shots.

                                      syncthing for continuous use.

                                      • thedaly 1 year ago
                                        I've starting using rclone over rsync for this application. Rclone can do segmented transfers and handles large numbers of files better, at least in my experience.
                                        • xnx 1 year ago
                                          rclone is great! So useful as an alternative/superior client for cloud storage too (Google Drive, OneDrive, etc.)
                                          • sneak 1 year ago
                                            What are segmented transfers?
                                            • daveguy 1 year ago
                                              Send multiple pieces of the file and reconstruct the file on the other end. It's more reliable to send smaller chunks and the transfer can be hash validated on individual chunks and after reconstruction. Multiple download streams typically performs better too (from single or multiple servers).

                                              e.g. BitTorrent for the multiple servers case.

                                        • smackeyacky 1 year ago
                                          Grab an S3 bucket on amazon.

                                          Do a 3 way sync with the s3 command line tool.

                                          That way, you have a neat cloud backup as well. Wouldn't take any more than 20 minutes total to set up.

                                          • jftuga 1 year ago
                                            Respectfully disagree. You get charged 9 cents per GB after the first 100 GB xfer each month. This doesn't fit in well with the OP's environment, which is a LAN.
                                          • eternityforest 1 year ago
                                            What about Jami? It runs almost everywhere and having an embeddable Jami library would be absolutely amazing, although a fair amount of work.
                                            • tripleo1 1 year ago
                                              1. warpinator, syncthing 2. rclone??

                                              --

                                              3. self hosted ipfs in tailscale or something? (that would be cool)

                                              • pluto_modadic 1 year ago
                                                continuous sync - mutagen.io (maybe you could extract some of the libraries)

                                                depends on if it's large or small files.

                                                • Helmut10001 1 year ago
                                                  rsync. If you are looking for a long term solution: zfs on both sides with zfs send.
                                                  • yetanother12345 1 year ago
                                                    eh... wget or curl ? or not modern enough for you?
                                                    • toomim 1 year ago
                                                      Use HTTP. For fault tolerance, use resumeable downloads or resumeable uploads. There is work at the IETF on resumeable uploads right now: https://datatracker.ietf.org/doc/draft-ietf-httpbis-resumabl...
                                                      • Flimm 1 year ago
                                                        I've had terrible experiences downloading large files over HTTP. I'm not sure why, range requests don't seem to be reliable or well supported. Something like BitTorrent is much better for large files: it divides the large file into chunks and hashes each chunk, and by default the chunks are downloaded in random order. BitTorrent seems much more reliable than range requests.
                                                        • otherme123 1 year ago
                                                          Same here. I downloaded some ~100Gb files, and everytime the connection broke, after relaunching, the shasum never matched. Even when the file size was exactly the same that the server reported.