FFmpeg merges WebRTC support

877 points by Sean-Der 1 month ago | 199 comments
  • Sean-Der 1 month ago
    I am so incredibly excited for WebRTC broadcasting. I wrote up some reasons in the Broadcast Box[0] README and the OBS PR [1]

    Now that GStreamer, OBS and FFmpeg all have WHIP support we finally have a ubiquitous protocol for video broadcasting for all platforms (Mobile, Web, Embedded, Broadcasting Software etc...)

    I have been working on Open Source + WebRTC Broadcasting for years now. This is a huge milestone :)

    [0] https://github.com/Glimesh/broadcast-box?tab=readme-ov-file#...

    [1] https://github.com/obsproject/obs-studio/pull/7926

    • bradly 1 month ago
      That pr is really great work both technically and interpersonally. A fun read for sure. Great work and thank you for your determination.
      • maxmcd 1 month ago
        Thanks for all your work Sean! It's been a delight to use your webrtc libs and see your impact across a broad range of technical efforts.
        • Sean-Der 1 month ago
          Thank you :)

          When are you coming back to the WebRTC space, lots more cool stuff you could b doing :) I really loved [0] it's so cool that a user can access a server behind a firewall/NAT without setting up a VPN or having SSH constantly listening.

          [0] https://github.com/maxmcd/webtty

        • monocularvision 1 month ago
          Your work in this area has been phenomenal. Thank you! I use broadcast-box all the time.
          • echelon 1 month ago
            What sort of infrastructure do you need for scaling WebRTC multicast?

            Are we entering an era where you don't need Amazon's budget to host something like Twitch?

            • Sean-Der 1 month ago
              Yes we are :) When OBS merges the PR [0] things are going to get very interesting.

              Before you needed to run expensive transcoding jobs to be able to support heterogenous clients. Once we get Simulcast the only cost will be bandwidth.

              With Hetzner I am paying $1 a TB. With AV1 or H265 + Simulcast I am getting 4K for hundreds of users on just a single server.

              We will have some growing pains, but I am not giving up until I can make this accessible to everyone.

              [0] https://github.com/obsproject/obs-studio/pull/10885

          • xmprt 1 month ago
            Working in the events broadcasting space, this opens up OBS to being a viable alternative to professional software like vMix. Especially the P2P support and support for broadcasting multiple scenes seem extremely valuable to have.
            • WhyNotHugo 1 month ago
              Are there any video players which can play a webrtc stream? Last I checked, VLC and other popular tools still don’t support it.
              • numpad0 1 month ago
                [1]:

                  gst-launch-1.0 playbin3 uri="gstwebrtc://localhost:8443?peer-id=<webrtcsink-peer-id>"
                
                WebRTC is normally used in bidirectional use cases like video chat with text options, so I don't think it so odd that VLC doesn't outright support it. VLC does not support dialing into an Asterisk server, either.

                [1] https://gstreamer.freedesktop.org/documentation/rswebrtc/web...

                • RedShift1 1 month ago
                  That's impossible, VLC supports everything. If VLC doesn't support it, it doesn't exist.
                  • carlhjerpe 1 month ago
                    While that might be true I've found mpv more approachable when doing weird inputs
                    • mey 1 month ago
                      XAVC HS 4k 10Bit HEVC 4:2:2 on Windows.

                      Plex and ffmpeg, perfectly fine. VLC is not a fan.

                      • oskenso 1 month ago
                        I wish vlc supported usf, 2sf and minigsf
                        • byteknight 1 month ago
                          Amen.
                        • mortoc 1 month ago
                          I'd guess VLC will get support for it soon now that ffmpeg supports it.
                          • Gormo 1 month ago
                            Possibly, but VLC maintains its own codec libraries and doesn't rely on FFMpeg.
                          • bilekas 1 month ago
                            Maybe I'm wrong but in this case, couldn't you create your own middleware server that could consume the Weber stream feed and then stream out as a regular vlc consumable feed? I'm guessing there will be some transcoding on the fly but that should be trivial..
                            • shmerl 1 month ago
                              Should ffplay support it if ffmpeg added support for it in general?
                            • rmoriz 1 month ago
                              Any plans to add multipath/failover-bonding support? e.g. mobile streaming unit connected with several 5G modems. Some people use a modified SRT to send H.265 over multiple links.
                              • Sean-Der 1 month ago
                                Absolutely! Some people have modified libwebrtc to do this today, but it wasn't upstreamed.

                                ICE (protocol for networking) supports this today. It just needs to get into the software.

                              • 1oooqooq 1 month ago
                                i was using vnc for remote dosbox gaming on the phone. now i can sink infinite amount of time trying to do a input handler webapp and using this+obs instead! thanks!
                                • athrun 1 month ago
                                  I've also been trying (and mostly failing) to build such a setup over the last few weeks. What are you thinking in terms of the overall building blocks to get this to work?

                                  I've been struggling to get a proper low-latency screen+audio recording going (on macos) and streaming that over WebRTC. Either the audio gets de-sync, or the streaming latency is too high.

                                  • 1oooqooq 1 month ago
                                    games i plan to play don't care about latency, which solves most of your problems :)

                                    but this+obs+a webapp for input+ydotool to pass the input to dosbox. then i can just open a page on the browser on the phone.

                              • jauntywundrkind 1 month ago
                                Not the SCTP parts! It's implementing WebRTC-HTTP Ingestion Protocol (WHIP), a commonly used low-latency HTTP protocol for talking to a gateway that talks actual WebRTC to peers over WebRTC's SCTP-based protocol. https://www.ietf.org/archive/id/draft-ietf-wish-whip-01.html

                                I hope some day we can switch to a QUIC or WebTransport based p2p protocol, rather than use SCTP. QUIC does the SCTP job very well atop existing UDP, rather than add such wild complexity & variance. One candidate, Media-over-Quic ?MoQ), but the browser doesn't have a p2p quic & progress on that stalled out years ago. https://quic.video/https://datatracker.ietf.org/group/moq/about/

                                • Sean-Der 1 month ago
                                  How would you like to see/use the SCTP parts? I am not sure how to expose them since the WHIP IETF draft makes no mention/suggestion of it.

                                  Most 'WHIP Providers' also support DataChannel. But it isn't a standardized thing yet

                                  • jauntywundrkind 1 month ago
                                    WebRTC actual's complezity is very high. WHIP seems to be the standard path for most apps to integrate, but it does rely on an exterior service to actually do anything.

                                    Hypothetically ffmpeg could be an ICE server for peer-connecting, do SDP for stream negotiation possibly with a side of WHEP (egress protocol) as well, could do SCTP for actual stream transfer. Such that it could sort of act as a standalone peer, rather than offload that work to a gateway service.

                                    Worth noting that gstreamer & OBS also are WHIP based, rely on an external gateway for their WebRTC support. There's not one clear way to do a bunch of the WebRTC layer cake (albeit WHEP is fairly popular I think at this point?), so WHIP is a good way to support sending videos, without having to make a bunch of other decisions that may or may not jive with how someone wants to implement WebRTC in their system; those decisions are all in the WHIP gateway. It may be better to decouple, not try to do it all, which would require specific opinionative approaches.

                                • qwertox 1 month ago
                                  What does this mean? That websites could connect directly to an FFmpeg instance and receive an audio- and/or video-stream?

                                  Phoronix has a somewhat more informative page: https://www.phoronix.com/news/FFmpeg-Lands-WHIP-Muxer

                                  • bigfishrunning 1 month ago
                                    It means that programs that use the FFmpeg libraries (looks like libavformat specifically) can consume webrtc streams
                                    • okdood64 1 month ago
                                      I still don't understand any practical use cases. Can you give some examples? (I'm not being obtuse here I'm genuinely curious what this can enable now.)
                                      • darkvertex 1 month ago
                                        WebRTC excels at sub-second latency peer to peer, so you can do near-realtime video, so anywhere that is useful.

                                        Say you wanted to do a virtual portal installation connecting views from two different cities with live audio, you could have ffmpeg feed off a professional cinema or DSLR camera device with a clean audio feed and stream that over WebRTC into a webpage-based live viewer.

                                        Or say you wanna do a webpage that remote controls a drone or rover robot, it would be great for that.

                                        • lmm 1 month ago
                                          My first thought is a nice way to save a stream in whatever format you want (e.g. transcode for watching on an old phone or something on your commute), just ffmpeg -i <stream> and then all your usual video format options, instead of having to download it and then convert it afterwards.

                                          ffmpeg also has some processing abilities of its own, so you could e.g. greenscreen (chroma key) from a stream onto an existing video background.

                                          ffmpeg is a pretty low-level building block and as others have said, it's mostly used as a library - a lot of video players or processing tools can now add support for stream inputs easily, and that's probably where the biggest impact is.

                                          • MintPaw 1 month ago
                                            You can only really get a video stream out of Unreal Engine using WebRTC, so now clients can at least use ffmpeg/avconv instead of something even worse like libdatachannel.
                                            • jcelerier 1 month ago
                                              I want my desktop app https://ossia.io which uses ffmpeg to be able to send & receive video to another computer over internet without having to fiddle with opening ports on each other's routers. This combined with a server like vdo.ninja solves that.
                                              • ninkendo 1 month ago
                                                My guess is you could more easily build an open source client for whatever video conferencing system you want that uses WebRTC (most services like teams, discord, zoom, etc seem to use WebRTC as a fallback for browsers, if not using it wholesale for everything, although there may be countermeasures to block unofficial clients.)
                                              • dark-star 1 month ago
                                                Are there any popular/well-known WebRTC senders (or servers)? I'm pretty sure this is not for YouTube etc., right? So what would I watch through WebRTC?
                                                • Sean-Der 1 month ago
                                                  Twitch supports WHIP today. Lots of WebRTC services support WHIP (Cloudflare, LiveKit, Dolby...)

                                                  webrtcHacks has an article on it[0] kind of old, but captures the spirit of it!

                                                  [0] https://webrtchacks.com/tag/simulcast/

                                                • qwertox 1 month ago
                                                  So it's only the receiving part of WebRTC, now being able to use WHIP in order to ask a server for a stream?
                                            • msgodel 1 month ago
                                              That should make self hosting streams/streaming CDNs way easier.

                                              If you know how to use it ffmpeg is such an amazing stand alone/plug and play piece of media software.

                                              • Sean-Der 1 month ago
                                                It's so exciting.

                                                Especially with Simulcast it will make it SO cheap/easy for people.

                                                I made https://github.com/Glimesh/broadcast-box in a hope to make self-hosting + WebRTC a lot easier :)

                                                • eigenvalue 1 month ago
                                                  LLMs really know how to use it incredibly well. You can ask them to do just about any video related task and they can give you an ffmpeg one liner to do it.
                                                  • rietta 1 month ago
                                                    Wow, you are not wrong. I just asked Gemini "how can I use ffmpeg to apply a lower third image to a video?" and it gave a very detailed explanation of using an overlay filter. Have not tested its answer yet but on its face it looks legit.
                                                    • Ajedi32 1 month ago
                                                      It could very well be legit, but if you "have not tested its answer yet" the fact that it can generate something that looks plausible doesn't really tell you much. Generating plausible-sounding but incorrect answers is like the #1 most common failure mode for LLMs.
                                                      • refulgentis 1 month ago
                                                        It's amazing --- I cut my teeth in software engineering with ffmpeg-related work 15 years ago, LLMs generating CLI commands with filters etc. is right up there with "bash scripts" as things LLMs turned from "theoratically possible, but no thanks unless you're paying me" into fun, easy, and regular.

                                                        Yesterday I asked it for a command to take a 14 minute video, play the first 10 seconds in realtime, and rest at 10x speed. The ffmpeg CLI syntax always seemed to be able to do anything if you could keep it all in you head, but I was still surprised to see that ffmpeg could do it all in one command.

                                                        • karel-3d 1 month ago
                                                          "Have not tested its answer yet but on its face it looks legit."

                                                          That's LLMs for you

                                                        • 65 1 month ago
                                                          It can't be a Hacker News thread without at least one mention of LLMs, even if the thread is completely unrelated.
                                                        • jmuguy 1 month ago
                                                          It really is, this comic always comes to mind https://xkcd.com/2347/
                                                        • esbeeb 1 month ago
                                                          Gajim, the XMPP client, has been awaiting this for a long time! Their Audio/Video calling features fell into deprecation, and they've been patiently waiting for FFmpeg to make it much easier for them to add Audio/Video calling features back again.
                                                          • dedosk 1 month ago
                                                            Gajim and XMPP is still used out there? I miss the days when I could use pidgin for chat apps.

                                                            Now it is all wallet garden/app-per-service.

                                                            • rw_grim 1 month ago
                                                              There's plugins for most of the modern stuff at https://pidgin.im/plugins
                                                              • NicuCalcea 1 month ago
                                                                I'm quite happy with Beeper, it still has some bugs and isn't open source, but it saves me from remembering where different contacts live.
                                                            • matt3210 1 month ago
                                                              I love seeing the Anubis graphics unexpectedly. I’ve seen it at ffmpeg and gnu so far (among others)
                                                              • crabmusket 1 month ago
                                                                I do too, but this time it won't let me in :/
                                                              • autoexec 1 month ago
                                                                Hopefully this doesn't make it more dangerous to keep ffmpeg on our systems. WebRTC security flaws are responsible for a lot of compromises. It's one of the first things I disable after installing a browser
                                                                • Sean-Der 1 month ago
                                                                  What security flaws?

                                                                  This implementation is very small. I feel 100% confident we are giving users the best thing possible.

                                                                  • autoexec 1 month ago
                                                                    most recently: https://cyberpress.org/critical-libvpx-vulnerability-in-fire..., but you can have your pick from any year https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=webrtc

                                                                    You're right that biggest reason people usually recommend disabling it is to prevent your IP from leaking when using a VPN https://www.techradar.com/vpn/webrtc-leaks but not having to worry about RCE or DoS is a nice bonus

                                                                    I'm not sure how much will this impact ffmpeg users. Considering that WebRTC has a bad track record in terms of security though, I do worry a little that its inclusion in one more place on our systems could increase attack surface.

                                                                  • globie 1 month ago
                                                                    I assume autoexec is referring to the plethora of WebRTC vulnerabilities which have affected browsers, messengers, and any other software which implements WebRTC for client use. Its full implementation is seemingly difficult to get right.

                                                                    Of course, you're right that this implementation is very small. It's very different than a typical client implementation, I don't share the same concerns. It's also only the WHIP portion of WebRTC, and anyone processing user input through ffmpeg is hopefully compiling a version enabling only the features they use, or at least "--disable-muxer=whip" and others at configure time. Or, you know, you could specify everything explicitly at runtime so ffmpeg won't load features based on variable user input.

                                                                    • gruez 1 month ago
                                                                      >I assume autoexec is referring to the plethora of WebRTC vulnerabilities which have affected browsers, messengers, and any other software which implements WebRTC for client use. Its full implementation is seemingly difficult to get right.

                                                                      Like what? I did a quick search and most seem to be stuff like ip leaks and fingerprinting, which isn't relevant in ffmpeg.

                                                                    • codedokode 1 month ago
                                                                      Leaking local IP addresses?
                                                                    • morepedantic 1 month ago
                                                                      ffmpeg is high performance code dealing with esoteric codecs and binary formats in C, so don't sweat it.
                                                                      • dylan604 1 month ago
                                                                        is this something that one could compile with a --without-whip type of argument if you don't want/need? that would an ideal thing.
                                                                        • marxisttemp 1 month ago
                                                                          Yes, pretty much every bit of ffmpeg can be enabled or disabled when compiling.
                                                                        • mschuster91 1 month ago
                                                                          > Hopefully this doesn't make it more dangerous to keep ffmpeg on our systems.

                                                                          ffmpeg has had so many issues in the past [1], it's best practice anyway to keep it well contained when dealing with user input. Create a docker image with nothing but ffmpeg and its dependencies installed and do a "docker run" for every transcode job you got. Or maybe add ClamAV, OpenOffice and ImageMagick in the image as well if you also need to deal with creating thumbnails of images and document.

                                                                          And personally, I'd go a step further and keep the servers that deal with user-generated files in more than accepting and serving them in their own, heavily locked down VLAN (or Security Group if you're on AWS).

                                                                          That's not a dumbass criticism of any of these projects mentioned by the way. Security is hard, especially when dealing with binary formats that have inherited a lot of sometimes questionably reverse engineered garbage. It's wise to recognize this before getting fucked over like 4chan was.

                                                                          [1] https://ffmpeg.org/security.html

                                                                          • xxpor 1 month ago
                                                                            if you're worried about arbitrary code exec from an ffmpeg vuln, docker is not a sufficient security boundary.
                                                                      • FrostKiwi 1 month ago
                                                                        OMG YEEEEES. I'm building web based remote control and if this allows me to do ffmpeg gdigrab, have that become a WebRTC stream and be consumed by a client without the ExpressJS gymnastics I do right now, I'll be over the moon.
                                                                        • tyre 1 month ago
                                                                          Interesting I keep getting blocked by the bot detection on iOS safari, both from our work WiFi and cellular data.

                                                                          Anubis let me go

                                                                          • jsheard 1 month ago
                                                                            Are you getting the "access denied" page, or an infinite challenge loop?
                                                                            • kairosisme 1 month ago
                                                                              FWIW I also can’t pass the Anubis pass on iOS Safari, even though I can on any other site. I see the Anubis success screen for a moment before it switches to the “invalid response” screen.

                                                                              edit: Trying again a few minutes later worked

                                                                            • __turbobrew__ 1 month ago
                                                                              I got stuck on access denied. Canada IPv4. Safari on iOS.
                                                                            • xena 1 month ago
                                                                              Do you happen to have a dual-stack network?
                                                                              • 1 month ago
                                                                              • dyl000 1 month ago
                                                                                Anubis isn’t letting me through ;(
                                                                                • Mofpofjis 1 month ago
                                                                                  A commit that was "co-authored-by" 6+ people and has three thousand lines of code: this is a total wreck of a development workflow. This feature should have been implemented with a series of about 20 patches. Awful.
                                                                                  • Daemon404 1 month ago
                                                                                    (long time FFmpeg dev here)

                                                                                    You are being downvoted, but you are entirely correct. This is also explicitly not allowed in FFmpeg, but this was pushed after many months, with no heads up on the list, no final review sign off, and with some developers expressing (and continuing to express) reservations about its quality on the list and IRC.

                                                                                    • bigfishrunning 1 month ago
                                                                                      That's really unfortunate to hear. I'm a huge fan of Webrtc and Pion, and was very excited to get some ffmpeg integration -- hopefully some of the quality issues will be ironed out before the next ffmpeg release
                                                                                      • Daemon404 1 month ago
                                                                                        There's quite some time until the next release, I believe, so it should be.

                                                                                        The biggest thing missing right now is NACK support, and one of the authors has said they intend to do this (along with fixing old OpenSSL version support, and supporting other libraries). Until that is done, it isn't really "prod ready", so to speak.

                                                                                        For some context, there has been a history of half-supported things being pushed to FFmpeg by companies or people who just need some subset of $thing, in the past, and vendors using that to sell their products with "FFmpeg isn't good enough" marketing, while the feature is either brought up to standard, or in some cases, removed, as the original authors vanish, so it's perhaps a touchy subject for us :) (and why my post was perhaps unnecessarily grumpy).

                                                                                        As for the git / premature push stuff, I strongly believe it is a knock-on effect of mailing list based development - the team working on this support did it elsewhere, and had a designated person send it to the list, meaning every bit of communication is garbled. But that is a whole different can of worms :D.

                                                                                    • jpk 1 month ago
                                                                                      I mean, it probably was a branch that several people contributed commits to that was squashed prior to merge into mainline. Folks sometimes have thoughts about whether there's value in squashing or not, but it's a pretty common and sensible workflow.
                                                                                      • fc417fc802 1 month ago
                                                                                        > common and sensible

                                                                                        Perhaps "common and technically works" would be a better way to put that (similarly for rebase). I suspect people would stop squashing if git gained the ability to tag groups of commits with topics in either a nested or overlapping manner.

                                                                                    • sylware 1 month ago
                                                                                      OMG, this is not completely brain damaged c++ code lost in the middle of one of the web engines from the whatng cartel??? or C code with one billion dependencies with absurd SDKs???

                                                                                      Quick! Quick! I need to find something bad about it... wait... AH!

                                                                                      Does it compile with the latest libressl? Hopefully not (like python _ssl.c) and I can start talking bad about it.

                                                                                      ;P

                                                                                      Ofc, that was irony.

                                                                                      We all know the main issue with webRTC is not its implementations, but webRTC itlself.

                                                                                      All that said, it is exactly at this very time twitch.tv chose to break ffmpeg HLS (its current beta HLS streams are completely breaking ffmpeg HLS support...).

                                                                                      • pkz 1 month ago
                                                                                        Does this mean that ffmpeg now can record a Jitsi video meeting audio stream?
                                                                                        • throwpoaster 1 month ago
                                                                                          What’s ffmpeg security auditing like? Seems reactive from their site.
                                                                                          • SeriousM 1 month ago
                                                                                            I can't wait to see this in Jellyfin implemented!
                                                                                          • leland-takamine 1 month ago
                                                                                            Anyone been able to successfully build ffmpeg from source to include whip support? Struggling to figure out the right ./configure options
                                                                                          • cranberryturkey 1 month ago
                                                                                            Can someone ELI5 what this means? i've been using ffmpeg for over a decade.
                                                                                            • esbeeb 1 month ago
                                                                                              WebRTC is very, very hard to code for. But if FFmpeg abstracts that complexity away, then WebRTC becomes much easier to add to a software project wishing to benfit from that which WebRTC offers.
                                                                                              • cranberryturkey 1 month ago
                                                                                                I guess I still don't understand. You don't really "code" with ffmpeg. It just is used to transform media formats or publish to a public streaming endpoint.
                                                                                                • marxisttemp 1 month ago
                                                                                                  All of ffmpeg’s functionality is accessible from C (and transitively most other programming languages) via libavformat, libavcodec etc. FFmpeg supporting WebRTC means that projects using these libraries gain support for WebRTC in code.
                                                                                            • shmerl 1 month ago
                                                                                              Does it allow more realtime streaming than SRT on LAN?

                                                                                              I'm still waiting for ffmpeg CLI tool to merge pipewire + xdg-desktop-portal support. You still can't record a screen or window on Wayland with it.

                                                                                              • Sean-Der 1 month ago
                                                                                                With WebRTC you can expect ~100ms with zero optimizations on your LAN.

                                                                                                With bitwhip[0] I got it way lower then that even.

                                                                                                [0] https://github.com/bitwhip/bitwhip

                                                                                                • shmerl 1 month ago
                                                                                                  That's nice. I had hard time getting low latency with SRT, but managed to get within the range of roughly slightly less than one second using gpu-screen-recorder on one end and ffplay on the other end with flags for low latency.
                                                                                              • chompychop 1 month ago
                                                                                                I have a beginner question - Can WebRTC be used as an alternative to sending base64-encoded images to a backend server for image processing? Is this approach recommended?
                                                                                              • ec109685 1 month ago
                                                                                                Why doesn’t a PR of that magnitude come with tests?
                                                                                                • bigfishrunning 1 month ago
                                                                                                  Using Pion no less! very cool!
                                                                                                • wang_zuo 1 month ago
                                                                                                  The author seems to be an undergraduate from china. very impressive!
                                                                                                  • mrheosuper 1 month ago
                                                                                                    so RTC is real time communication, not real time clock...
                                                                                                    • karlkloss 1 month ago
                                                                                                      "Sadly, you must enable JavaScript to get past this challenge."

                                                                                                      Nope. Get lost. Running random code from websites you don't know is asking for desaster.

                                                                                                      • theobr 1 month ago
                                                                                                        Absolutely huge
                                                                                                        • alexfromapex 1 month ago
                                                                                                          It would be cool to have a chat too
                                                                                                          • quantadev 1 month ago
                                                                                                            Public Service Announcement: There's a reddit topic for WebRTC, that doesn't get enough action imo! Get in there ya'll...

                                                                                                            https://www.reddit.com/r/WebRTC

                                                                                                            • spartanatreyu 1 month ago
                                                                                                              No.

                                                                                                              Reddit lost their own community's trust when the CEO ejected the community's moderators.

                                                                                                              Information posted there is now far less likely to be qualitative compared to other places, so what's the point of going there?

                                                                                                          • MrThoughtful 1 month ago
                                                                                                            I know there are JavaScript ports of FFmpeg and I would love to use them. But so far, I never got it working. I tried it with AI and this prompt:

                                                                                                                Make a simple example of speeding up an mp4
                                                                                                                video in the browser using a version of ffmpeg
                                                                                                                that runs in the browser. Don't use any server
                                                                                                                side tech like node. Make it a single html file.
                                                                                                            
                                                                                                            But so far every LLM I tried failed to come up with a working solution.
                                                                                                            • bastawhiz 1 month ago
                                                                                                              If you visit the ffmpeg.wasm documentation, the first example on the Usage page does almost exactly this:

                                                                                                              https://ffmpegwasm.netlify.app/docs/getting-started/usage

                                                                                                              It transcodes a webm file to MP4, but making it speed up the video is trivial: just add arguments to `ffmpeg.exec()`. Your lack of success in this task is trusting an LLM to know about cutting-edge libraries and how to use them, not a lack of progress in the area.

                                                                                                              • MrThoughtful 1 month ago
                                                                                                                The problem is that they don't provide the full code that can run in the browser. I have not managed to get the function they show in the first example to run in the browser.
                                                                                                                • Matheus28 1 month ago
                                                                                                                  You don’t need an LLM to do that. The code in there is almost complete…
                                                                                                                  • bastawhiz 1 month ago
                                                                                                                    That's just wrong. The example is live: you can run it right there on the page. If the code isn't working when you write it, you're probably importing something incorrectly (or you're not running it in an environment with React, which is where the `use*` functions come from). You can even click on the source of the log lines when the example is running (on the right edge of the Chrome console) to jump into the hot-loaded code and see the exact code that's running it.
                                                                                                                    • numpad0 1 month ago
                                                                                                                      I just threw that prompt into the free ChatGPT, looks like it'll have a few versioning as well as CORS issues...
                                                                                                                  • simlevesque 1 month ago
                                                                                                                    Don't try to do cutting edge stuff with a brain that doesn't know anything past a certian date.
                                                                                                                    • colechristensen 1 month ago
                                                                                                                      Trying to do things off the beaten path with LLMs is rarely successful, especially if there's a related much more popular option.

                                                                                                                      I'm convinced that programmers' bias towards LLMs is strongly correlated with the weirdness of their work. Very often my strange ideas pushed to LLMs look like solutions but are rather broken and hallucinated attempts which only vaguely represent what needs to be done.

                                                                                                                      • bigfishrunning 1 month ago
                                                                                                                        > I'm convinced that programmers' bias towards LLMs is strongly correlated with the weirdness of their work.

                                                                                                                        This is an extremely astute observation; my work has always been somewhat weird and I've never found LLMs to be more then an interesting party-trick

                                                                                                                      • minimaxir 1 month ago
                                                                                                                        The JS ports of FFmpeg (or WASM port if you want the in-browser approach) are very old and would be more than present in modern LLM training datasets, albeit likely not enough of a proportion for LLMs to understand it well.

                                                                                                                        https://github.com/Kagami/ffmpeg.js/

                                                                                                                        https://github.com/ffmpegwasm/ffmpeg.wasm

                                                                                                                      • rvz 1 month ago
                                                                                                                        > But so far every LLM I tried failed to come up with a working solution.

                                                                                                                        Maybe you need to actually learn how it works instead of deferring to LLMs that have no understanding of what you are specifically requesting.

                                                                                                                        Just read the fine documentation.

                                                                                                                        • prophesi 1 month ago
                                                                                                                          Entered the same prompt with Sonnet 4. Just needed to paste the two errors in the console (trying to load the CDN which won't work since it uses a web worker, and hallucinated an ffmpegWasm function) and it output an HTML file that worked.
                                                                                                                          • MrThoughtful 1 month ago
                                                                                                                            Can you put it on jsfiddle or some other codebin? I would love to see it.
                                                                                                                          • jsheard 1 month ago
                                                                                                                            I'm sorry, but if you give up on something you would "love to use" just because LLMs are unable to oneshot it then you might be a bit too dependent on AI.
                                                                                                                            • minimaxir 1 month ago
                                                                                                                              Time is a finite resource, and there's an opportunity cost. If an easy PoC for a complex project can't be created using AI and it would take hours/days to create a PoC organically that may not even be useful, it's better project management to just do something else entirely if it's not part of a critical path.
                                                                                                                              • bastawhiz 1 month ago
                                                                                                                                I can't disagree with this take more vehemently. This isn't an "easy PoC". This is "copy and paste it from the docs"-level effort:

                                                                                                                                https://ffmpegwasm.netlify.app/docs/getting-started/usage/

                                                                                                                                If you can't be arsed to google the library and read the Usage page and run the _one command_ on the Installation page to come up with a working example (or: tweak the single line of the sample code in the live editor in the docs to do what you want it to do), how do you expect to do anything beyond "an easy PoC"? At what point does your inability/unwillingness to do single-digit-minutes of effort to explore an idea really just mean you aren't the right person for the job? Hell, even just pasting the code sample into the LLM and asking it to change it for you would get you to the right answer.

                                                                                                                            • ch_sm 1 month ago
                                                                                                                              if you‘re really interested in doing that, i‘m certain you can with a bit of effort. There are plenty of docs and examples online.
                                                                                                                              • 1 month ago
                                                                                                                              • mort96 1 month ago
                                                                                                                                You know there's ... documentation, right?
                                                                                                                                • ycombinatrix 1 month ago
                                                                                                                                  LLM is my eyes. LLM is my ears. LLM is my documentation. I am LLM.