Real-time action chunking with large models

84 points by pr337h4m 3 weeks ago | 9 comments
  • fennecbutt 3 weeks ago
    Alright, I'm building the robot project I was putting off. This is so fucking cool.

    Excellent work!

    • jauntywundrkind 3 weeks ago
      Anyone have good intro recommendations for VLAs?
    • UltraSane 3 weeks ago
      I love the implications of a robot that can plug in Ethernet cables.
      • lysp 3 weeks ago
        Just need one that can plug in USB-A cables the first attempt (I average 3 attempts).
        • meepmorp 3 weeks ago
          “Soon, a robot will fix the cables in the server room for me!”
          • LoganDark 3 weeks ago
            New job title: Spaghetti Organizer
        • b0a04gl 3 weeks ago
          rtc handling 300ms+ delay and still pulling off tasks like plugging ethernet is kinda nuts. what i'm not getting is but how's it keeping the control loop stable without retraining? some sort of latent plan caching?
          • kvablack 3 weeks ago
            It uses an inpainting algorithm (adapted from image generation literature) to produce future actions that are consistent with the current trajectory. It's sort of like warm-starting from a cached plan, although the plan isn't latent, it's directly in action space. Hopefully that answers your question -- there are many more details in the blog post and paper :)