Omnio: First AI model that can natively reason over audio

13 points by lukax 8 months ago | 8 comments
  • lukax 8 months ago
    We built Omnio to address the limitations we kept running into with existing audio AI models (we previously built an automatic speech recognition product). Most of them rely heavily on speech-to-text, which strips out a lot of the things like speaker roles, emotions and non-verbal cues that are critical in more complex scenarios. Omnio directly processes audio signals to capture that kind of context. It's designed to understand conversations in a way that feels more "human."
    • barrenko 8 months ago
      So this is speech-to what? You're kinda missing that in your info. Are you guys based in USA or Ljubljana?
    • LukaFurlan 8 months ago
      insane release
      • sharpshadow 8 months ago
        Can this get me the lyrics of rap songs or not?
        • overlord_tm 8 months ago
          Try it out, you get free credits on signup. Worked surprisingly well for me on this example https://www.youtube.com/watch?v=DxkeOkaVRLo
          • sharpshadow 8 months ago
            Yes I saw it thanks and will try it out. On the introduction blog post they said beta only for paid developers but seemingly it’s free for all.
            • lukax 8 months ago
              Whoops. We've updated the website. Omnio is available to all developers and all accounts receive $5.00 in free credits.
          • easwee 8 months ago
            I tried Eminem - Rap God with "transcribe word by word" prompt and it did quite good, I just had to set temperature to 0, else it can get creative.