On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don’t require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

    • bitfucker@programming.dev
      link
      fedilink
      arrow-up
      13
      arrow-down
      1
      ·
      7 months ago

      Someone is always bound to make this someday. At least the maker is announcing it which is decent enough. Actually, I have always thought that if AI can generate image and voice, what is stopping someone from identity theft? And BAM, we are now in an age where digital data will soon be unreliable unless we have protocol in-place to prove the origin of the data.

    • duncesplayed@lemmy.one
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 months ago

      If you pump out enough research papers, maybe Microsoft won’t move you over to the Office team.