Home Microsoft’s new VASA-1 AI model can turn photos into ‘talking faces’

Microsoft’s new VASA-1 AI model can turn photos into ‘talking faces’

TL:DR

  • Microsoft introduced VASA-1, an AI model that transforms still images into 'talking faces'.
  • VASA-1 demonstrates impressive lip-sync capability and realistic head movements.
  • While promising for animation and AI filmmaking, Microsoft has no immediate plans for commercial release due to concerns about potential misuse.

Microsoft has provided a glimpse of VASA-1, its new artificial intelligence (AI) model, which can turn still images into ‘talking faces’ to great effect.

The end product can be impressive or terrifying, but the lip-sync capability of this project is very realistic. At present, the model is only available as a research preview to Microsoft researchers but the demos released to the public have created a stir.

It’s Microsoft’s latest move in the ongoing battle for generative AI supremacy. Earlier this week they announced a huge AI investment in UAE. While rivals Meta released their AI assistant across its platforms.

The premise is that anyone can upload a photo and voice sample to create an apparent live, talking head of your own face. VASA-1 takes a single photo and a brief audio file to convert into a quite convincing talking face video.

What makes it stand out, is the quality of the lip-sync, head movements and recognizable facial features.

There will be genuine uses for such a program but safeguards will be required, as ever with AI, due to the potential for misinformation and malicious intentions. Microsoft has acknowledged this with an admission “like other related content generation techniques, (VASA-1) could still potentially be misused for impersonating humans.”

The research report continued, “Given such context, we have no plans to release an online demo, API, product, additional implementation details, or any related offerings until we are certain that the technology will be used responsibly and in accordance with proper regulations.”

What will VASA-1 be used for?

The lip-sync qualities of this program need to be seen to be believed, as shown by the imagery of Mona Lisa rapping. Word perfect? Pretty much. It has been said researchers were pleasantly surprised by just how good this performed.

VASA-1 appears to be a great fit for animation, from gaming to social media avatars and AI filmmaking but as stated above, there are no current plans for the project to develop beyond a research demonstration.

That could change as developers will be very keen to get working with the model.

Image credit: Microsoft

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Graeme Hanna
Tech Journalist

Graeme Hanna is a full-time, freelance writer with significant experience in online news as well as content writing. Since January 2021, he has contributed as a football and news writer for several mainstream UK titles including The Glasgow Times, Rangers Review, Manchester Evening News, MyLondon, Give Me Sport, and the Belfast News Letter. Graeme has worked across several briefs including news and feature writing in addition to other significant work experience in professional services. Now a contributing news writer at ReadWrite.com, he is involved with pitching relevant content for publication as well as writing engaging tech news stories.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.