Home Generative AI model Fugatto unveiled, can create never-before-heard sounds

Generative AI model Fugatto unveiled, can create never-before-heard sounds

TLDR

  • Nvidia's Fugatto AI creates unique sounds using text and audio mix prompts.
  • Designed for music, ads, and gaming, it generates and transforms sounds instantly.
  • Powered by 2.5B parameters, trained on Nvidia DGX systems with H100 GPUs.

Researchers at Nvidia have created a new generative AI model named Fugatto which can generate ‘entirely new sounds’ from any mix of music, voices, and sounds.

The tool has been likened to a ‘Swiss Army knife’ for sound as it can allow users to control audio output by text alone.

The Fugatto model, short for Foundational Generative Audio Transformer Opus 1, can generate or transform any mix of music, voices and sounds described with prompts using any combination of text and audio files.

“This thing is wild,” said Ido Zmishlany, a multi-platinum producer and songwriter — and cofounder of One Take Audio, a member of the NVIDIA Inception program for cutting-edge startups in a blog post announcing the AI model.

“Sound is my inspiration. It’s what moves me to create music. The idea that I can create entirely new sounds on the fly in the studio is incredible.”

Rafael Valle, a manager of applied audio research at the technology giant and one of the dozen-plus people behind Fugatto, said: “We wanted to create a model that understands and generates sound like humans do.

“Fugatto is our first step toward a future where unsupervised multitask learning in audio synthesis and transformation emerges from data and model scale.”

AI model Fugatto can create “new sounds on the fly”

The researchers have listed numerous use cases, including music producers using Fugatto to quickly prototype or edit an idea for a song, trying out different styles, voices and instruments.

In another situation, the team says an ad agency could use the model to quickly target an existing campaign for multiple regions or situations, using different accents and emotions. Video game developers could use Fugatto to modify prerecorded assets in their titles to fit the changing action as users play the game.

The full version of the tool uses 2.5 billion parameters and was trained on a bank of Nvidia DGX systems packing 32 Nvidia H100 Tensor Core GPUs.

The team behind the new model included people worldwide, including India, Brazil, China, Jordan and South Korea. This collaboration is said to have made Fugatto’s multi-accent and multilingual capabilities stronger.

Featured Image: AI-generated via Ideogram

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Sophie Atkinson
Tech Journalist

Sophie Atkinson is a UK-based journalist and content writer, as well as a founder of a content agency which focuses on storytelling through social media marketing. She kicked off her career with a Print Futures Award which champions young talent working in print, paper and publishing. Heading straight into a regional newsroom, after graduating with a BA (Hons) degree in Journalism, Sophie started by working for Reach PLC. Now, with five years experience in journalism and many more in content marketing, Sophie works as a freelance writer and marketer. Her areas of specialty span a wide range, including technology, business,…

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.