Audio Generation Tool, Audiobox, Introduced by Meta

Audiobox empowers users to utilize text description prompts to specify and manipulate sound effects, expanding the range of controllable features.

By Boudhayan Ghosh

December 12, 2023

Meta introduced the Audiobox as their latest foundational research model for audio generation. Within this family of models are specialized versions such as Audiobox Speech and Audiobox Sound.

These models enable the creation of voices and sound effects by amalgamating voice inputs with natural language prompts, catering to diverse audio needs. Underlying these variants is the Audiobox SSL, a self-supervised model forming the common foundation for all Audiobox iterations.

Audiobox further permits users to merge an audio voice input alongside a textual style prompt, facilitating the synthesis of speech in various environments or emotional tones, such as speaking in a cathedral or expressing sadness at a slower pace.

The inclusion of text and voice inputs significantly amplifies Audiobox’s controllability in contrast to other Meta inventions like Voicebox. Audiobox empowers users to utilize text description prompts to specify and manipulate sound effects, expanding the range of controllable features. When combined, the voice input establishes the fundamental timbre, while the text prompt becomes a tool for altering other attributes.

Audiobox inherits Voicebox’s guided audio generation training objective and flow-matching modeling method, enabling audio infilling. This capability permits users to refine sound effects, such as incorporating diverse thunder sounds into a rain soundscape, enhancing the model’s versatility.

Audio Generation Tool, Audiobox, Introduced by Meta

LEAVE A REPLY Cancel reply

Most Popular

Samsung-Tesla AI6 Chip Pact Sparks Machine Learning Revolution

Unlocking Tomorrow: The Future of Artificial Intelligence and Its Impact on Our Lives

TCS to Lay Off Over 12,000 Staff Amid Strategic Realignment—not an AI Cut

Audio Generation Tool, Audiobox, Introduced by Meta

Subscribe to our newsletter

RELATED ARTICLES

TCS to Lay Off Over 12,000 Staff Amid Strategic Realignment—not an AI Cut

Samsung-Tesla AI6 Chip Pact Sparks Machine Learning Revolution

Grok 4: xAI’s Boldest AI Model Yet Brings Voice, Vision, and Reasoning to the Forefront

LEAVE A REPLY Cancel reply

Most Popular

Samsung-Tesla AI6 Chip Pact Sparks Machine Learning Revolution

Unlocking Tomorrow: The Future of Artificial Intelligence and Its Impact on Our Lives

TCS to Lay Off Over 12,000 Staff Amid Strategic Realignment—not an AI Cut