Meta has announced the launch of SAM Audio, an artificial intelligence model designed to segment and isolate specific sounds from complex audio recordings. According to Meta, this new tool allows users to separate elements such as vocals or instruments in a band recording, filter out background noises like traffic, or remove unwanted sounds such as a barking dog from podcasts with minimal effort.
"SAM Audio, the latest addition to our Segment Anything collection, transforms audio processing by making it easy to isolate any sound from complex audio mixtures using text, visual, and time span prompts," Meta stated in its announcement.
The company describes the model as intuitive and accessible for both professionals and casual users. "This intuitive approach mirrors how people naturally engage with sound, making professional-grade audio separation more accessible and easier than ever before. SAM Audio has the potential to transform audio and video editing and drive innovation in areas like music, podcasting, television, film, scientific research, accessibility, and more."
Previously, tools for editing or separating audio were often specialized for narrow use cases. Meta claims that SAM Audio is unique because it serves multiple scenarios within one unified system. "As a unified model, SAM Audio is the first to support use cases that match how people naturally think about audio, and achieves cutting-edge performance across diverse, real-world scenarios."
SAM Audio offers three prompting methods: text prompting (for example typing “dog barking” or “singing voice” to extract those sounds), visual prompting (clicking on a person or object in a video to isolate their corresponding sound), and span prompting (marking specific time segments where target audio occurs). The company notes that span prompting is an industry first.
"These prompting methods can be used alone or in any combination, giving you precise and intuitive control over how audio is separated. We see so many potential use cases, including sound isolation, noise filtering, and more to help people bring their creative visions to life," Meta added.
SAM Audio is now available through the Segment Anything Playground platform. Users can try out its features using sample assets provided by Meta or upload their own files; the model is also available for download.
"We’re excited to bring audio to the Segment Anything collection of models and we believe SAM Audio is the all-around best audio separation model available. Learn more about SAM Audio and try it on the Segment Anything Playground today," said Meta.
