Google’s artificial intelligence lab DeepMind is working on new technology that can generate soundtracks and even dialogue to accompany videos. The lab shared progress on its video-to-audio (V2A) technology project that can be combined with video creation tools like Google Veo and OpenAI’s Sora. In a blog post, the DeepMind team explains that the system can understand raw pixels and combine that information with text prompts to create sound effects that sync up with what’s happening on the screen. However, the tool can also be used to create soundtracks for traditional footage, such as silent movies and other videos without sound.
DeepMind researchers trained the technology on videos, audio, and annotations that included detailed AI-generated sound descriptions and transcripts of conversations, so the technology learned to associate specific sounds with visual scenes. TechCrunch The DeepMind team isn’t the first to release an AI tool that can generate sound effects (ElevenLabs also recently did so), and they say they won’t be the last. “Our work goes beyond existing video-to-audio solutions because it can understand raw pixels and adding text prompts is optional,” the team wrote.
The text prompt is optional, but you can use it to shape and refine your final product to make it as accurate and realistic as possible. For example, enter positive prompts to guide the output towards creating the sounds you want, and negative prompts to avoid sounds you don’t want. In the sample below, the team used the prompts “movie, thriller, horror movie, music, tension, atmosphere, footsteps on concrete.”
The researchers acknowledge that they are still trying to address existing limitations of V2A technology, such as the degradation of the output audio quality when the source video is distorted. They are also working on improving the lip sync of the generated dialogue. Moreover, they vow to conduct “rigorous safety evaluation and testing” before releasing the technology to the world.
This article contains affiliate links, if you click on such links and make a purchase we may earn a commission.