Google creates AI music generator
Google researchers have developed an artificial intelligence (AI) music generator that uses text prompts to create minutes-long musical pieces, the company revealed in a research paper.
Known as MusicLM, Google says the new feature uses a hierarchical sequence-to-sequence model for conditional music generation.
The music is produced at 24kHz, which remains consistent for several minutes. MusicLM can also transform a whistled or hummed melody into other instruments.
Google shared multiple snippets that were produced using MusicLM. The examples it created include shorter tracks from “rich captions” and several “story mode” clips.
An example of the “rich caption” text prompt Google used to generate the soundtrack for an arcade game is quoted below.
“The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff,” the prompt reads.
“The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.”
The “story mode” clip examples were created using various shorter text prompts, including:
- Time to meditate (0:00-0:15), time to wake up (0:15-0:30), time to run (0:30-0:45), time to give 100% (0:45-0:60)
- Electronic song played in a videogame (0:00-0:15), meditation song played next to a river (0:15-0:30), fire (0:30-0:45), fireworks (0:45-0:60)
Google’s MusicLM can also generate music clips from paintings — specifically from descriptions of the artworks — including Salvador Dali’s famous “The Persistence of Memory”.
MusicLM can even simulate human vocals. However, while it seems to get the tone and overall sound of voices right, they don’t sound entirely genuine.
Samples
- Arcade game soundtrack:
- Story mode 1:
- Story mode 2:
- Dali – Persistence of Memory: