
Stability AI, one of the world’s most prominent generative AI companies, has joined the likes of Google, Meta and Open AI in creating a model that generates clips of music and sound. Called “Stable Audio,” the new text-to-music generator was trained on sounds from the music library Audio Sparx.
Stability touts its new product as the first music generation product that creates high-quality, 44.1 kHz music for commercial use though a process called “latent diffusion” — a process that was first introduced for images through Stable Diffusion, the company’s marquee product. Stable Audio uses sound conditioned on text metadata as well as audio file duration and start time to allow for greater control over the content of the generated audio.
By typing prompts like “post-rock, guitars, drum kit, bass, strings, euphoric, up-lifting, moody, flowing, raw, epic, sentimental, 125 BPM,” users…
