← Home

Quick answer

AI Summary: Presents Jukebox, an autoregressive transformer utilizing a multi-scale VQ-VAE to generate highly realistic, minute-long songs with coherent musical structure and synthesized vocals.

Claim

Jukebox: A Generative Model for Music

Prafulla Dhariwal·
Heewoo Jun·
Christine Payne·
Jong Wook Kim·
Alec Radford·
Ilya Sutskever

ABSTRACT

We introduce Jukebox, a generative model that produces high-fidelity, highly diverse music with singing in the raw audio domain. We model music as a sequence of discrete tokens by using a multi-scale VQ-VAE to compress the extremely high-dimensional raw audio. We then train autoregressive Transformers to generate audio conditioned on artist, genre, and lyrics. The system generates minute-long musical compositions with coherent musical structure, instrumentation, and recognizable vocals. This work demonstrates that massive scale and hierarchical compression can bridge the gap between long-term musical structure and high-frequency audio details.

Review Snapshot

Explore ratings

4.4
★★★★
5 ratings
5 star
60%
4 star
20%
3 star
20%
2 star
0%
1 star
0%

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Jukebox: A Generative Model for Music.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.
Post an inquiry
Sort by: Most helpful