Quick answer
AI Summary: Introduces the VQ-VAE, a breakthrough generative model that learns discrete, highly compressed latent representations, solving the posterior collapse problem in standard autoencoders.
AI Summary: Introduces the VQ-VAE, a breakthrough generative model that learns discrete, highly compressed latent representations, solving the posterior collapse problem in standard autoencoders.
Learning useful representations without supervision remains a key challenge in machine learning. We propose the Vector Quantised-Variational AutoEncoder (VQ-VAE), a simple yet powerful generative model that learns discrete representations of data. Unlike standard VAEs that use continuous latent spaces, VQ-VAE utilizes vector quantisation to overcome the 'posterior collapse' problem, where latents are ignored when paired with a powerful autoregressive decoder. We show that VQ-VAE can learn highly compressed, discrete latent representations of images, audio, and video, which can then be used by an autoregressive prior (like PixelCNN or WaveNet) to generate high-quality, coherent samples.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for Neural Discrete Representation Learning.