← Home

Quick answer

AI Summary: Introduces GLIDE, a diffusion model that popularized 'classifier-free guidance' to generate photorealistic images from text and pioneered natural language image inpainting and editing.

Claim

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Alex Nichol·
Prafulla Dhariwal·
Aditya Ramesh·
Pranav Shyam·
Pamela Mishkin·
Bob McGrew·
Ilya Sutskever·
Mark Chen

ABSTRACT

Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guiding technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image synthesis and compare two different guidance strategies: CLIP guidance and classifier-free guidance. We find that classifier-free guidance yields higher-quality images that better capture the text prompts. We present GLIDE, a 3.5 billion parameter text-guided diffusion model. Human evaluators overwhelmingly prefer GLIDE over DALL-E 1. Furthermore, we demonstrate that GLIDE can be fine-tuned to perform powerful, zero-shot image editing (inpainting) via natural language prompts.

Review Snapshot

Explore ratings

4.6
★★★★★
5 ratings
5 star
60%
4 star
40%
3 star
0%
2 star
0%
1 star
0%

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.
Post an inquiry
Sort by: Most helpful