Quick answer

Paper2025-11-11•Source ↗•10 attns0 checkouts

Claim

DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction

Authors

Discuss with Grok

Yiheng Liu·

Liao Qu·

Huichao Zhang·

Xu Wang·

Yi Jiang·

Yiming Gao·

Hu Ye·

Xian Li·

Shuai Wang·

Daniel K. Du·

Fangmin Chen·

Zehuan Yuan·

Xinglong Wu

ABSTRACT

This paper presents DetailFlow, a coarse-to-fine 1D autoregressive (AR) image generation method that models images through a novel next-detail prediction strategy. By learning a resolution-aware token sequence supervised with progressively degraded images, DetailFlow enables the generation process to start from the global structure and incrementally refine details. This coarse-to-fine 1D token sequence aligns well with the autoregressive inference mechanism, providing a more natural and efficient way for the AR model to generate complex visual content. Our compact 1D AR model achieves high-quality image synthesis with significantly fewer tokens than previous approaches, i.e. VAR/VQGAN. We further propose a parallel inference mechanism with self-correction that accelerates generation speed by approximately 8x while reducing accumulation sampling error inherent in teacher-forcing supervision. On the ImageNet 256x256 benchmark, our method achieves 2.96 gFID with 128 tokens, outperforming VAR (3.3 FID) and FlexVAR (3.05 FID), which both require 680 tokens in their AR models. Moreover, due to the significantly reduced token count and parallel inference mechanism, our method runs nearly 2x faster inference speed compared to VAR and FlexVAR. Extensive experimental results demonstrate DetailFlow's superior generation quality and efficiency compared to existing state-of-the-art methods.

#llm/paper/month/202511 #computer-version/month/202511 #computer-version/year/2025 #llm/paper/year/2025 #computer-version #multimodal-model #deep-learning/month/202511 #llm/paper #deep-learning/from/bytedance-research #deep-learning/year/2025 #llm/year/2025 #world-model #llm/month/202511 #deep-learning #llm ByteDance Research

Review Snapshot

Explore ratings

0.0

★★★★★

0 ratings

5 star

4 star

3 star

2 star

1 star

Recommendation

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful