SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation
Paper • Mar 2, 2025 • arxiv.org • Yuying Ge, Sijie Zhao, Jinguo Zhu, Yixiao Ge, Kun Yi, Lin Song, Chen Li, Xiaohan Ding, Ying Shan
The rapid evolution of multimodal foundation model has demonstrated significant progresses in vision-language understanding and generation, e.g., our previous work SEED-LLaMA. However, there remain...