SeedEdit: Align Image Re-Generation to Image EditingPaper·Yichun Shi, Peng Wang, Weilin Huang·11/11/2024Source ↗
Classification Done Right for Vision-Language Pre-TrainingPaper·Zilong Huang, Qinghao Ye, Bingyi Kang, Jiashi Feng, Haoqi…·11/6/2024Source ↗
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science CompetitionsPaper·Ziming Li, Qianbo Zang, David Ma, Jiawei Guo, Tuney Zheng…·11/5/2024Source ↗
Seeing the Image: Prioritizing Visual Correlation by Contrastive AlignmentPaper·Xin Xiao, Bohong Wu, Jiacong Wang, Chunyuan Li, Xun Zhou,…·11/5/2024Source ↗
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image SynthesisPaper·Yuxi Ren, Xin Xia, Yanzuo Lu, Jiacheng Zhang, Jie Wu, Pan…·11/4/2024Source ↗
Why Does the Effective Context Length of LLMs Fall Short?Paper·Chenxin An, Jun Zhang, Ming Zhong, Lei Li, Shansan Gong, …·10/24/2024Source ↗
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise ClusteringPaper·Ziyu Zhao, Tao Shen, Didi Zhu, Zexi Li, Jing Su, Xuwu Wan…·10/22/2024Source ↗
Unveiling the Tapestry of Consistency in Large Vision-Language ModelsPaper·Yuan Zhang, Fei Xiao, Tao Huang, Chun-Kai Fan, Hongyuan D…·10/6/2024Source ↗
HybridFlow: A Flexible and Efficient RLHF FrameworkPaper·Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang…·10/2/2024Source ↗
Seed-Music: A Unified Framework for High Quality and Controlled Music GenerationPaper·Ye Bai, Haonan Chen, Jitong Chen, Zhuo Chen, Yi Deng, Xia…·9/19/2024Source ↗
An X-ray Significantly Variable, Luminous, Type 2 Quasar at z = 2.99 with a Massive Host GalaxyPaper·Xiurui Zhao, Stefano Marchesi, Marco Ajello, Francesca Ci…·9/3/2024Source ↗
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play AcceleratorPaper·Hanshu Yan, Xingchao Liu, Jiachun Pan, Jun Hao Liew, Qian…·9/2/2024Source ↗
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal ModelsPaper·Feng Li, Renrui Zhang, Hao Zhang, Yuanhan Zhang, Bo Li, W…·7/28/2024Source ↗
PixelLM: Pixel Reasoning with Large Multimodal ModelPaper·Zhongwei Ren, Zhicheng Huang, Yunchao Wei, Yao Zhao, Dong…·7/18/2024Source ↗
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language ModelPaper·Yatai Ji, Shilong Zhang, Jie Wu, Peize Sun, Weifeng Chen,…·7/10/2024Source ↗
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech RecognitionPaper·Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, …·7/10/2024Source ↗
Autoregressive Pretraining with Mamba in VisionPaper·Sucheng Ren, Xianhang Li, Haoqin Tu, Feng Wang, Fangxun S…·6/11/2024Source ↗
An Image is Worth 32 Tokens for Reconstruction and GenerationPaper·Qihang Yu, Mark Weber, Xueqing Deng, Xiaohui Shen, Daniel…·6/11/2024Source ↗
Seed-TTS: A Family of High-Quality Versatile Speech Generation ModelsPaper·Philip Anastassiou, Jiawei Chen, Jitong Chen, Yuanzhe Che…·6/4/2024Source ↗