3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian SplattingPaper·Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, G…·5/28/2024Source ↗
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised PretrainingPaper·Haohe Liu, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, …·5/11/2024Source ↗
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video GenerationPaper·Yupeng Zhou, Daquan Zhou, Ming-Ming Cheng, Jiashi Feng, Q…·5/2/2024Source ↗
DINOISER: Diffused Conditional Sequence Learning by Manipulating NoisesPaper·Jiasheng Ye, Zaixiang Zheng, Yu Bao, Lihua Qian, Mingxuan…·5/1/2024Source ↗
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense CaptioningPaper·Lin Xu, Yilin Zhao, Daquan Zhou, Zhijie Lin, See Kiong Ng…·4/29/2024Source ↗
HQ-Edit: A High-Quality Dataset for Instruction-based Image EditingPaper·Mude Hui, Siwei Yang, Bingchen Zhao, Yichun Shi, Heng Wan…·4/15/2024Source ↗
SALMONN: Towards Generic Hearing Abilities for Large Language ModelsPaper·Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian…·4/8/2024Source ↗
Depth Anything: Unleashing the Power of Large-Scale Unlabeled DataPaper·Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi…·4/7/2024Source ↗
Magic-Me: Identity-Specific Video Customized DiffusionPaper·Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu L…·3/20/2024Source ↗
SDXL-Lightning: Progressive Adversarial Diffusion DistillationPaper·Shanchuan Lin, Anran Wang, Xiao Yang·3/2/2024Source ↗
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUsPaper·Ziheng Jiang, Haibin Lin, Yinmin Zhong, Qi Huang, Yangrui…·2/23/2024Source ↗
Diffusion Glancing Transformer for Parallel Sequence to Sequence LearningPaper·Lihua Qian, Mingxuan Wang, Yang Liu, Hao Zhou·11/29/2023Source ↗
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion ModelPaper·Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, J…·11/27/2023Source ↗
Make Pixels Dance: High-Dynamic Video GenerationPaper·Yan Zeng, Guoqiang Wei, Jiani Zheng, Jiaxin Zou, Yang Wei…·11/18/2023Source ↗
MagicEdit: High-Fidelity and Temporally Coherent Video EditingPaper·Jun Hao Liew, Hanshu Yan, Jianfeng Zhang, Zhongcong Xu, J…·8/28/2023Source ↗
PolyVoice: Language Models for Speech to Speech TranslationPaper·Qianqian Dong, Zhiying Huang, Qiao Tian, Chen Xu, Tom Ko,…·6/13/2023Source ↗
Efficient Neural Music GenerationPaper·Max W. Y. Lam, Qiao Tian, Tang Li, Zongyu Yin, Siyuan Fen…·5/25/2023Source ↗
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length InputsPaper·Yujia Zhai, Chengquan Jiang, Leyuan Wang, Xiaoying Jia, S…·2/20/2023Source ↗
Cross-modal Contrastive Learning for Speech TranslationPaper·Rong Ye, Mingxuan Wang, Lei Li·5/5/2022Source ↗
Contrastive Learning for Many-to-many Multilingual Neural Machine TranslationPaper·Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li·7/22/2021Source ↗