Quick answer

Paper2025-03-04•Source ↗•10 attns0 checkouts

Claim

Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts

Authors

Discuss with Grok

Shulai Zhang·

Ningxin Zheng·

Haibin Lin·

Ziheng Jiang·

Wenlei Bao·

Chengquan Jiang·

Qi Hou·

Weihao Cui·

Size Zheng·

Li-Wen Chang·

Quan Chen·

Xin Liu

ABSTRACT

Mixture-of-experts (MoE) has been extensively employed to scale large language models to trillion-plus parameters while maintaining a fixed computational cost. The development of large MoE models in the distributed scenario encounters the problem of large communication overhead. The inter-device communication of a MoE layer can occupy 47% time of the entire model execution with popular models and frameworks. Therefore, existing methods suggest the communication in a MoE layer to be pipelined with the computation for overlapping. However, these coarse grained overlapping schemes introduce a notable impairment of computational efficiency and the latency concealing is sub-optimal. To this end, we present COMET, an optimized MoE system with fine-grained communication-computation overlapping. Leveraging data dependency analysis and task rescheduling, COMET achieves precise fine-grained overlapping of communication and computation. Through adaptive workload assignment, COMET effectively eliminates fine-grained communication bottlenecks and enhances its adaptability across various scenarios. Our evaluation shows that COMET accelerates the execution of a single MoE layer by $1.96\times$ and for end-to-end execution, COMET delivers a $1.71\times$ speedup on average. COMET has been adopted in the production environment of clusters with ten-thousand-scale of GPUs, achieving savings of millions of GPU hours.

#computer-version/year/2025 #llm/month/202503 #llm/paper/year/2025 #computer-version #llm/paper/month/202503 #multimodal-model #llm/paper #deep-learning/from/bytedance-research #deep-learning/year/2025 #llm/year/2025 #world-model #computer-version/month/202503 #deep-learning #deep-learning/month/202503 #llm ByteDance Research

Review Snapshot

Explore ratings

0.0

★★★★★

0 ratings

5 star

4 star

3 star

2 star

1 star

Recommendation

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful