Quick answer

2, a model that harmonizes high computational efficiency with superior reasoning and agent performance. 2 are as follows: (1) DeepSeek Sparse Attention (DSA): We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance in long-context scenarios.

Paper2025-12-02•Source ↗•10 attns0 checkouts

Claim

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Authors

Discuss with Grok

DeepSeek-AI·

Aixin Liu·

Aoxue Mei·

Bangcai Lin·

Bing Xue·

Bingxuan Wang·

Bingzheng Xu·

Bochao Wu·

Bowei Zhang·

Chaofan Lin·

Chen Dong·

Chengda Lu·

Chenggang Zhao·

Chengqi Deng·

Chenhao Xu·

Chong Ruan·

Damai Dai·

Daya Guo·

Dejian Yang·

Deli Chen·

Erhang Li·

Fangqi Zhou·

Fangyun Lin·

Fucong Dai·

Guangbo Hao·

Guanting Chen·

Guowei Li·

H. Zhang·

Hanwei Xu·

Hao Li·

Haofen Liang·

Haoran Wei·

Haowei Zhang·

Haowen Luo·

Haozhe Ji·

Honghui Ding·

Hongxuan Tang·

Huanqi Cao·

Huazuo Gao·

Hui Qu·

Hui Zeng·

Jialiang Huang·

Jiashi Li·

Jiaxin Xu·

Jiewen Hu·

Jingchang Chen·

Jingting Xiang·

Jingyang Yuan·

Jingyuan Cheng·

Jinhua Zhu·

Jun Ran·

Junguang Jiang·

Junjie Qiu·

Junlong Li·

Junxiao Song·

Kai Dong·

Kaige Gao·

Kang Guan·

Kexin Huang·

Kexing Zhou·

Kezhao Huang·

Kuai Yu·

Lean Wang·

Lecong Zhang·

Lei Wang·

Liang Zhao·

Liangsheng Yin·

Lihua Guo·

Lingxiao Luo·

Linwang Ma·

Litong Wang·

Liyue Zhang·

M. S. Di·

M. Y Xu·

Mingchuan Zhang·

Minghua Zhang·

Minghui Tang·

Mingxu Zhou·

Panpan Huang·

Peixin Cong·

Peiyi Wang·

Qiancheng Wang·

Qihao Zhu·

Qingyang Li·

Qinyu Chen·

Qiushi Du·

Ruiling Xu·

Ruiqi Ge·

Ruisong Zhang·

Ruizhe Pan·

Runji Wang·

Runqiu Yin·

Runxin Xu·

Ruomeng Shen·

Ruoyu Zhang·

S. H. Liu·

Shanghao Lu·

Shangyan Zhou·

Shanhuang Chen·

Shaofei Cai·

Shaoyuan Chen·

Shengding Hu·

Shengyu Liu·

Shiqiang Hu·

Shirong Ma·

Shiyu Wang·

Shuiping Yu·

Shunfeng Zhou·

Shuting Pan·

Songyang Zhou·

Tao Ni·

Tao Yun·

Tian Pei·

Tian Ye·

Tianyuan Yue·

Wangding Zeng·

Wen Liu·

Wenfeng Liang·

Wenjie Pang·

Wenjing Luo·

Wenjun Gao·

Wentao Zhang·

Xi Gao·

Xiangwen Wang·

Xiao Bi·

Xiaodong Liu·

Xiaohan Wang·

Xiaokang Chen·

Xiaokang Zhang·

Xiaotao Nie·

Xin Cheng·

Xin Liu·

Xin Xie·

Xingchao Liu·

Xingkai Yu·

Xingyou Li·

Xinyu Yang·

Xinyuan Li·

Xu Chen·

Xuecheng Su·

Xuehai Pan·

Xuheng Lin·

Xuwei Fu·

Y. Q. Wang·

Yang Zhang·

Yanhong Xu·

Yanru Ma·

Yao Li·

Yao Zhao·

Yaofeng Sun·

Yaohui Wang·

Yi Qian·

Yi Yu·

Yichao Zhang·

Yifan Ding·

Yifan Shi·

Yiliang Xiong·

Ying He·

Ying Zhou·

Yinmin Zhong·

Yishi Piao·

Yisong Wang·

Yixiao

ABSTRACT

We introduce DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance. The key technical breakthroughs of DeepSeek-V3.2 are as follows: (1) DeepSeek Sparse Attention (DSA): We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance in long-context scenarios. (2) Scalable Reinforcement Learning Framework: By implementing a robust reinforcement learning protocol and scaling post-training compute, DeepSeek-V3.2 performs comparably to GPT-5. Notably, our high-compute variant, DeepSeek-V3.2-Speciale, surpasses GPT-5 and exhibits reasoning proficiency on par with Gemini-3.0-Pro, achieving gold-medal performance in both the 2025 International Mathematical Olympiad (IMO) and the International Olympiad in Informatics (IOI). (3) Large-Scale Agentic Task Synthesis Pipeline: To integrate reasoning into tool-use scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This methodology facilitates scalable agentic post-training, yielding substantial improvements in generalization and instruction-following robustness within complex, interactive environments.

#deep-learning/month/202512 #llm/paper/year/2025 DeepSeek #multimodal-model #llm/paper #deep-learning/year/2025 #llm/paper/month/202512 #llm/year/2025 #deep-learning/from/deepseek #deep-learning #llm/month/202512 #llm #ai-coding

Review Snapshot

Explore ratings

0.0

★★★★★

0 ratings

5 star

4 star

3 star

2 star

1 star

Recommendation

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful