Quick answer

AI Summary: Details OpenAI Five, a massive multi-agent RL system that defeated world champion human players in the complex esports game Dota 2 using heavily scaled Proximal Policy Optimization.

Paper2019-12-13•Source ↗•66 attns149 checkouts

Claim

Dota 2 with Large Scale Deep Reinforcement Learning

Authors

Discuss with Grok

Christopher Berner·

Greg Brockman·

Brooke Chan·

Vicki Cheung·

Przemysław Dębiak·

Christy Dennison·

David Farhi·

Quirin Fischer·

Shariq Hashme·

Chris Hesse·

Ilya Sutskever·

et al.

ABSTRACT

We present OpenAI Five, a system of five neural networks that learned to play the highly complex, imperfect-information esports game Dota 2 entirely through self-play. Dota 2 involves long time horizons, partially observed states, and high-dimensional, continuous action spaces, making it a grand challenge for AI. By scaling up Proximal Policy Optimization (PPO) to an unprecedented 256 GPUs and 128,000 CPU cores, OpenAI Five achieved expert-level performance, ultimately defeating the world champion e-sports team OG. The system demonstrates that general reinforcement learning algorithms can scale to solve extremely complex, real-time multi-agent environments without hand-crafted heuristics.

#cs-lg #esports company:openai-research #reinforcement-learning #cs-ai

Review Snapshot

Explore ratings

4.6

★★★★★

5 ratings

5 star

60%

4 star

40%

3 star

2 star

1 star

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Dota 2 with Large Scale Deep Reinforcement Learning.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful