← Home

Quick answer

AI Summary: Introduces MADDPG, a framework that stabilizes multi-agent reinforcement learning by using centralized training and decentralized execution, allowing agents to successfully navigate mixed cooperative and competitive environments.

Claim

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Ryan Lowe·
Yi Wu·
Aviv Tamar·
Jean Harb·
Pieter Abbeel·
Igor Mordatch

ABSTRACT

We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. Our Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm utilizes centralized training with decentralized execution, allowing agents to discover complex physical behaviors like cooperation, coordination, and deception.

Review Snapshot

Explore ratings

4.6
★★★★★
5 ratings
5 star
60%
4 star
40%
3 star
0%
2 star
0%
1 star
0%

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.
Post an inquiry
Sort by: Most helpful