← Home

Quick answer

AI Summary: Presents IMPALA, a scalable, distributed reinforcement learning architecture that uses the V-trace algorithm to efficiently train single agents across massive computer clusters.

Claim

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Lasse Espeholt·
Hubert Soyer·
Remi Munos·
Karen Simonyan·
Volodymyr Mnih·
Tom Ward·
Yotam Doron·
Vlad Firoiu·
Tim Harley·
Iain Dunning·
Shane Legg·
Koray Kavukcuoglu

ABSTRACT

Scaling reinforcement learning algorithms to utilize thousands of machines efficiently is crucial for tackling complex, visually rich environments. We introduce IMPALA (Importance Weighted Actor-Learner Architecture), a distributed agent architecture that decouples the acting and learning processes to achieve massive throughput. Unlike A3C, which uses gradients to communicate between workers, IMPALA actors communicate trajectories of experience to centralized learners. We introduce the V-trace off-policy actor-critic algorithm to correct for the policy lag caused by this decoupled architecture. IMPALA achieves data throughput that is orders of magnitude higher than previous systems, successfully learning a single policy capable of playing all 30 DMLab-30 levels and 57 Atari games.

Review Snapshot

Explore ratings

4.6
★★★★★
5 ratings
5 star
60%
4 star
40%
3 star
0%
2 star
0%
1 star
0%

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.
Post an inquiry
Sort by: Most helpful