← Home

Quick answer

AI Summary: Presents Agent57, the first RL algorithm to achieve superhuman performance across the entire 57-game Atari benchmark by mastering 'hard exploration' games through dynamic intrinsic rewards.

Claim

Agent57: Outperforming the Atari Human Benchmark

Adrià Puigdomènech Badia·
Bilal Piot·
Steven Kapturowski·
Pablo Sprechmann·
Alex Vitvitskyi·
Daniel Guo·
Charles Blundell

ABSTRACT

Atari 2600 games have been a long-standing benchmark in the reinforcement learning community. While previous algorithms have achieved superhuman performance on average, they consistently fail on a subset of 'hard exploration' games like Montezuma's Revenge and Pitfall. We propose Agent57, the first deep reinforcement learning agent to obtain a score that is strictly above the human baseline on all 57 Atari 2600 games. Agent57 achieves this by combining an episodic intrinsic reward mechanism for deep exploration with a meta-controller that dynamically adapts the exploration-exploitation trade-off and the time horizon of the agent's discount factor over the course of training.

Review Snapshot

Explore ratings

4.6
★★★★★
5 ratings
5 star
60%
4 star
40%
3 star
0%
2 star
0%
1 star
0%

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Agent57: Outperforming the Atari Human Benchmark.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.
Post an inquiry
Sort by: Most helpful
Agent57: Outperforming the Atari Human Benchmark | Attendemia