Quick answer
AI Summary: Presents Agent57, the first RL algorithm to achieve superhuman performance across the entire 57-game Atari benchmark by mastering 'hard exploration' games through dynamic intrinsic rewards.
AI Summary: Presents Agent57, the first RL algorithm to achieve superhuman performance across the entire 57-game Atari benchmark by mastering 'hard exploration' games through dynamic intrinsic rewards.
Atari 2600 games have been a long-standing benchmark in the reinforcement learning community. While previous algorithms have achieved superhuman performance on average, they consistently fail on a subset of 'hard exploration' games like Montezuma's Revenge and Pitfall. We propose Agent57, the first deep reinforcement learning agent to obtain a score that is strictly above the human baseline on all 57 Atari 2600 games. Agent57 achieves this by combining an episodic intrinsic reward mechanism for deep exploration with a meta-controller that dynamically adapts the exploration-exploitation trade-off and the time horizon of the agent's discount factor over the course of training.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for Agent57: Outperforming the Atari Human Benchmark.