← Home

Quick answer

AI Summary: Employs an adversarial 'Alice and Bob' self-play framework to automatically generate increasingly complex physical goals, allowing robotic hands to learn generalized manipulation skills without human-designed rewards.

Claim

Asymmetric self-play for automatic goal discovery in robotic manipulation

OpenAI Robotics Team

ABSTRACT

Training robots to solve a wide variety of manipulation tasks typically requires massive amounts of human-engineered reward functions and goal specifications. We introduce a method for automatic goal discovery using asymmetric self-play between two reinforcement learning agents: a 'Alice' agent that discovers novel physical states by interacting with the environment, and a 'Bob' agent that is trained to replicate those states. This competitive dynamic creates an automated curriculum where Alice is incentivized to propose increasingly complex goals, and Bob learns a highly robust, generalized manipulation policy. We demonstrate that policies trained entirely via this unsupervised self-play transfer successfully to physical robot hands, enabling them to solve complex block-stacking and manipulation tasks zero-shot.

Review Snapshot

Explore ratings

4.6
★★★★★
5 ratings
5 star
60%
4 star
40%
3 star
0%
2 star
0%
1 star
0%

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Asymmetric self-play for automatic goal discovery in robotic manipulation.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.
Post an inquiry
Sort by: Most helpful