Quick answer
AI Summary: Employs an adversarial 'Alice and Bob' self-play framework to automatically generate increasingly complex physical goals, allowing robotic hands to learn generalized manipulation skills without human-designed rewards.
AI Summary: Employs an adversarial 'Alice and Bob' self-play framework to automatically generate increasingly complex physical goals, allowing robotic hands to learn generalized manipulation skills without human-designed rewards.
Training robots to solve a wide variety of manipulation tasks typically requires massive amounts of human-engineered reward functions and goal specifications. We introduce a method for automatic goal discovery using asymmetric self-play between two reinforcement learning agents: a 'Alice' agent that discovers novel physical states by interacting with the environment, and a 'Bob' agent that is trained to replicate those states. This competitive dynamic creates an automated curriculum where Alice is incentivized to propose increasingly complex goals, and Bob learns a highly robust, generalized manipulation policy. We demonstrate that policies trained entirely via this unsupervised self-play transfer successfully to physical robot hands, enabling them to solve complex block-stacking and manipulation tasks zero-shot.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for Asymmetric self-play for automatic goal discovery in robotic manipulation.