Quick answer
AI Summary: Introduces SIMA, a generalist AI agent trained across multiple 3D video games to follow natural language instructions using only visual inputs and generic controls.
AI Summary: Introduces SIMA, a generalist AI agent trained across multiple 3D video games to follow natural language instructions using only visual inputs and generic controls.
We introduce the Scalable Instructable Multiworld Agent (SIMA), an AI agent capable of following natural-language instructions to carry out tasks in a wide variety of 3D virtual environments and video games. SIMA operates using only pixel inputs and keyboard/mouse action outputs, mimicking human interaction. By training the agent across multiple commercial video games and research environments concurrently, we demonstrate that SIMA learns generalizable skills that transfer to unseen games. This research highlights the potential of using diverse simulated worlds to train general-purpose embodied agents that can understand language and execute complex, long-horizon tasks.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for Scaling Instructable Agents Across Many Simulated Worlds.