Quick answer
AI Summary: Iterative refinement between the agent and its world model leads to more accurate physical interaction modeling.
AI Summary: Iterative refinement between the agent and its world model leads to more accurate physical interaction modeling.
The goal of this paper is to improve VLA performance through iterative online interaction. Since collecting real-world rollouts is expensive, we investigate whether a learned simulator—an action-conditioned video generation model—can generate rollout data. We propose VLAW, an iterative algorithm that uses real-world rollout data to improve the physical fidelity of a world model, which then generates synthetic data to improve the VLA policy. In experiments on a real robot, we achieve a 39.2% absolute success rate improvement over the base policy. We show that the world model learns to capture complex dynamics and failure cases, enabling the policy to progressively overcome execution errors.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model.