Quick answer
AI Summary: Enables realistic physics-based simulations and 'what-if' scenarios from monocular video inputs.
AI Summary: Enables realistic physics-based simulations and 'what-if' scenarios from monocular video inputs.
To bridge the simulation-to-real gap, we introduce Mirage2Matter, a physically grounded Gaussian world model that generates high-fidelity embodied training data from multi-view videos. We reconstruct environments into photorealistic scene representations using 3DGS, then leverage generative models to recover physically realistic properties like collision geometry. By integrating this into a simulation environment via a precision calibration target, we ensure accurate scale alignment. Extensive experiments show that VLA models trained on our generated data exhibit strong zero-shot generalization across various manipulation tasks, overcoming the simulation mismatches that usually undermine zero-shot deployment.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for Mirage2Matter: A Physically Grounded Gaussian World Model from Video.