Quick answer
AI Summary: GSR teaches robots to decompose manipulation tasks into logical sub-goals based on object affordances.
AI Summary: GSR teaches robots to decompose manipulation tasks into logical sub-goals based on object affordances.
We introduce Grounded Scene-graph Reasoning (GSR), a structured reasoning paradigm that explicitly models world-state evolution as transitions over semantically grounded scene graphs. By reasoning step-wise over object states and spatial relations, rather than mapping perception to actions, GSR enables explicit reasoning about preconditions and consequences. We construct Manip-Cognition-1.6M, a large-scale dataset to supervise world understanding and action planning. Evaluations across RLBench and real-world tasks show that GSR significantly improves zero-shot generalization and long-horizon task completion over prompting-based baselines by treating scene graphs as the primary state space for decision making.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for GSR: Learning Structured Reasoning for Embodied Manipulation.