← Home

Quick answer

AI Summary: Shows that investing in better outcome verifiers yields higher performance than scaling actor policies alone.

Claim

Scaling Verification Can Be More Effective than Scaling Policy Learning for VLA

Authors
Jacky Kwok·
Xilun Zhang·
Marco Pavone·
Chelsea Finn

ABSTRACT

We investigate test-time verification as a means to shrink the 'intention-action gap' in embodied AI. We characterize the test-time scaling law for embodied instruction following and demonstrate that jointly scaling the number of rephrased instructions and generated actions increases test-time sample diversity. We present CoVer, a contrastive verifier for vision-language-action alignment, and introduce 'boot-time compute'—a hierarchical verification pipeline. Compared to scaling policy pre-training on the same data, our verification approach yields 22% gains in-distribution and a 45% improvement in real-world experiments, suggesting verification is a more compute-efficient path to alignment.

Review Snapshot

Explore ratings

0.0
★★★★★
0 ratings
5 star
0%
4 star
0%
3 star
0%
2 star
0%
1 star
0%

Recommendation

0%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Scaling Verification Can Be More Effective than Scaling Policy Learning for VLA.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.
Post an inquiry
Sort by: Most helpful