Quick answer

AI Summary: MiRA introduces a 'divide and conquer' strategy for multimodal tasks by using three distinct agents for vision, text, and final judgment. By forcing the visual and textual agents to reason independently before a 'judge' combines their findings, the framework prevents the 'hallucination leakage' often seen in unified multimodal models.

Paper2026-02-20•Source ↗•12 attns164 checkouts

Claim

MiRA: A Zero-Shot Mixture-of-Reasoning Agents Framework

Authors

Discuss with Grok

Sethuraman et al.·

AAMAS 2026 Main Track

ABSTRACT

We propose Mixture-of-Reasoning Agents (MiRA), a zero-shot multimodal framework that decomposes reasoning across three specialized agents: Visual Analyzing, Text Comprehending, and Judge. By consolidating multimodal evidence through independent reasoning paths, MiRA achieves 96.0% accuracy on ScienceQA, surpassing all few-shot and zero-shot baselines. This modular approach provides highly accurate, interpretable results without the need for task-specific adaptation, establishing a new paradigm for generalizable AI in multimodal answering.

#zero-shot #multimodal #cs-ma #cs-ai

Review Snapshot

Explore ratings

4.7

★★★★★

3 ratings

5 star

67%

4 star

33%

3 star

2 star

1 star

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for MiRA: A Zero-Shot Mixture-of-Reasoning Agents Framework.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful