CompactRAG: Reducing LLM Calls and Token Overhead in Multi-Hop Question Answering
Paper • Feb 5, 2026 • arxiv.org • Hao Yang, Zhiyu Yang, Xupeng Zhang, Wei Wei, Yunjie Zhang, Lin Yang
Retrieval-augmented generation (RAG) has become a key paradigm for knowledge-intensive question answering. However, existing multi-hop RAG systems remain inefficient, as they alternate between retr...