Topic: Awesome List: ai-agent-papers-2026

Track this topic after sign-in.

Short answer

This page shows the most relevant public items for Awesome List: ai-agent-papers-2026, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

Weekly Monthly All time

← Back to home

Emulating Aggregate Human Choice Behavior and Biases with GPT Conversational Agents
Paper • Feb 5, 2026 • arxiv.org • Stephen Pilli, Vivek Nallur
Cognitive biases often shape human decisions. While large language models (LLMs) have been shown to reproduce well-known biases, a more critical question is whether LLMs can predict biases at the i...
TrajAD: Trajectory Anomaly Detection for Trustworthy LLM Agents
Paper • Feb 6, 2026 • arxiv.org • Yibing Liu, Chong Zhang, Zhongyi Han, Hansong Liu, Yong Wang, Yang Yu, Xiaoyan Wang, Yilong Yin
We address the problem of runtime trajectory anomaly detection, a critical capability for enabling trustworthy LLM agents. Current safety measures predominantly focus on static input/output filteri...
Completing Missing Annotation: Multi-Agent Debate for Accurate and Scalable Relevant Assessment for IR Benchmarks
Paper • Feb 6, 2026 • arxiv.org • Minjeong Ban, Jeonghwan Choi, Hyangsuk Min, Nicole Hee-Yeon Kim, Minseok Kim, Jae-Gil Lee, Hwanjun Song
Information retrieval (IR) evaluation remains challenging due to incomplete IR benchmark datasets that contain unlabeled relevant chunks. While LLMs and LLM-human hybrid strategies reduce costly hu...
JADE: Expert-Grounded Dynamic Evaluation for Open-Ended Professional Tasks
Paper • Feb 6, 2026 • arxiv.org • Lanbo Lin, Jiayao Liu, Tianyuan Yang, Li Cai, Yuanwu Xu, Lei Wei, Sicong Xie, Guannan Zhang
Evaluating agentic AI on open-ended professional tasks faces a fundamental dilemma between rigor and flexibility. Static rubrics provide rigorous, reproducible assessment but fail to accommodate di...
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents
Paper • Feb 6, 2026 • arxiv.org • Alisia Lupidi, Bhavul Gauri, Thomas Simon Foster, Bassel Al Omari, Despoina Magka, Alberto Pepe, Alexis Audran-Reiss, Muna Aghamelu, Nicolas Baldwin, Lucia Cipolina-Kun, Jean-Christophe Gagnon-Audet, Chee Hau Leow, Sandra Lefdal, Hossam Mossalam, Abhinav Moudgil, Saba Nazir, Emanuel Tewolde, Isabel Urrego, Jordi Armengol Estape, Amar Budhiraja, Gaurav Chaurasia, Abhishek Charnalia, Derek Dunfield, Karen Hambardzumyan, Daniel Izcovich, Martin Josifoski, Ishita Mediratta, Kelvin Niu, Parth Pathak, Michael Shvartsman, Edan Toledo, Anton Protopopov, Roberta Raileanu, Alexander Miller, Tatiana Shavrina, Jakob Foerster, Yoram Bachrach
LLM agents hold significant promise for advancing scientific research. To accelerate this progress, we introduce AIRS-Bench (the AI Research Science Benchmark), a suite of 20 tasks sourced from sta...
Agentic Uncertainty Reveals Agentic Overconfidence
Paper • Feb 6, 2026 • arxiv.org • Jean Kaddour, Srijan Patel, Gbètondji Dovonon, Leo Richter, Pasquale Minervini, Matt J. Kusner
Can AI agents predict whether they will succeed at a task? We study agentic uncertainty by eliciting success probability estimates before, during, and after task execution. All results exhibit agen...
From Features to Actions: Explainability in Traditional and Agentic AI Systems
Paper • Feb 6, 2026 • arxiv.org • Sindhuja Chaduvula, Jessee Ho, Kina Kim, Aravind Narayanan, Mahshid Alinoori, Muskan Garg, Dhanesh Ramachandram, Shaina Raza
Over the last decade, explainable AI has primarily focused on interpreting individual model predictions, producing post-hoc explanations that relate inputs to outputs under a fixed decision structu...
SimpleMem: Efficient Lifelong Memory for LLM Agents
Paper • Jan 29, 2026 • arxiv.org • Jiaqi Liu, Yaofeng Su, Peng Xia, Siwei Han, Zeyu Zheng, Cihang Xie, Mingyu Ding, Huaxiu Yao
To support long-term interaction in complex environments, LLM agents require memory systems that manage historical experiences. Existing approaches either retain full interaction histories via pass...
HiMeS: Hippocampus-inspired Memory System for Personalized AI Assistants
Paper • Jan 6, 2026 • arxiv.org • Hailong Li, Feifei Li, Wenhui Que, Xingyu Fan
Large language models (LLMs) power many interactive systems such as chatbots, customer-service agents, and personal assistants. In knowledge-intensive scenarios requiring user-specific personalizat...
MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents
Paper • Jan 6, 2026 • arxiv.org • Dongming Jiang, Yi Li, Guanpeng Li, Bingzhe Li
Memory-Augmented Generation (MAG) extends Large Language Models with external memory to support long-context reasoning, but existing approaches largely rely on semantic similarity over monolithic m...
Membox: Weaving Topic Continuity into Long-Range Memory for LLM Agents
Paper • Jan 20, 2026 • arxiv.org • Dehao Tao, Guoliang Ma, Yongfeng Huang, Minghu Jiang
Human-agent dialogues often exhibit topic continuity-a stable thematic frame that evolves through temporally adjacent exchanges-yet most large language model (LLM) agent memory systems fail to pres...
Beyond Static Summarization: Proactive Memory Extraction for LLM Agents
Paper • Jan 8, 2026 • arxiv.org • Chengyuan Yang, Zequn Sun, Wei Wei, Wei Hu
Memory management is vital for LLM agents to handle long-term interaction and personalization. Most research focuses on how to organize and use memory summary, but often overlooks the initial memor...
Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction
Paper • Jan 8, 2026 • arxiv.org • Muzhao Tian, Zisu Huang, Xiaohua Wang, Jingwen Xu, Zhengkang Guo, Qi Qian, Yuanzhe Shen, Kaitao Song, Jiakang Yuan, Changze Lv, Xiaoqing Zheng
As LLM-based agents are increasingly used in long-term interactions, cumulative memory is critical for enabling personalization and maintaining stylistic consistency. However, most existing systems...
PRISMA: Reinforcement Learning Guided Two-Stage Policy Optimization in Multi-Agent Architecture for Open-Domain Multi-Hop Question Answering
Paper • Jan 9, 2026 • arxiv.org • Yu Liu, Wenxiao Zhang, Cong Cao, Wenxuan Lu, Fangfang Yuan, Diandian Guo, Kun Peng, Qiang Sun, Kaiyan Zhang, Yanbing Liu, Jin B. Hong, Bowen Zhou, Zhiyuan Ma
Answering real-world open-domain multi-hop questions over massive corpora is a critical challenge in Retrieval-Augmented Generation (RAG) systems. Recent research employs reinforcement learning (RL...
L-RAG: Balancing Context and Retrieval with Entropy-Based Lazy Loading
Paper • Jan 10, 2026 • arxiv.org • Sergii Voloshyn
Retrieval-Augmented Generation (RAG) has emerged as the predominant paradigm for grounding Large Language Model outputs in factual knowledge, effectively mitigating hallucinations. However, convent...
Amory: Building Coherent Narrative-Driven Agent Memory through Agentic Reasoning
Paper • Jan 9, 2026 • arxiv.org • Yue Zhou, Xiaobo Guo, Belhassen Bayar, Srinivasan H. Sengamedu
Long-term conversational agents face a fundamental scalability challenge as interactions extend over time: repeatedly processing entire conversation histories becomes computationally prohibitive. C...
CIRAG: Construction-Integration Retrieval and Adaptive Generation for Multi-hop Question Answering
Paper • Jan 11, 2026 • arxiv.org • Zili Wei, Xiaocui Yang, Yilin Wang, Zihan Wang, Weidong Bao, Shi Feng, Daling Wang, Yifei Zhang
Triple-based Iterative Retrieval-Augmented Generation (iRAG) mitigates document-level noise for multi-hop question answering. However, existing methods still face limitations: (i) greedy single-pat...
Seeing through the Conflict: Transparent Knowledge Conflict Handling in Retrieval-Augmented Generation
Paper • Jan 11, 2026 • arxiv.org • Hua Ye, Siyuan Chen, Ziqi Zhong, Canran Xiao, Haoliang Zhang, Yuhan Wu, Fei Shen
Large language models (LLMs) equipped with retrieval--the Retrieval-Augmented Generation (RAG) paradigm--should combine their parametric knowledge with external evidence, yet in practice they often...
Relink: Constructing Query-Driven Evidence Graph On-the-Fly for GraphRAG
Paper • Jan 12, 2026 • arxiv.org • Manzong Huang, Chenyang Bu, Yi He, Xingrui Zhuo, Xindong Wu
Graph-based Retrieval-Augmented Generation (GraphRAG) mitigates hallucinations in Large Language Models (LLMs) by grounding them in structured knowledge. However, current GraphRAG methods are const...
Active Context Compression: Autonomous Memory Management in LLM Agents
Paper • Jan 12, 2026 • arxiv.org • Nikhil Verma
Large Language Model (LLM) agents struggle with long-horizon software engineering tasks due to "Context Bloat." As interaction history grows, computational costs explode, latency increases,...

← PreviousPage 2Next →

FAQ

What does this Awesome List: ai-agent-papers-2026 page rank?

It ranks public content for Awesome List: ai-agent-papers-2026 using recent discussion, review, and engagement signals so you can triage faster. This guidance is specific to Awesome List: ai-agent-papers-2026 topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How should I use weekly vs monthly vs all-time?

Use weekly for fast-moving updates, monthly for stable trend confirmation, and all-time for evergreen references. This guidance is specific to Awesome List: ai-agent-papers-2026 topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How can I discover organizations active in Awesome List: ai-agent-papers-2026?

Use the linked entities section to jump to labs, companies, and experts connected to this topic and explore their timelines. This guidance is specific to Awesome List: ai-agent-papers-2026 topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Can I follow this topic for updates?

Yes. Use the follow button on this page to subscribe and track new high-signal activity. This guidance is specific to Awesome List: ai-agent-papers-2026 topic page on Attendemia and is written so it still makes sense without reading other sections on the page.