Topic: cs.AI

Track this topic after sign-in.

Short answer

This page shows the most relevant public items for cs.AI, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

Weekly Monthly All time

← Back to home

From Features to Actions: Explainability in Traditional and Agentic AI Systems
Paper • Feb 18, 2026 • arXiv • Moritz Miller, Florent Draye, Bernhard Schölkopf
This paper distinguishes between two paradigms in AI explanation: static prediction and agentic trajectories. In agentic systems, behavior emerges as a sequence of observations, reasoning steps, an...
OpenAI o1 System Card
Paper • Sep 12, 2024 • OpenAI • OpenAI
We introduce OpenAI o1, a new series of large language models trained with reinforcement learning to perform complex reasoning. o1 models are designed to spend more time thinking before they respon...
GPT-4 Technical Report
Paper • Mar 15, 2023 • arXiv • OpenAI
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT...
Training language models to follow instructions with human feedback
Paper • Mar 4, 2022 • arXiv • Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe
Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not he...
A Generalist Agent
Paper • May 12, 2022 • arXiv • Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Nando de Freitas
Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato,...
From Vibe to Verification: Automated Synthesis of Formal Specifications from Agentic Prompts
Paper • Feb 20, 2026 • arXiv • Armando Solar-Lezama, Sumit Gulwani, Elena Rossi
The rapid adoption of natural language programming ('vibe coding') has democratized software creation but introduced unprecedented levels of technical debt and architectural instability. We propose...
KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning
Paper • Feb 4, 2026 • arXiv • Fan Yang, Rui Meng, Trudi Di Qi, Ali Ezzati, Yuxin Wen
Direct Preference Optimization (DPO) and its variants have revolutionized LLM alignment, yet they struggle when the preferred choice requires deep, multi-step reasoning. We introduce KEPO, a framew...
GLM-5: From Vibe Coding to Agentic Engineering
Paper • Feb 17, 2026 • arXiv • Zhipu AI Team, Tsinghua University Researchers
We present GLM-5, a foundation model designed to bridge the gap between human-guided 'vibe coding' and autonomous 'agentic engineering.' GLM-5 introduces DeepSeek-inspired Sparse Attention (DSA) to...
GSR: Learning Structured Reasoning for Embodied Manipulation
Paper • Feb 10, 2026 • arXiv • Kewei Hu, Michael Zhang, Hanwen Kang
We introduce Grounded Scene-graph Reasoning (GSR), a structured reasoning paradigm that explicitly models world-state evolution as transitions over semantically grounded scene graphs. By reasoning ...
Differentiable Modal Logic for Multi-Agent Diagnosis, Orchestration and Communication
Paper • Feb 12, 2026 • arXiv • Antonin Sulc
As multi-agent systems evolve into autonomous swarms, debugging failures requires reasoning about knowledge, belief, and obligation. We present Differentiable Modal Logic (DML), implemented via Mod...
Mirage2Matter: A Physically Grounded Gaussian World Model from Video
Paper • Feb 8, 2026 • arXiv • Zhengqing Gao, Ziwen Li, Xin Wang, Tongliang Liu
To bridge the simulation-to-real gap, we introduce Mirage2Matter, a physically grounded Gaussian world model that generates high-fidelity embodied training data from multi-view videos. We reconstru...
Towards On-Policy SFT: Distribution Discriminant Theory and its Applications
Paper • Feb 13, 2026 • arXiv • Miaosen Zhang, Xu Yang, Qi Dai, Chong Luo
Supervised fine-tuning (SFT) is efficient but often yields inferior generalization compared to RL, a gap driven by RL's use of on-policy data. We propose a framework to bridge this chasm by enablin...
ZEST: Zero-shot Embodied Skill Transfer for Athletic Robot Control
Paper • Feb 8, 2026 • arXiv • Eva Mungai, Zach Nobles, Scott Kuindersma, Yeuhi Abe
We introduce ZEST (Zero-shot Embodied Skill Transfer), a motion-imitation framework that trains policies via RL from diverse sources—mocap, noisy monocular video, and animation—and deploys them to ...
Tiny Recursive Reasoning with Mamba-2 Attention Hybrid
Paper • Feb 12, 2026 • arXiv • Wenlong Wang, Fergal Reid
Recent work demonstrates that tiny networks (7M parameters) can achieve strong performance on abstract reasoning through latent recursion. We investigate whether Mamba-2's state space recurrence, i...
Any House Any Task: Scalable Long-Horizon Planning for Abstract Human Tasks
Paper • Feb 13, 2026 • arXiv • Zhihong Liu, Yang Li, Rengming Huang, Cewu Lu
Open world language conditioned task planning is crucial for robots in large-scale households. A key challenge remains scalability; performance degrades with increasing environment size and plan le...
Scaling Verification Can Be More Effective than Scaling Policy Learning for VLA
Paper • Feb 13, 2026 • arXiv • Jacky Kwok, Xilun Zhang, Marco Pavone, Chelsea Finn
We investigate test-time verification as a means to shrink the 'intention-action gap' in embodied AI. We characterize the test-time scaling law for embodied instruction following and demonstrate th...
Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Models
Paper • Feb 8, 2026 • arXiv • Fan Yang, Rui Meng, Trudi Di Qi, Yuxin Wen
We argue that while replacing data with model parameters characterizes the present of Federated Learning (FL), replacing parameters with preferences represents a more scalable and privacy-preservin...
Inference-Only Prompt Projection for Safe Text-to-Image Generation
Paper • Feb 9, 2026 • arXiv • Minhyuk Lee, Hyekyung Yoon, Myungjoo Kang
Text-to-Image (T2I) diffusion models enable high-quality synthesis, but deployment demands safeguards. We formalize the tension between safety and alignment through a total variation (TV) lens, yie...
STAR: Bridging Statistical and Agentic Reasoning for Large Model Performance Prediction
Paper • Feb 12, 2026 • arXiv • Xiaoxiao Wang, Chunxiao Li, Junying Wang, Zicheng Zhang
The paper introduces STAR (STatistical and Agentic Reasoning), a novel framework designed to predict the performance of large models across diverse benchmarks from limited observations. Existing st...
MEM1: A Constant-Memory RL Framework for Long-Horizon Language Agents
Paper • Feb 12, 2026 • arXiv • Yurong Chen, Yu He, Michael I. Jordan, Fan Yao
Modern language agents must operate over long-horizon, multi-turn interactions, but most rely on full-context prompting which leads to unbounded memory growth. We introduce MEM1, an end-to-end rein...

← PreviousPage 4Next →

FAQ

What does this cs.AI page rank?

It ranks public content for cs.AI using recent discussion, review, and engagement signals so you can triage faster. This guidance is specific to cs.AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How should I use weekly vs monthly vs all-time?

Use weekly for fast-moving updates, monthly for stable trend confirmation, and all-time for evergreen references. This guidance is specific to cs.AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How can I discover organizations active in cs.AI?

Use the linked entities section to jump to labs, companies, and experts connected to this topic and explore their timelines. This guidance is specific to cs.AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Can I follow this topic for updates?

Yes. Use the follow button on this page to subscribe and track new high-signal activity. This guidance is specific to cs.AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Topic: cs.AI

Short answer

From Features to Actions: Explainability in Traditional and Agentic AI Systems

OpenAI o1 System Card

GPT-4 Technical Report

Training language models to follow instructions with human feedback

A Generalist Agent

From Vibe to Verification: Automated Synthesis of Formal Specifications from Agentic Prompts

KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning

GLM-5: From Vibe Coding to Agentic Engineering

GSR: Learning Structured Reasoning for Embodied Manipulation

Differentiable Modal Logic for Multi-Agent Diagnosis, Orchestration and Communication

Mirage2Matter: A Physically Grounded Gaussian World Model from Video

Towards On-Policy SFT: Distribution Discriminant Theory and its Applications

ZEST: Zero-shot Embodied Skill Transfer for Athletic Robot Control

Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

Any House Any Task: Scalable Long-Horizon Planning for Abstract Human Tasks

Scaling Verification Can Be More Effective than Scaling Policy Learning for VLA

Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Models

Inference-Only Prompt Projection for Safe Text-to-Image Generation

STAR: Bridging Statistical and Agentic Reasoning for Large Model Performance Prediction

MEM1: A Constant-Memory RL Framework for Long-Horizon Language Agents

Top Entities In This Topic

Related Topics

FAQ

What does this cs.AI page rank?

How should I use weekly vs monthly vs all-time?

How can I discover organizations active in cs.AI?

Can I follow this topic for updates?