Topic: Reinforcement Learning

Track this topic after sign-in.

Short answer

This page shows the most relevant public items for Reinforcement Learning, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

Weekly Monthly All time

Current month Last month 2 months ago

← Back to home

Minimax M2.5: Scaling RL for Industrial-Grade Agentic AI
Paper • Feb 16, 2026 • arXiv • MiniMax Research Team
Training agents for industrial-scale deployment requires extreme stability and data throughput. We present Minimax M2.5, a model trained using a novel asynchronous RL architecture designed to proce...
MASPO: Robust and Sample-Efficient LLM Reasoning via Unified Policy Optimization
Paper • Feb 19, 2026 • arXiv • Xiaoliang Fu, Jiaye Lin, Yangyi Fang
Policy optimization for Large Language Models often suffers from gradient instability and reward signal unreliability, particularly in mathematical and verifiable reasoning tasks. We introduce MASP...
KLong: Training LLM Agents for Extremely Long-horizon Tasks
Paper • Feb 19, 2026 • arXiv • Yue Liu, Zhiyuan Hu, Flood Sung
Current LLM agents frequently fail in tasks requiring hundreds of steps due to error accumulation and context overflow. We introduce KLong, an agentic framework that utilizes 'Trajectory-Splitting ...
Trust Region Policy Optimization
Paper • Feb 19, 2015 • arXiv • John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, Philipp Moritz
We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement. By making several approximations to the theoretically-justified procedure, we develop a practical ...
Asynchronous Methods for Deep Reinforcement Learning
Paper • Feb 4, 2016 • arXiv • Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present as...
A Generalist Agent
Paper • May 12, 2022 • arXiv • Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Nando de Freitas
Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato,...

← PreviousPage 2Next →

FAQ

What does this Reinforcement Learning page rank?

It ranks public content for Reinforcement Learning using recent discussion, review, and engagement signals so you can triage faster. This guidance is specific to Reinforcement Learning topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How should I use weekly vs monthly vs all-time?

Use weekly for fast-moving updates, monthly for stable trend confirmation, and all-time for evergreen references. This guidance is specific to Reinforcement Learning topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How can I discover organizations active in Reinforcement Learning?

Use the linked entities section to jump to labs, companies, and experts connected to this topic and explore their timelines. This guidance is specific to Reinforcement Learning topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Can I follow this topic for updates?

Yes. Use the follow button on this page to subscribe and track new high-signal activity. This guidance is specific to Reinforcement Learning topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Topic: Reinforcement Learning

Short answer

Minimax M2.5: Scaling RL for Industrial-Grade Agentic AI

MASPO: Robust and Sample-Efficient LLM Reasoning via Unified Policy Optimization

KLong: Training LLM Agents for Extremely Long-horizon Tasks

Trust Region Policy Optimization

Asynchronous Methods for Deep Reinforcement Learning

A Generalist Agent

Top Entities In This Topic

Related Topics

FAQ

What does this Reinforcement Learning page rank?

How should I use weekly vs monthly vs all-time?

How can I discover organizations active in Reinforcement Learning?

Can I follow this topic for updates?