Topic: cs.LG

Short answer

This page shows the most relevant public items for cs.LG, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

WeeklyMonthlyAll time
Current weekPast week2 weeks ago

← Back to home

  1. Diffusion Alignment as Variational Expectation-Maximization

    PaperFeb 13, 2026arXivZijing Ou, Jacob Si, Junyi Zhu, Yingzhen Li

    Diffusion alignment aims to optimize diffusion models for downstream objectives. While existing methods based on RL achieve success, they often suffer from reward over-optimization and mode collaps...

  2. Minimax M2.5: Scaling RL for Industrial-Grade Agentic AI

    PaperFeb 16, 2026arXivMiniMax Research Team

    Training agents for industrial-scale deployment requires extreme stability and data throughput. We present Minimax M2.5, a model trained using a novel asynchronous RL architecture designed to proce...

  3. Fast KV Compaction via Attention Matching

    PaperFeb 18, 2026arXivAdam Zweiger, Xinghong Fu, Han Guo, MIT Team

    Large Language Models struggle with memory overhead during long-context inference due to the linear growth of the Key-Value (KV) cache. We propose Attention Matching (AM), a framework for high-qual...

  4. KLong: Training LLM Agents for Extremely Long-horizon Tasks

    PaperFeb 19, 2026arXivYue Liu, Zhiyuan Hu, Flood Sung

    Current LLM agents frequently fail in tasks requiring hundreds of steps due to error accumulation and context overflow. We introduce KLong, an agentic framework that utilizes 'Trajectory-Splitting ...

Related Topics

cs.AI (11)Machine Learning (8)Reinforcement Learning (4)q-bio.NC (2)Preference Optimization (1)