Topic: cs.CL

Track this topic after sign-in.

Short answer

This page shows the most relevant public items for cs.CL, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

Weekly Monthly All time

← Back to home

Attention Is All You Need
Paper • Jun 12, 2017 • arXiv • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder ...
Gemini 3.1 Pro: A Smarter Baseline for Complex Reasoning on ARC-AGI-2
Paper • Feb 19, 2026 • arXiv • Google DeepMind Team
We introduce Gemini 3.1 Pro, an enhanced version of the Gemini 3 series optimized for rigorous logic and long-horizon problem solving. Built on a hybrid architecture that fuses linear attention wit...
AnyTool: Self-Reflective API Generation for Open-Ended Agentic AI
Paper • Jul 22, 2025 • arXiv • Wei Chen, Yujin Han, Qingwen Bu
Current Agentic AI systems are constrained by the predefined set of tools provided by developers. We introduce AnyTool, a framework that grants agents the autonomy to dynamically generate, test, an...
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Paper • Sep 8, 2021 • arXiv • Stephanie Lin, Jacob Hilton, Owain Evans
We propose TruthfulQA, a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including healt...
WebGPT: Browser-assisted question-answering with human feedback
Paper • Dec 16, 2021 • arXiv • Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman
We introduce a method for fine-tuning language models to interact with a text-based web browser to answer open-ended questions. This model, WebGPT, searches the web, navigates through links, and sy...
Learning to summarize from human feedback
Paper • Sep 2, 2020 • arXiv • Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano
We show that it is possible to significantly improve the quality of text summaries generated by large language models by training them with reinforcement learning from human feedback. We collect a ...
Improving Language Understanding by Generative Pre-Training
Paper • Jun 11, 2018 • OpenAI • Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever
Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large un...
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • Dec 6, 2022 • arXiv • Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever
We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual and multitask su...
Language Models are Unsupervised Multitask Learners
Paper • Feb 14, 2019 • OpenAI • Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific data...
Learning Transferable Visual Models From Natural Language Supervision
Paper • Feb 26, 2021 • arXiv • Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories, restricting their generality. We demonstrate that the simple pre-training task of pre...
Language Models are Few-Shot Learners
Paper • May 28, 2020 • arXiv • Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic i...
Improving language models by retrieving from trillions of tokens
Paper • Dec 8, 2021 • arXiv • Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack W. Rae, Erich Elsen, Laurent Sifre
We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. With a 2 trillion token database, our R...
Training Compute-Optimal Large Language Models
Paper • Mar 29, 2022 • arXiv • Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Laurent Sifre
We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. We find that current large language models are significantly under...
The MIQ Benchmark: Evaluating Machine Intelligence Quotient in Open-Ended Agentic Environments
Paper • Feb 23, 2026 • arXiv • Yao Mu, Tianyu Zheng, Guo Chen, Qingwen Bu, Percy Liang
As large language models transition from passive dialogue systems to active, autonomous agents, traditional benchmarks evaluating static question-answering capabilities have become obsolete. We int...
Flamingo: a Visual Language Model for Few-Shot Learning
Paper • Apr 28, 2022 • arXiv • Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Karen Simonyan
Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. We introduce Flamingo, a family ...
Training-Driven Representational Geometry Modularization Predicts Brain Alignment in Language Models
Paper • Feb 11, 2026 • arXiv • Yixuan Liu, Zhiyuan Ma, Likai Tang, Runmin Gan, Xinche Zhang, Jinhao Li, Chao Xie, Sen Song
The degree to which Large Language Models (LLMs) align with human brain activity during language processing remains a central question in both AI and neuroscience. We investigate the impact of trai...
Agentic Test-Time Scaling for WebAgents
Paper • Feb 12, 2026 • arXiv • Nicholas Lee, Lutfi Eren Erdogan, Chris Joseph John, Surya Krishnapillai, Kurt Keutzer, Amir Gholami
Current WebAgents struggle with long-horizon tasks and complex navigation. We propose an agentic scaling framework that increases compute at test-time through iterative trajectory pruning and rewar...
Gemma: Open Models Based on Gemini Research and Technology
Paper • Feb 21, 2024 • arXiv • Gemma Team, Google DeepMind
We introduce Gemma, a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Gemma models are offered in two sizes: a 7 bi...
Improving alignment of dialogue agents via targeted human judgements
Paper • Sep 22, 2022 • arXiv • Amelia Glaese, Nat McAleese, Maja Trebacz, John Aslanides, Vlad Firoiu, Geoffrey Irving
We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines. We train our model using reinforcement lea...
A-MapReduce: Executing Wide Search via Agentic MapReduce
Paper • Feb 16, 2026 • arXiv • Mingju Chen, Guibin Zhang, Heng Chang, Yuchen Guo
Traditional multi-agent systems often struggle with 'search breadth' in unstructured environments, leading to tunnel vision in reasoning. We propose A-MapReduce, a framework that applies the MapRed...

← PreviousPage 1Next →

FAQ

What does this cs.CL page rank?

It ranks public content for cs.CL using recent discussion, review, and engagement signals so you can triage faster. This guidance is specific to cs.CL topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How should I use weekly vs monthly vs all-time?

Use weekly for fast-moving updates, monthly for stable trend confirmation, and all-time for evergreen references. This guidance is specific to cs.CL topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How can I discover organizations active in cs.CL?

Use the linked entities section to jump to labs, companies, and experts connected to this topic and explore their timelines. This guidance is specific to cs.CL topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Can I follow this topic for updates?

Yes. Use the follow button on this page to subscribe and track new high-signal activity. This guidance is specific to cs.CL topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Topic: cs.CL

Short answer

Attention Is All You Need

Gemini 3.1 Pro: A Smarter Baseline for Complex Reasoning on ARC-AGI-2

AnyTool: Self-Reflective API Generation for Open-Ended Agentic AI

TruthfulQA: Measuring How Models Mimic Human Falsehoods

WebGPT: Browser-assisted question-answering with human feedback

Learning to summarize from human feedback

Improving Language Understanding by Generative Pre-Training

Robust Speech Recognition via Large-Scale Weak Supervision

Language Models are Unsupervised Multitask Learners

Learning Transferable Visual Models From Natural Language Supervision

Language Models are Few-Shot Learners

Improving language models by retrieving from trillions of tokens

Training Compute-Optimal Large Language Models

The MIQ Benchmark: Evaluating Machine Intelligence Quotient in Open-Ended Agentic Environments

Flamingo: a Visual Language Model for Few-Shot Learning

Training-Driven Representational Geometry Modularization Predicts Brain Alignment in Language Models

Agentic Test-Time Scaling for WebAgents

Gemma: Open Models Based on Gemini Research and Technology

Improving alignment of dialogue agents via targeted human judgements

A-MapReduce: Executing Wide Search via Agentic MapReduce

Top Entities In This Topic

Related Topics

FAQ

What does this cs.CL page rank?

How should I use weekly vs monthly vs all-time?

How can I discover organizations active in cs.CL?

Can I follow this topic for updates?