Quick answer

AI Summary: The foundational paper that successfully applied Reinforcement Learning from Human Feedback (RLHF) to language models, significantly improving their ability to generate high-quality, human-preferred text summaries.

Paper2020-09-02•Source ↗•31 attns413 checkouts

Claim

Learning to summarize from human feedback

Authors

Discuss with Grok

Nisan Stiennon·

Long Ouyang·

Jeffrey Wu·

Daniel Ziegler·

Ryan Lowe·

Chelsea Voss·

Alec Radford·

Dario Amodei·

Paul Christiano

ABSTRACT

We show that it is possible to significantly improve the quality of text summaries generated by large language models by training them with reinforcement learning from human feedback. We collect a dataset of human preferences between pairs of summaries generated by our models, and train a reward model to predict these preferences. We then use Proximal Policy Optimization (PPO) to fine-tune the language model to maximize this reward. Our models trained with human feedback significantly outperform models trained via supervised fine-tuning, with human evaluators strongly preferring the RL-optimized summaries. This demonstrates a scalable approach for aligning models with complex human values.

#cs-lg company:openai-research #summarization #rlhf #cs-cl

Review Snapshot

Explore ratings

4.6

★★★★★

5 ratings

5 star

60%

4 star

40%

3 star

2 star

1 star

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for Learning to summarize from human feedback.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful