Quick answer

AI Summary: Details WebGPT, a language model trained via RLHF to actively browse the internet, retrieve documents, and synthesize cited answers, fundamentally addressing the hallucination problem in LLMs.

Paper2021-12-16•Source ↗•28 attns484 checkouts

Claim

WebGPT: Browser-assisted question-answering with human feedback

Authors

Discuss with Grok

Reiichiro Nakano·

Jacob Hilton·

Suchir Balaji·

Jeff Wu·

Long Ouyang·

Christina Kim·

Christopher Hesse·

Shantanu Jain·

Vineet Kosaraju·

William Saunders·

Xu Jiang·

Karl Cobbe·

Tyna Eloundou·

Gretchen Krueger·

Kevin Button·

Matthew Knight·

Benjamin Chess·

John Schulman

ABSTRACT

We introduce a method for fine-tuning language models to interact with a text-based web browser to answer open-ended questions. This model, WebGPT, searches the web, navigates through links, and synthesizes answers while citing its sources. We train the model using a combination of behavior cloning from human demonstrations and reinforcement learning from human feedback (RLHF) on the ELI5 dataset. WebGPT's answers are preferred by human evaluators over answers written by human demonstrators in 56% of cases, proving that teaching a model to use external tools can significantly improve factuality and reduce hallucinations.

company:openai-research #information-retrieval #cs-ai #rlhf #cs-cl

Review Snapshot

Explore ratings

4.6

★★★★★

5 ratings

5 star

60%

4 star

40%

3 star

2 star

1 star

Recommendation

100%

recommend this content.

Review this content

Share your opinion to help other learners triage faster.

Write a review

Invite a reviewer

Invite someone by email to share an invited review for WebGPT: Browser-assisted question-answering with human feedback.

Author Inquiries

Public questions about this content. Attendemia will route your question to the author. Vote on the most important ones. No guarantee of response.

Post an inquiry

Sort by: Most helpful