Quick answer
AI Summary: Details WebGPT, a language model trained via RLHF to actively browse the internet, retrieve documents, and synthesize cited answers, fundamentally addressing the hallucination problem in LLMs.
AI Summary: Details WebGPT, a language model trained via RLHF to actively browse the internet, retrieve documents, and synthesize cited answers, fundamentally addressing the hallucination problem in LLMs.
We introduce a method for fine-tuning language models to interact with a text-based web browser to answer open-ended questions. This model, WebGPT, searches the web, navigates through links, and synthesizes answers while citing its sources. We train the model using a combination of behavior cloning from human demonstrations and reinforcement learning from human feedback (RLHF) on the ELI5 dataset. WebGPT's answers are preferred by human evaluators over answers written by human demonstrators in 56% of cases, proving that teaching a model to use external tools can significantly improve factuality and reduce hallucinations.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for WebGPT: Browser-assisted question-answering with human feedback.