Quick answer
AI Summary: Presents Sparrow, a dialogue agent that pioneered advanced RLHF alignment techniques and integrated internet search to significantly improve the safety and factual correctness of LLMs.
AI Summary: Presents Sparrow, a dialogue agent that pioneered advanced RLHF alignment techniques and integrated internet search to significantly improve the safety and factual correctness of LLMs.
We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines. We train our model using reinforcement learning from human feedback (RLHF), where human participants provide judgements on model responses based on a targeted set of rules. Crucially, Sparrow is augmented with the ability to search the internet, and human raters evaluate the accuracy of the model's claims based on the evidence it retrieves. We show that Sparrow breaks new ground in safety and correctness, providing a comprehensive framework for aligning large dialogue models with complex human values.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for Improving alignment of dialogue agents via targeted human judgements.