Improving alignment of dialogue agents via targeted human judgements
Paper • Sep 22, 2022 • arXiv • Amelia Glaese, Nat McAleese, Maja Trebacz, John Aslanides, Vlad Firoiu, Geoffrey Irving
We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines. We train our model using reinforcement lea...