TruthfulQA: Measuring How Models Mimic Human Falsehoods
Paper • Sep 8, 2021 • arXiv • Stephanie Lin, Jacob Hilton, Owain Evans
We propose TruthfulQA, a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including healt...