Quick answer
AI Summary: Introduces the TruthfulQA benchmark, revealing the counterintuitive finding that larger language models are often more prone to hallucinating human falsehoods because they memorize internet misconceptions more effectively.
AI Summary: Introduces the TruthfulQA benchmark, revealing the counterintuitive finding that larger language models are often more prone to hallucinating human falsehoods because they memorize internet misconceptions more effectively.
We propose TruthfulQA, a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including health, law, finance and politics. We crafted questions that some humans would answer falsely due to a false belief or misconception (e.g., 'What happens if you crack your knuckles?'). We evaluate several model families, including GPT-3, and show a surprising scaling trend: larger models are generally less truthful than smaller models on this benchmark. This occurs because larger models are better at learning the popular misconceptions and conspiracy theories embedded in their internet training data.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for TruthfulQA: Measuring How Models Mimic Human Falsehoods.