Quick answer
AI Summary: The official technical and safety documentation for OpenAI o1, a breakthrough model that uses reinforcement learning to generate a 'hidden chain of thought,' achieving PhD-level reasoning in STEM domains.
AI Summary: The official technical and safety documentation for OpenAI o1, a breakthrough model that uses reinforcement learning to generate a 'hidden chain of thought,' achieving PhD-level reasoning in STEM domains.
We introduce OpenAI o1, a new series of large language models trained with reinforcement learning to perform complex reasoning. o1 models are designed to spend more time thinking before they respond, generating a hidden chain of thought that allows them to refine their logic, correct mistakes, and break down difficult steps. In our evaluations, o1 achieves performance comparable to human PhD students on challenging benchmark tasks in physics, chemistry, and biology. This system card details our safety evaluations, highlighting how the hidden chain of thought significantly improves the model's robustness against jailbreaks and prompt injections by allowing the model to reason about safety policies in real-time.
Share your opinion to help other learners triage faster.
Write a reviewInvite someone by email to share an invited review for OpenAI o1 System Card.