Topic: Superalignment

Short answer

This page shows the most relevant public items for Superalignment, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

WeeklyMonthlyAll time
Current weekPast week2 weeks ago

← Back to home

  1. Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

    PaperDec 14, 2023arXivCollin Burns, Pavel Izmailov, Jan Hendrik Kirchner, Bowen Baker, Leo Gao, Leopold Aschenbrenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, Ilya Sutskever, Jeff Wu

    As AI models become increasingly capable, we will eventually face the challenge of superalignment: how can humans supervise AI systems that are much smarter than them? To study this empirically tod...

Related Topics

Machine Learning (1)AI Alignment (1)cs.LG (1)company:openai-research (1)