Topic: Multimodal AI

Track this topic after sign-in.

Short answer

This page shows the most relevant public items for Multimodal AI, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

WeeklyMonthlyAll time

← Back to home

  1. Scaling Laws for Autoregressive Generative Modeling

    PaperOct 28, 2020arXivTom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, Sam McCandlish

    Building upon previous work establishing scaling laws for language models, we investigate whether similar power-law scaling relationships hold across other data modalities. We train autoregressive ...

  2. Multimodal Neurons in Artificial Neural Networks

    PaperMar 4, 2021DistillGabriel Goh, Nick Cammarata, Chelsea Voss, Shan Carter, Michael Petrov, Ludwig Schubert, Alec Radford, Chris Olah

    We investigate the internal representations of the CLIP model and discover the presence of 'multimodal neurons'. These neurons fire not only for specific visual features (like a spider) but also fo...

  3. Learning Transferable Visual Models From Natural Language Supervision

    PaperFeb 26, 2021arXivAlec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever

    State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories, restricting their generality. We demonstrate that the simple pre-training task of pre...

  4. Perceiver: General Perception with Iterative Attention

    PaperMar 4, 2021arXivAndrew Jaegle, Felix Gimeno, Andrew Brock, Oriol Vinyals, Andrew Zisserman, Joao Carreira

    Biological systems perceive the world by simultaneously processing high-dimensional inputs from modalities as diverse as vision, audition, and touch. We introduce the Perceiver, an architecture tha...

  5. Flamingo: a Visual Language Model for Few-Shot Learning

    PaperApr 28, 2022arXivJean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Karen Simonyan

    Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. We introduce Flamingo, a family ...

Top Entities In This Topic

Related Topics

FAQ

What does this Multimodal AI page rank?

It ranks public content for Multimodal AI using recent discussion, review, and engagement signals so you can triage faster. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How should I use weekly vs monthly vs all-time?

Use weekly for fast-moving updates, monthly for stable trend confirmation, and all-time for evergreen references. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How can I discover organizations active in Multimodal AI?

Use the linked entities section to jump to labs, companies, and experts connected to this topic and explore their timelines. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Can I follow this topic for updates?

Yes. Use the follow button on this page to subscribe and track new high-signal activity. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.