Topic: Multimodal AI

Track this topic after sign-in.

Short answer

This page shows the most relevant public items for Multimodal AI, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

Weekly Monthly All time

← Back to home

Scaling Laws for Autoregressive Generative Modeling
Paper • Oct 28, 2020 • arXiv • Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, Sam McCandlish
Building upon previous work establishing scaling laws for language models, we investigate whether similar power-law scaling relationships hold across other data modalities. We train autoregressive ...
Multimodal Neurons in Artificial Neural Networks
Paper • Mar 4, 2021 • Distill • Gabriel Goh, Nick Cammarata, Chelsea Voss, Shan Carter, Michael Petrov, Ludwig Schubert, Alec Radford, Chris Olah
We investigate the internal representations of the CLIP model and discover the presence of 'multimodal neurons'. These neurons fire not only for specific visual features (like a spider) but also fo...
Learning Transferable Visual Models From Natural Language Supervision
Paper • Feb 26, 2021 • arXiv • Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories, restricting their generality. We demonstrate that the simple pre-training task of pre...
Perceiver: General Perception with Iterative Attention
Paper • Mar 4, 2021 • arXiv • Andrew Jaegle, Felix Gimeno, Andrew Brock, Oriol Vinyals, Andrew Zisserman, Joao Carreira
Biological systems perceive the world by simultaneously processing high-dimensional inputs from modalities as diverse as vision, audition, and touch. We introduce the Perceiver, an architecture tha...
Flamingo: a Visual Language Model for Few-Shot Learning
Paper • Apr 28, 2022 • arXiv • Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Karen Simonyan
Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. We introduce Flamingo, a family ...

FAQ

What does this Multimodal AI page rank?

It ranks public content for Multimodal AI using recent discussion, review, and engagement signals so you can triage faster. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How should I use weekly vs monthly vs all-time?

Use weekly for fast-moving updates, monthly for stable trend confirmation, and all-time for evergreen references. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How can I discover organizations active in Multimodal AI?

Use the linked entities section to jump to labs, companies, and experts connected to this topic and explore their timelines. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Can I follow this topic for updates?

Yes. Use the follow button on this page to subscribe and track new high-signal activity. This guidance is specific to Multimodal AI topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Topic: Multimodal AI

Short answer

Scaling Laws for Autoregressive Generative Modeling

Multimodal Neurons in Artificial Neural Networks

Learning Transferable Visual Models From Natural Language Supervision

Perceiver: General Perception with Iterative Attention

Flamingo: a Visual Language Model for Few-Shot Learning

Top Entities In This Topic

Related Topics

FAQ

What does this Multimodal AI page rank?

How should I use weekly vs monthly vs all-time?

How can I discover organizations active in Multimodal AI?

Can I follow this topic for updates?