Topic: Computer Vision

Track this topic after sign-in.

Short answer

This page shows the most relevant public items for Computer Vision, ranked by trend activity and review signal. Use weekly for fast changes, monthly for more stable patterns, and all-time for evergreen picks.

Weekly Monthly All time

← Back to home

OpenClaw 3.0: The End of Brittle DOM Parsing for Web Agents
Blog • Mar 5, 2026 • Medium • Marcus Sterling
For the past year, web automation agents have relied heavily on parsing HTML DOM structures, making them notoriously brittle whenever a website updates its layout. The release of OpenClaw 3.0 this ...
VDC-Agent: When Video Detailed Captioners Evolve Themselves via Agentic Self-Reflection
Paper • Feb 28, 2026 • arXiv • Qiang Wang, Xinyuan Gao, SongLin Dong, Jizhou Han, Jiangyang Li, Yuhang He, Yihong Gong
Generating highly detailed, temporally accurate video captions requires models to understand complex spatial and temporal dynamics. VDC-Agent introduces an autonomous framework where the captioning...
Dive into Deep Learning
Book • Dec 21, 2023 • Amazon • Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola
Interactive learning is often the best way to master complex programming paradigms. This unique book offers an interactive experience with code and math integrated seamlessly. It covers modern conv...
Learning OpenCV 4 Computer Vision with Python 3
Book • Feb 20, 2020 • Amazon • Joseph Howse, Joe Minichino
OpenCV remains the foundational library for performing high performance visual manipulation. This book guides developers through image processing and object tracking using Python efficiently. It br...
Understanding Deep Learning
Book • Dec 5, 2023 • Amazon • Simon J.D. Prince
Grasping the true mechanics of deep learning requires a clear visualization of complex mathematical transformations. This highly visual textbook covers foundational neural networks and visual appli...
Transformers for Natural Language Processing and Computer Vision
Book • Jan 31, 2024 • Amazon • Denis Rothman
Transformer architectures have evolved far beyond text generation to completely dominate visual tasks. This extensive guide explores how attention mechanisms are reshaping image processing and mult...
Deep Learning for Vision Systems
Book • Oct 13, 2020 • Amazon • Mohamed Elgendy
Understanding how a computer learns to see requires breaking down complex neural networks into intuitive concepts. This text teaches the tools necessary for building intelligent systems that identi...
Computer Vision on AWS
Book • Mar 31, 2023 • Amazon • Lauren Mullennex, Nate Bachmeier, Jay Rao
Developing scalable visual solutions requires a robust cloud infrastructure to handle massive datasets and intensive compute loads. This book demonstrates how to build and deploy real world visual ...
Practical Machine Learning for Computer Vision
Book • Aug 24, 2021 • Amazon • Valliappa Lakshmanan, Martin Görner, Ryan Gillard
Employing machine learning models to extract information from images can be daunting for software developers. This book provides intuitive explanations of visual architectures alongside practical c...
Modern Computer Vision with PyTorch
Book • Nov 27, 2020 • Amazon • V Kishore Ayyadevara, Yeshwanth Reddy
Deep learning is the driving force behind modern visual applications. This practical guide takes a code first approach to solving over fifty real world image problems using PyTorch. Readers will bu...
Computer Vision Algorithms and Applications
Book • Jan 3, 2022 • Amazon • Richard Szeliski
This foundational reference explores the vast variety of techniques used to analyze and interpret images. It takes a deeply scientific approach to formulating visual problems and solving them using...
Foundations of Computer Vision
Book • Apr 16, 2024 • Amazon • Antonio Torralba, Phillip Isola, William T. Freeman
Machine learning has revolutionized how machines perceive the world but modern methods have deep roots in classic techniques. This comprehensive textbook introduces the mathematical and algorithmic...
Visual Web Navigation Agents: Beyond the DOM
Paper • Jan 22, 2026 • arXiv • John Smith, Alice Chen, Wei Lin
Traditional autonomous web agents rely heavily on parsing underlying website code which often breaks during dynamic updates. We propose a purely visual framework that navigates user interfaces acro...
Diffusion Models Beat GANs on Image Synthesis
Paper • May 11, 2021 • arXiv • Prafulla Dhariwal, Alex Nichol
We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better archi...
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Paper • May 22, 2017 • arXiv • Joao Carreira, Andrew Zisserman
Video action recognition is a crucial challenge in computer vision, but progress has been hindered by the lack of large-scale, comprehensive datasets comparable to ImageNet. We introduce the Kineti...
Neural Discrete Representation Learning
Paper • Nov 2, 2017 • arXiv • Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu
Learning useful representations without supervision remains a key challenge in machine learning. We propose the Vector Quantised-Variational AutoEncoder (VQ-VAE), a simple yet powerful generative m...
Human-level control through deep reinforcement learning
Paper • Feb 26, 2015 • Nature • Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Demis Hassabis
We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural networ...
A homogeneous view of asymptotic giant branch carbon stars as seen by Gaia
Paper • Feb 22, 2026 • arXiv • Tatsunori Hashimoto, Percy Liang, Xinyi Wang
Current web agents rely heavily on underlying HTML DOM structures, making them brittle to website updates and entirely incapable of navigating dynamic, canvas-based, or non-web applications. We pro...
Mirage2Matter: A Physically Grounded Gaussian World Model from Video
Paper • Feb 8, 2026 • arXiv • Zhengqing Gao, Ziwen Li, Xin Wang, Tongliang Liu
To bridge the simulation-to-real gap, we introduce Mirage2Matter, a physically grounded Gaussian world model that generates high-fidelity embodied training data from multi-view videos. We reconstru...
Inference-Only Prompt Projection for Safe Text-to-Image Generation
Paper • Feb 9, 2026 • arXiv • Minhyuk Lee, Hyekyung Yoon, Myungjoo Kang
Text-to-Image (T2I) diffusion models enable high-quality synthesis, but deployment demands safeguards. We formalize the tension between safety and alignment through a total variation (TV) lens, yie...

← PreviousPage 1Next →

FAQ

What does this Computer Vision page rank?

It ranks public content for Computer Vision using recent discussion, review, and engagement signals so you can triage faster. This guidance is specific to Computer Vision topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How should I use weekly vs monthly vs all-time?

Use weekly for fast-moving updates, monthly for stable trend confirmation, and all-time for evergreen references. This guidance is specific to Computer Vision topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

How can I discover organizations active in Computer Vision?

Use the linked entities section to jump to labs, companies, and experts connected to this topic and explore their timelines. This guidance is specific to Computer Vision topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Can I follow this topic for updates?

Yes. Use the follow button on this page to subscribe and track new high-signal activity. This guidance is specific to Computer Vision topic page on Attendemia and is written so it still makes sense without reading other sections on the page.

Topic: Computer Vision

Short answer

OpenClaw 3.0: The End of Brittle DOM Parsing for Web Agents

VDC-Agent: When Video Detailed Captioners Evolve Themselves via Agentic Self-Reflection

Dive into Deep Learning

Learning OpenCV 4 Computer Vision with Python 3

Understanding Deep Learning

Transformers for Natural Language Processing and Computer Vision

Deep Learning for Vision Systems

Computer Vision on AWS

Practical Machine Learning for Computer Vision

Modern Computer Vision with PyTorch

Computer Vision Algorithms and Applications

Foundations of Computer Vision

Visual Web Navigation Agents: Beyond the DOM

Diffusion Models Beat GANs on Image Synthesis

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Neural Discrete Representation Learning

Human-level control through deep reinforcement learning

A homogeneous view of asymptotic giant branch carbon stars as seen by Gaia

Mirage2Matter: A Physically Grounded Gaussian World Model from Video

Inference-Only Prompt Projection for Safe Text-to-Image Generation

Top Entities In This Topic

Related Topics

FAQ

What does this Computer Vision page rank?

How should I use weekly vs monthly vs all-time?

How can I discover organizations active in Computer Vision?

Can I follow this topic for updates?