ActiVis@CVPR26

Overview

Vision foundation models excel at encoding passive data, yet robots require physically-grounded reasoning about pose, dynamics, and affordances. This workshop bridges the gap between computer vision and robotics, moving beyond simple deployment to pioneer task-driven, co-designed perception-action loops. We aim to translate perceptual abstractions into actionable structures, closing the loop from pixels to torque for robust, real-world systems.

To achieve this, we foster a bidirectional dialogue: enabling vision researchers to incorporate robotic constraints into design, while empowering roboticists to effectively deploy advanced models. Specifically, this workshop focuses on three interactive dimensions and the corresponding stage-wise challenges essential for actionable visual perception.

Topics

Interactive Dimensions

What visual capability is needed for fully autonomous systems

How can the vision community contribute to general-purpose robotic systems

What data modality is critical for generalizable, robust robot control

Stage-Wise Challenges

Data

Dynamic logs with multi-modal feedback.
Teaching models about risk.
Physically accurate data for the "sim-to-real" gap.

Model

3D geometry and physical dynamics.
Differentiable cause-effect relationships.
Conditioned representations over passive observation.

Optimization

Safety and stability in the learning objective.
Downstream task success.
Model confidence for safe real-world deployment.

Evaluation

Closed-loop performance.
Reliability against environmental variability.
Physical task completion and safe interaction.

Call for Papers (Last updated on May 21th)

Submission Instructions

We welcome submissions covering:

Research papers: Long papers (8 pages) showcasing novel findings, methods, or theoretical advancements.
Short/Abstract papers: Features exploratory work (4 pages or 2 pages excluding references) that may be preliminary but presents innovative concepts, early results, or thought-provoking viewpoints that stimulate discussion and future work.
Position papers: Offer critical perspectives on trends and challenges within the filed (no less than 8 pages).
Survey papers: Provide thorough reviews of specific topics, mapping the current research landscape and suggesting directions for future exploration (no less than 8 pages).

All formats allow unlimited references and appendices.

Contributions will be non-archival but hosted on our workshop website, and thus dual submission is allowed where permitted by third parties. We welcome submissions that are under submission or accepted by other conferences. Please mention it in the last sentence of the paper abstract if your paper has been under submission or accepted by other conferences.

Submissions should follow CVPR two-column style and be anonymous; see the CVPR-26 author kit for details.

Submission and Important Dates

Portal: OpenReview Submission Portal
Start Date: March 9th, 2026
Deadline: May 14th, 2026 (23:59PM AOE)
Notification to Authors: May 24th, 2026 (23:59PM AOE)
Camera-ready Due (non-archival): May 28th, 2026 (23:59PM AOE)

Invited Speakers (Last updated on May 21th)

Nathan F. Lepora

University of Bristol

Professor of Robotics & AI

Marco Pavone

Stanford University

Associate Professor

Saurabh Gupta

University of Illinois Urbana-Champaign

Associate Professor

Chelsea Finn

Stanford University

Assistant Professor

Yunzhu Li

Columbia University

Assistant Professor

Ming-Yu Liu

NVIDIA Research

Vice President of Research

Ruoshi Liu

Amazon Frontier AI & Robotics (FAR)

Research Scientist

Schedule (Last updated on May 21th)

To encourage open-ended discussion and maximize in-person engagement, the workshop will feature a mix of structured and interactive formats.

Keynote sessions will consist of 40-minute presentations followed by 5-minute Q&A segments, providing speakers with the opportunity to engage directly with attendees.
Poster sessions and Coffee breaks will showcase recent advancements and novel research while also serving as a platform for networking and idea exchange among researchers from diverse backgrounds and career stages.
Panel discussion will incorporate live polling and audience-submitted questions, fostering dynamic and meaningful dialogue. We will invite leading researchers from both academia and industry to participate in a debate on the most promising directions for future research in embodied AI from a computer vision perspective.

These interactive elements are designed to stimulate lively exchanges, bridge the gap between junior and senior researchers, and cultivate an open, inclusive research community.

Time	Session
8:50 – 9:00	Opening Remarks
9:00 – 9:40	Invited Talk by Saurabh Gupta
9:40 – 10:20	Invited Talk by Ming-Yu Liu
10:20 – 10:30	Coffee Break
10:30 – 11:10	Invited Talk by Nathan F. Lepora
11:10 – 11:50	Invited Talk by Yunzhu Li
11:50 – 14:00	Lunch Break
14:00 – 14:40	Invited Talk by Chelsea Finn
14:40 – 15:20	Invited Talk by Ruoshi Liu
15:20 – 15:30	Coffee Break
15:30 – 16:10	Invited Talk by Marco Pavone
16:10 – 16:20	Closing Remarks