RSS 2021 VLRR Workshop

Visual Learning and Reasoning for Robotics

Full-day workshop at RSS 2021

Virtual Conference

July 13, 2021, Pacific Time (PT)

Welcome! This workshop includes three live events:

Invited Talks (25 min talk + 5 min Q&A)
Spotlight Talks (4 min talk + 2 min Q&A)
Panel Discussion (60 min)

To attend the workshop, please use the pheedloop platform provided by RSS 2021.

For the panel discussion, you can also post questions at this link.

Schedule

Time (PT)	Invited Speaker	Title
10:15 - 10:30	-	Opening Remarks \| Video \|
10:30 - 11:00	Andrew Davison Imperial College London	Representations for Spatial AI \| Video \|
11:00 - 11:30	Raquel Urtasun University of Toronto / Waabi	Interpretable Neural Motion Planning \| Video \|
11:30 - 12:00	Spotlight Talks + Q&A	ZePHyR: Zero-shot Pose Hypothesis Rating Brian Okorn (Carnegie Mellon University); Qiao Gu (Carnegie Mellon University); Martial Hebert (Carnegie Mellon University); David Held (Carnegie Mellon University) \| PDF \| Video \| ST-DETR: Spatio-Temporal Object Traces Attention Detection Transformer* Eslam Bakr (Valeo); Ahmad ElSallab (Valeo Deep Learning Research) \| PDF \| Video \| Lifelong Interactive 3D Object Recognition for Real-Time Robotic Manipulation* Hamed Ayoobi (University of Groningen); S. Hamidreza Kasaei (University of Groningen); Ming Cao (University of Groningen); Rineke Verbrugge (University of Groningen); Bart Verheij (University of Groningen) \| PDF \| Video \| Predicting Diverse and Plausible State Foresight For Robotic Pushing Tasks* Lingzhi Zhang (University of Pennsylvania); Shenghao Zhou (University of Pennsylvania); Jianbo Shi (University of Pennsylvania) \| PDF \| Video \| Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos* Haoyu Xiong (University of Toronto, Vector Institute)*; Quanzhou Li (University of Toronto, Vector Institute); Yun-Chun Chen (University of Toronto, Vector Institute); Homanga Bharadhwaj (University of Toronto, Vector Institute); Samarth Sinha (University of Toronto, Vector Institute); Animesh Garg (University of Toronto, Vector Institute, NVIDIA) \| PDF \| Video \|
12:00 - 12:30	Abhinav Gupta CMU / Facebook AI Research	No RL, No Simulation \| Video \|
12:30 - 1:00	Shuran Song Columbia University	Unfolding the Unseen: Deformable Cloth Perception and Manipulation \| Video \|
1:00 - 2:30	-	Break
2:30 - 3:00	Saurabh Gupta UIUC	Learning to Move and Moving to Learn \| Video \|
3:00 - 3:30	Sergey Levine UC Berkeley / Google	Scalable Robotic Learning \| Video \|
3:30 - 4:00	Spotlight Talks + Q&A	3D Neural Scene Representations for Visuomotor Control Yunzhu Li (MIT); Shuang Li (MIT); Vincent Sitzmann (MIT); Pulkit Agrawal (MIT); Antonio Torralba (MIT) \| PDF \| Video \| Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation* Nicklas A Hansen (UC San Diego); Hao Su (UC San Diego); Xiaolong Wang (UC San Diego) \| PDF \| Video \| Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers* Ruihan Yang (UC San Diego); Minghao Zhang (Tsinghua University); Nicklas A Hansen (UC San Diego); Huazhe Xu (UC Berkeley); Xiaolong Wang (UC San Diego) \| PDF \| Video \| Interaction Prediction and Monte-Carlo Tree Search for Robot Manipulation in Clutter* Baichuan Huang (Rutgers University); Abdeslam Boularias (Rutgers University); Jingjin Yu (Rutgers University) \| PDF \| Video \| A Simple Method for Complex In-Hand Manipulation* Tao Chen (MIT)*; Jie Xu (MIT); Pulkit Agrawal (MIT) \| PDF \| Video \|
4:00 - 5:00	Invited Speakers	Panel Discussion \| Video \|

Introduction

Visual perception is essential for achieving robot autonomy in the real world. To perform complex robot tasks in unknown environments, a robot needs to actively acquire knowledge through physical interactions and conduct sophisticated reasoning of the observed objects. This invites a series of research challenges in developing computational tools to close the perception-action loop. Given the recent advances in computer vision and deep learning, we look for new potential solutions for performing real-world robotic tasks in an effective and computationally efficient manner.

We focus on the two parallel themes in this workshop:

How could a robot’s interaction with the physical world facilitate the development of its visual perception?
How a deep understanding of the physical world through visual learning and reasoning could give rise to effective and robust robotic control?

Call for Papers

We're inviting submissions! If you're interested in (remotely) presenting a spotlight talk, please submit a short paper (or extended abstract) to CMT. We suggest extended abstracts of 2 pages in the RSS format. A maximum of 4 pages will be considered. References will not count towards the page limit. The review process is double-blind. Significant overlap with work submitted to other venues is acceptable, but it must be explicitly stated at the time of submission.

Important Dates:

Paper Submission: June 20, 2021 (11:59 pm PST)
Review Due: June 29, 2021 (11:59 pm PST)
Author Notification: July 1, 2021 (11:59 pm PST)
Camera-Ready Version: July 8, 2021 (11:59 pm PST)
Conference Date: July 13, 2021

Organizers

Kuan Fang Stanford University	David Held CMU	Yuke Zhu UT Austin / NVIDIA	Dinesh Jayaraman Univ. of Pennsylvania
Animesh Garg Univ. of Toronto / NVIDIA	Lin Sun Magic Leap	Yu Xiang NVIDIA	Greg Dudek McGill / Samsung