ICRA 2018 Workshop on Representing a Complex World: Perception, Inference, and Learning for Joint Semantic, Geometric, and Physical Understanding

The goal of this workshop is to bring together researchers from robotics, computer vision, machine learning, and neuroscience to examine the challenges and opportunities emerging from the design of environment representations and perception algorithms that unify semantics, geometry, and physics. This goal is motivated by two fundamental observations. First, the development of advanced perception and world understanding is a key requirement for robot autonomy in complex, unstructured environments, and an enabling technology for robot use in transportation, agriculture, mining, construction, security, surveillance, and environmental monitoring. Second, despite the unprecedented progress over the past two decades, there is still a large gap between robot and human perception (e.g., expressiveness of representations, robustness, latency). The workshop aims to bring forward the latest breakthroughs and cutting-edge research on multimodal representations, as well as novel perception, inference, and learning algorithms that can generate such representations.

The workshop will include keynote presentations from established researchers in robotics, machine learning, computer vision, human and animal perception.
There will be two panel discussions and two poster sessions highlighting contributed papers throughout the day.
There will be a demo session including exciting live demos (best demo takes home a monetary prize - see below).

The workshop is endorsed by the IEEE RAS Technical Committee for Computer & Robot Vision.

To encourage rigorous innovative submissions, this year we plan to award a monetary prize for the best paper and the best demo presented during the workshop. Quality and impact of the submissions will be evaluated by the program committee. The best workshop paper and best demo awards are sponsored by:

Participants are invited to submit a full paper (following ICRA formatting guidelines) or an extended abstract (up to 2 pages) related to key challenges in unified geometric, semantic, topological, and temporal representations, and associated perception, inference, and learning algorithms. Topics of interest include but are not limited to:

novel representations that combine geometry, semantics, and physics, and allow reasoning over spatial, semantic, and temporal aspects;
contextual inference techniques that produce maximum likelihood estimates over hybrid multi-modal representations;
learning techniques that produce cognitive representations directly from complex sensory inputs;
approaches that combine learning-based techniques with geometric estimation methods;
position papers and unconventional ideas on how to reach human-level performance across the broad spectrum of perceptual problems arising in robotics.

Contributed papers will be reviewed by the organizers and a program committee of invited reviewers. Accepted papers will be published on the workshop website and will be featured in spotlight presentations and poster sessions. We strongly encourage the preparation of live demos to accompany the papers. We plan to select the best submissions and invite the authors of these papers to contribute to a special issue on the IEEE Transactions on Robotics, related to the topic of the workshop.

Submission link: https://easychair.org/conferences/?conf=icramrp18

Submission Deadline: April 6, 2018.
Notification of Acceptance: April 30, 2018.
Workshop Date: May 21, 2018.
Time: 09:00 - 17:00.
Room: M1 (Mezzanine Level).

Feel free to post thought-provoking questions and ideas related to joint metric-semantic-physical perception:

Your questions and ideas will be discussed during the panel sessions.

Jana Kosecka (George Mason University)
Dieter Fox / Arun Byravan (University of Washington)
Ian Reid (University of Adelaide)
Srini Srinivasan (Queensland Brain Institute)
Michael Milford (Queensland University of Technology)
Torsten Sattler (ETH Zurich)

Time	Topic
8:45 - 9:00	Registration, welcome, and opening remarks
9:00 - 9:30	Invited talk: Michael Milford (Queensland University of Technology) Adventures in multi-modal, sometimes bio-inspired perception, mapping and navigation for robots and autonomous vehicles Abstract: I'll take the audience on a whirlwind tour of 15 years of research pushing the boundaries on multi-modal perception and sensing in a mapping and navigation context. I'll cover our forays into biologically inspired sensing and mapping, and touch on some of the opportunities and challenges we faced along the way, including translating the work into applied industrial outcomes. The field is still open for more innovation and we hope the workshop will be a great provocation for discussion and collaboration!
9:30 - 10:00	Invited talk: Jana Kosecka (George Mason University) Semantic Understanding for Robot Perception and Navigation Abstract: Advancements in robotic navigation, mapping, object search and recognition rest to a large extent on robust, efficient and scalable semantic understanding of the surrounding environment. In recent years we have developed several approaches for capturing geometry and semantics of environment from video, RGB-D data, or just simply a single RGB image, focusing on indoors environments relevant for robotics applications. I will demonstrate our work on object detection, semantic segmentation and semantically driven navigation as applicable to find and fetch tasks in indoors environments.
10:00 - 10:30	Poster Spotlights A. Gawel, C. Del Don, R. Siegwart, J. Nieto, C. Cadena "X-View: Graph-Based Semantic Multi-View Localization" B. Bescos, J. Facil, J. Civera, J. Neira "Detecting, Tracking and Eliminating Dynamic Objects in 3D Mapping using Deep Learning and Inpainting" A. Loquercio, D. Scaramuzza "Learning to Control Drones in Natural Environments: A Survey" O. Roesler, A. Aly, T. Taniguchi, Y. Hayashi "A Probabilistic Framework for Comparing Syntactic and Semantic Grounding of Synonyms through Cross-Situational Learning" A. Milioto, C. Stachniss "Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics using CNNs" F. Nardi, B. Della Corte, G. Grisetti "Unified Representation of Heterogeneous Sets of Geometric Primitives" W. Vega-Brown, N. Roy "Admissible abstractions for near-optimal task and motion planning" C. Grimm, R. Balasubramanian, M. Sundberg, R. Sherman, A. Kothari, R. Hatton "A Grasping Metric based on Hand-Object Collision" S. Daftry, Y. Agrawal, L. Matthies "Online Self-supervised Scene Segmentation for Micro Aerial Vehicles" L. Nicholson, M. Milford, N. Suenderhauf "QuadricSLAM: Constrained Dual Quadrics from Object Detections as Landmarks in Semantic SLAM" P. Karkus, D. Hsu, W. S. Lee "Particle Filter Networks: End-to-End Probabilistic Localization From Visual Observations"
10:30 - 11:00	Coffee break
11:00 - 11:30	Poster Spotlights M. Henein, G. Kennedy, V. Ila, R. Mahony "Exploiting Rigid Body Motion for SLAM in Dynamic Environments with Applications in Urban Driving and Extrinsic Calibration of a Multi RGBD Camera System" J. Park, D. Manocha "Combining Computer Vision and Real Time Motion Planning for Human-Robot Interaction" R. Mahjourian, M. Wicke, A. Angelova "Unsupervised Learning of Depth and Ego-Motion from Monocular Video in 3D" J. Weibel, T. Patten, M. Vincze "Geometric Priors from Robot Vision in Deep Networks for 3D Object Classification" M. Sundermeyer, E. Y. Puang, Z. Marton, M. Durner, R. Triebel "Learning Implicit Representations of 3D Object Orientations from RGB" A. Inceoglu, G. Ince, Y. Yaslan, S. Sariel "Comparative Assessment of Sensing Modalities on Manipulation Failure Detection" J. Bowkett, J. Burdick, L. Matthies, R. Detry "Semantic Understanding of Task Outcomes: Visually Identifying Failure Modes Autonomously Discovered in Simulation" Y. Feldman, V. Indelman "Towards Robust Autonomous Semantic Perception" T. Mota, M. Sridharan "Incrementally Grounding Expressions for Spatial Relations between Objects" B. X. Chen, R. Sahdev, D. Wu, X. Zhao, M. Papagelis, J. K. Tsotsos "Scene Classification in Indoor Environments for Robots using Context Based Word Embeddings" K. Desingh, A. Opipari, O. Jenkins "Analysis of Goal-directed Manipulation in Clutter using Scene Graph Belief Propagation"
11:30 - 12:00	Invited talk: Ian Reid (University of Adelaide) SLAM in the Era of Deep Learning Abstract: In this talk I will discuss progress in my group over the last few years in moving towards an Object-based system for Localisation and Mapping that maintains the key features and merits of geometric SLAM, but which takes advantage of advances in Deep Learning for detecting objects, performing semantic segmentation, and the ability to regress quantities such as depth for a single view.
12:00 - 12:30	Morning wrap-up: panel discussion
12:30 - 2:00	Lunch break
2:00 - 2:30	Invited talk: Dieter Fox / Arun Byravan (University of Washington) Learning to Predict and Control Objects from Low-level Supervisionn Abstract:
2:30 - 3:00	Invited talk: Srini Srinivasan (Queensland Brain Institute) Facets of vision, perception, learning and `cognition' in a small brain Abstract: Honeybees possess a brain about the size of a sesame seed, and yet display surprisingly sophisticated performance in tasks that involve pattern recognition, maze learning, establishing complex associations and learning abstract concepts. This presentation will highlight some of these capacities, and invite speculation on how these tasks might be implemented effectively in what must be relatively simple neural hardware.
3:00 - 3:30	Coffee break & Poster Session
3:30 - 4:00	Poster and demo session
4:00 - 4:30	Invited talk: Torsten Sattler (ETH Zurich) Challenges in Long-Term Visual Localization Abstract: Visual localization is the problem of estimating the position and orientation from which an image was taken with respect to a 3D model of a known scene. This problem has important applications, including autonomous vehicles (including self-driving cars and other robots) and augmented / mixed / virtual reality. While multiple solutions to the visual localization problem exist both in the Robotics and Computer Vision communities for accurate camera pose estimation, they typically assume that the scene does not change over time. However, this assumption is often invalid in practice, both in indoor and outdoor environments. This talk thus briefly discusses the challenges encountered when trying to localize images over a longer period of time. Next, we show how a combination of 3D scene geometry and higher-level scene understanding can help to enable visual localization in conditions where both classical and recently proposed learning-based approaches struggle.
4:30 - 5:00	Afternoon wrap-up: panel discussion & closing remarks
5:00 - 5:15	Award ceremony

Cesar Cadena (ETHZ)
Kasra Khosoussi (MIT)
Xiaowei Zhou (Zhejiang University)
Torsten Sattler (ETHZ)
Arunkumar Byravan (UW)
Christian Häne (Berkeley)
Niko Sünderhauf (QUT)
Vadim Indelman (Technion)
Roberto Tron (BU)
Karol Hausman (Google Brain)

Nikolay Atanasov <natanasov@ucsd.edu>
Luca Carlone <lcarlone@mit.edu>

Should you have any questions, please do not hesitate to contact the organizers Nikolay Atanasov (natanasov@ucsd.edu) or Luca Carlone (lcarlone@mit.edu). Please include ``ICRA 2018 Workshop Submission'' in the subject of the email.

Overview

Awards

Call for Papers

Important Dates

Participation

Invited Speakers

Schedule

Program Committee

Organizing Committee

Contact