To operate and navigate safely in crowded environments, autonomous agents such as mobile social robots should be equipped with a reliable egocentric perception system, to perceive and predict intricate human motions and intentions in its environment while also considering human social decorum and interactions. In this talk, I will explain how we can 1) design and formulate different levels of perception tasks necessary for social navigation while ensuring all the modules can be incorporated into a fully end-to-end learning pipeline without any pre- and/or post-heuristics, 2) develop novel learning approaches for neural networks which can accept and predict tensor as well as non-tensor data, e.g. graphs and sets, as inevitable tools for a fully end-to-end model learning of such a multi-level, multi-task problem. I will briefly present some of my teamís recent works, relevant to this topic. I will also introduce JRDB, the first unified and standardised dataset and benchmark for a variety of relevant 2D and 3D visual perception problems for navigation in human environments.
Hamid Rezatofighi is a tenured Assistant Professor at Faculty of Information Technology, Monash University, Australia. Before that, he was a Senior Research Fellow at the Australian Institute for Machine Learning (AIML), the University of Adelaide, where he closely worked with Prof. Ian Reid. In 2018, he was awarded a prestigious Endeavour Research Fellowship and used this opportunity for a placement at the Stanford Vision Lab (SVL), Stanford University, directed by Silvio Savarese and Li Fei-Fei. He received his PhD from the Australian National University in 2015 under the supervision of Prof. Richard Hartley. He has published over 50 top tier papers in computer vision, AI and machine learning, robotics, medical imaging and signal processing, and has been awarded several grants including the recent ARC discovery 2020 grant. He has served as the publication chair in ACCV18 and has been serving as one of the area chairs in CVPR20 and WACV21. His research interest includes vision-based perception tasks, esp. those that are required for an autonomous robot to navigate in a human environment, such as object/person detection, multiple object/people tracking, social trajectory forecasting, social activity and human pose prediction and autonomous social robot planning. He has also research expertise in Bayesian filtering, estimation and learning using point process and finite set statistics and is a pioneer in an emerging field in machine learning, known as set learning using deep neural networks.