This talk is about emotional AI, about the machine learning and computer vision methods developed for various human-centric AI applications, and about the technology behind generating “fake video news”.
Bio: Maja Pantic obtained her PhD degree in computer science in 2001 from Delft University of Technology, the Netherlands. Until 2005, she was an Assistant/ Associate Professor at Delft University of Technology. In 2006, she joined the Imperial College London, Department of Computing, UK, where she is Professor of Affective & Behavioural Computing and the Head of the iBUG group, working on machine analysis of human non-verbal behaviour. In April 2018, she became the Research Director of Samsung AI Research Centre in Cambridge. Prof. Pantic is one of the world’s leading experts in the research on machine understanding of human behavior including vision-based detection, tracking, and analysis of human behavioral cues like facial expressions and body gestures, and multimodal analysis of human behaviors like laughter, social signals, and affective states. Prof. Pantic received various awards for her work including BCS Roger Needham Award, awarded annually to a UK based researcher for a distinguished research contribution in computer science. She is a Fellow of the UK’s Royal Academy of Engineering, an IEEE Fellow and an IAPR Fellow.
Over the past decade, interventional medicine has experienced a slow but consistent evolution toward the increasing use and sophistication of technology in the operating room. A side-effect of these changes is an unprecedented opportunity to transparently capture data on procedures as they are performed. For example, the da Vinci Surgical robot now performs over one million procedures per year while capturing stereo video and tool movement – effectively a complete record of a surgical procedure. At scale, millions of such data points create new opportunities to apply a wealth of data-driven learning techniques to understand and improve surgery. In this talk, I’ll review some highlights of the past decade of work exploiting recorded surgical data for assessment of human performance, and I’ll highlight some key advances and opportunities for creating methods that augment the performance of the surgeon during the procedure. I’ll close with some thoughts about how recent advances in simulation and learning present the possibility of automating some elements of interventional medicine in the not too distant future.
Bio: Greg Hager is the Mandell Bellmore Professor of Computer Science at Johns Hopkins University and Founding Director of the Malone Center for Engineering in Healthcare. Professor Hager’s research interests include computer vision, vision-based and collaborative robotics, time-series analysis of image data, and applications of image analysis and robotics in medicine and in manufacturing. He is a member of the CISE Advisory Committee, the governing board of the International Federation of Robotics Research and former member of the Board of the Directors of the Computing Research Association. He previously served as Chair of the Computing Community Consortium. In 2014, he was awarded a Hans Fischer Fellowship in the Institute of Advanced Study of the Technical University and in 2017 was named a TUM Ambassador. Professor Hager has served on the editorial boards of IEEE TRO, IEEE PAMI, and IJCV and ACM Transactions on Computing for Healthcare. He is a fellow of the ACM and IEEE for his contributions to Vision-Based Robotics and a Fellow of AAAS, the MICCAI Society and of AIMBE for his contributions to imaging and his work on the analysis of surgical technical skill. Professor Hager is a co-founder of Clear Guide Medical and Ready Robotics.
Title: Amazon Go: a peek under the hood (Wednesday, March 4 – 5.30:6:30 PM)
This talk outlines the core technologies behind the custom-built Just Walk Out technology for Amazon Go. We present the algorithmic challenges in building a highly accurate customer-facing application using deep learning and computer vision, and the technical details of the high throughput services for Amazon Go that transfer gigabytes of video from stores to cloud systems.
Time permitting, I will also present a method for synthesizing naturally looking images of multiple people interacting in a specific scenario.
Bio: Professor Gérard Medioni received the Diplôme d’Ingénieur from ENST, Paris in 1977, a M.S. and Ph.D. from the University of Southern California in 1980 and 1983 respectively. He is Vice President/Distinguished Scientist at Amazon where he leads the research for Amazon Go, and Professor Emeritus of Computer Science. He served as Chairman of the Computer Science Department from 2001 to 2007. Professor Medioni has made significant contributions to the field of computer vision. His research covers a broad spectrum of the field, such as edge detection, stereo and motion analysis, shape inference and description, and system integration. He has published 4 books, over 80 journal papers and 200 conference articles, and is the recipient of 24 international patents. He is the editor, with Sven Dickinson, of the Computer Vision series of books for Morgan-Claypool.
Prof. Medioni is on the advisory board of the IEEE Transactions on PAMI Journal, associate editor of the Pattern Recognition and Image Analysis Journal. He is vice president of the Computer Vision Foundation (CVF).
Prof. Medioni served at program co-chair of the 1991 IEEE CVPR Conference in Hawaii, of the 1995 IEEE Symposium on Computer Vision in Miami, general co-chair of many CVPR Conferences (1997, 2001, 2007, 2009, 2020), conference co-chair of ICPR (1998, 2014), WACV (2009, 2011, 2013, 2015, 2017, 2019, 2021), ICCV (2017, 2019).
Prof. Medioni has been a consultant to several companies and startups (DXO, Poseidon, Opti-copy, Geometrix, Symah Vision, KLA-Tencor, PrimeSense) prior to joining Amazon.
He is a Fellow of IAPR, a Fellow of the IEEE, and a Fellow of AAAI.