Suvrit Sra (MIT)

Nov 12, 2021

Title and Abstract

Some surprising gaps between optimization theory and ML practice

It is well-known that there are large gaps between optimization theory and machine learning practice. Even more surprising are, however, some gaps that have persisted at the fundamental level of how we analyze non-convex optimization in machine learning. I will talk about two such gaps in this talk. The first one pertains to ignoring the elephant in the room: non-differentiable non-convex optimization (e.g., training a deep ReLU network), and the second one to a surprising non-convergence phenomenon in the training of deep networks, which challenges existing theory and suggests a new direction of research. I will talk about some of our theoretical progress towards addressing these gaps, while highlighting important research questions.

Talk based on the PhD thesis of Jingzhao Zhang (MIT).

Bio

Suvrit Sra is an Associate Professor in the EECS Department at MIT, and also a core faculty member of the Laboratory for Information and Decision Systems (LIDS), the Institute for Data, Systems, and Society (IDSS), as well as a member of MIT-ML and Statistics groups. He obtained his PhD in Computer Science from the University of Texas at Austin. Before moving to MIT, he was a Senior Research Scientist at the Max Planck Institute for Intelligent Systems, Tübingen, Germany. He has held visiting faculty positions at UC Berkeley (EECS) and Carnegie Mellon University (Machine Learning Department) during 2013-2014. His research bridges a number of mathematical areas such as differential geometry, matrix analysis, convex analysis, probability theory, and optimization with machine learning. He founded the OPT (Optimization for Machine Learning) series of workshops, held from OPT2008–2017 at the NeurIPS (erstwhile NIPS) conference. He has co-edited a book with the same name (MIT Press, 2011). He is also a co-founder and chief scientist of macro-eyes, a global healthcare+AI startup