Yingbin Liang (Ohio State)

Nov 1, 11-12pm 293 Cory.

Title and Abstract

SGD Converges to Global Minimum in Deep Learning via Star-Convex Path
Stochastic gradient descent (SGD) has been found to be remarkably effective in training a variety of deep neural networks. However, there is still a lack of understanding on how and why SGD can train these complex networks towards a global minimum. In this talk, I will present our recent establishment of the convergence of SGD to a global minimum in training deep neural networks. Our argument exploits the following two important properties: 1) the training loss can achieve zero value (approximately), which has been widely observed in deep learning; 2) SGD follows a star-convex path, which is verified by various experiments in our work. Following these two facts, our analysis shows that SGD, although has long been considered as a randomized algorithm, converges in an intrinsically deterministic manner to a global minimum.

Bio

Dr. Yingbin Liang received the Ph.D. degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2005. She is now an Associate Professor at the Department of Electrical and Computer Engineering at the Ohio State University. Dr. Liang's research interests include machine learning, statistical signal processing, optimization, information theory, and wireless communication and networks. Dr. Liang received the M. E. Van Valkenburg Graduate Research Award from University of Illinois at Urbana-Champaign in 2005, the Vodafone-U.S. Foundation Fellows Initiative Research Merit Award in 2005, the National Science Foundation CAREER Award in 2009, the State of Hawaii Governor Innovation Award in 2009. Her paper received EURASIP Best Paper Award in 2014. She served as an Associate Editor for the Shannon Theory of the IEEE Transactions on Information Theory during 2013-2015