R Srikant (UIUC)Mar 12, 2021 Title and AbstractSample Complexity and Overparameterization Bounds for Neural Temporal Difference Learning We will consider the dynamics of projection-free temporal-difference learning with neural network-based value function approximation over a general state space, and present bounds on the sample complexity and overparameterization required to achieve a desired accuracy. Our bounds are based on a Lyapunov drift analysis of the network parameters as a stopped random process. Joint work with Semih Cayci, Siddhartha Satpathi and Niao He. BioR. Srikant is the Fredric G. and Elizabeth H. Nearing Endowed Professor of ECE and the Coordinated Science Lab at UIUC. He is also one of two Co-Directors of the C3.ai Digital Transformation Institute, jointly headquartered at UIUC and Berkeley, which is a consortium of universities (Stanford, MIT, CMU, UChicago, Princeton, KTH, Berkeley and UIUC) and industries (C3.ai and Microsoft) aimed at promoting research on AI, ML, IoT and cloud computing for the betterment of society. His research interests are in machine learning, communication networks and applied probability. He is the winner of the 2019 IEEE Koji Kobayashi Computers and Communications Award and the 2015 IEEE INFOCOM Achievement Award. He has won several best paper awards including the 2017 Applied Probability Society’s Best Publication Award, the 2015 INFOCOM Best Paper Award and the 2015 WiOpt Best Paper Award. He also won the Distinguished Alumnus Award from the Indian Institute of Technology, Madras in 2015 |