Ayush Sekhari (MIT)

March 15, 2023

Title and Abstract

On the Complexity of Adversarial Decision Making

A central problem in online learning and decision making – from bandits to reinforcement learning – is to understand what modeling assumptions lead to sample-efficient learning guarantees. We consider a general adversarial decision-making framework that encompasses (structured) bandit problems with adversarial rewards and reinforcement learning problems with adversarial dynamics. Our main result is to show – via new upper and lower bounds – that the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. in the stochastic counterpart to our setting, is necessary and sufficient to obtain low regret for adversarial decision-making. However, compared to the stochastic setting, one must apply the Decision-Estimation Coefficient to the convex hull of the class of models (or, hypotheses) under consideration. This establishes that the price of accommodating adversarial rewards or dynamics is governed by the behavior of the model class under convexification, and recovers a number of existing results – both positive and negative.

Bio

Ayush is a postdoc researcher working with Prof. Sasha Rakhlin at MIT. He completed his Ph.D. from the Computer Science department at Cornell University, advised by Professor Karthik Sridharan and Professor Robert D. Kleinberg. His research interests span optimization, online learning, reinforcement learning and control, and the interplay between them. Before coming to Cornell, he spent a year at Google as a part of the Brain residency program. Before Google, he completed his undergraduate studies in computer science at IIT Kanpur in India where he was awarded the President's gold medal. Ayush has also been a winner of a student best paper award at COLT 2019