Ananya Kumar (Stanford)

March 1, 2023

Title and Abstract

Foundation Models for Robustness to Distribution Shifts

When ML systems are deployed, they often face test examples that are different from training — this leads to a large drop in accuracy. The foundation model paradigm (pretraining general-purpose representations from broad unlabeled data, and then adapting to a variety of tasks we care about) has emerged as one of the most effective ways to improve robustness to novel test examples. But how should we adapt good foundation models robustly, and how should we pretrain good models? (1, Adaptation) In the first part of the talk, we will explain why the standard approach of fine-tuning all model parameters can distort good pretrained representations and underperform out-of-distribution. The theory leads to practical insights and better methods for fine-tuning. Our methods have led to state-of-the-art accuracies on ImageNet and in applications such as satellite remote sensing, wildlife conservation, and radiology. (2, Pretraining) Next, we will examine how foundation models can learn good representations. We show that contrastive pretraining on unlabeled data from many domains, and then transferring to labeled data from one domain, improves accuracy even on the domains where we had no labels. We explain why pretraining can work differently from some classical domain adaptation intuitions. Our theory predicts phenomena on real datasets, and leads to improved algorithms. (3, Future Work) Finally, we discuss some exciting future research directions on foundation models.

Bio

Ananya is a final year PhD student at Stanford University advised by Percy Liang and Tengyu Ma. His PhD work focuses on representation learning, foundation models, and reliable machine learning.