Predicting What You Already Know Helps: Provable Self-Supervised Learning
Speaker: Qi Lei, Postdoc at Princeton (https://cecilialeiqi.github.io/)
Abstract: Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks), that do not require labeled data, to learn semantic representations. These pretext tasks are created solely using the input features, such as predicting a missing image patch, recovering the color channels of an image from context, or predicting missing words, yet predicting this known information helps in learning representations effective for downstream prediction tasks. In this talk, we posit a mechanism based on approximate conditional independence to formalize how solving certain pretext tasks can learn representations that provably decrease the sample complexity of downstream supervised tasks. Formally, we quantify how the approximate independence between the components of the pretext task (conditional on the label and latent variables) allows us to learn representations that can solve the downstream task with drastically reduced sample complexity by just training a linear layer on top of the learned representation.
Bio:Qi Lei is a postdoctoral scholar at Princeton EE department, advised by Prof. Jason Lee. She received her Ph.D. from Oden Institute for Computational Engineering & Sciences at UT Austin in May 2020, where she was also a member of the Center for Big Data Analytics and the Wireless Networking & Communications Group. She was a visitor at the Institute for Advanced Study (IAS) for the Theoretical Machine Learning Program in 2019-2020. Before that, she was a research fellow at Simons Institute for the Foundations of Deep Learning Program. Her main research interests are machine learning, deep learning, and optimization. Qi has received several awards, including 2 years of Computing Innovative Fellowship, and the Simons-Berkeley Research Fellowship. She also owns several patents.