Skip to main content

Statistics Seminar

Inference and Learning in Infinite Dimensions: Insights from Optimal Transport by Tengyuan Liang

Event Details

Wednesday, November 16, 2022
4-5 p.m.

Date: November 16, 2022

Speaker: Tengyuan Liang


Title: Inference and Learning in Infinite Dimensions: Insights from Optimal Transport


I plan to discuss two papers on Inference and Learning in Infinite Dimensions using insights from optimal transport theory.

The first paper introduces a new simulation-based inference procedure to model and sample from multi-dimensional probability distributions given access to i.i.d. samples, circumventing the usual approaches of explicitly modeling the density function or designing Markov chain Monte Carlo. Motivated by the seminal work on distance and isomorphism between metric measure spaces by Memoli (2011) and Sturm (2012), we propose a new notion called the Reversible Gromov-Monge (RGM) distance and study how RGM can be used to design new transform samplers to perform simulation-based inference. Our RGM sampler can also estimate optimal alignments between two heterogeneous metric measure spaces $(X, \mu, d_X)$ and $(Y, \nu, d_Y)$ from empirical data sets, with estimated maps that approximately push forward one measure $\mu$ to the other $\nu$, and vice versa. We study the analytic properties of the RGM distance and derive that under mild conditions, RGM equals the classic Gromov-Wasserstein distance. Curiously, drawing a connection to Brenier's polar factorization and a result by Brenier and Gangbo (2003), we show that the RGM sampler induces bias towards strong isomorphism with proper choices of $c_{\cX}$ and $c_{\cY}$. Statistical rate of convergence, representation, and optimization questions regarding the induced sampler are studied. Synthetic and real-world examples showcasing the effectiveness of the RGM sampler are also demonstrated.

Motivated by robust dynamic resource allocation in operations research, the second paper studies the Online Learning to Transport (OLT) problem where the decision variable is a probability measure, an infinite-dimensional object. We draw connections between online learning, optimal transport, and partial differential equations through an insight called the minimal selection principle, initially studied in the Wasserstein gradient flow setting by Ambrosio, Gigli and Savere (2008). This allows us to seamlessly extend the standard online learning framework to the infinite-dimensional setting. Based on our framework, we derive a novel method called the minimal selection or exploration (MSoE) algorithm to solve OLT problems using mean-field approximation and discretization techniques. In the displacement convex setting, the main theoretical message underpinning our approach is that minimizing transport cost over time (via the minimal selection principle) ensures optimal cumulative regret upper bounds. On the algorithmic side, our MSoE algorithm applies beyond the displacement convex setting, making the mathematical theory of optimal transport practically relevant to non-convex settings common in dynamic resource allocation.