Skip to main content

Statistics Seminar

Statistical Inference for Matrix Data: Robustness and Extremes presented by Joshua Cape

Event Details

Date
Thursday, April 9, 2026
Time
1-2 p.m.
Location
7560 Morgridge Hall
Description

Abstract: Matrices are ubiquitous for representing, manipulating, and analyzing large datasets across diverse domains. This talk highlights a recent line of work on developing statistical inference guarantees for high-dimensional problem settings where observable data matrices are large. Our theory and methods bring together tools from nonparametric statistics, extreme value theory, and random matrix theory, enabling hypothesis testing and uncertainty quantification for spectral-based embeddings and test statistics in low-rank latent variable models. Our contributions, under the umbrella of modern multivariate statistics, leverage fine-grained perturbation analysis in matrix models where signals are corrupted by noise. In this talk, we focus on two key directions: robustness and extremes. For models with Gaussian noise, our study of extremes considers the maximum Euclidean row norm of the difference between the leading sample and population singular vectors. We establish that this statistic is asymptotically Gumbel in certain high-dimensional regimes, enabling novel and high-resolution hypothesis testing for singular subspace structure. For models with general, possibly heavy-tailed continuous noise, our study of robustness considers ordinal rank-transformed data representations. Among our results, we establish that the leading eigenvalue of matrices of normalized rank statistics is asymptotically Gaussian in certain high-dimensional regimes, enabling novel inference for community detection and principal submatrix detection. Collectively, these results inform rigorous, inferential spectral-based methodology that is applicable throughout statistical machine learning and data science.

Cost
Free

Tags