Seminar Talk: Toward Combining Causal Inference and Databases for Interpretable Data Science
Sudeepa Roy: Associate Professor, Computer Science, Duke University
ABSTRACT: Causal inference - estimating the effect of a treatment on an outcome - is critical for sound data analysis. It provides a means to estimate the impact of a certain intervention to the world that correlation, association, or model-based prediction analysis cannot provide. As a result, causal inference is indispensable in health, public policy, and other domains. In this talk, I will discuss how traditional database research can help causal inference research in AI and statistics, and vice versa. First, I will discuss an interpretable and scalable "matching" framework for causal inference that has comparable estimation accuracies as black box machine learning methods and uses efficient SQL query evaluations. Next, I will discuss extending standard causal inference to relational data, and using causal inference for explaining SQL query answers. I will conclude with an overview of our ongoing and future work for actionable and interpretable methods for different stages in the data analysis pipeline.
BIO: Sudeepa Roy is an Associate Professor in Computer Science at Duke University. She works broadly in data management and data science, which includes causal inference and explanations for big data, debugging queries, data repair, query optimization, data provenance, and uncertain data. Before joining Duke in 2015, she did a postdoc at the University of Washington, and obtained her Ph.D. from the University of Pennsylvania. She is a recipient of the VLDB Early Career Research Contributions Award, an NSF CAREER Award, and a Google Ph.D. fellowship in structured data. She is a co-director of the Almost Matching Exactly (AME) lab for interpretable causal inference at Duke (https://almost-matching-exactly.github.io/).