Extension of differential topic analysis for microbiome data presented by Pratheepa Jeganathan Department of Mathematics and Statistics McMaster University
Abstract: High-throughput sequencing generates massive molecular microbial datasets that pose several statistical challenges. As a result, statistical methods have been developed to address contamination sequences from reagents, unequal sampling, strain switching (present as one taxon in one set of specimens and a close, distinct strain appears in the other set of specimens), sparsity, and heterogeneity. One of the important goals in microbiome research is often to find taxonomic differences across environments or groups. In this talk, we will demonstrate differential topic analysis that facilitates inferences on latent microbial communities when strain switching can be an impediment. First, in the presence of DNA contamination, we quantify true abundance using Bayesian reference analysis. Next, we present a data similarity matrix-based method to detect strain switching. Then, we will show how to use topic models to provide useful aggregates for differential abundance analysis based on topics rather than individual strains using an R package diffTop available on Github.