Title: Robust Statistical Methods for Noisy Complex Network Data
Presented by Wenrui Li, Assistant Professor; University of Connecticut
DATE: Thursday, May 7, 2026; Time: 1:00 – 2:00 pm, 1681 LGRT
Bio: Wenrui Li is Assistant Professor in the Department of Statistics at the University of Connecticut, and prior to this, a postdoctoral researcher in the Department of Biostatistics, Epidemiology and Informatics at University of Pennsylvania, mentored by Qi Long. Prof Li received a Ph.D. in Statistics from Boston University under the directions of Eric D. Kolaczyk, Daniel L. Sussman and Laura F. White, a MS degree in Statistics from University of Washington and a BS degree in Statistics from Shandong University. Prof Li's research interests include statistics for network data, causal inference under interference, high-dimensional data analysis (e.g., -omics data and neuroimaging data), and statistical methods for infectious disease transmission and surveillance.
Abstract:
In recent years there has been an explosion of network data from seemingly all corners of science. Such data present unique analytical challenges. In particular, it is widely recognized by practitioners that there is measurement error associated with most network data including, but not limited to, social networks and biological networks and pathways (e.g., gene regulatory pathways and metabolic networks). By "measurement error" we mean true edges being observed as non-edges, and vice versa. However, there has been little attention given, at the level of statistical theory and methods, to the problem of noisy networks. In this talk, I will introduce our recent efforts to account for network noise in two settings: Bayesian model for high-dimensional data with noisy network information and causal inference on noisy networks.
In the first project, we propose a graph-guided Bayesian modeling framework to account for network noise in regression models involving structured high-dimensional predictors. Our approach uses two sources of network information, namely, the noisy graph extracted from existing databases and the graph estimated for predictors in the data at hand, to inform the latent true graph. We demonstrate the advantages of our method over existing methods in simulations, and through analyses of -omics datasets for Alzheimer's disease.
In the second project, we quantify biases and variances of standard estimators of average causal effects in noisy networks and develop a general framework for estimation of true average causal effects. We employ method-of-moments techniques to derive estimators and establish their asymptotic unbiasedness, consistency, and normality. Simulations in the context of social contact networks in British secondary schools suggest that substantial inferential accuracy by our estimators is possible in networks of even modest size when nontrivial noise is present.