Interdisciplinary Seminar: Fan Li, Duke University

Overlap Weighting for Causal Inference

Presented by Dr. Fan Li, Duke University

Friday, October 1
12 p.m. ET

Covariate balance is crucial for causal comparisons. Weighting is a common strategy to balance covariates in observational studies. We propose a general class of weights—the balancing weights—that balance the weighted distributions of the covariates between treatment groups. These weights incorporate the propensity score to weight each group to an analyst-selected target population. This class unifies existing weighting methods, including commonly used weights such as inverse-probability weights as special cases. Within the class, we highlight the overlap weighting method, which has been widely adopted in applied research. The overlap weight of each unit is proportional to the probability of that unit being assigned to the opposite group. The overlap weights are bounded and minimize the asymptotic variance of the weighted average treatment effect among the class of balancing weights. The overlap weights also possess a desirable exact balance property. Extension of overlap weighting to multiple treatments, survival outcomes, and subgroup analysis will also be discussed.

Speaker Bio:

Dr. Fan Li is a Professor in the Department of Statistical Science, with a secondary appointment at the Department of Biostatistics and Bioinformatics, at Duke University. Her main research interest is causal inference – designs and analysis for evaluating treatments and interventions in randomized experiments and observational studies, and their applications to health studies (also known as comparative effectiveness research) and computational social science. Dr. Li also works on the interface between causal inference and machine learning. She has developed methods for propensity score, clinical trials, randomized experiments (e.g. A/B testing), difference-in-differences, regression discontinuity designs, representation learning, and Bayesian methods. She has also worked on statistical methods for missing data, as well as Bayesian graphical modeling for genomics and neuroimaging data.