Joint Colloquium with UConn Health: SyNPar: A Data-Preservation Framework for High-Power False Discovery Rate Control in High-Dimensional Variable Selection, Jingyi (Jessica) Li, UCLA

SyNPar: A Data-Preservation Framework for High-Power False Discovery Rate Control in High-Dimensional Variable Selection

Presented by Jingyi Jessica Li, Professor, Department of Biostatistics, UCLA Fielding School of Public Health

Wednesday, March 12, 2025, 4:00 PM, AUST 202
Coffee will be served at 3:30 in the Noether Lounge (AUST 326)
Webex Meeting Link

Abstract: Balancing false discovery rate (FDR) control and statistical power is a fundamental challenge in high-dimensional variable selection. Existing FDR control methods often perturb the original data, either by concatenating knockoffs variables or splitting the data, which can compromise power. In this paper, we introduce SyNPar, a novel framework that controls the FDR in high-dimensional variable selection while preserving the integrity of the original data. SyNPar generates synthetic null data using an inference model under the null hypothesis and identifies false positives through a numerical analog of the likelihood ratio test. We provide theoretical guarantees for FDR control at any desired level and show that SyNPar achieves asymptotically optimal power. The framework is versatile, straightforward to implement, and applicable to a wide range of statistical models, including high-dimensional linear regression, generalized linear models (GLMs), Cox models, and Gaussian graphical models. Through extensive simulations and real-world data applications, we demonstrate that SyNPar consistently outperforms state-of-the-art methods, such as knockoffs and data-splitting techniques, in terms of FDR control, statistical power, and computational efficiency.
Speaker Bio:
Jingyi Jessica Li, Professor of Statistics and Data Science (also affiliated with Biostatistics, Computational Medicine, and Human Genetics), leads a research group called the Junction of Statistics and Biology at UCLA. Dr. Li focuses on developing interpretable statistical methods for biomedical data, particularly single-cell and spatial transcriptomic data, with a strong emphasis on statistical rigor through the use of synthetic controls. A recipient of multiple awards, including the NSF CAREER Award, Sloan Research Fellowship, ISCB Overton Prize, and COPSS Emerging Leaders Award, her contributions are widely recognized in the fields of computational biology and statistics.