Paper of the Month: April 2023

Once a month during the academic year, the statistics faculty select a paper for our students to read and discuss. Papers are selected based on their impact or historical value, or because they contain useful techniques or results.

Meng, X. L. (1994). Multiple-Imputation Inferences with Uncongenial Sources of Input Statistical Science, 538-558.

Friday, April 21, 12:00 pm
AUST 326

Notes Preparer: Jung Wun Lee, PhD Student

The motivation of choosing the paper: Incomplete data, including nonresponses or missing values, are a prevalent issue in the data science practice. As statisticians in the statistical consulting service, my colleagues and I have faced many challenges related to incomplete data. It was interesting to realize that there are no short answers for dealing with missing values, while questions are simple, such as, “My data set has some missing values, so what should I do?”. I chose this paper to share part of my knowledge during my Ph.D. training and invite everyone to the world of missing value problems.

Partial abstract of the paper: This paper reviews the old controversies over the validity of multiple-imputation (MI) inference when a procedure for analyzing multiple imputed data sets cannot be derived from (is "uncongenial" to) the model adopted for multiple imputations.

Given sensible imputations and complete-data analysis procedures, inferences from standard MI combining rules are typically superior to, and thus different from, analysts’ incomplete-data analyses. The latter may suffer from serious nonresponse biases because such analyses often must rely on convenient but unrealistic assumptions about the nonresponse mechanism. When it is desirable to conduct inferences under models for nonresponse other than the original imputation model, a possible alternative to recreating imputations is to incorporate appropriate importance weights into the standard combining rules.

These points are reviewed and explored by simple examples and general theory, from both Bayesian and frequentist perspectives, particularly from the randomization perspective. Some convenient terms are suggested for facilitating communication among researchers from different perspectives when evaluating MI inferences with uncongenial sources of input.