Statistics Colloquium: Jae-Kwang Kim, Iowa State University

This event is part of the Fall 2022 Statistics Colloquium


Multiple Bias Calibration for Valid Statistical Inference With Selection Bias

Presented by Jae-Kwang Kim, LAS Dean's Professor, Department of Statistics, Iowa State University

Wednesday, September 7
4:00 p.m. ET
Austin Building, Room 434

The Wilcoxon rank-sum/Mann-Whitney test is one of the most popular distribution-free procedures for testing the equality of two univariate probability distributions. One of the main reasons for its popularity can be attributed to the remarkable result of Hodges and Lehmann (1956), which shows that the asymptotic relative efficiency of Wilcoxon’s test with respect to Student’s -test, under location alternatives, never falls below 0.864, despite the former being exactly distribution-free in finite samples. Even more striking is the result of Chernoff and Savage (1958), which shows that the efficiency of a Gaussian score transformed Wilcoxon’s test, against the -test, is lower bounded by 1. In this talk we will discuss multivariate versions of these celebrated results, by considering distribution-free analogues of the Hotelling -test based on optimal transport. The proposed tests are consistent against a general class of alternatives and satisfy Hodges-Lehmann and Chernoff-Savage-type efficiency lower bounds over various natural families of multivariate distributions, despite being entirely agnostic to the underlying data generating mechanism. Analogous results for independence testing will also be presented. Finally, we will discuss how optimal transport based multivariate ranks can be used to obtain distribution-free kernel two-sample tests, which are universally consistent, computationally efficient, and have non-trivial asymptotic efficiency. (Based on joint work with Nabarun Deb and Bodhisattva Sen.)

Key reference:

  • Qin, J., D. Leung and J. Shao (2002), ‘Estimation with survey data under non-ignorable nonresponse or informative sampling’, Journal of the American Statistical Association 97, 193–200.
  • Morikawa, K. and Kim, J.K. (2021), ‘Semiparametric optimal estimation with nonignorable nonresponse data’, The Annals of Statistics 49(5), 2991–3014.
  • Han, P. and L. Wang (2013), ‘Estimation with missing data: Beyond double robustness’, Biometrika 100, 417–430.

Speaker Bio:

Dr. Jae-kwang Kim is a LAS dean’s professor in the Department of Statistics at Iowa State University (ISU). He got his Ph.D. from ISU in 2000 under the supervision of prof. Wayne Fuller and then worked in various places (such as Westat and Yonsei university in Korea) before he joined ISU in 2008. He is a fellow of ASA and IMS and the president-elect of KISS (Korean International Statistical Society). He has extensive research/consulting experience in the area of survey sampling and missing data analysis. He is also a co-author of the book “Statistical methods for handling incomplete data (2nd edn)” coauthored with Jun Shao.

Statistics Colloquium: Jae-Kwang Kim, Iowa State University

This event is part of the Fall 2022 Statistics Colloquium


Multiple Bias Calibration for Valid Statistical Inference With Selection Bias

Presented by Jae-Kwang Kim, LAS Dean's Professor, Department of Statistics, Iowa State University

Wednesday, September 7
4:00 p.m. ET
Austin Building, Room 434

The Wilcoxon rank-sum/Mann-Whitney test is one of the most popular distribution-free procedures for testing the equality of two univariate probability distributions. One of the main reasons for its popularity can be attributed to the remarkable result of Hodges and Lehmann (1956), which shows that the asymptotic relative efficiency of Wilcoxon’s test with respect to Student’s -test, under location alternatives, never falls below 0.864, despite the former being exactly distribution-free in finite samples. Even more striking is the result of Chernoff and Savage (1958), which shows that the efficiency of a Gaussian score transformed Wilcoxon’s test, against the -test, is lower bounded by 1. In this talk we will discuss multivariate versions of these celebrated results, by considering distribution-free analogues of the Hotelling -test based on optimal transport. The proposed tests are consistent against a general class of alternatives and satisfy Hodges-Lehmann and Chernoff-Savage-type efficiency lower bounds over various natural families of multivariate distributions, despite being entirely agnostic to the underlying data generating mechanism. Analogous results for independence testing will also be presented. Finally, we will discuss how optimal transport based multivariate ranks can be used to obtain distribution-free kernel two-sample tests, which are universally consistent, computationally efficient, and have non-trivial asymptotic efficiency. (Based on joint work with Nabarun Deb and Bodhisattva Sen.)

Key reference:

  • Qin, J., D. Leung and J. Shao (2002), ‘Estimation with survey data under non-ignorable nonresponse or informative sampling’, Journal of the American Statistical Association 97, 193–200.
  • Morikawa, K. and Kim, J.K. (2021), ‘Semiparametric optimal estimation with nonignorable nonresponse data’, The Annals of Statistics 49(5), 2991–3014.
  • Han, P. and L. Wang (2013), ‘Estimation with missing data: Beyond double robustness’, Biometrika 100, 417–430.

Speaker Bio:

Dr. Jae-kwang Kim is a LAS dean’s professor in the Department of Statistics at Iowa State University (ISU). He got his Ph.D. from ISU in 2000 under the supervision of prof. Wayne Fuller and then worked in various places (such as Westat and Yonsei university in Korea) before he joined ISU in 2008. He is a fellow of ASA and IMS and the president-elect of KISS (Korean International Statistical Society). He has extensive research/consulting experience in the area of survey sampling and missing data analysis. He is also a co-author of the book “Statistical methods for handling incomplete data (2nd edn)” coauthored with Jun Shao.