Once a month during the academic year, the statistics faculty select a paper for our students to read and discuss. Papers are selected based on their impact or historical value, or because they contain useful techniques or results.
Golub, G. H., Heath, M., & Wahba, G. (1979). Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter. Technometrics, 21(2), 215–223.
Notes preparer: HaiYing Wang
Cross-validation is widely used to assess the quality of model fitting in statistics and machine learning and is often adopted in choosing tuning parameters in regularization methods such as the widely used LASSO. However, the ordinary cross-validation may not perform well for certain scenarios, such as the case when the design matrix is close to being column-orthogonal. In the context of ridge regression, the authors proposed the method of generalized cross-validation (GCV) which is a rotation-invariant version of the ordinary cross-validation. The GCV improves the ordinary cross-validation; it does not require to estimate the model error variance; it is applicable when the number of predictors is larger than the sample size. It is worth mentioning that this paper was a joint work by researchers from computer science and statistics almost forty years ago. Modern applications of the technique of GCV are far beyond ridge regression, but the fundamental ideas are in the original paper.