Once a month during the academic year, the statistics faculty select a paper for our students to read and discuss. Papers are selected based on their impact or historical value, or because they contain useful techniques or results.
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004) Least angle regression. The Annals of Statistics, 32, 407–499.
Notes preparer: Kun Chen
In many problems, we have available a large collection of predictors from which we hope to select a parsimonious set for accurate prediction of a response variable. Classical model selection algorithms include forward selection, backward elimination, stepwise selection, and their stagewise versions, while in the era of big data, methods under the learning scheme of “regularization + optimization”, as exemplified by Lasso, have undergone exciting development. Given that these seemingly very different approaches often produce models with striking similarity, a natural question arises: is there any intrinsic connection between them?
Efron et al. (2004) proposed Least Angle Regression (LARS), a novel model selection algorithm, to bridge Lasso and forward stagewise regression (FSR). They showed that the solution paths of either Lasso or FSR can be efficiently produced via the LARS algorithm with some simple modification. Ever since this work, there has been a revival of interest into the so called stagewise learning. Generally speaking, a stagewise algorithm builds a model from scratch and gradually increases the model complexity in a sequence of simple learning steps. As such, it was realized that LARS and stagewise learning also connect to various optimization and machine learning approaches such as steepest descend, boosting, and path-following algorithms. As noted in one of the paper’s discussion pieces, “the LARS–Lasso–boosting relationship opens the door for new insights on existing methods’ underlying statistical mechanisms and for the development of new and promising methodology.”
We will discuss the main idea of LARS, the intriguing connections between regularized estimation and stagewise learning, and some recent advances.