Abstract
SummaryWe consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with the multifactor analysis-of-variance problem as the most important and well-known example. Instead of selecting factors by stepwise backward elimination, we focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection. The lasso, the LARS algorithm and the non-negative garrotte are recently proposed regression methods that can be used to select individual variables. We study and propose efficient algorithms for the extensions of these methods for factor selection and show that these extensions give superior performance to the traditional stepwise backward elimination method in factor selection problems. We study the similarities and the differences between these methods. Simulations and real examples are used to illustrate the methods.
References
14
Referenced
4,906
{'volume-title': 'Adaptive regression and model selection in data mining problems', 'year': '1999', 'author': 'Bakin', 'key': '2023040800100213700_'}
/ Adaptive regression and model selection in data mining problems by Bakin (1999)10.1080/00401706.1995.10484371
/ Technometrics / Better subset regression using the nonnegative garrote by Breiman (1995)10.1214/009053604000000067
/ Ann. Statist. / Least angle regression by Efron (2004)10.1198/016214501753382273
/ J. Am. Statist. Ass. / Variable selection via nonconcave penalized likelihood and its oracle properties by Fan (2001)10.1214/aos/1176325766
/ Ann. Statist. / The risk inflation criterion for multiple regression by Foster (1994)10.1080/10618600.1998.10474784
/ J. Comput. Graph. Statist. / Penalized regressions: the bridge versus the lasso by Fu (1999)10.1093/biomet/87.4.731
/ Biometrika / Calibration and empirical Bayes variable selection by George (2000)10.1080/01621459.1993.10476353
/ J. Am. Statist. Ass. / Variable selection via Gibbs sampling by George (1993){'volume-title': 'Applied Logistic Regression', 'year': '1989', 'author': 'Hosmer', 'key': '2023040800100213700_'}
/ Applied Logistic Regression by Hosmer (1989){'volume-title': 'Technical Report 1072', 'year': '2003', 'author': 'Lin', 'key': '2023040800100213700_'}
/ Technical Report 1072 by Lin (2003){'volume-title': 'Technical Report', 'year': '2004', 'author': 'Rosset', 'key': '2023040800100213700_'}
/ Technical Report by Rosset (2004)10.1198/016214502753479356
/ J. Am. Statist. Ass. / Adaptive model selection by Shen (2002)10.1111/j.2517-6161.1996.tb02080.x
/ J. R. Statist. Soc. B / Regression shrinkage and selection via the lasso by Tibshirani (1996){'volume-title': 'Statistics Discussion Paper 2005-25', 'year': '2005', 'author': 'Yuan', 'key': '2023040800100213700_'}
/ Statistics Discussion Paper 2005-25 by Yuan (2005)
Dates
Type | When |
---|---|
Created | 19 years, 8 months ago (Dec. 21, 2005, 6:07 a.m.) |
Deposited | 7 months, 2 weeks ago (Jan. 6, 2025, 12:40 p.m.) |
Indexed | 7 hours, 27 minutes ago (Aug. 21, 2025, 2:28 p.m.) |
Issued | 19 years, 8 months ago (Dec. 21, 2005) |
Published | 19 years, 8 months ago (Dec. 21, 2005) |
Published Online | 19 years, 8 months ago (Dec. 21, 2005) |
Published Print | 19 years, 6 months ago (Feb. 1, 2006) |
Funders
1
National Science Foundation
10.13039/100000001
Region: Americas
gov (National government)
Labels
4
- U.S. National Science Foundation
- NSF
- US NSF
- USA NSF
Awards
1
- DMS-0134987
@article{Yuan_2005, title={Model Selection and Estimation in Regression with Grouped Variables}, volume={68}, ISSN={1467-9868}, url={http://dx.doi.org/10.1111/j.1467-9868.2005.00532.x}, DOI={10.1111/j.1467-9868.2005.00532.x}, number={1}, journal={Journal of the Royal Statistical Society Series B: Statistical Methodology}, publisher={Oxford University Press (OUP)}, author={Yuan, Ming and Lin, Yi}, year={2005}, month=dec, pages={49–67} }