{"title":"关于“最佳子集、逐步前进还是套索”的讨论基于广泛比较的分析与建议","authors":"R. Mazumder","doi":"10.1214/20-sts807","DOIUrl":null,"url":null,"abstract":"I warmly congratulate the authors Hastie, Tibshirani and Tibshirani (HTT); and Bertsimas, Pauphilet and Van Parys (BPV) for their excellent contributions and important perspectives on sparse regression. Due to space constraints, and my greater familiarity with the content and context of HTT (I have had numerous fruitful discussions with the authors regarding their work), I will focus my discussion on the HTT paper. HTT nicely articulate the relative merits of three canonical estimators in sparse regression: L0, L1 and (forward)stepwise selection. I am humbled that a premise of their work is an article I wrote with Bertsimas and King [4] (BKM). BKM showed that current Mixed Integer Optimization (MIO) algorithms allow us to compute best subsets solutions for problem instances (p ≈ 1000 features) much larger than a previous benchmark (software for best subsets in the R package leaps) that could only handle instances with p ≈ 30. HTT by extending and refining the experiments performed by BKM, have helped clarify and deepen our understanding of L0, L1 and stepwise regression. They raise several intriguing questions that perhaps deserve further attention from the wider statistics and optimization communities. In this commentary, I will focus on some of the key points discussed in HTT, with a bias toward some of the recent work I have been involved in. There is a large and rich body of work in high-dimensional statistics and related optimization techniques that I will not be able to discuss within the limited scope of my commentary.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"602-608"},"PeriodicalIF":3.9000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Discussion of “Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons”\",\"authors\":\"R. Mazumder\",\"doi\":\"10.1214/20-sts807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"I warmly congratulate the authors Hastie, Tibshirani and Tibshirani (HTT); and Bertsimas, Pauphilet and Van Parys (BPV) for their excellent contributions and important perspectives on sparse regression. Due to space constraints, and my greater familiarity with the content and context of HTT (I have had numerous fruitful discussions with the authors regarding their work), I will focus my discussion on the HTT paper. HTT nicely articulate the relative merits of three canonical estimators in sparse regression: L0, L1 and (forward)stepwise selection. I am humbled that a premise of their work is an article I wrote with Bertsimas and King [4] (BKM). BKM showed that current Mixed Integer Optimization (MIO) algorithms allow us to compute best subsets solutions for problem instances (p ≈ 1000 features) much larger than a previous benchmark (software for best subsets in the R package leaps) that could only handle instances with p ≈ 30. HTT by extending and refining the experiments performed by BKM, have helped clarify and deepen our understanding of L0, L1 and stepwise regression. They raise several intriguing questions that perhaps deserve further attention from the wider statistics and optimization communities. In this commentary, I will focus on some of the key points discussed in HTT, with a bias toward some of the recent work I have been involved in. There is a large and rich body of work in high-dimensional statistics and related optimization techniques that I will not be able to discuss within the limited scope of my commentary.\",\"PeriodicalId\":51172,\"journal\":{\"name\":\"Statistical Science\",\"volume\":\"35 1\",\"pages\":\"602-608\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Science\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/20-sts807\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Science","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/20-sts807","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Discussion of “Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons”
I warmly congratulate the authors Hastie, Tibshirani and Tibshirani (HTT); and Bertsimas, Pauphilet and Van Parys (BPV) for their excellent contributions and important perspectives on sparse regression. Due to space constraints, and my greater familiarity with the content and context of HTT (I have had numerous fruitful discussions with the authors regarding their work), I will focus my discussion on the HTT paper. HTT nicely articulate the relative merits of three canonical estimators in sparse regression: L0, L1 and (forward)stepwise selection. I am humbled that a premise of their work is an article I wrote with Bertsimas and King [4] (BKM). BKM showed that current Mixed Integer Optimization (MIO) algorithms allow us to compute best subsets solutions for problem instances (p ≈ 1000 features) much larger than a previous benchmark (software for best subsets in the R package leaps) that could only handle instances with p ≈ 30. HTT by extending and refining the experiments performed by BKM, have helped clarify and deepen our understanding of L0, L1 and stepwise regression. They raise several intriguing questions that perhaps deserve further attention from the wider statistics and optimization communities. In this commentary, I will focus on some of the key points discussed in HTT, with a bias toward some of the recent work I have been involved in. There is a large and rich body of work in high-dimensional statistics and related optimization techniques that I will not be able to discuss within the limited scope of my commentary.
期刊介绍:
The central purpose of Statistical Science is to convey the richness, breadth and unity of the field by presenting the full range of contemporary statistical thought at a moderate technical level, accessible to the wide community of practitioners, researchers and students of statistics and probability.