{"title":"Discussion of “Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons”","authors":"R. Mazumder","doi":"10.1214/20-sts807","DOIUrl":"https://doi.org/10.1214/20-sts807","url":null,"abstract":"I warmly congratulate the authors Hastie, Tibshirani and Tibshirani (HTT); and Bertsimas, Pauphilet and Van Parys (BPV) for their excellent contributions and important perspectives on sparse regression. Due to space constraints, and my greater familiarity with the content and context of HTT (I have had numerous fruitful discussions with the authors regarding their work), I will focus my discussion on the HTT paper. HTT nicely articulate the relative merits of three canonical estimators in sparse regression: L0, L1 and (forward)stepwise selection. I am humbled that a premise of their work is an article I wrote with Bertsimas and King [4] (BKM). BKM showed that current Mixed Integer Optimization (MIO) algorithms allow us to compute best subsets solutions for problem instances (p ≈ 1000 features) much larger than a previous benchmark (software for best subsets in the R package leaps) that could only handle instances with p ≈ 30. HTT by extending and refining the experiments performed by BKM, have helped clarify and deepen our understanding of L0, L1 and stepwise regression. They raise several intriguing questions that perhaps deserve further attention from the wider statistics and optimization communities. In this commentary, I will focus on some of the key points discussed in HTT, with a bias toward some of the recent work I have been involved in. There is a large and rich body of work in high-dimensional statistics and related optimization techniques that I will not be able to discuss within the limited scope of my commentary.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"602-608"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47846338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Conversation with J. Stuart (Stu) Hunter","authors":"R. D. Veaux","doi":"10.1214/19-sts766","DOIUrl":"https://doi.org/10.1214/19-sts766","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"663-671"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43654798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rejoinder: Sparse Regression: Scalable Algorithms and Empirical Performance","authors":"D. Bertsimas, J. Pauphilet, Bart P. G. Van Parys","doi":"10.1214/20-sts701rej","DOIUrl":"https://doi.org/10.1214/20-sts701rej","url":null,"abstract":"their","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46343056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter Restrictions for the Sake of Identification: Is There Utility in Asserting That Perhaps a Restriction Holds?","authors":"P. Gustafson","doi":"10.1214/23-sts885","DOIUrl":"https://doi.org/10.1214/23-sts885","url":null,"abstract":"Statistical modeling can involve a tension between assumptions and statistical identification. The law of the observable data may not uniquely determine the value of a target parameter without invoking a key assumption, and, while plausible, this assumption may not be obviously true in the scientific context at hand. Moreover, there are many instances of key assumptions which are untestable, hence we cannot rely on the data to resolve the question of whether the target is legitimately identified. Working in the Bayesian paradigm, we consider the grey zone of situations where a key assumption, in the form of a parameter space restriction, is scientifically reasonable but not incontrovertible for the problem being tackled. Specifically, we investigate statistical properties that ensue if we structure a prior distribution to assert that `maybe' or `perhaps' the assumption holds. Technically this simply devolves to using a mixture prior distribution putting just some prior weight on the assumption, or one of several assumptions, holding. However, while the construct is straightforward, there is very little literature discussing situations where Bayesian model averaging is employed across a mix of fully identified and partially identified models.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48381504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of Causal Effects Within Principal Strata Using Auxiliary Variables","authors":"Zhichao Jiang, Peng Ding","doi":"10.1214/20-sts810","DOIUrl":"https://doi.org/10.1214/20-sts810","url":null,"abstract":"In causal inference, principal stratification is a framework for dealing with a posttreatment intermediate variable between a treatment and an outcome, in which the principal strata are defined by the joint potential values of the intermediate variable. Because the principal strata are not fully observable, the causal effects within them, also known as the principal causal effects, are not identifiable without additional assumptions. Several previous empirical studies leveraged auxiliary variables to improve the inference of principal causal effects. We establish a general theory for identification and estimation of the principal causal effects with auxiliary variables, which provides a solid foundation for statistical inference and more insights for model building in empirical research. In particular, we consider two commonly-used strategies for principal stratification problems: principal ignorability, and the conditional independence between the auxiliary variable and the outcome given principal strata and covariates. For these two strategies, we give non-parametric and semi-parametric identification results without modeling assumptions on the outcome. When the assumptions for neither strategies are plausible, we propose a large class of flexible parametric and semi-parametric models for identifying principal causal effects. Our theory not only ensures formal identification results of several models that have been used in previous empirical studies but also generalizes them to allow for different types of outcomes and intermediate variables.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48107960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hunyong Cho, Joshua P. Zitovsky, Xinyi Li, Minxin Lu, K. Shah, John Sperger, Matthew C. B. Tsilimigras, M. Kosorok
{"title":"Comment: Diagnostics and Kernel-based Extensions for Linear Mixed Effects Models with Endogenous Covariates","authors":"Hunyong Cho, Joshua P. Zitovsky, Xinyi Li, Minxin Lu, K. Shah, John Sperger, Matthew C. B. Tsilimigras, M. Kosorok","doi":"10.1214/20-sts782","DOIUrl":"https://doi.org/10.1214/20-sts782","url":null,"abstract":"We discuss “Linear mixed models with endogenous covariates: modeling sequential treatment effects with application to a mobile health study” by Qian, Klasnja and Murphy. In this discussion, we study when the linear mixed effects models with endogenous covariates are feasible to use by providing examples and diagnostic tools as well as discussing potential extensions. This includes evaluating feasibility of partial likelihood-based inference, checking the conditional independence assumption, estimation of marginal effects, and kernel extensions of the model.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"396-399"},"PeriodicalIF":5.7,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47084522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comment: On the Potential for Misuse of Outcome-Wide Study Designs, and Ways to Prevent It","authors":"S. Vansteelandt, O. Dukes","doi":"10.1214/20-sts769","DOIUrl":"https://doi.org/10.1214/20-sts769","url":null,"abstract":"We congratulate the authors, VanderWeele, T.J., Mathur, M.B. and Chen, Y. (2020) (hereafter referred to as VMC), for making an interesting and important proposal, and thank the Editor for the opportunity to comment on it. We agree with VMC that outcome-wide epidemiology has the potential to overcome many of the weaknesses of the traditional epidemiological approach. Scientific reports that express the effects of exposure on a variety of different outcomes provide a more complete view on the exposure impact, while lessening the risk of selective analysis and reporting. We see much value in it, though caution is warranted. In this commentary, we highlight a number of key limitations, which will in turn suggest preferred analysis strategies that we find important to consider in addition to (or instead of) those described by VMC.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48148964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comment: Matching Methods for Observational Studies Derived from Large Administrative Databases","authors":"F. Sävje","doi":"10.1214/19-sts739","DOIUrl":"https://doi.org/10.1214/19-sts739","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45336677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rejoinder: A Nonparametric Superefficient Estimator of the Average Treatment Effect","authors":"D. Benkeser, Weixian Cai, M. J. Laan","doi":"10.1214/20-sts789","DOIUrl":"https://doi.org/10.1214/20-sts789","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"511-517"},"PeriodicalIF":5.7,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42917060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}