Statistical SciencePub Date : 2021-02-01Epub Date: 2020-12-21DOI: 10.1214/19-sts749
Corwin M Zigler, Georgia Papadogeorgou
{"title":"Bipartite Causal Inference with Interference.","authors":"Corwin M Zigler, Georgia Papadogeorgou","doi":"10.1214/19-sts749","DOIUrl":"https://doi.org/10.1214/19-sts749","url":null,"abstract":"<p><p>Statistical methods to evaluate the effectiveness of interventions are increasingly challenged by the inherent interconnectedness of units. Specifically, a recent flurry of methods research has addressed the problem of <i>interference</i> between observations, which arises when one observational unit's outcome depends not only on its treatment but also the treatment assigned to other units. We introduce the setting of <i>bipartite causal inference with interference,</i> which arises when 1) treatments are defined on observational units that are distinct from those at which outcomes are measured and 2) there is <i>interference</i> between units in the sense that outcomes for some units depend on the treatments assigned to many other units. The focus of this work is to formulate definitions and several possible causal estimands for this setting, highlighting similarities and differences with more commonly considered settings of causal inference with interference. Towards an empirical illustration, an inverse probability of treatment weighted estimator is adapted from existing literature to estimate a subset of simplified, but interesting, estimands. The estimators are deployed to evaluate how interventions to reduce air pollution from 473 power plants in the U.S. causally affect cardiovascular hospitalization among Medicare beneficiaries residing at 18,807 zip code locations.</p>","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"36 1","pages":"109-123"},"PeriodicalIF":5.7,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8048152/pdf/nihms-1056137.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38804958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introduction to the Special Section","authors":"Yihong Wu, Harrison H. Zhou","doi":"10.1214/20-sts361ed","DOIUrl":"https://doi.org/10.1214/20-sts361ed","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42468701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Conversation with Tze Leung Lai","authors":"Ying Lu, Dylan S. Small, Z. Ying","doi":"10.1214/20-sts775","DOIUrl":"https://doi.org/10.1214/20-sts775","url":null,"abstract":"This conversation began in June 2015 in the Department of Statistics at Columbia University during Lai’s visit to his alma mater where he celebrated his seventieth birthday. It continued in the subsequent years at Columbia and Stanford. Lai was born on June 28, 1945, in Hong Kong, where he grew up and attended The University of Hong Kong, receiving his B.A. degree (First Class Honors) in Mathematics in 1967. He went to Columbia University in 1968 for graduate study in statistics and received his Ph.D. degree in 1971. He stayed on the faculty at Columbia and was appointed Higgins Professor of Mathematical Statistics in 1986. A year later he moved to Stanford, where he is currently Ray Lyman Wilbur Professor of Statistics, and by courtesy, also of Biomedical Data Science and Computational and Mathematical Engineering. He is a fellow of the Institute of Mathematical Statistics, the American Statistical Association and an elected member of Academia Sinica in Taiwan. He was the third recipient of the COPSS Award which he won in 1983. He has been married to Letitia Chow since 1975, and they have two sons and two grandchildren.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42132267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of Two Frameworks for Analyzing Longitudinal Data","authors":"Jie Zhou, Xiao Zhou, Liuquan Sun","doi":"10.1214/20-sts813","DOIUrl":"https://doi.org/10.1214/20-sts813","url":null,"abstract":"Under the random design of longitudinal data, observation times are irregular, and there are mainly two frameworks for analyzing such kind of longitudinal data. One is the clustered data framework and the other is the counting process framework. In this paper, we give a thorough comparison of these two frameworks in terms of data structure, model assumptions and estimation procedures. We find that modeling the observation times in the counting process framework will not gain any efficiency when the observation times are correlated with covariates but independent of the longitudinal response given covariates. Some simulation studies are conducted to compare the finite sample behaviors of the related estimators, and a real data analysis of the Alzheimer’s disease study is implemented for further comparison.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"1 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66085640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Confidence as Likelihood","authors":"Y. Pawitan, Youngjo Lee","doi":"10.1214/20-sts811","DOIUrl":"https://doi.org/10.1214/20-sts811","url":null,"abstract":"Confidence and likelihood are fundamental statistical concepts with distinct technical interpretation and usage. Confidence is a meaningful concept of uncertainty within the context of confidence-interval procedure, while likelihood has been used predominantly as a tool for statistical modelling and inference given observed data. Here we show that confidence is in fact an extended likelihood, thus giving a much closer correspondence between the two concepts. This result gives the confidence concept an external meaning outside the confidence-interval context, and vice versa, it gives the confidence interpretation to the likelihood. In addition to the obvious interpretation purposes, this connection suggests two-way transfers of technical information. For example, the extended likelihood theory gives a clear way to update or combine confidence information. On the other hand, the confidence connection gives the extended likelihood direct access to the frequentist probability, an objective certification not directly available to the classical likelihood. This implies that intervals derived from the extended likelihood have the same logical status as confidence intervals, thus simplifying the terminology in the inference of random parameters.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"1 1","pages":""},"PeriodicalIF":5.7,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66085566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gambler’s Ruin and the ICM","authors":"P. Diaconis, S. Ethier","doi":"10.1214/21-sts826","DOIUrl":"https://doi.org/10.1214/21-sts826","url":null,"abstract":"Consider gambler's ruin with three players, 1, 2, and 3, having initial capitals $A$, $B$, and $C$. At each round a pair of players is chosen (uniformly at random) and a fair coin flip is made resulting in the transfer of one unit between these two players. Eventually, one of the players is eliminated and the game continues with the remaining two. Let $sigmain S_3$ be the elimination order (e.g., $sigma=132$ means player 1 is eliminated first, player 3 is eliminated second, and player 2 is left with $A+B+C$). \u0000We seek approximations (and exact formulas) for the probabilities $P_{A,B,C}(sigma)$. One frequently used approximation, the independent chip model (ICM), is shown to be inadequate. A regression adjustment is proposed, which seems to give good approximations to the players' elimination order probabilities.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48579330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Methods to Compute Prediction Intervals: A Review and New Results","authors":"Qinglong Tian, D. Nordman, W. Meeker","doi":"10.1214/21-sts842","DOIUrl":"https://doi.org/10.1214/21-sts842","url":null,"abstract":"This paper reviews two main types of prediction interval methods under a parametric framework. First, we describe methods based on an (approximate) pivotal quantity. Examples include the plug-in, pivotal, and calibration methods. Then we describe methods based on a predictive distribution (sometimes derived based on the likelihood). Examples include Bayesian, fiducial, and direct-bootstrap methods. Several examples involving continuous distributions along with simulation studies to evaluate coverage probability properties are provided. We provide specific connections among different prediction interval methods for the (log-)location-scale family of distributions. This paper also discusses general prediction interval methods for discrete data, using the binomial and Poisson distributions as examples. We also overview methods for dependent data, with application to time series, spatial data, and Markov random fields, for example.","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46757725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Look at Robustness and Stability of $ell_{1}$-versus $ell_{0}$-Regularization: Discussion of Papers by Bertsimas et al. and Hastie et al.","authors":"Yuansi Chen, Armeen Taeb, P. Bühlmann","doi":"10.1214/20-sts809","DOIUrl":"https://doi.org/10.1214/20-sts809","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"614-622"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45899156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rejoinder: Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons","authors":"T. Hastie, R. Tibshirani, R. Tibshirani","doi":"10.1214/20-sts733rej","DOIUrl":"https://doi.org/10.1214/20-sts733rej","url":null,"abstract":"","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"579-592"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44684797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modern Variable Selection in Action: Comment on the Papers by HTT and BPV","authors":"E. George","doi":"10.1214/20-sts808","DOIUrl":"https://doi.org/10.1214/20-sts808","url":null,"abstract":"Let me begin by congratulating the authors of these two papers, hereafter HTT and BPV, for their superb contributions to the comparisons of methods for variable selection problems in high dimensional regression. The methods considered are truly some of today’s leading contenders for coping with the size and complexity of big data problems of so much current importance. Not surprisingly, there is no clear winner here because the terrain of comparisons is so vast and complex, and no single method can dominate across all situations. The considered setups vary greatly in terms of the number of observations n, the number of predictors p, the number and relative sizes of the underlying nonzero regression coefficients, predictor correlation structures and signal-to-noise ratios (SNRs). And even these only scratch the surface of the infinite possibilities. Further, there is the additional issue as to which performance measure is most important. Is the goal of an analysis exact variable selection or prediction or both? And what about computational speed and scalability? All these considerations would naturally depend on the practical application at hand. The methods compared by HTT and BPV have been unleashed by extraordinary developments in computational speed, and so it is tempting to distinguish them primarily by their novel implementation algorithms. In particular, the recent integer optimization related algorithms for variable selection differ in fundamental ways from the now widely adopted coordinate ascent algorithms for the lasso related methods. Undoubtedly, the impressive improvements in computational speed unleashed by these algorithms are critical for the feasibility of practical applications. However, the more fundamental story behind the performance differences has to do with the differences between the criteria that their algorithms are seeking to optimize. In an important sense, they are being guided by different solutions to the general variable selection problem. Focusing first on the paper of HTT, its main thrust appears to have been kindled by the computational breakthrough of Bertsimas, King and Mazumder (2016) (hereafter BKM), which had proposed a mixed integer opti-","PeriodicalId":51172,"journal":{"name":"Statistical Science","volume":"35 1","pages":"609-613"},"PeriodicalIF":5.7,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45250262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}