The American Statistician最新文献_第2页

Technical Validation of Plot Designs by Use of Deep Learning 基于深度学习的情节设计技术验证

The American Statistician Pub Date : 2023-10-13 DOI: 10.1080/00031305.2023.2270649

Anne Helby Petersen, Claus Ekstrøm

{"title":"Technical Validation of Plot Designs by Use of Deep Learning","authors":"Anne Helby Petersen, Claus Ekstrøm","doi":"10.1080/00031305.2023.2270649","DOIUrl":"https://doi.org/10.1080/00031305.2023.2270649","url":null,"abstract":"AbstractWhen does inspecting a certain graphical plot allow for an investigator to reach the right statistical conclusion? Visualizations are commonly used for various tasks in statistics – including model diagnostics and exploratory data analysis – and though attractive due to its intuitive nature, the lack of available methods for validating plots is a major drawback. We propose a new technical validation method for visual reasoning. Our method trains deep neural networks to distinguish between plots simulated under two different data generating mechanisms (null or alternative), and we use the classification accuracy as a technical validation score (TVS). The TVS measures the information content in the plots, and TVS values can be used to compare different plots or different choices of data generating mechanisms, thereby providing a meaningful scale that new visual reasoning procedures can be validated against. We apply the method to three popular diagnostic plots for linear regression, namely scatter plots, quantile-quantile plots and residual plots. We consider various types and degrees of misspecification, as well as different within-plot sample sizes. Our method produces TVSs that increase with increasing sample size and decrease with increasing difficulty, and hence the TVS is a meaningful measure of validity.Keywords: Deep learninggraphical inferencelinear regressionneural networkmodel diagnosticsvisualizationDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135854309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Phistogram 的Phistogram

The American Statistician Pub Date : 2023-10-09 DOI: 10.1080/00031305.2023.2267639

Adriana Verónica Blanc

{"title":"The Phistogram","authors":"Adriana Verónica Blanc","doi":"10.1080/00031305.2023.2267639","DOIUrl":"https://doi.org/10.1080/00031305.2023.2267639","url":null,"abstract":"AbstractThis article introduces a new kind of histogram-based representation for univariate random variables, named the phistogram because of its perceptual qualities. The technique relies on shifted groupings of data, creating a color-gradient zone that evidences the uncertainty from smoothing and highlights sampling issues. In this way, the phistogram offers a deep and visually appealing perspective on the finite sample peculiarities, being capable of depicting the underlying distribution as well, thus becoming an useful complement to histograms and other statistical summaries. Although not limited to it, the present construction is derived from the equal-area histogram, a variant that differs conceptually from the traditional one. As such a distinction is not greatly emphasized in the literature, the graphical fundamentals are described in detail, and an alternative terminology is proposed to separate some concepts. Additionally, a compact notation is adopted to integrate the representation’s metadata into the graphic itself.Keywords: statistical graphicdata visualization toolperceptioncolor-gradient techniquesmoothing uncertaintyequal-area histogramDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135141198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Note on Monte Carlo Integration in High Dimensions 关于高维蒙特卡罗积分的一个注记

The American Statistician Pub Date : 2023-10-09 DOI: 10.1080/00031305.2023.2267637

Yanbo Tang

引用次数: 0

One-step weighting to generalize and transport treatment effect estimates to a target population* 一步加权来概括和传递治疗效果估计到目标人群*

The American Statistician Pub Date : 2023-10-09 DOI: 10.1080/00031305.2023.2267598

Ambarish Chattopadhyay, Eric R. Cohn, José R. Zubizarreta

{"title":"One-step weighting to generalize and transport treatment effect estimates to a target population*","authors":"Ambarish Chattopadhyay, Eric R. Cohn, José R. Zubizarreta","doi":"10.1080/00031305.2023.2267598","DOIUrl":"https://doi.org/10.1080/00031305.2023.2267598","url":null,"abstract":"AbstractThe problems of generalization and transportation of treatment effect estimates from a study sample to a target population are central to empirical research and statistical methodology. In both randomized experiments and observational studies, weighting methods are often used with this objective. Traditional methods construct the weights by separately modeling the treatment assignment and study selection probabilities and then multiplying functions (e.g., inverses) of their estimates. In this work, we provide a justification and an implementation for weighting in a single step. We show a formal connection between this one-step method and inverse probability and inverse odds weighting. We demonstrate that the resulting estimator for the target average treatment effect is consistent, asymptotically Normal, multiply robust, and semiparametrically efficient. We evaluate the performance of the one-step estimator in a simulation study. We illustrate its use in a case study on the effects of physician racial diversity on preventive healthcare utilization among Black men in California. We provide R code implementing the methodology.Keywords: Causal inferenceGeneralizationTransportationRandomized experimentsObservational studiesWeighting methodsDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135141677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Causal quartets: Different ways to attain the same average treatment effect* 因果四重奏:达到相同平均治疗效果的不同方法*

The American Statistician Pub Date : 2023-10-05 DOI: 10.1080/00031305.2023.2267597

Andrew Gelman, Jessica Hullman, Lauren Kennedy

引用次数: 7

ANOVA and Mixed Models: A Short Introduction Using RLukas Meier, Boca Raton, FL: Chapman & Hall/CRC Press, 2023, xiv + 187 pp., $66.95(P), ISBN: 978-0-367-70420-9. 方差分析和混合模型:使用RLukas Meier, Boca Raton, FL: Chapman &Hall/CRC Press, 2023, xiv + 187 pp， $66.95(P)， ISBN: 978-0-367-70420-9。

The American Statistician Pub Date : 2023-10-02 DOI: 10.1080/00031305.2023.2261817

Brady T. West

引用次数: 0

Missing data imputation with high-dimensional data 缺少高维数据的数据输入

The American Statistician Pub Date : 2023-10-02 DOI: 10.1080/00031305.2023.2259962

Alberto Brini, Edwin R. van den Heuvel

{"title":"Missing data imputation with high-dimensional data","authors":"Alberto Brini, Edwin R. van den Heuvel","doi":"10.1080/00031305.2023.2259962","DOIUrl":"https://doi.org/10.1080/00031305.2023.2259962","url":null,"abstract":"AbstractImputation of missing data in high-dimensional datasets with more variables P than samples N, P≫N, is hampered by the data dimensionality. For multivariate imputation, the covariance matrix is ill conditioned and cannot be properly estimated. For fully conditional imputation, the regression models for imputation cannot include all the variables. Thus, the high dimension requires special imputation approaches. In this paper, we provide an overview and realistic comparisons of imputation approaches for high-dimensional data when applied to a linear mixed modelling (LMM) framework. We examine approaches from three different classes using simulation studies: multiple imputation with penalized regression, multiple imputation with recursive partitioning and predictive mean matching and multiple imputation with Principal Component Analysis (PCA). We illustrate the methods on a real case study where a multivariate outcome, i.e., an extracted set of correlated biomarkers from human urine samples, was collected and monitored over time and we discuss the proposed methods with more standard imputation techniques that could be applied by ignoring either the multivariate or the longitudinal dimension. Our simulations demonstrate the superiority of the recursive partitioning and predictive mean matching algorithm over the other methods in terms of bias, mean squared error and coverage of the LMM parameter estimates when compared to those obtained from a data analysis without missingness, although it comes at the expense of high computational costs. It is worthwhile reconsidering much faster methodologies like the one relying on PCA.Keywords: high-dimensional datalongitudinal datalinear mixed modelsmissing datamultiple imputationprincipal component analysispenalized regressionrecursive partitioningDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135829922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A First Course in Linear Model Theory, 2nd ed.Nalini Ravishanker, Zhiyi Chi, and Dipak K. Dey, Boca Raton, FL: Chapman & Hall/CRC Press, 2022, xvi + 513 pp., $110.00(H), ISBN: 978-1-439-85805-9. nalini Ravishanker, Zhiyi Chi, and Dipak K. Dey, Boca Raton, FL: Chapman &霍尔/CRC出版社，2022,16 + 513页，$110.00(H)， ISBN: 978-1-439-85805-9。

The American Statistician Pub Date : 2023-10-02 DOI: 10.1080/00031305.2023.2261819

Carlos Cinelli

引用次数: 0

Bayesian Modeling and Computation in PythonOsvaldo A. Martin, Ravin Kumar, and Junpeng Lao, Boca Raton, FL: Chapman & Hall/CRC Press, 2022, xxii + 398 pp., $99.95(H), ISBN: 978-0-367-89436-8. python中的贝叶斯建模和计算osvaldo A. Martin, Ravin Kumar和Junpeng Lao, Boca Raton, FL: Chapman &;霍尔/CRC出版社，2022,22 + 398页，99.95美元(H)， ISBN: 978-0-367-89436-8。

The American Statistician Pub Date : 2023-10-02 DOI: 10.1080/00031305.2023.2261818

P. Richard Hahn

引用次数: 0

The Application of the Likelihood Ratio Test and the Cochran-Mantel-Haenszel Test to Discrimination Cases 似然比检验和Cochran-Mantel-Haenszel检验在歧视案件中的应用

The American Statistician Pub Date : 2023-09-15 DOI: 10.1080/00031305.2023.2259969

Weiwen Miao, Joseph L. Gastwirth

{"title":"The Application of the Likelihood Ratio Test and the Cochran-Mantel-Haenszel Test to Discrimination Cases","authors":"Weiwen Miao, Joseph L. Gastwirth","doi":"10.1080/00031305.2023.2259969","DOIUrl":"https://doi.org/10.1080/00031305.2023.2259969","url":null,"abstract":"ABSTRACTIn practice, the ultimate outcome of many important discrimination cases, e.g. the Wal-Mart, Nike and Goldman-Sachs equal pay cases, is determined at the stage when the plaintiffs request that the case be certified as a class action. The primary statistical issue at this time is whether the employment practice in question leads to a common pattern of outcomes disadvantaging most plaintiffs. However, there are no formal procedures or government guidelines for checking whether an employment practice results in a common pattern of disparity. This paper proposes using the slightly modified likelihood ratio test and the one-sided Cochran-Mantel-Haenszel (CMH) test to examine data relevant to deciding whether this commonality requirement is satisfied. Data considered at the class certification stage from several actual cases are analyzed by the proposed procedures. The results often show that the employment practice at issue created a common pattern of disparity, however, based on the evidence presented to the courts, the class action requests were denied.KEYWORDS: Class actionCochran-Mantel-Haenszel testDisparate impactEmployment discriminationLikelihood ratio testStratified dataDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135394720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0