Technical Validation of Plot Designs by Use of Deep Learning

The American Statistician Pub Date : 2023-10-13 DOI:10.1080/00031305.2023.2270649

Anne Helby Petersen, Claus Ekstrøm

{"title":"Technical Validation of Plot Designs by Use of Deep Learning","authors":"Anne Helby Petersen, Claus Ekstrøm","doi":"10.1080/00031305.2023.2270649","DOIUrl":null,"url":null,"abstract":"AbstractWhen does inspecting a certain graphical plot allow for an investigator to reach the right statistical conclusion? Visualizations are commonly used for various tasks in statistics – including model diagnostics and exploratory data analysis – and though attractive due to its intuitive nature, the lack of available methods for validating plots is a major drawback. We propose a new technical validation method for visual reasoning. Our method trains deep neural networks to distinguish between plots simulated under two different data generating mechanisms (null or alternative), and we use the classification accuracy as a technical validation score (TVS). The TVS measures the information content in the plots, and TVS values can be used to compare different plots or different choices of data generating mechanisms, thereby providing a meaningful scale that new visual reasoning procedures can be validated against. We apply the method to three popular diagnostic plots for linear regression, namely scatter plots, quantile-quantile plots and residual plots. We consider various types and degrees of misspecification, as well as different within-plot sample sizes. Our method produces TVSs that increase with increasing sample size and decrease with increasing difficulty, and hence the TVS is a meaningful measure of validity.Keywords: Deep learninggraphical inferencelinear regressionneural networkmodel diagnosticsvisualizationDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The American Statistician","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/00031305.2023.2270649","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

AbstractWhen does inspecting a certain graphical plot allow for an investigator to reach the right statistical conclusion? Visualizations are commonly used for various tasks in statistics – including model diagnostics and exploratory data analysis – and though attractive due to its intuitive nature, the lack of available methods for validating plots is a major drawback. We propose a new technical validation method for visual reasoning. Our method trains deep neural networks to distinguish between plots simulated under two different data generating mechanisms (null or alternative), and we use the classification accuracy as a technical validation score (TVS). The TVS measures the information content in the plots, and TVS values can be used to compare different plots or different choices of data generating mechanisms, thereby providing a meaningful scale that new visual reasoning procedures can be validated against. We apply the method to three popular diagnostic plots for linear regression, namely scatter plots, quantile-quantile plots and residual plots. We consider various types and degrees of misspecification, as well as different within-plot sample sizes. Our method produces TVSs that increase with increasing sample size and decrease with increasing difficulty, and hence the TVS is a meaningful measure of validity.Keywords: Deep learninggraphical inferencelinear regressionneural networkmodel diagnosticsvisualizationDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.

查看原文本刊更多论文

基于深度学习的情节设计技术验证

摘要什么时候检查一个特定的图形可以让研究者得出正确的统计结论?可视化通常用于统计中的各种任务-包括模型诊断和探索性数据分析-尽管由于其直观的性质而具有吸引力，但缺乏验证图的可用方法是一个主要缺点。我们提出了一种新的视觉推理技术验证方法。我们的方法训练深度神经网络来区分在两种不同的数据生成机制(null或alternative)下模拟的图，我们使用分类精度作为技术验证分数(TVS)。TVS测量图中的信息内容，TVS值可用于比较不同的图或不同的数据生成机制选择，从而提供一个有意义的尺度，新的视觉推理程序可以根据该尺度进行验证。我们将该方法应用于三种常用的线性回归诊断图，即散点图、分位数-分位数图和残差图。我们考虑了不同类型和程度的错配，以及不同的图内样本量。我们的方法产生的TVS随样本量的增加而增加，随难度的增加而减少，因此TVS是一种有意义的效度度量。关键词:深度学习图形推理线性回归神经网络模型诊断可视化免责声明作为对作者和研究人员的服务，我们提供此版本的已接受手稿(AM)。在最终出版版本记录(VoR)之前，将对该手稿进行编辑、排版和审查。在制作和印前，可能会发现可能影响内容的错误，所有适用于期刊的法律免责声明也与这些版本有关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The American Statistician

自引率

0.00%

发文量