使用可视化比较重叠数据分布

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Information Visualization Pub Date : 2023-05-23 DOI:10.1177/14738716231173731

Eric Newburger, Niklas Elmqvist

{"title":"使用可视化比较重叠数据分布","authors":"Eric Newburger, Niklas Elmqvist","doi":"10.1177/14738716231173731","DOIUrl":null,"url":null,"abstract":"We present results from a preregistered and crowdsourced user study where we asked members of the general population to determine whether two samples represented using different forms of data visualizations are drawn from the same or different populations. Such a task reduces to assessing whether the overlap between the two visualized samples is large enough to suggest similar or different origins. When using idealized normal curves fitted on the samples, it is essentially a graphical formulation of the classic Student’s t-test. However, we speculate that using more sophisticated visual representations, such as bar histograms, Wilkinson dot plots, strip plots, or Tukey boxplots will both allow people to be more accurate at this task as well as better understand its meaning. In other words, the purpose of our study is to explore which visualization best scaffolds novices in making graphical inferences about data. However, our results indicate that the more abstracted idealized bell curve representation of the task yields more accuracy.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"17 1","pages":"0"},"PeriodicalIF":1.8000,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing overlapping data distributions using visualization\",\"authors\":\"Eric Newburger, Niklas Elmqvist\",\"doi\":\"10.1177/14738716231173731\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present results from a preregistered and crowdsourced user study where we asked members of the general population to determine whether two samples represented using different forms of data visualizations are drawn from the same or different populations. Such a task reduces to assessing whether the overlap between the two visualized samples is large enough to suggest similar or different origins. When using idealized normal curves fitted on the samples, it is essentially a graphical formulation of the classic Student’s t-test. However, we speculate that using more sophisticated visual representations, such as bar histograms, Wilkinson dot plots, strip plots, or Tukey boxplots will both allow people to be more accurate at this task as well as better understand its meaning. In other words, the purpose of our study is to explore which visualization best scaffolds novices in making graphical inferences about data. However, our results indicate that the more abstracted idealized bell curve representation of the task yields more accuracy.\",\"PeriodicalId\":50360,\"journal\":{\"name\":\"Information Visualization\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Visualization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/14738716231173731\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/14738716231173731","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

我们展示了一项预先注册的众包用户研究的结果，我们要求普通人群的成员确定使用不同形式的数据可视化表示的两个样本是否来自相同或不同的人群。这样的任务简化为评估两个可视化样本之间的重叠是否大到足以表明相似或不同的起源。当使用拟合样本的理想正态曲线时，它本质上是经典学生t检验的图形化公式。然而，我们推测，使用更复杂的视觉表征，如条形直方图、威尔金森点图、条形图或Tukey箱形图，既能让人们更准确地完成这项任务，也能更好地理解其含义。换句话说，我们研究的目的是探索哪种可视化最适合初学者对数据进行图形推断。然而，我们的结果表明，任务的更抽象的理想化钟形曲线表示产生更高的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparing overlapping data distributions using visualization

We present results from a preregistered and crowdsourced user study where we asked members of the general population to determine whether two samples represented using different forms of data visualizations are drawn from the same or different populations. Such a task reduces to assessing whether the overlap between the two visualized samples is large enough to suggest similar or different origins. When using idealized normal curves fitted on the samples, it is essentially a graphical formulation of the classic Student’s t-test. However, we speculate that using more sophisticated visual representations, such as bar histograms, Wilkinson dot plots, strip plots, or Tukey boxplots will both allow people to be more accurate at this task as well as better understand its meaning. In other words, the purpose of our study is to explore which visualization best scaffolds novices in making graphical inferences about data. However, our results indicate that the more abstracted idealized bell curve representation of the task yields more accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Visualization COMPUTER SCIENCE, SOFTWARE ENGINEERING-

CiteScore

5.40

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Information Visualization is essential reading for researchers and practitioners of information visualization and is of interest to computer scientists and data analysts working on related specialisms. This journal is an international, peer-reviewed journal publishing articles on fundamental research and applications of information visualization. The journal acts as a dedicated forum for the theories, methodologies, techniques and evaluations of information visualization and its applications. The journal is a core vehicle for developing a generic research agenda for the field by identifying and developing the unique and significant aspects of information visualization. Emphasis is placed on interdisciplinary material and on the close connection between theory and practice. This journal is a member of the Committee on Publication Ethics (COPE).