Exploring Exploratory Data Analysis: An Empirical Test of Run Chart Utility

IF 1.4 Q4 ENGINEERING, INDUSTRIAL
Matthew Barsalou, Pedro Manuel Saraiva, Roberto Henriques
{"title":"Exploring Exploratory Data Analysis: An Empirical Test of Run Chart Utility","authors":"Matthew Barsalou, Pedro Manuel Saraiva, Roberto Henriques","doi":"10.2478/mspe-2023-0050","DOIUrl":null,"url":null,"abstract":"Abstract This paper explores Exploratory Data Analysis (EDA). Graphical methods are used to gain insights in EDA and these insights can be useful for forming tentative hypotheses when performing a root cause analysis (RCA). The topic of EDA is well addressed in the literature; however, empirical studies of the efficacy of EDA are lacking. We therefore aim to evaluate EDA by comparing one group of students identifying salient features in a table against a second group of students attempting to identify salient features in the same data presented in the form of a run chart, and then extracting relevant conclusions from such a comparison. Two groups of students were randomly selected to receive data; either in the form of a table or a run chart. They were then tasked with visually identifying any data points that stood out as interesting. The number of correctly identified values and the time to find the values were both evaluated by a two-sample t-test to determine if there was a statistically significant difference. The participants with a graph found the correct values that stood out in the data much quicker than those that used a table. Those using the data in the form of a table too much longer and failed to identify values that stood out. However, those with a graph also had far more false positives. Much has been written on the topic of EDA in the literature; however, an empirical evaluation of this common methodology is lacking. This paper confirms with empirical evidence the effectiveness of EDA.","PeriodicalId":44097,"journal":{"name":"Management Systems in Production Engineering","volume":" 4","pages":"442 - 448"},"PeriodicalIF":1.4000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Management Systems in Production Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/mspe-2023-0050","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract This paper explores Exploratory Data Analysis (EDA). Graphical methods are used to gain insights in EDA and these insights can be useful for forming tentative hypotheses when performing a root cause analysis (RCA). The topic of EDA is well addressed in the literature; however, empirical studies of the efficacy of EDA are lacking. We therefore aim to evaluate EDA by comparing one group of students identifying salient features in a table against a second group of students attempting to identify salient features in the same data presented in the form of a run chart, and then extracting relevant conclusions from such a comparison. Two groups of students were randomly selected to receive data; either in the form of a table or a run chart. They were then tasked with visually identifying any data points that stood out as interesting. The number of correctly identified values and the time to find the values were both evaluated by a two-sample t-test to determine if there was a statistically significant difference. The participants with a graph found the correct values that stood out in the data much quicker than those that used a table. Those using the data in the form of a table too much longer and failed to identify values that stood out. However, those with a graph also had far more false positives. Much has been written on the topic of EDA in the literature; however, an empirical evaluation of this common methodology is lacking. This paper confirms with empirical evidence the effectiveness of EDA.
探索性数据分析:运行图实用性的实证检验
本文探讨探索性数据分析(EDA)。图形方法用于获得EDA中的见解,这些见解对于在执行根本原因分析(RCA)时形成试探性假设非常有用。EDA的主题在文献中得到了很好的解决;然而,缺乏对EDA有效性的实证研究。因此,我们的目标是通过比较一组学生在表格中识别显著特征与第二组学生试图识别以运行图形式呈现的相同数据中的显著特征,然后从这种比较中提取相关结论来评估EDA。随机抽取两组学生接收数据;以表格或运行图的形式。然后,他们被要求从视觉上识别出任何有趣的数据点。正确识别值的数量和找到这些值的时间都通过双样本t检验来评估,以确定是否存在统计学上显著的差异。使用图表的参与者比使用表格的参与者更快地找到了数据中突出的正确值。那些以表的形式使用数据的时间太长,无法识别出突出的值。然而,那些有图表的人也有更多的误报。文献中有很多关于EDA主题的文章;然而,缺乏对这种常见方法的经验评估。本文用实证证明了EDA的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.30
自引率
13.30%
发文量
48
审稿时长
10 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信