{"title":"Assessing Task Difficulty in Software Testing using Biometric Measures","authors":"Daryl Camilleri, C. Porter, Mark Micallef","doi":"10.14236/ewic/hci2022.5","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate the extent to which we could classify task difficulty in the software testing domain, using psycho-physiological sensors. Following a literature review, we selected and adapted the work of Fritz et al. (2014) among software developers, and transposed it to the testing domain. We present the results of a study conducted with 16 professional software testers carrying out predefined tasks in a lab setting, while we collected eye tracking, electroencephalogram (EEG) and electrodermal activity (EDA) data. On average, each participant took part in a two-hour data-collection session. Throughout our study, we captured approximately 14Gb of biometric data, consisting of more than 120 million data points. Using this data, we trained 21 na¨ıve Bayes classifiers to predict task difficulty from three perspectives (by participant, by task, by participant-task) and using the seven possible combinations of sensors. Our results confirm that we can predict task difficulty for a new tester with a precision of 74.4% and a recall of 72.5% using just an eye tracker, and for a new task with a precision of 72.2% and a recall of 70.0% using eye tracking and electrodermal activity. The results achieved are largely consistent with the work of Fritz et al. (2014). We conclude by providing insights as to which combinations of sensors would provide the best results, and how this work could be used to enhance well-being and workflow support tools in an industry setting.","PeriodicalId":413003,"journal":{"name":"Electronic Workshops in Computing","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Workshops in Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14236/ewic/hci2022.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we investigate the extent to which we could classify task difficulty in the software testing domain, using psycho-physiological sensors. Following a literature review, we selected and adapted the work of Fritz et al. (2014) among software developers, and transposed it to the testing domain. We present the results of a study conducted with 16 professional software testers carrying out predefined tasks in a lab setting, while we collected eye tracking, electroencephalogram (EEG) and electrodermal activity (EDA) data. On average, each participant took part in a two-hour data-collection session. Throughout our study, we captured approximately 14Gb of biometric data, consisting of more than 120 million data points. Using this data, we trained 21 na¨ıve Bayes classifiers to predict task difficulty from three perspectives (by participant, by task, by participant-task) and using the seven possible combinations of sensors. Our results confirm that we can predict task difficulty for a new tester with a precision of 74.4% and a recall of 72.5% using just an eye tracker, and for a new task with a precision of 72.2% and a recall of 70.0% using eye tracking and electrodermal activity. The results achieved are largely consistent with the work of Fritz et al. (2014). We conclude by providing insights as to which combinations of sensors would provide the best results, and how this work could be used to enhance well-being and workflow support tools in an industry setting.
在本文中,我们研究了在软件测试领域中,我们可以使用心理-生理传感器对任务难度进行分类的程度。在文献回顾之后,我们选择并改编了Fritz等人(2014)在软件开发人员中的工作,并将其转移到测试领域。我们介绍了一项研究的结果,该研究由16名专业软件测试人员在实验室环境中执行预定义的任务,同时我们收集了眼动追踪、脑电图(EEG)和皮电活动(EDA)数据。平均而言,每个参与者都参加了两个小时的数据收集会议。在整个研究过程中,我们捕获了大约14Gb的生物识别数据,包括超过1.2亿个数据点。利用这些数据,我们训练了21个na¨ıve贝叶斯分类器,从三个角度(按参与者、按任务、按参与者-任务)和使用七种可能的传感器组合来预测任务难度。我们的研究结果证实,我们可以用眼动仪预测新测试者的任务难度,准确率为74.4%,召回率为72.5%;用眼动仪和皮肤电活动预测新任务的准确率为72.2%,召回率为70.0%。所得结果与Fritz et al.(2014)的工作基本一致。最后,我们提供了关于哪种传感器组合可以提供最佳结果的见解,以及如何在工业环境中使用这项工作来增强福祉和工作流支持工具。