Automated quality assessment for crowdsourced test reports of mobile applications

2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER) Pub Date : 2018-03-20 DOI:10.1109/SANER.2018.8330224

Xin Chen, He Jiang, Xiaochen Li, Tieke He, Zhenyu Chen

{"title":"Automated quality assessment for crowdsourced test reports of mobile applications","authors":"Xin Chen, He Jiang, Xiaochen Li, Tieke He, Zhenyu Chen","doi":"10.1109/SANER.2018.8330224","DOIUrl":null,"url":null,"abstract":"In crowdsourced mobile application testing, crowd workers help developers perform testing and submit test reports for unexpected behaviors. These submitted test reports usually provide critical information for developers to understand and reproduce the bugs. However, due to the poor performance of workers and the inconvenience of editing on mobile devices, the quality of test reports may vary sharply. At times developers have to spend a significant portion of their available resources to handle the low-quality test reports, thus heavily decreasing their efficiency. In this paper, to help developers predict whether a test report should be selected for inspection within limited resources, we propose a new framework named TERQAF to automatically model the quality of test reports. TERQAF defines a series of quantifiable indicators to measure the desirable properties of test reports and aggregates the numerical values of all indicators to determine the quality of test reports by using step transformation functions. Experiments conducted over five crowdsourced test report datasets of mobile applications show that TERQAF can correctly predict the quality of test reports with accuracy of up to 88.06% and outperform baselines by up to 23.06%. Meanwhile, the experimental results also demonstrate that the four categories of measurable indicators have positive impacts on TERQAF in evaluating the quality of test reports.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"1 1","pages":"368-379"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SANER.2018.8330224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

In crowdsourced mobile application testing, crowd workers help developers perform testing and submit test reports for unexpected behaviors. These submitted test reports usually provide critical information for developers to understand and reproduce the bugs. However, due to the poor performance of workers and the inconvenience of editing on mobile devices, the quality of test reports may vary sharply. At times developers have to spend a significant portion of their available resources to handle the low-quality test reports, thus heavily decreasing their efficiency. In this paper, to help developers predict whether a test report should be selected for inspection within limited resources, we propose a new framework named TERQAF to automatically model the quality of test reports. TERQAF defines a series of quantifiable indicators to measure the desirable properties of test reports and aggregates the numerical values of all indicators to determine the quality of test reports by using step transformation functions. Experiments conducted over five crowdsourced test report datasets of mobile applications show that TERQAF can correctly predict the quality of test reports with accuracy of up to 88.06% and outperform baselines by up to 23.06%. Meanwhile, the experimental results also demonstrate that the four categories of measurable indicators have positive impacts on TERQAF in evaluating the quality of test reports.

查看原文本刊更多论文

移动应用众包测试报告的自动质量评估

在众包移动应用程序测试中，众包工作者帮助开发人员执行测试并提交意外行为的测试报告。这些提交的测试报告通常为开发人员提供理解和重现错误的关键信息。然而，由于工作人员的工作能力差，以及在移动设备上编辑的不便，测试报告的质量可能会有很大的差异。有时，开发人员不得不花费大量可用资源来处理低质量的测试报告，从而严重降低了他们的效率。在本文中，为了帮助开发人员预测是否应该在有限的资源内选择测试报告进行检查，我们提出了一个名为TERQAF的新框架来自动建模测试报告的质量。TERQAF定义了一系列可量化的指标来衡量测试报告的理想属性，并利用阶跃变换函数将所有指标的数值相加来确定测试报告的质量。在5个移动应用众包测试报告数据集上进行的实验表明，TERQAF能够正确预测测试报告的质量，准确率高达88.06%，优于基线的准确率高达23.06%。同时，实验结果也表明，四类可测量指标对TERQAF评价检测报告质量有正向影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)

自引率

0.00%

发文量