Industrial adoption of machine learning techniques for early identification of invalid bug reports

IF 3.6 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering Pub Date : 2024-07-31 DOI:10.1007/s10664-024-10502-3

Muhammad Laiq, Nauman bin Ali, Jürgen Börstler, Emelie Engström

{"title":"Industrial adoption of machine learning techniques for early identification of invalid bug reports","authors":"Muhammad Laiq, Nauman bin Ali, Jürgen Börstler, Emelie Engström","doi":"10.1007/s10664-024-10502-3","DOIUrl":null,"url":null,"abstract":"<p>Despite the accuracy of machine learning (ML) techniques in predicting invalid bug reports, as shown in earlier research, and the importance of early identification of invalid bug reports in software maintenance, the adoption of ML techniques for this task in industrial practice is yet to be investigated. In this study, we used a technology transfer model to guide the adoption of an ML technique at a company for the early identification of invalid bug reports. In the process, we also identify necessary conditions for adopting such techniques in practice. We followed a case study research approach with various design and analysis iterations for technology transfer activities. We collected data from bug repositories, through focus groups, a questionnaire, and a presentation and feedback session with an expert. As expected, we found that an ML technique can identify invalid bug reports with acceptable accuracy at an early stage. However, the technique’s accuracy drops over time in its operational use due to changes in the product, the used technologies, or the development organization. Such changes may require retraining the ML model. During validation, practitioners highlighted the need to understand the ML technique’s predictions to trust the predictions. We found that a visual (using a state-of-the-art ML interpretation framework) and descriptive explanation of the prediction increases the trustability of the technique compared to just presenting the results of the validity predictions. We conclude that trustability, integration with the existing toolchain, and maintaining the techniques’ accuracy over time are critical for increasing the likelihood of adoption.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"42 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10502-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Despite the accuracy of machine learning (ML) techniques in predicting invalid bug reports, as shown in earlier research, and the importance of early identification of invalid bug reports in software maintenance, the adoption of ML techniques for this task in industrial practice is yet to be investigated. In this study, we used a technology transfer model to guide the adoption of an ML technique at a company for the early identification of invalid bug reports. In the process, we also identify necessary conditions for adopting such techniques in practice. We followed a case study research approach with various design and analysis iterations for technology transfer activities. We collected data from bug repositories, through focus groups, a questionnaire, and a presentation and feedback session with an expert. As expected, we found that an ML technique can identify invalid bug reports with acceptable accuracy at an early stage. However, the technique’s accuracy drops over time in its operational use due to changes in the product, the used technologies, or the development organization. Such changes may require retraining the ML model. During validation, practitioners highlighted the need to understand the ML technique’s predictions to trust the predictions. We found that a visual (using a state-of-the-art ML interpretation framework) and descriptive explanation of the prediction increases the trustability of the technique compared to just presenting the results of the validity predictions. We conclude that trustability, integration with the existing toolchain, and maintaining the techniques’ accuracy over time are critical for increasing the likelihood of adoption.

Abstract Image

查看原文本刊更多论文

工业界采用机器学习技术，及早识别无效错误报告

尽管早先的研究表明，机器学习（ML）技术在预测无效错误报告方面具有很高的准确性，而且早期识别无效错误报告在软件维护中也非常重要，但在工业实践中采用 ML 技术来完成这项任务还有待研究。在本研究中，我们使用技术转移模型来指导一家公司采用 ML 技术来早期识别无效错误报告。在此过程中，我们还确定了在实践中采用此类技术的必要条件。我们采用案例研究的方法，对技术转让活动进行了各种设计和分析迭代。我们从错误库中收集数据，通过焦点小组、问卷调查以及与专家的演示和反馈会议。不出所料，我们发现人工智能技术可以在早期阶段以可接受的准确度识别出无效的错误报告。然而，随着时间的推移，由于产品、所用技术或开发组织的变化，该技术的准确性会在实际使用中下降。这种变化可能需要重新训练 ML 模型。在验证过程中，实践者强调需要理解 ML 技术的预测，以便信任其预测结果。我们发现，对预测进行可视化（使用最先进的 ML 解释框架）和描述性解释，比仅仅展示有效性预测结果更能提高技术的可信度。我们的结论是，可信任度、与现有工具链的整合以及长期保持技术的准确性对于提高采用的可能性至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.