Demo: Automatically Retrainable Self Improving Model for the Automated Classification of Software Incidents into Multiple Classes

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS) Pub Date : 2021-07-01 DOI:10.1109/ICDCS51616.2021.00113

Badal Agrawal, Mohit Mishra

{"title":"Demo: Automatically Retrainable Self Improving Model for the Automated Classification of Software Incidents into Multiple Classes","authors":"Badal Agrawal, Mohit Mishra","doi":"10.1109/ICDCS51616.2021.00113","DOIUrl":null,"url":null,"abstract":"Developers across most of the organizations face the issue of manually dealing with the classification of the software bug reports. Software bug reports often contain text and other useful information that are common for a particular type of bug. This information can be extracted using the techniques of Natural Language Processing and combined with the manual classification done by the developers until now to create a properly labelled data set for training a supervised learning model for automatically classifying the bug reports into their respective categories. Previous studies have only focused on binary classification of software incident reports as bug and non-bug. Our novel approach achieves an accuracy of 76.94% for a 10-factor classification problem on the bug repository created by Microsoft Dynamics 365 team. In addition, we propose a novel method for automatically retraining the model and updating it with developer feedback in case of misclassification that will significantly reduce the maintenance cost and effort.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS51616.2021.00113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Developers across most of the organizations face the issue of manually dealing with the classification of the software bug reports. Software bug reports often contain text and other useful information that are common for a particular type of bug. This information can be extracted using the techniques of Natural Language Processing and combined with the manual classification done by the developers until now to create a properly labelled data set for training a supervised learning model for automatically classifying the bug reports into their respective categories. Previous studies have only focused on binary classification of software incident reports as bug and non-bug. Our novel approach achieves an accuracy of 76.94% for a 10-factor classification problem on the bug repository created by Microsoft Dynamics 365 team. In addition, we propose a novel method for automatically retraining the model and updating it with developer feedback in case of misclassification that will significantly reduce the maintenance cost and effort.

查看原文本刊更多论文

演示:软件事件自动分类为多类的自动可再训练自改进模型

大多数组织的开发人员都面临着手动处理软件错误报告分类的问题。软件错误报告通常包含文本和其他有用的信息，这些信息对于特定类型的错误是常见的。这些信息可以使用自然语言处理技术提取，并与开发人员迄今为止所做的手动分类相结合，以创建一个适当标记的数据集，用于训练一个监督学习模型，以自动将bug报告分类到各自的类别中。以往的研究只关注软件事件报告的二进制分类，即bug和non-bug。我们的新方法在Microsoft Dynamics 365团队创建的错误存储库上实现了10因素分类问题的76.94%的准确率。此外，我们提出了一种新的方法来自动重新训练模型，并在分类错误的情况下使用开发人员的反馈来更新模型，这将大大减少维护成本和工作量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

自引率

0.00%

发文量