An Empirical Investigation to Overcome Class-Imbalance in Inspection Reviews

Maninder Singh, G. Walia, Anurag Goswami
{"title":"An Empirical Investigation to Overcome Class-Imbalance in Inspection Reviews","authors":"Maninder Singh, G. Walia, Anurag Goswami","doi":"10.1109/MLDS.2017.15","DOIUrl":null,"url":null,"abstract":"Background: software inspection results in reviews that report the presence of faults. Requirements author must manually read through the reviews and differentiate between true-faults and false-positives. Problem: post-inspection decisions (fault or nonfault) are difficult and time consuming. It is difficult to employ machine learning (ML) techniques directly to raw (unstructured) data because of class imbalance problem and possible fault-slippage through misclassification of fault. Aim: The aim of this research is to solve this problem with the help of ensemble approach and priority analysis to achieve significant accuracy in determining true-fault and false-positive reviews without losing any listed fault. Method: We conducted empirical experiment using two trained models (with reviews from inspection domain vs. movies domain) to address class imbalance problem. Our approach uses ensemble methods to develop classification confidence of inspection reviews and assigns them to appropriate priority class. Results: The results showed that movies trained model performed better than inspection trained and restricted any possible fault-slippage.","PeriodicalId":248656,"journal":{"name":"2017 International Conference on Machine Learning and Data Science (MLDS)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Machine Learning and Data Science (MLDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLDS.2017.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Background: software inspection results in reviews that report the presence of faults. Requirements author must manually read through the reviews and differentiate between true-faults and false-positives. Problem: post-inspection decisions (fault or nonfault) are difficult and time consuming. It is difficult to employ machine learning (ML) techniques directly to raw (unstructured) data because of class imbalance problem and possible fault-slippage through misclassification of fault. Aim: The aim of this research is to solve this problem with the help of ensemble approach and priority analysis to achieve significant accuracy in determining true-fault and false-positive reviews without losing any listed fault. Method: We conducted empirical experiment using two trained models (with reviews from inspection domain vs. movies domain) to address class imbalance problem. Our approach uses ensemble methods to develop classification confidence of inspection reviews and assigns them to appropriate priority class. Results: The results showed that movies trained model performed better than inspection trained and restricted any possible fault-slippage.
检视评鉴中克服阶层失衡的实证研究
背景:软件检查的结果是报告故障存在的评审。需求作者必须手动阅读审查,并区分真正的错误和假阳性。问题:检查后的决定(故障或无故障)是困难和耗时的。由于类不平衡问题和错误分类可能导致的断层滑动,机器学习技术很难直接应用于原始(非结构化)数据。目的:本研究的目的是借助集成方法和优先级分析来解决这一问题,在不丢失任何列出的故障的情况下,在确定真故障和假阳性评论方面达到显著的准确性。方法:我们使用两个训练好的模型(分别来自检验领域和电影领域的评论)进行实证实验来解决阶级失衡问题。我们的方法使用集成方法来开发检查审查的分类置信度,并将它们分配到适当的优先级类。结果:实验结果表明,电影训练模型的性能优于检测训练模型,有效地抑制了任何可能的断层滑动。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信