Still Confusing for Bug-Component Triaging? Deep Feature Learning and Ensemble Setting to Rescue

Yanqi Su, Zheming Han, Zhipeng Gao, Zhenchang Xing, Qinghua Lu, Xiwei Xu
{"title":"Still Confusing for Bug-Component Triaging? Deep Feature Learning and Ensemble Setting to Rescue","authors":"Yanqi Su, Zheming Han, Zhipeng Gao, Zhenchang Xing, Qinghua Lu, Xiwei Xu","doi":"10.1109/ICPC58990.2023.00046","DOIUrl":null,"url":null,"abstract":"To speed up the bug-fixing process, it is essential to triage bugs into the right components as soon as possible. Given the large number of bugs filed everyday, a reliable and effective bug-component triaging tool is needed to assist this task. LR-BKG is the state-of-the-art toolkit for doing this. However, the suboptimal performance for recommending the right component at the first position (low Top-1 accuracy) limits its usage in practice. We thoroughly investigate the limitations of LR-BKG and find out the gap between the manual feature design of LR-BKG and the characteristics of bug reports causes such suboptimal performance. Therefore, we propose an approach, DEEPTRIAG, which uses the large scale pre-trained models to extract deep features automatically from bug reports (including bug summary and description), to fill this gap. DEEPTRIAG transforms bug-component triaging into a multi-classification task (CodeBERT-Classifier) and a generation task (CodeT5-Generator). Then, we ensemble the prediction results from them to improve the performance of bug-component triaging further. Extensive experimental results demonstrate the superior performance of DEEPTRIAG on bug-component triaging over LR-BKG. In particular, the overall Top-1 accuracy is improved from 56.2% to 68.3% on Mozilla dataset and from 51.3% to 64.1% on Eclipse dataset, which verifies the effectiveness and generalization of our approach on improving the practical usage for bug-component triaging.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC58990.2023.00046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

To speed up the bug-fixing process, it is essential to triage bugs into the right components as soon as possible. Given the large number of bugs filed everyday, a reliable and effective bug-component triaging tool is needed to assist this task. LR-BKG is the state-of-the-art toolkit for doing this. However, the suboptimal performance for recommending the right component at the first position (low Top-1 accuracy) limits its usage in practice. We thoroughly investigate the limitations of LR-BKG and find out the gap between the manual feature design of LR-BKG and the characteristics of bug reports causes such suboptimal performance. Therefore, we propose an approach, DEEPTRIAG, which uses the large scale pre-trained models to extract deep features automatically from bug reports (including bug summary and description), to fill this gap. DEEPTRIAG transforms bug-component triaging into a multi-classification task (CodeBERT-Classifier) and a generation task (CodeT5-Generator). Then, we ensemble the prediction results from them to improve the performance of bug-component triaging further. Extensive experimental results demonstrate the superior performance of DEEPTRIAG on bug-component triaging over LR-BKG. In particular, the overall Top-1 accuracy is improved from 56.2% to 68.3% on Mozilla dataset and from 51.3% to 64.1% on Eclipse dataset, which verifies the effectiveness and generalization of our approach on improving the practical usage for bug-component triaging.
仍然对bug -组件分类感到困惑?深度特征学习和集成设置拯救
为了加快bug修复过程,必须尽快将bug分类到正确的组件中。由于每天都有大量的bug提交,因此需要一个可靠且有效的bug组件分类工具来协助完成这项任务。LR-BKG是最先进的工具包。然而,在第一个位置推荐正确组件的次优性能(低Top-1精度)限制了它在实践中的使用。我们深入研究了LR-BKG的局限性,并找出了LR-BKG的手动特征设计与bug报告特征之间的差距导致了这种次优性能。因此,我们提出了一种方法DEEPTRIAG,该方法使用大规模预训练模型从bug报告(包括bug摘要和描述)中自动提取深度特征,以填补这一空白。DEEPTRIAG将bug组件分类转换为多分类任务(CodeBERT-Classifier)和生成任务(CodeT5-Generator)。然后,我们将它们的预测结果进行综合,进一步提高bug-component分类的性能。大量的实验结果表明,DEEPTRIAG比LR-BKG在bug-component分类上有更好的性能。特别是,在Mozilla数据集上,Top-1的总体准确率从56.2%提高到68.3%,在Eclipse数据集上从51.3%提高到64.1%,这验证了我们的方法在改进bug组件分类的实际使用方面的有效性和泛化性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信