{"title":"Still Confusing for Bug-Component Triaging? Deep Feature Learning and Ensemble Setting to Rescue","authors":"Yanqi Su, Zheming Han, Zhipeng Gao, Zhenchang Xing, Qinghua Lu, Xiwei Xu","doi":"10.1109/ICPC58990.2023.00046","DOIUrl":null,"url":null,"abstract":"To speed up the bug-fixing process, it is essential to triage bugs into the right components as soon as possible. Given the large number of bugs filed everyday, a reliable and effective bug-component triaging tool is needed to assist this task. LR-BKG is the state-of-the-art toolkit for doing this. However, the suboptimal performance for recommending the right component at the first position (low Top-1 accuracy) limits its usage in practice. We thoroughly investigate the limitations of LR-BKG and find out the gap between the manual feature design of LR-BKG and the characteristics of bug reports causes such suboptimal performance. Therefore, we propose an approach, DEEPTRIAG, which uses the large scale pre-trained models to extract deep features automatically from bug reports (including bug summary and description), to fill this gap. DEEPTRIAG transforms bug-component triaging into a multi-classification task (CodeBERT-Classifier) and a generation task (CodeT5-Generator). Then, we ensemble the prediction results from them to improve the performance of bug-component triaging further. Extensive experimental results demonstrate the superior performance of DEEPTRIAG on bug-component triaging over LR-BKG. In particular, the overall Top-1 accuracy is improved from 56.2% to 68.3% on Mozilla dataset and from 51.3% to 64.1% on Eclipse dataset, which verifies the effectiveness and generalization of our approach on improving the practical usage for bug-component triaging.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC58990.2023.00046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To speed up the bug-fixing process, it is essential to triage bugs into the right components as soon as possible. Given the large number of bugs filed everyday, a reliable and effective bug-component triaging tool is needed to assist this task. LR-BKG is the state-of-the-art toolkit for doing this. However, the suboptimal performance for recommending the right component at the first position (low Top-1 accuracy) limits its usage in practice. We thoroughly investigate the limitations of LR-BKG and find out the gap between the manual feature design of LR-BKG and the characteristics of bug reports causes such suboptimal performance. Therefore, we propose an approach, DEEPTRIAG, which uses the large scale pre-trained models to extract deep features automatically from bug reports (including bug summary and description), to fill this gap. DEEPTRIAG transforms bug-component triaging into a multi-classification task (CodeBERT-Classifier) and a generation task (CodeT5-Generator). Then, we ensemble the prediction results from them to improve the performance of bug-component triaging further. Extensive experimental results demonstrate the superior performance of DEEPTRIAG on bug-component triaging over LR-BKG. In particular, the overall Top-1 accuracy is improved from 56.2% to 68.3% on Mozilla dataset and from 51.3% to 64.1% on Eclipse dataset, which verifies the effectiveness and generalization of our approach on improving the practical usage for bug-component triaging.