Deep learning or classical machine learning? An empirical study on line-level software defect prediction

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process Pub Date : 2024-06-02 DOI:10.1002/smr.2696

Yufei Zhou, Xutong Liu, Zhaoqiang Guo, Yuming Zhou, Corey Zhang, Junyan Qian

{"title":"Deep learning or classical machine learning? An empirical study on line-level software defect prediction","authors":"Yufei Zhou, Xutong Liu, Zhaoqiang Guo, Yuming Zhou, Corey Zhang, Junyan Qian","doi":"10.1002/smr.2696","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Line-level software defect prediction (LL-SDP) serves as a valuable tool for developers to detect defective lines with minimal human effort. Recently, GLANCE was proposed as a readily implementable baseline for assessing the efficacy of newly proposed LL-SDP models.</p>\n </section>\n \n <section>\n \n <h3> Problem</h3>\n \n <p>While DeepLineDP, a cutting-edge LL-SDP model rooted in deep learning, has demonstrated state-of-the-art performance, it has not yet been compared against GLANCE.</p>\n </section>\n \n <section>\n \n <h3> Objective</h3>\n \n <p>We aim to empirically compare DeepLineDP with GLANCE to obtain a comprehensive understanding of how deep learning contributes to solving the LL-SDP challenge.</p>\n </section>\n \n <section>\n \n <h3> Method</h3>\n \n <p>We compare GLANCE against DeepLineDP to assess the extent to which DeepLineDP surpasses GLANCE in predicting defective files and identifying problematic lines. In order to obtain a reliable conclusion, we use the same dataset and performance metrics utilized by DeepLineDP.</p>\n </section>\n \n <section>\n \n <h3> Result</h3>\n \n <p>Our experimental findings indicate that DeepLineDP does not outperform GLANCE in LL-SDP. This suggests that the application of deep learning, in this context, does not yield the anticipated significant improvements.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>This finding underscores the need for further research in deep learning-based LL-SDP to attain the state-of-the-art performance that remains elusive for less advanced techniques.</p>\n </section>\n </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 10","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.2696","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Line-level software defect prediction (LL-SDP) serves as a valuable tool for developers to detect defective lines with minimal human effort. Recently, GLANCE was proposed as a readily implementable baseline for assessing the efficacy of newly proposed LL-SDP models.

Problem

While DeepLineDP, a cutting-edge LL-SDP model rooted in deep learning, has demonstrated state-of-the-art performance, it has not yet been compared against GLANCE.

Objective

We aim to empirically compare DeepLineDP with GLANCE to obtain a comprehensive understanding of how deep learning contributes to solving the LL-SDP challenge.

Method

We compare GLANCE against DeepLineDP to assess the extent to which DeepLineDP surpasses GLANCE in predicting defective files and identifying problematic lines. In order to obtain a reliable conclusion, we use the same dataset and performance metrics utilized by DeepLineDP.

Result

Our experimental findings indicate that DeepLineDP does not outperform GLANCE in LL-SDP. This suggests that the application of deep learning, in this context, does not yield the anticipated significant improvements.

Conclusion

This finding underscores the need for further research in deep learning-based LL-SDP to attain the state-of-the-art performance that remains elusive for less advanced techniques.

查看原文本刊更多论文

深度学习还是经典机器学习？线路级软件缺陷预测实证研究

线路级软件缺陷预测（LL-SDP）是开发人员以最小的人力检测缺陷线路的重要工具。最近，GLANCE 被提出作为评估新提出的 LL-SDP 模型功效的一个易于实现的基线。虽然 DeepLineDP（一种植根于深度学习的前沿 LL-SDP 模型）已经展示了最先进的性能，但它尚未与 GLANCE 进行过比较。我们将 GLANCE 与 DeepLineDP 进行比较，以评估 DeepLineDP 在预测缺陷文件和识别问题行方面超越 GLANCE 的程度。为了得出可靠的结论，我们使用了与 DeepLineDP 相同的数据集和性能指标。我们的实验结果表明，DeepLineDP 在 LL-SDP 中的表现并没有超过 GLANCE。这表明，在这种情况下，深度学习的应用并没有产生预期的显著改进。这一发现突出表明，需要进一步研究基于深度学习的 LL-SDP，以获得最先进的性能，而对于不太先进的技术来说，这种性能仍然难以达到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-

自引率

10.00%

发文量

109