Refining software defect prediction through attentive neural models for code understanding

IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Mona Nashaat , James Miller
{"title":"Refining software defect prediction through attentive neural models for code understanding","authors":"Mona Nashaat ,&nbsp;James Miller","doi":"10.1016/j.jss.2024.112266","DOIUrl":null,"url":null,"abstract":"<div><div>Identifying defects through manual software testing is a resource-intensive task in software development. To alleviate this, software defect prediction identifies code segments likely to contain faults using data-driven methods. Traditional techniques rely on static code metrics, which often fail to reflect the deeper syntactic and semantic features of the code. This paper introduces a novel framework that utilizes transformer-based networks with attention mechanisms to predict software defects. The framework encodes input vectors to develop meaningful representations of software modules. A bidirectional transformer encoder is employed to model programming languages, followed by fine-tuning with labeled data to detect defects. The performance of the framework is assessed through experiments across various software projects and compared against baseline techniques. Additionally, statistical hypothesis testing and an ablation study are performed to assess the impact of different parameter choices. The empirical findings indicate that the proposed approach can increase classification accuracy by an average of 15.93% and improve the F1 score by up to 44.26% compared to traditional methods.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"220 ","pages":"Article 112266"},"PeriodicalIF":3.7000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121224003108","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Identifying defects through manual software testing is a resource-intensive task in software development. To alleviate this, software defect prediction identifies code segments likely to contain faults using data-driven methods. Traditional techniques rely on static code metrics, which often fail to reflect the deeper syntactic and semantic features of the code. This paper introduces a novel framework that utilizes transformer-based networks with attention mechanisms to predict software defects. The framework encodes input vectors to develop meaningful representations of software modules. A bidirectional transformer encoder is employed to model programming languages, followed by fine-tuning with labeled data to detect defects. The performance of the framework is assessed through experiments across various software projects and compared against baseline techniques. Additionally, statistical hypothesis testing and an ablation study are performed to assess the impact of different parameter choices. The empirical findings indicate that the proposed approach can increase classification accuracy by an average of 15.93% and improve the F1 score by up to 44.26% compared to traditional methods.

Abstract Image

通过专注于代码理解的神经模型完善软件缺陷预测
在软件开发过程中,通过人工软件测试识别缺陷是一项资源密集型任务。为了缓解这一问题,软件缺陷预测使用数据驱动方法识别可能包含缺陷的代码段。传统技术依赖于静态代码度量,而静态代码度量往往无法反映代码更深层次的语法和语义特征。本文介绍了一种新颖的框架,它利用基于变压器的网络和注意机制来预测软件缺陷。该框架对输入向量进行编码,以开发软件模块的有意义表征。采用双向变压器编码器对编程语言进行建模,然后利用标注数据进行微调以检测缺陷。通过在各种软件项目中进行实验,对该框架的性能进行了评估,并与基准技术进行了比较。此外,还进行了统计假设检验和消融研究,以评估不同参数选择的影响。实证研究结果表明,与传统方法相比,所提出的方法可将分类准确率平均提高 15.93%,并将 F1 分数最多提高 44.26%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Systems and Software
Journal of Systems and Software 工程技术-计算机:理论方法
CiteScore
8.60
自引率
5.70%
发文量
193
审稿时长
16 weeks
期刊介绍: The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to: • Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution • Agile, model-driven, service-oriented, open source and global software development • Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems • Human factors and management concerns of software development • Data management and big data issues of software systems • Metrics and evaluation, data mining of software development resources • Business and economic aspects of software development processes The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信