基于嵌入式静态度量的深度语义特征学习用于软件缺陷预测

2019 26th Asia-Pacific Software Engineering Conference (APSEC) Pub Date : 2019-12-01 DOI:10.1109/APSEC48747.2019.00041

Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen

{"title":"基于嵌入式静态度量的深度语义特征学习用于软件缺陷预测","authors":"Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen","doi":"10.1109/APSEC48747.2019.00041","DOIUrl":null,"url":null,"abstract":"Software defect prediction, which locates defective code snippets, can assist developers in finding potential bugs and assigning their testing efforts. Traditional defect prediction features are static code metrics, which only contain statistic information of programs and fail to capture semantics in programs, leading to the degradation of defect prediction performance. To take full advantage of the semantics and static metrics of programs, we propose a framework called Defect Prediction via Attention Mechanism (DP-AM) in this paper. Specifically, DPAM first extracts vectors which are then encoded as digital vectors by mapping and word embedding from abstract syntax trees (ASTs) of programs. Then it feeds these numerical vectors into Recurrent Neural Network to automatically learn semantic features of programs. After that, it applies self-attention mechanism to further build relationship among these features. Furthermore, it employs global attention mechanism to generate significant features among them. Finally, we combine these semantic features with traditional static metrics for accurate software defect prediction. We evaluate our method in terms of F1-measure on seven open-source Java projects in Apache. Our experimental results show that DP-AM improves F1-measure by 11% in average, compared with the state-of-the-art methods.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Deep Semantic Feature Learning with Embedded Static Metrics for Software Defect Prediction\",\"authors\":\"Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen\",\"doi\":\"10.1109/APSEC48747.2019.00041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software defect prediction, which locates defective code snippets, can assist developers in finding potential bugs and assigning their testing efforts. Traditional defect prediction features are static code metrics, which only contain statistic information of programs and fail to capture semantics in programs, leading to the degradation of defect prediction performance. To take full advantage of the semantics and static metrics of programs, we propose a framework called Defect Prediction via Attention Mechanism (DP-AM) in this paper. Specifically, DPAM first extracts vectors which are then encoded as digital vectors by mapping and word embedding from abstract syntax trees (ASTs) of programs. Then it feeds these numerical vectors into Recurrent Neural Network to automatically learn semantic features of programs. After that, it applies self-attention mechanism to further build relationship among these features. Furthermore, it employs global attention mechanism to generate significant features among them. Finally, we combine these semantic features with traditional static metrics for accurate software defect prediction. We evaluate our method in terms of F1-measure on seven open-source Java projects in Apache. Our experimental results show that DP-AM improves F1-measure by 11% in average, compared with the state-of-the-art methods.\",\"PeriodicalId\":325642,\"journal\":{\"name\":\"2019 26th Asia-Pacific Software Engineering Conference (APSEC)\",\"volume\":\"79 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 26th Asia-Pacific Software Engineering Conference (APSEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APSEC48747.2019.00041\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC48747.2019.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

摘要

软件缺陷预测，定位有缺陷的代码片段，可以帮助开发人员发现潜在的错误并分配他们的测试工作。传统的缺陷预测特征是静态的代码度量，仅包含程序的统计信息，无法捕捉程序中的语义，导致缺陷预测性能下降。为了充分利用程序的语义和静态度量，本文提出了一个基于注意机制的缺陷预测框架(DP-AM)。具体来说，DPAM首先从程序的抽象语法树(ast)中提取向量，然后通过映射和词嵌入将向量编码为数字向量。然后将这些数值向量输入到递归神经网络中，自动学习程序的语义特征。然后，应用自关注机制进一步建立这些特征之间的关系。此外，它还利用全局注意机制来生成其中的显著特征。最后，我们将这些语义特征与传统的静态度量相结合，以实现准确的软件缺陷预测。我们在Apache中的7个开源Java项目中根据f1度量来评估我们的方法。我们的实验结果表明，与最先进的方法相比，DP-AM平均提高了11%的f1测量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Semantic Feature Learning with Embedded Static Metrics for Software Defect Prediction

Software defect prediction, which locates defective code snippets, can assist developers in finding potential bugs and assigning their testing efforts. Traditional defect prediction features are static code metrics, which only contain statistic information of programs and fail to capture semantics in programs, leading to the degradation of defect prediction performance. To take full advantage of the semantics and static metrics of programs, we propose a framework called Defect Prediction via Attention Mechanism (DP-AM) in this paper. Specifically, DPAM first extracts vectors which are then encoded as digital vectors by mapping and word embedding from abstract syntax trees (ASTs) of programs. Then it feeds these numerical vectors into Recurrent Neural Network to automatically learn semantic features of programs. After that, it applies self-attention mechanism to further build relationship among these features. Furthermore, it employs global attention mechanism to generate significant features among them. Finally, we combine these semantic features with traditional static metrics for accurate software defect prediction. We evaluate our method in terms of F1-measure on seven open-source Java projects in Apache. Our experimental results show that DP-AM improves F1-measure by 11% in average, compared with the state-of-the-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 26th Asia-Pacific Software Engineering Conference (APSEC)

自引率

0.00%

发文量