Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen
{"title":"Deep Semantic Feature Learning with Embedded Static Metrics for Software Defect Prediction","authors":"Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen","doi":"10.1109/APSEC48747.2019.00041","DOIUrl":null,"url":null,"abstract":"Software defect prediction, which locates defective code snippets, can assist developers in finding potential bugs and assigning their testing efforts. Traditional defect prediction features are static code metrics, which only contain statistic information of programs and fail to capture semantics in programs, leading to the degradation of defect prediction performance. To take full advantage of the semantics and static metrics of programs, we propose a framework called Defect Prediction via Attention Mechanism (DP-AM) in this paper. Specifically, DPAM first extracts vectors which are then encoded as digital vectors by mapping and word embedding from abstract syntax trees (ASTs) of programs. Then it feeds these numerical vectors into Recurrent Neural Network to automatically learn semantic features of programs. After that, it applies self-attention mechanism to further build relationship among these features. Furthermore, it employs global attention mechanism to generate significant features among them. Finally, we combine these semantic features with traditional static metrics for accurate software defect prediction. We evaluate our method in terms of F1-measure on seven open-source Java projects in Apache. Our experimental results show that DP-AM improves F1-measure by 11% in average, compared with the state-of-the-art methods.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC48747.2019.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Software defect prediction, which locates defective code snippets, can assist developers in finding potential bugs and assigning their testing efforts. Traditional defect prediction features are static code metrics, which only contain statistic information of programs and fail to capture semantics in programs, leading to the degradation of defect prediction performance. To take full advantage of the semantics and static metrics of programs, we propose a framework called Defect Prediction via Attention Mechanism (DP-AM) in this paper. Specifically, DPAM first extracts vectors which are then encoded as digital vectors by mapping and word embedding from abstract syntax trees (ASTs) of programs. Then it feeds these numerical vectors into Recurrent Neural Network to automatically learn semantic features of programs. After that, it applies self-attention mechanism to further build relationship among these features. Furthermore, it employs global attention mechanism to generate significant features among them. Finally, we combine these semantic features with traditional static metrics for accurate software defect prediction. We evaluate our method in terms of F1-measure on seven open-source Java projects in Apache. Our experimental results show that DP-AM improves F1-measure by 11% in average, compared with the state-of-the-art methods.