{"title":"基于双序列结构语义特征学习的软件缺陷数预测","authors":"Tao Wang, Chuanqi Tao, Hongjing Guo, Lijin Tang","doi":"10.1109/QRS57517.2022.00026","DOIUrl":null,"url":null,"abstract":"Software defect prediction(SDP), which predicts defective code areas, including files, code blocks, code lines, etc. It can help developers or testers in allocating test resources before the testing phase. Software defect number prediction(SDNP) is an important research direction of SDP. Previous studies mostly used regression-based methods or different neural networks to mine the semantic features contained in AST, but the way to represent code was relatively simple. In this article, we propose a framework for representing the semantic features in terms of sequences of nodes with a double sequence structure, by analyzing the ASTs and the changes in the code blocks between adjacent version. In addition, to combine statistical metric information, we also propose a model that dynamically determines the ratio of semantic features to traditional metric features during model training by using the gated fusion mechanism to perform SDNP. In the experimental part, we select 10 open source Java projects as training and test sets, and conduct a lot of comparative experiments. The experimental results demonstrate the superiority of our proposed method compared to the baseline approach.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic Feature Learning based on Double Sequences Structure for Software Defect Number Prediction\",\"authors\":\"Tao Wang, Chuanqi Tao, Hongjing Guo, Lijin Tang\",\"doi\":\"10.1109/QRS57517.2022.00026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software defect prediction(SDP), which predicts defective code areas, including files, code blocks, code lines, etc. It can help developers or testers in allocating test resources before the testing phase. Software defect number prediction(SDNP) is an important research direction of SDP. Previous studies mostly used regression-based methods or different neural networks to mine the semantic features contained in AST, but the way to represent code was relatively simple. In this article, we propose a framework for representing the semantic features in terms of sequences of nodes with a double sequence structure, by analyzing the ASTs and the changes in the code blocks between adjacent version. In addition, to combine statistical metric information, we also propose a model that dynamically determines the ratio of semantic features to traditional metric features during model training by using the gated fusion mechanism to perform SDNP. In the experimental part, we select 10 open source Java projects as training and test sets, and conduct a lot of comparative experiments. The experimental results demonstrate the superiority of our proposed method compared to the baseline approach.\",\"PeriodicalId\":143812,\"journal\":{\"name\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QRS57517.2022.00026\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS57517.2022.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic Feature Learning based on Double Sequences Structure for Software Defect Number Prediction
Software defect prediction(SDP), which predicts defective code areas, including files, code blocks, code lines, etc. It can help developers or testers in allocating test resources before the testing phase. Software defect number prediction(SDNP) is an important research direction of SDP. Previous studies mostly used regression-based methods or different neural networks to mine the semantic features contained in AST, but the way to represent code was relatively simple. In this article, we propose a framework for representing the semantic features in terms of sequences of nodes with a double sequence structure, by analyzing the ASTs and the changes in the code blocks between adjacent version. In addition, to combine statistical metric information, we also propose a model that dynamically determines the ratio of semantic features to traditional metric features during model training by using the gated fusion mechanism to perform SDNP. In the experimental part, we select 10 open source Java projects as training and test sets, and conduct a lot of comparative experiments. The experimental results demonstrate the superiority of our proposed method compared to the baseline approach.