{"title":"基于序列和树形结构的LSTM缺陷预测","authors":"Xuan Zhou, Lu Lu","doi":"10.1109/QRS51102.2020.00055","DOIUrl":null,"url":null,"abstract":"With the ever-expanding spread of contemporary software, software defect prediction (SDP) is attracting more and more attention. However, sequential networks used in previous studies, weaken syntactic information and fail to capture longdistance dependencies. To solve these problems, we develop a long short-term memory network based on bidirectional and tree structure (LSTM-BT). Specifically, LSTM-BT combines bidirectional long short-term memory networks (BI-LSTM) and tree long short-term memory networks (Tree-LSTM) to capture semantic and syntactic features from source codes. First, token vectors are captured from the abstract syntax tree (AST). Second, an embedding layer is used to extract semantic information hidden inside the AST nodes. Last, features are fed to the LSTM- BT, which is used to conduct predictions of defect-proneness. To validate our method, we carried out experiments on 8 pairs of Java open-source projects and the results show that LSTM- BT performs better compared to several state-of-the-art defect prediction models.","PeriodicalId":301814,"journal":{"name":"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Defect Prediction via LSTM Based on Sequence and Tree Structure\",\"authors\":\"Xuan Zhou, Lu Lu\",\"doi\":\"10.1109/QRS51102.2020.00055\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the ever-expanding spread of contemporary software, software defect prediction (SDP) is attracting more and more attention. However, sequential networks used in previous studies, weaken syntactic information and fail to capture longdistance dependencies. To solve these problems, we develop a long short-term memory network based on bidirectional and tree structure (LSTM-BT). Specifically, LSTM-BT combines bidirectional long short-term memory networks (BI-LSTM) and tree long short-term memory networks (Tree-LSTM) to capture semantic and syntactic features from source codes. First, token vectors are captured from the abstract syntax tree (AST). Second, an embedding layer is used to extract semantic information hidden inside the AST nodes. Last, features are fed to the LSTM- BT, which is used to conduct predictions of defect-proneness. To validate our method, we carried out experiments on 8 pairs of Java open-source projects and the results show that LSTM- BT performs better compared to several state-of-the-art defect prediction models.\",\"PeriodicalId\":301814,\"journal\":{\"name\":\"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QRS51102.2020.00055\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS51102.2020.00055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Defect Prediction via LSTM Based on Sequence and Tree Structure
With the ever-expanding spread of contemporary software, software defect prediction (SDP) is attracting more and more attention. However, sequential networks used in previous studies, weaken syntactic information and fail to capture longdistance dependencies. To solve these problems, we develop a long short-term memory network based on bidirectional and tree structure (LSTM-BT). Specifically, LSTM-BT combines bidirectional long short-term memory networks (BI-LSTM) and tree long short-term memory networks (Tree-LSTM) to capture semantic and syntactic features from source codes. First, token vectors are captured from the abstract syntax tree (AST). Second, an embedding layer is used to extract semantic information hidden inside the AST nodes. Last, features are fed to the LSTM- BT, which is used to conduct predictions of defect-proneness. To validate our method, we carried out experiments on 8 pairs of Java open-source projects and the results show that LSTM- BT performs better compared to several state-of-the-art defect prediction models.