IET Software最新文献

筛选
英文 中文
Research and Application of Firewall Log and Intrusion Detection Log Data Visualization System 防火墙日志和入侵检测日志数据可视化系统的研究与应用
IF 1.5 4区 计算机科学
IET Software Pub Date : 2024-08-13 DOI: 10.1049/2024/7060298
Ma Mingze
{"title":"Research and Application of Firewall Log and Intrusion Detection Log Data Visualization System","authors":"Ma Mingze","doi":"10.1049/2024/7060298","DOIUrl":"https://doi.org/10.1049/2024/7060298","url":null,"abstract":"<div>\u0000 <p>This paper tackles current challenges in network security analysis by proposing an innovative information gain-based feature selection algorithm and leveraging visualization techniques to develop a network security log data visualization system. The system’s key functions include raw data collection for firewall logs and intrusion detection logs, data preprocessing, database management, data manipulation, data logic processing, and data visualization. Through statistical analysis of log data and the construction of visualization models, the system presents analysis results in diverse graphical formats while offering interactive capabilities. Seamlessly integrating data generation, processing, analysis, and display processes, the system demonstrates high accuracy, precision, recall, F1 score, and real-time performance metrics, reaching 98.3%, 92.1%, 97.5%, 98.1%, and 91.2%, respectively, in experimental evaluations. The proposed method significantly enhances real-time prediction capabilities of network security status and monitoring efficiency of network devices, providing a robust security assurance tool.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/7060298","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141973646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmented Frequency-Domain Correlation Prediction Model for Long-Term Time Series Forecasting Using Transformer 利用变压器对长期时间序列进行预测的分段频域相关预测模型
IF 1.5 4区 计算机科学
IET Software Pub Date : 2024-07-08 DOI: 10.1049/2024/2920167
Haozhuo Tong, Lingyun Kong, Jie Liu, Shiyan Gao, Yilu Xu, Yuezhe Chen
{"title":"Segmented Frequency-Domain Correlation Prediction Model for Long-Term Time Series Forecasting Using Transformer","authors":"Haozhuo Tong,&nbsp;Lingyun Kong,&nbsp;Jie Liu,&nbsp;Shiyan Gao,&nbsp;Yilu Xu,&nbsp;Yuezhe Chen","doi":"10.1049/2024/2920167","DOIUrl":"https://doi.org/10.1049/2024/2920167","url":null,"abstract":"<div>\u0000 <p>Long-term time series forecasting has received significant attention from researchers in recent years. Transformer model-based approaches have emerged as promising solutions in this domain. Nevertheless, most existing methods rely on point-by-point self-attention mechanisms or employ transformations, decompositions, and reconstructions of the entire sequence to capture dependencies. The point-by-point self-attention mechanism becomes impractical for long-term time series forecasting due to its quadratic complexity with respect to the time series length. Decomposition and reconstruction methods may introduce information loss, leading to performance bottlenecks in the models. In this paper, we propose a Transformer-based forecasting model called NPformer. Our method introduces a novel multiscale segmented Fourier attention mechanism. By segmenting the long-term time series and performing discrete Fourier transforms on different segments, we aim to identify frequency-domain correlations between these segments. This allows us to capture dependencies more effectively. In addition, we incorporate a normalization module and a desmoothing factor into the model. These components address the problem of oversmoothing that arises in sequence decomposition methods. Furthermore, we introduce an isometry convolution method to enhance the prediction accuracy of the model. The experimental results demonstrate that NPformer outperforms other Transformer-based methods in long-term time series forecasting.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/2920167","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accounting Management and Optimizing Production Based on Distributed Semantic Recognition 基于分布式语义识别的会计管理与生产优化
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-06-18 DOI: 10.1049/2024/8425877
Ruina Guo, Shu Wang, Guangsen Wei
{"title":"Accounting Management and Optimizing Production Based on Distributed Semantic Recognition","authors":"Ruina Guo,&nbsp;Shu Wang,&nbsp;Guangsen Wei","doi":"10.1049/2024/8425877","DOIUrl":"https://doi.org/10.1049/2024/8425877","url":null,"abstract":"<div>\u0000 <p>Accounting management and production optimization are vital aspects of enterprise management, serving as indispensable core components in the modern business landscape. However, conventional methods reliant on manual input exhibit drawbacks such as low recognition accuracy and excessive memory consumption. To address these challenges, semantic recognition technology utilizing voice signals has emerged as a pivotal solution across various industries. Building upon this premise, this paper introduces a distributed semantic recognition-based algorithm for accounting management and production optimization. The proposed algorithm encompasses multiple modules, including a front-end feature extraction module, a channel transmission module, and a voice quality vector quantization module. Additionally, a semantic recognition module is introduced to process the voice signals and generate prediction results. By leveraging extensive accounting management and production data for learning and analysis, the algorithm automatically uncovers patterns and laws within the data, extracting valuable information. To validate the proposed algorithm, this study utilizes the dataset from the UCI machine learning repository and applies it for analysis and processing. The experimental findings demonstrate that the algorithm introduced in this paper outperforms alternative methods. Specifically, it achieves a notable 9.3% improvement in comprehensive recognition accuracy and reduces memory usage by 34.4%. These results highlight the algorithm’s efficacy in enhancing the understanding and analysis of customer needs, market trends, competitors, and other pertinent information within the realm of commercial applications for companies.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/8425877","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141424921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Chandy–Lamport Distributed Snapshot Algorithm Using Colored Petri Net 用彩色 Petri 网模拟钱迪-兰波特分布式快照算法
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-06-07 DOI: 10.1049/2024/6582682
Saeid Pashazadeh, Basheer Zuhair Jaafar Al-Basseer, Jafar Tanha
{"title":"Modeling Chandy–Lamport Distributed Snapshot Algorithm Using Colored Petri Net","authors":"Saeid Pashazadeh,&nbsp;Basheer Zuhair Jaafar Al-Basseer,&nbsp;Jafar Tanha","doi":"10.1049/2024/6582682","DOIUrl":"https://doi.org/10.1049/2024/6582682","url":null,"abstract":"<div>\u0000 <p>Distributed global snapshot (DGS) is one of the fundamental protocols in distributed systems. It is used for different applications like collecting information from a distributed system and taking checkpoints for process rollback. The Chandy–Lamport protocol (CLP) is famous and well-known for taking DGS. The main aim of this protocol was to generate consistent cuts without interrupting the regular operation of the distributed system. CLP was the origin of many future protocols and inspired them. The first aim of this paper is to propose a novel formal hierarchical parametric colored Petri net model of CLP. The number of constituting processes of the model is parametric. The second aim is to automatically generate a novel message sequence chart (MSC) to show detailed steps for each simulation run of the snapshot protocol. The third aim is model checking of the proposed formal model to verify the correctness of CLP and our proposed colored Petri net model. Having vital tools helps greatly to test the correct operation of the newly proposed distributed snapshot protocol. The proposed model of CLP can easily be used for visually testing the correct operation of the new future under-development DGS protocol. It also permits formal verification of the correct operation of the new proposed protocol. This model can be used as a simple, powerful, and visual tool for the step-by-step run of the CLP, model checking, and teaching it to postgraduate students. The same approach applies to similar complicated distributed protocols.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6582682","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141286897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software Defect Prediction Using Deep Q-Learning Network-Based Feature Extraction 利用基于深度 Q 学习网络的特征提取进行软件缺陷预测
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-05-30 DOI: 10.1049/2024/3946655
Qinhe Zhang, Jiachen Zhang, Tie Feng, Jialang Xue, Xinxin Zhu, Ningyang Zhu, Zhiheng Li
{"title":"Software Defect Prediction Using Deep Q-Learning Network-Based Feature Extraction","authors":"Qinhe Zhang,&nbsp;Jiachen Zhang,&nbsp;Tie Feng,&nbsp;Jialang Xue,&nbsp;Xinxin Zhu,&nbsp;Ningyang Zhu,&nbsp;Zhiheng Li","doi":"10.1049/2024/3946655","DOIUrl":"https://doi.org/10.1049/2024/3946655","url":null,"abstract":"<div>\u0000 <p>Machine learning-based software defect prediction (SDP) approaches have been commonly proposed to help to deliver high-quality software. Unfortunately, all the previous research conducted without effective feature reduction suffers from high-dimensional data, leading to unsatisfactory prediction performance measures. Moreover, without proper feature reduction, the interpretability and generalization ability of machine learning models in SDP may be compromised, hindering their practical utility in diverse software development environments. In this paper, an SDP approach using deep <i>Q</i>-learning network (DQN)-based feature extraction is proposed to eliminate irrelevant, redundant, and noisy features and improve the classification performance. In the data preprocessing phase, the undersampling method of BalanceCascade is applied to divide the original datasets. As the first step of feature extraction, the weight ranking of all the metric elements is calculated according to the expected cross-entropy. Then, the relation matrix is constructed by applying random matrix theory. After that, the reward principle is defined for computing the <i>Q</i> value of <i>Q</i>-learning based on weight ranking, relation matrix, and the number of errors, according to which a convolutional neural network model is trained on datasets until the sequences of metric pairs are generated for all datasets acting as the revised feature set. Various experiments have been conducted on 11 NASA and 11 PROMISE repository datasets. Sensitive analysis experiments show that binary classification algorithms based on SDP approaches using the DQN-based feature extraction outperform those without using it. We also conducted experiments to compare our approach with four state-of-the-art approaches on common datasets, which show that our approach is superior to these methods in precision, <i>F</i>-measure, area under receiver operating characteristics curve, and Matthews correlation coefficient values.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/3946655","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141246131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Balanced Adversarial Tight Matching for Cross-Project Defect Prediction 用于跨项目缺陷预测的平衡对抗式紧密匹配
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-05-16 DOI: 10.1049/2024/1561351
Siyu Jiang, Jiapeng Zhang, Feng Guo, Teng Ouyang, Jing Li
{"title":"Balanced Adversarial Tight Matching for Cross-Project Defect Prediction","authors":"Siyu Jiang,&nbsp;Jiapeng Zhang,&nbsp;Feng Guo,&nbsp;Teng Ouyang,&nbsp;Jing Li","doi":"10.1049/2024/1561351","DOIUrl":"10.1049/2024/1561351","url":null,"abstract":"<div>\u0000 <p>Cross-project defect prediction (CPDP) is an attractive research area in software testing. It identifies defects in projects with limited labeled data (target projects) by utilizing predictive models from data-rich projects (source projects). Existing CPDP methods based on transfer learning mainly rely on the assumption of a unimodal distribution and consider the case where the feature distribution has one obvious peak. However, in actual situations, the feature distribution of project samples often exhibits multiple peaks that cannot be ignored. It manifests as a multimodal distribution, making it challenging to align distributions between different projects. To address this issue, we propose a balanced adversarial tight-matching model for CPDP. Specifically, this method employs multilinear conditioning to obtain the cross-covariance of both features and classifier predictions, capturing the multimodal distribution of the feature. When reducing the captured multimodal distribution differences, pseudo-labels are needed, but pseudo-labels have uncertainty. Therefore, we additionally add an auxiliary classifier and attempt to generate pseudo-labels using a pseudo-label strategy with less uncertainty. Finally, the feature generator and two classifiers undergo adversarial training to align the multimodal distributions of different projects. This method outperforms the state-of-the-art CPDP model used on the benchmark dataset.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/1561351","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140968219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Empirical Study on Downstream Dependency Package Groups in Software Packaging Ecosystems 软件包生态系统中的下游依赖包群实证研究
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-04-30 DOI: 10.1049/2024/4488412
Qing Qi, Jian Cao
{"title":"An Empirical Study on Downstream Dependency Package Groups in Software Packaging Ecosystems","authors":"Qing Qi,&nbsp;Jian Cao","doi":"10.1049/2024/4488412","DOIUrl":"https://doi.org/10.1049/2024/4488412","url":null,"abstract":"<div>\u0000 <p>The role of focal packages in packaging ecosystems is crucial for the development of the entire ecosystem, as they are the packages on which other packages depend. However, the evolution of dependency groups in packaging ecosystems has not been systematically investigated. In this study, we examine the downstream dependency package groups (DDGs) in three typical packaging ecosystems—Cargo for Rust, Comprehensive Perl Archive Network for Perl, and RubyGems for Ruby—to identify their features and evolution. We also identify and analyze a special type of DDG, the collaborative downstream dependency package group (CDDG), which requires shared contributors. Our findings show that the overall development of DDGs, particularly CDDGs, is consistent with the status of the whole ecosystem, and the size of DDGs and CDDGs follows a power law distribution. Furthermore, the interaction mechanisms between focal packages and downstream packages differ between ecosystems, but focal packages always play a leading role in the development of DDGs and CDDGs. Finally, we investigate predictive models for the development of CDDGs in the next stage based on their features, and our results show that random forest and Gradient Boosting Regression Tree achieve acceptable prediction accuracy. We provide the raw data and scripts used for our analysis at https://github.com/onion616/DDG.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4488412","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting DBSCAN and Combination Strategy to Prioritize the Test Suite in Regression Testing 利用 DBSCAN 和组合策略确定回归测试中测试套件的优先级
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-04-04 DOI: 10.1049/2024/9942959
Zikang Zhang, Jinfu Chen, Yuechao Gu, Zhehao Li, Rexford Nii Ayitey Sosu
{"title":"Exploiting DBSCAN and Combination Strategy to Prioritize the Test Suite in Regression Testing","authors":"Zikang Zhang,&nbsp;Jinfu Chen,&nbsp;Yuechao Gu,&nbsp;Zhehao Li,&nbsp;Rexford Nii Ayitey Sosu","doi":"10.1049/2024/9942959","DOIUrl":"https://doi.org/10.1049/2024/9942959","url":null,"abstract":"<div>\u0000 <p>Test case prioritization techniques improve the fault detection rate by adjusting the execution sequence of test cases. For static black-box test case prioritization techniques, existing methods generally improve the fault detection rate by increasing the early diversity of execution sequences based on string distance differences. However, such methods have a high time overhead and are less stable. This paper proposes a novel test case prioritization method (DC-TCP) based on density-based spatial clustering of applications with noise (DBSCAN) and combination policies. By introducing a combination strategy to model the inputs to generate a mapping model, the test inputs are mapped to consistent types to improve generality. The DBSCAN method is then used to refine the classification of test cases further, and finally, the Firefly search strategy is introduced to improve the effectiveness of sequence merging. Extensive experimental results demonstrate that the proposed DC-TCP method outperforms other methods in terms of the average percentage of faults detected and exhibits advantages in terms of time efficiency when compared to several existing static black-box sorting methods.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/9942959","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Expository Examination of Temporally Evolving Graph-Based Approaches for the Visual Investigation of Autonomous Driving 基于时序演进图的自动驾驶视觉研究方法的阐述性研究
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-03-20 DOI: 10.1049/2024/5802816
Li Wan, Wenzhi Cheng
{"title":"An Expository Examination of Temporally Evolving Graph-Based Approaches for the Visual Investigation of Autonomous Driving","authors":"Li Wan,&nbsp;Wenzhi Cheng","doi":"10.1049/2024/5802816","DOIUrl":"10.1049/2024/5802816","url":null,"abstract":"<div>\u0000 <p>With the continuous advancement of autonomous driving technology, visual analysis techniques have emerged as a prominent research topic. The data generated by autonomous driving is large-scale and time-varying, yet more than existing visual analytics methods are required to deal with such complex data effectively. Time-varying diagrams can be used to model and visualize the dynamic relationships in various complex systems and can visually describe the data trends in autonomous driving systems. To this end, this paper introduces a time-varying graph-based method for visual analysis in autonomous driving. The proposed method employs a graph structure to represent the relative positional relationships between the target and obstacle interferences. By incorporating the time dimension, a time-varying graph model is constructed. The method explores the characteristic changes of nodes in the graph at different time instances, establishing feature expressions that differentiate target and obstacle motion patterns. The analysis demonstrates that the feature vector centrality in the time-varying graph effectively captures the distinctions in motion patterns between targets and obstacles. These features can be utilized for accurate target and obstacle recognition, achieving high recognition accuracy. To evaluate the proposed time-varying graph-based visual analytic autopilot method, a comparative study is conducted against traditional visual analytic methods such as the frame differencing method and advanced visual analytic methods like visual lidar odometry and mapping. Robustness, accuracy, and resource consumption experiments are performed using the publicly available KITTI dataset to analyze and compare the three methods. The experimental results show that the proposed time-varying graph-based method exhibits superior accuracy and robustness. This study offers valuable insights and solution ideas for developing deep integration between intelligent networked vehicles and intelligent transportation. It provides a reference for advancing intelligent transportation systems and their integration with autonomous driving technologies.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5802816","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140225546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Project Defect Prediction Using Transfer Learning with Long Short-Term Memory Networks 利用长短期记忆网络的迁移学习进行跨项目缺陷预测
IF 1.6 4区 计算机科学
IET Software Pub Date : 2024-03-18 DOI: 10.1049/2024/5550801
Hongwei Tao, Lianyou Fu, Qiaoling Cao, Xiaoxu Niu, Haoran Chen, Songtao Shang, Yang Xian
{"title":"Cross-Project Defect Prediction Using Transfer Learning with Long Short-Term Memory Networks","authors":"Hongwei Tao,&nbsp;Lianyou Fu,&nbsp;Qiaoling Cao,&nbsp;Xiaoxu Niu,&nbsp;Haoran Chen,&nbsp;Songtao Shang,&nbsp;Yang Xian","doi":"10.1049/2024/5550801","DOIUrl":"10.1049/2024/5550801","url":null,"abstract":"<div>\u0000 <p>With the increasing number of software projects, within-project defect prediction (WPDP) has already been unable to meet the demand, and cross-project defect prediction (CPDP) is playing an increasingly significant role in the area of software engineering. The classic CPDP methods mainly concentrated on applying metric features to predict defects. However, these approaches failed to consider the rich semantic information, which usually contains the relationship between software defects and context. Since traditional methods are unable to exploit this characteristic, their performance is often unsatisfactory. In this paper, a transfer long short-term memory (TLSTM) network model is first proposed. Transfer semantic features are extracted by adding a transfer learning algorithm to the long short-term memory (LSTM) network. Then, the traditional metric features and semantic features are combined for CPDP. First, the abstract syntax trees (AST) are generated based on the source codes. Second, the AST node contents are converted into integer vectors as inputs to the TLSTM model. Then, the semantic features of the program can be extracted by TLSTM. On the other hand, transferable metric features are extracted by transfer component analysis (TCA). Finally, the semantic features and metric features are combined and input into the logical regression (LR) classifier for training. The presented TLSTM model performs better on the <i>f</i>-measure indicator than other machine and deep learning models, according to the outcomes of several open-source projects of the PROMISE repository. The TLSTM model built with a single feature achieves 0.7% and 2.1% improvement on Log4j-1.2 and Xalan-2.7, respectively. When using combined features to train the prediction model, we call this model a transfer long short-term memory for defect prediction (DPTLSTM). DPTLSTM achieves a 2.9% and 5% improvement on Synapse-1.2 and Xerces-1.4.4, respectively. Both prove the superiority of the proposed model on the CPDP task. This is because LSTM capture long-term dependencies in sequence data and extract features that contain source code structure and context information. It can be concluded that: (1) the TLSTM model has the advantage of preserving information, which can better retain the semantic features related to software defects; (2) compared with the CPDP model trained with traditional metric features, the performance of the model can validly enhance by combining semantic features and metric features.</p>\u0000 </div>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5550801","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140233101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信