{"title":"Guest Editorial: Machine learning applied to quality and security in software systems","authors":"Honghao Gao, Walayat Hussain, Ramón J. Durán Barroso, Junaid Arshad, Yuyu Yin","doi":"10.1049/sfw2.12141","DOIUrl":null,"url":null,"abstract":"<p>During the development of software systems, even with advanced planning, problems with quality and security occur. These defects may result in threats to program development and maintenance. Therefore, to control and minimise these defects, machine learning can be used to improve the quality and security of software systems. This special issue focuses on recent advances in architecture, algorithms, optimisation, and models for machine learning applied to quality and security in software systems. After a rigorous review according to relevance, originality, technical novelties, and presentation quality, we selected 4 manuscripts. A summary of these accepted papers is outlined below.</p><p>In the first paper entitled “Robust Malware Identification via Deep Temporal Convolutional Network with Symmetric Cross Entropy Learning” by Sun et al., the authors propose a robust Malware identification method using the temporal convolutional network (TCN). Moreover, word embedding techniques are generally utilised to understand the contextual relationship between the input operation code (opcode) and application programming interface (API) function names in many cases. Here, considering the numerous unlabelled samples in practical intelligent environments, the authors pre-train the TCN model on an unlabelled set using a word embedding method, that is, <i>word</i>2<i>vec</i>. In the experiments, the proposed method is compared to several traditional statistical methods and more recent neural networks on a synthetic Malware dataset and a real-world dataset. The performance comparisons demonstrate the better performance and noise robustness of the proposed method, that the proposed method can yield the best identification accuracy of 98.75% in real-world scenarios.</p><p>In the second paper entitled “Just-In-Time Defect Prediction Enhanced by the Joint Method of Line Label Fusion and File Filtering” by Zhang et al., the authors propose a Just-in-Time defect prediction model enhanced by the joint method of line label Fusion and file Filtering (JIT-FF). First, to distinguish added and removed lines while preserving the original software changes information, the authors represent the code changes as original, added, and removed codes according to line labels. Second, to obtain semantics-enhanced code representation, the authors propose a cross-attention-based line label fusion method to perform complementary feature enhancement. Third, to generate code changes containing fewer defect-irrelevant files, the authors formalise the file filtering as a sequential decision problem and propose a reinforcement learning-based file filtering method. Finally, based on generated code changes, CodeBERT-based commit representation and multi-layer perceptron-based defect prediction are performed to identify the defective software changes. The experiments demonstrate that JIT-FF predicts defective software changes more effectively.</p><p>In the third paper entitled “Android Malware Detection via Efficient API Call Sequences Extraction and Machine Learning Classifiers” by Wang et al., the authors propose a novel Android malware detection framework, where the authors contribute an efficient API call sequences extraction algorithm and an investigation of different types of classifiers. In API call sequences extraction, the authors propose an algorithm for transforming the function call graph from a multigraph into a directed simple graph, which successfully avoids unnecessary repetitive path searching. The authors also propose a pruning search, which further reduces the number of paths to be searched. The developed algorithm greatly reduces the time complexity. The authors generate the transition matrix as classification features and investigate three types of machine learning classifiers to complete the malware detection task. The experiments are performed on real-world APKs, and the results demonstrate that the proposed method reduces the running time and produces high detection accuracy.</p><p>In the fourth paper entitled “Selecting Reliable Blockchain Peers via Hybrid Blockchain Reliability Prediction” by Zheng et al., the authors propose H-BRP, a Hybrid Blockchain Reliability Prediction model, to extract the blockchain reliability factors and then make the personalised prediction for each user. Connecting to unreliable blockchain peers is prone to resource waste and even loss of cryptocurrencies by repeated transactions. The proposed model primarily aims to select reliable blockchain peers and to evaluate and predict their reliability. Comprehensive experiments conducted on 100 blockchain requesters and 200 blockchain peers demonstrate the effectiveness of the proposed H-BRP model. Furthermore, the implementation and dataset of 2,000,000 test cases are released.</p><p>The Guest Editors would like to express their deep gratitude to all the authors who have submitted their valuable contributions, and to the numerous and highly qualified anonymous reviewers. We think that the selected contributions, which represent the current state of the art in the field, will be of great interest to the community. We also would like to thank the <i>IET Software</i> publication staff members for their continuous support and dedication. We particularly appreciate the relentless support and encouragement granted to us by Prof. Hana Chockler, the Editor-in-Chief of <i>IET Software</i>.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"17 4","pages":"345-347"},"PeriodicalIF":1.5000,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/sfw2.12141","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Software","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/sfw2.12141","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
During the development of software systems, even with advanced planning, problems with quality and security occur. These defects may result in threats to program development and maintenance. Therefore, to control and minimise these defects, machine learning can be used to improve the quality and security of software systems. This special issue focuses on recent advances in architecture, algorithms, optimisation, and models for machine learning applied to quality and security in software systems. After a rigorous review according to relevance, originality, technical novelties, and presentation quality, we selected 4 manuscripts. A summary of these accepted papers is outlined below.
In the first paper entitled “Robust Malware Identification via Deep Temporal Convolutional Network with Symmetric Cross Entropy Learning” by Sun et al., the authors propose a robust Malware identification method using the temporal convolutional network (TCN). Moreover, word embedding techniques are generally utilised to understand the contextual relationship between the input operation code (opcode) and application programming interface (API) function names in many cases. Here, considering the numerous unlabelled samples in practical intelligent environments, the authors pre-train the TCN model on an unlabelled set using a word embedding method, that is, word2vec. In the experiments, the proposed method is compared to several traditional statistical methods and more recent neural networks on a synthetic Malware dataset and a real-world dataset. The performance comparisons demonstrate the better performance and noise robustness of the proposed method, that the proposed method can yield the best identification accuracy of 98.75% in real-world scenarios.
In the second paper entitled “Just-In-Time Defect Prediction Enhanced by the Joint Method of Line Label Fusion and File Filtering” by Zhang et al., the authors propose a Just-in-Time defect prediction model enhanced by the joint method of line label Fusion and file Filtering (JIT-FF). First, to distinguish added and removed lines while preserving the original software changes information, the authors represent the code changes as original, added, and removed codes according to line labels. Second, to obtain semantics-enhanced code representation, the authors propose a cross-attention-based line label fusion method to perform complementary feature enhancement. Third, to generate code changes containing fewer defect-irrelevant files, the authors formalise the file filtering as a sequential decision problem and propose a reinforcement learning-based file filtering method. Finally, based on generated code changes, CodeBERT-based commit representation and multi-layer perceptron-based defect prediction are performed to identify the defective software changes. The experiments demonstrate that JIT-FF predicts defective software changes more effectively.
In the third paper entitled “Android Malware Detection via Efficient API Call Sequences Extraction and Machine Learning Classifiers” by Wang et al., the authors propose a novel Android malware detection framework, where the authors contribute an efficient API call sequences extraction algorithm and an investigation of different types of classifiers. In API call sequences extraction, the authors propose an algorithm for transforming the function call graph from a multigraph into a directed simple graph, which successfully avoids unnecessary repetitive path searching. The authors also propose a pruning search, which further reduces the number of paths to be searched. The developed algorithm greatly reduces the time complexity. The authors generate the transition matrix as classification features and investigate three types of machine learning classifiers to complete the malware detection task. The experiments are performed on real-world APKs, and the results demonstrate that the proposed method reduces the running time and produces high detection accuracy.
In the fourth paper entitled “Selecting Reliable Blockchain Peers via Hybrid Blockchain Reliability Prediction” by Zheng et al., the authors propose H-BRP, a Hybrid Blockchain Reliability Prediction model, to extract the blockchain reliability factors and then make the personalised prediction for each user. Connecting to unreliable blockchain peers is prone to resource waste and even loss of cryptocurrencies by repeated transactions. The proposed model primarily aims to select reliable blockchain peers and to evaluate and predict their reliability. Comprehensive experiments conducted on 100 blockchain requesters and 200 blockchain peers demonstrate the effectiveness of the proposed H-BRP model. Furthermore, the implementation and dataset of 2,000,000 test cases are released.
The Guest Editors would like to express their deep gratitude to all the authors who have submitted their valuable contributions, and to the numerous and highly qualified anonymous reviewers. We think that the selected contributions, which represent the current state of the art in the field, will be of great interest to the community. We also would like to thank the IET Software publication staff members for their continuous support and dedication. We particularly appreciate the relentless support and encouragement granted to us by Prof. Hana Chockler, the Editor-in-Chief of IET Software.
期刊介绍:
IET Software publishes papers on all aspects of the software lifecycle, including design, development, implementation and maintenance. The focus of the journal is on the methods used to develop and maintain software, and their practical application.
Authors are especially encouraged to submit papers on the following topics, although papers on all aspects of software engineering are welcome:
Software and systems requirements engineering
Formal methods, design methods, practice and experience
Software architecture, aspect and object orientation, reuse and re-engineering
Testing, verification and validation techniques
Software dependability and measurement
Human systems engineering and human-computer interaction
Knowledge engineering; expert and knowledge-based systems, intelligent agents
Information systems engineering
Application of software engineering in industry and commerce
Software engineering technology transfer
Management of software development
Theoretical aspects of software development
Machine learning
Big data and big code
Cloud computing
Current Special Issue. Call for papers:
Knowledge Discovery for Software Development - https://digital-library.theiet.org/files/IET_SEN_CFP_KDSD.pdf
Big Data Analytics for Sustainable Software Development - https://digital-library.theiet.org/files/IET_SEN_CFP_BDASSD.pdf