{"title":"Convolutional neural networks on assembly code for predicting software defects","authors":"Anh Viet Phan, Minh le Nguyen","doi":"10.1109/IESYS.2017.8233558","DOIUrl":null,"url":null,"abstract":"Software defect prediction is one of the most attractive research topics in the field of software engineering. The task is to predict whether or not a program contains semantic bugs. Previous studies apply conventional machine learning techniques on software metrics, or deep learning on source code's tree representations called abstract syntax trees. This paper formulates an approach for software defect prediction, in which source code firstly is compiled into assembly code and then a multi-view convolutional neural network is applied to automatically learn defect features from the assembly instruction sequences. The experimental results on four real-world datasets indicate that exploiting assembly code is beneficial to detecting semantic bugs. Our approach significantly outperforms baselines that are based on software metrics and abstract syntax trees.","PeriodicalId":429982,"journal":{"name":"2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IESYS.2017.8233558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28
Abstract
Software defect prediction is one of the most attractive research topics in the field of software engineering. The task is to predict whether or not a program contains semantic bugs. Previous studies apply conventional machine learning techniques on software metrics, or deep learning on source code's tree representations called abstract syntax trees. This paper formulates an approach for software defect prediction, in which source code firstly is compiled into assembly code and then a multi-view convolutional neural network is applied to automatically learn defect features from the assembly instruction sequences. The experimental results on four real-world datasets indicate that exploiting assembly code is beneficial to detecting semantic bugs. Our approach significantly outperforms baselines that are based on software metrics and abstract syntax trees.