{"title":"基于机器学习算法的APT1数据集字符串和PE头特征静态恶意软件分析","authors":"Neil Balram, G. Hsieh, Christian McFall","doi":"10.1109/CSCI49370.2019.00022","DOIUrl":null,"url":null,"abstract":"Static malware analysis is used to analyze executable files without executing the code to determine whether a file is malicious or not. Data analytic and machine learning techniques have been used increasingly to help process the large number of malware files circulating in the wild and detect new attacks. In this paper, we present the design and implementation of six different machine learning classifiers, and two distinct categories of features statically extracted from the executables: strings and Portable Executable header information. A total of twelve malware detectors were implemented for each of the six classifiers to operate with each of the two feature categories separately. These classifiers and feature extraction algorithms were implemented in Python using the scikit-learn machine learning library. The performances in detection accuracy and required processing time of the twelve malware detectors were compared and analyzed.","PeriodicalId":103662,"journal":{"name":"2019 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Static Malware Analysis Using Machine Learning Algorithms on APT1 Dataset with String and PE Header Features\",\"authors\":\"Neil Balram, G. Hsieh, Christian McFall\",\"doi\":\"10.1109/CSCI49370.2019.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Static malware analysis is used to analyze executable files without executing the code to determine whether a file is malicious or not. Data analytic and machine learning techniques have been used increasingly to help process the large number of malware files circulating in the wild and detect new attacks. In this paper, we present the design and implementation of six different machine learning classifiers, and two distinct categories of features statically extracted from the executables: strings and Portable Executable header information. A total of twelve malware detectors were implemented for each of the six classifiers to operate with each of the two feature categories separately. These classifiers and feature extraction algorithms were implemented in Python using the scikit-learn machine learning library. The performances in detection accuracy and required processing time of the twelve malware detectors were compared and analyzed.\",\"PeriodicalId\":103662,\"journal\":{\"name\":\"2019 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCI49370.2019.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI49370.2019.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Static Malware Analysis Using Machine Learning Algorithms on APT1 Dataset with String and PE Header Features
Static malware analysis is used to analyze executable files without executing the code to determine whether a file is malicious or not. Data analytic and machine learning techniques have been used increasingly to help process the large number of malware files circulating in the wild and detect new attacks. In this paper, we present the design and implementation of six different machine learning classifiers, and two distinct categories of features statically extracted from the executables: strings and Portable Executable header information. A total of twelve malware detectors were implemented for each of the six classifiers to operate with each of the two feature categories separately. These classifiers and feature extraction algorithms were implemented in Python using the scikit-learn machine learning library. The performances in detection accuracy and required processing time of the twelve malware detectors were compared and analyzed.