{"title":"基于专业领域预训练BERT的恶意软件API序列检测模型","authors":"Rongheng Xu, Jilin Zhang, Li Zhou","doi":"10.1109/DSA56465.2022.00162","DOIUrl":null,"url":null,"abstract":"With the development of the Internet, Internet information security is becoming more and more important. As far as malware detection is concerned, the increasingly serious distortion and scrambling have brought great challenges to the traditional detection methodstraditional methods such as feature database are difficult to effectively detect non-input viruses, and there is a very high cost of experts in detection. With the development of artificial intelligence technology, machine learning and deep learning methods are widely used to deal with tasks in the computer field. In dynamic detection, API call sequences generated by malicious software are widely used in software classification as features, because these sequences represent the behaviors of malicious software. However, traditional methods cannot capture the global relationship of API sequences. We use the BERT model based on transformer to learn the global relationship and add Windows API corpus to the pre-training model.","PeriodicalId":208148,"journal":{"name":"2022 9th International Conference on Dependable Systems and Their Applications (DSA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Malware API Sequence Detection Model based on Pre-trained BERT in Professional domain\",\"authors\":\"Rongheng Xu, Jilin Zhang, Li Zhou\",\"doi\":\"10.1109/DSA56465.2022.00162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the development of the Internet, Internet information security is becoming more and more important. As far as malware detection is concerned, the increasingly serious distortion and scrambling have brought great challenges to the traditional detection methodstraditional methods such as feature database are difficult to effectively detect non-input viruses, and there is a very high cost of experts in detection. With the development of artificial intelligence technology, machine learning and deep learning methods are widely used to deal with tasks in the computer field. In dynamic detection, API call sequences generated by malicious software are widely used in software classification as features, because these sequences represent the behaviors of malicious software. However, traditional methods cannot capture the global relationship of API sequences. We use the BERT model based on transformer to learn the global relationship and add Windows API corpus to the pre-training model.\",\"PeriodicalId\":208148,\"journal\":{\"name\":\"2022 9th International Conference on Dependable Systems and Their Applications (DSA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 9th International Conference on Dependable Systems and Their Applications (DSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSA56465.2022.00162\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th International Conference on Dependable Systems and Their Applications (DSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSA56465.2022.00162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Malware API Sequence Detection Model based on Pre-trained BERT in Professional domain
With the development of the Internet, Internet information security is becoming more and more important. As far as malware detection is concerned, the increasingly serious distortion and scrambling have brought great challenges to the traditional detection methodstraditional methods such as feature database are difficult to effectively detect non-input viruses, and there is a very high cost of experts in detection. With the development of artificial intelligence technology, machine learning and deep learning methods are widely used to deal with tasks in the computer field. In dynamic detection, API call sequences generated by malicious software are widely used in software classification as features, because these sequences represent the behaviors of malicious software. However, traditional methods cannot capture the global relationship of API sequences. We use the BERT model based on transformer to learn the global relationship and add Windows API corpus to the pre-training model.