利用对抗程序欺骗基于深度神经网络的二进制代码匹配

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-10-01 DOI:10.1109/ICSME55016.2022.00019

W. Wong, Huaijin Wang, Pingchuan Ma, Shuai Wang, Mingyue Jiang, T. Chen, Qiyi Tang, Sen Nie, Shi Wu

{"title":"利用对抗程序欺骗基于深度神经网络的二进制代码匹配","authors":"W. Wong, Huaijin Wang, Pingchuan Ma, Shuai Wang, Mingyue Jiang, T. Chen, Qiyi Tang, Sen Nie, Shi Wu","doi":"10.1109/ICSME55016.2022.00019","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) have achieved a major success in solving challenging tasks such as social networks analysis and image classification. Despite the prosperous development of DNNs, recent research has demonstrated the feasibility of exploiting DNNs using adversarial examples, in which a small distortion is added into the input data to largely mislead prediction of DNNs.Determining the similarity of two binary codes is the foundation for many reverse engineering, re-engineering, and security applications. Currently, the majority of binary code matching tools are based on DNNs, the dependability of which has not been completely studied. In this research, we present an attack that perturbs software in executable format to deceive DNN-based binary code matching. Unlike prior attacks which mostly change non-functional code components to generate adversarial programs, our approach proposes the design of several semantics-preserving transformations directly toward the control flow graph of binary code, making it particularly effective to deceive DNNs. To speedup the process, we design a framework that leverages gradient- or hill climbing-based optimizations to generate adversarial examples in both white-box and black-box settings. We evaluated our attack against two popular DNN-based binary code matching tools, asm2vec and ncc, and achieve reasonably high success rates. Our attack toward an industrial-strength DNN-based binary code matching service, BinaryAI, shows that the proposed attack can fool remote APIs in challenging black-box settings with a success rate of over 16.2% (on average). Furthermore, we show that the generated adversarial programs can be used to augment robustness of two white-box models, asm2vec and ncc, reducing the attack success rates by 17.3% and 6.8% while preserving stable, if not better, standard accuracy.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deceiving Deep Neural Networks-Based Binary Code Matching with Adversarial Programs\",\"authors\":\"W. Wong, Huaijin Wang, Pingchuan Ma, Shuai Wang, Mingyue Jiang, T. Chen, Qiyi Tang, Sen Nie, Shi Wu\",\"doi\":\"10.1109/ICSME55016.2022.00019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks (DNNs) have achieved a major success in solving challenging tasks such as social networks analysis and image classification. Despite the prosperous development of DNNs, recent research has demonstrated the feasibility of exploiting DNNs using adversarial examples, in which a small distortion is added into the input data to largely mislead prediction of DNNs.Determining the similarity of two binary codes is the foundation for many reverse engineering, re-engineering, and security applications. Currently, the majority of binary code matching tools are based on DNNs, the dependability of which has not been completely studied. In this research, we present an attack that perturbs software in executable format to deceive DNN-based binary code matching. Unlike prior attacks which mostly change non-functional code components to generate adversarial programs, our approach proposes the design of several semantics-preserving transformations directly toward the control flow graph of binary code, making it particularly effective to deceive DNNs. To speedup the process, we design a framework that leverages gradient- or hill climbing-based optimizations to generate adversarial examples in both white-box and black-box settings. We evaluated our attack against two popular DNN-based binary code matching tools, asm2vec and ncc, and achieve reasonably high success rates. Our attack toward an industrial-strength DNN-based binary code matching service, BinaryAI, shows that the proposed attack can fool remote APIs in challenging black-box settings with a success rate of over 16.2% (on average). Furthermore, we show that the generated adversarial programs can be used to augment robustness of two white-box models, asm2vec and ncc, reducing the attack success rates by 17.3% and 6.8% while preserving stable, if not better, standard accuracy.\",\"PeriodicalId\":300084,\"journal\":{\"name\":\"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSME55016.2022.00019\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME55016.2022.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络(dnn)在解决社交网络分析和图像分类等具有挑战性的任务方面取得了重大成功。尽管深度神经网络的蓬勃发展，最近的研究已经证明了利用对抗示例开发深度神经网络的可行性，其中在输入数据中添加一个小的失真，从而在很大程度上误导了对深度神经网络的预测。确定两个二进制代码的相似性是许多逆向工程、再工程和安全应用程序的基础。目前，大多数二进制码匹配工具都是基于深度神经网络的，其可靠性尚未得到充分的研究。在本研究中，我们提出了一种通过干扰可执行格式的软件来欺骗基于dnn的二进制代码匹配的攻击。与之前主要改变非功能代码组件以生成对抗性程序的攻击不同，我们的方法提出了几种直接针对二进制代码的控制流图的语义保留转换的设计，使其特别有效地欺骗dnn。为了加快这一过程，我们设计了一个框架，利用基于梯度或爬坡的优化，在白盒和黑盒设置中生成对抗性示例。我们针对两种流行的基于dnn的二进制代码匹配工具asm2vec和ncc评估了我们的攻击，并获得了相当高的成功率。我们对基于工业强度dnn的二进制代码匹配服务BinaryAI的攻击表明，所提出的攻击可以在具有挑战性的黑箱设置中欺骗远程api，成功率超过16.2%(平均)。此外，我们表明，生成的对抗程序可以用来增强两个白盒模型asm2vec和ncc的鲁棒性，将攻击成功率降低17.3%和6.8%，同时保持稳定的标准精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deceiving Deep Neural Networks-Based Binary Code Matching with Adversarial Programs

Deep neural networks (DNNs) have achieved a major success in solving challenging tasks such as social networks analysis and image classification. Despite the prosperous development of DNNs, recent research has demonstrated the feasibility of exploiting DNNs using adversarial examples, in which a small distortion is added into the input data to largely mislead prediction of DNNs.Determining the similarity of two binary codes is the foundation for many reverse engineering, re-engineering, and security applications. Currently, the majority of binary code matching tools are based on DNNs, the dependability of which has not been completely studied. In this research, we present an attack that perturbs software in executable format to deceive DNN-based binary code matching. Unlike prior attacks which mostly change non-functional code components to generate adversarial programs, our approach proposes the design of several semantics-preserving transformations directly toward the control flow graph of binary code, making it particularly effective to deceive DNNs. To speedup the process, we design a framework that leverages gradient- or hill climbing-based optimizations to generate adversarial examples in both white-box and black-box settings. We evaluated our attack against two popular DNN-based binary code matching tools, asm2vec and ncc, and achieve reasonably high success rates. Our attack toward an industrial-strength DNN-based binary code matching service, BinaryAI, shows that the proposed attack can fool remote APIs in challenging black-box settings with a success rate of over 16.2% (on average). Furthermore, we show that the generated adversarial programs can be used to augment robustness of two white-box models, asm2vec and ncc, reducing the attack success rates by 17.3% and 6.8% while preserving stable, if not better, standard accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

自引率

0.00%

发文量