BinCodex：用于评估二进制代码相似性检测技术的多层次综合数据集

BenchCouncil Transactions on Benchmarks, Standards and Evaluations Pub Date : 2024-06-01 DOI:10.1016/j.tbench.2024.100163

Peihua Zhang , Chenggang Wu , Zhe Wang

{"title":"BinCodex：用于评估二进制代码相似性检测技术的多层次综合数据集","authors":"Peihua Zhang , Chenggang Wu , Zhe Wang","doi":"10.1016/j.tbench.2024.100163","DOIUrl":null,"url":null,"abstract":"<div><p>The binary code similarity detection (BCSD) technique can quantitatively measure the differences between two given binaries and give matching results at predefined granularity (e.g., function), and has been widely used in multiple scenarios including software vulnerability search, security patch analysis, malware detection, code clone detection, etc. With the help of deep learning, the BCSD techniques have achieved high accuracy in their evaluation. However, on the one hand, their high accuracy has become indistinguishable due to the lack of a standard dataset, thus being unable to reveal their abilities. On the other hand, since binary code can be easily changed, it is essential to gain a holistic understanding of the underlying transformations including default optimization options, non-default optimization options, and commonly used code obfuscations, thus assessing their impact on the accuracy and adaptability of the BCSD technique. This paper presents our observations regarding the diversity of BCSD datasets and proposes a comprehensive dataset for the BCSD technique. We employ and present detailed evaluation results of various BCSD works, applying different classifications for different types of BCSD tasks, including pure function pairing and vulnerable code detection. Our results show that most BCSD works are capable of adopting default compiler options but are unsatisfactory when facing non-default compiler options and code obfuscation. We take a layered perspective on the BCSD task and point to opportunities for future optimizations in the technologies we consider.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"4 2","pages":"Article 100163"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485924000152/pdfft?md5=e14058fa183420c2a27c98650ad7e993&pid=1-s2.0-S2772485924000152-main.pdf","citationCount":"0","resultStr":"{\"title\":\"BinCodex: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques\",\"authors\":\"Peihua Zhang , Chenggang Wu , Zhe Wang\",\"doi\":\"10.1016/j.tbench.2024.100163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The binary code similarity detection (BCSD) technique can quantitatively measure the differences between two given binaries and give matching results at predefined granularity (e.g., function), and has been widely used in multiple scenarios including software vulnerability search, security patch analysis, malware detection, code clone detection, etc. With the help of deep learning, the BCSD techniques have achieved high accuracy in their evaluation. However, on the one hand, their high accuracy has become indistinguishable due to the lack of a standard dataset, thus being unable to reveal their abilities. On the other hand, since binary code can be easily changed, it is essential to gain a holistic understanding of the underlying transformations including default optimization options, non-default optimization options, and commonly used code obfuscations, thus assessing their impact on the accuracy and adaptability of the BCSD technique. This paper presents our observations regarding the diversity of BCSD datasets and proposes a comprehensive dataset for the BCSD technique. We employ and present detailed evaluation results of various BCSD works, applying different classifications for different types of BCSD tasks, including pure function pairing and vulnerable code detection. Our results show that most BCSD works are capable of adopting default compiler options but are unsatisfactory when facing non-default compiler options and code obfuscation. We take a layered perspective on the BCSD task and point to opportunities for future optimizations in the technologies we consider.</p></div>\",\"PeriodicalId\":100155,\"journal\":{\"name\":\"BenchCouncil Transactions on Benchmarks, Standards and Evaluations\",\"volume\":\"4 2\",\"pages\":\"Article 100163\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772485924000152/pdfft?md5=e14058fa183420c2a27c98650ad7e993&pid=1-s2.0-S2772485924000152-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BenchCouncil Transactions on Benchmarks, Standards and Evaluations\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772485924000152\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772485924000152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

二进制代码相似性检测（BCSD）技术可以定量测量两个给定二进制文件之间的差异，并给出预定粒度（如函数）的匹配结果，已被广泛应用于软件漏洞搜索、安全补丁分析、恶意软件检测、代码克隆检测等多个场景。在深度学习的帮助下，BCSD 技术在评估中取得了较高的准确率。然而，一方面，由于缺乏标准数据集，其高精度变得难以区分，从而无法展现其能力。另一方面，由于二进制代码很容易更改，因此有必要全面了解底层转换，包括默认优化选项、非默认优化选项和常用代码混淆，从而评估它们对 BCSD 技术准确性和适应性的影响。本文介绍了我们对 BCSD 数据集多样性的观察，并为 BCSD 技术提出了一个综合数据集。我们针对不同类型的 BCSD 任务（包括纯函数配对和漏洞代码检测）采用了不同的分类方法，并介绍了各种 BCSD 作品的详细评估结果。我们的结果表明，大多数 BCSD 作品都能采用默认编译器选项，但在面对非默认编译器选项和代码混淆时却不能令人满意。我们从分层的角度来看待 BCSD 任务，并指出了我们所考虑的技术在未来的优化机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BinCodex: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques

The binary code similarity detection (BCSD) technique can quantitatively measure the differences between two given binaries and give matching results at predefined granularity (e.g., function), and has been widely used in multiple scenarios including software vulnerability search, security patch analysis, malware detection, code clone detection, etc. With the help of deep learning, the BCSD techniques have achieved high accuracy in their evaluation. However, on the one hand, their high accuracy has become indistinguishable due to the lack of a standard dataset, thus being unable to reveal their abilities. On the other hand, since binary code can be easily changed, it is essential to gain a holistic understanding of the underlying transformations including default optimization options, non-default optimization options, and commonly used code obfuscations, thus assessing their impact on the accuracy and adaptability of the BCSD technique. This paper presents our observations regarding the diversity of BCSD datasets and proposes a comprehensive dataset for the BCSD technique. We employ and present detailed evaluation results of various BCSD works, applying different classifications for different types of BCSD tasks, including pure function pairing and vulnerable code detection. Our results show that most BCSD works are capable of adopting default compiler options but are unsatisfactory when facing non-default compiler options and code obfuscation. We take a layered perspective on the BCSD task and point to opportunities for future optimizations in the technologies we consider.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BenchCouncil Transactions on Benchmarks, Standards and Evaluations

CiteScore

4.80

自引率

0.00%

发文量