用于查找COTS二进制文件中错误的反编译代码的符号查询

HyungSeok Han, JeongOh Kyea, Yonghwi Jin, Jinoh Kang, Brian Pak, Insu Yun
{"title":"用于查找COTS二进制文件中错误的反编译代码的符号查询","authors":"HyungSeok Han, JeongOh Kyea, Yonghwi Jin, Jinoh Kang, Brian Pak, Insu Yun","doi":"10.1109/SP46215.2023.10179314","DOIUrl":null,"url":null,"abstract":"Extensible static checking tools, such as Sys and CodeQL, have successfully discovered bugs in source code. These tools allow analysts to write application-specific rules, referred to as queries. These queries can leverage the domain knowledge of analysts, thereby making the analysis more accurate and scalable. However, the majority of these tools are inapplicable to binary-only analysis. One exception, joern, translates a binary code into decompiled code and feeds the decompiled code into an ordinary C code analyzer. However, this approach is not sufficiently precise for symbolic analysis, as it overlooks the unique characteristics of decompiled code. While binary analysis platforms, such as angr, support symbolic analysis, analysts must understand their intermediate representations (IRs) although they are mostly working with decompiled code.In this paper, we propose a precise and scalable symbolic analysis called fearless symbolic analysis that uses intuitive queries for binary code and implement this in QueryX. To make the query intuitive, QueryX enables analysts to write queries on top of decompiled code instead of IRs. In particular, QueryX supports callbacks on decompiled code, using which analysts can control symbolic analysis to discover bugs in the code. For precise analysis, we lift decompiled code into our IR named DNR and perform symbolic analysis on DNR while considering the characteristics of the decompiled code. Notably, DNR is only used internally such that it allows analysts to write queries regardless of using DNR. For scalability, QueryX automatically reduces control-flow graphs using callbacks and ordering dependencies between callbacks that are specified in the queries. We applied QueryX to the Windows kernel, the Windows system service, and an automotive binary. As a result, we found 15 unique bugs including 10 CVEs and earned $180,000 from the Microsoft bug bounty program.","PeriodicalId":439989,"journal":{"name":"2023 IEEE Symposium on Security and Privacy (SP)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"QueryX: Symbolic Query on Decompiled Code for Finding Bugs in COTS Binaries\",\"authors\":\"HyungSeok Han, JeongOh Kyea, Yonghwi Jin, Jinoh Kang, Brian Pak, Insu Yun\",\"doi\":\"10.1109/SP46215.2023.10179314\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extensible static checking tools, such as Sys and CodeQL, have successfully discovered bugs in source code. These tools allow analysts to write application-specific rules, referred to as queries. These queries can leverage the domain knowledge of analysts, thereby making the analysis more accurate and scalable. However, the majority of these tools are inapplicable to binary-only analysis. One exception, joern, translates a binary code into decompiled code and feeds the decompiled code into an ordinary C code analyzer. However, this approach is not sufficiently precise for symbolic analysis, as it overlooks the unique characteristics of decompiled code. While binary analysis platforms, such as angr, support symbolic analysis, analysts must understand their intermediate representations (IRs) although they are mostly working with decompiled code.In this paper, we propose a precise and scalable symbolic analysis called fearless symbolic analysis that uses intuitive queries for binary code and implement this in QueryX. To make the query intuitive, QueryX enables analysts to write queries on top of decompiled code instead of IRs. In particular, QueryX supports callbacks on decompiled code, using which analysts can control symbolic analysis to discover bugs in the code. For precise analysis, we lift decompiled code into our IR named DNR and perform symbolic analysis on DNR while considering the characteristics of the decompiled code. Notably, DNR is only used internally such that it allows analysts to write queries regardless of using DNR. For scalability, QueryX automatically reduces control-flow graphs using callbacks and ordering dependencies between callbacks that are specified in the queries. We applied QueryX to the Windows kernel, the Windows system service, and an automotive binary. As a result, we found 15 unique bugs including 10 CVEs and earned $180,000 from the Microsoft bug bounty program.\",\"PeriodicalId\":439989,\"journal\":{\"name\":\"2023 IEEE Symposium on Security and Privacy (SP)\",\"volume\":\"102 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Symposium on Security and Privacy (SP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SP46215.2023.10179314\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Symposium on Security and Privacy (SP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP46215.2023.10179314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

可扩展的静态检查工具,如Sys和CodeQL,已经成功地发现了源代码中的错误。这些工具允许分析人员编写特定于应用程序的规则,即查询。这些查询可以利用分析人员的领域知识,从而使分析更加准确和可伸缩。然而,这些工具中的大多数都不适用于仅二进制分析。一个例外是joern,它将二进制代码翻译成反编译代码,并将反编译代码提供给普通的C代码分析器。然而,这种方法对于符号分析不够精确,因为它忽略了反编译代码的独特特征。虽然二进制分析平台(如angr)支持符号分析,但分析人员必须理解它们的中间表示(ir),尽管它们主要使用反编译代码。在本文中,我们提出了一种精确且可扩展的符号分析,称为无畏符号分析,它对二进制代码使用直观的查询,并在QueryX中实现。为了使查询更加直观,QueryX使分析人员能够在反编译代码而不是ir之上编写查询。特别是,QueryX支持反编译代码的回调,分析人员可以使用它来控制符号分析以发现代码中的错误。为了进行精确的分析,我们将反编译代码提升到名为DNR的IR中,并在考虑反编译代码的特征的同时对DNR进行符号分析。值得注意的是,DNR仅在内部使用,因此它允许分析师编写查询,而无需使用DNR。对于可伸缩性,QueryX使用回调和查询中指定的回调之间的排序依赖关系来自动减少控制流图。我们将QueryX应用于Windows内核、Windows系统服务和一个自动二进制文件。结果,我们发现了15个独特的漏洞,包括10个cve,并从微软漏洞赏金计划中获得了18万美元。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
QueryX: Symbolic Query on Decompiled Code for Finding Bugs in COTS Binaries
Extensible static checking tools, such as Sys and CodeQL, have successfully discovered bugs in source code. These tools allow analysts to write application-specific rules, referred to as queries. These queries can leverage the domain knowledge of analysts, thereby making the analysis more accurate and scalable. However, the majority of these tools are inapplicable to binary-only analysis. One exception, joern, translates a binary code into decompiled code and feeds the decompiled code into an ordinary C code analyzer. However, this approach is not sufficiently precise for symbolic analysis, as it overlooks the unique characteristics of decompiled code. While binary analysis platforms, such as angr, support symbolic analysis, analysts must understand their intermediate representations (IRs) although they are mostly working with decompiled code.In this paper, we propose a precise and scalable symbolic analysis called fearless symbolic analysis that uses intuitive queries for binary code and implement this in QueryX. To make the query intuitive, QueryX enables analysts to write queries on top of decompiled code instead of IRs. In particular, QueryX supports callbacks on decompiled code, using which analysts can control symbolic analysis to discover bugs in the code. For precise analysis, we lift decompiled code into our IR named DNR and perform symbolic analysis on DNR while considering the characteristics of the decompiled code. Notably, DNR is only used internally such that it allows analysts to write queries regardless of using DNR. For scalability, QueryX automatically reduces control-flow graphs using callbacks and ordering dependencies between callbacks that are specified in the queries. We applied QueryX to the Windows kernel, the Windows system service, and an automotive binary. As a result, we found 15 unique bugs including 10 CVEs and earned $180,000 from the Microsoft bug bounty program.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信