用于Python二进制反编译的法证等效转换

A. Ahad, Chi-Gon Jung, Ammar Askar, Doowon Kim, Taesoo Kim, Yonghwi Kwon
{"title":"用于Python二进制反编译的法证等效转换","authors":"A. Ahad, Chi-Gon Jung, Ammar Askar, Doowon Kim, Taesoo Kim, Yonghwi Kwon","doi":"10.1109/SP46215.2023.10179370","DOIUrl":null,"url":null,"abstract":"Decompilation is a crucial capability in forensic analysis, facilitating analysis of unknown binaries. The recent rise of Python malware has brought attention to Python decompilers that aim to obtain source code representation from a Python binary. However, Python decompilers fail to handle various binaries, limiting their capabilities in forensic analysis.This paper proposes a novel solution that transforms a decompilation error-inducing Python binary into a decompilable binary. Our key intuition is that we can resolve the decompilation errors by transforming error-inducing code blocks in the input binary into another form. The core of our approach is the concept of Forensically Equivalent Transformation (FET) which allows non-semantic preserving transformation in the context of forensic analysis. We carefully define the FETs to minimize their undesirable consequences while fixing various error-inducing instructions that are difficult to solve when preserving the exact semantics. We evaluate the prototype of our approach with 17,117 real-world Python malware samples causing decompilation errors in five popular decompilers. It successfully identifies and fixes 77,022 errors. Our approach also handles anti-analysis techniques, including opcode remapping, and helps migrate Python 3.9 binaries to 3.8 binaries.","PeriodicalId":439989,"journal":{"name":"2023 IEEE Symposium on Security and Privacy (SP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pyfet: Forensically Equivalent Transformation for Python Binary Decompilation\",\"authors\":\"A. Ahad, Chi-Gon Jung, Ammar Askar, Doowon Kim, Taesoo Kim, Yonghwi Kwon\",\"doi\":\"10.1109/SP46215.2023.10179370\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decompilation is a crucial capability in forensic analysis, facilitating analysis of unknown binaries. The recent rise of Python malware has brought attention to Python decompilers that aim to obtain source code representation from a Python binary. However, Python decompilers fail to handle various binaries, limiting their capabilities in forensic analysis.This paper proposes a novel solution that transforms a decompilation error-inducing Python binary into a decompilable binary. Our key intuition is that we can resolve the decompilation errors by transforming error-inducing code blocks in the input binary into another form. The core of our approach is the concept of Forensically Equivalent Transformation (FET) which allows non-semantic preserving transformation in the context of forensic analysis. We carefully define the FETs to minimize their undesirable consequences while fixing various error-inducing instructions that are difficult to solve when preserving the exact semantics. We evaluate the prototype of our approach with 17,117 real-world Python malware samples causing decompilation errors in five popular decompilers. It successfully identifies and fixes 77,022 errors. Our approach also handles anti-analysis techniques, including opcode remapping, and helps migrate Python 3.9 binaries to 3.8 binaries.\",\"PeriodicalId\":439989,\"journal\":{\"name\":\"2023 IEEE Symposium on Security and Privacy (SP)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Symposium on Security and Privacy (SP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SP46215.2023.10179370\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Symposium on Security and Privacy (SP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP46215.2023.10179370","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

反编译是取证分析中的一项关键功能,有助于对未知二进制文件进行分析。最近Python恶意软件的兴起引起了人们对旨在从Python二进制文件中获取源代码表示的Python反编译器的关注。然而,Python反编译器无法处理各种二进制文件,限制了它们在取证分析方面的能力。本文提出了一种新的解决方案,将导致反编译错误的Python二进制文件转换为可反编译的二进制文件。我们的关键直觉是,我们可以通过将输入二进制中的导致错误的代码块转换为另一种形式来解决反编译错误。我们方法的核心是法医等效转换(FET)的概念,它允许在法医分析的背景下进行非语义保留转换。我们仔细定义了fet,以尽量减少其不良后果,同时修复了在保持精确语义时难以解决的各种错误诱导指令。我们用17117个真实的Python恶意软件样本来评估我们方法的原型,这些样本在五种流行的反编译器中导致反编译错误。它成功地识别并修复了77,022个错误。我们的方法还处理反分析技术,包括操作码重新映射,并帮助将Python 3.9二进制文件迁移到3.8二进制文件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Pyfet: Forensically Equivalent Transformation for Python Binary Decompilation
Decompilation is a crucial capability in forensic analysis, facilitating analysis of unknown binaries. The recent rise of Python malware has brought attention to Python decompilers that aim to obtain source code representation from a Python binary. However, Python decompilers fail to handle various binaries, limiting their capabilities in forensic analysis.This paper proposes a novel solution that transforms a decompilation error-inducing Python binary into a decompilable binary. Our key intuition is that we can resolve the decompilation errors by transforming error-inducing code blocks in the input binary into another form. The core of our approach is the concept of Forensically Equivalent Transformation (FET) which allows non-semantic preserving transformation in the context of forensic analysis. We carefully define the FETs to minimize their undesirable consequences while fixing various error-inducing instructions that are difficult to solve when preserving the exact semantics. We evaluate the prototype of our approach with 17,117 real-world Python malware samples causing decompilation errors in five popular decompilers. It successfully identifies and fixes 77,022 errors. Our approach also handles anti-analysis techniques, including opcode remapping, and helps migrate Python 3.9 binaries to 3.8 binaries.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信