部分求值的低级代码模块化反编译

M. Gómez-Zamalloa, E. Albert, G. Puebla
{"title":"部分求值的低级代码模块化反编译","authors":"M. Gómez-Zamalloa, E. Albert, G. Puebla","doi":"10.1109/SCAM.2008.35","DOIUrl":null,"url":null,"abstract":"Decompiling low-level code to a high-level intermediate representation facilitates the development of analyzers, model checkers, etc. which reason about properties of the low-level code (e.g., bytecode, .NET). Interpretive decompilation consists in partially evaluating an interpreter for the low-level language (written in the high-level language) w.r.t. the code to be decompiled. There have been proofs-of-concept that interpretive decompilation is feasible, butt here remain important open issues when it comes to decompile a real language: does the approach scale up? is the quality of decompiled programs comparable to that obtained by ad-hoc decompilers? do decompiled programs preserve the structure of the original programs? This paper addresses these issues by presenting, to the best of our knowledge, the first modular scheme to enable interpretive decompilation of low-level code to a high-level representation, namely, we decompile bytecode into PROLOG. We introduce two notions of optimality. The first one requires that each method/block is decompiled just once. The second one requires that each program point is traversed at most once during decompilation. We demonstrate the impact of our modular approach and optimality issues on a series of realistic benchmarks. Decompilation times and decompiled program sizes are linear with the size of the input bytecode program. This demostrates empirically the scalability of modular decompilation of low-level code by partial evaluation.","PeriodicalId":433693,"journal":{"name":"2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Modular Decompilation of Low-Level Code by Partial Evaluation\",\"authors\":\"M. Gómez-Zamalloa, E. Albert, G. Puebla\",\"doi\":\"10.1109/SCAM.2008.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decompiling low-level code to a high-level intermediate representation facilitates the development of analyzers, model checkers, etc. which reason about properties of the low-level code (e.g., bytecode, .NET). Interpretive decompilation consists in partially evaluating an interpreter for the low-level language (written in the high-level language) w.r.t. the code to be decompiled. There have been proofs-of-concept that interpretive decompilation is feasible, butt here remain important open issues when it comes to decompile a real language: does the approach scale up? is the quality of decompiled programs comparable to that obtained by ad-hoc decompilers? do decompiled programs preserve the structure of the original programs? This paper addresses these issues by presenting, to the best of our knowledge, the first modular scheme to enable interpretive decompilation of low-level code to a high-level representation, namely, we decompile bytecode into PROLOG. We introduce two notions of optimality. The first one requires that each method/block is decompiled just once. The second one requires that each program point is traversed at most once during decompilation. We demonstrate the impact of our modular approach and optimality issues on a series of realistic benchmarks. Decompilation times and decompiled program sizes are linear with the size of the input bytecode program. This demostrates empirically the scalability of modular decompilation of low-level code by partial evaluation.\",\"PeriodicalId\":433693,\"journal\":{\"name\":\"2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCAM.2008.35\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM.2008.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

将低级代码反编译为高级中间表示形式,有助于分析程序、模型检查程序等的开发,这些程序可以推断低级代码的属性(例如,字节码、。net)。解释性反编译包括部分地计算解释器对低级语言(用高级语言编写)而不是要反编译的代码的解释器。已经有概念证明解释式反编译是可行的,但是当涉及到反编译真正的语言时,这里仍然存在重要的开放问题:这种方法是否可以扩展?反编译程序的质量是否可与特设反编译器获得的质量相媲美?反编译程序是否保留原始程序的结构?本文通过提出这些问题,据我们所知,第一个模块化方案能够将低级代码的解释性反编译为高级表示,也就是说,我们将字节码反编译为PROLOG。我们引入两个最优性的概念。第一个要求每个方法/块只反编译一次。第二种方法要求每个程序点在反编译期间最多遍历一次。我们在一系列现实的基准测试中展示了我们的模块化方法和最优性问题的影响。反编译时间和反编译程序大小与输入字节码程序的大小成线性关系。这从经验上证明了通过部分求值对低级代码进行模块化反编译的可伸缩性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Modular Decompilation of Low-Level Code by Partial Evaluation
Decompiling low-level code to a high-level intermediate representation facilitates the development of analyzers, model checkers, etc. which reason about properties of the low-level code (e.g., bytecode, .NET). Interpretive decompilation consists in partially evaluating an interpreter for the low-level language (written in the high-level language) w.r.t. the code to be decompiled. There have been proofs-of-concept that interpretive decompilation is feasible, butt here remain important open issues when it comes to decompile a real language: does the approach scale up? is the quality of decompiled programs comparable to that obtained by ad-hoc decompilers? do decompiled programs preserve the structure of the original programs? This paper addresses these issues by presenting, to the best of our knowledge, the first modular scheme to enable interpretive decompilation of low-level code to a high-level representation, namely, we decompile bytecode into PROLOG. We introduce two notions of optimality. The first one requires that each method/block is decompiled just once. The second one requires that each program point is traversed at most once during decompilation. We demonstrate the impact of our modular approach and optimality issues on a series of realistic benchmarks. Decompilation times and decompiled program sizes are linear with the size of the input bytecode program. This demostrates empirically the scalability of modular decompilation of low-level code by partial evaluation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信