Static binary rewriting without supplemental information: Overcoming the tradeoff between coverage and correctness

2013 20th Working Conference on Reverse Engineering (WCRE) Pub Date : 2013-11-21 DOI:10.1109/WCRE.2013.6671280

M. Smithson, Khaled Elwazeer, K. Anand, A. Kotha, R. Barua

{"title":"Static binary rewriting without supplemental information: Overcoming the tradeoff between coverage and correctness","authors":"M. Smithson, Khaled Elwazeer, K. Anand, A. Kotha, R. Barua","doi":"10.1109/WCRE.2013.6671280","DOIUrl":null,"url":null,"abstract":"Binary rewriting is the process of transforming executables by maintaining the original binary's functionality, while improving it in one or more metrics, such as energy use, memory use, security, or reliability. Although several technologies for rewriting binaries exist, static rewriting allows for arbitrarily complex transformations to be performed. Other technologies, such as dynamic or minimally-invasive rewriting, are limited in their transformation ability. We have designed the first static binary rewriter that guarantees 100% code coverage without the need for relocation or symbolic information. A key challenge in static rewriting is content classification (i.e. deciding what portion of the code segment is code versus data). Our contributions are (i) handling portions of the code segment with uncertain classification by using speculative disassembly in case it was code, and retaining the original binary in case it was data; (ii) drastically limiting the number of possible speculative sequences using a new technique called binary characterization; and (iii) avoiding the need for relocation or symbolic information by using call translation at usage points of code pointers (i.e. indirect control transfers), rather than changing addresses at address creation points. Extensive evaluation using stripped binaries for the entire SPEC 2006 benchmark suite (with over 1.9 million lines of code) demonstrates the robustness of the scheme.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 20th Working Conference on Reverse Engineering (WCRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WCRE.2013.6671280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 41

Abstract

Binary rewriting is the process of transforming executables by maintaining the original binary's functionality, while improving it in one or more metrics, such as energy use, memory use, security, or reliability. Although several technologies for rewriting binaries exist, static rewriting allows for arbitrarily complex transformations to be performed. Other technologies, such as dynamic or minimally-invasive rewriting, are limited in their transformation ability. We have designed the first static binary rewriter that guarantees 100% code coverage without the need for relocation or symbolic information. A key challenge in static rewriting is content classification (i.e. deciding what portion of the code segment is code versus data). Our contributions are (i) handling portions of the code segment with uncertain classification by using speculative disassembly in case it was code, and retaining the original binary in case it was data; (ii) drastically limiting the number of possible speculative sequences using a new technique called binary characterization; and (iii) avoiding the need for relocation or symbolic information by using call translation at usage points of code pointers (i.e. indirect control transfers), rather than changing addresses at address creation points. Extensive evaluation using stripped binaries for the entire SPEC 2006 benchmark suite (with over 1.9 million lines of code) demonstrates the robustness of the scheme.

查看原文本刊更多论文

没有补充信息的静态二进制重写:克服覆盖率和正确性之间的权衡

二进制重写是通过维护原始二进制文件的功能来转换可执行文件的过程，同时在一个或多个指标(如能源使用、内存使用、安全性或可靠性)方面进行改进。尽管存在几种重写二进制文件的技术，但静态重写允许执行任意复杂的转换。其他技术，如动态或微创重写，在其转换能力方面是有限的。我们设计了第一个静态二进制重写器，它保证100%的代码覆盖率，而不需要重定位或符号信息。静态重写的一个关键挑战是内容分类(即决定代码段的哪一部分是代码，哪一部分是数据)。我们的贡献是(i)处理不确定分类的代码段部分，如果它是代码，使用推测反汇编，如果它是数据，保留原始二进制;(ii)使用称为二元表征的新技术大幅限制可能的推测序列的数量;(iii)通过在代码指针的使用点使用调用转换(即间接控制传输)，而不是在地址创建点更改地址，从而避免重新定位或符号信息的需要。对整个SPEC 2006基准测试套件(超过190万行代码)使用剥离二进制文件进行了广泛的评估，证明了该方案的健壮性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 20th Working Conference on Reverse Engineering (WCRE)

自引率

0.00%

发文量