派生:一个自动逆向工程指令编码的工具

D. Engler, Wilson C. Hsieh
{"title":"派生:一个自动逆向工程指令编码的工具","authors":"D. Engler, Wilson C. Hsieh","doi":"10.1145/351397.351409","DOIUrl":null,"url":null,"abstract":"Many binary tools, such as disassemblers, dynamic code generation systems, and executable code rewriters, need to understand how machine instructions are encoded. Unfortunately, specifying such encodings is tedious and error-prone. Users must typically specify thousands of details of instruction layout, such as opcode and field locations values, legal operands, and jump offset encodings. We have built a tool called DERIVE that extracts these details from existing software: the system assembler. Users need only provide the assembly syntax for the instructions for which they want encodings. DERIVE automatically reverse-engineers instruction encoding knowledge from the assembler by feeding it permutations of instructions and doing equation solving on the output.\nDERIVE is robust and general. It derives instruction encodings for SPARC, MIPS, Alpha, PowerPC, ARM, and x86. In the last case, it handles variable-sized instructions, large instructions, instruction encodings determined by operand size, and other CISC features. DERIVE is also remarkably simple: it is a factor of 6 smaller than equivalent, more traditional systems. Finally, its declarative specifications eliminate the mis-specification errors that plague previous approaches, such as illegal registers used as operands or incorrect field offsets and sizes. This paper discusses our current DERIVE prototype, explains how it computes instruction encodings, and also discusses the more general implications of the ability to extract functionality from installed software.","PeriodicalId":261161,"journal":{"name":"Workshop on Dynamic and Adaptive Compilation and Optimization","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Derive: a tool that automatically reverse-engineers instruction encodings\",\"authors\":\"D. Engler, Wilson C. Hsieh\",\"doi\":\"10.1145/351397.351409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many binary tools, such as disassemblers, dynamic code generation systems, and executable code rewriters, need to understand how machine instructions are encoded. Unfortunately, specifying such encodings is tedious and error-prone. Users must typically specify thousands of details of instruction layout, such as opcode and field locations values, legal operands, and jump offset encodings. We have built a tool called DERIVE that extracts these details from existing software: the system assembler. Users need only provide the assembly syntax for the instructions for which they want encodings. DERIVE automatically reverse-engineers instruction encoding knowledge from the assembler by feeding it permutations of instructions and doing equation solving on the output.\\nDERIVE is robust and general. It derives instruction encodings for SPARC, MIPS, Alpha, PowerPC, ARM, and x86. In the last case, it handles variable-sized instructions, large instructions, instruction encodings determined by operand size, and other CISC features. DERIVE is also remarkably simple: it is a factor of 6 smaller than equivalent, more traditional systems. Finally, its declarative specifications eliminate the mis-specification errors that plague previous approaches, such as illegal registers used as operands or incorrect field offsets and sizes. This paper discusses our current DERIVE prototype, explains how it computes instruction encodings, and also discusses the more general implications of the ability to extract functionality from installed software.\",\"PeriodicalId\":261161,\"journal\":{\"name\":\"Workshop on Dynamic and Adaptive Compilation and Optimization\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Workshop on Dynamic and Adaptive Compilation and Optimization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/351397.351409\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Dynamic and Adaptive Compilation and Optimization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/351397.351409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

许多二进制工具,如反汇编器、动态代码生成系统和可执行代码重写器,都需要了解机器指令是如何编码的。不幸的是,指定这样的编码是乏味且容易出错的。用户通常必须指定指令布局的数千个细节,例如操作码和字段位置值、合法操作数和跳转偏移量编码。我们已经建立了一个叫做派生的工具,它从现有的软件中提取这些细节:系统汇编器。用户只需为他们想要编码的指令提供程序集语法。通过向汇编器输入指令的排列并对输出进行方程求解,自动从汇编器获得逆向工程指令编码知识。派生是健壮和通用的。它派生用于SPARC、MIPS、Alpha、PowerPC、ARM和x86的指令编码。在最后一种情况下,它处理可变大小的指令、大指令、由操作数大小决定的指令编码以及其他CISC特性。派生系统也非常简单:它比同等的、更传统的系统小6倍。最后,它的声明性规范消除了困扰以前方法的错误规范错误,例如将非法寄存器用作操作数或不正确的字段偏移量和大小。本文讨论了我们当前的派生原型,解释了它如何计算指令编码,并讨论了从已安装的软件中提取功能的能力的更一般含义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Derive: a tool that automatically reverse-engineers instruction encodings
Many binary tools, such as disassemblers, dynamic code generation systems, and executable code rewriters, need to understand how machine instructions are encoded. Unfortunately, specifying such encodings is tedious and error-prone. Users must typically specify thousands of details of instruction layout, such as opcode and field locations values, legal operands, and jump offset encodings. We have built a tool called DERIVE that extracts these details from existing software: the system assembler. Users need only provide the assembly syntax for the instructions for which they want encodings. DERIVE automatically reverse-engineers instruction encoding knowledge from the assembler by feeding it permutations of instructions and doing equation solving on the output. DERIVE is robust and general. It derives instruction encodings for SPARC, MIPS, Alpha, PowerPC, ARM, and x86. In the last case, it handles variable-sized instructions, large instructions, instruction encodings determined by operand size, and other CISC features. DERIVE is also remarkably simple: it is a factor of 6 smaller than equivalent, more traditional systems. Finally, its declarative specifications eliminate the mis-specification errors that plague previous approaches, such as illegal registers used as operands or incorrect field offsets and sizes. This paper discusses our current DERIVE prototype, explains how it computes instruction encodings, and also discusses the more general implications of the ability to extract functionality from installed software.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信