从二进制码中恢复算法的高级中间表示

A. Bugerya, I. Kulagin, V. Padaryan, M. A. Solovev, A. Tikhonov
{"title":"从二进制码中恢复算法的高级中间表示","authors":"A. Bugerya, I. Kulagin, V. Padaryan, M. A. Solovev, A. Tikhonov","doi":"10.1109/IVMEM.2019.00015","DOIUrl":null,"url":null,"abstract":"One of the tasks of binary code security analysis is detection of undocumented features in software. This task is hard to automate, and it requires participation of a cybersecurity expert. The way of representation of the algorithm under analysis strongly determines the analysis effort and quality of its results. Existing intermediate representations and languages are intended for use in software that either carries out optimizing transformations or analyzes binary code. Such representations and intermediate languages are unsuitable for manual data flow analysis. This paper proposes a high-level hierarchical flowchart-based representation of a program algorithm as well as an algorithm for its construction. The proposed representation is based on a hypergraph and it allows both automatic and manual data flow analysis on different detail levels. The hypergraph nodes represent functions. Every node contains a set of other nodes which are fragments. The fragment is a linear sequence of instructions that does not contain call and ret instructions. Edges represent data flows between nodes and correspond to memory buffers and registers. In the future this representation can be used to implement automatic analysis algorithms. An approach is proposed to increasing quality of the developed algorithm representation using grouping of single data flows into one flow connecting logical algorithm modules.","PeriodicalId":166102,"journal":{"name":"2019 Ivannikov Memorial Workshop (IVMEM)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Recovery of High-Level Intermediate Representations of Algorithms from Binary Code\",\"authors\":\"A. Bugerya, I. Kulagin, V. Padaryan, M. A. Solovev, A. Tikhonov\",\"doi\":\"10.1109/IVMEM.2019.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the tasks of binary code security analysis is detection of undocumented features in software. This task is hard to automate, and it requires participation of a cybersecurity expert. The way of representation of the algorithm under analysis strongly determines the analysis effort and quality of its results. Existing intermediate representations and languages are intended for use in software that either carries out optimizing transformations or analyzes binary code. Such representations and intermediate languages are unsuitable for manual data flow analysis. This paper proposes a high-level hierarchical flowchart-based representation of a program algorithm as well as an algorithm for its construction. The proposed representation is based on a hypergraph and it allows both automatic and manual data flow analysis on different detail levels. The hypergraph nodes represent functions. Every node contains a set of other nodes which are fragments. The fragment is a linear sequence of instructions that does not contain call and ret instructions. Edges represent data flows between nodes and correspond to memory buffers and registers. In the future this representation can be used to implement automatic analysis algorithms. An approach is proposed to increasing quality of the developed algorithm representation using grouping of single data flows into one flow connecting logical algorithm modules.\",\"PeriodicalId\":166102,\"journal\":{\"name\":\"2019 Ivannikov Memorial Workshop (IVMEM)\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Ivannikov Memorial Workshop (IVMEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IVMEM.2019.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Ivannikov Memorial Workshop (IVMEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVMEM.2019.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

二进制代码安全分析的任务之一是检测软件中未记录的特性。这项任务很难自动化,需要网络安全专家的参与。被分析算法的表示方式在很大程度上决定了分析的效果和结果的质量。现有的中间表示和语言旨在用于执行优化转换或分析二进制代码的软件。这种表示和中间语言不适合手工数据流分析。本文提出了一种基于高级层次流程图的程序算法表示方法及其构造算法。提出的表示是基于超图的,它允许在不同的细节级别上进行自动和手动数据流分析。超图节点表示函数。每个节点包含一组其他节点,这些节点是片段。片段是一个不包含调用和ret指令的线性指令序列。边表示节点之间的数据流,并对应于内存缓冲区和寄存器。在未来,这种表示可以用于实现自动分析算法。提出了一种通过将单个数据流分组为一个连接逻辑算法模块的流来提高所开发算法表示质量的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Recovery of High-Level Intermediate Representations of Algorithms from Binary Code
One of the tasks of binary code security analysis is detection of undocumented features in software. This task is hard to automate, and it requires participation of a cybersecurity expert. The way of representation of the algorithm under analysis strongly determines the analysis effort and quality of its results. Existing intermediate representations and languages are intended for use in software that either carries out optimizing transformations or analyzes binary code. Such representations and intermediate languages are unsuitable for manual data flow analysis. This paper proposes a high-level hierarchical flowchart-based representation of a program algorithm as well as an algorithm for its construction. The proposed representation is based on a hypergraph and it allows both automatic and manual data flow analysis on different detail levels. The hypergraph nodes represent functions. Every node contains a set of other nodes which are fragments. The fragment is a linear sequence of instructions that does not contain call and ret instructions. Edges represent data flows between nodes and correspond to memory buffers and registers. In the future this representation can be used to implement automatic analysis algorithms. An approach is proposed to increasing quality of the developed algorithm representation using grouping of single data flows into one flow connecting logical algorithm modules.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信