A. Rimsa, J. N. Amaral, Fernando Magno Quintão Pereira
{"title":"Efficient and Precise Dynamic Construction of Control Flow Graphs","authors":"A. Rimsa, J. N. Amaral, Fernando Magno Quintão Pereira","doi":"10.1145/3355378.3355383","DOIUrl":null,"url":null,"abstract":"The extraction of high-level information from binary code is an important problem in programming languages, whose solution supports the detection of malware in binary code and the construction of dynamic program slices. The Control Flow Graph is one of the instruments used to represent the structure of binary programs. Most solutions to reconstruct CFGs from binary programs rely on purely static techniques, based either on data-flow analyses, or in type inference. In contrast, in this work we use a purely dynamic approach to such a purpose. Our technique can be used alone, or in combination with static analysis tools. We demonstrate that it is possible to verify completeness in several real-world programs. We also show how to combine our technique with DynInst, the current state-of-the-art static CFG reconstructor. By providing DynInst with extra information, we improve its capacity to deal with indirect jumps. Our dynamic CFG reconstructor has been implemented on top of valgrind. When applied on cBench, this implementation is able to completely cover 36% of all the functions available in that suite. It adds an average overhead of 43x onto the execution of the original programs. Although expressive, this overhead is almost four times lower than the overhead of DCFG, a tool distributed by Intel, and built on top of PinPlay.","PeriodicalId":429937,"journal":{"name":"Proceedings of the XXIII Brazilian Symposium on Programming Languages","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the XXIII Brazilian Symposium on Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3355378.3355383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The extraction of high-level information from binary code is an important problem in programming languages, whose solution supports the detection of malware in binary code and the construction of dynamic program slices. The Control Flow Graph is one of the instruments used to represent the structure of binary programs. Most solutions to reconstruct CFGs from binary programs rely on purely static techniques, based either on data-flow analyses, or in type inference. In contrast, in this work we use a purely dynamic approach to such a purpose. Our technique can be used alone, or in combination with static analysis tools. We demonstrate that it is possible to verify completeness in several real-world programs. We also show how to combine our technique with DynInst, the current state-of-the-art static CFG reconstructor. By providing DynInst with extra information, we improve its capacity to deal with indirect jumps. Our dynamic CFG reconstructor has been implemented on top of valgrind. When applied on cBench, this implementation is able to completely cover 36% of all the functions available in that suite. It adds an average overhead of 43x onto the execution of the original programs. Although expressive, this overhead is almost four times lower than the overhead of DCFG, a tool distributed by Intel, and built on top of PinPlay.