{"title":"用于反编译的复合类型重构","authors":"K. Troshina, Yegor Derevenets, A. Chernov","doi":"10.1109/SCAM.2010.24","DOIUrl":null,"url":null,"abstract":"Decompilation is reconstruction of a program in a high-level language from a program in a low-level language. This paper presents a method for automatic reconstruction of composite types (structures, arrays and combinations of them)in a high-level program during decompilation. Assembly code is obtained by disassembling a binary code or traces collected by a simulator. The proposed method is based on expressing memory access operations as pairs base offset, then building equivalence classes for the bases used in the program and accumulating offsets for each equivalence class. For Strictly conforming C programs our approach is substantiated by the C language semantics as defined in the international standard. However, experimental results have revealed that it is applicable for real-world programs also. Experimental results are obtained for a number of open-source programs as well as for traces collected from them. The method is an essential part of the tool for program decompilation TyDec being developed by the authors. Decompiler TyDec can be used as a standalone tool or as a plug-in for Interactive Trace Explorer TrEx being developed in Institute for System Programming, Russian Academy of Sciences.","PeriodicalId":222204,"journal":{"name":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Reconstruction of Composite Types for Decompilation\",\"authors\":\"K. Troshina, Yegor Derevenets, A. Chernov\",\"doi\":\"10.1109/SCAM.2010.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decompilation is reconstruction of a program in a high-level language from a program in a low-level language. This paper presents a method for automatic reconstruction of composite types (structures, arrays and combinations of them)in a high-level program during decompilation. Assembly code is obtained by disassembling a binary code or traces collected by a simulator. The proposed method is based on expressing memory access operations as pairs base offset, then building equivalence classes for the bases used in the program and accumulating offsets for each equivalence class. For Strictly conforming C programs our approach is substantiated by the C language semantics as defined in the international standard. However, experimental results have revealed that it is applicable for real-world programs also. Experimental results are obtained for a number of open-source programs as well as for traces collected from them. The method is an essential part of the tool for program decompilation TyDec being developed by the authors. Decompiler TyDec can be used as a standalone tool or as a plug-in for Interactive Trace Explorer TrEx being developed in Institute for System Programming, Russian Academy of Sciences.\",\"PeriodicalId\":222204,\"journal\":{\"name\":\"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCAM.2010.24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 10th IEEE Working Conference on Source Code Analysis and Manipulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM.2010.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reconstruction of Composite Types for Decompilation
Decompilation is reconstruction of a program in a high-level language from a program in a low-level language. This paper presents a method for automatic reconstruction of composite types (structures, arrays and combinations of them)in a high-level program during decompilation. Assembly code is obtained by disassembling a binary code or traces collected by a simulator. The proposed method is based on expressing memory access operations as pairs base offset, then building equivalence classes for the bases used in the program and accumulating offsets for each equivalence class. For Strictly conforming C programs our approach is substantiated by the C language semantics as defined in the international standard. However, experimental results have revealed that it is applicable for real-world programs also. Experimental results are obtained for a number of open-source programs as well as for traces collected from them. The method is an essential part of the tool for program decompilation TyDec being developed by the authors. Decompiler TyDec can be used as a standalone tool or as a plug-in for Interactive Trace Explorer TrEx being developed in Institute for System Programming, Russian Academy of Sciences.