Yapeng Ye, Zhuo Zhang, Qingkai Shi, Yousra Aafer, X. Zhang
{"title":"D-ARM: Disassembling ARM Binaries by Lightweight Superset Instruction Interpretation and Graph Modeling","authors":"Yapeng Ye, Zhuo Zhang, Qingkai Shi, Yousra Aafer, X. Zhang","doi":"10.1109/SP46215.2023.10179307","DOIUrl":null,"url":null,"abstract":"ARM binary analysis has a wide range of applications in ARM system security. A fundamental challenge is ARM disassembly. ARM, particularly AArch32, has a number of unique features making disassembly distinct from x86 disassembly, such as the mixing of ARM and Thumb instruction modes, implicit mode switching within an application, and more prevalent use of inlined data. Existing techniques cannot achieve high accuracy when binaries become complex and have undergone obfuscation. We propose a novel ARM binary disassembly technique that is particularly designed to address challenges in legacy code for 32-bit ARM binaries. It features a lightweight superset instruction interpretation method to derive rich semantic information and a graph-theory based method that aggregates such information to produce final results. Our comparative evaluation with a number of state-of-the-art disassemblers, including Ghidra, IDA, P-Disasm, XDA, D-Disasm, and Spedi, on thousands of binaries generated from SPEC2000 and SPEC2006 with various settings, and real-world applications collected online show that our technique D-ARM substantially outperforms the baselines.","PeriodicalId":439989,"journal":{"name":"2023 IEEE Symposium on Security and Privacy (SP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Symposium on Security and Privacy (SP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP46215.2023.10179307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
ARM binary analysis has a wide range of applications in ARM system security. A fundamental challenge is ARM disassembly. ARM, particularly AArch32, has a number of unique features making disassembly distinct from x86 disassembly, such as the mixing of ARM and Thumb instruction modes, implicit mode switching within an application, and more prevalent use of inlined data. Existing techniques cannot achieve high accuracy when binaries become complex and have undergone obfuscation. We propose a novel ARM binary disassembly technique that is particularly designed to address challenges in legacy code for 32-bit ARM binaries. It features a lightweight superset instruction interpretation method to derive rich semantic information and a graph-theory based method that aggregates such information to produce final results. Our comparative evaluation with a number of state-of-the-art disassemblers, including Ghidra, IDA, P-Disasm, XDA, D-Disasm, and Spedi, on thousands of binaries generated from SPEC2000 and SPEC2006 with various settings, and real-world applications collected online show that our technique D-ARM substantially outperforms the baselines.