Neville Grech, Lexi Brent, Bernhard Scholz, Y. Smaragdakis
{"title":"Gigahorse: Thorough, Declarative Decompilation of Smart Contracts","authors":"Neville Grech, Lexi Brent, Bernhard Scholz, Y. Smaragdakis","doi":"10.1109/ICSE.2019.00120","DOIUrl":null,"url":null,"abstract":"The rise of smart contractsThe rise of smart contracts–autonomous applications running on blockchains–has led to a growing number of threats, necessitating sophisticated program analysis. However, smart contracts, which transact valuable tokens and cryptocurrencies, are compiled to very low-level bytecode. This bytecode is the ultimate semantics and means of enforcement of the contract. We present the Gigahorse toolchain. At its core is a reverse compiler (i.e., a decompiler) that decompiles smart contracts from Ethereum Virtual Machine (EVM) bytecode into a highlevel 3-address code representation. The new intermediate representation of smart contracts makes implicit data- and controlflow dependencies of the EVM bytecode explicit. Decompilation obviates the need for a contract’s source and allows the analysis of both new and deployed contracts. Gigahorse advances the state of the art on several fronts. It gives the highest analysis precision and completeness among decompilers for Ethereum smart contracts–e.g., Gigahorse can decompile over 99.98% of deployed contracts, compared to 88% for the recently-published Vandal decompiler and under 50% for the state-of-the-practice Porosity decompiler. Importantly, Gigahorse offers a full-featured toolchain for further analyses (and a \"batteries included\" approach, with multiple clients already implemented), together with the highest performance and scalability. Key to these improvements is Gigahorse’s use of a declarative, logic-based specification, which allows high-level insights to inform low-level decompilation.autonomous applications running on blockchains---has led to a growing number of threats, necessitating sophisticated program analysis. However, smart contracts, which transact valuable tokens and cryptocurrencies, are compiled to very low-level bytecode. This bytecode is the ultimate semantics and means of enforcement of the contract. We present the Gigahorse toolchain. At its core is a reverse compiler (i.e., a decompiler) that decompiles smart contracts from Ethereum Virtual Machine (EVM) bytecode into a high-level 3-address code representation. The new intermediate representation of smart contracts makes implicit data- and control-flow dependencies of the EVM bytecode explicit. Decompilation obviates the need for a contract's source and allows the analysis of both new and deployed contracts. Gigahorse advances the state of the art on several fronts. It gives the highest analysis precision and completeness among decompilers for Ethereum smart contracts---e.g., Gigahorse can decompile over 99.98\\% of deployed contracts, compared to 88\\% for the recently-published Vandal decompiler and under 50\\% for the state-of-the-practice Porosity decompiler. Importantly, Gigahorse offers a full-featured toolchain for further analyses (and a ``batteries included'' approach, with multiple clients already implemented), together with the highest performance and scalability. Key to these improvements is Gigahorse's use of a declarative, logic-based specification, which allows high-level insights to inform low-level decompilation.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"20 1","pages":"1176-1186"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"72","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2019.00120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 72
Abstract
The rise of smart contractsThe rise of smart contracts–autonomous applications running on blockchains–has led to a growing number of threats, necessitating sophisticated program analysis. However, smart contracts, which transact valuable tokens and cryptocurrencies, are compiled to very low-level bytecode. This bytecode is the ultimate semantics and means of enforcement of the contract. We present the Gigahorse toolchain. At its core is a reverse compiler (i.e., a decompiler) that decompiles smart contracts from Ethereum Virtual Machine (EVM) bytecode into a highlevel 3-address code representation. The new intermediate representation of smart contracts makes implicit data- and controlflow dependencies of the EVM bytecode explicit. Decompilation obviates the need for a contract’s source and allows the analysis of both new and deployed contracts. Gigahorse advances the state of the art on several fronts. It gives the highest analysis precision and completeness among decompilers for Ethereum smart contracts–e.g., Gigahorse can decompile over 99.98% of deployed contracts, compared to 88% for the recently-published Vandal decompiler and under 50% for the state-of-the-practice Porosity decompiler. Importantly, Gigahorse offers a full-featured toolchain for further analyses (and a "batteries included" approach, with multiple clients already implemented), together with the highest performance and scalability. Key to these improvements is Gigahorse’s use of a declarative, logic-based specification, which allows high-level insights to inform low-level decompilation.autonomous applications running on blockchains---has led to a growing number of threats, necessitating sophisticated program analysis. However, smart contracts, which transact valuable tokens and cryptocurrencies, are compiled to very low-level bytecode. This bytecode is the ultimate semantics and means of enforcement of the contract. We present the Gigahorse toolchain. At its core is a reverse compiler (i.e., a decompiler) that decompiles smart contracts from Ethereum Virtual Machine (EVM) bytecode into a high-level 3-address code representation. The new intermediate representation of smart contracts makes implicit data- and control-flow dependencies of the EVM bytecode explicit. Decompilation obviates the need for a contract's source and allows the analysis of both new and deployed contracts. Gigahorse advances the state of the art on several fronts. It gives the highest analysis precision and completeness among decompilers for Ethereum smart contracts---e.g., Gigahorse can decompile over 99.98\% of deployed contracts, compared to 88\% for the recently-published Vandal decompiler and under 50\% for the state-of-the-practice Porosity decompiler. Importantly, Gigahorse offers a full-featured toolchain for further analyses (and a ``batteries included'' approach, with multiple clients already implemented), together with the highest performance and scalability. Key to these improvements is Gigahorse's use of a declarative, logic-based specification, which allows high-level insights to inform low-level decompilation.