Han Liu, Zhiqiang Yang, Yu Jiang, Wenqi Zhao, Jiaguang Sun
{"title":"通过智能合约胎记为以太坊启用克隆检测","authors":"Han Liu, Zhiqiang Yang, Yu Jiang, Wenqi Zhao, Jiaguang Sun","doi":"10.1109/ICPC.2019.00024","DOIUrl":null,"url":null,"abstract":"The Ethereum ecosystem has introduced a pervasive blockchain platform with programmable transactions. Everyone is allowed to develop and deploy smart contracts. Such flexibility can lead to a large collection of similar contracts, i.e., clones, especially when Ethereum applications are highly domain-specific and may share similar functionalities within the same domain, e.g., token contracts often provide interfaces for money transfer and balance inquiry. While smart contract clones have a wide range of impact across different applications, e.g., security, they are relatively little studied. Although clone detection has been a long-standing research topic, blockchain smart contracts introduce new challenges, e.g., syntactic diversity due to trade-off between storage and execution, understanding high-level business logic etc.. In this paper, we highlighted the very first attempt to clone detection of Ethereum smart contracts. To overcome the new challenges, we introduce the concept of smart contract birthmark, i.e., a semantic-preserving and computable representation for smart contract bytecode. The birthmark captures high-level semantics by effectively sketching symbolic execution traces (e.g., data access dependencies, path conditions) and maintain syntactic regularities (e.g., type and number of instructions) as well. Then, the clone detection problem is reduced to a computation of statistical similarity between two contract birthmarks. We have implemented a clone detector called EClone and evaluated it on Ethereum. The empirical results demonstrated the potential of EClone in accurately identifying clones. We have also extended EClone for vulnerability search and managed to detect CVE-2018-10376 instances.","PeriodicalId":6853,"journal":{"name":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","volume":"51 1","pages":"105-115"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Enabling Clone Detection For Ethereum Via Smart Contract Birthmarks\",\"authors\":\"Han Liu, Zhiqiang Yang, Yu Jiang, Wenqi Zhao, Jiaguang Sun\",\"doi\":\"10.1109/ICPC.2019.00024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Ethereum ecosystem has introduced a pervasive blockchain platform with programmable transactions. Everyone is allowed to develop and deploy smart contracts. Such flexibility can lead to a large collection of similar contracts, i.e., clones, especially when Ethereum applications are highly domain-specific and may share similar functionalities within the same domain, e.g., token contracts often provide interfaces for money transfer and balance inquiry. While smart contract clones have a wide range of impact across different applications, e.g., security, they are relatively little studied. Although clone detection has been a long-standing research topic, blockchain smart contracts introduce new challenges, e.g., syntactic diversity due to trade-off between storage and execution, understanding high-level business logic etc.. In this paper, we highlighted the very first attempt to clone detection of Ethereum smart contracts. To overcome the new challenges, we introduce the concept of smart contract birthmark, i.e., a semantic-preserving and computable representation for smart contract bytecode. The birthmark captures high-level semantics by effectively sketching symbolic execution traces (e.g., data access dependencies, path conditions) and maintain syntactic regularities (e.g., type and number of instructions) as well. Then, the clone detection problem is reduced to a computation of statistical similarity between two contract birthmarks. We have implemented a clone detector called EClone and evaluated it on Ethereum. The empirical results demonstrated the potential of EClone in accurately identifying clones. We have also extended EClone for vulnerability search and managed to detect CVE-2018-10376 instances.\",\"PeriodicalId\":6853,\"journal\":{\"name\":\"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)\",\"volume\":\"51 1\",\"pages\":\"105-115\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPC.2019.00024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC.2019.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enabling Clone Detection For Ethereum Via Smart Contract Birthmarks
The Ethereum ecosystem has introduced a pervasive blockchain platform with programmable transactions. Everyone is allowed to develop and deploy smart contracts. Such flexibility can lead to a large collection of similar contracts, i.e., clones, especially when Ethereum applications are highly domain-specific and may share similar functionalities within the same domain, e.g., token contracts often provide interfaces for money transfer and balance inquiry. While smart contract clones have a wide range of impact across different applications, e.g., security, they are relatively little studied. Although clone detection has been a long-standing research topic, blockchain smart contracts introduce new challenges, e.g., syntactic diversity due to trade-off between storage and execution, understanding high-level business logic etc.. In this paper, we highlighted the very first attempt to clone detection of Ethereum smart contracts. To overcome the new challenges, we introduce the concept of smart contract birthmark, i.e., a semantic-preserving and computable representation for smart contract bytecode. The birthmark captures high-level semantics by effectively sketching symbolic execution traces (e.g., data access dependencies, path conditions) and maintain syntactic regularities (e.g., type and number of instructions) as well. Then, the clone detection problem is reduced to a computation of statistical similarity between two contract birthmarks. We have implemented a clone detector called EClone and evaluated it on Ethereum. The empirical results demonstrated the potential of EClone in accurately identifying clones. We have also extended EClone for vulnerability search and managed to detect CVE-2018-10376 instances.