Hanting Chu;Pengcheng Zhang;Hai Dong;Yan Xiao;Shunhui Ji
{"title":"DeepFusion:通过深度学习和数据融合进行智能合约漏洞检测","authors":"Hanting Chu;Pengcheng Zhang;Hai Dong;Yan Xiao;Shunhui Ji","doi":"10.1109/TR.2024.3480010","DOIUrl":null,"url":null,"abstract":"Given that smart contracts execute transactions worth hundreds of millions of dollars daily, the issue of smart contract security has attracted considerable attention over the past few years. Traditional methods for detecting vulnerabilities heavily rely on manually developed rules and features, leading to the problems of low accuracy, high false positives, and poor scalability. Although deep learning-inspired approaches were designed to alleviate the problem, most of them rely on monothetic features, which may result in information incompetence during the learning process. Furthermore, the lack of available labeled vulnerability datasets is also a major limitation. To address these issues, we collect and construct a dataset of five labeled smart contract vulnerabilities, and propose <italic>DeepFusion</i>, a vulnerability detection method that fuses code representation information, including program slice information and abstraction syntax tree (AST) structured information. First, we develop automated tools to extract contract vulnerability slicing information from source code, and extract structured information from source code-converted AST. Second, code features and global structured features are fused into the data. Finally, the fused data are input into the Bidirectional Long Short-Term Memory+ Attention (BiLSTM+ATT) model for smart contract vulnerability detection. The BiLSTM model can capture long-term dependencies in both directions and is more suitable for processing serialized information generated by <italic>DeepFusion</i>, while the attention mechanism can highlight the characteristic information of vulnerabilities. We conducted experiments via collecting a real smart contract dataset. The experimental results show that our method significantly outperforms the existing methods in detecting the vulnerabilities of <italic>reentrancy</i>, <italic>timestamp dependence</i>, <italic>integer overflow and underflow</i>, <italic>Use tx.origin for authentication</i>, and <italic>Unprotected Self-destruct Instruction</i> by 6.36%, 6.42%, 16.5%, 21.29%, and 25.05%, respectively. To the best of our knowledge, the latter two vulnerabilities are the first to be detected using deep learning methods.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 3","pages":"3544-3558"},"PeriodicalIF":5.7000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DeepFusion: Smart Contract Vulnerability Detection Via Deep Learning and Data Fusion\",\"authors\":\"Hanting Chu;Pengcheng Zhang;Hai Dong;Yan Xiao;Shunhui Ji\",\"doi\":\"10.1109/TR.2024.3480010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given that smart contracts execute transactions worth hundreds of millions of dollars daily, the issue of smart contract security has attracted considerable attention over the past few years. Traditional methods for detecting vulnerabilities heavily rely on manually developed rules and features, leading to the problems of low accuracy, high false positives, and poor scalability. Although deep learning-inspired approaches were designed to alleviate the problem, most of them rely on monothetic features, which may result in information incompetence during the learning process. Furthermore, the lack of available labeled vulnerability datasets is also a major limitation. To address these issues, we collect and construct a dataset of five labeled smart contract vulnerabilities, and propose <italic>DeepFusion</i>, a vulnerability detection method that fuses code representation information, including program slice information and abstraction syntax tree (AST) structured information. First, we develop automated tools to extract contract vulnerability slicing information from source code, and extract structured information from source code-converted AST. Second, code features and global structured features are fused into the data. Finally, the fused data are input into the Bidirectional Long Short-Term Memory+ Attention (BiLSTM+ATT) model for smart contract vulnerability detection. The BiLSTM model can capture long-term dependencies in both directions and is more suitable for processing serialized information generated by <italic>DeepFusion</i>, while the attention mechanism can highlight the characteristic information of vulnerabilities. We conducted experiments via collecting a real smart contract dataset. The experimental results show that our method significantly outperforms the existing methods in detecting the vulnerabilities of <italic>reentrancy</i>, <italic>timestamp dependence</i>, <italic>integer overflow and underflow</i>, <italic>Use tx.origin for authentication</i>, and <italic>Unprotected Self-destruct Instruction</i> by 6.36%, 6.42%, 16.5%, 21.29%, and 25.05%, respectively. To the best of our knowledge, the latter two vulnerabilities are the first to be detected using deep learning methods.\",\"PeriodicalId\":56305,\"journal\":{\"name\":\"IEEE Transactions on Reliability\",\"volume\":\"74 3\",\"pages\":\"3544-3558\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2024-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Reliability\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10737415/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10737415/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
DeepFusion: Smart Contract Vulnerability Detection Via Deep Learning and Data Fusion
Given that smart contracts execute transactions worth hundreds of millions of dollars daily, the issue of smart contract security has attracted considerable attention over the past few years. Traditional methods for detecting vulnerabilities heavily rely on manually developed rules and features, leading to the problems of low accuracy, high false positives, and poor scalability. Although deep learning-inspired approaches were designed to alleviate the problem, most of them rely on monothetic features, which may result in information incompetence during the learning process. Furthermore, the lack of available labeled vulnerability datasets is also a major limitation. To address these issues, we collect and construct a dataset of five labeled smart contract vulnerabilities, and propose DeepFusion, a vulnerability detection method that fuses code representation information, including program slice information and abstraction syntax tree (AST) structured information. First, we develop automated tools to extract contract vulnerability slicing information from source code, and extract structured information from source code-converted AST. Second, code features and global structured features are fused into the data. Finally, the fused data are input into the Bidirectional Long Short-Term Memory+ Attention (BiLSTM+ATT) model for smart contract vulnerability detection. The BiLSTM model can capture long-term dependencies in both directions and is more suitable for processing serialized information generated by DeepFusion, while the attention mechanism can highlight the characteristic information of vulnerabilities. We conducted experiments via collecting a real smart contract dataset. The experimental results show that our method significantly outperforms the existing methods in detecting the vulnerabilities of reentrancy, timestamp dependence, integer overflow and underflow, Use tx.origin for authentication, and Unprotected Self-destruct Instruction by 6.36%, 6.42%, 16.5%, 21.29%, and 25.05%, respectively. To the best of our knowledge, the latter two vulnerabilities are the first to be detected using deep learning methods.
期刊介绍:
IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.