Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts

IF 6.9 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Nami Ashizawa , Naoto Yanai , Jason Paul Cruz , Shingo Okamura
{"title":"Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts","authors":"Nami Ashizawa ,&nbsp;Naoto Yanai ,&nbsp;Jason Paul Cruz ,&nbsp;Shingo Okamura","doi":"10.1016/j.bcra.2022.100101","DOIUrl":null,"url":null,"abstract":"<div><p>Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Consequently, many smart contract vulnerabilities have been discovered in the past decade. To detect and prevent such vulnerabilities, different security analysis tools, including static and dynamic analysis tools, have been created, but their performance decreases drastically when codes to be analyzed are constantly being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool that detects smart contract vulnerabilities. Eth2Vec maintains its robustness against code rewrites; i.e., it can detect vulnerabilities even in rewritten codes. Other machine-learning-based static analysis tools require features, which analysts create manually, as inputs. In contrast, Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts. In doing so, Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts. We performed experiments with existing open databases, such as Etherscan, and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.</p></div>","PeriodicalId":53141,"journal":{"name":"Blockchain-Research and Applications","volume":"3 4","pages":"Article 100101"},"PeriodicalIF":6.9000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096720922000422/pdfft?md5=c155d37a333d4b006542a4a3e93bd67c&pid=1-s2.0-S2096720922000422-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Blockchain-Research and Applications","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096720922000422","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Consequently, many smart contract vulnerabilities have been discovered in the past decade. To detect and prevent such vulnerabilities, different security analysis tools, including static and dynamic analysis tools, have been created, but their performance decreases drastically when codes to be analyzed are constantly being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool that detects smart contract vulnerabilities. Eth2Vec maintains its robustness against code rewrites; i.e., it can detect vulnerabilities even in rewritten codes. Other machine-learning-based static analysis tools require features, which analysts create manually, as inputs. In contrast, Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts. In doing so, Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts. We performed experiments with existing open databases, such as Etherscan, and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.

Eth2Vec:学习以太坊智能合约漏洞检测的合约范围代码表示
以太坊智能合约是在以太坊区块链上部署和执行的计算机程序,用于在互不信任的各方之间执行协议。作为支持智能合约的最突出的平台,以太坊一直是许多攻击的目标,并受到安全事件的困扰。因此,在过去十年中发现了许多智能合约漏洞。为了检测和防止此类漏洞,已经创建了不同的安全分析工具,包括静态和动态分析工具,但是当要分析的代码不断被重写时,它们的性能会急剧下降。在本文中,我们提出了Eth2Vec,这是一种基于机器学习的静态分析工具,可以检测智能合约漏洞。Eth2Vec对代码重写保持健壮性;也就是说,它甚至可以在重写的代码中检测到漏洞。其他基于机器学习的静态分析工具需要分析人员手动创建的功能作为输入。相比之下,Eth2Vec使用神经网络进行语言处理,自动学习脆弱合约的特征。通过这样做,Eth2Vec可以通过比较目标合约代码与学习合约代码之间的相似性来检测智能合约中的漏洞。我们对现有的开放数据库(如Etherscan)进行了实验,Eth2Vec能够在众所周知的指标(即精度,召回率和f1分数)方面优于基于支持向量机的最新模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.30
自引率
3.60%
发文量
0
期刊介绍: Blockchain: Research and Applications is an international, peer reviewed journal for researchers, engineers, and practitioners to present the latest advances and innovations in blockchain research. The journal publishes theoretical and applied papers in established and emerging areas of blockchain research to shape the future of blockchain technology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信