GSVD: Common Vulnerability Dataset for Smart Contracts on BSC and Polygon

{"title":"GSVD: Common Vulnerability Dataset for Smart Contracts on BSC and Polygon","authors":"","doi":"10.5121/csit.2023.130601","DOIUrl":null,"url":null,"abstract":"The blockchain 2.0 age, marked by smart contract and Ethereum, has arrived couple years ago. Its technologies have expanded the application scenarios of blockchain technology and driven the boom of decentralized Finance. However, smart contract vulnerabilities and security issues are also emerging one after another. Hackers have exploited these vulnerabilities to cause huge economic losses. In recent years, a large amount of research on the analysis and detection of smart contract vulnerabilities has emerged, but there has been no common detection tool and corresponding test dataset. In this paper, we build GSVD dataset (Generalized Smart Contract Vulnerability Dataset) consisting four offline datasets using smart contracts on two chains, Polygon and BSC: two small Solidity datasets consisting of 153 labeled smart contract source codes, which can be used to test the performance of vulnerability mining tools; two large Solidity datasets consisting of 52,202 un labeled real smart contract source codes that can be used to verify the correctness of various theories and tools under a large number of real data conditions. At the same time, this paper integrates the scripting framework accompanying the GSVD dataset, which can execute a variety of popular automated vulnerability detection tools on top of these datasets and generate analysis results of contracts and potential vulnerabilities. We tested the Minor dataset under GSVD using three tools (Slither, Manticore, Mythril) that are kept up to date and found that the combined use of all tools detected 61.1% of labeled vulnerabilities, of which Mythril has the highest detection rate of 42.6%. It is not difficult to conclude that there`re still ample room for advancement for current smart contract vulnerability mining tools because of their underlying methods. Besides, our dataset can contribute to the ultimate target greatly by providing mining tools plenty real contracts information.","PeriodicalId":110134,"journal":{"name":"Advanced Information Technologies and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Information Technologies and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2023.130601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The blockchain 2.0 age, marked by smart contract and Ethereum, has arrived couple years ago. Its technologies have expanded the application scenarios of blockchain technology and driven the boom of decentralized Finance. However, smart contract vulnerabilities and security issues are also emerging one after another. Hackers have exploited these vulnerabilities to cause huge economic losses. In recent years, a large amount of research on the analysis and detection of smart contract vulnerabilities has emerged, but there has been no common detection tool and corresponding test dataset. In this paper, we build GSVD dataset (Generalized Smart Contract Vulnerability Dataset) consisting four offline datasets using smart contracts on two chains, Polygon and BSC: two small Solidity datasets consisting of 153 labeled smart contract source codes, which can be used to test the performance of vulnerability mining tools; two large Solidity datasets consisting of 52,202 un labeled real smart contract source codes that can be used to verify the correctness of various theories and tools under a large number of real data conditions. At the same time, this paper integrates the scripting framework accompanying the GSVD dataset, which can execute a variety of popular automated vulnerability detection tools on top of these datasets and generate analysis results of contracts and potential vulnerabilities. We tested the Minor dataset under GSVD using three tools (Slither, Manticore, Mythril) that are kept up to date and found that the combined use of all tools detected 61.1% of labeled vulnerabilities, of which Mythril has the highest detection rate of 42.6%. It is not difficult to conclude that there`re still ample room for advancement for current smart contract vulnerability mining tools because of their underlying methods. Besides, our dataset can contribute to the ultimate target greatly by providing mining tools plenty real contracts information.
GSVD:基于BSC和多边形的智能合约通用漏洞数据集
以智能合约和以太坊为标志的区块链2.0时代已经在几年前到来。其技术拓展了区块链技术的应用场景,带动了去中心化金融的繁荣。然而,智能合约的漏洞和安全问题也层出不穷。黑客利用这些漏洞造成了巨大的经济损失。近年来,智能合约漏洞分析与检测的研究大量涌现,但一直没有通用的检测工具和相应的测试数据集。本文利用Polygon和BSC两条链上的智能合约构建了由四个离线数据集组成的GSVD数据集(广义智能合约漏洞数据集):两个由153个标记智能合约源代码组成的小Solidity数据集,可用于测试漏洞挖掘工具的性能;两个由52,202个未标记的真实智能合约源代码组成的大型Solidity数据集,可用于在大量真实数据条件下验证各种理论和工具的正确性。同时,本文集成了GSVD数据集附带的脚本框架,可以在这些数据集上执行各种流行的自动化漏洞检测工具,生成合同和潜在漏洞的分析结果。我们使用三种工具(Slither, Manticore, Mythril)在GSVD下对Minor数据集进行了测试,这些工具都是最新的,发现所有工具的组合使用检测到61.1%的标记漏洞,其中Mythril的检测率最高,为42.6%。不难得出结论,由于现有智能合约漏洞挖掘工具的底层方法,它们仍有很大的发展空间。此外,我们的数据集可以为挖掘工具提供大量真实的合约信息,从而为最终目标做出巨大贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信