基于深度学习的输入类型对智能合约漏洞检测性能的影响:初步研究

Information Pub Date : 2024-05-24 DOI:10.3390/info15060302
Izdehar M. Aldyaflah, Wenbing Zhao, Shunkun Yang, Xiong Luo
{"title":"基于深度学习的输入类型对智能合约漏洞检测性能的影响:初步研究","authors":"Izdehar M. Aldyaflah, Wenbing Zhao, Shunkun Yang, Xiong Luo","doi":"10.3390/info15060302","DOIUrl":null,"url":null,"abstract":"Stemming vulnerabilities out of a smart contract prior to its deployment is essential to ensure the security of decentralized applications. As such, numerous tools and machine-learning-based methods have been proposed to help detect vulnerabilities in smart contracts. Furthermore, various ways of encoding the smart contracts for analysis have also been proposed. However, the impact of these input methods has not been systematically studied, which is the primary goal of this paper. In this preliminary study, we experimented with four common types of input, including Word2Vec, FastText, Bag-of-Words (BoW), and Term Frequency–Inverse Document Frequency (TF-IDF). To focus on the comparison of these input types, we used the same deep-learning model, i.e., convolutional neural networks, in all experiments. Using a public dataset, we compared the vulnerability detection performance of the four input types both in the binary classification scenarios and the multiclass classification scenario. Our findings show that TF-IDF is the best overall input type among the four. TF-IDF has excellent detection performance in all scenarios: (1) it has the best F1 score and accuracy in binary classifications for all vulnerability types except for the delegate vulnerability where TF-IDF comes in a close second, and (2) it comes in a very close second behind BoW (within 0.8%) in the multiclass classification.","PeriodicalId":510156,"journal":{"name":"Information","volume":"7 40","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Impact of Input Types on Smart Contract Vulnerability Detection Performance Based on Deep Learning: A Preliminary Study\",\"authors\":\"Izdehar M. Aldyaflah, Wenbing Zhao, Shunkun Yang, Xiong Luo\",\"doi\":\"10.3390/info15060302\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stemming vulnerabilities out of a smart contract prior to its deployment is essential to ensure the security of decentralized applications. As such, numerous tools and machine-learning-based methods have been proposed to help detect vulnerabilities in smart contracts. Furthermore, various ways of encoding the smart contracts for analysis have also been proposed. However, the impact of these input methods has not been systematically studied, which is the primary goal of this paper. In this preliminary study, we experimented with four common types of input, including Word2Vec, FastText, Bag-of-Words (BoW), and Term Frequency–Inverse Document Frequency (TF-IDF). To focus on the comparison of these input types, we used the same deep-learning model, i.e., convolutional neural networks, in all experiments. Using a public dataset, we compared the vulnerability detection performance of the four input types both in the binary classification scenarios and the multiclass classification scenario. Our findings show that TF-IDF is the best overall input type among the four. TF-IDF has excellent detection performance in all scenarios: (1) it has the best F1 score and accuracy in binary classifications for all vulnerability types except for the delegate vulnerability where TF-IDF comes in a close second, and (2) it comes in a very close second behind BoW (within 0.8%) in the multiclass classification.\",\"PeriodicalId\":510156,\"journal\":{\"name\":\"Information\",\"volume\":\"7 40\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/info15060302\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info15060302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

要确保去中心化应用的安全性,必须在部署智能合约之前将漏洞扼杀在摇篮里。因此,人们提出了许多工具和基于机器学习的方法来帮助检测智能合约中的漏洞。此外,人们还提出了各种对智能合约进行编码的分析方法。然而,这些输入方法的影响尚未得到系统研究,而这正是本文的主要目标。在这项初步研究中,我们尝试了四种常见的输入类型,包括 Word2Vec、FastText、词包(BoW)和词频-反向文档频率(TF-IDF)。为了集中比较这些输入类型,我们在所有实验中使用了相同的深度学习模型,即卷积神经网络。我们使用公共数据集,比较了四种输入类型在二元分类场景和多分类场景下的漏洞检测性能。我们的研究结果表明,TF-IDF 是四种输入类型中整体性能最好的一种。TF-IDF 在所有场景下都具有出色的检测性能:(1) 在所有漏洞类型的二进制分类中,TF-IDF 的 F1 分数和准确率都是最高的,只有委托漏洞紧随其后;(2) 在多类分类中,TF-IDF 紧随 BoW 之后(0.8% 以内)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Impact of Input Types on Smart Contract Vulnerability Detection Performance Based on Deep Learning: A Preliminary Study
Stemming vulnerabilities out of a smart contract prior to its deployment is essential to ensure the security of decentralized applications. As such, numerous tools and machine-learning-based methods have been proposed to help detect vulnerabilities in smart contracts. Furthermore, various ways of encoding the smart contracts for analysis have also been proposed. However, the impact of these input methods has not been systematically studied, which is the primary goal of this paper. In this preliminary study, we experimented with four common types of input, including Word2Vec, FastText, Bag-of-Words (BoW), and Term Frequency–Inverse Document Frequency (TF-IDF). To focus on the comparison of these input types, we used the same deep-learning model, i.e., convolutional neural networks, in all experiments. Using a public dataset, we compared the vulnerability detection performance of the four input types both in the binary classification scenarios and the multiclass classification scenario. Our findings show that TF-IDF is the best overall input type among the four. TF-IDF has excellent detection performance in all scenarios: (1) it has the best F1 score and accuracy in binary classifications for all vulnerability types except for the delegate vulnerability where TF-IDF comes in a close second, and (2) it comes in a very close second behind BoW (within 0.8%) in the multiclass classification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信