GIWRF-SMOTE: Gini impurity-based weighted random forest with SMOTE for effective malware attack and anomaly detection in IoT-Edge

IF 1.4 Q2 MULTIDISCIPLINARY SCIENCES

Smart Science Pub Date : 2022-12-06 DOI:10.1080/23080477.2022.2152933

J. Manokaran, G. Vairavel

{"title":"GIWRF-SMOTE: Gini impurity-based weighted random forest with SMOTE for effective malware attack and anomaly detection in IoT-Edge","authors":"J. Manokaran, G. Vairavel","doi":"10.1080/23080477.2022.2152933","DOIUrl":null,"url":null,"abstract":"ABSTRACT The Internet of Things (IoT) is a smart technology that has switched the conventional way of living into smart living. As their usage becomes unavoidable, malware attacks in IoT networks have also increased. Many investigations and studies have proposed different methods to detect malware attacks, but these measures have some performance degradation in terms of accuracy, error, and lack of comprehensiveness. The cloud-based IoT infrastructure further creates latency and security problems. The machine learning (ML)-based edge computing can overcome these complications by automating the responses and moving the computation nearer to the network edge, where data is created. In this work, the performance of various prominent ML algorithms, such as logistic regression (LR), naive Bayes (NB), support vector machine (SVM), decision tree (DT), random forest (RF), and k-nearest neighbor (KNN), has been compared to predict malware attack accurately in IoT-edge environment. To enhance the prediction accuracy of the ML algorithms, the unbalanced data is converted into balanced data using the synthetic minority oversampling technique (SMOTE) and optimum features are selected using the Gini impurity-based weighted RF feature selection technique (GIWRF). The investigational results show that among six ML algorithms, RF with GIWRF attained the highest accuracy of 99.39%. GRAPHICAL ABSTRACT","PeriodicalId":53436,"journal":{"name":"Smart Science","volume":"11 1","pages":"276 - 292"},"PeriodicalIF":1.4000,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/23080477.2022.2152933","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 1

Abstract

ABSTRACT The Internet of Things (IoT) is a smart technology that has switched the conventional way of living into smart living. As their usage becomes unavoidable, malware attacks in IoT networks have also increased. Many investigations and studies have proposed different methods to detect malware attacks, but these measures have some performance degradation in terms of accuracy, error, and lack of comprehensiveness. The cloud-based IoT infrastructure further creates latency and security problems. The machine learning (ML)-based edge computing can overcome these complications by automating the responses and moving the computation nearer to the network edge, where data is created. In this work, the performance of various prominent ML algorithms, such as logistic regression (LR), naive Bayes (NB), support vector machine (SVM), decision tree (DT), random forest (RF), and k-nearest neighbor (KNN), has been compared to predict malware attack accurately in IoT-edge environment. To enhance the prediction accuracy of the ML algorithms, the unbalanced data is converted into balanced data using the synthetic minority oversampling technique (SMOTE) and optimum features are selected using the Gini impurity-based weighted RF feature selection technique (GIWRF). The investigational results show that among six ML algorithms, RF with GIWRF attained the highest accuracy of 99.39%. GRAPHICAL ABSTRACT

查看原文本刊更多论文

GIWRF-SMOTE:基于基尼杂质的加权随机森林，用于物联网边缘的有效恶意软件攻击和异常检测

物联网(IoT)是一种智能技术，它将传统的生活方式转变为智能生活。随着它们的使用变得不可避免，物联网网络中的恶意软件攻击也有所增加。许多调查和研究提出了不同的方法来检测恶意软件攻击，但这些方法在准确性、错误和缺乏全面性方面存在一定的性能下降。基于云的物联网基础设施进一步造成了延迟和安全问题。基于机器学习(ML)的边缘计算可以通过自动化响应并将计算移动到更靠近网络边缘(创建数据的地方)来克服这些复杂性。在这项工作中，各种著名的机器学习算法，如逻辑回归(LR)、朴素贝叶斯(NB)、支持向量机(SVM)、决策树(DT)、随机森林(RF)和k近邻(KNN)的性能进行了比较，以准确预测物联网边缘环境中的恶意软件攻击。为了提高机器学习算法的预测精度，使用合成少数派过采样技术(SMOTE)将不平衡数据转换为平衡数据，并使用基于基尼杂质的加权射频特征选择技术(GIWRF)选择最优特征。研究结果表明，在6种ML算法中，基于GIWRF的RF算法准确率最高，达到99.39%。图形抽象

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Smart Science Engineering-Engineering (all)

CiteScore

4.70

自引率

4.30%

发文量

期刊介绍： Smart Science (ISSN 2308-0477) is an international, peer-reviewed journal that publishes significant original scientific researches, and reviews and analyses of current research and science policy. We welcome submissions of high quality papers from all fields of science and from any source. Articles of an interdisciplinary nature are particularly welcomed. Smart Science aims to be among the top multidisciplinary journals covering a broad spectrum of smart topics in the fields of materials science, chemistry, physics, engineering, medicine, and biology. Smart Science is currently focusing on the topics of Smart Manufacturing (CPS, IoT and AI) for Industry 4.0, Smart Energy and Smart Chemistry and Materials. Other specific research areas covered by the journal include, but are not limited to: 1. Smart Science in the Future 2. Smart Manufacturing: -Cyber-Physical System (CPS) -Internet of Things (IoT) and Internet of Brain (IoB) -Artificial Intelligence -Smart Computing -Smart Design/Machine -Smart Sensing -Smart Information and Networks 3. Smart Energy and Thermal/Fluidic Science 4. Smart Chemistry and Materials