Mireya Lucia Hernandez-Jaimes , Alfonso Martinez-Cruz , Kelsey Alejandra Ramírez-Gutiérrez
{"title":"A Machine Learning approach for anomaly detection on the Internet of Things based on Locality-Sensitive Hashing","authors":"Mireya Lucia Hernandez-Jaimes , Alfonso Martinez-Cruz , Kelsey Alejandra Ramírez-Gutiérrez","doi":"10.1016/j.vlsi.2024.102159","DOIUrl":null,"url":null,"abstract":"<div><p><span><span><span>The increasing connectivity of devices on the Internet of Things<span> (IoT) has created a favorable field for attacks. Consequently, current anomaly-based intrusion detection systems<span> (AIDS) integrate artificial intelligence algorithms, such as </span></span></span>machine learning<span> (ML) and deep learning<span><span> (DL), to manage high data volumes, recognize complex patterns, and detect unknown anomalies. However, the effectiveness of these methods is contingent upon the quality and meaningfulness of the extracted features from IoT-based communications. Also, with the growth of the IoT, feature extraction and selection are becoming increasingly difficult due to data heterogeneity, the generation of massive amounts of information, and the lack of feature standardization. Moreover, current proposals rely on complex feature extraction and selection techniques. As a result, this study introduces a novel approach for ML modeling, including </span>decision trees and </span></span></span>random forests<span>, to detect anomalies in IoT. This study aims to overcome feature extraction and selection process dependency by integrating </span></span>fingerprinting techniques<span> based on locality-sensitive hashing (LSH) to represent network packet<span> information in a suitable format for ML modeling and detecting harmful sequential network packets. The anomaly detection performance was assessed using two benchmark IoT datasets, ToN-IoT and MQTT-IoT, which contain cyberattacks threatening IoT networks. The proposal outperforms other methods regarding accuracy, precision, and FPR with values of 99.82%, 99.93%, and 0.13%, respectively.</span></span></p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integration-The Vlsi Journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167926024000221","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The increasing connectivity of devices on the Internet of Things (IoT) has created a favorable field for attacks. Consequently, current anomaly-based intrusion detection systems (AIDS) integrate artificial intelligence algorithms, such as machine learning (ML) and deep learning (DL), to manage high data volumes, recognize complex patterns, and detect unknown anomalies. However, the effectiveness of these methods is contingent upon the quality and meaningfulness of the extracted features from IoT-based communications. Also, with the growth of the IoT, feature extraction and selection are becoming increasingly difficult due to data heterogeneity, the generation of massive amounts of information, and the lack of feature standardization. Moreover, current proposals rely on complex feature extraction and selection techniques. As a result, this study introduces a novel approach for ML modeling, including decision trees and random forests, to detect anomalies in IoT. This study aims to overcome feature extraction and selection process dependency by integrating fingerprinting techniques based on locality-sensitive hashing (LSH) to represent network packet information in a suitable format for ML modeling and detecting harmful sequential network packets. The anomaly detection performance was assessed using two benchmark IoT datasets, ToN-IoT and MQTT-IoT, which contain cyberattacks threatening IoT networks. The proposal outperforms other methods regarding accuracy, precision, and FPR with values of 99.82%, 99.93%, and 0.13%, respectively.
期刊介绍:
Integration''s aim is to cover every aspect of the VLSI area, with an emphasis on cross-fertilization between various fields of science, and the design, verification, test and applications of integrated circuits and systems, as well as closely related topics in process and device technologies. Individual issues will feature peer-reviewed tutorials and articles as well as reviews of recent publications. The intended coverage of the journal can be assessed by examining the following (non-exclusive) list of topics:
Specification methods and languages; Analog/Digital Integrated Circuits and Systems; VLSI architectures; Algorithms, methods and tools for modeling, simulation, synthesis and verification of integrated circuits and systems of any complexity; Embedded systems; High-level synthesis for VLSI systems; Logic synthesis and finite automata; Testing, design-for-test and test generation algorithms; Physical design; Formal verification; Algorithms implemented in VLSI systems; Systems engineering; Heterogeneous systems.