{"title":"Preliminary Study on the Effect of Traffic Representation on Accuracy Degradation in Machine Learning-based IoT Device Identification","authors":"Nik Aqil, Firdaus Afifi, Faiz Zaki, N. B. Anuar","doi":"10.1109/ICOCO56118.2022.10031725","DOIUrl":null,"url":null,"abstract":"The Internet of Things (IoT) has gained attention for its rapid growth in the past few years. IoT devices such as temperature and humidity sensors and voice controllers are implemented widely, from household appliances to industrial machines. However, with the rapid growth and benefits IoT offers, we are exposed to various security vulnerabilities, such as data breaches and IoT-specific malware. Researchers are using IoT device identification as a solution for IoT security issues. IoT device identification helps network administrators identify network traffic into its originating devices. However, researchers often overlook an important issue in IoT device identification, which is accuracy degradation over time. Thus, this paper explores the severity of accuracy degradation in IoT device identification on different traffic representation approaches, which are flow, sub-flow, and packet. This paper utilizes a private, and the UNSW IoT Traffic Traces public dataset. Based on the experimental findings, the sub-flow-based approach recorded the best overall performance, with only 8% degradation in the private dataset and 1% degradation in the public dataset. Meanwhile, even though the packet-based approach only degraded 5% on the private dataset, it recorded up to an 11% accuracy decrease in the public dataset.","PeriodicalId":319652,"journal":{"name":"2022 IEEE International Conference on Computing (ICOCO)","volume":"925 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Computing (ICOCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOCO56118.2022.10031725","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The Internet of Things (IoT) has gained attention for its rapid growth in the past few years. IoT devices such as temperature and humidity sensors and voice controllers are implemented widely, from household appliances to industrial machines. However, with the rapid growth and benefits IoT offers, we are exposed to various security vulnerabilities, such as data breaches and IoT-specific malware. Researchers are using IoT device identification as a solution for IoT security issues. IoT device identification helps network administrators identify network traffic into its originating devices. However, researchers often overlook an important issue in IoT device identification, which is accuracy degradation over time. Thus, this paper explores the severity of accuracy degradation in IoT device identification on different traffic representation approaches, which are flow, sub-flow, and packet. This paper utilizes a private, and the UNSW IoT Traffic Traces public dataset. Based on the experimental findings, the sub-flow-based approach recorded the best overall performance, with only 8% degradation in the private dataset and 1% degradation in the public dataset. Meanwhile, even though the packet-based approach only degraded 5% on the private dataset, it recorded up to an 11% accuracy decrease in the public dataset.