{"title":"基于变压器和聚类的物联网设备识别方法","authors":"Litong Deng , Dinglin Gu , Zhi Lin","doi":"10.1016/j.comnet.2025.111791","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid proliferation of Internet of Things (IoT) technologies, mitigating unauthorized device intrusions and impersonation attacks has become a critical security challenge. Device identification plays a crucial role in detecting anomalous behaviors, thereby enhancing security during device operation. However, existing identification methods predominantly rely on manually crafted feature engineering, which necessitates extensive domain knowledge and involves a time-consuming feature selection process. This not only increases computational overhead but also risks omitting essential information, thereby limiting identification performance. To address these challenges, this paper proposes a sample construction method that converts network traffic into multibyte token sequences, utilizes the Transformer architecture to model both the temporal and contextual relationships of raw traffic packets. This approach eliminates the need for complex feature engineering and enables efficient sample generation from just one minute of network traffic, facilitating accurate and efficient IoT device identification. To tackle the open-set identification problem and enhance security management during device access, this study extends the end-to-end identification framework by integrating metric learning with HDBSCAN clustering to generate distinctive device fingerprints. This method not only effectively classifies known devices but also reliably detects previously unseen devices. Experimental results on two public datasets, UNSW and Yourthings, demonstrate that the proposed method achieves superior performance, attaining accuracy rates of 99.89 % and 99.68 %, respectively. Furthermore, it outperforms existing approaches in terms of recognition accuracy, generalization capability, and scalability.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"273 ","pages":"Article 111791"},"PeriodicalIF":4.6000,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IoT device identification method based on transformer and clustering\",\"authors\":\"Litong Deng , Dinglin Gu , Zhi Lin\",\"doi\":\"10.1016/j.comnet.2025.111791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the rapid proliferation of Internet of Things (IoT) technologies, mitigating unauthorized device intrusions and impersonation attacks has become a critical security challenge. Device identification plays a crucial role in detecting anomalous behaviors, thereby enhancing security during device operation. However, existing identification methods predominantly rely on manually crafted feature engineering, which necessitates extensive domain knowledge and involves a time-consuming feature selection process. This not only increases computational overhead but also risks omitting essential information, thereby limiting identification performance. To address these challenges, this paper proposes a sample construction method that converts network traffic into multibyte token sequences, utilizes the Transformer architecture to model both the temporal and contextual relationships of raw traffic packets. This approach eliminates the need for complex feature engineering and enables efficient sample generation from just one minute of network traffic, facilitating accurate and efficient IoT device identification. To tackle the open-set identification problem and enhance security management during device access, this study extends the end-to-end identification framework by integrating metric learning with HDBSCAN clustering to generate distinctive device fingerprints. This method not only effectively classifies known devices but also reliably detects previously unseen devices. Experimental results on two public datasets, UNSW and Yourthings, demonstrate that the proposed method achieves superior performance, attaining accuracy rates of 99.89 % and 99.68 %, respectively. Furthermore, it outperforms existing approaches in terms of recognition accuracy, generalization capability, and scalability.</div></div>\",\"PeriodicalId\":50637,\"journal\":{\"name\":\"Computer Networks\",\"volume\":\"273 \",\"pages\":\"Article 111791\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389128625007571\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625007571","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
IoT device identification method based on transformer and clustering
With the rapid proliferation of Internet of Things (IoT) technologies, mitigating unauthorized device intrusions and impersonation attacks has become a critical security challenge. Device identification plays a crucial role in detecting anomalous behaviors, thereby enhancing security during device operation. However, existing identification methods predominantly rely on manually crafted feature engineering, which necessitates extensive domain knowledge and involves a time-consuming feature selection process. This not only increases computational overhead but also risks omitting essential information, thereby limiting identification performance. To address these challenges, this paper proposes a sample construction method that converts network traffic into multibyte token sequences, utilizes the Transformer architecture to model both the temporal and contextual relationships of raw traffic packets. This approach eliminates the need for complex feature engineering and enables efficient sample generation from just one minute of network traffic, facilitating accurate and efficient IoT device identification. To tackle the open-set identification problem and enhance security management during device access, this study extends the end-to-end identification framework by integrating metric learning with HDBSCAN clustering to generate distinctive device fingerprints. This method not only effectively classifies known devices but also reliably detects previously unseen devices. Experimental results on two public datasets, UNSW and Yourthings, demonstrate that the proposed method achieves superior performance, attaining accuracy rates of 99.89 % and 99.68 %, respectively. Furthermore, it outperforms existing approaches in terms of recognition accuracy, generalization capability, and scalability.
期刊介绍:
Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.