IoT device identification method based on transformer and clustering

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Litong Deng , Dinglin Gu , Zhi Lin
{"title":"IoT device identification method based on transformer and clustering","authors":"Litong Deng ,&nbsp;Dinglin Gu ,&nbsp;Zhi Lin","doi":"10.1016/j.comnet.2025.111791","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid proliferation of Internet of Things (IoT) technologies, mitigating unauthorized device intrusions and impersonation attacks has become a critical security challenge. Device identification plays a crucial role in detecting anomalous behaviors, thereby enhancing security during device operation. However, existing identification methods predominantly rely on manually crafted feature engineering, which necessitates extensive domain knowledge and involves a time-consuming feature selection process. This not only increases computational overhead but also risks omitting essential information, thereby limiting identification performance. To address these challenges, this paper proposes a sample construction method that converts network traffic into multibyte token sequences, utilizes the Transformer architecture to model both the temporal and contextual relationships of raw traffic packets. This approach eliminates the need for complex feature engineering and enables efficient sample generation from just one minute of network traffic, facilitating accurate and efficient IoT device identification. To tackle the open-set identification problem and enhance security management during device access, this study extends the end-to-end identification framework by integrating metric learning with HDBSCAN clustering to generate distinctive device fingerprints. This method not only effectively classifies known devices but also reliably detects previously unseen devices. Experimental results on two public datasets, UNSW and Yourthings, demonstrate that the proposed method achieves superior performance, attaining accuracy rates of 99.89 % and 99.68 %, respectively. Furthermore, it outperforms existing approaches in terms of recognition accuracy, generalization capability, and scalability.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"273 ","pages":"Article 111791"},"PeriodicalIF":4.6000,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625007571","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

With the rapid proliferation of Internet of Things (IoT) technologies, mitigating unauthorized device intrusions and impersonation attacks has become a critical security challenge. Device identification plays a crucial role in detecting anomalous behaviors, thereby enhancing security during device operation. However, existing identification methods predominantly rely on manually crafted feature engineering, which necessitates extensive domain knowledge and involves a time-consuming feature selection process. This not only increases computational overhead but also risks omitting essential information, thereby limiting identification performance. To address these challenges, this paper proposes a sample construction method that converts network traffic into multibyte token sequences, utilizes the Transformer architecture to model both the temporal and contextual relationships of raw traffic packets. This approach eliminates the need for complex feature engineering and enables efficient sample generation from just one minute of network traffic, facilitating accurate and efficient IoT device identification. To tackle the open-set identification problem and enhance security management during device access, this study extends the end-to-end identification framework by integrating metric learning with HDBSCAN clustering to generate distinctive device fingerprints. This method not only effectively classifies known devices but also reliably detects previously unseen devices. Experimental results on two public datasets, UNSW and Yourthings, demonstrate that the proposed method achieves superior performance, attaining accuracy rates of 99.89 % and 99.68 %, respectively. Furthermore, it outperforms existing approaches in terms of recognition accuracy, generalization capability, and scalability.
基于变压器和聚类的物联网设备识别方法
随着物联网(IoT)技术的快速发展,减轻未经授权的设备入侵和模拟攻击已成为一项关键的安全挑战。设备识别在检测异常行为、提高设备运行安全性方面起着至关重要的作用。然而,现有的识别方法主要依赖于手工制作的特征工程,这需要广泛的领域知识,并且涉及耗时的特征选择过程。这不仅增加了计算开销,而且有遗漏重要信息的风险,从而限制了识别性能。为了解决这些挑战,本文提出了一种将网络流量转换为多字节令牌序列的示例构造方法,利用Transformer架构对原始流量数据包的时间和上下文关系进行建模。这种方法消除了复杂的特征工程的需要,并且可以从一分钟的网络流量中高效地生成样本,从而促进准确高效的物联网设备识别。为了解决开放集识别问题并加强设备访问过程中的安全管理,本研究通过将度量学习与HDBSCAN聚类相结合来扩展端到端识别框架,以生成独特的设备指纹。该方法不仅可以有效地对已知设备进行分类,而且可以可靠地检测出以前未见过的设备。在UNSW和Yourthings两个公开数据集上的实验结果表明,该方法取得了优异的性能,准确率分别达到99.89%和99.68%。此外,它在识别精度、泛化能力和可扩展性方面优于现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Networks
Computer Networks 工程技术-电信学
CiteScore
10.80
自引率
3.60%
发文量
434
审稿时长
8.6 months
期刊介绍: Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信