{"title":"Traffic through two lenses: A dual-branch vision transformer for IoT traffic classification","authors":"Wen Yang, Chenxi Tang, Chaowei Tang, Jingwen Lu, Jing Si, Zhuo Zeng, Wenyu Ma","doi":"10.1016/j.comnet.2025.111469","DOIUrl":null,"url":null,"abstract":"<div><div>Internet of Things (IoT) traffic classification identifies different communication activities by analyzing network traffic to ensure efficient and reliable Quality of Service (QoS) for businesses. However, with the dramatic increase in IoT devices, traffic from smart homes, industrial sensor networks, and other diverse traffic exhibits a high degree of diversity and complexity. This not only brings great challenges to network resource management, but also poses a serious threat to cyberspace security. To address this challenge, we propose Bimodal TrafficNet, a network traffic classification model based on Vision Transformer (ViT), to improve the classification accuracy and generalization ability of the model by fusing two modal information, traffic images and statistical features. The model contains two core branches: a traffic image branch (I-Branch) and a statistical feature branch (F-Branch). The former focuses on capturing the detailed features of network traffic, whereas the latter focuses on the global behavior of traffic and strengthens the synergistic effect between the two branches using a Bimodal Cross-Attention module (BCA module). In addition, the I-Branch introduces a Pixel-Level Interactive Attention module (PLIA module) to further optimize the representation of network traffic image features. The experimental results show that Bimodal TrafficNet performs best on four public datasets: Edge-IIoTset, CICIoT2022, ISCXVPN2016, and USTC-TFC2016, compared to existing methods.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"269 ","pages":"Article 111469"},"PeriodicalIF":4.6000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625004360","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Internet of Things (IoT) traffic classification identifies different communication activities by analyzing network traffic to ensure efficient and reliable Quality of Service (QoS) for businesses. However, with the dramatic increase in IoT devices, traffic from smart homes, industrial sensor networks, and other diverse traffic exhibits a high degree of diversity and complexity. This not only brings great challenges to network resource management, but also poses a serious threat to cyberspace security. To address this challenge, we propose Bimodal TrafficNet, a network traffic classification model based on Vision Transformer (ViT), to improve the classification accuracy and generalization ability of the model by fusing two modal information, traffic images and statistical features. The model contains two core branches: a traffic image branch (I-Branch) and a statistical feature branch (F-Branch). The former focuses on capturing the detailed features of network traffic, whereas the latter focuses on the global behavior of traffic and strengthens the synergistic effect between the two branches using a Bimodal Cross-Attention module (BCA module). In addition, the I-Branch introduces a Pixel-Level Interactive Attention module (PLIA module) to further optimize the representation of network traffic image features. The experimental results show that Bimodal TrafficNet performs best on four public datasets: Edge-IIoTset, CICIoT2022, ISCXVPN2016, and USTC-TFC2016, compared to existing methods.
期刊介绍:
Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.