Maochun Xu, Qiang Liu, Gang Li, Chengmeng Li, Lei Ma, Ke Lin
{"title":"Enhanced RoBERTaSN Model for Industrial IoT Text Similarity Analysis in Smart Manufacturing Systems","authors":"Maochun Xu, Qiang Liu, Gang Li, Chengmeng Li, Lei Ma, Ke Lin","doi":"10.1002/itl2.70155","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>In Industrial Internet of Things (IIoT) environments, smart manufacturing systems generate massive textual data (equipment logs, maintenance reports, etc.) requiring accurate similarity analysis for fault diagnosis and predictive maintenance. Traditional methods underperform in Industry 5.0 scenarios due to technical vocabulary and domain-specific language. This paper presents RoBERTaSN, an enhanced model combining RoBERTa with a Siamese network, featuring self-attention and dual pooling optimized for industrial texts. It enables precise similarity calculations between fault descriptions and historical records. Experiments on industrial datasets (e.g., equipment fault logs, maintenance reports) yield 94.2% accuracy in fault diagnosis text matching—7.8% higher than traditional TF-IDF (86.4%) and 6.0% higher than mainstream pretrained models (BERT: 88.2% accuracy; BiMPM: 84.67% <i>F</i>1-score), addressing semantic challenges in smart factories and advancing Industry 5.0's human–machine collaboration and intelligent decision-making goals.</p>\n </div>","PeriodicalId":100725,"journal":{"name":"Internet Technology Letters","volume":"8 6","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
In Industrial Internet of Things (IIoT) environments, smart manufacturing systems generate massive textual data (equipment logs, maintenance reports, etc.) requiring accurate similarity analysis for fault diagnosis and predictive maintenance. Traditional methods underperform in Industry 5.0 scenarios due to technical vocabulary and domain-specific language. This paper presents RoBERTaSN, an enhanced model combining RoBERTa with a Siamese network, featuring self-attention and dual pooling optimized for industrial texts. It enables precise similarity calculations between fault descriptions and historical records. Experiments on industrial datasets (e.g., equipment fault logs, maintenance reports) yield 94.2% accuracy in fault diagnosis text matching—7.8% higher than traditional TF-IDF (86.4%) and 6.0% higher than mainstream pretrained models (BERT: 88.2% accuracy; BiMPM: 84.67% F1-score), addressing semantic challenges in smart factories and advancing Industry 5.0's human–machine collaboration and intelligent decision-making goals.