Maochun Xu, Qiang Liu, Gang Li, Chengmeng Li, Lei Ma, Ke Lin
{"title":"智能制造系统中工业物联网文本相似度分析的增强RoBERTaSN模型","authors":"Maochun Xu, Qiang Liu, Gang Li, Chengmeng Li, Lei Ma, Ke Lin","doi":"10.1002/itl2.70155","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>In Industrial Internet of Things (IIoT) environments, smart manufacturing systems generate massive textual data (equipment logs, maintenance reports, etc.) requiring accurate similarity analysis for fault diagnosis and predictive maintenance. Traditional methods underperform in Industry 5.0 scenarios due to technical vocabulary and domain-specific language. This paper presents RoBERTaSN, an enhanced model combining RoBERTa with a Siamese network, featuring self-attention and dual pooling optimized for industrial texts. It enables precise similarity calculations between fault descriptions and historical records. Experiments on industrial datasets (e.g., equipment fault logs, maintenance reports) yield 94.2% accuracy in fault diagnosis text matching—7.8% higher than traditional TF-IDF (86.4%) and 6.0% higher than mainstream pretrained models (BERT: 88.2% accuracy; BiMPM: 84.67% <i>F</i>1-score), addressing semantic challenges in smart factories and advancing Industry 5.0's human–machine collaboration and intelligent decision-making goals.</p>\n </div>","PeriodicalId":100725,"journal":{"name":"Internet Technology Letters","volume":"8 6","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced RoBERTaSN Model for Industrial IoT Text Similarity Analysis in Smart Manufacturing Systems\",\"authors\":\"Maochun Xu, Qiang Liu, Gang Li, Chengmeng Li, Lei Ma, Ke Lin\",\"doi\":\"10.1002/itl2.70155\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>In Industrial Internet of Things (IIoT) environments, smart manufacturing systems generate massive textual data (equipment logs, maintenance reports, etc.) requiring accurate similarity analysis for fault diagnosis and predictive maintenance. Traditional methods underperform in Industry 5.0 scenarios due to technical vocabulary and domain-specific language. This paper presents RoBERTaSN, an enhanced model combining RoBERTa with a Siamese network, featuring self-attention and dual pooling optimized for industrial texts. It enables precise similarity calculations between fault descriptions and historical records. Experiments on industrial datasets (e.g., equipment fault logs, maintenance reports) yield 94.2% accuracy in fault diagnosis text matching—7.8% higher than traditional TF-IDF (86.4%) and 6.0% higher than mainstream pretrained models (BERT: 88.2% accuracy; BiMPM: 84.67% <i>F</i>1-score), addressing semantic challenges in smart factories and advancing Industry 5.0's human–machine collaboration and intelligent decision-making goals.</p>\\n </div>\",\"PeriodicalId\":100725,\"journal\":{\"name\":\"Internet Technology Letters\",\"volume\":\"8 6\",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Internet Technology Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70155\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Enhanced RoBERTaSN Model for Industrial IoT Text Similarity Analysis in Smart Manufacturing Systems
In Industrial Internet of Things (IIoT) environments, smart manufacturing systems generate massive textual data (equipment logs, maintenance reports, etc.) requiring accurate similarity analysis for fault diagnosis and predictive maintenance. Traditional methods underperform in Industry 5.0 scenarios due to technical vocabulary and domain-specific language. This paper presents RoBERTaSN, an enhanced model combining RoBERTa with a Siamese network, featuring self-attention and dual pooling optimized for industrial texts. It enables precise similarity calculations between fault descriptions and historical records. Experiments on industrial datasets (e.g., equipment fault logs, maintenance reports) yield 94.2% accuracy in fault diagnosis text matching—7.8% higher than traditional TF-IDF (86.4%) and 6.0% higher than mainstream pretrained models (BERT: 88.2% accuracy; BiMPM: 84.67% F1-score), addressing semantic challenges in smart factories and advancing Industry 5.0's human–machine collaboration and intelligent decision-making goals.