{"title":"基于标签数据增强和图神经网络的文本分类","authors":"Guoying Sun;Yanan Cheng;Ke Kong;Zhaoxin Zhang;Dong Zhao","doi":"10.1109/TII.2025.3537607","DOIUrl":null,"url":null,"abstract":"Although graph neural networks based methods can solve the uneven text length problem of text classification datasets, they are difficult to address the data sparsity problem of short texts. Although some researchers try to reduce the sparsity of the graph by adding labels to its structure, most of them only treat labels as node features other than words and documents, which is not sufficient to construct denser matrices. To address the above problems, three label data augmentation strategies are proposed to build a dense graph, and the attention mechanisms are used to update node features. In addition, a node feature updating method that simultaneously uses global and local weights is proposed. Multiple comparative experiments on five benchmark datasets demonstrate that the method proposed in this article is optimal and the accuracy and micro-F1 have improved by at least 0.012 on four benchmark datasets.","PeriodicalId":13301,"journal":{"name":"IEEE Transactions on Industrial Informatics","volume":"21 5","pages":"3966-3975"},"PeriodicalIF":9.9000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Text Classification Based on Label Data Augmentation and Graph Neural Network\",\"authors\":\"Guoying Sun;Yanan Cheng;Ke Kong;Zhaoxin Zhang;Dong Zhao\",\"doi\":\"10.1109/TII.2025.3537607\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although graph neural networks based methods can solve the uneven text length problem of text classification datasets, they are difficult to address the data sparsity problem of short texts. Although some researchers try to reduce the sparsity of the graph by adding labels to its structure, most of them only treat labels as node features other than words and documents, which is not sufficient to construct denser matrices. To address the above problems, three label data augmentation strategies are proposed to build a dense graph, and the attention mechanisms are used to update node features. In addition, a node feature updating method that simultaneously uses global and local weights is proposed. Multiple comparative experiments on five benchmark datasets demonstrate that the method proposed in this article is optimal and the accuracy and micro-F1 have improved by at least 0.012 on four benchmark datasets.\",\"PeriodicalId\":13301,\"journal\":{\"name\":\"IEEE Transactions on Industrial Informatics\",\"volume\":\"21 5\",\"pages\":\"3966-3975\"},\"PeriodicalIF\":9.9000,\"publicationDate\":\"2025-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Industrial Informatics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10892358/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Informatics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10892358/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Text Classification Based on Label Data Augmentation and Graph Neural Network
Although graph neural networks based methods can solve the uneven text length problem of text classification datasets, they are difficult to address the data sparsity problem of short texts. Although some researchers try to reduce the sparsity of the graph by adding labels to its structure, most of them only treat labels as node features other than words and documents, which is not sufficient to construct denser matrices. To address the above problems, three label data augmentation strategies are proposed to build a dense graph, and the attention mechanisms are used to update node features. In addition, a node feature updating method that simultaneously uses global and local weights is proposed. Multiple comparative experiments on five benchmark datasets demonstrate that the method proposed in this article is optimal and the accuracy and micro-F1 have improved by at least 0.012 on four benchmark datasets.
期刊介绍:
The IEEE Transactions on Industrial Informatics is a multidisciplinary journal dedicated to publishing technical papers that connect theory with practical applications of informatics in industrial settings. It focuses on the utilization of information in intelligent, distributed, and agile industrial automation and control systems. The scope includes topics such as knowledge-based and AI-enhanced automation, intelligent computer control systems, flexible and collaborative manufacturing, industrial informatics in software-defined vehicles and robotics, computer vision, industrial cyber-physical and industrial IoT systems, real-time and networked embedded systems, security in industrial processes, industrial communications, systems interoperability, and human-machine interaction.