将无监督语义和语法深度集成到异构图中,用于归纳文本分类

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yue Gao, Xiangling Fu, Xien Liu, Ji Wu
{"title":"将无监督语义和语法深度集成到异构图中,用于归纳文本分类","authors":"Yue Gao, Xiangling Fu, Xien Liu, Ji Wu","doi":"10.1007/s40747-023-01228-8","DOIUrl":null,"url":null,"abstract":"Abstract Graph-based neural networks and unsupervised pre-trained models are both cutting-edge text representation methods, given their outstanding ability to capture global information and contextualized information, respectively. However, both representation methods meet obstacles to further performance improvements. On one hand, graph-based neural networks lack knowledge orientation to guide textual interpretation during global information interaction. On the other hand, unsupervised pre-trained models imply rich semantic and syntactic knowledge which lacks sufficient induction and expression. Therefore, how to effectively integrate graph-based global information and unsupervised contextualized semantic and syntactic information to achieve better text representation is an important issue pending for solution. In this paper, we propose a representation method that deeply integrates Unsupervised Semantics and Syntax into heterogeneous Graphs (USS-Graph) for inductive text classification. By constructing a heterogeneous graph whose edges and nodes are totally generated by knowledge from unsupervised pre-trained models, USS-Graph can harmonize the two perspectives of information under a bidirectionally weighted graph structure and thereby realizing the intra-fusion of graph-based global information and unsupervised contextualized semantic and syntactic information. Based on USS-Graph, we also propose a series of optimization measures to further improve the knowledge integration and representation performance. Extensive experiments conducted on benchmark datasets show that USS-Graph consistently achieves state-of-the-art performances on inductive text classification tasks. Additionally, extended experiments are conducted to deeply analyze the characteristics of USS-Graph and the effectiveness of our proposed optimization measures for further knowledge integration and information complementation.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"31 1","pages":"0"},"PeriodicalIF":5.0000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deeply integrating unsupervised semantics and syntax into heterogeneous graphs for inductive text classification\",\"authors\":\"Yue Gao, Xiangling Fu, Xien Liu, Ji Wu\",\"doi\":\"10.1007/s40747-023-01228-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Graph-based neural networks and unsupervised pre-trained models are both cutting-edge text representation methods, given their outstanding ability to capture global information and contextualized information, respectively. However, both representation methods meet obstacles to further performance improvements. On one hand, graph-based neural networks lack knowledge orientation to guide textual interpretation during global information interaction. On the other hand, unsupervised pre-trained models imply rich semantic and syntactic knowledge which lacks sufficient induction and expression. Therefore, how to effectively integrate graph-based global information and unsupervised contextualized semantic and syntactic information to achieve better text representation is an important issue pending for solution. In this paper, we propose a representation method that deeply integrates Unsupervised Semantics and Syntax into heterogeneous Graphs (USS-Graph) for inductive text classification. By constructing a heterogeneous graph whose edges and nodes are totally generated by knowledge from unsupervised pre-trained models, USS-Graph can harmonize the two perspectives of information under a bidirectionally weighted graph structure and thereby realizing the intra-fusion of graph-based global information and unsupervised contextualized semantic and syntactic information. Based on USS-Graph, we also propose a series of optimization measures to further improve the knowledge integration and representation performance. Extensive experiments conducted on benchmark datasets show that USS-Graph consistently achieves state-of-the-art performances on inductive text classification tasks. Additionally, extended experiments are conducted to deeply analyze the characteristics of USS-Graph and the effectiveness of our proposed optimization measures for further knowledge integration and information complementation.\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2023-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-023-01228-8\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s40747-023-01228-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

基于图的神经网络和无监督预训练模型都是前沿的文本表示方法,它们分别具有捕获全局信息和上下文化信息的出色能力。然而,这两种表示方法在进一步提高性能方面都遇到了障碍。一方面,基于图的神经网络在全局信息交互过程中缺乏知识导向来指导文本解释。另一方面,无监督预训练模型隐含着丰富的语义和句法知识,但缺乏足够的归纳和表达。因此,如何有效地将基于图的全局信息与无监督的上下文化语义和句法信息相结合,实现更好的文本表示是一个亟待解决的重要问题。在本文中,我们提出了一种将无监督语义和语法深度集成到异构图(USS-Graph)中的表示方法,用于归纳文本分类。us - graph通过构建一个边缘和节点完全由无监督预训练模型的知识生成的异构图,在双向加权图结构下协调信息的两个视角,从而实现基于图的全局信息与无监督的上下文化语义和句法信息的内融合。在USS-Graph的基础上,提出了一系列优化措施,进一步提高知识集成和表示性能。在基准数据集上进行的大量实验表明,USS-Graph在归纳文本分类任务上始终达到最先进的性能。此外,我们还进行了扩展实验,深入分析了USS-Graph的特点以及我们提出的优化措施的有效性,以进一步进行知识整合和信息互补。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Deeply integrating unsupervised semantics and syntax into heterogeneous graphs for inductive text classification

Deeply integrating unsupervised semantics and syntax into heterogeneous graphs for inductive text classification
Abstract Graph-based neural networks and unsupervised pre-trained models are both cutting-edge text representation methods, given their outstanding ability to capture global information and contextualized information, respectively. However, both representation methods meet obstacles to further performance improvements. On one hand, graph-based neural networks lack knowledge orientation to guide textual interpretation during global information interaction. On the other hand, unsupervised pre-trained models imply rich semantic and syntactic knowledge which lacks sufficient induction and expression. Therefore, how to effectively integrate graph-based global information and unsupervised contextualized semantic and syntactic information to achieve better text representation is an important issue pending for solution. In this paper, we propose a representation method that deeply integrates Unsupervised Semantics and Syntax into heterogeneous Graphs (USS-Graph) for inductive text classification. By constructing a heterogeneous graph whose edges and nodes are totally generated by knowledge from unsupervised pre-trained models, USS-Graph can harmonize the two perspectives of information under a bidirectionally weighted graph structure and thereby realizing the intra-fusion of graph-based global information and unsupervised contextualized semantic and syntactic information. Based on USS-Graph, we also propose a series of optimization measures to further improve the knowledge integration and representation performance. Extensive experiments conducted on benchmark datasets show that USS-Graph consistently achieves state-of-the-art performances on inductive text classification tasks. Additionally, extended experiments are conducted to deeply analyze the characteristics of USS-Graph and the effectiveness of our proposed optimization measures for further knowledge integration and information complementation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信