将外部知识纳入文本匹配模型

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Speech and Language Pub Date : 2024-03-12 DOI:10.1016/j.csl.2024.101638

Kexin Jiang , Guozhe Jin , Zhenguo Zhang , Rongyi Cui , Yahui Zhao

{"title":"将外部知识纳入文本匹配模型","authors":"Kexin Jiang , Guozhe Jin , Zhenguo Zhang , Rongyi Cui , Yahui Zhao","doi":"10.1016/j.csl.2024.101638","DOIUrl":null,"url":null,"abstract":"<div><p>Text matching is a computational task that involves comparing and establishing the semantic relationship between two textual inputs. The prevailing approach in text matching entails the computation of textual representations or employing attention mechanisms to facilitate interaction with the text. These techniques have demonstrated notable efficacy in various text-matching scenarios. However, these methods primarily focus on modeling the sentence pairs themselves and rarely incorporate additional information to enrich the models. In this study, we address the challenge of text matching in natural language processing by proposing a novel approach that leverages external knowledge sources, namely Wiktionary for word definitions and a knowledge graph for text triplet information. Unlike conventional methods that primarily rely on textual representations and attention mechanisms, our approach enhances semantic understanding by integrating relevant external information. We introduce a fusion module to amalgamate the semantic insights derived from the text and the external knowledge. Our methodology’s efficacy is evidenced through comprehensive experiments conducted on diverse datasets, encompassing natural language inference, text classification, and medical natural language inference. The results unequivocally indicate a significant enhancement in model performance, underscoring the effectiveness of incorporating external knowledge into text-matching tasks.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"87 ","pages":"Article 101638"},"PeriodicalIF":3.1000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Incorporating external knowledge for text matching model\",\"authors\":\"Kexin Jiang , Guozhe Jin , Zhenguo Zhang , Rongyi Cui , Yahui Zhao\",\"doi\":\"10.1016/j.csl.2024.101638\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Text matching is a computational task that involves comparing and establishing the semantic relationship between two textual inputs. The prevailing approach in text matching entails the computation of textual representations or employing attention mechanisms to facilitate interaction with the text. These techniques have demonstrated notable efficacy in various text-matching scenarios. However, these methods primarily focus on modeling the sentence pairs themselves and rarely incorporate additional information to enrich the models. In this study, we address the challenge of text matching in natural language processing by proposing a novel approach that leverages external knowledge sources, namely Wiktionary for word definitions and a knowledge graph for text triplet information. Unlike conventional methods that primarily rely on textual representations and attention mechanisms, our approach enhances semantic understanding by integrating relevant external information. We introduce a fusion module to amalgamate the semantic insights derived from the text and the external knowledge. Our methodology’s efficacy is evidenced through comprehensive experiments conducted on diverse datasets, encompassing natural language inference, text classification, and medical natural language inference. The results unequivocally indicate a significant enhancement in model performance, underscoring the effectiveness of incorporating external knowledge into text-matching tasks.</p></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":\"87 \",\"pages\":\"Article 101638\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000214\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000214","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

文本匹配是一项涉及比较和建立两个文本输入之间语义关系的计算任务。文本匹配的主流方法是计算文本表征或采用注意力机制来促进与文本的交互。这些技术在各种文本匹配场景中都取得了显著的效果。然而，这些方法主要侧重于对句子本身进行建模，很少结合其他信息来丰富模型。在本研究中，我们针对自然语言处理中文本匹配所面临的挑战，提出了一种利用外部知识源（即维基词典中的词义和知识图谱中的文本三连音信息）的新方法。与主要依赖文本表征和注意力机制的传统方法不同，我们的方法通过整合相关外部信息来增强语义理解。我们引入了一个融合模块，以整合从文本和外部知识中获得的语义见解。我们在各种数据集（包括自然语言推理、文本分类和医学自然语言推理）上进行的综合实验证明了我们方法的有效性。实验结果清楚地表明，模型的性能得到了显著提高，这突出了将外部知识融入文本匹配任务的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Incorporating external knowledge for text matching model

Text matching is a computational task that involves comparing and establishing the semantic relationship between two textual inputs. The prevailing approach in text matching entails the computation of textual representations or employing attention mechanisms to facilitate interaction with the text. These techniques have demonstrated notable efficacy in various text-matching scenarios. However, these methods primarily focus on modeling the sentence pairs themselves and rarely incorporate additional information to enrich the models. In this study, we address the challenge of text matching in natural language processing by proposing a novel approach that leverages external knowledge sources, namely Wiktionary for word definitions and a knowledge graph for text triplet information. Unlike conventional methods that primarily rely on textual representations and attention mechanisms, our approach enhances semantic understanding by integrating relevant external information. We introduce a fusion module to amalgamate the semantic insights derived from the text and the external knowledge. Our methodology’s efficacy is evidenced through comprehensive experiments conducted on diverse datasets, encompassing natural language inference, text classification, and medical natural language inference. The results unequivocally indicate a significant enhancement in model performance, underscoring the effectiveness of incorporating external knowledge into text-matching tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Speech and Language 工程技术-计算机：人工智能

CiteScore

11.30

自引率

4.70%

发文量

审稿时长

22.9 weeks

期刊介绍： Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.