结合语境证据提高汉语内隐语篇关系识别

IF 4.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science Pub Date : 2023-11-25 DOI:10.1007/s11704-023-2503-4

Sheng Xu, Peifeng Li, Qiaoming Zhu

{"title":"结合语境证据提高汉语内隐语篇关系识别","authors":"Sheng Xu, Peifeng Li, Qiaoming Zhu","doi":"10.1007/s11704-023-2503-4","DOIUrl":null,"url":null,"abstract":"<p>The discourse analysis task, which focuses on understanding the semantics of long text spans, has received increasing attention in recent years. As a critical component of discourse analysis, discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units (e.g., clauses, sentences, and sentence groups), called arguments, in a document. Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations, ignoring important textual information in the surrounding contexts. However, in many cases, more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations, requiring mining more contextual clues. In this paper, we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures. In this way, the selector can learn the ability to automatically pick critical textual information from the context (i.e., as evidence) for arguments to assist in discriminating their relations. Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations. Finally, we combine original and enhanced argument representations to recognize their relations. In addition, we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability. The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"40 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2023-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Incorporating contextual evidence to improve implicit discourse relation recognition in Chinese\",\"authors\":\"Sheng Xu, Peifeng Li, Qiaoming Zhu\",\"doi\":\"10.1007/s11704-023-2503-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The discourse analysis task, which focuses on understanding the semantics of long text spans, has received increasing attention in recent years. As a critical component of discourse analysis, discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units (e.g., clauses, sentences, and sentence groups), called arguments, in a document. Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations, ignoring important textual information in the surrounding contexts. However, in many cases, more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations, requiring mining more contextual clues. In this paper, we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures. In this way, the selector can learn the ability to automatically pick critical textual information from the context (i.e., as evidence) for arguments to assist in discriminating their relations. Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations. Finally, we combine original and enhanced argument representations to recognize their relations. In addition, we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability. The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.</p>\",\"PeriodicalId\":12640,\"journal\":{\"name\":\"Frontiers of Computer Science\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2023-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers of Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11704-023-2503-4\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers of Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11704-023-2503-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，以理解长文本跨度语义为重点的语篇分析任务受到越来越多的关注。作为语篇分析的一个重要组成部分，语篇关系识别旨在识别文档中相邻的语篇单位(如子句、句子和句组)之间的修辞关系。以往的研究侧重于捕捉论点之间的语义交互，以识别它们的话语关系，而忽略了周围语境中的重要文本信息。然而，在许多情况下，需要从两个论点的文本中捕获语义交互来识别它们的修辞关系，需要挖掘更多的上下文线索。在本文中，我们提出了一种方法，将训练集中的rst风格的话语树转换为基于依赖的树，并在这些转换后的结构上训练上下文证据选择器。通过这种方式，选择器可以学习自动从上下文(即作为证据)中为参数挑选关键文本信息的能力，以帮助区分它们之间的关系。然后将参数与相应的证据串联起来进行编码，得到增强的参数表示。最后，我们结合原始和增强的参数表示来识别它们之间的关系。此外，我们引入辅助任务来指导证据选择者的训练，以增强其选择能力。在中国CDTB数据集上的实验结果表明，我们的方法在微观和宏观F1得分方面都优于几种最先进的基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Incorporating contextual evidence to improve implicit discourse relation recognition in Chinese

The discourse analysis task, which focuses on understanding the semantics of long text spans, has received increasing attention in recent years. As a critical component of discourse analysis, discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units (e.g., clauses, sentences, and sentence groups), called arguments, in a document. Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations, ignoring important textual information in the surrounding contexts. However, in many cases, more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations, requiring mining more contextual clues. In this paper, we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures. In this way, the selector can learn the ability to automatically pick critical textual information from the context (i.e., as evidence) for arguments to assist in discriminating their relations. Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations. Finally, we combine original and enhanced argument representations to recognize their relations. In addition, we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability. The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers of Computer Science COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING

CiteScore

8.60

自引率

2.40%

发文量

799

审稿时长

6-12 weeks

期刊介绍： Frontiers of Computer Science aims to provide a forum for the publication of peer-reviewed papers to promote rapid communication and exchange between computer scientists. The journal publishes research papers and review articles in a wide range of topics, including: architecture, software, artificial intelligence, theoretical computer science, networks and communication, information systems, multimedia and graphics, information security, interdisciplinary, etc. The journal especially encourages papers from new emerging and multidisciplinary areas, as well as papers reflecting the international trends of research and development and on special topics reporting progress made by Chinese computer scientists.