Multi-Task Multi-Attention Transformer for Generative Named Entity Recognition

IF 5.1 2区计算机科学 Q1 ACOUSTICS

IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-09-12 DOI:10.1109/TASLP.2024.3458796

Ying Mo;Jiahao Liu;Hongyin Tang;Qifan Wang;Zenglin Xu;Jingang Wang;Xiaojun Quan;Wei Wu;Zhoujun Li

{"title":"Multi-Task Multi-Attention Transformer for Generative Named Entity Recognition","authors":"Ying Mo;Jiahao Liu;Hongyin Tang;Qifan Wang;Zenglin Xu;Jingang Wang;Xiaojun Quan;Wei Wu;Zhoujun Li","doi":"10.1109/TASLP.2024.3458796","DOIUrl":null,"url":null,"abstract":"Most previous sequential labeling models are task-specific, while recent years have witnessed the rise of generative models due to the advantage of unifying all named entity recognition (NER) tasks into the encoder-decoder framework. Although achieving promising performance, our pilot studies demonstrate that existing generative models are ineffective at detecting entity boundaries and estimating entity types. In this paper, we propose a multi-task Transformer, which incorporates an entity boundary detection task into the named entity recognition task. More concretely, we achieve entity boundary detection by classifying the relations between tokens within the sentence. To improve the accuracy of entity-type mapping during decoding, we adopt an external knowledge base to calculate the prior entity-type distributions and then incorporate the information into the model via the self- and cross-attention mechanisms. We perform experiments on extensive NER benchmarks, including flat, nested, and discontinuous NER datasets involving long entities. It substantially increases nearly \n<inline-formula><tex-math>$+0.3 \\sim +1.5\\;{F_1}$</tex-math></inline-formula>\n scores across a broad spectrum or performs closely to the best generative NER model. Experimental results show that our approach improves the performance of the generative NER model considerably.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"4171-4183"},"PeriodicalIF":5.1000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10679732/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Most previous sequential labeling models are task-specific, while recent years have witnessed the rise of generative models due to the advantage of unifying all named entity recognition (NER) tasks into the encoder-decoder framework. Although achieving promising performance, our pilot studies demonstrate that existing generative models are ineffective at detecting entity boundaries and estimating entity types. In this paper, we propose a multi-task Transformer, which incorporates an entity boundary detection task into the named entity recognition task. More concretely, we achieve entity boundary detection by classifying the relations between tokens within the sentence. To improve the accuracy of entity-type mapping during decoding, we adopt an external knowledge base to calculate the prior entity-type distributions and then incorporate the information into the model via the self- and cross-attention mechanisms. We perform experiments on extensive NER benchmarks, including flat, nested, and discontinuous NER datasets involving long entities. It substantially increases nearly

$+0.3 \sim +1.5\;{F_1}$

scores across a broad spectrum or performs closely to the best generative NER model. Experimental results show that our approach improves the performance of the generative NER model considerably.

查看原文本刊更多论文

用于生成式命名实体识别的多任务多注意转换器

以前的大多数顺序标注模型都是针对特定任务的，而近年来，由于将所有命名实体识别（NER）任务统一到编码器-解码器框架中的优势，生成模型开始兴起。尽管取得了可喜的成绩，但我们的试验研究表明，现有的生成模型在检测实体边界和估计实体类型方面效果不佳。在本文中，我们提出了一种多任务转换器，它将实体边界检测任务纳入命名实体识别任务中。更具体地说，我们通过对句子中标记之间的关系进行分类来实现实体边界检测。为了提高解码过程中实体类型映射的准确性，我们采用外部知识库来计算先验实体类型分布，然后通过自关注和交叉关注机制将这些信息纳入模型。我们在广泛的 NER 基准上进行了实验，包括涉及长实体的平面、嵌套和不连续 NER 数据集。它在广泛的范围内大幅提高了近 $+0.3 （sim +1.5）;{F_1}$ 的得分，或与最佳生成式 NER 模型的表现接近。实验结果表明，我们的方法大大提高了生成式 NER 模型的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

11.30

自引率

11.10%

发文量

217

期刊介绍： The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.