Unified Neural Lexical Analysis Via Two-Stage Span Tagging

IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yantuan Xian, Yefen Zhu, Zhentao Yu, Yuxin Huang, Junjun Guo, Yan Xiang
{"title":"Unified Neural Lexical Analysis Via Two-Stage Span Tagging","authors":"Yantuan Xian,&nbsp;Yefen Zhu,&nbsp;Zhentao Yu,&nbsp;Yuxin Huang,&nbsp;Junjun Guo,&nbsp;Yan Xiang","doi":"10.1049/cit2.70015","DOIUrl":null,"url":null,"abstract":"<p>Lexical analysis is a fundamental task in natural language processing, which involves several subtasks, such as word segmentation (WS), part-of-speech (POS) tagging, and named entity recognition (NER). Recent works have shown that taking advantage of relatedness between these subtasks can be beneficial. This paper proposes a unified neural framework to address these subtasks simultaneously. Apart from the sequence tagging paradigm, the proposed method tackles the multitask lexical analysis via two-stage sequence span classification. Firstly, the model detects the word and named entity boundaries by multi-label classification over character spans in a sentence. Then, the authors assign POS labels and entity labels for words and named entities by multi-class classification, respectively. Furthermore, a Gated Task Transformation (GTT) is proposed to encourage the model to share valuable features between tasks. The performance of the proposed model was evaluated on Chinese and Thai public datasets, demonstrating state-of-the-art results.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1254-1267"},"PeriodicalIF":7.3000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70015","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.70015","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Lexical analysis is a fundamental task in natural language processing, which involves several subtasks, such as word segmentation (WS), part-of-speech (POS) tagging, and named entity recognition (NER). Recent works have shown that taking advantage of relatedness between these subtasks can be beneficial. This paper proposes a unified neural framework to address these subtasks simultaneously. Apart from the sequence tagging paradigm, the proposed method tackles the multitask lexical analysis via two-stage sequence span classification. Firstly, the model detects the word and named entity boundaries by multi-label classification over character spans in a sentence. Then, the authors assign POS labels and entity labels for words and named entities by multi-class classification, respectively. Furthermore, a Gated Task Transformation (GTT) is proposed to encourage the model to share valuable features between tasks. The performance of the proposed model was evaluated on Chinese and Thai public datasets, demonstrating state-of-the-art results.

Abstract Image

Abstract Image

Abstract Image

基于两阶段跨度标注的统一神经词法分析
词法分析是自然语言处理中的一项基本任务,它涉及几个子任务,如分词(WS)、词性(POS)标记和命名实体识别(NER)。最近的研究表明,利用这些子任务之间的相关性是有益的。本文提出了一个统一的神经网络框架来同时处理这些子任务。除了序列标注范例外,该方法还通过两阶段序列跨度分类解决了多任务词法分析问题。首先,该模型通过对句子中的字符跨度进行多标签分类来检测单词和命名实体的边界。然后,作者通过多类分类分别为单词和命名实体分配词性标签和实体标签。此外,提出了一种门控任务转换(GTT),以鼓励模型在任务之间共享有价值的特征。该模型的性能在中国和泰国的公共数据集上进行了评估,展示了最先进的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CAAI Transactions on Intelligence Technology
CAAI Transactions on Intelligence Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
11.00
自引率
3.90%
发文量
134
审稿时长
35 weeks
期刊介绍: CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信