ELCA: Enhanced boundary location for Chinese named entity recognition via contextual association

IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yizhao Wang, Shun Mao, Yuncheng Jiang
{"title":"ELCA: Enhanced boundary location for Chinese named entity recognition via contextual association","authors":"Yizhao Wang, Shun Mao, Yuncheng Jiang","doi":"10.3233/ida-230383","DOIUrl":null,"url":null,"abstract":"Named Entity Recognition (NER) is a fundamental task that aids in the completion of other tasks such as text understanding, information retrieval and question answering in Natural Language Processing (NLP). In recent years, the use of a mix of character-word structure and dictionary information forChinese NER has been demonstrated to be effective. As a representative of hybrid models, Lattice-LSTM has obtained better benchmarking results in several publicly available Chinese NER datasets. However, Lattice-LSTM does not address the issue of long-distance entities or the detection of several entities with the same character. At the same time, the ambiguity of entity boundary information also leads to a decrease in the accuracy of embedding NER. This paper proposes ELCA: Enhanced Boundary Location for Chinese Named Entity Recognition Via Contextual Association, a method that solves the problem of long-distance dependent entities by using sentence-level position information. At the same time, it uses adaptive word convolution to overcome the problem of several entities sharing the same character. ELCA achieves the state-of-the-art outcomes in Chinese Word Segmentation and Chinese NER.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"23 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-230383","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Named Entity Recognition (NER) is a fundamental task that aids in the completion of other tasks such as text understanding, information retrieval and question answering in Natural Language Processing (NLP). In recent years, the use of a mix of character-word structure and dictionary information forChinese NER has been demonstrated to be effective. As a representative of hybrid models, Lattice-LSTM has obtained better benchmarking results in several publicly available Chinese NER datasets. However, Lattice-LSTM does not address the issue of long-distance entities or the detection of several entities with the same character. At the same time, the ambiguity of entity boundary information also leads to a decrease in the accuracy of embedding NER. This paper proposes ELCA: Enhanced Boundary Location for Chinese Named Entity Recognition Via Contextual Association, a method that solves the problem of long-distance dependent entities by using sentence-level position information. At the same time, it uses adaptive word convolution to overcome the problem of several entities sharing the same character. ELCA achieves the state-of-the-art outcomes in Chinese Word Segmentation and Chinese NER.
ELCA:通过上下文关联加强中文命名实体识别的边界定位
命名实体识别(NER)是一项基本任务,有助于完成自然语言处理(NLP)中的文本理解、信息检索和问题解答等其他任务。近年来,在中文 NER 中混合使用字词结构和词典信息已被证明是有效的。作为混合模型的代表,Lattice-LSTM 在几个公开的中文 NER 数据集中取得了较好的基准结果。然而,Lattice-LSTM 并没有解决远距离实体或多个同字实体的检测问题。同时,实体边界信息的模糊性也会导致嵌入 NER 的准确率下降。本文提出的 ELCA:通过上下文关联增强中文命名实体识别的边界定位,是一种利用句子级位置信息解决长距离依存实体问题的方法。同时,它还利用自适应词卷积克服了多个实体共享同一字符的问题。ELCA 在中文分词和中文近义词识别方面取得了最先进的成果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Intelligent Data Analysis
Intelligent Data Analysis 工程技术-计算机:人工智能
CiteScore
2.20
自引率
5.90%
发文量
85
审稿时长
3.3 months
期刊介绍: Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信