Named Entity Recognition Incorporating Chinese Segmentation Information

Shiyuan Yu, Shuming Guo, Ruiyang Huang, Jianpeng Zhang, Ke Su, Nan Hu
{"title":"Named Entity Recognition Incorporating Chinese Segmentation Information","authors":"Shiyuan Yu, Shuming Guo, Ruiyang Huang, Jianpeng Zhang, Ke Su, Nan Hu","doi":"10.1109/IEEECONF52377.2022.10013348","DOIUrl":null,"url":null,"abstract":"Word-level information is crucial for Chinese named entity recognition. Presently, most works have achieved better performance by extracting word-level information into character-level representations through existing lexicons, but the maintenance of lexical lists is a major challenge. In this paper, we present the NIMSI model, proposing the incorporation of multiple segmentation information to enhance recognition, using a trilogy to align character-level attention with word-level attention to construct features of segmented information in Chinese text. Also, we use a simple but effective method to directly incorporate multi-segmentation information into character-level representations. Finally, as the experiments on the three benchmark datasets show, our model effectively incorporates segmentation information and alleviates the segmentation errors.","PeriodicalId":193681,"journal":{"name":"2021 International Conference on Advanced Computing and Endogenous Security","volume":"57 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Advanced Computing and Endogenous Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEECONF52377.2022.10013348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Word-level information is crucial for Chinese named entity recognition. Presently, most works have achieved better performance by extracting word-level information into character-level representations through existing lexicons, but the maintenance of lexical lists is a major challenge. In this paper, we present the NIMSI model, proposing the incorporation of multiple segmentation information to enhance recognition, using a trilogy to align character-level attention with word-level attention to construct features of segmented information in Chinese text. Also, we use a simple but effective method to directly incorporate multi-segmentation information into character-level representations. Finally, as the experiments on the three benchmark datasets show, our model effectively incorporates segmentation information and alleviates the segmentation errors.
基于中文分词信息的命名实体识别
词级信息对中文命名实体识别至关重要。目前,大多数作品都是利用已有的词汇将词级信息提取为字符级表示,取得了较好的效果,但词汇表的维护是一个主要的挑战。在本文中,我们提出了NIMSI模型,提出了结合多个分词信息来增强识别,使用三部曲对齐字级注意和词级注意来构建中文文本中分词信息的特征。此外,我们还使用了一种简单而有效的方法将多段信息直接合并到字符级表示中。最后,在三个基准数据集上的实验表明,我们的模型有效地融合了分割信息,减轻了分割误差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信