A Novel Chinese Resume Named Entity Recognition Model Based on Lexical Enhancement

Jinshang Luo, Ying Liu, Mengshu Hou
{"title":"A Novel Chinese Resume Named Entity Recognition Model Based on Lexical Enhancement","authors":"Jinshang Luo, Ying Liu, Mengshu Hou","doi":"10.1145/3581807.3581856","DOIUrl":null,"url":null,"abstract":"The resume's popularity on the Internet has greatly increased with the development of the communication form. It is a concern of researchers to analyze the resumes of job applicants using the Named Entity Recognition (NER) method. The difficulty of Chinese Resume NER rests with word segmentation ambiguity and domain knowledge complexity. To tackle the issue, a novel lexical enhancement Long Short-Term Memory (LSTM) model with the average encoding strategy (LEAE-LSTM) is proposed. First, through the pre-trained models, the representations of characters and words are encoded separately. The lexical features with complementary information are introduced for the character sequence by matching the lexicon. Furthermore, to improve contextual awareness, the multi-metadata embeddings are combined as the input of the LSTM layer. The sentence's implicit correlations are picked up by the self-attention mechanism. Experiments on the benchmark resume dataset demonstrate that LEAE-LSTM surpasses other state-of-the-art methods. For the Chinese resume dataset, LEAE-LSTM gains a 1.8% improvement in F1 score over the baseline model Lattice LSTM.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3581807.3581856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The resume's popularity on the Internet has greatly increased with the development of the communication form. It is a concern of researchers to analyze the resumes of job applicants using the Named Entity Recognition (NER) method. The difficulty of Chinese Resume NER rests with word segmentation ambiguity and domain knowledge complexity. To tackle the issue, a novel lexical enhancement Long Short-Term Memory (LSTM) model with the average encoding strategy (LEAE-LSTM) is proposed. First, through the pre-trained models, the representations of characters and words are encoded separately. The lexical features with complementary information are introduced for the character sequence by matching the lexicon. Furthermore, to improve contextual awareness, the multi-metadata embeddings are combined as the input of the LSTM layer. The sentence's implicit correlations are picked up by the self-attention mechanism. Experiments on the benchmark resume dataset demonstrate that LEAE-LSTM surpasses other state-of-the-art methods. For the Chinese resume dataset, LEAE-LSTM gains a 1.8% improvement in F1 score over the baseline model Lattice LSTM.
一种基于词汇增强的中文简历命名实体识别模型
随着交流形式的发展,简历在互联网上的受欢迎程度大大提高。利用命名实体识别(NER)方法对求职者的简历进行分析一直是研究者关注的问题。中文简历NER的难点在于分词歧义和领域知识复杂性。针对这一问题,提出了一种基于平均编码策略的词汇增强长短期记忆模型(LEAE-LSTM)。首先,通过预训练的模型,分别对字符和单词的表示进行编码。通过对词汇的匹配,为字符序列引入具有互补信息的词汇特征。此外,为了提高上下文感知能力,将多元数据嵌入组合为LSTM层的输入。句子的内隐关联是由自我注意机制拾取的。在基准简历数据集上的实验表明,LEAE-LSTM优于其他最先进的方法。对于中文简历数据集,LEAE-LSTM在F1得分上比基线模型Lattice LSTM提高了1.8%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信