Research on Chinese Naming Recognition Model Based on BERT Embedding

Qing Cai
{"title":"Research on Chinese Naming Recognition Model Based on BERT Embedding","authors":"Qing Cai","doi":"10.1109/ICSESS47205.2019.9040736","DOIUrl":null,"url":null,"abstract":"Named entity recognition (NER) is one of the foundations of natural language processing(NLP). In the method of Chinese named entity recognition based on neural network, the vector representation of words is an important step. Traditional word embedding method map words or chars into a single vector, which can not represent the polysemy of words. To solve this problem, a named entity recognition method based on BERT Embedding model is proposed. The method enhances the semantic representation of words by BERT(Bidirectional Encoder Representations from Transformers) pre-trained language model. BERT can generates the semantic vectors dynamically according to the context of the words, and then inputs the word vectors into BiGRU-CRF for training. The whole model can be trained during training. It is also possible to fix the BERT and train only the BiGRU-CRF part. Experiments show that the two training methods of the model reach 95.43% F1 and 94.18% F1 in MSRA corpus, respectively, which are better than the current optimal Lattice-LSTM model.","PeriodicalId":203944,"journal":{"name":"2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)","volume":"188 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS47205.2019.9040736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Named entity recognition (NER) is one of the foundations of natural language processing(NLP). In the method of Chinese named entity recognition based on neural network, the vector representation of words is an important step. Traditional word embedding method map words or chars into a single vector, which can not represent the polysemy of words. To solve this problem, a named entity recognition method based on BERT Embedding model is proposed. The method enhances the semantic representation of words by BERT(Bidirectional Encoder Representations from Transformers) pre-trained language model. BERT can generates the semantic vectors dynamically according to the context of the words, and then inputs the word vectors into BiGRU-CRF for training. The whole model can be trained during training. It is also possible to fix the BERT and train only the BiGRU-CRF part. Experiments show that the two training methods of the model reach 95.43% F1 and 94.18% F1 in MSRA corpus, respectively, which are better than the current optimal Lattice-LSTM model.
基于BERT嵌入的中文命名识别模型研究
命名实体识别是自然语言处理(NLP)的基础之一。在基于神经网络的中文命名实体识别方法中,词的向量表示是一个重要步骤。传统的词嵌入方法将词或字符映射到单个向量中,不能表示词的多义性。为了解决这一问题,提出了一种基于BERT嵌入模型的命名实体识别方法。该方法通过BERT(Bidirectional Encoder Representations from Transformers)预训练的语言模型增强词的语义表示。BERT可以根据单词的上下文动态生成语义向量,然后将单词向量输入到BiGRU-CRF中进行训练。在训练过程中可以对整个模型进行训练。也可以修复BERT,只训练BiGRU-CRF部分。实验表明,该模型的两种训练方法在MSRA语料上分别达到95.43% F1和94.18% F1,优于目前最优的Lattice-LSTM模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信