Named Entity Recognition Based on BERT-BiLSTM-SPAN in Low Resource Scenarios

Maobin Weng, Weiwen Zhang
{"title":"Named Entity Recognition Based on BERT-BiLSTM-SPAN in Low Resource Scenarios","authors":"Maobin Weng, Weiwen Zhang","doi":"10.1109/ICCRD56364.2023.10080054","DOIUrl":null,"url":null,"abstract":"The task of named entity recognition (NER) is crucial in the creation of knowledge graphs. With the advancement of deep learning, the pre-training model BERT has become the mainstream solution for NER. However, lack of corpus leads to poor performance of NER models using BERT alone. In low resource scenarios, previous work has focused on merging complex information to model or transfer learning from high resource corpora. Therefore, a simple but effective strategy for fully utilizing the corpus is required. In this paper, we focus on recognizing entities under resource constraints. We propose BERT-BiLSTM-SPAN for low resource scenarios, where BERT is used as an embedding layer, combined with BiLSTM and a decoding layer using a span pointer decoding algorithm. To make our model more robust, we employ adversarial training and data augmentation techniques. We conduct experiments on the marine news dataset. The BERT-BiLSTM-SPAN achieves an 80.11% F1-score. Furthermore, experimental results of data augmentation and adversarial training are both encouraging. Therefore, our proposed solutions show suitability in low resource scenarios.","PeriodicalId":324375,"journal":{"name":"2023 15th International Conference on Computer Research and Development (ICCRD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 15th International Conference on Computer Research and Development (ICCRD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCRD56364.2023.10080054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The task of named entity recognition (NER) is crucial in the creation of knowledge graphs. With the advancement of deep learning, the pre-training model BERT has become the mainstream solution for NER. However, lack of corpus leads to poor performance of NER models using BERT alone. In low resource scenarios, previous work has focused on merging complex information to model or transfer learning from high resource corpora. Therefore, a simple but effective strategy for fully utilizing the corpus is required. In this paper, we focus on recognizing entities under resource constraints. We propose BERT-BiLSTM-SPAN for low resource scenarios, where BERT is used as an embedding layer, combined with BiLSTM and a decoding layer using a span pointer decoding algorithm. To make our model more robust, we employ adversarial training and data augmentation techniques. We conduct experiments on the marine news dataset. The BERT-BiLSTM-SPAN achieves an 80.11% F1-score. Furthermore, experimental results of data augmentation and adversarial training are both encouraging. Therefore, our proposed solutions show suitability in low resource scenarios.
低资源场景下基于BERT-BiLSTM-SPAN的命名实体识别
命名实体识别(NER)任务在知识图的创建中至关重要。随着深度学习的发展,预训练模型BERT已成为NER的主流解决方案。然而,缺乏语料库导致单独使用BERT的NER模型性能不佳。在低资源情况下,以前的工作主要集中在将复杂信息合并到模型中,或者从高资源语料库中迁移学习。因此,需要一个简单而有效的策略来充分利用语料库。本文主要研究资源约束下实体的识别问题。我们提出BERT-BiLSTM- span用于低资源场景,其中BERT作为嵌入层,结合BiLSTM和使用跨度指针解码算法的解码层。为了使我们的模型更加健壮,我们采用了对抗性训练和数据增强技术。我们在海洋新闻数据集上进行实验。BERT-BiLSTM-SPAN达到80.11%的f1分。此外,数据增强和对抗性训练的实验结果都令人鼓舞。因此,我们提出的解决方案适用于低资源场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信