{"title":"Named-entity recognition method of key population information based on improved BiLSTM-CRF model","authors":"Bi-Jun Ren, Fanliang Bu, Yihui Fu, Zhiwen Hou","doi":"10.1109/ICAICA52286.2021.9497963","DOIUrl":null,"url":null,"abstract":"Key population control is related to national security and social stability. Aiming at the difficulty of extracting key population information entities, this paper proposes a sequence labeling method based on dynamic word vectors. Replace the Word2vec pre-training model with the BERT model, which strengthens the feature extraction capabilities of the traditional entity extraction model, and more fully describes the multiple semantic and syntactic information of words. The improvement ideas for the BiLSTM-CRF model are as follows: Embed the BERT model upstream of the model, which is responsible for converting the original corpus into a dynamic vectorized representation, and the trained word vector is input into the BiLSTM layer for semantic encoding, and further mining the semantic related features of the entity context, and finally, the CRF layer outputs the sequence label with the maximum probability. After training with key population information as a data set, the F1 value of the model reached 0.90.","PeriodicalId":121979,"journal":{"name":"2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICA52286.2021.9497963","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Key population control is related to national security and social stability. Aiming at the difficulty of extracting key population information entities, this paper proposes a sequence labeling method based on dynamic word vectors. Replace the Word2vec pre-training model with the BERT model, which strengthens the feature extraction capabilities of the traditional entity extraction model, and more fully describes the multiple semantic and syntactic information of words. The improvement ideas for the BiLSTM-CRF model are as follows: Embed the BERT model upstream of the model, which is responsible for converting the original corpus into a dynamic vectorized representation, and the trained word vector is input into the BiLSTM layer for semantic encoding, and further mining the semantic related features of the entity context, and finally, the CRF layer outputs the sequence label with the maximum probability. After training with key population information as a data set, the F1 value of the model reached 0.90.