{"title":"Design of Chinese named entity recognition algorithm based on BiLSTM-CRF model","authors":"Luan Di, Xie Ling, Wang Guangwen","doi":"10.1109/TOCS53301.2021.9688786","DOIUrl":null,"url":null,"abstract":"This paper implements a Chinese named entity recognition algorithm based on bidirectional LSTM (BiLSTM) and CRF model. Named entity recognition is an important part in the field of natural language processing. It is not only a typical time-series data processing problem, but also a typical sequence annotation problem. Due to the complexity of Chinese semantic ambiguity and polysemy, the task of Chinese named entity recognition is more difficult. BiLSTM uses two reverse LSTM networks to provide additional context information for the algorithm model. CRF can effectively control the conversion relationship between output sequences and further improve the recognition accuracy. In order to prevent over fitting, Dropout mechanism is also adopted in the network. The algorithm is implemented based on tensorflow platform, and the recognition rate is significantly improved compared with using single LSTM model. The experiments also verified the influence of Embedding dimension, parameter optimizer and Dropout rate on recognition accuracy.","PeriodicalId":360004,"journal":{"name":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TOCS53301.2021.9688786","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper implements a Chinese named entity recognition algorithm based on bidirectional LSTM (BiLSTM) and CRF model. Named entity recognition is an important part in the field of natural language processing. It is not only a typical time-series data processing problem, but also a typical sequence annotation problem. Due to the complexity of Chinese semantic ambiguity and polysemy, the task of Chinese named entity recognition is more difficult. BiLSTM uses two reverse LSTM networks to provide additional context information for the algorithm model. CRF can effectively control the conversion relationship between output sequences and further improve the recognition accuracy. In order to prevent over fitting, Dropout mechanism is also adopted in the network. The algorithm is implemented based on tensorflow platform, and the recognition rate is significantly improved compared with using single LSTM model. The experiments also verified the influence of Embedding dimension, parameter optimizer and Dropout rate on recognition accuracy.