{"title":"基于BiLSTM-CRF模型的中文命名实体识别算法设计","authors":"Luan Di, Xie Ling, Wang Guangwen","doi":"10.1109/TOCS53301.2021.9688786","DOIUrl":null,"url":null,"abstract":"This paper implements a Chinese named entity recognition algorithm based on bidirectional LSTM (BiLSTM) and CRF model. Named entity recognition is an important part in the field of natural language processing. It is not only a typical time-series data processing problem, but also a typical sequence annotation problem. Due to the complexity of Chinese semantic ambiguity and polysemy, the task of Chinese named entity recognition is more difficult. BiLSTM uses two reverse LSTM networks to provide additional context information for the algorithm model. CRF can effectively control the conversion relationship between output sequences and further improve the recognition accuracy. In order to prevent over fitting, Dropout mechanism is also adopted in the network. The algorithm is implemented based on tensorflow platform, and the recognition rate is significantly improved compared with using single LSTM model. The experiments also verified the influence of Embedding dimension, parameter optimizer and Dropout rate on recognition accuracy.","PeriodicalId":360004,"journal":{"name":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design of Chinese named entity recognition algorithm based on BiLSTM-CRF model\",\"authors\":\"Luan Di, Xie Ling, Wang Guangwen\",\"doi\":\"10.1109/TOCS53301.2021.9688786\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper implements a Chinese named entity recognition algorithm based on bidirectional LSTM (BiLSTM) and CRF model. Named entity recognition is an important part in the field of natural language processing. It is not only a typical time-series data processing problem, but also a typical sequence annotation problem. Due to the complexity of Chinese semantic ambiguity and polysemy, the task of Chinese named entity recognition is more difficult. BiLSTM uses two reverse LSTM networks to provide additional context information for the algorithm model. CRF can effectively control the conversion relationship between output sequences and further improve the recognition accuracy. In order to prevent over fitting, Dropout mechanism is also adopted in the network. The algorithm is implemented based on tensorflow platform, and the recognition rate is significantly improved compared with using single LSTM model. The experiments also verified the influence of Embedding dimension, parameter optimizer and Dropout rate on recognition accuracy.\",\"PeriodicalId\":360004,\"journal\":{\"name\":\"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TOCS53301.2021.9688786\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TOCS53301.2021.9688786","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design of Chinese named entity recognition algorithm based on BiLSTM-CRF model
This paper implements a Chinese named entity recognition algorithm based on bidirectional LSTM (BiLSTM) and CRF model. Named entity recognition is an important part in the field of natural language processing. It is not only a typical time-series data processing problem, but also a typical sequence annotation problem. Due to the complexity of Chinese semantic ambiguity and polysemy, the task of Chinese named entity recognition is more difficult. BiLSTM uses two reverse LSTM networks to provide additional context information for the algorithm model. CRF can effectively control the conversion relationship between output sequences and further improve the recognition accuracy. In order to prevent over fitting, Dropout mechanism is also adopted in the network. The algorithm is implemented based on tensorflow platform, and the recognition rate is significantly improved compared with using single LSTM model. The experiments also verified the influence of Embedding dimension, parameter optimizer and Dropout rate on recognition accuracy.