{"title":"一种融合词与词根特征的中文命名实体识别方法","authors":"Shan Deng, Kai-Biao Lin, Ping Lu","doi":"10.1145/3573942.3574055","DOIUrl":null,"url":null,"abstract":"Named Entity Recognition (NER) is a subtask of natural language processing. Its accuracy is crucial for downstream tasks. In Chinese NER, word information is often added to enhance the semantic and boundary information of Chinese words, but these methods ignore the radical information of Chinese characters. This paper propose a multi-feature fusion model(MFFM) for Chinese NER. First, the input sequences are exported to the BERT layer, the word embedding layer and the radical embedding layer respectively; then the above three layer output are combined together as input of the Bidirectional Long Short-Term Memory(BiLSTM) layer to model the contextual information; finally annotate the sequence with conditional random field. The proposed model not only avoids the import of complex structures, but also effectively captures the character features of the context, thus improves the recognition performance. The experimental results show that the F1 value of MFFM reaches 71.02% on the Weibo dataset, which is 3.12% higher than that of the BERT model, and 82.78% on the OntoNotes4.0 dataset, which is 0.85% higher than that of the BERT model.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Chinese Named Entity Recognition Method Fusing Word and Radical Features\",\"authors\":\"Shan Deng, Kai-Biao Lin, Ping Lu\",\"doi\":\"10.1145/3573942.3574055\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named Entity Recognition (NER) is a subtask of natural language processing. Its accuracy is crucial for downstream tasks. In Chinese NER, word information is often added to enhance the semantic and boundary information of Chinese words, but these methods ignore the radical information of Chinese characters. This paper propose a multi-feature fusion model(MFFM) for Chinese NER. First, the input sequences are exported to the BERT layer, the word embedding layer and the radical embedding layer respectively; then the above three layer output are combined together as input of the Bidirectional Long Short-Term Memory(BiLSTM) layer to model the contextual information; finally annotate the sequence with conditional random field. The proposed model not only avoids the import of complex structures, but also effectively captures the character features of the context, thus improves the recognition performance. The experimental results show that the F1 value of MFFM reaches 71.02% on the Weibo dataset, which is 3.12% higher than that of the BERT model, and 82.78% on the OntoNotes4.0 dataset, which is 0.85% higher than that of the BERT model.\",\"PeriodicalId\":103293,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3573942.3574055\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573942.3574055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Chinese Named Entity Recognition Method Fusing Word and Radical Features
Named Entity Recognition (NER) is a subtask of natural language processing. Its accuracy is crucial for downstream tasks. In Chinese NER, word information is often added to enhance the semantic and boundary information of Chinese words, but these methods ignore the radical information of Chinese characters. This paper propose a multi-feature fusion model(MFFM) for Chinese NER. First, the input sequences are exported to the BERT layer, the word embedding layer and the radical embedding layer respectively; then the above three layer output are combined together as input of the Bidirectional Long Short-Term Memory(BiLSTM) layer to model the contextual information; finally annotate the sequence with conditional random field. The proposed model not only avoids the import of complex structures, but also effectively captures the character features of the context, thus improves the recognition performance. The experimental results show that the F1 value of MFFM reaches 71.02% on the Weibo dataset, which is 3.12% higher than that of the BERT model, and 82.78% on the OntoNotes4.0 dataset, which is 0.85% higher than that of the BERT model.