Weiquan Zhang, Suqin Tang, Danni He, Tinghui Li, Changchun Pan
{"title":"基于词首字母特征的壮语命名实体识别","authors":"Weiquan Zhang, Suqin Tang, Danni He, Tinghui Li, Changchun Pan","doi":"10.1145/3529466.3529478","DOIUrl":null,"url":null,"abstract":"Named entity recognition is an important task and basis for the intelligent information processing and knowledge representation learning of the Zhuang Language. A BilSTM-CNN-CRF network model combining the uppercase and lowercase characters of words is proposed to be applied to the named entity recognition task of the Zhuang language, which lacks corpus for named entity labeling. Firstly, word2vec is used to train in unmarked Zhuang text to get the word vector of the Zhuang language. Then convolutional neural network is used to extract the character features of Zhuang words, and the character feature vector is obtained. The above two vectors were connected with the initial case feature vectors, which are randomly generated, and then the connected vectors were input into a BilSTM-CNN-CRF model for training; thus, the end-to-end named entity recognition model of Zhuang language was constructed. Experimental results show that, without relying on artificial features and external dictionaries, the proposed method in this study is superior to contrastive models by achieving an 80.37% F1 value in the named entity recognition task, which leads to the realization of automated named entity recognition of Zhuang language.","PeriodicalId":375562,"journal":{"name":"Proceedings of the 2022 6th International Conference on Innovation in Artificial Intelligence","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Named Entity Recognition of Zhuang Language Based on the Feature of Initial Letter in Word\",\"authors\":\"Weiquan Zhang, Suqin Tang, Danni He, Tinghui Li, Changchun Pan\",\"doi\":\"10.1145/3529466.3529478\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named entity recognition is an important task and basis for the intelligent information processing and knowledge representation learning of the Zhuang Language. A BilSTM-CNN-CRF network model combining the uppercase and lowercase characters of words is proposed to be applied to the named entity recognition task of the Zhuang language, which lacks corpus for named entity labeling. Firstly, word2vec is used to train in unmarked Zhuang text to get the word vector of the Zhuang language. Then convolutional neural network is used to extract the character features of Zhuang words, and the character feature vector is obtained. The above two vectors were connected with the initial case feature vectors, which are randomly generated, and then the connected vectors were input into a BilSTM-CNN-CRF model for training; thus, the end-to-end named entity recognition model of Zhuang language was constructed. Experimental results show that, without relying on artificial features and external dictionaries, the proposed method in this study is superior to contrastive models by achieving an 80.37% F1 value in the named entity recognition task, which leads to the realization of automated named entity recognition of Zhuang language.\",\"PeriodicalId\":375562,\"journal\":{\"name\":\"Proceedings of the 2022 6th International Conference on Innovation in Artificial Intelligence\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 6th International Conference on Innovation in Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3529466.3529478\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Innovation in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529466.3529478","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Named Entity Recognition of Zhuang Language Based on the Feature of Initial Letter in Word
Named entity recognition is an important task and basis for the intelligent information processing and knowledge representation learning of the Zhuang Language. A BilSTM-CNN-CRF network model combining the uppercase and lowercase characters of words is proposed to be applied to the named entity recognition task of the Zhuang language, which lacks corpus for named entity labeling. Firstly, word2vec is used to train in unmarked Zhuang text to get the word vector of the Zhuang language. Then convolutional neural network is used to extract the character features of Zhuang words, and the character feature vector is obtained. The above two vectors were connected with the initial case feature vectors, which are randomly generated, and then the connected vectors were input into a BilSTM-CNN-CRF model for training; thus, the end-to-end named entity recognition model of Zhuang language was constructed. Experimental results show that, without relying on artificial features and external dictionaries, the proposed method in this study is superior to contrastive models by achieving an 80.37% F1 value in the named entity recognition task, which leads to the realization of automated named entity recognition of Zhuang language.