{"title":"Chinese Text Classification Method Based on BERT Word Embedding","authors":"Ziniu Wang, Zhilin Huang, Jianling Gao","doi":"10.1145/3395260.3395273","DOIUrl":null,"url":null,"abstract":"In this paper, we enhance the semantic representation of the word through the BERT pre-training language model, dynamically generates the semantic vector according to the context of the character, and then inputs the character vector embedded as a character-level word vector sequence into the CapsNet.We builted the BiGRU module in the capsule network for text feature extraction, and introduced attention mechanism to focus on key information.We use the corpus of baidu's Chinese question and answer data set and only take the types of questions as classified samples to conduct experiments.We used the separate BERT network and the CapsNet as a comparative experiment. Finally, the experimental results show that the model effect is better than using one of the models alone, and the effect is improved.","PeriodicalId":103490,"journal":{"name":"Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3395260.3395273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
In this paper, we enhance the semantic representation of the word through the BERT pre-training language model, dynamically generates the semantic vector according to the context of the character, and then inputs the character vector embedded as a character-level word vector sequence into the CapsNet.We builted the BiGRU module in the capsule network for text feature extraction, and introduced attention mechanism to focus on key information.We use the corpus of baidu's Chinese question and answer data set and only take the types of questions as classified samples to conduct experiments.We used the separate BERT network and the CapsNet as a comparative experiment. Finally, the experimental results show that the model effect is better than using one of the models alone, and the effect is improved.