Dongxue Bao, Donghong Qin, Xianye Liang, Lila Hong
{"title":"基于BERT和融合网络的短文本分类模型","authors":"Dongxue Bao, Donghong Qin, Xianye Liang, Lila Hong","doi":"10.1145/3507548.3507574","DOIUrl":null,"url":null,"abstract":"Abstract: Aiming at short texts lacking contextual information, large amount of text data, sparse features, and traditional text feature representations that cannot dynamically obtain the key classification information of a word polysemous and contextual semantics. this paper proposes a pre-trained language model based on BERT. The network model B-BAtt-MPC (BERT-BiLSTM-Attention-Max-Pooling-Concat) that integrates BiLSTM, Attention mechanism and Max-Pooling mechanism. Firstly, obtain multi-dimensional and rich feature information such as text context semantics, grammar, and context through the BERT model; Secondly, use the BERT output vector to obtain the most important feature information worth noting through the BiLSTM, Attention layer and Max-Pooling layer; In order to optimize the classification model, the BERT and BiLSTM output vectors are fused and input into Max-Pooling; Finally, the classification results are obtained by fusing two feature vectors with Max-Pooling. The experimental results of two data sets show that the model proposed in this paper can obtain the importance and key rich semantic features of short text classification, and can improve the text classification effect.","PeriodicalId":414908,"journal":{"name":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Short Text Classification Model Based on BERT and Fusion Network\",\"authors\":\"Dongxue Bao, Donghong Qin, Xianye Liang, Lila Hong\",\"doi\":\"10.1145/3507548.3507574\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract: Aiming at short texts lacking contextual information, large amount of text data, sparse features, and traditional text feature representations that cannot dynamically obtain the key classification information of a word polysemous and contextual semantics. this paper proposes a pre-trained language model based on BERT. The network model B-BAtt-MPC (BERT-BiLSTM-Attention-Max-Pooling-Concat) that integrates BiLSTM, Attention mechanism and Max-Pooling mechanism. Firstly, obtain multi-dimensional and rich feature information such as text context semantics, grammar, and context through the BERT model; Secondly, use the BERT output vector to obtain the most important feature information worth noting through the BiLSTM, Attention layer and Max-Pooling layer; In order to optimize the classification model, the BERT and BiLSTM output vectors are fused and input into Max-Pooling; Finally, the classification results are obtained by fusing two feature vectors with Max-Pooling. The experimental results of two data sets show that the model proposed in this paper can obtain the importance and key rich semantic features of short text classification, and can improve the text classification effect.\",\"PeriodicalId\":414908,\"journal\":{\"name\":\"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3507548.3507574\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3507548.3507574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
摘要
摘要:针对缺乏上下文信息的短文本、文本数据量大、特征稀疏、传统文本特征表示不能动态获取词的多义和上下文语义的关键分类信息等问题。本文提出了一种基于BERT的预训练语言模型。集成了BiLSTM、Attention机制和Max-Pooling机制的网络模型b - bat - mpc (BERT-BiLSTM-Attention-Max-Pooling-Concat)。首先,通过BERT模型获得文本上下文语义、语法、上下文等多维、丰富的特征信息;其次,利用BERT输出向量,通过BiLSTM、Attention层和Max-Pooling层获得最重要的值得注意的特征信息;为了优化分类模型,将BERT和BiLSTM输出向量融合并输入到Max-Pooling中;最后,利用Max-Pooling对两个特征向量进行融合,得到分类结果。两个数据集的实验结果表明,本文提出的模型能够获得短文本分类的重要性和关键丰富的语义特征,能够提高文本分类效果。
Short Text Classification Model Based on BERT and Fusion Network
Abstract: Aiming at short texts lacking contextual information, large amount of text data, sparse features, and traditional text feature representations that cannot dynamically obtain the key classification information of a word polysemous and contextual semantics. this paper proposes a pre-trained language model based on BERT. The network model B-BAtt-MPC (BERT-BiLSTM-Attention-Max-Pooling-Concat) that integrates BiLSTM, Attention mechanism and Max-Pooling mechanism. Firstly, obtain multi-dimensional and rich feature information such as text context semantics, grammar, and context through the BERT model; Secondly, use the BERT output vector to obtain the most important feature information worth noting through the BiLSTM, Attention layer and Max-Pooling layer; In order to optimize the classification model, the BERT and BiLSTM output vectors are fused and input into Max-Pooling; Finally, the classification results are obtained by fusing two feature vectors with Max-Pooling. The experimental results of two data sets show that the model proposed in this paper can obtain the importance and key rich semantic features of short text classification, and can improve the text classification effect.