{"title":"Multi-attention mechanism based on gate recurrent unit for English text classification","authors":"Haiying Liu","doi":"10.4108/eai.27-1-2022.173166","DOIUrl":null,"url":null,"abstract":"Text classification is one of the core tasks in the field of natural language processing. Aiming at the advantages and disadvantages of current deep learning-based English text classification methods in long text classification, this paper proposes an English text classification model, which introduces multi-attention mechanism based on gate recurrent unit (GRU) to focus on important parts of English text. Firstly, sentences and documents are encoded according to the hierarchical structure of English documents. Second, it uses the attention mechanism separately at each level. On the basis of the global object vector, the maximum pooling is used to extract the specific object vector of sentence, so that the encoded document vector has more obvious category features and can pay more attention to the most distinctive semantic features of each English text. Finally, documents are classified according to the constructed English document representation. Experimental results on public data sets show that this model has better classification performance for long English texts with hierarchical structure.","PeriodicalId":43034,"journal":{"name":"EAI Endorsed Transactions on Scalable Information Systems","volume":"39 1","pages":"20"},"PeriodicalIF":1.1000,"publicationDate":"2022-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EAI Endorsed Transactions on Scalable Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/eai.27-1-2022.173166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Text classification is one of the core tasks in the field of natural language processing. Aiming at the advantages and disadvantages of current deep learning-based English text classification methods in long text classification, this paper proposes an English text classification model, which introduces multi-attention mechanism based on gate recurrent unit (GRU) to focus on important parts of English text. Firstly, sentences and documents are encoded according to the hierarchical structure of English documents. Second, it uses the attention mechanism separately at each level. On the basis of the global object vector, the maximum pooling is used to extract the specific object vector of sentence, so that the encoded document vector has more obvious category features and can pay more attention to the most distinctive semantic features of each English text. Finally, documents are classified according to the constructed English document representation. Experimental results on public data sets show that this model has better classification performance for long English texts with hierarchical structure.