{"title":"基于门递归单元的英语文本分类多注意机制","authors":"Haiying Liu","doi":"10.4108/eai.27-1-2022.173166","DOIUrl":null,"url":null,"abstract":"Text classification is one of the core tasks in the field of natural language processing. Aiming at the advantages and disadvantages of current deep learning-based English text classification methods in long text classification, this paper proposes an English text classification model, which introduces multi-attention mechanism based on gate recurrent unit (GRU) to focus on important parts of English text. Firstly, sentences and documents are encoded according to the hierarchical structure of English documents. Second, it uses the attention mechanism separately at each level. On the basis of the global object vector, the maximum pooling is used to extract the specific object vector of sentence, so that the encoded document vector has more obvious category features and can pay more attention to the most distinctive semantic features of each English text. Finally, documents are classified according to the constructed English document representation. Experimental results on public data sets show that this model has better classification performance for long English texts with hierarchical structure.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2022-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-attention mechanism based on gate recurrent unit for English text classification\",\"authors\":\"Haiying Liu\",\"doi\":\"10.4108/eai.27-1-2022.173166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text classification is one of the core tasks in the field of natural language processing. Aiming at the advantages and disadvantages of current deep learning-based English text classification methods in long text classification, this paper proposes an English text classification model, which introduces multi-attention mechanism based on gate recurrent unit (GRU) to focus on important parts of English text. Firstly, sentences and documents are encoded according to the hierarchical structure of English documents. Second, it uses the attention mechanism separately at each level. On the basis of the global object vector, the maximum pooling is used to extract the specific object vector of sentence, so that the encoded document vector has more obvious category features and can pay more attention to the most distinctive semantic features of each English text. Finally, documents are classified according to the constructed English document representation. Experimental results on public data sets show that this model has better classification performance for long English texts with hierarchical structure.\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2022-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4108/eai.27-1-2022.173166\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/eai.27-1-2022.173166","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Multi-attention mechanism based on gate recurrent unit for English text classification
Text classification is one of the core tasks in the field of natural language processing. Aiming at the advantages and disadvantages of current deep learning-based English text classification methods in long text classification, this paper proposes an English text classification model, which introduces multi-attention mechanism based on gate recurrent unit (GRU) to focus on important parts of English text. Firstly, sentences and documents are encoded according to the hierarchical structure of English documents. Second, it uses the attention mechanism separately at each level. On the basis of the global object vector, the maximum pooling is used to extract the specific object vector of sentence, so that the encoded document vector has more obvious category features and can pay more attention to the most distinctive semantic features of each English text. Finally, documents are classified according to the constructed English document representation. Experimental results on public data sets show that this model has better classification performance for long English texts with hierarchical structure.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.