{"title":"基于多层监督的文本分类注意模型","authors":"Chunyi Yue, Hanqiang Cao, Guoping Xu, Youli Dong","doi":"10.1145/3395260.3395290","DOIUrl":null,"url":null,"abstract":"Text classification is a classic topic in natural language processing. In this study, we propose an attention model with multi-layer supervision for this task. In our model, the previous context vector is directly used as attention to select the required features, and multi-layer supervision is used for text classification, i.e., the prediction losses are combined across all layers in the global cost function. The main contribution of our model is that the context vector is not only used as attention but also as a representation of an input text for classification at each layer. We conducted experiments based on five benchmark text classification data sets and the results indicate that our model can improve classification performance when applied to most of the data sets.","PeriodicalId":103490,"journal":{"name":"Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence","volume":"8 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Attention model with multi-layer supervision for text Classification\",\"authors\":\"Chunyi Yue, Hanqiang Cao, Guoping Xu, Youli Dong\",\"doi\":\"10.1145/3395260.3395290\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text classification is a classic topic in natural language processing. In this study, we propose an attention model with multi-layer supervision for this task. In our model, the previous context vector is directly used as attention to select the required features, and multi-layer supervision is used for text classification, i.e., the prediction losses are combined across all layers in the global cost function. The main contribution of our model is that the context vector is not only used as attention but also as a representation of an input text for classification at each layer. We conducted experiments based on five benchmark text classification data sets and the results indicate that our model can improve classification performance when applied to most of the data sets.\",\"PeriodicalId\":103490,\"journal\":{\"name\":\"Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence\",\"volume\":\"8 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3395260.3395290\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3395260.3395290","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Attention model with multi-layer supervision for text Classification
Text classification is a classic topic in natural language processing. In this study, we propose an attention model with multi-layer supervision for this task. In our model, the previous context vector is directly used as attention to select the required features, and multi-layer supervision is used for text classification, i.e., the prediction losses are combined across all layers in the global cost function. The main contribution of our model is that the context vector is not only used as attention but also as a representation of an input text for classification at each layer. We conducted experiments based on five benchmark text classification data sets and the results indicate that our model can improve classification performance when applied to most of the data sets.