{"title":"Multi feature fusion paper classification model based on attention mechanism","authors":"C. Fan, Yongchun Li, Yuexin Wu","doi":"10.1109/ICNLP58431.2023.00063","DOIUrl":null,"url":null,"abstract":"In recent years, the number of published scientific research papers has shown a growing trend. How to classify scientific research papers efficiently and accurately is a very important issue. However, excellent paper classification system platforms at home and abroad, such as China National Knowledge Infrastructure, Microsoft Academic Network, etc., rely heavily on the structured or semi-structured text in papers for classification, and do not interpret the unstructured text data in papers enough. To solve this problem, we proposed a multi-feature fusion paper classification model based on attention mechanism (AttentionMFF), which uses the fusion features of structured and unstructured text data in papers to improve classification performance. First, Attention MFF extracts the features of different texts in papers by a BERT layer, then uses attention mechanism to fuse different features, and finally get category through the linear layer. Experiments on the arXiv paper dataset show that the Attention MFF has higher F1-Score than TextCNN model and BERT model that only uses the feature of abstract.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"33 1","pages":"308-312"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Icon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNLP58431.2023.00063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the number of published scientific research papers has shown a growing trend. How to classify scientific research papers efficiently and accurately is a very important issue. However, excellent paper classification system platforms at home and abroad, such as China National Knowledge Infrastructure, Microsoft Academic Network, etc., rely heavily on the structured or semi-structured text in papers for classification, and do not interpret the unstructured text data in papers enough. To solve this problem, we proposed a multi-feature fusion paper classification model based on attention mechanism (AttentionMFF), which uses the fusion features of structured and unstructured text data in papers to improve classification performance. First, Attention MFF extracts the features of different texts in papers by a BERT layer, then uses attention mechanism to fuse different features, and finally get category through the linear layer. Experiments on the arXiv paper dataset show that the Attention MFF has higher F1-Score than TextCNN model and BERT model that only uses the feature of abstract.