{"title":"Latent semantic indexing and convolutional neural network for multi-label and multi-class text classification","authors":"Oscar Quispe, Alexander Ocsa, R. Coronado","doi":"10.1109/LA-CCI.2017.8285711","DOIUrl":null,"url":null,"abstract":"The classification of a real text should not be necessarily treated as a binary or multi-class classification, since the text may belong to one or more labels. This type of problem is called multi-label classification. In this paper, we propose the use of latent semantic indexing to text representation, convolutional neural networks to feature extraction and a single multi layer perceptron for multi-label classification in real text data. The experiments show that the model outperforms state of the art techniques when the dataset has long documents, and we observe that the precision is poor when the size of the texts is small.","PeriodicalId":144567,"journal":{"name":"2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LA-CCI.2017.8285711","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
The classification of a real text should not be necessarily treated as a binary or multi-class classification, since the text may belong to one or more labels. This type of problem is called multi-label classification. In this paper, we propose the use of latent semantic indexing to text representation, convolutional neural networks to feature extraction and a single multi layer perceptron for multi-label classification in real text data. The experiments show that the model outperforms state of the art techniques when the dataset has long documents, and we observe that the precision is poor when the size of the texts is small.